Use Web Sources in Cortex
Web sources are the fallback when a site has no usable RSS or API endpoint. They work best when the page structure is stable enough to target with selectors.
Add a listing-page URL plus selectors for the repeating items and links Cortex should extract.
If needed, Cortex can also open each linked article page to refine metadata such as the title or publish date.
- Blog, newsroom, or updates pages with no usable RSS
- Stable listing pages built around repeatable cards or articles
- Publisher sections where metadata lives on the linked page
- Cases where scraping is acceptable because no cleaner source exists
Web is the most flexible source type, but it also has the highest maintenance cost. Use it when the site does not offer a clean RSS feed or API endpoint and you still need Cortex to monitor a stable page structure directly.
Web Setup
Start with the narrowest listing page you can find, then identify the selectors that mark each item and its canonical link. Add page-level selectors only when the listing page does not carry enough metadata on its own.
Key Configuration Fields
| Name | Description |
|---|---|
| URL |
Use the listing page that already gathers the items you want. A broad homepage is usually a worse starting point than a purpose-built announcements or blog index page.
Example
https://example.com/blog
|
| Item Selector |
Target the repeating card or article container on the listing page. This is the selector that defines what Cortex treats as one item candidate.
Example
article.post
|
| Link Selector |
Point at the article link inside each item container so Cortex can resolve a canonical URL for each extracted item.
Example
a.title
|
| Wait For |
Use this when cards load after the initial page render. It helps Cortex wait for the listing page structure before extraction begins.
Example
css:div.card
|
| Page Title Selector |
Use page-level selectors when the listing page only exposes short cards and you want Cortex to refine the title or publish date from the linked article page.
Example
h1
|
For the full field reference, see Create Source in the API docs.
Examples
These examples show the kinds of page structures Web sources usually target successfully.
Blog Listing Page
Self-linking card grid where each listing item is already the canonical article link. Use this when the listing page exposes enough metadata to avoid heavier page-level extraction.
JS-Rendered Blog Page
JavaScript-rendered listing that needs an explicit wait and keeps page-level fallback selectors. Use this when cards load late or article pages carry more reliable metadata than the index.
Page selectors: pageTitleSelector and pageDateSelector extract metadata from each linked article page, not the listing. Use these when the listing cards lack reliable dates.
Troubleshooting & FAQ
The source runs, but no items are appearing.
The listing selectors are probably not matching the current page structure.
Recheck the repeating container selector first, then confirm that the link selector actually points at a usable article URL inside each matched item.
The page clearly has content, but Cortex still misses it.
The page may render its cards after the initial load.
Add a small wait or a Wait For selector so Cortex does not extract before the listing page is ready.
When should I use RSS or API Endpoint instead?
Prefer RSS or API Endpoint whenever a stable feed or JSON API already exists.
Web is usually the right choice only when you need page scraping because the publisher does not expose a cleaner connector surface.