Scraping dynamic content in a website

If the content is generated dynamically, you can use Windmill or Seleninum to drive the browser and get the data once it's been rendered.

You can find an example here.

The polite option would be to ask the owners of the site if they have an API which allows you access to their news stories.

The less polite option would be to trace the HTTP transactions that take place while the page is loading and work out which one is the AJAX call which pulls in the data.

Looks like it's this one. But it looks like it might contain session data, so I don't know how long it will continue to work for.

Scraping dynamic content in a website

Related

Recent Posts