I have this webpage that uses client-side JavaScript to format data on the page before it's displayed to the user.

Is it possible to somehow use wget to download the page and use some sort of client-side JavaScript engine to format the data as it would be displayed in a browser?


Solution 1:

You could probably make that happen with something like PhantomJS

You can write a phantomjs script that will load the page like a browser would, and then either take screenshots or use JS to inspect the page and pull out data.

Solution 2:

Here is a simple little phantomjs script that triggers javascript on a webpage and allows you to pull it down locally:

file: get.js

var page = require('webpage').create(),
  system = require('system'), address;

address = system.args[1];
page.scrollPosition= { top: 4000, left: 0}  
page.open(address, function(status) {
  if (status !== 'success') {
    console.log('** Error loading url.');
  } else {
    console.log(page.content);
  }
  phantom.exit();
});

Use it as follows:
$> phantomjs /path/to/get.js "http://www.google.com" > "google.html"

Changing /path/to, url and filename to what you want.