wget + JavaScript?
I have this webpage that uses client-side JavaScript to format data on the page before it's displayed to the user.
Is it possible to somehow use wget
to download the page and use some sort of client-side JavaScript engine to format the data as it would be displayed in a browser?
Solution 1:
You could probably make that happen with something like PhantomJS
You can write a phantomjs script that will load the page like a browser would, and then either take screenshots or use JS to inspect the page and pull out data.
Solution 2:
Here is a simple little phantomjs script that triggers javascript on a webpage and allows you to pull it down locally:
file: get.js
var page = require('webpage').create(),
system = require('system'), address;
address = system.args[1];
page.scrollPosition= { top: 4000, left: 0}
page.open(address, function(status) {
if (status !== 'success') {
console.log('** Error loading url.');
} else {
console.log(page.content);
}
phantom.exit();
});
Use it as follows: $> phantomjs /path/to/get.js "http://www.google.com" > "google.html"
Changing /path/to
, url
and filename
to what you want.