Trying to use the DOMParser with node js
A lot of browser functionalities, like DOM manipulations or XHR, are not available natively NodeJS because that is not a typical server task to access the DOM - you'll have to use an external library to do that.
DOM capacities depends a lot on the library, here's a quick comparisons of the main tools you can use:
-
jsdom
: implements DOM level 4 which is the latest DOM standard, so everything that you can do on a modern browser, you can do it injsdom
. It is the de-facto industry standard for doing browser stuff on Node, used by Mocha, Vue Test Utils, Webpack Prerender SPA Plugin, and many other:const jsdom = require("jsdom"); const dom = new jsdom.JSDOM(`<!DOCTYPE html><p>Hello world</p>`); dom.window.document.querySelector("p").textContent; // 'Hello world'
-
deno_dom
: if using Deno instead of Node is an option, this library provides DOM parsing capabilities:import { DOMParser } from "https://deno.land/x/deno_dom/deno-dom-wasm.ts"; const parser = new DOMParser(); const document = parser.parseFromString('<p>Hello world</p>', 'text/html'); document.querySelector('p').textContent; // 'Hello world';
-
htmlparser2
: same as jsdom, but with enhanced performances and flexibility at the price of a more complex API:const htmlparser = require("htmlparser2"); const parser = new htmlparser.Parser({ onopentag: (name, attrib) => { if (name=='p') console.log('a paragraph element is opening'); } }, {decodeEntities: true}); parser.write(`<!DOCTYPE html><p>Hello world</p>`); parser.end(); // console output: 'a paragraph element is opening'
-
cheerio
: implementation of jQuery based on HTML DOM parsing byhtmlparser2
:const cheerio = require('cheerio'); const $ = cheerio.load(`<!DOCTYPE html><p>Hello world</p>`); $('p').text('Bye moon'); $.html(); // '<!DOCTYPE html><p>Bye moon</p>'
-
xmldom
: fully implements the DOM level 2 and partially implements the DOM level 3. Works with HTML, and with XML also -
dom-parser
: regex-based DOM parser that implements a few DOM methods likegetElementById
. Since parsing HTML with regular expressions is a very bad idea I wouldn't recommend this one for production.
There is no DOMParser
in node.js, that's a browser thing. You can try any of these modules though:
https://github.com/joyent/node/wiki/modules#wiki-parsers-xml
You can use a Node implementation of DOMParser, such as xmldom. This will allow you to access DOMParser outside of the browser. For example:
var DOMParser = require('xmldom').DOMParser;
var parser = new DOMParser();
var document = parser.parseFromString('Your XML String', 'text/xml');
I used jsdom because it's got a ton of usage and is written by a prominent web hero - no promises that it's behavior perfectly matches your browser (or even that every browser's behavior is the same) but it worked for me:
const jsdom = require("jsdom")
const { JSDOM } = jsdom
global.DOMParser = new JSDOM().window.DOMParser
I really like htmlparser2. It's a fantastic, fast and lightweight library. I've created a small demo on how to use it on RunKit: https://runkit.com/jfahrenkrug/htmlparser2-demo/1.0.0