Preventing XSS in Node.js / server side javascript

Any idea how one would go about preventing XSS attacks on a node.js app? Any libs out there that handle removing javascript in hrefs, onclick attributes,etc. from POSTed data?

I don't want to have to write a regex for all that :)

Any suggestions?

I've created a module that bundles the Caja HTML Sanitizer

npm install sanitizer

http://github.com/theSmaw/Caja-HTML-Sanitizer

https://www.npmjs.com/package/sanitizer

Any feedback appreciated.

One of the answers to Sanitize/Rewrite HTML on the Client Side suggests borrowing the whitelist-based HTML sanitizer in JS from Google Caja which, as far as I can tell from a quick scroll-through, implements an HTML SAX parser without relying on the browser's DOM.

Update: Also, keep in mind that the Caja sanitizer has apparently been given a full, professional security review while regexes are known for being very easy to typo in security-compromising ways.

Update 2017-09-24: There is also now DOMPurify. I haven't used it yet, but it looks like it meets or exceeds every point I look for:

Relies on functionality provided by the runtime environment wherever possible. (Important both for performance and to maximize security by relying on well-tested, mature implementations as much as possible.)
- Relies on either a browser's DOM or jsdom for Node.JS.
Default configuration designed to strip as little as possible while still guaranteeing removal of javascript.
- Supports HTML, MathML, and SVG
- Falls back to Microsoft's proprietary, un-configurable toStaticHTML under IE8 and IE9.
Highly configurable, making it suitable for enforcing limitations on an input which can contain arbitrary HTML, such as a WYSIWYG or Markdown comment field. (In fact, it's the top of the pile here)
- Supports the usual tag/attribute whitelisting/blacklisting and URL regex whitelisting
- Has special options to sanitize further for certain common types of HTML template metacharacters.
They're serious about compatibility and reliability
- Automated tests running on 16 different browsers as well as three diffferent major versions of Node.JS.
- To ensure developers and CI hosts are all on the same page, lock files are published.

All usual techniques apply to node.js output as well, which means:

Blacklists will not work.
You're not supposed to filter input in order to protect HTML output. It will not work or will work by needlessly malforming the data.
You're supposed to HTML-escape text in HTML output.

I'm not sure if node.js comes with some built-in for this, but something like that should do the job:

function htmlEscape(text) {
   return text.replace(/&/g, '&amp;').
     replace(/</g, '&lt;').  // it's not neccessary to escape >
     replace(/"/g, '&quot;').
     replace(/'/g, '&#039;');
}

I recently discovered node-validator by chriso.

Example

get('/', function (req, res) {

  //Sanitize user input
  req.sanitize('textarea').xss(); // No longer supported
  req.sanitize('foo').toBoolean();

});

XSS Function Deprecation

The XSS function is no longer available in this library.

https://github.com/chriso/validator.js#deprecations

Preventing XSS in Node.js / server side javascript

Example

XSS Function Deprecation

Related

Recent Posts