Javascript .replace command replace page text?

Can the JavaScript command .replace replace text in any webpage? I want to create a Chrome extension that replaces specific words in any webpage to say something else (example cake instead of pie).


Solution 1:

The .replace method is a string operation, so it's not immediately simple to run the operation on HTML documents, which are composed of DOM Node objects.

Use TreeWalker API

The best way to go through every node in a DOM and replace text in it is to use the document.createTreeWalker method to create a TreeWalker object. This is a practice that is used in a number of Chrome extensions!

// create a TreeWalker of all text nodes
var allTextNodes = document.createTreeWalker(document.body, NodeFilter.SHOW_TEXT),
    // some temp references for performance
    tmptxt,
    tmpnode,
    // compile the RE and cache the replace string, for performance
    cakeRE = /cake/g,
    replaceValue = "pie";

// iterate through all text nodes
while (allTextNodes.nextNode()) {
    tmpnode = allTextNodes.currentNode;
    tmptxt = tmpnode.nodeValue;
    tmpnode.nodeValue = tmptxt.replace(cakeRE, replaceValue);
}

To replace parts of text with another element or to add an element in the middle of text, use DOM splitText, createElement, and insertBefore methods, example.

Don't use innerHTML or innerText or jQuery .html()

// the innerHTML property of any DOM node is a string
document.body.innerHTML = document.body.innerHTML.replace(/cake/g,'pie')
  • It's generally slower (especially on mobile devices).
  • It effectively removes and replaces the entire DOM, which is not awesome and could have some side effects: it destroys all event listeners attached in JavaScript code (via addEventListener or .onxxxx properties) thus breaking the functionality partially/completely.
  • This is, however, a common, quick, and very dirty way to do it.

Solution 2:

Ok, so the createTreeWalker method is the RIGHT way of doing this and it's a good way. I unfortunately needed to do this to support IE8 which does not support document.createTreeWalker. Sad Ian is sad.

If you want to do this with a .replace on the page text using a non-standard innerHTML call like a naughty child, you need to be careful because it WILL replace text inside a tag, leading to XSS vulnerabilities and general destruction of your page.

What you need to do is only replace text OUTSIDE of tag, which I matched with:

var search_re = new RegExp("(?:>[^<]*)(" + stringToReplace + ")(?:[^>]*<)", "gi");

gross, isn't it. you may want to mitigate any slowness by replacing some results and then sticking the rest in a setTimeout call like so:

// replace some chunk of stuff, the first section of your page works nicely
// if you happen to have that organization
//
setTimeout(function() { /* replace the rest */ }, 10);

which will return immediately after replacing the first chunk, letting your page continue with its happy life. for your replace calls, you're also going to want to replace large chunks in a temp string

var tmp = element.innerHTML.replace(search_re, whatever); 
/* more replace calls, maybe this is in a for loop, i don't know what you're doing */  
element.innerHTML = tmp;

so as to minimize reflows (when the page recalculates positioning and re-renders everything). for large pages, this can be slow unless you're careful, hence the optimization pointers. again, don't do this unless you absolutely need to. use the createTreeWalker method zetlen has kindly posted above..