How do I select text nodes with jQuery?

I would like to get all descendant text nodes of an element, as a jQuery collection. What is the best way to do that?


jQuery doesn't have a convenient function for this. You need to combine contents(), which will give just child nodes but includes text nodes, with find(), which gives all descendant elements but no text nodes. Here's what I've come up with:

var getTextNodesIn = function(el) {
    return $(el).find(":not(iframe)").addBack().contents().filter(function() {
        return this.nodeType == 3;
    });
};

getTextNodesIn(el);

Note: If you're using jQuery 1.7 or earlier, the code above will not work. To fix this, replace addBack() with andSelf(). andSelf() is deprecated in favour of addBack() from 1.8 onwards.

This is somewhat inefficient compared to pure DOM methods and has to include an ugly workaround for jQuery's overloading of its contents() function (thanks to @rabidsnail in the comments for pointing that out), so here is non-jQuery solution using a simple recursive function. The includeWhitespaceNodes parameter controls whether or not whitespace text nodes are included in the output (in jQuery they are automatically filtered out).

Update: Fixed bug when includeWhitespaceNodes is falsy.

function getTextNodesIn(node, includeWhitespaceNodes) {
    var textNodes = [], nonWhitespaceMatcher = /\S/;

    function getTextNodes(node) {
        if (node.nodeType == 3) {
            if (includeWhitespaceNodes || nonWhitespaceMatcher.test(node.nodeValue)) {
                textNodes.push(node);
            }
        } else {
            for (var i = 0, len = node.childNodes.length; i < len; ++i) {
                getTextNodes(node.childNodes[i]);
            }
        }
    }

    getTextNodes(node);
    return textNodes;
}

getTextNodesIn(el);

Jauco posted a good solution in a comment, so I'm copying it here:

$(elem)
  .contents()
  .filter(function() {
    return this.nodeType === 3; //Node.TEXT_NODE
  });

$('body').find('*').contents().filter(function () { return this.nodeType === 3; });

jQuery.contents() can be used with jQuery.filter to find all child text nodes. With a little twist, you can find grandchildren text nodes as well. No recursion required:

$(function() {
  var $textNodes = $("#test, #test *").contents().filter(function() {
    return this.nodeType === Node.TEXT_NODE;
  });
  /*
   * for testing
   */
  $textNodes.each(function() {
    console.log(this);
  });
});
div { margin-left: 1em; }
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>

<div id="test">
  child text 1<br>
  child text 2
  <div>
    grandchild text 1
    <div>grand-grandchild text 1</div>
    grandchild text 2
  </div>
  child text 3<br>
  child text 4
</div>

jsFiddle


I was getting a lot of empty text nodes with the accepted filter function. If you're only interested in selecting text nodes that contain non-whitespace, try adding a nodeValue conditional to your filter function, like a simple $.trim(this.nodevalue) !== '':

$('element')
    .contents()
    .filter(function(){
        return this.nodeType === 3 && $.trim(this.nodeValue) !== '';
    });

http://jsfiddle.net/ptp6m97v/

Or to avoid strange situations where the content looks like whitespace, but is not (e.g. the soft hyphen &shy; character, newlines \n, tabs, etc.), you can try using a Regular Expression. For example, \S will match any non-whitespace characters:

$('element')
        .contents()
        .filter(function(){
            return this.nodeType === 3 && /\S/.test(this.nodeValue);
        });