What is the best practice for parsing remote content with jQuery?

Following a jQuery ajax call to retrieve an entire XHTML document, what is the best way to select specific elements from the resulting string? Perhaps there is a library or plugin that solves this issue?

jQuery can only select XHTML elements that exist in a string if they're normally allowed in a div in the W3C specification; therefore, I'm curious about selecting things like <title>, <script>, and <style>.

According to the jQuery documentation:

http://docs.jquery.com/Core/jQuery#htmlownerDocument

The HTML string cannot contain elements that are invalid within a div, such as html, head, body, or title elements.

Therefore, since we have established that jQuery does not provide a way to do this, how would I select these elements? As an example, if you can show me how to select the remote page's title, that would be perfect!

Thanks, Pete


Instead of hacking jQuery to do this I'd suggest you drop out of jQuery for a minute and use raw XML dom methods. Using XML Dom methods you would can do this:

  window.onload = function(){ 
    $.ajax({
          type: 'GET', 
          url: 'text.html',
          dataType: 'html',
          success: function(data) {

            //cross platform xml object creation from w3schools
            try //Internet Explorer
              {
              xmlDoc=new ActiveXObject("Microsoft.XMLDOM");
              xmlDoc.async="false";
              xmlDoc.loadXML(data);
              }
            catch(e)
              {
              try // Firefox, Mozilla, Opera, etc.
                {
                parser=new DOMParser();
                xmlDoc=parser.parseFromString(data,"text/xml");
                }
              catch(e)
                {
                alert(e.message);
                return;
                }
              }

            alert(xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue);
          }
    });
  }

No messing about with iframes etc.


Just an idea - tested in FF/Safari - seems to work if you create an iframe to store the document temporarily. Of course, if you are doing this it might be smarter to just use the src property of the iframe to load the document and do whatever you want in the "onload" of it.

  $(function() {
    $.ajax({
      type: 'GET', 
      url: 'result.html',
      dataType: 'html',
      success: function(data) {
        var $frame = $("<iframe src='about:blank'/>").hide();
        $frame.appendTo('body');
        var doc = $frame.get(0).contentWindow.document;
        doc.write(data);
        var $title = $("title", doc);
        alert('Title: '+$title.text() );
        $frame.remove();
      }
    });
  });

I had to append the iframe to the body to get it to have a .contentWindow.