To get the html source of a page I can use this below but this wont get the generated source, it won't contain any of the html that was added dynamically by the javascript in the browser. How do I get the the final generated HTML source?


WebRequest req = WebRequest.Create(""); 
WebResponse res = req.GetResponse(); 
StreamReader sr = new StreamReader(res.GetResponseStream()); 
string html = sr.ReadToEnd();

if I try this below then it returns the document with out the JavaScript code injected

Public Class Form1

    Dim WB As WebBrowser = Nothing

    Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load

        WB = New WebBrowser()
        AddHandler WB.DocumentCompleted, AddressOf WebBrowser1_DocumentCompleted


    End Sub

    Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs)

        'Dim htmlcode As String = WebBrowser1.Document.Body.OuterHtml()
        Dim s As String = WB.DocumentText

    End Sub
End Class

HTML returned

<!DOCTYPE html>

<html xmlns="">
<head runat="server">

    <form id="form1" runat="server">
    <div id="center_text_panel">
    //test text  this text should be here

    <script type="text/javascript">

        document.getElementById("center_text_panel").innerText = "test text";


Solution 1:

You can use WebKit.NET

Look here for official tutorials

This can not only grab the source, but also process javascript through the pageload event.


Then, handle the DocumentCompleted event, and:

private documentContent = webKitBrowser1.DocumentText

Edit - This might be the better open source WebKit option:

Solution 2:

Just put a webbrowser control to your form and you flowing code:


     private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
           string htmlcode= webBrowser1.Document.Body.InnerHtml;//Or Each Filed Or element..//WebBrowser.DocumentText


for getting also html code that generated dynamically by java script code you have two way:

  1. run flowing code after webBrowser1_DocumentCompleted Event
 StringBuilder htmlcode = new StringBuilder();
            foreach (HtmlElement item in webBrowser1.Document.All)
                htmlcode.Append( item.InnerHtml);
  1. write a javascript code for returning document.documentElement.innerHTML and using InvolkeScript Function To Return Result:
   var htmlcode = webBrowser1.Document.InvokeScript("javascriptcode");