how to get html content from a webview?

Solution 1:

Actually this question has many answers. Here are 2 of them :

  • This first is almost the same as yours, I guess we got it from the same tutorial.

public class TestActivity extends Activity {

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.webview);
        final WebView webview = (WebView) findViewById(R.id.browser);
        webview.getSettings().setJavaScriptEnabled(true);
        webview.addJavascriptInterface(new MyJavaScriptInterface(this), "HtmlViewer");

        webview.setWebViewClient(new WebViewClient() {
            @Override
            public void onPageFinished(WebView view, String url) {
                webview.loadUrl("javascript:window.HtmlViewer.showHTML" +
                        "('<html>'+document.getElementsByTagName('html')[0].innerHTML+'</html>');");
            }
        });

        webview.loadUrl("http://android-in-action.com/index.php?post/" +
                "Common-errors-and-bugs-and-how-to-solve-avoid-them");
    }

    class MyJavaScriptInterface {

        private Context ctx;

        MyJavaScriptInterface(Context ctx) {
            this.ctx = ctx;
        }

        public void showHTML(String html) {
            new AlertDialog.Builder(ctx).setTitle("HTML").setMessage(html)
                    .setPositiveButton(android.R.string.ok, null).setCancelable(false).create().show();
        }

    }
}

This way your grab the html through javascript. Not the prettiest way but when you have your javascript interface, you can add other methods to tinker it.


  • An other way is using an HttpClient like there.

The option you choose also depends, I think, on what you intend to do with the retrieved html...

Solution 2:

In KitKat and above, you could use evaluateJavascript method on webview

wvbrowser.evaluateJavascript(
        "(function() { return ('<html>'+document.getElementsByTagName('html')[0].innerHTML+'</html>'); })();",
         new ValueCallback<String>() {
            @Override
            public void onReceiveValue(String html) {
                Log.d("HTML", html); 
                // code here
            }
    });

See this answer for more examples

Solution 3:

For android 4.2, dont forget to add @JavascriptInterface to all javascript functions

Solution 4:

Android WebView is just another render engine that render HTML contents downloaded from a HTTP server, much like Chrome or FireFox. I don't know the reason why you need get the rendered page (or screenshot) from WebView. For most of situation, this is not necessary. You can always get the raw HTML content from HTTP server directly.

There are already answers posted talking about getting the raw stream using HttpUrlConnection or HttpClient. Alternatively, there is a very handy library when dealing with HTML content parse/process on Android: JSoup, it provide very simple API to get HTML contents form HTTP server, and provide an abstract representation of HTML document to help us manage HTML parsing not only in a more OO style but also much easily:

// Single line of statement to get HTML document from HTTP server.
Document doc = Jsoup.connect("http://en.wikipedia.org/").get();

It is handy when, for example, you want to download HTML document first then add some custom css or javascript to it before passing it to WebView for rendering. Much more on their official web site, worth to check it out.