Programmatically get a screenshot of a page

I'm writing a specialized crawler and parser for internal use, and I require the ability to take a screenshot of a web page in order to check what colours are being used throughout. The program will take in around ten web addresses and will save them as a bitmap image.

From there I plan to use LockBits in order to create a list of the five most used colours within the image. To my knowledge, it's the easiest way to get the colours used within a web page, but if there is an easier way to do it please chime in with your suggestions.

Anyway, I was going to use ACA WebThumb ActiveX Control until I saw the price tag. I'm also fairly new to C#, having only used it for a few months. Is there a solution to my problem of taking a screenshot of a web page in order to extract the colour scheme?

A quick and dirty way would be to use the WinForms WebBrowser control and draw it to a bitmap. Doing this in a standalone console app is slightly tricky because you have to be aware of the implications of hosting a STAThread control while using a fundamentally asynchronous programming pattern. But here is a working proof of concept which captures a web page to an 800x600 BMP file:

namespace WebBrowserScreenshotSample
{
    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using System.Threading;
    using System.Windows.Forms;

    class Program
    {
        [STAThread]
        static void Main()
        {
            int width = 800;
            int height = 600;

            using (WebBrowser browser = new WebBrowser())
            {
                browser.Width = width;
                browser.Height = height;
                browser.ScrollBarsEnabled = true;

                // This will be called when the page finishes loading
                browser.DocumentCompleted += Program.OnDocumentCompleted;

                browser.Navigate("https://stackoverflow.com/");

                // This prevents the application from exiting until
                // Application.Exit is called
                Application.Run();
            }
        }

        static void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            // Now that the page is loaded, save it to a bitmap
            WebBrowser browser = (WebBrowser)sender;

            using (Graphics graphics = browser.CreateGraphics())
            using (Bitmap bitmap = new Bitmap(browser.Width, browser.Height, graphics))
            {
                Rectangle bounds = new Rectangle(0, 0, bitmap.Width, bitmap.Height);
                browser.DrawToBitmap(bitmap, bounds);
                bitmap.Save("screenshot.bmp", ImageFormat.Bmp);
            }

            // Instruct the application to exit
            Application.Exit();
        }
    }
}

To compile this, create a new console application and make sure to add assembly references for System.Drawing and System.Windows.Forms.

UPDATE: I rewrote the code to avoid having to using the hacky polling WaitOne/DoEvents pattern. This code should be closer to following best practices.

UPDATE 2: You indicate that you want to use this in a Windows Forms application. In that case, forget about dynamically creating the WebBrowser control. What you want is to create a hidden (Visible=false) instance of a WebBrowser on your form and use it the same way I show above. Here is another sample which shows the user code portion of a form with a text box (webAddressTextBox), a button (generateScreenshotButton), and a hidden browser (webBrowser). While I was working on this, I discovered a peculiarity which I didn't handle before -- the DocumentCompleted event can actually be raised multiple times depending on the nature of the page. This sample should work in general, and you can extend it to do whatever you want:

namespace WebBrowserScreenshotFormsSample
{
    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using System.IO;
    using System.Windows.Forms;

    public partial class MainForm : Form
    {
        public MainForm()
        {
            this.InitializeComponent();

            // Register for this event; we'll save the screenshot when it fires
            this.webBrowser.DocumentCompleted += 
                new WebBrowserDocumentCompletedEventHandler(this.OnDocumentCompleted);
        }

        private void OnClickGenerateScreenshot(object sender, EventArgs e)
        {
            // Disable button to prevent multiple concurrent operations
            this.generateScreenshotButton.Enabled = false;

            string webAddressString = this.webAddressTextBox.Text;

            Uri webAddress;
            if (Uri.TryCreate(webAddressString, UriKind.Absolute, out webAddress))
            {
                this.webBrowser.Navigate(webAddress);
            }
            else
            {
                MessageBox.Show(
                    "Please enter a valid URI.",
                    "WebBrowser Screenshot Forms Sample",
                    MessageBoxButtons.OK,
                    MessageBoxIcon.Exclamation);

                // Re-enable button on error before returning
                this.generateScreenshotButton.Enabled = true;
            }
        }

        private void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            // This event can be raised multiple times depending on how much of the
            // document has loaded, if there are multiple frames, etc.
            // We only want the final page result, so we do the following check:
            if (this.webBrowser.ReadyState == WebBrowserReadyState.Complete &&
                e.Url == this.webBrowser.Url)
            {
                // Generate the file name here
                string screenshotFileName = Path.GetFullPath(
                    "screenshot_" + DateTime.Now.Ticks + ".png");

                this.SaveScreenshot(screenshotFileName);
                MessageBox.Show(
                    "Screenshot saved to '" + screenshotFileName + "'.",
                    "WebBrowser Screenshot Forms Sample",
                    MessageBoxButtons.OK,
                    MessageBoxIcon.Information);

                // Re-enable button before returning
                this.generateScreenshotButton.Enabled = true;
            }
        }

        private void SaveScreenshot(string fileName)
        {
            int width = this.webBrowser.Width;
            int height = this.webBrowser.Height;
            using (Graphics graphics = this.webBrowser.CreateGraphics())
            using (Bitmap bitmap = new Bitmap(width, height, graphics))
            {
                Rectangle bounds = new Rectangle(0, 0, width, height);
                this.webBrowser.DrawToBitmap(bitmap, bounds);
                bitmap.Save(fileName, ImageFormat.Png);
            }
        }
    }
}

https://screenshotlayer.com/documentation is the only free service I can find lately...

You'll need to use HttpWebRequest to download the binary of the image. See the provided url above for details.

HttpWebRequest request = HttpWebRequest.Create("https://[url]") as HttpWebRequest;
Bitmap bitmap;
using (Stream stream = request.GetResponse().GetResponseStream())
{
    bitmap = new Bitmap(stream);
}
// now that you have a bitmap, you can do what you need to do...

Programmatically get a screenshot of a page

Related

Recent Posts