Fastest method of screen capturing on Windows

I want to write a screencasting program for the Windows platform, but am unsure of how to capture the screen. The only method I'm aware of is to use GDI, but I'm curious whether there are other ways to go about this, and, if there are, which incurs the least overhead? Speed is a priority.

The screencasting program will be for recording game footage, although, if this does narrow down the options, I'm still open for any other suggestions that fall out of this scope. Knowledge isn't bad, after all.

Edit: I came across this article: Various methods for capturing the screen. It has introduced me to the Windows Media API way of doing it and the DirectX way of doing it. It mentions in the Conclusion that disabling hardware acceleration could drastically improve the performance of the capture application. I'm curious as to why this is. Could anyone fill in the missing blanks for me?

Edit: I read that screencasting programs such as Camtasia use their own capture driver. Could someone give me an in-depth explanation on how it works, and why it is faster? I may also need guidance on implementing something like that, but I'm sure there is existing documentation anyway.

Also, I now know how FRAPS records the screen. It hooks the underlying graphics API to read from the back buffer. From what I understand, this is faster than reading from the front buffer, because you are reading from system RAM, rather than video RAM. You can read the article here.


Solution 1:

This is what I use to collect single frames, but if you modify this and keep the two targets open all the time then you could "stream" it to disk using a static counter for the file name. - I can't recall where I found this, but it has been modified, thanks to whoever!

void dump_buffer()
{
   IDirect3DSurface9* pRenderTarget=NULL;
   IDirect3DSurface9* pDestTarget=NULL;
     const char file[] = "Pickture.bmp";
   // sanity checks.
   if (Device == NULL)
      return;

   // get the render target surface.
   HRESULT hr = Device->GetRenderTarget(0, &pRenderTarget);
   // get the current adapter display mode.
   //hr = pDirect3D->GetAdapterDisplayMode(D3DADAPTER_DEFAULT,&d3ddisplaymode);

   // create a destination surface.
   hr = Device->CreateOffscreenPlainSurface(DisplayMde.Width,
                         DisplayMde.Height,
                         DisplayMde.Format,
                         D3DPOOL_SYSTEMMEM,
                         &pDestTarget,
                         NULL);
   //copy the render target to the destination surface.
   hr = Device->GetRenderTargetData(pRenderTarget, pDestTarget);
   //save its contents to a bitmap file.
   hr = D3DXSaveSurfaceToFile(file,
                              D3DXIFF_BMP,
                              pDestTarget,
                              NULL,
                              NULL);

   // clean up.
   pRenderTarget->Release();
   pDestTarget->Release();
}

Solution 2:

EDIT: I can see that this is listed under your first edit link as "the GDI way". This is still a decent way to go even with the performance advisory on that site, you can get to 30fps easily I would think.

From this comment (I have no experience doing this, I'm just referencing someone who does):

HDC hdc = GetDC(NULL); // get the desktop device context
HDC hDest = CreateCompatibleDC(hdc); // create a device context to use yourself

// get the height and width of the screen
int height = GetSystemMetrics(SM_CYVIRTUALSCREEN);
int width = GetSystemMetrics(SM_CXVIRTUALSCREEN);

// create a bitmap
HBITMAP hbDesktop = CreateCompatibleBitmap( hdc, width, height);

// use the previously created device context with the bitmap
SelectObject(hDest, hbDesktop);

// copy from the desktop device context to the bitmap device context
// call this once per 'frame'
BitBlt(hDest, 0,0, width, height, hdc, 0, 0, SRCCOPY);

// after the recording is done, release the desktop context you got..
ReleaseDC(NULL, hdc);

// ..delete the bitmap you were using to capture frames..
DeleteObject(hbDesktop);

// ..and delete the context you created
DeleteDC(hDest);

I'm not saying this is the fastest, but the BitBlt operation is generally very fast if you're copying between compatible device contexts.

For reference, Open Broadcaster Software implements something like this as part of their "dc_capture" method, although rather than creating the destination context hDest using CreateCompatibleDC they use an IDXGISurface1, which works with DirectX 10+. If there is no support for this they fall back to CreateCompatibleDC.

To change it to use a specific application, you need to change the first line to GetDC(game) where game is the handle of the game's window, and then set the right height and width of the game's window too.

Once you have the pixels in hDest/hbDesktop, you still need to save it to a file, but if you're doing screen capture then I would think you would want to buffer a certain number of them in memory and save to the video file in chunks, so I will not point to code for saving a static image to disk.

Solution 3:

I wrote a video capture software, similar to FRAPS for DirectX applications. The source code is available and my article explains the general technique. Look at http://blog.nektra.com/main/2013/07/23/instrumenting-direct3d-applications-to-capture-video-and-calculate-frames-per-second/

Respect to your questions related to performance,

  • DirectX should be faster than GDI except when you are reading from the frontbuffer which is very slow. My approach is similar to FRAPS (reading from backbuffer). I intercept a set of methods from Direct3D interfaces.

  • For video recording in realtime (with minimal application impact), a fast codec is essential. FRAPS uses it's own lossless video codec. Lagarith and HUFFYUV are generic lossless video codecs designed for realtime applications. You should look at them if you want to output video files.

  • Another approach to recording screencasts could be to write a Mirror Driver. According to Wikipedia: When video mirroring is active, each time the system draws to the primary video device at a location inside the mirrored area, a copy of the draw operation is executed on the mirrored video device in real-time. See mirror drivers at MSDN: http://msdn.microsoft.com/en-us/library/windows/hardware/ff568315(v=vs.85).aspx.

Solution 4:

I use d3d9 to get the backbuffer, and save that to a png file using the d3dx library:

    IDirect3DSurface9 *surface ;

    // GetBackBuffer
    idirect3ddevice9->GetBackBuffer(0, 0, D3DBACKBUFFER_TYPE_MONO, &surface ) ;

    // save the surface
    D3DXSaveSurfaceToFileA( "filename.png", D3DXIFF_PNG, surface, NULL, NULL ) ;

    SAFE_RELEASE( surface ) ;

To do this you should create your swapbuffer with

d3dpps.SwapEffect = D3DSWAPEFFECT_COPY ; // for screenshots.

(So you guarantee the backbuffer isn't mangled before you take the screenshot).