How to get the word under the cursor in Windows?

I want to create a application which gets the word under the cursor (not only for text fields), but I can't find how to do that. Using OCR is pretty hard. The only thing I've seen working is the Deskperience components. They support a 'native' way, but I they cost a lot. Now I'm trying to figure out what is this 'native' way (maybe somehow of hooking). Any help will be appreciated.

EDIT: I found a way, but it gets only the whole text of the control. Any idea how to get only the word under the cursor from the whole text?


Solution 1:

On recent versions of Windows, the recommended way to gather information from one application to another (if you don't own the targeted application of course) is to use the UI Automation technology. Wikipedia is pretty good for more information on this: Microsoft UI Automation

Basically, UI automation will use all necessary means to gather what can be gathered

Here is a small console application code that will spy the UI of other apps. Run it and move the mouse over to different applications. Each application has a different support for various "UI automation patterns". For example, there is the Value pattern and the Text pattern as demonstrated here.

static void Main(string[] args)
{
    do
    {
        System.Drawing.Point mouse = System.Windows.Forms.Cursor.Position; // use Windows forms mouse code instead of WPF
        AutomationElement element = AutomationElement.FromPoint(new System.Windows.Point(mouse.X, mouse.Y));
        if (element == null)
        {
            // no element under mouse
            return;
        }

        Console.WriteLine("Element at position " + mouse + " is '" + element.Current.Name + "'");

        object pattern;
        // the "Value" pattern is supported by many application (including IE & FF)
        if (element.TryGetCurrentPattern(ValuePattern.Pattern, out pattern))
        {
            ValuePattern valuePattern = (ValuePattern)pattern;
            Console.WriteLine(" Value=" + valuePattern.Current.Value);
        }

        // the "Text" pattern is supported by some applications (including Notepad)and returns the current selection for example
        if (element.TryGetCurrentPattern(TextPattern.Pattern, out pattern))
        {
            TextPattern textPattern = (TextPattern)pattern;
            foreach(TextPatternRange range in textPattern.GetSelection())
            {
                Console.WriteLine(" SelectionRange=" + range.GetText(-1));
            }
        }
        Thread.Sleep(1000);
        Console.WriteLine(); Console.WriteLine();
    }
    while (true);
}

UI automation is actually supported by Internet Explorer and Firefox, but not by Chrome to my knowledge. See this link: When will Google Chrome be accessible?

Now, this is just the beginning of work for you :-), because:

  • Most of the time, all this has heavy security implication. Using this technology (or direct Windows technology such as WindowFromPoint) will require sufficient rights to do so (such as being an administrator). And I don't think DExperience has any way to overcome these limitations, unless they install a kernel driver on the computer.

  • Some applications will not expose anything to anyone, even with proper rights. For example, if I'm writing a banking application, I don't want you to spy on what my application will display :-). Other applications such as Outlook with DRM will not expose anything for the same reasons.

  • Only the UI automation Text pattern support can give more information (like the word) than just the whole text. Alas, this specific pattern is not supported by IE nor FF even if they support UI automation globally.

So, if all this does not work for you, you will have to dive deeper and use OCR or Shape recognition techniques. Even with this, there will be some cases where you won't be able to do it at all (because of security rights).

Solution 2:

This is non-trivial if the application you want to "spy" on is drawing the text themselves. One possible solution is to trigger the other application to paint a portion of it's window by invalidating the area directly under the cursor.

When the other application paints, you will have to intercept the text drawing calls. One way to do so is to inject code in the other application, and intercept calls into GDI functions that draw text. When you debug native applications, this is what visual studio does to implement breakpoints. To test the idea you could use a library like detours (but that's not free for commercial use).

You could also check if the application supports one of the accessability API's that are in Windows to facilitate things like screen readers for blind people.

One word of caution: I have not done any of this myself.

Solution 3:

If the app need to handle not only .Net apps I would start with importing functions (P/Invoke):

  • WindowFromPoint
  • ChildWindowFromPointEx

Later you can iterate over the controls and try to get the text from inside based on type. If I will find some time I will try to publish such code.

After some checking it looks like the best way (unfortunately the hard also) is to hook into GDI text rendering some discussion