Python code to automate desktop activities in windows
Have a look at SIKULI.
Sikuli is a visual technology to automate and test graphical user interfaces (GUI) using images (screenshots).
SIKULI uses a very clever combination of taking screenshots, and embedding them into your python (it's jython, actually) script.
Take screenshots:
and use them in your code:
You can try Automa.
It's a Windows GUI automation tool written in Python which is very simple to use. For example, you can do the following:
# to double click on an icon on the desktop
doubleclick("Recycle Bin")
# to maximize
click("Maximize")
# to input some text and press ENTER
write("Some text", into="Label of the text field")
press(ENTER)
The full list of available commands can be found here.
Disclaimer: I'm one of Automa's developers.
There are different ways of automating user interfaces in Windows that can be accessed via Python (using ctypes
or some of the Python windows bindings):
Raw windows APIs --
Get/SetCursorPos
for the mouse, HWND APIs likeGetFocus
andGetForegroundWindow
AutoIt
-- an automation scripting language: Calling AutoIt Functions in PythonMicrosoft Active Accessibility (
MSAA
) / WinEvent -- an API for interrogating a UI through the accessibility APIs in Win95.UI/Automation (
UIA
) -- a replacement forMSAA
introduced in Vista (available for XP SP3 IIRC).
Automating a user interface to test it is a non-trivial task. There are a lot of gotchas that can trip you up.
I would suggest testing your automation framework in an automated way so you can verify that it works on the platforms you are testing (to identify failures in the automation API
vs failures in the application).
Another consideration is how to deal with localization. Note also that the names for Minimize/Maximize/... are localized as well, and can be in a different language to the application (system vs. user locale)!
In pseudo-code, an MSAA
program to minimize an application would look something like:
window = AccessibleObjectFromWindow(FindWindow("My Window"))
titlebar = [x for x in window.AccessibleChildren if x.accRole == TitleBar]
minimize = [x for x in titlebar[0].AccessibleChildren if x.Name == "Minimize"]
if len(minimize) != 0: # may already be minimized
mimimize[0].accDoDefaultAction()
MSAA
accessible items are stored as (object: IAccessible, childId: int)
pairs. Care is needed here to get the calls correct (e.g. get_accChildCount
only uses the IAccessible
, so when childId
is not 0 you must return 0 instead of calling get_accChildCount
)!
IAccessible
calls can return different error codes to indicate "this object does not support this property"
-- e.g. DISP_E_MEMBERNOTFOUND
or E_NOTIMPL
.
Be aware of the state of the window. If the window is maximized then minimized, restore will restore the window to its maximized state, so you need to restore it again to get it back to the normal/windowed state.
The MSAA
and UIA
APIs don't support right mouse button clicks, so you need to use a Win32 API
to trigger it.
The MSAA
model does not support treeview heirarchy information -- it displays it as a flat list. On the other hand, UIA
will only enumerate elements that are visible so you will not be able to access elements in the UIA
tree that are collapsed.