Python API
Desktop
- class RPA.Desktop.Desktop(locators_path: Optional[str] = None)
Desktop is a cross-platform library for navigating and interacting with desktop environments. It can be used to automate applications through the same interfaces that are available to human users.
The library includes the following features:
Mouse and keyboard input emulation
Starting and stopping applications
Finding elements through image template matching
Scraping text from given regions
Taking screenshots
Clipboard management
Warning
Windows element selectors are not currently supported, and require the use of
RPA.Desktop.Windows
Installation
The basic features such as mouse and keyboard input and application control work with a default
rpaframework
install.Advanced computer-vision features such as image template matching and OCR require an additional library called
rpaframework-recognition
.The dependency should be added separately by specifing it in your conda.yaml as
rpaframework-recognition==5.0.1
for example. If installing recognition throughpip
instead ofconda
, the OCR feature also requirestesseract
.Locating elements
To automate actions on the desktop, a robot needs to interact with various graphical elements such as buttons or input fields. The locations of these elements can be found using a feature called locators.
A locator describes the properties or features of an element. This information can be later used to locate similar elements even when window positions or states change.
The currently supported locator types are:
Name
Arguments
Description
alias
name (str)
A custom named locator from the locator database, the default.
image
path (str)
Image of an element that is matched to current screen content.
point
x (int), y (int)
Pixel coordinates as absolute position.
offset
x (int), y (int)
Pixel coordinates relative to current mouse position.
size
width (int), height (int)
Region of fixed size, around point or screen top-left
region
left (int), top (int), right (int), bottom (int)
Bounding coordinates for a rectangular region.
ocr
text (str), confidence (float, optional)
Text to find from the current screen.
A locator is defined by its type and arguments, divided by a colon. Some example usages are shown below. Note that the prefix for
alias
can be omitted as its the default type.Click point:50,100 Click region:20,20,100,30 Move mouse image:%{ROBOT_ROOT}/logo.png Move mouse offset:200,0 Click Click alias:SpareBin.Login Click SpareBin.Login Click ocr:"Create New Account"
You can also pass internal
region
objects as locators:${region}= Find Element ocr:"Customer name" Click ${region}
Locator chaining
Often it is not enough to have one locator, but instead an element is defined through a relationship of various locators. For this use case the library supports a special syntax, which we will call locator chaining.
An example of chaining:
# Read text from area on the right side of logo Read text image:logo.png + offset:600,0 + size:400,200
The supported operators are:
Operator
Description
then, +
Base locator relative to the previous one
and, &&, &
Both locators should be found
or, ||, |
Either of the locators should be found
not, !
The locator should not be found
Further examples:
# Click below either label Click (image:name.png or image:email.png) then offset:0,300 # Wait until dialog disappears Wait for element not image:cookie.png
Named locators
The library supports storing locators in a database, which contains all of the required fields and various bits of metadata. This enables having one source of truth, which can be updated if a website’s or applications’s UI changes. Robot Framework scripts can then only contain a reference to a stored locator by name.
The main way to create named locators is with VSCode.
Read more on identifying elements and crafting locators:
Keyboard and mouse
Keyboard keywords can emulate typing text, but also pressing various function keys. The name of a key is case-insensitive and spaces will be converted to underscores, i.e. the key
Page Down
andpage_down
are equivalent.The following function keys are supported:
Key
Description
shift
A generic Shift key. This is a modifier.
shift_l
The left Shift key. This is a modifier.
shift_r
The right Shift key. This is a modifier.
ctrl
A generic Ctrl key. This is a modifier.
ctrl_l
he left Ctrl key. This is a modifier.
ctrl_r
The right Ctrl key. This is a modifier.
alt
A generic Alt key. This is a modifier.
alt_l
The left Alt key. This is a modifier.
alt_r
The right Alt key. This is a modifier.
alt_gr
The AltGr key. This is a modifier.
cmd
A generic command button (Windows / Command / Super key). This may be a modifier.
cmd_l
The left command button (Windows / Command / Super key). This may be a modifier.
cmd_r
The right command button (Windows / Command / Super key). This may be a modifier.
up
An up arrow key.
down
A down arrow key.
left
A left arrow key.
right
A right arrow key.
enter
The Enter or Return key.
space
The Space key.
tab
The Tab key.
backspace
The Backspace key.
delete
The Delete key.
esc
The Esc key.
home
The Home key.
end
The End key.
page_down
The Page Down key.
page_up
The Page Up key.
caps_lock
The Caps Lock key.
f1 to f20
The function keys.
insert
The Insert key. This may be undefined for some platforms.
menu
The Menu key. This may be undefined for some platforms.
num_lock
The Num Lock key. This may be undefined for some platforms.
pause
The Pause / Break key. This may be undefined for some platforms.
print_screen
The Print Screen key. This may be undefined for some platforms.
scroll_lock
The Scroll Lock key. This may be undefined for some platforms.
When controlling the mouse, there are different types of actions that can be done. Same formatting rules as function keys apply. They are as follows:
Action
Description
click
Click with left mouse button
left_click
Click with left mouse button
double_click
Double click with left mouse button
triple_click
Triple click with left mouse button
right_click
Click with right mouse button
The supported mouse button types are
left
,right
, andmiddle
.Examples
Both Robot Framework and Python examples follow.
The library must be imported first.
*** Settings *** Library RPA.Desktop
from RPA.Desktop import Desktop desktop = Desktop()
The library can open applications and interact with them through keyboard and mouse events.
*** Keywords *** Write entry in accounting [Arguments] ${entry} Open application erp_client.exe Click image:%{ROBOT_ROOT}/images/create.png Type text ${entry} Press keys ctrl s Press keys enter
def write_entry_in_accounting(entry): desktop.open_application("erp_client.exe") desktop.click(f"image:{ROBOT_ROOT}/images/create.png") desktop.type_text(entry) desktop.press_keys("ctrl", "s") desktop.press_keys("enter")
Targeting can be currently done using coordinates (absolute or relative), but using template matching is preferred.
*** Keywords *** Write to field [Arguments] ${text} Move mouse image:input_label.png Move mouse offset:200,0 Click Type text ${text} Press keys enter
def write_to_field(text): desktop.move_mouse("image:input_label.png") desktop.move_mouse("offset:200,0") desktop.click() desktop.type_text(text) desktop.press_keys("enter")
Elements can be found by text too.
*** Keywords *** Click New Click ocr:New
def click_new(): desktop.click('ocr:"New"')
It is recommended to wait for the elements to be visible before trying any interaction. You can also pass
region
objects as locators.*** Keywords *** Click New ${region}= Wait For element ocr:New Click ${region}
def click_new(): region = desktop.wait_for_element("ocr:New") desktop.click(region)
Another way to find elements by offsetting from an anchor:
*** Keywords *** Type Notes [Arguments] ${text} Click With Offset ocr:Notes 500 0 Type Text ${text}
def type_notes(text): desktop.click_with_offset("ocr:Notes", 500, 0) desktop.type_text(text)
- ROBOT_LIBRARY_DOC_FORMAT = 'REST'
- ROBOT_LIBRARY_SCOPE = 'GLOBAL'
- add_library_components(library_components: List, translation: Optional[dict] = None, translated_kw_names: Optional[list] = None)
- get_keyword_arguments(name)
- get_keyword_documentation(name)
- get_keyword_names()
- get_keyword_source(keyword_name)
- get_keyword_tags(name)
- get_keyword_types(name)
- run_keyword(name, args, kwargs=None)