Python API
DocumentAI
- class RPA.DocumentAI.DocumentAI.DocumentAI
Wrapper library offering generic keywords for initializing, scanning and retrieving results as fields from documents (PDF, PNG etc.).
Library requires at the minimum rpaframework version 19.0.0.
This is a helper facade for the following libraries:
RPA.Cloud.Google (requires rpaframework-google)
RPA.DocumentAI.Base64AI
RPA.DocumentAI.Nanonets
Where the following steps are required:
Engine initialization:
Init Engine
Document scan:
Predict
Result retrieval:
Get Result
So no matter the engine you’re using, the very same keywords can be used, as only the passed parameters will differ (please check the docs on each library for particularities). Once initialized, you can jump between the engines with
Switch Engine
. Before scanning documents, you must configure the service first, with a model to scan the files with and an API key for authorizing the access.See Portal example: https://robocorp.com/portal/robot/robocorp/example-document-ai
Example: Robot Framework
*** Settings *** Library RPA.DocumentAI *** Tasks *** Scan Documents Init Engine base64ai vault=document_ai:base64ai Init Engine nanonets vault=document_ai:nanonets Switch Engine base64ai Predict invoice.png ${data} = Get Result Log List ${data} Switch Engine nanonets Predict invoice.png model=858e4b37-6679-4552-9481-d5497dfc0b4a ${data} = Get Result Log List ${data}
Example: Python
from RPA.DocumentAI import DocumentAI, EngineName lib_docai = DocumentAI() lib_docai.init_engine( EngineName.GOOGLE, vault="document_ai:serviceaccount", region="eu" ) lib_docai.predict( "invoice.pdf", model="df1d166771005ff4", project_id="complete-agency-347912", region="eu" ) print(lib_docai.get_result())
- ROBOT_AUTO_KEYWORDS = False
- ROBOT_LIBRARY_DOC_FORMAT = 'REST'
- ROBOT_LIBRARY_SCOPE = 'GLOBAL'
- property engine: Any
- get_result(extended: bool = False) Dict[Hashable, str | int | float | bool | list | dict | None] | List[str | int | float | bool | list | dict | None] | str | int | float | bool | list | dict | None | Document
Retrieve the result data previously obtained with
Predict
.The stored raw result is usually pre-processed with a library specific keyword prior the return.
- Parameters:
extended – Get all the details inside the result data. (main fields only by default)
- Returns:
Usually a list of fields detected in the document.
Example: Robot Framework
*** Tasks *** Scan With Base64 Document AI Base64 ${data} = Get Result Log List ${data}
Example: Python
result = lib_docai.get_result() for field in result: print(field)
- init_engine(name: EngineName | str, secret: str | Path | Tuple | List | Dict | None = None, vault: Dict | str | None = None, **kwargs) None
Initialize the engine you want to scan documents with.
This is required before being able to run
Predict
. Once initialized, you don’t need to run this again, simply useSwitch Engine
to jump between the engines. The final secret value (passed directly with secret or picked up automatically from the Vault with vault) will be split into authorization args and kwargs or just passed as it is to the wrapped library. Keep in mind that some engines are expecting API keys where others tokens or private keys. Any optional keyword argument will be passed further in the wrapped library.- Parameters:
name – Name of the engine.
secret – Authenticate with a string/file/object secret directly.
vault – Specify the Vault storage name and secret key in order to authenticate. (‘name:key’ or {name: key} formats are supported)
How secret resolution works
When vault is passed in, the corresponding Vault is retrieved and the value belonging to specified field is returned as a secret. If a secret is used, then this value is returned as it is if this isn’t a path pointing to the file holding the value to be returned. We’ll be relying on environment variables in the absence of both the secret and vault.
Expected secret value formats:
google: <json-service/token> (
RPA.Cloud.Google.Init Document AI
)base64ai: <e-mail>,<api-key> (
RPA.DocumentAI.Base64AI.Set Authorization
)nanonets: <api-key> (
RPA.DocumentAI.Nanonets.Set Authorization
)
Example: Robot Framework
*** Keywords *** Init Base64 Init Engine base64ai vault=document_ai:base64ai
Example: Python
from RPA.DocumentAI import DocumentAI from RPA.Robocorp.Vault import Vault lib_docai = DocumentAI() mail_apikey = Vault().get_secret("document_ai")["base64ai"] lib_docai.init_engine("base64ai", secret=mail_apikey)
- predict(location: Path | str, model: str | List[str] | None = None, **kwargs) None
Scan a document with the currently active engine and store the result internally for a later retrieval.
Based on the selected engine, this wraps a chain of libraries until calling a service API in the end, where the passed file is analyzed. Any optional keyword argument will be passed further in the wrapped library. (some engines require mandatory parameters like project ID or region)
- Parameters:
location – Path to a local file or URL address of a remote one. (not all engines work with URLs)
model – Model name(s) to scan with. (some engines guess the model if not specified)
Example: Robot Framework
*** Tasks *** Document AI Base64 [Setup] Init Base64 Predict https://site.com/path/to/invoice.png
Example: Python
lib_docai.predict("local/path/to/invoice.png", model="finance/invoice")
- property result: Dict[Hashable, str | int | float | bool | list | dict | None] | List[str | int | float | bool | list | dict | None] | str | int | float | bool | list | dict | None | Document
- switch_engine(name: EngineName | str) None
Switch between already initialized engines.
Use this to jump between engines when scanning with multiple of them.
- Parameters:
name – Name of the engine to be set as active. (choose between: google, base64ai, nanonets)
Example: Robot Framework
*** Tasks *** Document AI All @{engines} = Create List base64ai nanonets FOR ${engine} IN @{engines} Switch Engine ${engine} Log Scanning with engine: ${engine}... Predict invoice.png ${data} = Get Result Log List ${data} END
Example: Python
lib_docai.switch_engine("base64ai") lib_docai.predict("invoice.png")