Python API

Base64AI

class RPA.DocumentAI.Base64AI.Base64AI

Bases: object

Library to support Base64.ai service for intelligent document processing (IDP).

Added with rpaframework version 19.0.0.

Service supports identifying fields in the documents, which can be given to the service in multiple different file formats and via URL.

Robot Framework example usage

*** Settings ***
Library   RPA.DocumentAI.Base64AI
Library   RPA.Robocorp.Vault

*** Tasks ***
Identify document
    ${secrets}=   Get Secret  base64ai-auth
    Set Authorization  ${secrets}[email-address]   ${secrets}[apikey]
    ${results}=  Scan Document File
    ...   ${CURDIR}${/}invoice.pdf
    ...   model_types=finance/check/usa,finance/invoice/usa
    # Scan response contains list of detected models in the document
    FOR  ${result}  IN  @{results}
        Log To Console  Model: ${result}[model]
        Log To Console  Field keys: ${{','.join($result['fields'].keys())}}
        Log To Console  Fields: ${result}[fields]
        Log To Console  Text (OCR): ${result}[ocr]
    END

Python example usage

from RPA.DocumentAI.Base64AI import Base64AI
from RPA.Robocorp.Vault import Vault

secrets = Vault().get_secret("base64ai-auth")
baselib = Base64AI()
baselib.set_authorization(secrets["email-address"], secrets["apikey"])
result = baselib.scan_document_file(
    "invoice.pdf",
    model_types="finance/invoice,finance/check/usa",
)
for r in result:
    print(f"Model: {r['model']}")
    for key, props in r["fields"].items():
        print(f"FIELD {key}: {props['value']}")
    print(f"Text (OCR): {r['ocr']}")
BASE_URL = 'https://base64.ai'
ROBOT_LIBRARY_DOC_FORMAT = 'REST'
ROBOT_LIBRARY_SCOPE = 'GLOBAL'
get_fields_from_prediction_result(prediction: Optional[Union[Dict[Hashable, Optional[Union[str, int, float, bool, list, dict]]], List[Optional[Union[str, int, float, bool, list, dict]]], str, int, float, bool, list, dict]]) List

Helper keyword to get found fields from a prediction result. For example see Scan Document File or Scan Document URL keyword.

Parameters

prediction – prediction result dictionary

Returns

list of found fields

get_user_data() Dict

Get user data including details on credits used and credits remaining for the Base64 service.

Returned user data contains following keys:

  • givenName

  • familyName

  • email

  • hasWorkEmail

  • companyName

  • numberOfCredits

  • numberOfPages

  • numberOfUploads

  • numberOfCreditsSpentOnDocuments (visible if used)

  • numberOfCreditsSpentOnFaceDetection (visible if used)

  • numberOfCreditsSpentOnFaceRecognition (visible if used)

  • hasActiveAwsContract

  • subscriptionType

  • subscriptionPeriod

  • tags

  • ccEmails

  • status

  • remainingCredits (calculated by the keyword)

Returns

object containing details on the API user

Robot Framework example:

${userdata}=   Get User Data
Log To Console  I have still ${userdata}[remainingCredits] credits left

Python example:

userdata = baselib.get_user_data()
print(f"I have still {userdata['remainingCredits']} credits left")
scan_document_file(file_path: str, model_types: Optional[Union[str, List[str]]] = None, mock: bool = False) Optional[Union[Dict[Hashable, Optional[Union[str, int, float, bool, list, dict]]], List[Optional[Union[str, int, float, bool, list, dict]]], str, int, float, bool, list, dict]]

Scan a document file. Can be given a model_types to specifically target certain models.

Parameters
  • file_path – filepath to the file

  • model_types – single model type or list of model types

  • mock – set to True to use /mock/scan endpoint instead of /scan

Returns

result of the document scan

Robot Framework example:

${results}=    Scan Document File
...    ${CURDIR}${/}files${/}IMG_8277.jpeg
...    model_types=finance/check/usa,finance/invoice
FOR    ${result}    IN    @{results}
    Log To Console    Model: ${result}[model]
    Log To Console    Fields: ${result}[fields]
    Log To Console    Text (OCR): ${result}[ocr]
END

Python example:

result = baselib.scan_document_file(
    "./files/Invoice-1120.pdf",
    model_types="finance/invoice,finance/check/usa",
)
for r in result:
    print(f"Model: {r['model']}")
    for key, val in r["fields"].items():
        print(f"{key}: {val['value']}")
    print(f"Text (OCR): {r['ocr']}")
scan_document_url(url: str, model_types: Optional[Union[str, List[str]]] = None, mock: bool = False) Optional[Union[Dict[Hashable, Optional[Union[str, int, float, bool, list, dict]]], List[Optional[Union[str, int, float, bool, list, dict]]], str, int, float, bool, list, dict]]

Scan a document URL. Can be given a model_types to specifically target certain models.

Parameters
  • url – valid url to a file

  • model_types – single model type or list of model types

  • mock – set to True to use /mock/scan endpoint instead of /scan

Returns

result of the document scan

Robot Framework example:

${results}=    Scan Document URL
...    https://base64.ai/static/content/features/data-extraction/models//2.png
FOR    ${result}    IN    @{results}
    Log To Console    Model: ${result}[model]
    Log To Console    Fields: ${result}[fields]
    Log To Console    Text (OCR): ${result}[ocr]
END

Python example:

result = baselib.scan_document_url(
    "https://base64.ai/static/content/features/data-extraction/models//2.png"
)
for r in result:
    print(f"Model: {r['model']}")
    for key, props in r["fields"].items():
        print(f"FIELD {key}: {props['value']}")
    print(f"Text (OCR): {r['ocr']}")
set_authorization(api_email: str, api_key: str) None

Set Base64 AI request headers with email and key related to API.

Parameters
  • api_email – email address related to the API

  • api_key – key related to the API

Robot Framework example:

${secrets}=   Get Secret  base64ai-auth
Set Authorization    ${secrets}[email-address]    ${secrets}[apikey]

Python example:

secrets = Vault().get_secret("base64ai-auth")
baselib = Base64AI()
baselib.set_authorization(secrets["email-address"], secrets["apikey"])