Python API
Nanonets
- class RPA.DocumentAI.Nanonets.Nanonets
Library to support Nanonets service for intelligent document processing (IDP).
Library requires at the minimum rpaframework version 19.0.0.
Service supports identifying fields in the documents, which can be given to the service in multiple different file formats and via URL.
Robot Framework example usage
*** Settings *** Library RPA.DocumentAI.Nanonets Library RPA.Robocorp.Vault *** Tasks *** Identify document ${secrets}= Get Secret nanonets-auth Set Authorization ${secrets}[apikey] ${result}= Predict File ... ${CURDIR}${/}files${/}eckero.jpg ... ${secrets}[receipts-model-id] ${fields}= Get Fields From Prediction Result ${result} FOR ${field} IN @{fields} Log To Console Label:${field}[label] Text:${field}[ocr_text] END ${tables}= Get Tables From Prediction Result ${result} FOR ${table} IN @{tables} FOR ${rows} IN ${table}[rows] FOR ${row} IN @{rows} ${cells}= Evaluate [cell['text'] for cell in $row] Log To Console ROW:${{" | ".join($cells)}} END END END
Python example usage
from RPA.DocumentAI.Nanonets import Nanonets from RPA.Robocorp.Vault import Vault secrets = Vault().get_secret("nanonets-auth") nanolib = Nanonets() nanolib.set_authorization(secrets["apikey"]) result = nanolib.predict_file(file_to_scan, secrets["receipts-model-id"]) fields = nanolib.get_fields_from_prediction_result(result) for field in fields: print(f"Label: {field['label']} Text: {field['ocr_text']}") tables = nanolib.get_tables_from_prediction_result(result) for table in tables: rpatable = Tables().create_table(table["rows"]) for row in table["rows"]: cells = [cell["text"] for cell in row] print(f"ROW: {' | '.join(cells)}")
- ROBOT_LIBRARY_DOC_FORMAT = 'REST'
- ROBOT_LIBRARY_SCOPE = 'GLOBAL'
- get_all_models() Dict
Get all available models related to the API key.
- Returns:
object containing available models
Robot Framework example:
${models}= Get All Models FOR ${model} IN @{models} Log To Console Model ID: ${model}[model_id] Log To Console Model Type: ${model}[model_type] END
Python example:
models = nanolib.get_all_models() for model in models: print(f"model id: {model['model_id']}") print(f"model type: {model['model_type']}")
- get_fields_from_prediction_result(prediction: Dict[Hashable, str | int | float | bool | list | dict | None] | List[str | int | float | bool | list | dict | None] | str | int | float | bool | list | dict | None) List
Helper keyword to get found fields from a prediction result.
For example. see
Predict File
keyword- Parameters:
prediction – prediction result dictionary
- Returns:
list of found fields
- get_tables_from_prediction_result(prediction: Dict[Hashable, str | int | float | bool | list | dict | None] | List[str | int | float | bool | list | dict | None] | str | int | float | bool | list | dict | None) List
Helper keyword to get found tables from a prediction result.
For another example. see
Predict File
keyword- Parameters:
prediction – prediction result dictionary
- Returns:
list of found tables
Robot Framework example:
# It is possible to create ``RPA.Tables`` compatible tables from the result ${tables}= Get Tables From Prediction Result ${result} FOR ${table} IN @{tables} ${rpatable}= Create Table ${table}[rows] FOR ${row} IN @{rpatable} Log To Console ${row} END END
Python example:
# It is possible to create ``RPA.Tables`` compatible tables from the result tables = nanolib.get_tables_from_prediction_result(result) for table in tables: rpatable = Tables().create_table(table["rows"]) for row in rpatable: print(row)
- ocr_fulltext(filename: str, filepath: str) List
OCR fulltext a given file. Returns words and full text.
Filename and filepath needs to be given separately.
- Parameters:
filename – name of the file
filepath – path of the file
- Returns:
the result in a list format
Robot Framework example:
${results}= OCR Fulltext ... invoice.pdf ... ${CURDIR}${/}invoice.pdf FOR ${result} IN @{results} Log To Console Filename: ${result}[filename] FOR ${pagenum} ${page} IN ENUMERATE @{result.pagedata} start=1 Log To Console Page ${pagenum} raw Text: ${page}[raw_text] END END
Python example:
results = nanolib.ocr_fulltext("IMG_8277.jpeg", "./IMG_8277.jpeg") for result in results: print(f"FILENAME: {result['filename']}") for page in result["page_data"]: print(f"Page {page['page']+1}: {page['raw_text']}")
- predict_file(filepath: str, model_id: str) Dict[Hashable, str | int | float | bool | list | dict | None] | List[str | int | float | bool | list | dict | None] | str | int | float | bool | list | dict | None
Get prediction result for a file by a given model id.
- Parameters:
filepath – filepath to the file
model_id – id of the Nanonets model to categorize a file
- Returns:
the result in a list format
Robot Framework example:
${result}= Predict File ./document.pdf ${MODEL_ID} ${fields}= Get Fields From Prediction Result ${result} FOR ${field} IN @{fields} Log To Console Label:${field}[label] Text:${field}[ocr_text] END ${tables}= Get Tables From Prediction Result ${result} FOR ${table} IN @{tables} FOR ${rows} IN ${table}[rows] FOR ${row} IN @{rows} ${cells}= Evaluate [cell['text'] for cell in $row] Log To Console ROW:${{" | ".join($cells)}} END END END
Python example:
result = nanolib.predict_file("./docu.pdf", secrets["receipts-model-id"]) fields = nanolib.get_fields_from_prediction_result(result) for field in fields: print(f"Label: {field['label']} Text: {field['ocr_text']}") tables = nanolib.get_tables_from_prediction_result(result) for table in tables: for row in table["rows"]: cells = [cell["text"] for cell in row] print(f"ROW: {' | '.join(cells)}")
- set_authorization(apikey: str) None
Set Nanonets request headers with key related to API.
- Parameters:
apikey – key related to the API
Robot Framework example:
${secrets}= Get Secret nanonets-auth Set Authorization ${secrets}[apikey]
Python example:
secrets = Vault().get_secret("nanonets-auth") nanolib = Nanonets() nanolib.set_authorization(secrets["apikey"])