Full Example-Client in Python is available on https://github.com/globalw/pdf2ocr_client
The PDF to OCR API allows you to develop your client application with ease.
You should substitute the base URL for your requests with the actual API host URL.
In the event of an error, a JSON response is returned with a detail
field, which is an array of Validation
objects. Each Validation
contains loc
(location of the error), msg
(error message), and type
(error type).
This schema is used to send file data for OCR processing. It requires base64_file
.
base64_file
: Base64 encoded content of the file.This schema is used to retrieve OCR processed data. It requires uuid
.
uuid
: Unique identifier of the OCR processing job.Uploads a file for OCR processing in base64 format.
import requests
import json
url = "YOUR-RAPIDAPI-HOST/process-pdf"
payload = json.dumps({
"filename": "testfile.pdf",
"base64_file": "base64 string"
})
headers = {
'Content-Type': 'application/json',
'X-RapidAPI-Host': 'YOUR-RAPIDAPI-HOST',
'X-RapidAPI-Key': 'YOUR-RAPIDAPI-KEY'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
The API will return a uuid
which you can use to fetch the OCR result once itโs ready.
Retrieve the OCR processed file using the uuid
.
import requests
import json
url = "YOUR-RAPIDAPI-HOST/ocr-pdf"
payload = json.dumps({
"uuid": "98e2c23d93794bbd9b7ae8c1f8cf905d"
})
headers = {
'Content-Type': 'application/json',
'X-RapidAPI-Host': 'YOUR-RAPIDAPI-HOST',
'X-RapidAPI-Key': 'YOUR-RAPIDAPI-KEY'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
If the OCR processing is successful and complete, the API will return the OCR result as a string in the response.
For more advanced usage and error handling, refer to the complete schema definitions and handle the HTTP 422 Validation Error appropriately. This is a general guide to get started with the API, and the actual implementation may vary based on the programming language and libraries you use.
Full Example-Client in Python is available on https://github.com/globalw/pdf2ocr_client