PDF to OCR

FREEMIUM
By globalw | Updated 1ใƒถๆœˆๅ‰ | Text Analysis
Popularity

8.7 / 10

Latency

119ms

Service Level

100%

Health Check

N/A

README

PDF To OCR API

Full Example-Client in Python is available on https://github.com/globalw/pdf2ocr_client

The PDF to OCR API allows you to develop your client application with ease.

Base URL

You should substitute the base URL for your requests with the actual API host URL.

Error Handling

In the event of an error, a JSON response is returned with a detail field, which is an array of Validation objects. Each Validation contains loc (location of the error), msg (error message), and type (error type).

Schemas

ProcessFileRequest

This schema is used to send file data for OCR processing. It requires base64_file.

  • base64_file: Base64 encoded content of the file.

OCRRequest

This schema is used to retrieve OCR processed data. It requires uuid.

  • uuid: Unique identifier of the OCR processing job.

Endpoints

POST /process-pdf

Uploads a file for OCR processing in base64 format.

Request

import requests
import json

url = "YOUR-RAPIDAPI-HOST/process-pdf"

payload = json.dumps({
  "filename": "testfile.pdf",
  "base64_file": "base64 string"
})
headers = {
  'Content-Type': 'application/json',
  'X-RapidAPI-Host': 'YOUR-RAPIDAPI-HOST',
  'X-RapidAPI-Key': 'YOUR-RAPIDAPI-KEY'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

Response

The API will return a uuid which you can use to fetch the OCR result once itโ€™s ready.


POST /ocr-pdf

Retrieve the OCR processed file using the uuid.

Request

import requests
import json

url = "YOUR-RAPIDAPI-HOST/ocr-pdf"

payload = json.dumps({
    "uuid": "98e2c23d93794bbd9b7ae8c1f8cf905d"
})
headers = {
  'Content-Type': 'application/json',
  'X-RapidAPI-Host': 'YOUR-RAPIDAPI-HOST',
  'X-RapidAPI-Key': 'YOUR-RAPIDAPI-KEY'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

Response

If the OCR processing is successful and complete, the API will return the OCR result as a string in the response.


For more advanced usage and error handling, refer to the complete schema definitions and handle the HTTP 422 Validation Error appropriately. This is a general guide to get started with the API, and the actual implementation may vary based on the programming language and libraries you use.

Full Example-Client in Python is available on https://github.com/globalw/pdf2ocr_client

Followers: 0
Resources:
Terms of use
API Creator:
Rapid account: Globalw
globalw
globalw
Log In to Rate API
Rating: 5 - Votes: 1