OCR

FREEMIUM
By API 4 AI | Updated vor einem Monat | Visual Recognition
Popularity

9.6 / 10

Latency

1,845ms

Service Level

100%

Health Check

100%

Back to All Tutorials (8)

How to recognize a container number using OCR API

Logistics centers, ports, and warehouses are establishments involved in handling containers, wagons, and various other methods of cargo transportation. As a result, they are required to manually assess the entirety of the cargo.

We propose utilizing the OCR API to identify the number of cargo containers marked in adherence to the ISO 6346 standard. Through the application of the OCR API, one can detect words within an image, delineate bounding boxes around them, and retrieve the complete textual content present within the image.

This API provides support for a wide array of both contemporary and archaic languages. It operates in two modes: identifying separate words or extracting the entire text encompassed within an image.

Python implementation

Prior to commencing, we strongly advise acquainting yourself with the OCR API. You can begin by perusing the OCR API documentation available at OCR API Docs and exploring a variety of practical examples provided in the OCR API Examples repository found at OCR API Examples.
Familiarizing yourself with these resources will enable you to better grasp the functionality and capabilities of the OCR API, thus optimizing your experience and utilization of this tool.

Getting a container number

Submit a photograph of a container to the OCR API, and in return, obtain a comprehensive textual output extracted from the image.
It is noteworthy that container numbers adhere to a standardized format established by the ISO 6346 specification.
Consequently, we shall employ regular expressions to identify the container number within the extracted text.

import re
import sys
from pathlib import Path

import requests
from requests.adapters import Retry, HTTPAdapter

API_URL = 'https://ocr43.p.rapidapi.com'


def get_container_number(photo_path: Path, api_key: str):
    # We strongly recommend you use exponential backoff.
    error_statuses = (408, 409, 429, 500, 502, 503, 504)
    s = requests.Session()
    retries = Retry(backoff_factor=1.5, status_forcelist=error_statuses)

    s.mount('https://', HTTPAdapter(max_retries=retries))

    url = f'{API_URL}/v1/results'
    with photo_path.open('rb') as f:
        api_res = s.post(url, files={'image': f},
                         headers={'X-RapidAPI-Key': api_key}, timeout=20)
    api_res_json = api_res.json()

    # Handle processing failure.
    if (api_res.status_code != 200 or
            api_res_json['results'][0]['status']['code'] == 'failure'):
        print('Image processing failed.')
        sys.exit(1)

    # Parse container number using regular expression according to https://en.wikipedia.org/wiki/ISO_6346
    text = api_res_json['results'][0]['entities'][0]['objects'][0]['entities'][0]['text']
    serial_match = re.search(r'([a-zA-Z]{3})([a-zA-Z])\s?(\d{6})\s?(\d)', text)
    if not serial_match:
        return None
    owner_code = serial_match.group(1)
    category_id = serial_match.group(2)
    serial = serial_match.group(3)
    check_digit = serial_match.group(4)
    container_type = text[serial_match.end() + 1:serial_match.end() + 5]
    return owner_code, category_id, serial, check_digit, container_type

Parse command-line arguments

The script takes the api key and the path to the photo with the container as command line arguments using argparse.

def parse_args():
    """Parse command line arguments."""
    parser = argparse.ArgumentParser()
    parser.add_argument('--api-key', help='Rapid API token.', required=True)  # Get your token at https://rapidapi.com/api4ai-api4ai-default/api/ocr43/pricing
    parser.add_argument('photo', type=Path,
                        help='Path to a photo.')
    return parser.parse_args()

Main function

In the main function, previously created functions are called, and as a result, the container number will be printed.

    args = parse_args()
    info = get_container_number(args.photo, args.api_key)
    if info:
        owner_code, category_id, serial, check_digit, container_type = info
        print(f'Container number: {owner_code}{category_id}{serial}{check_digit}.')
    else:
        print('Container number not found.')

If you run the script with the photo, you’ll get the container info:

python3 main.py --api-key <API_KEY> <PATH_TO_PHOTO>


Python code

"""
Get cargo container number from a photo.

Run with:
`python3 main.py --api-key <RAPID_API_TOKEN> <PATH_TO_PHOTO>`
"""

import argparse
import re
import sys
from pathlib import Path

import requests
from requests.adapters import Retry, HTTPAdapter

API_URL = 'https://ocr43.p.rapidapi.com'


def get_container_number(photo_path: Path, api_key: str):
    """Get container number info according to ISO 6346."""
    # We strongly recommend you use exponential backoff.
    error_statuses = (408, 409, 429, 500, 502, 503, 504)
    s = requests.Session()
    retries = Retry(backoff_factor=1.5, status_forcelist=error_statuses)

    s.mount('https://', HTTPAdapter(max_retries=retries))

    url = f'{API_URL}/v1/results'
    with photo_path.open('rb') as f:
        api_res = s.post(url, files={'image': f},
                         headers={'X-RapidAPI-Key': api_key}, timeout=20)
    api_res_json = api_res.json()

    # Handle processing failure.
    if (api_res.status_code != 200 or
            api_res_json['results'][0]['status']['code'] == 'failure'):
        print('Image processing failed.')
        sys.exit(1)

    # Parse container number using regular expression according to https://en.wikipedia.org/wiki/ISO_6346.
    text = api_res_json['results'][0]['entities'][0]['objects'][0]['entities'][0]['text']
    serial_match = re.search(r'([a-zA-Z]{3})([a-zA-Z])\s?(\d{6})\s?(\d)', text)
    if not serial_match:
        return None
    owner_code = serial_match.group(1)
    category_id = serial_match.group(2)
    serial = serial_match.group(3)
    check_digit = serial_match.group(4)
    container_type = text[serial_match.end() + 1:serial_match.end() + 5]
    return owner_code, category_id, serial, check_digit, container_type


def parse_args():
    """Parse command line arguments."""
    parser = argparse.ArgumentParser()
    parser.add_argument('--api-key', help='Rapid API token.', required=True)  # Get your token at https://rapidapi.com/api4ai-api4ai-default/api/ocr43/pricing
    parser.add_argument('photo', type=Path,
                        help='Path to a photo.')
    return parser.parse_args()


def main():
    """
    Script entry point.
    """
    args = parse_args()
    info = get_container_number(args.photo, args.api_key)
    if info:
        owner_code, category_id, serial, check_digit, container_type = info
        print(f'Container number: {owner_code}{category_id}{serial}{check_digit}.')
    else:
        print('Container number not found.')


if __name__ == '__main__':
    main()

Conclusion

The OCR API presents itself as a remarkably versatile and advantageous tool, offering an extensive array of capabilities spanning various domains.

It emerges as an invaluable resource, catering to a diverse range of tasks, which include, but are not limited to, digitizing historical publications, conducting meticulous document scanning and analysis, and facilitating seamless translation of textual content.

Its multifaceted functionality positions it as an indispensable asset for professionals and researchers across a spectrum of fields. It adeptly addresses specific requirements with precision and efficiency.

By harnessing its capabilities, individuals and organizations can achieve their objectives with heightened accuracy, improved time-efficiency, and enhanced effectiveness. The OCR API plays a pivotal role in empowering endeavors and facilitating success in an increasingly dynamic and competitive landscape.