Requirements:
python -m pip install requests
)Example Input Table from us-002.pdf
test.py
in your working directoryYOUR_KEY
with your x-rapidapi-key which can be found here under Endpoints->Headers->X-RapidAPI-Key
import requests
url = "https://extract-table-documentdev.p.rapidapi.com/extracttable"
payload =open('us-002.pdf', 'rb')
headers = {
'content-type': "application/octet-stream",
'pages': "2",
'x-rapidapi-key': "YOUR_KEY",
'x-rapidapi-host': "extract-table-documentdev.p.rapidapi.com"
}
response = requests.request("POST", url, data=payload, headers=headers)
print(response.text)
py test.py
in your terminal at the root of the workspace. You will receive a response structured similar to the one below. Note: Pages
is an optional header, by default set to 1. Allows a number between 1-10 inclusive as input or the word “all”.{"tables": [{"stats": {"accuracy": 99.18, "whitespace": 4.44, "order": 1, "page": 2}, "titleEstimate": " Table 2: NNI Budget, by Agency, 2009\u20132011\n(dollars in millions)", "data": {"Agency": {"DOE": {"2009 Actual": "332.6", "2009 Recovery": "293.2", "2010 Estimated": "372.9", "2011 Proposed": "423.9"}, "NSF": {"2009 Actual": "408.6", "2009 Recovery": "101.2", "2010 Estimated": "417.7", "2011 Proposed": "401.3"}, "HHS/NIH": {"2009 Actual": "342.8", "2009 Recovery": "73.4", "2010 Estimated": "360.6", "2011 Proposed": "382.4"}, "DOD": {"2009 Actual": "459.0", "2009 Recovery": "0.0", "2010 Estimated": "436.4", "2011 Proposed": "348.5"}, "DOC/NIST": {"2009 Actual": "93.4", "2009 Recovery": "43.4", "2010 Estimated": "114.4", "2011 Proposed": "108.0"}, "EPA": {"2009 Actual": "11.6", "2009 Recovery": "0.0", "2010 Estimated": "17.7", "2011 Proposed": "20.0"}, "HHS/NIOSH": {"2009 Actual": "6.7", "2009 Recovery": "0.0", "2010 Estimated": "9.5", "2011 Proposed": "16.5"}, "NASA": {"2009 Actual": "13.7", "2009 Recovery": "0.0", "2010 Estimated": "13.7", "2011 Proposed": "15.8"}, "HHS/FDA": {"2009 Actual": "6.5", "2009 Recovery": "0.0", "2010 Estimated": "7.3", "2011 Proposed": "15.0"}, "DHS": {"2009 Actual": "9.1", "2009 Recovery": "0.0", "2010 Estimated": "11.7", "2011 Proposed": "11.7"}, "USDA/NIFA": {"2009 Actual": "9.9", "2009 Recovery": "0.0", "2010 Estimated": "10.4", "2011 Proposed": "8.9"}, "USDA/FS": {"2009 Actual": "5.4", "2009 Recovery": "0.0", "2010 Estimated": "5.4", "2011 Proposed": "5.4"}, "CPSC": {"2009 Actual": "0.2", "2009 Recovery": "0.0", "2010 Estimated": "0.2", "2011 Proposed": "2.2"}, "DOT/FHWA": {"2009 Actual": "0.9", "2009 Recovery": "0.0", "2010 Estimated": "3.2", "2011 Proposed": "2.0"}, "DOJ": {"2009 Actual": "1.2", "2009 Recovery": "0.0", "2010 Estimated": "0.0", "2011 Proposed": "0.0"}, "TOTAL": {"2009 Actual": "1,701.5", "2009 Recovery": "511.3", "2010 Estimated": "1,781.1", "2011 Proposed": "1,761.6"}}}}]}
Full schema can be found below or found here:
{
"type": "object",
"properties": {
"tables": {
"type": "array",
"items": {
"type": "object",
"properties": {
"stats": {
"type": "object",
"properties": {
"accuracy": {
"type": "number"
},
"whitespace": {
"type": "number"
},
"order": {
"type": "integer"
},
"page": {
"type": "integer"
}
}
},
"titleEstimate": {
"type": "string"
},
"data": {
"type": "object"
}
}
}
}
}
}