Regarding data integrity, it is mentioned that each API request returns a maximum of 100 results that match the query. However, there may be many more results that match the filter parameters. To ensure that all results are obtained, complementary calls must be made using the “next” values as parameters of the API call. These “next” values include “next_ts”, “next_q”, and “next_tsi”, which must be used in successive calls to obtain all results that match the query.
On the other hand, pagination is important to access all available results in an ordered and structured way. It is mentioned that the results are returned in ascending order by “crawled” date, and the “next” values are necessary to make the necessary requests to ingest all available results.
Field | Description | Findable | Type | Format |
---|---|---|---|---|
id | Identification code of each crawled publication assigned by Trawlingweb. | No | String | |
title | Title of the publication. | Si | String | |
text | Body text of the publication. | Si | String | |
published | Publication date-time. | Si | Date-Time | ISO 8601-UTC |
crawled | Crawl date-time. Time Zone = GMT +1 | Si | Integer | UNIX Timestamp milliseconds |
url | Website address of the publication. | No | String | |
author | Author of the publication. | Si | String | |
language | Language of the publication. | Si | String | ISO 639-1 |
domain | Web domain of publication. | Si | String | |
site | Web site of publication. | Si | String | |
site_type | Type of Website (news, blogs, discussions and general) | Si | String | |
site_language | Website language. | Si | String | ISO 639-1 |
site_country | Website country. | Si | String | ISO 3166-2 |
site_region | Website region. | Si | String | ISO 3166-2:ES |
site_section | Website section of the publication. | No | String | |
section | Section of publication. | No | String | |
value | Estimated economic value. | No | Float | |
rank | Domain ranking. | No | Integer | |
unique_visitors | Amount of estimated unique visitors. | No | Integer |
Data | Description | Type |
---|---|---|
totalResults | Number of publications found by the search | Integer |
restResults | Number of publications pending to be served | Integer |
next_ts | Initial date-time limit reference in Unix time (miliseconds) , 1 month ago by default | Integer |
next_tsi | This is the final time delimiter. Now default. | Integer |
next_q | Established query | String |
"response" : {
"data" : [{
"id" : "...",
"title" : "...",
"text" : "...",
"published" : "...",
"crawled" : "...",
"url" : "...",
"author" : "...",
"language" : "...",
"domain" : "...",
"site" : "...",
"site_type" : "...",
"site_language" : "...",
"site_country" : "...",
"site_region" : "...",
"site_section" : "...",
"section" : "...",
"value" : "...",
"rank" : "...",
"unique_visitors" : "..."
}],
"totalResults" : "...",
"restResults" : "...",
"next_ts" : "1675160515891",
"next_tsi" : "1677067077000",
"next_q" : "barcelona"
}