Pharma Industry Data Collection

FREEMIUM
By TRAWLINGWEB DATA | Updated לפני חודש | News, Media
Popularity

7.6 / 10

Latency

1,441ms

Service Level

100%

Health Check

N/A

Back to All Tutorials (3)

Data integrity and pagination

Best practice

Regarding data integrity, it is mentioned that each API request returns a maximum of 100 results that match the query. However, there may be many more results that match the filter parameters. To ensure that all results are obtained, complementary calls must be made using the “next” values as parameters of the API call. These “next” values include “next_ts”, “next_q”, and “next_tsi”, which must be used in successive calls to obtain all results that match the query.

On the other hand, pagination is important to access all available results in an ordered and structured way. It is mentioned that the results are returned in ascending order by “crawled” date, and the “next” values are necessary to make the necessary requests to ingest all available results.

News data

Field Description Findable Type Format
id Identification code of each crawled publication assigned by Trawlingweb. No String
title Title of the publication. Si String
text Body text of the publication. Si String
published Publication date-time. Si Date-Time ISO 8601-UTC
crawled Crawl date-time. Time Zone = GMT +1 Si Integer UNIX Timestamp milliseconds
url Website address of the publication. No String
author Author of the publication. Si String
language Language of the publication. Si String ISO 639-1
domain Web domain of publication. Si String
site Web site of publication. Si String
site_type Type of Website (news, blogs, discussions and general) Si String
site_language Website language. Si String ISO 639-1
site_country Website country. Si String ISO 3166-2
site_region Website region. Si String ISO 3166-2:ES
site_section Website section of the publication. No String
section Section of publication. No String
value Estimated economic value. No Float
rank Domain ranking. No Integer
unique_visitors Amount of estimated unique visitors. No Integer

Request data

Data Description Type
totalResults Number of publications found by the search Integer
restResults Number of publications pending to be served Integer
next_ts Initial date-time limit reference in Unix time (miliseconds) , 1 month ago by default Integer
next_tsi This is the final time delimiter. Now default. Integer
next_q Established query String

Example output

JSON format response:

"response" : {
    "data" : [{
        "id" : "...",
        "title" : "...",
        "text" : "...",
        "published" : "...",
        "crawled" : "...",
        "url" : "...",
        "author" : "...",
        "language" : "...",
        "domain" : "...",
        "site" : "...",
        "site_type" : "...",
        "site_language" : "...",
        "site_country" : "...",
        "site_region" : "...",
        "site_section" : "...",
        "section" : "...",
        "value" : "...",
        "rank" : "...",
        "unique_visitors" : "..."
    }],
    "totalResults" : "...",
    "restResults" : "...",
    "next_ts" : "1675160515891",
    "next_tsi" : "1677067077000",
    "next_q" : "barcelona"   
}