WebData Crawler

FREEMIUM
By RicardoDMelo | Updated 21 день назад | Data
Health Check

N/A

README

Hello! This is a HTML crawler for retrieving website data.
This crawler will try to get any data possible from a given url, and send an object as response containing all data possible.

How to use!

To use this crawler is really simple. There are two options.

GetWebData

https://webdata-crawler.p.mashape.com/api/HtmlCrawl/webdata?url=[URLHERE]&imgarr=[boolean]&generic=[boolean]This service gets a json object with data from the url you provided.

Params

  • url [REQUIRED]: Url you want to crawl. No “hashtags” (#) on the url. Only urls that return html.
  • imgarr: Default is FALSE. “false” for get a single image from the url, “true” for getting all images in html
  • generic: Default is TRUE. “true” will try to get a single generic image from the url if not explicitly provided at the html, ie. first image on article or biggest image on website. Using imgarr parameter ignore this feature.

Response

  • webData: JSON object. Can contain any of these properties: title, type, image(can be an array, depending on imgarr parameter), url, description, siteName

GetImages

https://webdata-crawler.p.mashape.com/api/HtmlCrawl/images?url=[URLHERE]This service gets an array of images fetched from all the img tags retrieved by the provided url.

Params

  • url [REQUIRED]: Url you want to crawl. No “hashtags” (#) on the url. Only urls that return html.

Response

  • imgs: array. An array containing all found images

Want to adapt your website?

The best way to get your website to be seen by other crawlers and robots is the OpenGraph Pattern!
OpenGraph is a pattern to describe your website. Google, Facebook, Telegram and several other applications crawl your website for these information.
To turn your websites into graph objects, see how here!

Followers: 30
API Creator:
R
RicardoDMelo
RicardoDMelo
Log In to Rate API
Rating: 5 - Votes: 1