logo
theme
Product AutoExtract
FREEMIUM
By crawlify
Updated 7 days ago
Product AutoExtract Overview

The Product AutoExtract API extracts clean product data from any e-commerce product page.

Retrieve pricing, title, main image, brand information and more automatically.

Features include:

  1. No code or selectors needed, our algorithms use machine learning to extract data.
  2. We handle proxies, and Javascript rendering for you.
  3. Our algorithms are resilient to website changes.
provider
rating
add first rating
Followers on API
Follow this API
resourcesProvider WebsiteTerms of Service
More Details

Overview

Crawlify develops algorithms that automatically extract data from any web page using machine learning and natural language processing.
No rules, coding or site specific selectors required.

The API's we are working on are:

  • Product AutoExtract (Launched)
  • Jobs AutoExtract (Coming soon)
  • Real Estate AutoExtract (Coming soon)

AutoExtract Product API

Overview

The Product AutoExtract API extracts clean product data from any e-commerce product page.
Retrieve pricing, title, main image, brand information and more data points automatically.
Features include:

  1. No code or selectors needed, our algorithms use machine learning to extract data.
  2. We handle proxies, and Javascript rendering for you.
  3. Our algorithms are resilient to website changes.
  4. The algorithms works across multiple site languages and currencies.

Usage

We currently use, RapidAPI marketplace to provide access to our API's. To get started create an account on RapidAPI.
Visit our Product AutoExtract API page and subscribe to one of our plans.
After doing so you are going to be issued an API Key to start using our service.
The API key will be sent using the x-rapidapi-key header and will authorize your requests.

Example request using NodeJS and request module to perform an API call.

var request = require("request");

var options = {
  method: 'GET',
  url: 'https://product-autoextract.p.rapidapi.com/v1/product',
  qs: {
    url: 'https%3A%2F%2Fwww.walmart.com%2Fip%2FApple-AirPods-with-Charging-Case-Latest-Model%2F604342441'
  },
  headers: {
    'x-rapidapi-host': 'product-autoextract.p.rapidapi.com',
    'x-rapidapi-key': 'ec212ed525mshf703effa4a7e573p195a20jsn45d7b9bfdcb9'
  }
};

request(options, function (error, response, body) {
    if (error) throw new Error(error);

    console.log(body);
});

The endpoint has the following query parameters:

Parameter Type Required? Description
url string required Web page URL of the product to process (URL encoded)
timeout int64 optional Number of milliseconds to wait for the data retrieval from the requested URL. The default timeout is 30 seconds (30000).

The Product API returns data in JSON format. Example response:

HTTP/1.1 201 OK
Content-Type: application/json; charset=utf-8

{
  humanLanguage: 'en',
  resolvedPageUrl: 'https://ecommerce-site/product/final-redirect-url',
  type: 'product',
  primaryImage: 'https://i5.cdn-stores.com/asr/92f2df9a-8ffc-4fd1-9592106d.jpeg',
  title: 'Instant Pot LUX60 6 Qt 6-in-1 Multi-Use Programmable Pressure Cooker',
  offerPrice: 69,
  regularPrice: 99,
  brand: 'Abrand',
  description: 'Product Description',
  productId: '55505580',
  availability: true,
  url: 'https://ecommerce-site/product'
}

Where response data contains the following fields:

Field Type Description
type string Type of page (always product)
url string URL of the page the API was called with
resolvedPageUrl string Final url in case of redirect from pageURL
title string Title of the product.
brand string (Beta)Product brand name (beta not present on all pages)
offerPrice float Offer or actual/final price of the product.
regularPrice float Regular or original price of the product, if available.
productId string (Alpha) Unique product ID determined by Crawlify. Can be between upc, sku, mpn, or extracted from product URL
primaryImage object Primary image details: including url, title, width, height
humanLanguage string Language two-letter code ISO 639-1 of the submitted page
availability bool (Beta) Item's availability, either true or false.

Possible errors:

Error code Description
400 Bad Request Required fields were invalid, not specified.
401 Unauthorized The access token is invalid or has been revoked.
Have a question about this API?Ask the API Provider.
Developers who viewed Product AutoExtract also viewed

Install SDK for (Node.js)Unirest

OAuth2 Authentication
Client ID
Client Secret
OAuth2 Authentication