Article Extractor

FREEMIUM
By PWSHub | Updated hace 3 días | Data
Popularity

9.7 / 10

Latency

865ms

Service Level

100%

Health Check

100%

Followers: 4
Resources:
Product Website
API Creator:
Rapid account: PWS Hub
PWSHub
pwshub
Log In to Rate API
Rating: 3.7 - Votes: 3

README

Article Extractor API

This API empowers you to efficiently extract the main content from blog posts and news entries provided by a URL.

Key Features

  • Core Functionality: Extracts the core text of the article, providing a concise summary of the information it conveys.
  • Simple Usage: Offers a single endpoint, /article/parse accessible through both GET and POST requests.
  • Comprehensive Extraction: Aims to capture the essential details of the article while maintaining readability.

Authentication

Include the following headers in your requests:

  • X-RapidAPI-Key: your unique RapidAPI key obtained upon registration
  • X-RapidAPI-Host: set to article-extractor2.p.rapidapi.com

Request Parameters

  • url (required): provide the URL of the blog post or news entry you wish to extract the main content from
  • word_per_minute (optional): this parameter influences the calculation of “time to read.” By default, it’s set to 300 words per minute. Adjust this value as needed to match your desired reading speed estimation
  • desc_truncate_len (optional): controls the maximum length of the generated description. The default is 210 characters. If the extracted description exceeds this limit, it will be truncated to ensure conciseness
  • desc_len_min (optional): sets the minimum required character count for the description. The default is 180 characters. If the extracted description falls below this threshold, the API will return “null”
  • content_len_min (optional): defines the minimum character count requirement for the extracted content. The default is 200 characters. If the content falls below this minimum, the API will return “null”

Example

Request from Node.js server

We love native approach with build-in fetch util:

import querystring from 'node:querystring'

export async function extract (url)  {
  try {
    const queryString = querystring.stringify({
      url,
      word_per_minute: 320,
    })

    const target = `https://article-extractor2.p.rapidapi.com/article/parse?${queryString}`
    
    const res = await fetch(target, {
      headers: {
        'X-RapidAPI-Key': 'YOUR_OWN_RAPID_API_KEY',
        'X-RapidAPI-Host': 'article-extractor2.p.rapidapi.com'
      },
    })
    const json = await res.json()
    return json
  } catch (err) {
    console.error(err)
    return null
  }
}

const data = await extract('https://css-tricks.com/empathetic-animation/')
console.log(data)

Of course you can use axios or other alternatives exist for sending requests:

import axios from 'axios'

const options = {
  method: 'GET',
  url: 'https://article-extractor2.p.rapidapi.com/article/parse',
  params: {
    url: 'https://css-tricks.com/empathetic-animation/',
    word_per_minute: 320,
  },
  headers: {
    'X-RapidAPI-Key': 'YOUR_OWN_RAPID_API_KEY',
    'X-RapidAPI-Host': 'article-extractor2.p.rapidapi.com'
  }
}

try {
  const response = await axios.request(options);
  console.log(response.data);
} catch (error) {
  console.error(error);
}

Response in JSON format:

{
  "error": 0,
  "message": "Article extraction success",
  "data": {
    "url": "https://css-tricks.com/empathetic-animation/",
    "title": "Empathetic Animation | CSS-Tricks",
    "description": "Animation on the web is often a contentious topic. I think, in part, it’s because bad animation is blindingly obvious, whereas well-executed animation fades seamlessly into the background. When handled well,...",
    "links": [
      "https://css-tricks.com/empathetic-animation/",
      "https://css-tricks.com/?p=358975"
    ],
    "image": "https://css-tricks.com/wp-json/social-image-generator/v1/image/358975",
    "content": "a very long HTML string ",
    "author": "@cassiecodes",
    "favicon": "https://i0.wp.com/css-tricks.com/wp-content/uploads/2021/07/star.png?fit=180%2C180&ssl=1",
    "source": "css-tricks.com",
    "published": "2022-01-18T09:38:10-08:00",
    "ttr": 150,
    "type": "article"
  }
}

If something wrong, the API will return:

{
  "error": 1,
  "message": "error description",
  "data": null
}