Extract data from online news & articles. Get full metadata with content, images, authors, summary, category, keywords, topics, and more.
Automatic data extraction from articles, products, discussions, and more. This API uses advanced AI technology to retrieve clean, structured data without the need for manual rules or site-specific training.
The most advanced article extraction API with AI/ML summary, category prediction, all images, blog logo, authors, keywords, tags, sentiment and more.
Features:
-
Extracts full HTML/Text
Using A.I. we extract full HTML even from javascript-heavy websites. -
Metadata
Get full article metadata including images, keywords, tags, and more.
Extracted Fields:
- String url: URL of the article
- String title: Title of the article
- String author: Main author of the article
- String html: Cleaned HTML, you can use this to add โReader Modeโ to your app.
- String text: Cleaned text content of the article
- String length: Total length of the articleโs content
- String description: Extracted description of the article.
- String siteName: The name of the blog/website the article was published on.
- String topImage: URL of the main featured image.
- Sring date: Date of publication of article.
- String keywords: Extracted keywords
- String summary: AI generated summary of top 5 sentences that sum up the article.
- List sentiment
- Score: Ranges from -5 to +5. Score. The Score is calculated by adding the sentiment values of recognized words.
- Comparative: Comparative score of the input string.
- Calculation: An array of words with a negative or positive valence with their respective AFINN -score.
- Positive: List of positive words in input string found in AFINN list.
- Negative: List of negative words in input string found in AFINN list.
- Score: Ranges from -5 to +5. Score. The Score is calculated by adding the sentiment values of recognized words.