This API allows you to fetch the content, title and images from an article on the web. Using advanced Machine Learning techniques we are able to determine which parts of the page are ads, menus and other boilerplate data. Our service will strip the non relevant content and respond with a clean, structured version of that article webpage.
Diffbot extracts data from web pages automatically and returns structured JSON. For example, our Article API returns an article's title, author, date and full-text. Use the web as your database! We use computer vision, machine learning and natural language processing to add structure to just about any web page.