The sanitizer will remove certain tags (script, marquee, head, frame, menu, object, et al.). It retains predominantly 'content' tags. The sanitizer will remove most attributes. It will keep only hrefs on a tags and colspans on td/th tags. The sanitizer can be a great tool for cleaning up the HTML saved by the likes of Word and OpenOffice.
Parse, validate and get location information about a phone number. Use this API to validate local and international phone numbers. You can determine what kind of number this is (e.g. fixed line or mobile), the location of the number and also reformat the number into local and international dialing formats.
Diffbot extracts data from web pages automatically and returns structured JSON. For example, our Article API returns an article's title, author, date and full-text. Use the web as your database! We use computer vision, machine learning and natural language processing to add structure to just about any web page.
The Mozscape API allows you to customize and integrate data from our dynamic Mozscape index right into your own applications. Our Mozscape index is updated frequently to ensure that you're getting the freshest look at the web possible. With billions of links in our index, intelligent metrics, and thorough URL data, our Mozscape API offers unlimited possibilities.
Free Search API with 1 million free queries per month. Web Search More than 2 billion pages indexed. English, German and Chinese results. Sorted by relevancy. News Search News articles from newspapers, magazines and blogs. Sorted by publishing date, with author and article image. Trending News Trending news, grouped by topic. Topics sorted by buzz (number of sources reporting on same topic). One main article per topic + related links. Trending Topics Trending news, grouped by topic. Topics sorted by buzz (number of sources reporting on same topic). All full articles per topic, sorted by publishing time. Suggestions Suggestions with auto completes for query substrings and corrections for misspelled terms.
The PeerReach API allows you to give context to the content produced by any Twitter profile. PeerReach has analysed over a 100 million accounts and can return information like, expertise area's. interests, gender, age and location. This free version of our API allows you to make 2400 daily calls.
Rapleaf's Personlization API provides rich demographic data about email addresses in real-time. You will need an Rapleaf API key in order to use this service. You can get one (for free) here: https://www.rapleaf.com/developers/api_access. With an API key, you will have free, unlimited access to age, gender, and city/state/country data. Additionally, we provide many other premium fields for a small price. Please visit our website or email us at [email protected] to get more info on pricing and our premium fields.
In an effort to create transparency and encourage technological innovation, the Bureau of Labor Statistics (BLS) is releasing its Application Programming Interface (API) to the public. The BLS Public Data API gives the public access to raw economic data from all BLS programs. It is the Bureau's hope that talented developers and programmers will use the BLS Public Data API to create original, inventive applications with published BLS data.
Multilingual sentiment analysis of texts from different sources (blogs, social networks,...). Besides polarity at sentence and global level, Sentiment Analysis uses advanced natural language processing techniques to also detect the polarity associated to both entities and concepts in the text. Sentiment Analysis also gives the user the possibility of detecting the polarity of user-defined entities and concepts, making the service a flexible tool applicable to any kind of scenario. Additionally, Sentiment Analysis detects if the text processed is subjective or objective and if it contains irony marks [beta], both at global and sentence level, giving the user additional information about the reliability of the polarity obtained from the sentiment analysis.
Automatic multilingual text classification according to pre-established categories defined in a model. The algorithm used combines statistic classification with rule-based filtering, which allows to obtain a high degree of precision for very different environments. Three models available: IPTC (International Press Telecommunications Council standard), EuroVocs and Corporate Reputation model. Languages covered are Spanish, English, French, Italian, Portuguese and Catalan.
The 3taps Search API is responsible for searching against the database of postings. For example, it can be used to find all postings from a particular data source, category and location, or to find postings with a given annotation value. A search request is made to the 3taps Search API, and search results are returned back to the caller. The search request can include any number of search criteria, and the results will be paginated to keep the search process manageable.
The PubNub Network makes Real-time Communications Simple with an easy API. Two Functions: Send/Receive (Publish/Subscribe). We provide a web-scale API for businesses to build scalable Data Push communication apps on Mobile, Tablet and Web. Bidirectional JSON. Ask for commit access - via Twitter: @pubnub - via IRC: #pubnub on FreeNode
Bike Sharing Networks Around the World. This project started in Barcelona, when I was unable to get a decent client for the Android to work with the Bicing system. We figured out the same problem would apply for the rest of the world, thus solving it for us locally could also work globally. Then, we also found out how important was this data for everyone, and that there's no reason for companies to keep it private. Most of these bike sharing systems are built using public money, we believe this data should be available to their own citizens The main reason for this project to be is for people to realize the benefits of providing free data.
The United States Code in JSON. The Code of Laws of the United States of America (variously abbreviated to Code of Laws of the United States, United States Code, U.S. Code, or U.S.C.) is a compilation and codification of the general and permanent federal laws of the United States. It contains 51 titles (along with a further 4 proposed titles). The main edition is published every six years by the Office of the Law Revision Counsel of the House of Representatives, and cumulative supplements are published annually. The current edition of the code was published in 2012, and according to the Government Printing Office, is over 200,000 pages long.
Recomio delivers recommendations with a few lines of code. With Recomio, you can build your own recommendation engine with simplicity, both on the web and mobile. All you need to do is use our free code which instantly enables star based ratings, likes, item relations and recommendations. Recomio will then aggregate all the information (real-time basis), provide recommendations and predict user behavior by analyzing user and item patterns and relationships.
Tagdef.com is the worlds largest hashtag dictionary. Use this api to access these definitions. The content is user-generated, and the directory currently contains over 60.000 definitions. The hashtags can have many definitions each, ordered by user-votes. The hashtags are often related to Twitter , but is also commonly used on Pinterest and Google+ . This API is free to use, but you must provide a clickable link back to the tagdef.com page for the given tag when using information from the API. This link can be found in the uri part of the reply.
Synapsify’s REST API is based on a core set of linguistic algorithms, utilizing phonemics, natural language processing (NLP), machine learning, etc. The API has been designed to be as flexible as possible, and can be adapted to a very wide range of domains and fields of interest, including: Reading any type of written content based on several dimensions; Revealing the quality, balance, credibility and quotablity of such content and its most important topics and phrases; Indexing and matching against other written content or customized indexes; Enhancing the ability to discover, understand and segment actionable insights.
DuckDuckGo Zero-click Info includes topic summaries, categories, disambiguation, official sites, !bang redirects, definitions and more. You can use this API for many things, e.g. define people, places, things, words and concepts; provides direct links to other services (via !bang syntax); list related topics; and gives official sites when available.
Topics Extraction tags locations, people, companies, dates and many other elements appearing in a text written in Spanish, English, French, Italian, Portuguese or Catalan. This detection process is carried out by combining a number of complex natural language processing techniques that allow to obtain morphological, syntactic and semantic analyses of a text and use them to identify different types of significant elements.
Get your API key at https://api.bitext.com/#/signup/api Contact us at [email protected] You can use this REST API to perform: Sentiment Analysis: Structure every part of an opinion into positive/neutral/negative, identify the topic of the opinion, the sentiment expression used, and get a numeric value for each of the sentiment-bearing phrases. Entity Extraction: Extract from text names, places, firms, twitter users, and others. Categorization: Classify text using a custom build taxonomy Concept Analysis: Linguistically based structuring text.
API for the United States Federal Court System (commonly referred to as PACER). Search for cases, get detailed docket information, and download filings. Save money by downloading cached copies of documents. Docket Alarm is for: (1) Financial institutions that want to run background litigation checks on companies; (2) Document management or doc review systems that want to automatically sync data with the court's; and (3) Law firms that want to automate tracking court filings in their cases.
Similarsitecheck is a free and open search engine to find similar and related websites. Our specially developed similarity algorithm Similarsitecheck helps to find alternative webpages. We provide an API for developers or whoever would like to use our data. It's free to get started using the API. There is a rate limit of 5,000 queries per day. If you expect to exceed that, please get in touch with us. Input a domain and receive an output of 20 similar websites with title, description, domainpower, website language and their similarity score.
This is an unofficial Pinterest API. Pinterest is a pinboard-style photo sharing website that allows users to create and manage theme-based image collections such as events, interests, hobbies, and more. Users can browse other pinboards for inspiration, 're-pin' images to their own pinboards, or 'like' photos. (Credits to http://pinterestapi.co,uk for the user-specific Pinterest APIs).
Trendn aggregates the most viral and socially-shared content on the web. Ranking is based on social engagement, which refers to how interesting or relevant people have found an item or category to be. Examples of engagement include sharing with your friends, bookmarking an article, leaving a comment on a blog, or clicking a link to read a news item.
Precisely annotate text with fine senses using the world's only API that disambiguates both common words (all parts of speech) and proper nouns (NEs) with near human accuracy. Use specialized recipes for well-formed text, queries, and social media (e.g. tweets). Get lexical annotation, statistical confidence scores, external links (wikipedia, twitter verified accounts, etc), and precise classification of NEs. Tags: disambiguation, wsd, text analytics, language, sense annotation, semantic, extraction ** For more documentation see at http://idilia.com/docs/rest_api/text-disambiguate ** Developer forum at http://groups.google.com/forum/?fromgroups#!forum/idilia-developers