Sign Up

Log In

OSoMe - Observatory on Social Media

FREE
By truthy
Updated a month ago
Education
7.1/10
Popularity Score
6491ms
Latency
87%
Success Rate

OSoMe - Observatory on Social Media API Overview

OSoMe (pronouced awe•some) allows you to submit queries to our massive social media database. Given a set of Twitter hashtags and a time period, we can count the number of matching tweets, generate a time series, count the number of tweets from each user, or just list matching Tweet IDs. One can subsequently utilize Twitter's REST API to retrieve more information about tweets or users of interest.

Overview

Ever wanted to know which is more popular: #ff or #tbt? With Truthy's API, you can! Search our large sample of tweets by hashtag and obtain time series, counts by user, tweet counts, or tweet ids.

Due to Twitter's TOS we are unable to share the content of the tweets, but you can retrieve a list of tweet IDs and then use Twitter's API to retrieve their content.

We currently offer just a few types of searches and outputs, but expect each to be expanded.

Queries

A query contains three main parts:
start & end: These designate the time period for the search. Specify these as Dates like 2016-01-07 or Datetimes like 2016-01-07T12:34:56
q: The query term(s). One can specify a single term or a comma-separated list of terms. Any tweet matching any of these terms will be counted, i.e. this is an OR query.

Queries must consist of #hashtags and/or URLs (i.e. not just words). Here are some tips about constructing queries:

  • Trailing wildcards are allowed: #academyaward* will match e.g. #academyaward, #academyawards, and #academyawards2017.
  • Aggregate multiple hashtags/URLs on one chart line using a comma-separated list: #oscars, #academyaward*
  • URL schemes and subdomains must be specified separately: http://cnn.com*, https://cnn.com*, http://m.cnn.com*, https://m.cnn.com*
  • Hashtag queries must consist of either at least two characters (e.g. #ff), or at least three characters with a trailing wildcard (e.g. #ows*).

As detailed below, all queries return JSON objects containing links to fetch the query output.

Time Series

Counts the number of tweets per day matching the query. Output is a whitespace-separated column with column 1 containing dates in ISO format, and column 2 containing the number of statuses measured that day.

NOTE: This is the number of Tweets measured by OSoMe, which represents approximately 10% of the total Twitter stream.

Example output:

2016-01-01       833
2016-01-02       744
2016-01-03       786

Tweet Counts

For each hashtag returned in the search, counts the number of tweets containing this hashtag. Output is a whitespace-separated file with column 1 containing the hashtags and column 2 containing the counts.

This query is often combined with wildcards, e.g. #partyinthe*, as each matching hashtag will be counted separately.

Example output:

#partyinthehouse    3
#partyinthebarn    1
#partyintheamtour    8
#partyinthebackfield    12
#partyinthepaint    2

Tweet IDs

Retrieve a list of Tweet IDs matching any element of the query. Output is a single column of Tweet IDs, separated by newlines.

These Tweet IDs are suitable for piping into Twitter's REST API in order to obtain fully-hydrated tweet objects.

Example output:

682712956589703168
682713052937064448
682713426477604864
682714428970778624
682714667865608192
...

User Post Count

Counts the number of tweets from each user matching the query. Output is a whitespace-separated file with column 1 containing the username and user id, and column 2 containing the number of posts from that user matching the query.

Example output:

0nlyYoutubers(2504266956)    1
3eeriiq(415720323)    1
412Carter(2342215570)    1
444_nal4b(2365799569)    2
59cent_(272158291)    1
9Assel(568993835)    2
A1_Cyber(272374050)    1
ADDAMPINKK(2576073802)    1

PROTIP: Command-line wizards can use sort -rnk2 <filename> to sort this file in descending order by number of posts.

Responses

Mashape enforces timeout restrictions on requests, but sometimes queries of this large database can take a minute or two to return results. In order to allow this, most of our queries will quickly return a JSON object containing a result_url. You can then GET that URL, which will redirect you to the result file once the query finishes. Both the result_url and the results file should be available for at least a day after your query finishes.

The "Tweet IDs" query works slightly differently. This is a fast query that often returns a large number of results. Thus the response JSON has a files entry, a list containing URLs, each pointing to a file full of Tweet IDs.

While this workflow may seem complicated, it is necessary in order for us to reliably serve such large files of results.

Troubleshooting

If your result_url redirects you to a URL that responds with a "404 Not Found" error, either your query returned no results, or else there was an error with the query. In order to aid troubleshooting, you can take a look at the query log files. Your result_url will redirect to a URL that looks something like http://osome.iuni.iu.edu/moe/jobs/2016-04-29/9ec9668b-f3e2-4e04-9625-4a7efd953bd7/data/timestampCount_0.txt. In this case, you can point a web browser at the job root directory, http://osome.iuni.iu.edu/moe/jobs/2016-04-29/9ec9668b-f3e2-4e04-9625-4a7efd953bd7/. In this directory, you will find at least three items:

output.log & error.log: Contain information about the query process. These are valuable for troubleshooting failed queries.
data: This folder contains the results of your query. Depending on the query type, the results may be in a subdirectory called mrOutput. When present, the tweetIds subdirectory contains Twitter tweet IDs for tweets matching the query.

Rate Limits

Mashape users are currently limited to 100 queries per day. This will be adjusted as we observe traffic and performance.

Attribution

Please cite the following publication when using this data:

Davis CA, Ciampaglia GL, Aiello LM, Chung K, Conover MD, Ferrara E, Flammini A, Fox GC, Gao X, Gonçalves B, Grabowicz PA, Hong K, Hui P, McCaulay S, McKelvey K, Meiss MR, Patil S, Peli Kankanamalage C, Pentchev V, Qiu J, Ratkiewicz J, Rudnick A, Serrette B, Shiralkar P, Varol O, Weng L, Wu T, Younge AJ, Menczer F. (2016) OSoMe: The IUNI observatory on social media. PeerJ Preprints 4:e2008v1 doi: 10.7287/peerj.preprints.2008v1

or as BibTeX:

@Article{Davis2016,
  Title   = {{OSoMe}: The {IUNI} observatory on social media},
  Author  = {Davis, Clayton A and Ciampaglia, Giovanni Luca and Aiello, Luca Maria and Chung, Keychul and Conover, Michael D and Ferrara, Emilio and Flammini, Alessandro and Fox, Geoffrey C and Gao, Xiaoming and Gon{\c{c}}alves, Bruno and Grabowicz, Przemyslaw A and Hong, Kibeom and Hui, Pik-Mai and McCaulay, Scott and McKelvey, Karissa and Meiss, Mark R and Patil, Snehal and Peli Kankanamalage, Chathuri and Pentchev, Valentin and Qiu, Judy and Ratkiewicz, Jacob and Rudnick, Alex and Serrette, Benjamin and Shiralkar, Prashant and Varol, Onur and Weng, Lilian and Wu, Tak-Lon and Younge, Andrew J and Menczer, Filippo},
  Journal = {{PeerJ} Preprints},
  Year    = {2016},
  Pages   = {e2008v1},
  Volume  = {4},
  Doi     = {10.7287/peerj.preprints.2008v1}
}
Log inSign up

Install SDK for NodeJS

Installing

To utilize unirest for node.js install the the npm module:

$ npm install unirest

After installing the npm package you can now start simplifying requests like so:

var unirest = require('unirest');

Creating Request

unirest.("https://osome-public.p.rapidapi.com")
.header("X-RapidAPI-Key", "undefined")
.header("Content-Type", "application/x-www-form-urlencoded")
.end(function (result) {
  console.log(result.status, result.headers, result.body);
});
OAuth2 Authentication
Client ID
Client Secret
OAuth2 Authentication

Sign up for free

to test this endpoint

Join the world’s largest API marketplace with over half a million developers and thousands of APIs.
DiscoverAPIs
Testfrom the browser
Connectusing code snippets
Managefrom one dashboard