Compare ratings, e.g. for two items from an online store, and compute an estimate for the probability that the first item is truly more highly rated on average than the second. The computation uses a state of the art empirical bayesian model requiring no specification of any prior distribution, which has been trained on a large database of Amazon reviews.
Currently RaterBayes accepts data in the 1-star to 5-star rating format and assumes that the number of stars given by each person are known, but additional features are planned for working with other types of data.
A new endpoint allows the URLs for two Amazon products for comparison to be specified directly
Small samples warning
In order to follow normal distribution approximations, RaterBayes currently requires at least 10 data points (i.e. ratings) and at least 1 data point for each star rating. These requirements apply to both items A and B, and no result will be returned for queries not meeting them. Features will be developed to work around this limitation and provide output for small samples.
Parameters are:
A1 : Number of 1-star reviews for item A
A2 : Number of 2-star reviews for item A
A3 : Number of 3-star reviews for item A
A4 : Number of 4-star reviews for item A
A5 : Number of 5-star reviews for item A
B1 : Number of 1-star reviews for item B
B2 : Number of 2-star reviews for item B
B3 : Number of 3-star reviews for item B
B4 : Number of 4-star reviews for item B
B5 : Number of 5-star reviews for item B
Output
The output is 3 JSON fields giving:
Statistial method and modelling data
RaterBayes uses the same modelling approach as the Bigger or False Discovery Rate (BFDR), implemented in the priorsplitteR R package. For details of the statistical approach, simulation experiments, and applications of the BFDR method to human genetic data, please read this preprint.
More documentation will be added shortly on how the model was fitted to data from Amazon ratings. It is hoped that more ratings databases will be added in the future, to allow selection of the most appropriate prior distribution for the user’s rating data.