Google Prediction API

Posted on May 19, 2010  Comments (1)

This looks very cool.

The Prediction API enables access to Google’s machine learning algorithms to analyze your historic data and predict likely future outcomes. Upload your data to Google Storage for Developers, then use the Prediction API to make real-time decisions in your applications. The Prediction API implements supervised learning algorithms as a RESTful web service to let you leverage patterns in your data, providing more relevant information to your users. Run your predictions on Google’s infrastructure and scale effortlessly as your data grows in size and complexity.

Accessible from many platforms: Google App Engine, Apps Script (Google Spreadsheets), web & desktop apps, and command line.

The Prediction API supports CSV formatted training data, up to 100M in size. Numeric or unstructured text can be sent as input features, and discrete categories (up to a few hundred different ones) can be provided as output labels.

Language identification
Customer sentiment analysis
Product recommendations & upsell opportunities
Document and email classification

Related: The Second 5,000 Days of the WebRobot Independently Applies the Scientific MethodControlled Experiments for Software SolutionsStatistical Learning as the Ultimate Agile Development Tool by Peter Norvig

One Response to “Google Prediction API”

  1. Tom Parsons
    October 28th, 2010 @ 5:50 am

    This looks very interesting and I’m wondering whether anyone has tried it with an email or document corpus? The only example I can find on the web shows a prediction analysis across listening tastes for music. I can see how this would work at a data level and I guess it would use Google’s algorithms to analyse text? but I’m not sure how you would actually use it and how you would “train” it.

    The movie recommendation example given in the Google code examples requires the data to be structured, while the spam/ham example appears to take unstructured data in a text file. I imagine you could use a document corpus that a user likes and then use this to predict what documents a user should be interested in. This type of thing would be great for enterprise level context based search.

Leave a Reply