Say Hello to Highlighter

highlighter-sm-promoMost people have encountered some sort of personalization in their use of the Internet.  If you’ve bought anything on Amazon, you have definitely seen the “People who bought this also bought….”  It’s an amazing feature that makes Amazon a lot more useful, but, sadly, it only works on Amazon.  Why can’t Amazon purchases give me great recommendations on ebay?  Why don’t the things I read on the LA Times help me find great stuff on USA Today?  Wouldn’t it be great if the whole web could understand me and show me great stuff everywhere I go?
Today, we are excited to launch Highlighter, a little Chrome extension that is designed to “highlight” a personalized newsfeed of what you might want to read on any content site you visit.  No biggie, it just personalizes the whole freakin’ Internet.
Here’s how it works:
  1. Install Highlighter for your Chrome browser from the Google store here.
  2. Highlighter uses your recent browsing history to get to know you and build your Interest Graph based on the content you’ve been reading
  3. As you surf the web, Highlighter shows this handy dandy tab when it thinks there are stories on the current site you’re visiting that you’ll likeScreenshot 2013-11-04 14.09.06
  4. Click the tab to see the whole newsfeed of content Highlighter has picked just for you.Screenshot 2013-11-04 14.09.22
  5. The more you use it, the smarter Highlighter gets.
We are always looking for ways to make tools like Highlighter better. Feel free to inundate us with any and all feedback at
We hope you enjoy it!

Announcing Gravity Labs!

I am pleased to announce the launch of Gravity labs, an initial peek into our underlying interest graph infrastructure as well as a showcase of some of our Open Source projects.

For the last 2+ years we have been working on productionizing a web-scale system that leverages a wide range of unique disciplines, from natural language processing to large scale semantics and ontology development, to real time behavioral algorithms, all the way to a variety of different machine learning techniques.

We didn’t set out to build a complex system that encompasses so many different disciplines, we set out to personalize the web.  Unfortunately, it didn’t take long for us to realize that the generally accepted collaborative filtering and behavioral targeting algorithms available today didn’t meet many of our core requirements.  There are quite a few – but the primary requirements for a cloud based, web scale personalization engine are:

Real time capable:

  • New user events occur in the tens, hundreds and even millions of times per second.  A user’s personalized experience needs to update in real time as each event, or group of events occurs
  • New content is created across the web at similar rates.  Content needs to be available for recommendation to a user immediately upon its creation.


  • Signal generated by a user on one site needs to be applicable to user’s recommendations on all other sites across the web.  Just like a user is now able to interact with their social graph as they use the broader web, they need to be able to take their interest graph (or “personalization profile”) with them to every website they visit, and be able to both apply it and augment it anywhere.


  • We are all unique.  You can’t put everyone in a bucket.  While neighborhood/bucketing based algorithms do work (and are one, albeit small, component of our infrastructure), generalizations about people’s actions are made in order to enable scalability at the cost of accuracy.  A true personalization engine should absolutely minimize grouping users together as much as possible, and treat each individual as a unique entity with a unique set of interests 

Filter Safe:

  • The fears of the filter bubble are real, and existing personalization and contextual recommendation engines often drive users down a more and more narrow content discovery path.  A successful personalization engine needs to have the capability to inject serendipity into a users experience at an individual level.  Both the general, real time consensus of content that is important across the global web regardless of a user’s interest, and the semantic relationships between are very different, but highly connected interests needs to be taken into account. 

It has been (and will continue to be) quite a challenge.  It has required the minds of very different people with many different core skillsets.

And it took a long time.  Candidly, one reason we have been so quiet about our development efforts is because we wanted to make sure we could get far enough ahead of everyone else :).  We’ve popped in and out of the news with test/data acquisition products here and there, but the goal has always been a system that can accurately process all of the interest based signal data across the entire web, and leverage it to personalize every user’s internet experience.

We are proud to announce that the above system, or the “Gravity Interest Service” as we call it internally, officially went live at production scale 6 months ago.

Since then we have:

  * Created over 400 million user interest graphs

  * Served over 13 Million pieces of personalized content per day

  * Personalized the daily internet experience of tens of millions of users per month

  * Processed over 25 million inbound interest signals per day

And with our current growth rate we will be handling 10X all of these numbers in under 6 months.

It’s an exciting time for us here, so we have decided to give a (small) peek under the hood, as well as open source some of our non-core components.  We leverage a significant amount of open source software for a good portion of our data storage and processing, and want to contribute what we can back to the community.

Thanks for your interest in Gravity. There is a lot more coming in the very near future, but our new Beta Labs Section should give you enough to play with until then.



Twinterest: A New Twitter Game by Gravity

Today we launched Twinterest on stage at The Web 2.0 Summit 2010 in San Francisco!

Twinterest is a Twitter-based game that analyzes your tweets to figure out what you’re into and shows how your interests compare to your friends’. You can play Twinterest here:

Here’s how the game works. First, you connect with Twitter. Next, we pull your tweet history and use our natural language processing technology to determine your interests based on what you’ve said on Twitter. After we process your tweets, we create a personal interest report for you. The report shows your interests and how your interests compare to your friends who have already played. You can tweet your results to your friends and followers or @mentions specific friends so that they’ll play and you can compare interests. Give it a whirl and let us know what you think.

Twinterest is the first game built on our platform. We built Twinterest for a few reasons. First, we wanted to show you your Interest Graph, an online representation of your real-world interests, and give you a sense for how we can apply it. In this case, Twinterest shows you what you have in common with your friends and followers on Twitter. Second, we wanted to tune our interest graphing algorithms and our ontology. Whenever a user removes an interest, our interest service gets a little smarter. This crowdsourcing is invaluable for making our platform better. Finally, we wanted to establish a connection with users so that we can show them future products built by their interest graphs.

Enjoy Twinterest, and please bear with us while it scales. We working hard to make sure you have a great experience!