Tuesday, April 20, 2010

World's Greatest Cloud Service Provider

Along with several of my colleagues, I am particularly attracted to our cloud computing roadmap, and I spend a fair amount of time exploring related ideas and companies. It came as a bit of a surprise to find that the world's greatest cloud service provider was a virtual unknown just a week ago:

(Credit to my friends at Parallels for sharing this image with me.)

Wednesday, April 07, 2010

Spam is everywhere, not just in email

About five years ago, I focused on building highly valuable businesses on the back of user-generated content. I led my first related investment in 2005. It was in a San Francisco-based startup called Yelp. Over the following 18 months, I led three other Bessemer investments based on the same fundamental roadmap: Wikia, OLX and LinkedIn.

One nice thing about the roadmap approach to investing is that when you discover a compelling roadmap, follow it and identify talented entrepreneurs who share your conviction, you end up investing in some great companies. LinkedIn, Wikia and Yelp are each now among the top-100 ranked websites by Quantcast. It's a bit harder to measure the size of OLX because it’s largely used by consumers outside the USA across several domains (http://www.olx.com.br, http://www.olx.pt, http://www.olx.ru and many others), but it is probably the largest site in my portfolio and reached more than 100,000,000 unique visitors last month. As satisfying as the success these companies have had pleasing consumers is that that each has a proven, working business model. I wish all of my investment roadmaps were as productive as the one based on user-generated content.

But it's not all happiness in user-generated content land. There are lots of challenges with it, and perhaps at the top of the list is that you don't actually control your site's content. That leads to troublesome spam.

Spam at Wikia
Wikia is a giant collection of wikis that anyone can edit. I think it was about a month after we invested that one my colleagues rushed into my office to ask when I last visited Wikia's home page. Without hesitation, I pulled up the site and was greeted by a massive image of two naked adults exploring one of the more creative entries in the Kama Sutra. Sadly, it feels as though for every well-meaning Internet consumer, there's at least one evildoer. (The ratio is probably much better than 1:1, but the evildoers each seem to make the "contributions" of 10 or even 100 people, and so the "effective" ratio is probably much worse than 1:1.)

Spam at OLX
OLX provides free classified sites in almost 50 languages in 100 countries around the world. They now collect more than 100,000 new free postings every day. These posts advertise the likes of a used car for sale, an apartment for rent or a job opening. Incredibly, OLX's computer algorithms automatically tag 50% of those new postings as spam and delete them immediately. FIFTY percent! Even more incredible is that OLX tunes its algorithm to be extremely conservative: if it might not be spam, they don't delete it. Once the computers are done reviewing the content, a crack team of customer service agents in Buenos Aires manually reviews every remaining post. The human reviewers end up tagging half of what's left as spam too! That means for every useful post, there are three pieces of spam. Left unchecked, the spammers would completely destroy the utility of OLX because you'd have to wade through gobs of crap to find real posts.

Spam at Yelp
Yelp is a site where consumers post their subjective reviews of local businesses. Yelp identified the spam problem early in its development, and came up with a highly effective way to filter out spam on its site. Spam in the context of Yelp is most often a shill review written to make a particular business look great or (its competitor's look horrible). The sentiment embedded in Yelp reviews about most local establishments feels real because most of the spam has been removed. The opacity of Yelp's spam filter frustrated a number of business owners, and so Yelp just launched a new feature to let consumers look at the reviews that were automatically removed.

The result is amazing. In a random sample of business profile pages I just visited on Yelp, the "filtered" reviews account for anywhere from 10% to 50% of the total! Much of the spam is truly shameless. My favorite reaction to Yelp's new feature was from The Next Web, and it's priceless:
Looking through the reviews that Yelp’s algorithm filtered out brings to mind a single thought: ‘Wow. It’s like viewing Yelp through Citysearch goggles.’

Everyone knows that not a day goes by when spam doesn't invade our email inboxes. We use filters and other defenses to prevent spam from destroying the utility of email. It is less obvious that, if left unchecked, spam would also destroy many of the most useful properties on the Internet. Thankfully, some of the best user-generated content sites have developed sophisticated techniques to keep the spam at bay. Let's just hope they keep innovating fast enough to stay a step ahead of the bad guys.