By now, you may have read Danny Sullivan’s recent post: “Google: Bing is Cheating, Copying Our Search Results” and heard Microsoft’s response, “We do not copy Google's results.” However you define copying, the bottom line is, these Bing results came directly from Google.
I’d like to give you some background and details of our experiments that lead us to understand just how Bing is using Google web search results.
It all started with tarsorrhaphy. Really. As it happens, tarsorrhaphy is a rare surgical procedure on eyelids. And in the summer of 2010, we were looking at the search results for an unusual misspelled query [torsorophy]. Google returned the correct spelling—tarsorrhaphy—along with results for the corrected query. At that time, Bing had no results for the misspelling. Later in the summer, Bing started returning our first result to their users without offering the spell correction (see screenshots below). This was very strange. How could they return our first result to their users without the correct spelling? Had they known the correct spelling, they could have returned several more relevant results for the corrected query.
This example opened our eyes, and over the next few months we noticed that URLs from Google search results would later appear in Bing with increasing frequency for all kinds of queries: popular queries, rare or unusual queries and misspelled queries. Even search results that we would consider mistakes of our algorithms started showing up on Bing.
We couldn’t shake the feeling that something was going on, and our suspicions became much stronger in late October 2010 when we noticed a significant increase in how often Google’s top search result appeared at the top of Bing’s ranking for a variety of queries. This statistical pattern was too striking to ignore. To test our hypothesis, we needed an experiment to determine whether Microsoft was really using Google’s search results in Bing’s ranking.
We created about 100 “synthetic queries”—queries that you would never expect a user to type, such as [hiybbprqag]. As a one-time experiment, for each synthetic query we inserted as Google’s top result a unique (real) webpage which had nothing to do with the query. Below is an example:
To be clear, the synthetic query had no relationship with the inserted result we chose—the query didn’t appear on the webpage, and there were no links to the webpage with that query phrase. In other words, there was absolutely no reason for any search engine to return that webpage for that synthetic query. You can think of the synthetic queries with inserted results as the search engine equivalent of marked bills in a bank.
We gave 20 of our engineers laptops with a fresh install of Microsoft Windows running Internet Explorer 8 with Bing Toolbar installed. As part of the install process, we opted in to the “Suggested Sites” feature of IE8, and we accepted the default options for the Bing Toolbar.
We asked these engineers to enter the synthetic queries into the search box on the Google home page, and click on the results, i.e., the results we inserted. We were surprised that within a couple weeks of starting this experiment, our inserted results started appearing in Bing. Below is an example: a search for [hiybbprqag] on Bing returned a page about seating at a theater in Los Angeles. As far as we know, the only connection between the query and result is Google’s result page (shown above).
We saw this happen for multiple queries. For the query [delhipublicschool40 chdjob] we inserted a search result for a credit union:
The same credit union soon showed up on Bing for that query:
For the query [juegosdeben1ogrande] we inserted a page of hip hop bling jewelry:
And the same hip hop bling page showed up in Bing:
As we see it, this experiment confirms our suspicion that Bing is using some combination of:
- Internet Explorer 8, which can send data to Microsoft via its Suggested Sites feature
- the Bing Toolbar, which can send data via Microsoft’s Customer Experience Improvement Program
At Google we strongly believe in innovation and are proud of our search quality. We’ve invested thousands of person-years into developing our search algorithms because we want our users to get the right answer every time they search, and that’s not easy. We look forward to competing with genuinely new search algorithms out there—algorithms built on core innovation, and not on recycled search results from a competitor. So to all the users out there looking for the most authentic, relevant search results, we encourage you to come directly to Google. And to those who have asked what we want out of all this, the answer is simple: we'd like for this practice to stop.
Posted by Amit Singhal, Google Fellow