Archives For Bias

In the latest congressional hearing, purportedly analyzing Google’s “stacking the deck” in the online advertising marketplace, much of the opening statement and questioning by Senator Mike Lee and later questioning by Senator Josh Hawley focused on an episode of alleged anti-conservative bias by Google in threatening to demonetize The Federalist, a conservative publisher, unless they exercised a greater degree of control over its comments section. The senators connected this to Google’s “dominance,” arguing that it is only because Google’s ad services are essential that Google can dictate terms to a conservative website. A similar impulse motivates Section 230 reform efforts as well: allegedly anti-conservative online platforms wield their dominance to censor conservative speech, either through deplatforming or demonetization.

Before even getting into the analysis of how to incorporate political bias into antitrust analysis, though, it should be noted that there likely is no viable antitrust remedy. Even aside from the Section 230 debate, online platforms like Google are First Amendment speakers who have editorial discretion over their sites and apps, much like newspapers. An antitrust remedy compelling these companies to carry speech they disagree with would almost certainly violate the First Amendment.

But even aside from the First Amendment aspect of this debate, there is no easy way to incorporate concerns about political bias into antitrust. Perhaps the best way to understand this argument in the antitrust sense is as a non-price effects analysis. 

Political bias could be seen by end consumers as an important aspect of product quality. Conservatives have made the case that not only Google, but also Facebook and Twitter, have discriminated against conservative voices. The argument would then follow that consumer welfare is harmed when these dominant platforms leverage their control of the social media marketplace into the marketplace of ideas by censoring voices with whom they disagree. 

While this has theoretical plausibility, there are real practical difficulties. As Geoffrey Manne and I have written previously, in the context of incorporating privacy into antitrust analysis:

The Horizontal Merger Guidelines have long recognized that anticompetitive effects may “be manifested in non-price terms and conditions that adversely affect customers.” But this notion, while largely unobjectionable in the abstract, still presents significant problems in actual application. 

First, product quality effects can be extremely difficult to distinguish from price effects. Quality-adjusted price is usually the touchstone by which antitrust regulators assess prices for competitive effects analysis. Disentangling (allegedly) anticompetitive quality effects from simultaneous (neutral or pro-competitive) price effects is an imprecise exercise, at best. For this reason, proving a product-quality case alone is very difficult and requires connecting the degradation of a particular element of product quality to a net gain in advantage for the monopolist. 

Second, invariably product quality can be measured on more than one dimension. For instance, product quality could include both function and aesthetics: A watch’s quality lies in both its ability to tell time as well as how nice it looks on your wrist. A non-price effects analysis involving product quality across multiple dimensions becomes exceedingly difficult if there is a tradeoff in consumer welfare between the dimensions. Thus, for example, a smaller watch battery may improve its aesthetics, but also reduce its reliability. Any such analysis would necessarily involve a complex and imprecise comparison of the relative magnitudes of harm/benefit to consumers who prefer one type of quality to another.

Just as with privacy and other product qualities, the analysis becomes increasingly complex first when tradeoffs between price and quality are introduced, and then even more so when tradeoffs between what different consumer groups perceive as quality is added. In fact, it is more complex than privacy. All but the most exhibitionistic would prefer more to less privacy, all other things being equal. But with political media consumption, most would prefer to have more of what they want to read available, even if it comes at the expense of what others may want. There is no easy way to understand what consumer welfare means in a situation where one group’s preferences need to come at the expense of another’s in moderation decisions.

Consider the case of The Federalist again. The allegation is that Google is imposing their anticonservative bias by “forcing” the website to clean up its comments section. The argument is that since The Federalist needs Google’s advertising money, it must play by Google’s rules. And since it did so, there is now one less avenue for conservative speech.

What this argument misses is the balance Google and other online services must strike as multi-sided platforms. The goal is to connect advertisers on one side of the platform, to the users on the other. If a site wants to take advantage of the ad network, it seems inevitable that intermediaries like Google will need to create rules about what can and can’t be shown or they run the risk of losing advertisers who don’t want to be associated with certain speech or conduct. For instance, most companies don’t want to be associated with racist commentary. Thus, they will take great pains to make sure they don’t sponsor or place ads in venues associated with racism. Online platforms connecting advertisers to potential consumers must take that into consideration.

Users, like those who frequent The Federalist, have unpriced access to content across those sites and apps which are part of ad networks like Google’s. Other models, like paid subscriptions (which The Federalist also has available), are also possible. But it isn’t clear that conservative voices or conservative consumers have been harmed overall by the option of unpriced access on one side of the platform, with advertisers paying on the other side. If anything, it seems the opposite is the case since conservatives long complained about legacy media having a bias and lauded the Internet as an opportunity to gain a foothold in the marketplace of ideas.

Online platforms like Google must balance the interests of users from across the political spectrum. If their moderation practices are too politically biased in one direction or another, users could switch to another online platform with one click or swipe. Assuming online platforms wish to maximize revenue, they will have a strong incentive to limit political bias from its moderation practices. The ease of switching to another platform which markets itself as more free speech-friendly, like Parler, shows entrepreneurs can take advantage of market opportunities if Google and other online platforms go too far with political bias. 

While one could perhaps argue that the major online platforms are colluding to keep out conservative voices, this is difficult to square with the different moderation practices each employs, as well as the data that suggests conservative voices are consistently among the most shared on Facebook

Antitrust is not a cure-all law. Conservatives who normally understand this need to reconsider whether antitrust is really well-suited for litigating concerns about anti-conservative bias online. 

If you do research involving statistical analysis, you’ve heard of John Ioannidis. If you haven’t heard of him, you will. He’s gone after the fields of medicine, psychology, and economics. He may be coming for your field next.

Ioannidis is after bias in research. He is perhaps best known for a 2005 paper “Why Most Published Research Findings Are False.” A professor at Stanford, he has built a career in the field of meta-research and may be one of the most highly cited researchers alive.

In 2017, he published “The Power of Bias in Economics Research.” He recently talked to Russ Roberts on the EconTalk podcast about his research and what it means for economics.

He focuses on two factors that contribute to bias in economics research: publication bias and low power. These are complicated topics. This post hopes to provide a simplified explanation of these issues and why bias and power matters.

What is bias?

We frequently hear the word bias. “Fake news” is biased news. For dinner, I am biased toward steak over chicken. That’s different from statistical bias.

In statistics, bias means that a researcher’s estimate of a variable or effect is different from the “true” value or effect. The “true” probability of getting heads from tossing a fair coin is 50 percent. Let’s say that no matter how many times I toss a particular coin, I find that I’m getting heads about 75 percent of the time. My instrument, the coin, may be biased. I may be the most honest coin flipper, but my experiment has biased results. In other words, biased results do not imply biased research or biased researchers.

Publication bias

Publication bias occurs because peer-reviewed publications tend to favor publishing positive, statistically significant results and to reject insignificant results. Informally, this is known as the “file drawer” problem. Nonsignificant results remain unsubmitted in the researcher’s file drawer or, if submitted, remain in limbo in an editor’s file drawer.

Studies are more likely to be published in peer-reviewed publications if they have statistically significant findings, build on previous published research, and can potentially garner citations for the journal with sensational findings. Studies that don’t have statistically significant findings or don’t build on previous research are less likely to be published.

The importance of “sensational” findings means that ho-hum findings—even if statistically significant—are less likely to be published. For example, research finding that a 10 percent increase in the minimum wage is associated with a one-tenth of 1 percent reduction in employment (i.e., an elasticity of 0.01) would be less likely to be published than a study finding a 3 percent reduction in employment (i.e., elasticity of –0.3).

“Man bites dog” findings—those that are counterintuitive or contradict previously published research—may be less likely to be published. A study finding an upward sloping demand curve is likely to be rejected because economists “know” demand curves slope downward.

On the other hand, man bites dog findings may also be more likely to be published. Card and Krueger’s 1994 study finding that a minimum wage hike was associated with an increase in low-wage workers was published in the top-tier American Economic Review. Had the study been conducted by lesser-known economists, it’s much less likely it would have been accepted for publication. The results were sensational, judging from the attention the article got from the New York Times, the Wall Street Journal, and even the Clinton administration. Sometimes a man does bite a dog.

Low power

A study with low statistical power has a reduced chance of detecting a true effect.

Consider our criminal legal system. We seek to find criminals guilty, while ensuring the innocent go free. Using the language of statistical testing, the presumption of innocence is our null hypothesis. We set a high threshold for our test: Innocent until proven guilty, beyond a reasonable doubt. We hypothesize innocence and only after overcoming our reasonable doubt do we reject that hypothesis.

Type1-Type2-Error

An innocent person found guilty is considered a serious error—a “miscarriage of justice.” The presumption of innocence (null hypothesis) combined with a high burden of proof (beyond a reasonable doubt) are designed to reduce these errors. In statistics, this is known as “Type I” error, or “false positive.” The probability of a Type I error is called alpha, which is set to some arbitrarily low number, like 10 percent, 5 percent, or 1 percent.

Failing to convict a known criminal is also a serious error, but generally agreed it’s less serious than a wrongful conviction. Statistically speaking, this is a “Type II” error or “false negative” and the probability of making a Type II error is beta.

By now, it should be clear there’s a relationship between Type I and Type II errors. If we reduce the chance of a wrongful conviction, we are going to increase the chance of letting some criminals go free. It can be mathematically shown (not here), that a reduction in the probability of Type I error is associated with an increase in Type II error.

Consider O.J. Simpson. Simpson was not found guilty in his criminal trial for murder, but was found liable for the deaths of Nicole Simpson and Ron Goldman in a civil trial. One reason for these different outcomes is the higher burden of proof for a criminal conviction (“beyond a reasonable doubt,” alpha = 1 percent) than for a finding of civil liability (“preponderance of evidence,” alpha = 50 percent). If O.J. truly is guilty of the murders, the criminal trial would have been less likely to find guilt than the civil trial would.

In econometrics, we construct the null hypothesis to be the opposite of what we hypothesize to be the relationship. For example, if we hypothesize that an increase in the minimum wage decreases employment, the null hypothesis would be: “A change in the minimum wage has no impact on employment.” If the research involves regression analysis, the null hypothesis would be: “The estimated coefficient on the elasticity of employment with respect to the minimum wage would be zero.” If we set the probability of Type I error to 5 percent, then regression results with a p-value of less than 0.05 would be sufficient to reject the null hypothesis of no relationship. If we increase the probability of Type I error, we increase the likelihood of finding a relationship, but we also increase the chance of finding a relationship when none exists.

Now, we’re getting to power.

Power is the chance of detecting a true effect. In the legal system, it would be the probability that a truly guilty person is found guilty.

By definition, a low power study has a small chance of discovering a relationship that truly exists. Low power studies produce more false negative than high power studies. If a set of studies have a power of 20 percent, then if we know that there are 100 actual effects, the studies will find only 20 of them. In other words, out of 100 truly guilty suspects, a legal system with a power of 20 percent will find only 20 of them guilty.

Suppose we expect 25 percent of those accused of a crime are truly guilty of the crime. Thus the odds of guilt are R = 0.25 / 0.75 = 0.33. Assume we set alpha to 0.05, and conclude the accused is guilty if our test statistic provides p < 0.05. Using Ioannidis’ formula for positive predictive value, we find:

  • If the power of the test is 20 percent, the probability that a “guilty” verdict reflects true guilt is 57 percent.
  • If the power of the test is 80 percent, the probability that a “guilty” verdict reflects true guilt is 84 percent.

In other words, a low power test is more likely to convict the innocent than a high power test.

In our minimum wage example, a low power study is more likely find a relationship between a change in the minimum wage and employment when no relationship truly exists. By extension, even if a relationship truly exists, a low power study would be more likely to find a bigger impact than a high power study. The figure below demonstrates this phenomenon.

MinimumWageResearchFunnelGraph

Across the 1,424 studies surveyed, the average elasticity with respect to the minimum wage is –0.190 (i.e., a 10 percent increase in the minimum wage would be associated with a 1.9 percent decrease in employment). When adjusted for the studies’ precision, the weighted average elasticity is –0.054. By this simple analysis, the unadjusted average is 3.5 times bigger than the adjusted average. Ioannidis and his coauthors estimate among the 60 studies with “adequate” power, the weighted average elasticity is –0.011.

(By the way, my own unpublished studies of minimum wage impacts at the state level had an estimated short-run elasticity of –0.03 and “precision” of 122 for Oregon and short-run elasticity of –0.048 and “precision” of 259 for Colorado. These results are in line with the more precise studies in the figure above.)

Is economics bogus?

It’s tempting to walk away from this discussion thinking all of econometrics is bogus. Ioannidis himself responds to this temptation:

Although the discipline has gotten a bad rap, economics can be quite reliable and trustworthy. Where evidence is deemed unreliable, we need more investment in the science of economics, not less.

For policymakers, the reliance on economic evidence is even more important, according to Ioannidis:

[P]oliticians rarely use economic science to make decisions and set new laws. Indeed, it is scary how little science informs political choices on a global scale. Those who decide the world’s economic fate typically have a weak scientific background or none at all.

Ioannidis and his colleagues identify several way to address the reliability problems in economics and other fields—social psychology is one of the worst. However these are longer term solutions.

In the short term, researchers and policymakers should view sensational finding with skepticism, especially if those sensational findings support their own biases. That skepticism should begin with one simple question: “What’s the confidence interval?”

 

In my series of three posts (here, here and here) drawn from my empirical study on search bias I have examined whether search bias exists, and, if so, how frequently it occurs.  This, the final post in the series, assesses the results of the study (as well as the Edelman & Lockwood (E&L) study to which it responds) to determine whether the own-content bias I’ve identified is in fact consistent with anticompetitive foreclosure or is otherwise sufficient to warrant antitrust intervention.

As I’ve repeatedly emphasized, while I refer to differences among search engines’ rankings of their own or affiliated content as “bias,” without more these differences do not imply anticompetitive conduct.  It is wholly unsurprising and indeed consistent with vigorous competition among engines that differentiation emerges with respect to algorithms.  However, it is especially important to note that the theories of anticompetitive foreclosure raised by Google’s rivals involve very specific claims about these differences.  Properly articulated vertical foreclosure theories proffer both that bias is (1) sufficient in magnitude to exclude Google’s rivals from achieving efficient scale, and (2) actually directed at Google’s rivals.  Unfortunately for search engine critics, their theories fail on both counts.  The observed own-content bias appears neither to be extensive enough to prevent rivals from gaining access to distribution nor does it appear to target Google’s rivals; rather, it seems to be a natural result of intense competition between search engines and of significant benefit to consumers.

Vertical foreclosure arguments are premised upon the notion that rivals are excluded with sufficient frequency and intensity as to render their efforts to compete for distribution uneconomical.  Yet the empirical results simply do not indicate that market conditions are in fact conducive to the types of harmful exclusion contemplated by application of the antitrust laws.  Rather, the evidence indicates that (1) the absolute level of search engine “bias” is extremely low, and (2) “bias” is not a function of market power, but an effective strategy that has arisen as a result of serious competition and innovation between and by search engines.  The first finding undermines competitive foreclosure arguments on their own terms, that is, even if there were no pro-consumer justifications for the integration of Google content with Google search results.  The second finding, even more importantly, reveals that the evolution of consumer preferences for more sophisticated and useful search results has driven rival search engines to satisfy that demand.  Both Bing and Google have shifted toward these results, rendering the complained-of conduct equivalent to satisfying the standard of care in the industry–not restraining competition.

A significant lack of search bias emerges in the representative sample of queries.  This result is entirely unsurprising, given that bias is relatively infrequent even in E&L’s sample of queries specifically designed to identify maximum bias.  In the representative sample, the total percentage of queries for which Google references its own content when rivals do not is even lower—only about 8%—meaning that Google favors its own content far less often than critics have suggested.  This fact is crucial and highly problematic for search engine critics, as their burden in articulating a cognizable antitrust harm includes not only demonstrating that bias exists, but further that it is actually competitively harmful.  As I’ve discussed, bias alone is simply not sufficient to demonstrate any prima facie anticompetitive harm as it is far more often procompetitive or competitively neutral than actively harmful.  Moreover, given that bias occurs in less than 10% of queries run on Google, anticompetitive exclusion arguments appear unsustainable.

Indeed, theories of vertical foreclosure find virtually zero empirical support in the data.  Moreover, it appears that, rather than being a function of monopolistic abuse of power, search bias has emerged as an efficient competitive strategy, allowing search engines to differentiate their products in ways that benefit consumers.  I find that when search engines do reference their own content on their search results pages, it is generally unlikely that another engine will reference this same content.  However, the fact that both this percentage and the absolute level of own content inclusion is similar across engines indicates that this practice is not a function of market power (or its abuse), but is rather an industry standard.  In fact, despite conducting a much smaller percentage of total consumer searches, Bing is consistently more biased than Google, illustrating that the benefits search engines enjoy from integrating their own content into results is not necessarily a function of search engine size or volume of queries.  These results are consistent with a business practice that is efficient and at significant tension with arguments that such integration is designed to facilitate competitive foreclosure. Continue Reading…

My last two posts on search bias (here and here) have analyzed and critiqued Edelman & Lockwood’s small study on search bias.  This post extends this same methodology and analysis to a random sample of 1,000 Google queries (released by AOL in 2006), to develop a more comprehensive understanding of own-content bias.  As I’ve stressed, these analyses provide useful—but importantly limited—glimpses into the nature of the search engine environment.  While these studies are descriptively helpful, actual harm to consumer welfare must always be demonstrated before cognizable antitrust injuries arise.  And naked identifications of own-content bias simply do not inherently translate to negative effects on consumers (see, e.g., here and here for more comprehensive discussion).

Now that’s settled, let’s jump into the results of the 1,000 random search query study.

How Do Search Engines Rank Their Own Content?

Consistent with our earlier analysis, a starting off point for thinking about measuring differentiation among search engines with respect to placing their own content is to compare how a search engine ranks its own content relative to how other engines place that same content (e.g. to compare how Google ranks “Google Maps” relative to how Bing or Blekko rank it).   Restricting attention exclusively to the first or “top” position, I find that Google simply does not refer to its own content in over 90% of queries.  Similarly, Bing does not reference Microsoft content in 85.4% of queries.  Google refers to its own content in the first position when other search engines do not in only 6.7% of queries; while Bing does so over twice as often, referencing Microsoft content that no other engine references in the first position in 14.3% of queries.  The following two charts illustrate the percentage of Google or Bing first position results, respectively, dedicated to own content across search engines.

The most striking aspect of these results is the small fraction of queries for which placement of own-content is relevant.  The results are similar when I expand consideration to the entire first page of results; interestingly, however, while the levels of own-content bias are similar considering the entire first page of results, Bing is far more likely than Google to reference its own content in its very first results position.

Examining Search Engine “Bias” on Google

Two distinct differences between the results of this larger study and my replication of Edelman & Lockwood emerge: (1) Google and Bing refer to their own content in a significantly smaller percentage of cases here than in the non-random sample; and (2) in general, when Google or Bing does rank its own content highly, rival engines are unlikely to similarly rank that same content.

The following table reports the percentages of queries for which Google’s ranking of its own content and its rivals’ rankings of that same content differ significantly. When Google refers to its own content within its Top 5 results, at least one other engine similarly ranks this content for only about 5% of queries.

The following table presents the likelihood that Google content will appear in a Google search, relative to searches conducted on rival engines (reported in odds ratios).

The first and third columns report results indicating that Google affiliated content is more likely to appear in a search executed on Google rather than rival engines.  Google is approximately 16 times more likely to refer to its own content on its first page as is any other engine.  Bing and Blekko are both significantly less likely to refer to Google content in their first result or on their first page than Google is to refer to Google content within these same parameters.  In each iteration, Bing is more likely to refer to Google content than is Blekko, and in the case of the first result, Bing is much more likely to do so.  Again, to be clear, the fact that Bing is more likely to rank its own content is not suggestive that the practice is problematic.  Quite the contrary, the demonstration that firms both with and without market power in search (to the extent that is a relevant antitrust market) engage in similar conduct the correct inference is that there must be efficiency explanations for the practice.  The standard response, of course, is that the competitive implications of a practice are different when a firm with market power does it.  That’s not exactly right.  It is true that firms with market power can engage in conduct that gives rise to potential antitrust problems when the same conduct from a firm without market power would not; however, when firms without market power engage in the same business practice it demands that antitrust analysts seriously consider the efficiency implications of the practice.  In other words, there is nothing in the mantra that things are “different” when larger firms do them that undercut potential efficiency explanations.

Examining Search Engine “Bias” on Bing

For queries within the larger sample, Bing refers to Microsoft content within its Top 1 and 3 results when no other engine similarly references this content for a slightly smaller percentage of queries than in my Edelman & Lockwood replication.  Yet Bing continues to exhibit a strong tendency to rank Microsoft content more prominently than rival engines.  For example, when Bing refers to Microsoft content within its Top 5 results, other engines agree with this ranking for less than 2% of queries; and Bing refers to Microsoft content that no other engine does within its Top 3 results for 99.2% of queries:

Regression analysis further illustrates Bing’s propensity to reference Microsoft content that rivals do not.  The following table reports the likelihood that Microsoft content is referred to in a Bing search as compared to searches on rival engines (again reported in odds ratios).

Bing refers to Microsoft content in its first results position about 56 times more often than rival engines refer to Microsoft content in this same position.  Across the entire first page, Microsoft content appears on a Bing search about 25 times more often than it does on any other engine.  Both Google and Blekko are accordingly significantly less likely to reference Microsoft content.  Notice further that, contrary to the findings in the smaller study, Google is slightly less likely to return Microsoft content than is Blekko, both in its first results position and across its entire first page.

A Closer Look at Google v. Bing

 Consistent with the smaller sample, I find again that Bing is more biased than Google using these metrics.  In other words, Bing ranks its own content significantly more highly than its rivals do more frequently then Google does, although the discrepancy between the two engines is smaller here than in the study of Edelman & Lockwood’s queries.  As noted above, Bing is over twice as likely to refer to own content in first results position than is Google.

Figures 7 and 8 present the same data reported above, but with Blekko removed, to allow for a direct visual comparison of own-content bias between Google and Bing.

Consistent with my earlier results, Bing appears to consistently rank Microsoft content higher than Google ranks the same (Microsoft) content more frequently than Google ranks Google content more prominently than Bing ranks the same (Google) content.

This result is particularly interesting given the strength of the accusations condemning Google for behaving in precisely this way.  That Bing references Microsoft content just as often as—and frequently even more often than!—Google references its own content strongly suggests that this behavior is a function of procompetitive product differentiation, and not abuse of market power.  But I’ll save an in-depth analysis of this issue for my next post, where I’ll also discuss whether any of the results reported in this series of posts support anticompetitive foreclosure theories or otherwise suggest antitrust intervention is warranted.

In my last post, I discussed Edelman & Lockwood’s (E&L’s) attempt to catch search engines in the act of biasing their results—as well as their failure to actually do so.  In this post, I present my own results from replicating their study.  Unlike E&L, I find that Bing is consistently more biased than Google, for reasons discussed further below, although neither engine references its own content as frequently as E&L suggest.

I ran searches for E&L’s original 32 non-random queries using three different search engines—Google, Bing, and Blekko—between June 23 and July 5 of this year.  This replication is useful, as search technology has changed dramatically since E&L recorded their results in August 2010.  Bing now powers Yahoo, and Blekko has had more time to mature and enhance its results.  Blekko serves as a helpful “control” engine in my study, as it is totally independent of Google and Microsoft, and so has no incentive to refer to Google or Microsoft content unless it is actually relevant to users.  In addition, because Blekko’s model is significantly different than Google and Microsoft’s, if results on all three engines agree that specific content is highly relevant to the user query, it lends significant credibility to the notion that the content places well on the merits rather than being attributable to bias or other factors.

How Do Search Engines Rank Their Own Content?

Focusing solely upon the first position, Google refers to its own products or services when no other search engine does in 21.9% of queries; in another 21.9% of queries, both Google and at least one other search engine rival (i.e. Bing or Blekko) refer to the same Google content with their first links.

But restricting focus upon the first position is too narrow.  Assuming that all instances in which Google or Bing rank their own content first and rivals do not amounts to bias would be a mistake; such a restrictive definition would include cases in which all three search engines rank the same content prominently—agreeing that it is highly relevant—although not all in the first position. 

The entire first page of results provides a more informative comparison.  I find that Google and at least one other engine return Google content on the first page of results in 7% of the queries.  Google refers to its own content on the first page of results without agreement from either rival search engine in only 7.9% of the queries.  Meanwhile, Bing and at least one other engine refer to Microsoft content in 3.2% of the queries.  Bing references Microsoft content without agreement from either Google or Blekko in 13.2% of the queries:

This evidence indicates that Google’s ranking of its own content differs significantly from its rivals in only 7.9% of queries, and that when Google ranks its own content prominently it is generally perceived as relevant.  Further, these results suggest that Bing’s organic search results are significantly more biased in favor of Microsoft content than Google’s search results are in favor of Google’s content.

Examining Search Engine “Bias” on Google

The following table presents the percentages of queries for which Google’s ranking of its own content differs significantly from its rivals’ ranking of that same content.

Note that percentages below 50 in this table indicate that rival search engines generally see the referenced Google content as relevant and independently believe that it should be ranked similarly.

So when Google ranks its own content highly, at least one rival engine typically agrees with this ranking; for example, when Google places its own content in its Top 3 results, at least one rival agrees with this ranking in over 70% of queries.  Bing especially agrees with Google’s rankings of Google content within its Top 3 and 5 results, failing to include Google content that Google ranks similarly in only a little more than a third of queries.

Examining Search Engine “Bias” on Bing

Bing refers to Microsoft content in its search results far more frequently than its rivals reference the same Microsoft content.  For example, Bing’s top result references Microsoft content for 5 queries, while neither Google nor Blekko ever rank Microsoft content in the first position:

This table illustrates the significant discrepancies between Bing’s treatment of its own Microsoft content relative to Google and Blekko.  Neither rival engine refers to Microsoft content Bing ranks within its Top 3 results; Google and Blekko do not include any Microsoft content Bing refers to on the first page of results in nearly 80% of queries.

Moreover, Bing frequently ranks Microsoft content highly even when rival engines do not refer to the same content at all in the first page of results.  For example, of the 5 queries for which Bing ranks Microsoft content in its top result, Google refers to only one of these 5 within its first page of results, while Blekko refers to none.  Even when comparing results across each engine’s full page of results, Google and Blekko only agree with Bing’s referral of Microsoft content in 20.4% of queries.

Although there are not enough Bing data to test results in the first position in E&L’s sample, Microsoft content appears as results on the first page of a Bing search about 7 times more often than Microsoft content appears on the first page of rival engines.  Also, Google is much more likely to refer to Microsoft content than Blekko, though both refer to significantly less Microsoft content than Bing.

A Closer Look at Google v. Bing

On E&L’s own terms, Bing results are more biased than Google results; rivals are more likely to agree with Google’s algorithmic assessment (than with Bing’s) that its own content is relevant to user queries.  Bing refers to Microsoft content other engines do not rank at all more often than Google refers its own content without any agreement from rivals.  Figures 1 and 2 display the same data presented above in order to facilitate direct comparisons between Google and Bing.

As Figures 1 and 2 illustrate, Bing search results for these 32 queries are more frequently “biased” in favor of its own content than are Google’s.  The bias is greatest for the Top 1 and Top 3 search results.

My study finds that Bing exhibits far more “bias” than E&L identify in their earlier analysis.  For example, in E&L’s study, Bing does not refer to Microsoft content at all in its Top 1 or Top 3 results; moreover, Bing refers to Microsoft content within its entire first page 11 times, while Google and Yahoo refer to Microsoft content 8 and 9 times, respectively.  Most likely, the significant increase in Bing’s “bias” differential is largely a function of Bing’s introduction of localized and personalized search results and represents serious competitive efforts on Bing’s behalf.

Again, it’s important to stress E&L’s limited and non-random sample, and to emphasize the danger of making strong inferences about the general nature or magnitude of search bias based upon these data alone.  However, the data indicate that Google’s own-content bias is relatively small even in a sample collected precisely to focus upon the queries most likely to generate it.  In fact—as I’ll discuss in my next post—own-content bias occurs even less often in a more representative sample of queries, strongly suggesting that such bias does not raise the competitive concerns attributed to it.