“Data is the new oil,” said Jaron Lanier in a recent op-ed for The New York Times. Lanier’s use of this metaphor is only the latest instance of what has become the dumbest meme in tech policy. As the digital economy becomes more prominent in our lives, it is not unreasonable to seek to understand one of its most important inputs. But this analogy to the physical economy is fundamentally flawed. Worse, introducing regulations premised upon faulty assumptions like this will likely do far more harm than good. Here are seven reasons why “data is the new oil” misses the mark:
1. Oil is rivalrous; data is non-rivalrous
If someone uses a barrel of oil, it can’t be consumed again. But, as Alan McQuinn, a senior policy analyst at the Information Technology and Innovation Foundation, noted, “when consumers ‘pay with data’ to access a website, they still have the same amount of data after the transaction as before. As a result, users have an infinite resource available to them to access free online services.” Imposing restrictions on data collection makes this infinite resource finite.
2. Oil is excludable; data is non-excludable
Oil is highly excludable because, as a physical commodity, it can be stored in ways that prevent use by non-authorized parties. However, as my colleagues pointed out in a recent comment to the FTC: “While databases may be proprietary, the underlying data usually is not.” They go on to argue that this can lead to under-investment in data collection:
[C]ompanies that have acquired a valuable piece of data will struggle both to prevent their rivals from obtaining the same data as well as to derive competitive advantage from the data. For these reasons, it also means that firms may well be more reluctant to invest in data generation than is socially optimal. In fact, to the extent this is true there is arguably more risk of companies under-investing in data generation than of firms over-investing in order to create data troves with which to monopolize a market. This contrasts with oil, where complete excludability is the norm.
3. Oil is fungible; data is non-fungible
Oil is a commodity, so, by definition, one barrel of oil of a given grade is equivalent to any other barrel of that grade. Data, on the other hand, is heterogeneous. Each person’s data is unique and may consist of a practically unlimited number of different attributes that can be collected into a profile. This means that oil will follow the law of one price, while a dataset’s value will be highly contingent on its particular properties and commercialization potential.
4. Oil has positive marginal costs; data has zero marginal costs
There is a significant expense to producing and distributing an additional barrel of oil (as low as $5.49 per barrel in Saudi Arabia; as high as $21.66 in the U.K.). Data is merely encoded information (bits of 1s and 0s), so gathering, storing, and transferring it is nearly costless (though, to be clear, setting up systems for collecting and processing can be a large fixed cost). Under perfect competition, the market clearing price is equal to the marginal cost of production (hence why data is traded for free services and oil still requires cold, hard cash).
5. Oil is a search good; data is an experience good
Oil is a search good, meaning its value can be assessed prior to purchasing. By contrast, data tends to be an experience good because companies don’t know how much a new dataset is worth until it has been combined with pre-existing datasets and deployed using algorithms (from which value is derived). This is one reason why purpose limitation rules can have unintended consequences. If firms are unable to predict what data they will need in order to develop new products, then restricting what data they’re allowed to collect is per se anti-innovation.
6. Oil has constant returns to scale; data has rapidly diminishing returns
As an energy input into a mechanical process, oil has relatively constant returns to scale (e.g., when oil is used as the fuel source to power a machine). When data is used as an input for an algorithm, it shows rapidly diminishing returns, as the charts collected in a presentation by Google’s Hal Varian demonstrate. The initial training data is hugely valuable for increasing an algorithm’s accuracy. But as you increase the dataset by a fixed amount each time, the improvements steadily decline (because new data is only helpful in so far as it’s differentiated from the existing dataset).
7. Oil is valuable; data is worthless
The features detailed above — rivalrousness, fungibility, marginal cost, returns to scale — all lead to perhaps the most important distinction between oil and data: The average barrel of oil is valuable (currently $56.49) and the average dataset is worthless (on the open market). As Will Rinehart showed, putting a price on data is a difficult task. But when data brokers and other intermediaries in the digital economy do try to value data, the prices are almost uniformly low. The Financial Times had the most detailed numbers on what personal data is sold for in the market:
“General information about a person, such as their age, gender and location is worth a mere $0.0005 per person, or $0.50 per 1,000 people.”
“A person who is shopping for a car, a financial product or a vacation is more valuable to companies eager to pitch those goods. Auto buyers, for instance, are worth about $0.0021 a pop, or $2.11 per 1,000 people.”
“Knowing that a woman is expecting a baby and is in her second trimester of pregnancy, for instance, sends the price tag for that information about her to $0.11.”
“For $0.26 per person, buyers can access lists of people with specific health conditions or taking certain prescriptions.”
“The company estimates that the value of a relatively high Klout score adds up to more than $3 in word-of-mouth marketing value.”
“[T]he sum total for most individuals often is less than a dollar.”
Data is a specific asset, meaning it has “a significantly higher value within a particular transacting relationship than outside the relationship.” We only think data is so valuable because tech companies are so valuable. In reality, it is the combination of high-skilled labor, large capital expenditures, and cutting-edge technologies (e.g., machine learning) that makes those companies so valuable. Yes, data is an important component of these production functions. But to claim that data is responsible for all the value created by these businesses, as Lanier does in his NYT op-ed, is farcical (and reminiscent of the labor theory of value).
People who analogize data to oil or gold may merely be trying to convey that data is as valuable in the 21st century as those commodities were in the 20th century (though, as argued, a dubious proposition). If the comparison stopped there, it would be relatively harmless. But there is a real risk that policymakers might take the analogy literally and regulate data in the same way they regulate commodities. As this article shows, data has many unique properties that are simply incompatible with 20th-century modes of regulation.
A better — though imperfect — analogy, as author Bernard Marr suggests, would be renewable energy. The sources of renewable energy are all around us — solar, wind, hydroelectric — and there is more available than we could ever use. We just need the right incentives and technology to capture it. The same is true for data. We leave our digital fingerprints everywhere — we just need to dust for them.
Next week the FCC is slated to vote on the second iteration of Chairman Wheeler’s proposed broadband privacy rules. Of course, as has become all too common, none of us outside the Commission has actually seen the proposal. But earlier this month Chairman Wheeler released a Fact Sheet that suggests some of the ways it would update the rules he initially proposed.
According to the Fact Sheet, the new proposed rules are
designed to evolve with changing technologies and encourage innovation, and are in harmony with other key privacy frameworks and principles — including those outlined by the Federal Trade Commission and the Administration’s Consumer Privacy Bill of Rights.
Unfortunately, the Chairman’s proposal appears to fall short of the mark on both counts.
As I discuss in detail in a letter filed with the Commission yesterday, despite the Chairman’s rhetoric, the rules described in the Fact Sheet fail to align with the FTC’s approach to privacy regulation embodied in its 2012 Privacy Report in at least two key ways:
First, the Fact Sheet significantly expands the scope of information that would be considered “sensitive” beyond that contemplated by the FTC. That, in turn, would impose onerous and unnecessary consumer consent obligations on commonplace uses of data, undermining consumer welfare, depriving consumers of information and access to new products and services, and restricting competition.
Second, unlike the FTC’s framework, the proposal described by the Fact Sheet ignores the crucial role of “context” in determining the appropriate level of consumer choice before affected companies may use consumer data. Instead, the Fact Sheet takes a rigid, acontextual approach that would stifle innovation and harm consumers.
The Chairman’s proposal moves far beyond the FTC’s definition of “sensitive” information requiring “opt-in” consent
The FTC’s privacy guidance is, in its design at least, appropriately flexible, aimed at balancing the immense benefits of information flows with sensible consumer protections. Thus it eschews an “inflexible list of specific practices” that would automatically trigger onerous consent obligations and “risk undermining companies’ incentives to innovate and develop new products and services….”
Under the FTC’s regime, depending on the context in which it is used (on which see the next section, below), the sensitivity of data delineates the difference between data uses that require “express affirmative” (opt-in) consent and those that do not (requiring only “other protections” short of opt-in consent — e.g., opt-out).
Because the distinction is so important — because opt-in consent is much more likely to staunch data flows — the FTC endeavors to provide guidance as to what data should be considered sensitive, and to cabin the scope of activities requiring opt-in consent. Thus, the FTC explains that “information about children, financial and health information, Social Security numbers, and precise geolocation data [should be treated as] sensitive.” But beyond those instances, the FTC doesn’t consider any other type of data as inherently sensitive.
By contrast, and without explanation, Chairman Wheeler’s Fact Sheet significantly expands what constitutes “sensitive” information requiring “opt-in” consent by adding “web browsing history,” “app usage history,” and “the content of communications” to the list of categories of data deemed sensitive in all cases.
By treating some of the most common and important categories of data as always “sensitive,” and by making the sensitivity of data the sole determinant for opt-in consent, the Chairman’s proposal would make it almost impossible for ISPs to make routine (to say nothing of innovative), appropriate, and productive uses of data comparable to those undertaken by virtually every major Internet company. This goes well beyond anything contemplated by the FTC — with no evidence of any corresponding benefit to consumers and with obvious harm to competition, innovation, and the overall economy online.
And because the Chairman’s proposal would impose these inappropriate and costly restrictions only on ISPs, it would create a barrier to competition by ISPs in other platform markets, without offering a defensible consumer protection rationale to justify either the disparate treatment or the restriction on competition.
“Opt-in” offers no greater privacy protection than allowing consumers to “opt-out”…, yet it imposes significantly higher costs on consumers, businesses, and the economy.
Not surprisingly, these costs fall disproportionately on the relatively poor and the less technology-literate. In the former case, opt-in requirements may deter companies from offering services at all, even to people who would make a very different trade-off between privacy and monetary price. In the latter case, because an initial decision to opt-in must be taken in relative ignorance, users without much experience to guide their decisions will face effectively higher decision-making costs than more knowledgeable users.
The Chairman’s proposal ignores the central role of context in the FTC’s privacy framework
In part for these reasons, central to the FTC’s more flexible framework is the establishment of a sort of “safe harbor” for data uses where the benefits clearly exceed the costs and consumer consent may be inferred:
Companies do not need to provide choice before collecting and using consumer data for practices that are consistent with the context of the transaction or the company’s relationship with the consumer….
Thus for many straightforward uses of data, the “context of the transaction,” not the asserted “sensitivity” of the underlying data, is the threshold question in evaluating the need for consumer choice in the FTC’s framework.
Chairman Wheeler’s Fact Sheet, by contrast, ignores this central role of context in its analysis. Instead, it focuses solely on data sensitivity, claiming that doing so is “in line with customer expectations.”
But this is inconsistent with the FTC’s approach.
In fact, the FTC’s framework explicitly rejects a pure “consumer expectations” standard:
Rather than relying solely upon the inherently subjective test of consumer expectations, the… standard focuses on more objective factors related to the consumer’s relationship with a business.
And while everyone agrees that sensitivity is a key part of pegging privacy regulation to actual consumer and corporate relationships, the FTC also recognizes that the importance of the sensitivity of the underlying data varies with the context in which it is used. Or, in the words of the White House’s 2012 Consumer Data Privacy in a Networked World Report (introducing its Consumer Privacy Bill of Rights), “[c]ontext should shape the balance and relative emphasis of particular principles” guiding the regulation of privacy.
By contrast, Chairman Wheeler’s “sensitivity-determines-consumer-expectations” framing is a transparent attempt to claim fealty to the FTC’s (and the Administration’s) privacy standards while actually implementing a privacy regime that is flatly inconsistent with them.
The FTC’s approach isn’t perfect, but that’s no excuse to double down on its failings
The FTC’s privacy guidance, and even more so its privacy enforcement practices under Section 5, are far from perfect. The FTC should be commended for its acknowledgement that consumers’ privacy preferences and companies’ uses of data will change over time, and that there are trade-offs inherent in imposing any constraints on the flow of information. But even the FTC fails to actually assess the magnitude of the costs and benefits of, and the deep complexities involved in, the trade-off, and puts an unjustified thumb on the scale in favor of limiting data use.
But that’s no excuse for Chairman Wheeler to ignore what the FTC gets right, and to double down on its failings. Based on the Fact Sheet (and the initial NPRM), it’s a virtual certainty that the Chairman’s proposal doesn’t heed the FTC’s refreshing call for humility and flexibility regarding the application of privacy rules to ISPs (and other Internet platforms):
These are complex and rapidly evolving areas, and more work should be done to learn about the practices of all large platform providers, their technical capabilities with respect to consumer data, and their current and expected uses of such data.
The rhetoric of the Chairman’s Fact Sheet is correct: the FCC should in fact conform its approach to privacy to the framework established by the FTC. Unfortunately, the reality of the Fact Sheet simply doesn’t comport with its rhetoric.
As the FCC’s vote on the Chairman’s proposal rapidly nears, and in light of its significant defects, we can only hope that the rest of the Commission refrains from reflexively adopting the proposed regime, and works to ensure that these problematic deviations from the FTC’s framework are addressed before moving forward.
The CPI Antitrust Chronicle published Geoffrey Manne’s and my recent paper, The Problems and Perils of Bootstrapping Privacy and Data into an Antitrust Framework as part of a symposium on Big Data in the May 2015 issue. All of the papers are worth reading and pondering, but of course ours is the best ;).
In it, we analyze two of the most prominent theories of antitrust harm arising from data collection: privacy as a factor of non-price competition, and price discrimination facilitated by data collection. We also analyze whether data is serving as a barrier to entry and effectively preventing competition. We argue that, in the current marketplace, there are no plausible harms to competition arising from either non-price effects or price discrimination due to data collection online and that there is no data barrier to entry preventing effective competition.
The issues of how to regulate privacy issues and what role competition authorities should in that, are only likely to increase in importance as the Internet marketplace continues to grow and evolve. The European Commission and the FTC have been called on by scholars and advocates to take greater consideration of privacy concerns during merger review and encouraged to even bring monopolization claims based upon data dominance. These calls should be rejected unless these theories can satisfy the rigorous economic review of antitrust law. In our humble opinion, they cannot do so at this time.
PRIVACY AS AN ELEMENT OF NON-PRICE COMPETITION
The Horizontal Merger Guidelines have long recognized that anticompetitive effects may “be manifested in non-price terms and conditions that adversely affect customers.” But this notion, while largely unobjectionable in the abstract, still presents significant problems in actual application.
First, product quality effects can be extremely difficult to distinguish from price effects. Quality-adjusted price is usually the touchstone by which antitrust regulators assess prices for competitive effects analysis. Disentangling (allegedly) anticompetitive quality effects from simultaneous (neutral or pro-competitive) price effects is an imprecise exercise, at best. For this reason, proving a product-quality case alone is very difficult and requires connecting the degradation of a particular element of product quality to a net gain in advantage for the monopolist.
Second, invariably product quality can be measured on more than one dimension. For instance, product quality could include both function and aesthetics: A watch’s quality lies in both its ability to tell time as well as how nice it looks on your wrist. A non-price effects analysis involving product quality across multiple dimensions becomes exceedingly difficult if there is a tradeoff in consumer welfare between the dimensions. Thus, for example, a smaller watch battery may improve its aesthetics, but also reduce its reliability. Any such analysis would necessarily involve a complex and imprecise comparison of the relative magnitudes of harm/benefit to consumers who prefer one type of quality to another.
PRICE DISCRIMINATION AS A PRIVACY HARM
If non-price effects cannot be relied upon to establish competitive injury (as explained above), then what can be the basis for incorporating privacy concerns into antitrust? One argument is that major data collectors (e.g., Google and Facebook) facilitate price discrimination.
The argument can be summed up as follows: Price discrimination could be a harm to consumers that antitrust law takes into consideration. Because companies like Google and Facebook are able to collect a great deal of data about their users for analysis, businesses could segment groups based on certain characteristics and offer them different deals. The resulting price discrimination could lead to many consumers paying more than they would in the absence of the data collection. Therefore, the data collection by these major online companies facilitates price discrimination that harms consumer welfare.
This argument misses a large part of the story, however. The flip side is that price discrimination could have benefits to those who receive lower prices from the scheme than they would have in the absence of the data collection, a possibility explored by the recent White House Report on Big Data and Differential Pricing.
While privacy advocates have focused on the possible negative effects of price discrimination to one subset of consumers, they generally ignore the positive effects of businesses being able to expand output by serving previously underserved consumers. It is inconsistent with basic economic logic to suggest that a business relying on metrics would want to serve only those who can pay more by charging them a lower price, while charging those who cannot afford it a larger one. If anything, price discrimination would likely promote more egalitarian outcomes by allowing companies to offer lower prices to poorer segments of the population—segments that can be identified by data collection and analysis.
If this group favored by “personalized pricing” is as big as—or bigger than—the group that pays higher prices, then it is difficult to state that the practice leads to a reduction in consumer welfare, even if this can be divorced from total welfare. Again, the question becomes one of magnitudes that has yet to be considered in detail by privacy advocates.
Either of these theories of harm is predicated on the inability or difficulty of competitors to develop alternative products in the marketplace—the so-called “data barrier to entry.” The argument is that upstarts do not have sufficient data to compete with established players like Google and Facebook, which in turn employ their data to both attract online advertisers as well as foreclose their competitors from this crucial source of revenue. There are at least four reasons to be dubious of such arguments:
Data is useful to all industries, not just online companies;
It’s not the amount of data, but how you use it;
Competition online is one click or swipe away; and
Access to data is not exclusive
Privacy advocates have thus far failed to make their case. Even in their most plausible forms, the arguments for incorporating privacy and data concerns into antitrust analysis do not survive legal and economic scrutiny. In the absence of strong arguments suggesting likely anticompetitive effects, and in the face of enormous analytical problems (and thus a high risk of error cost), privacy should remain a matter of consumer protection, not of antitrust.
Recent years have seen an increasing interest in incorporating privacy into antitrust analysis. The FTC and regulators in Europe have rejected these calls so far, but certain scholars and activists continue their attempts to breathe life into this novel concept. Elsewhere we have written at length on the scholarship addressing the issue and found the case for incorporation wanting. Among the errors proponents make is a persistent (and woefully unsubstantiated) assertion that online data can amount to a barrier to entry, insulating incumbent services from competition and ensuring that only the largest providers thrive. This data barrier to entry, it is alleged, can then allow firms with monopoly power to harm consumers, either directly through “bad acts” like price discrimination, or indirectly by raising the costs of advertising, which then get passed on to consumers.
A case in point was on display at last week’s George Mason Law & Economics Center Briefing on Big Data, Privacy, and Antitrust. Building on their growing body of advocacy work, Nathan Newman and Allen Grunes argued that this hypothesized data barrier to entry actually exists, and that it prevents effective competition from search engines and social networks that are interested in offering services with heightened privacy protections.
According to Newman and Grunes, network effects and economies of scale ensure that dominant companies in search and social networking (they specifically named Google and Facebook — implying that they are in separate markets) operate without effective competition. This results in antitrust harm, they assert, because it precludes competition on the non-price factor of privacy protection.
In other words, according to Newman and Grunes, even though Google and Facebook offer their services for a price of $0 and constantly innovate and upgrade their products, consumers are nevertheless harmed because the business models of less-privacy-invasive alternatives are foreclosed by insufficient access to data (an almost self-contradicting and silly narrative for many reasons, including the big question of whether consumers prefer greater privacy protection to free stuff). Without access to, and use of, copious amounts of data, Newman and Grunes argue, the algorithms underlying search and targeted advertising are necessarily less effective and thus the search product without such access is less useful to consumers. And even more importantly to Newman, the value to advertisers of the resulting consumer profiles is diminished.
Newman has put forth a number of other possible antitrust harms that purportedly result from this alleged data barrier to entry, as well. Among these is the increased cost of advertising to those who wish to reach consumers. Presumably this would harm end users who have to pay more for goods and services because the costs of advertising are passed on to them. On top of that, Newman argues that ad networks inherently facilitate price discrimination, an outcome that he asserts amounts to antitrust harm.
FTC Commissioner Maureen Ohlhausen (who also spoke at the George Mason event) recently made the case that antitrust law is not well-suited to handling privacy problems. She argues — convincingly — that competition policy and consumer protection should be kept separate to preserve doctrinal stability. Antitrust law deals with harms to competition through the lens of economic analysis. Consumer protection law is tailored to deal with broader societal harms and aims at protecting the “sanctity” of consumer transactions. Antitrust law can, in theory, deal with privacy as a non-price factor of competition, but this is an uneasy fit because of the difficulties of balancing quality over two dimensions: Privacy may be something some consumers want, but others would prefer a better algorithm for search and social networks, and targeted ads with free content, for instance.
In fact, there is general agreement with Commissioner Ohlhausen on her basic points, even among critics like Newman and Grunes. But, as mentioned above, views diverge over whether there are some privacy harms that should nevertheless factor into competition analysis, and on whether there is in fact a data barrier to entry that makes these harms possible.
As we explain below, however, the notion of data as an antitrust-relevant barrier to entry is simply a myth. And, because all of the theories of “privacy as an antitrust harm” are essentially predicated on this, they are meritless.
First, data is useful to all industries — this is not some new phenomenon particular to online companies
It bears repeating (because critics seem to forget it in their rush to embrace “online exceptionalism”) that offline retailers also receive substantial benefit from, and greatly benefit consumers by, knowing more about what consumers want and when they want it. Through devices like coupons and loyalty cards (to say nothing of targeted mailing lists and the age-old practice of data mining check-out receipts), brick-and-mortar retailers can track purchase data and better serve consumers. Not only do consumers receive better deals for using them, but retailers know what products to stock and advertise and when and on what products to run sales. For instance:
Following its acquisition of Kosmix in 2011, Walmart established @WalmartLabs, which created its own product search engine for online shoppers. In the first year of its use alone, the number of customers buying a product on Walmart.com after researching a purchase increased by 20 percent. According to Ron Bensen, the vice president of engineering at @WalmartLabs, the combination of in-store and online data could give brick-and-mortar retailers like Walmart an advantage over strictly online stores.
Panera and a whole host of restaurants, grocery stores, drug stores and retailers use loyalty cards to advertise and learn about consumer preferences.
And of course there is a host of others uses for data, as well, including security, fraud prevention, product optimization, risk reduction to the insured, knowing what content is most interesting to readers, etc. The importance of data stretches far beyond the online world, and far beyond mere retail uses more generally. To describe even online giants like Amazon, Apple, Microsoft, Facebook and Google as having a monopoly on data is silly.
Second, it’s not the amount of data that leads to success but building a better mousetrap
The value of knowing someone’s birthday, for example, is not in that tidbit itself, but in the fact that you know this is a good day to give that person a present. Most of the data that supports the advertising networks underlying the Internet ecosphere is of this sort: Information is important to companies because of the value that can be drawn from it, not for the inherent value of the data itself. Companies don’t collect information about you to stalk you, but to better provide goods and services to you.
Moreover, data itself is not only less important than what can be drawn from it, but data is also less important than the underlying product it informs. For instance, Snapchat created a challenger to Facebook so successfully (and in such short time) that Facebook attempted to buy it for $3 billion (Google offered $4 billion). But Facebook’s interest in Snapchat wasn’t about its data. Instead, Snapchat was valuable — and a competitive challenge to Facebook — because it cleverly incorporated the (apparently novel) insight that many people wanted to share information in a more private way.
Relatedly, Twitter, Instagram, LinkedIn, Yelp, Pinterest (and Facebook itself) all started with little (or no) data and they have had a lot of success. Meanwhile, despite its supposed data advantages, Google’s attempts at social networking — Google+ — have never caught up to Facebook in terms of popularity to users (and thus not to advertisers either). And scrappy social network Ello is starting to build a significant base without data collection for advertising at all.
At the same time it’s simply not the case that the alleged data giants — the ones supposedly insulating themselves behind data barriers to entry — actually have the type of data most relevant to startups anyway. As Andres Lerner has argued, if you wanted to start a travel business, the data from Kayak or Priceline would be far more relevant. Or if you wanted to start a ride-sharing business, data from cab companies would be more useful than the broad, market-cross-cutting profiles Google and Facebook have. Consider companies like Uber, Lyft and Sidecar that had no customer data when they began to challenge established cab companies that did possess such data. If data were really so significant, they could never have competed successfully. But Uber, Lyft and Sidecar have been able to effectively compete because they built products that users wanted to use — they came up with an idea for a better mousetrap.The data they have accrued came after they innovated, entered the market and mounted their successful challenges — not before.
In reality, those who complain about data facilitating unassailable competitive advantages have it exactly backwards. Companies need to innovate to attract consumer data, otherwise consumers will switch to competitors (including both new entrants and established incumbents). As a result, the desire to make use of more and better data drives competitive innovation, with manifestly impressive results: The continued explosion of new products, services and other apps is evidence that data is not a bottleneck to competition but a spur to drive it.
Third, competition online is one click or thumb swipe away; that is, barriers to entry and switching costs are low
Somehow, in the face of alleged data barriers to entry, competition online continues to soar, with newcomers constantly emerging and triumphing. This suggests that the barriers to entry are not so high as to prevent robust competition.
Again, despite the supposed data-based monopolies of Facebook, Google, Amazon, Apple and others, there exist powerful competitors in the marketplaces they compete in:
Google flight search has failed to seriously challenge — let alone displace — its competitors, as critics feared. Kayak, Expedia and the like remain the most prominent travel search sites — despite Google having literally purchased ITA’s trove of flight data and data-processing acumen.
People looking for local reviews go to Yelp and TripAdvisor (and, increasingly, Facebook) as often as Google.
With its recent acquisition of the shopping search engine, TheFind, and test-run of a “buy” button, Facebook is also gearing up to become a major competitor in the realm of e-commerce, challenging Amazon.
Likewise, Amazon recently launched its own ad network, “Amazon Sponsored Links,” to challenge other advertising players.
Even assuming for the sake of argument that data creates a barrier to entry, there is little evidence that consumers cannot easily switch to a competitor. While there are sometimes network effects online, like with social networking, history still shows that people will switch. MySpace was considered a dominant network until it made a series of bad business decisions and everyone ended up on Facebook instead. Similarly, Internet users can and do use Bing, DuckDuckGo, Yahoo, and a plethora of more specialized search engines on top of and instead of Google. And don’t forget that Google itself was once an upstart new entrant that replaced once-household names like Yahoo and AltaVista.
Fourth, access to data is not exclusive
Critics like Newman have compared Google to Standard Oil and argued that government authorities need to step in to limit Google’s control over data. But to say data is like oil is a complete misnomer. If Exxon drills and extracts oil from the ground, that oil is no longer available to BP. Data is not finite in the same way. To use an earlier example, Google knowing my birthday doesn’t limit the ability of Facebook to know my birthday, as well. While databases may be proprietary, the underlying data is not. And what matters more than the data itself is how well it is analyzed.
This is especially important when discussing data online, where multi-homing is ubiquitous, meaning many competitors end up voluntarily sharing access to data. For instance, I can use the friend-finder feature on WordPress to find Facebook friends, Google connections, and people I’m following on Twitter who also use the site for blogging. Using this feature allows WordPress to access your contact list on these major online players.
Further, it is not apparent that Google’s competitors have less data available to them. Microsoft, for instance, has admitted that it may actually have more data. And, importantly for this discussion, Microsoft may have actually garnered some of its data for Bing from Google.
If Google has a high cost per click, then perhaps it’s because it is worth it to advertisers: There are more eyes on Google because of its superior search product. Contra Newman and Grunes, Google may just be more popular for consumers and advertisers alike because the algorithm makes it more useful, not because it has more data than everyone else.
Fifth, the data barrier to entry argument does not have workable antitrust remedies
The misguided logic of data barrier to entry arguments leaves a lot of questions unanswered. Perhaps most important among these is the question of remedies. What remedy would apply to a company found guilty of leveraging its market power with data?
It’s actually quite difficult to conceive of a practical means for a competition authority to craft remedies that would address the stated concerns without imposing enormous social costs. In the unilateral conduct context, the most obvious remedy would involve the forced sharing of data.
On the one hand, as we’ve noted, it’s not clear this would actually accomplish much. If competitors can’t actually make good use of data, simply having more of it isn’t going to change things. At the same time, such a result would reduce the incentive to build data networks to begin with. In their startup stage, companies like Uber and Facebook required several months and hundreds of thousands, if not millions, of dollars to design and develop just the first iteration of the products consumers love. Would any of them have done it if they had to share their insights? In fact, it may well be that access to these free insights is what competitors actually want; it’s not the data they’re lacking, but the vision or engineering acumen to use it.
Other remedies limiting collection and use of data are not only outside of the normal scope of antitrust remedies, they would also involve extremely costly court supervision and may entail problematic “collisions between new technologies and privacy rights,” as the last year’s White House Report on Big Data and Privacy put it.
It is equally unclear what an antitrust enforcer could do in the merger context. As Commissioner Ohlhausen has argued, blocking specific transactions does not necessarily stop data transfer or promote privacy interests. Parties could simply house data in a standalone entity and enter into licensing arrangements. And conditioning transactions with forced data sharing requirements would lead to the same problems described above.
If antitrust doesn’t provide a remedy, then it is not clear why it should apply at all. The absence of workable remedies is in fact a strong indication that data and privacy issues are not suitable for antitrust. Instead, such concerns would be better dealt with under consumer protection law or by targeted legislation.
In short, all of this hand-wringing over privacy is largely a tempest in a teapot — especially when one considers the extent to which the White House and other government bodies have studiously ignored the real threat: government misuse of data à la the NSA. It’s almost as if the White House is deliberately shifting the public’s gaze from the reality of extensive government spying by directing it toward a fantasy world of nefarious corporations abusing private information….
The White House’s proposed bill is emblematic of many government “fixes” to largely non-existent privacy issues, and it exhibits the same core defects that undermine both its claims and its proposed solutions. As a result, the proposed bill vastly overemphasizes regulation to the dangerous detriment of the innovative benefits of Big Data for consumers and society at large.