Archives For big data

Why Data Is Not the New Oil

Alec Stapp —  8 October 2019

“Data is the new oil,” said Jaron Lanier in a recent op-ed for The New York Times. Lanier’s use of this metaphor is only the latest instance of what has become the dumbest meme in tech policy. As the digital economy becomes more prominent in our lives, it is not unreasonable to seek to understand one of its most important inputs. But this analogy to the physical economy is fundamentally flawed. Worse, introducing regulations premised upon faulty assumptions like this will likely do far more harm than good. Here are seven reasons why “data is the new oil” misses the mark:

1. Oil is rivalrous; data is non-rivalrous

If someone uses a barrel of oil, it can’t be consumed again. But, as Alan McQuinn, a senior policy analyst at the Information Technology and Innovation Foundation, noted, “when consumers ‘pay with data’ to access a website, they still have the same amount of data after the transaction as before. As a result, users have an infinite resource available to them to access free online services.” Imposing restrictions on data collection makes this infinite resource finite. 

2. Oil is excludable; data is non-excludable

Oil is highly excludable because, as a physical commodity, it can be stored in ways that prevent use by non-authorized parties. However, as my colleagues pointed out in a recent comment to the FTC: “While databases may be proprietary, the underlying data usually is not.” They go on to argue that this can lead to under-investment in data collection:

[C]ompanies that have acquired a valuable piece of data will struggle both to prevent their rivals from obtaining the same data as well as to derive competitive advantage from the data. For these reasons, it also  means that firms may well be more reluctant to invest in data generation than is socially optimal. In fact, to the extent this is true there is arguably more risk of companies under-investing in data  generation than of firms over-investing in order to create data troves with which to monopolize a market. This contrasts with oil, where complete excludability is the norm.

3. Oil is fungible; data is non-fungible

Oil is a commodity, so, by definition, one barrel of oil of a given grade is equivalent to any other barrel of that grade. Data, on the other hand, is heterogeneous. Each person’s data is unique and may consist of a practically unlimited number of different attributes that can be collected into a profile. This means that oil will follow the law of one price, while a dataset’s value will be highly contingent on its particular properties and commercialization potential.

4. Oil has positive marginal costs; data has zero marginal costs

There is a significant expense to producing and distributing an additional barrel of oil (as low as $5.49 per barrel in Saudi Arabia; as high as $21.66 in the U.K.). Data is merely encoded information (bits of 1s and 0s), so gathering, storing, and transferring it is nearly costless (though, to be clear, setting up systems for collecting and processing can be a large fixed cost). Under perfect competition, the market clearing price is equal to the marginal cost of production (hence why data is traded for free services and oil still requires cold, hard cash).

5. Oil is a search good; data is an experience good

Oil is a search good, meaning its value can be assessed prior to purchasing. By contrast, data tends to be an experience good because companies don’t know how much a new dataset is worth until it has been combined with pre-existing datasets and deployed using algorithms (from which value is derived). This is one reason why purpose limitation rules can have unintended consequences. If firms are unable to predict what data they will need in order to develop new products, then restricting what data they’re allowed to collect is per se anti-innovation.

6. Oil has constant returns to scale; data has rapidly diminishing returns

As an energy input into a mechanical process, oil has relatively constant returns to scale (e.g., when oil is used as the fuel source to power a machine). When data is used as an input for an algorithm, it shows rapidly diminishing returns, as the charts collected in a presentation by Google’s Hal Varian demonstrate. The initial training data is hugely valuable for increasing an algorithm’s accuracy. But as you increase the dataset by a fixed amount each time, the improvements steadily decline (because new data is only helpful in so far as it’s differentiated from the existing dataset).

7. Oil is valuable; data is worthless

The features detailed above — rivalrousness, fungibility, marginal cost, returns to scale — all lead to perhaps the most important distinction between oil and data: The average barrel of oil is valuable (currently $56.49) and the average dataset is worthless (on the open market). As Will Rinehart showed, putting a price on data is a difficult task. But when data brokers and other intermediaries in the digital economy do try to value data, the prices are almost uniformly low. The Financial Times had the most detailed numbers on what personal data is sold for in the market:

  • “General information about a person, such as their age, gender and location is worth a mere $0.0005 per person, or $0.50 per 1,000 people.”
  • “A person who is shopping for a car, a financial product or a vacation is more valuable to companies eager to pitch those goods. Auto buyers, for instance, are worth about $0.0021 a pop, or $2.11 per 1,000 people.”
  • “Knowing that a woman is expecting a baby and is in her second trimester of pregnancy, for instance, sends the price tag for that information about her to $0.11.”
  • “For $0.26 per person, buyers can access lists of people with specific health conditions or taking certain prescriptions.”
  • “The company estimates that the value of a relatively high Klout score adds up to more than $3 in word-of-mouth marketing value.”
  • [T]he sum total for most individuals often is less than a dollar.

Data is a specific asset, meaning it has “a significantly higher value within a particular transacting relationship than outside the relationship.” We only think data is so valuable because tech companies are so valuable. In reality, it is the combination of high-skilled labor, large capital expenditures, and cutting-edge technologies (e.g., machine learning) that makes those companies so valuable. Yes, data is an important component of these production functions. But to claim that data is responsible for all the value created by these businesses, as Lanier does in his NYT op-ed, is farcical (and reminiscent of the labor theory of value). 


People who analogize data to oil or gold may merely be trying to convey that data is as valuable in the 21st century as those commodities were in the 20th century (though, as argued, a dubious proposition). If the comparison stopped there, it would be relatively harmless. But there is a real risk that policymakers might take the analogy literally and regulate data in the same way they regulate commodities. As this article shows, data has many unique properties that are simply incompatible with 20th-century modes of regulation.

A better — though imperfect — analogy, as author Bernard Marr suggests, would be renewable energy. The sources of renewable energy are all around us — solar, wind, hydroelectric — and there is more available than we could ever use. We just need the right incentives and technology to capture it. The same is true for data. We leave our digital fingerprints everywhere — we just need to dust for them.

Michael Sykuta is Associate Professor, Agricultural and Applied Economics, and Director, Contracting Organizations Research Institute at the University of Missouri.

The US agriculture sector has been experiencing consolidation at all levels for decades, even as the global ag economy has been growing and becoming more diverse. Much of this consolidation has been driven by technological changes that created economies of scale, both at the farm level and beyond.

Likewise, the role of technology has changed the face of agriculture, particularly in the past 20 years since the commercial introduction of the first genetically modified (GMO) crops. However, biotechnology itself comprises only a portion of the technology change. The development of global positioning systems (GPS) and GPS-enabled equipment have created new opportunities for precision agriculture, whether for the application of crop inputs, crop management, or yield monitoring. The development of unmanned and autonomous vehicles and remote sensing technologies, particularly unmanned aerial vehicles (i.e. UAVs, or “drones”), have created new opportunities for field scouting, crop monitoring, and real-time field management. And currently, the development of Big Data analytics is promising to combine all of the different types of data associated with agricultural production in ways intended to improve the application of all the various technologies and to guide production decisions.

Now, with the pending mergers of several major agricultural input and life sciences companies, regulators are faced with a challenge: How to evaluate the competitive effects of such mergers in the face of such a complex and dynamic technology environment—particularly when these technologies are not independent of one another? What is the relevant market for considering competitive effects and what are the implications for technology development? And how does the nature of the technology itself implicate the economic efficiencies underlying these mergers?

Before going too far, it is important to note that while the three cases currently under review (i.e., ChemChina/Syngenta, Dow/DuPont, and Bayer/Monsanto) are frequently lumped together in discussions, the three present rather different competitive cases—particularly within the US. For instance, ChemChina’s acquisition of Syngenta will not, in itself, meaningfully change market concentration. However, financial backing from ChemChina may allow Syngenta to buy up the discards from other deals, such as the parts of DuPont that the EU Commission is requiring to be divested or the seed assets Bayer is reportedly looking to sell to preempt regulatory concerns, as well as other smaller competitors.

Dow-DuPont is perhaps the most head-to-head of the three mergers in terms of R&D and product lines. Both firms are in the top five in the US for pesticide manufacturing and for seeds. However, the Dow-DuPont merger is about much more than combining agricultural businesses. The Dow-DuPont deal specifically aims to create and spin-off three different companies specializing in agriculture, material science, and specialty products. Although agriculture may be the business line in which the companies most overlap, it represents just over 21% of the combined businesses’ annual revenues.

Bayer-Monsanto is yet a different sort of pairing. While both companies are among the top five in US pesticide manufacturing (with combined sales less than Syngenta and about equal to Dow without DuPont), Bayer is a relatively minor player in the seed industry. Likewise, Monsanto is focused almost exclusively on crop production and digital farming technologies, offering little overlap to Bayer’s human health or animal nutrition businesses.

Despite the differences in these deals, they tend to be lumped together and discussed almost exclusively in the context of pesticide manufacturing or crop protection more generally. In so doing, the discussion misses some important aspects of these deals that may mitigate traditional competitive concerns within the pesticide industry.

Mergers as the Key to Unlocking Innovation and Value

First, as the Dow-DuPont merger suggests, mergers may be the least-cost way of (re)organizing assets in ways that maximize value. This is especially true for R&D-intensive industries where intellectual property and innovation are at the core of competitive advantage. Absent the protection of common ownership, neither party would have an incentive to fully disclose the nature of its IP and innovation pipeline. In this case, merging interests increases the efficiency of information sharing so that managers can effectively evaluate and reorganize assets in ways that maximize innovation and return on investment.

Dow and DuPont each have a wide range of areas of application. Both groups of managers recognize that each of their business lines would be stronger as focused, independent entities; but also recognize that the individual elements of their portfolios would be stronger if combined with those of the other company. While the EU Commission argues that Dow-DuPont would reduce the incentive to innovate in the pesticide industry—a dubious claim in itself—the commission seems to ignore the potential increases in efficiency, innovation and ability to serve customer interests across all three of the proposed new businesses. At a minimum, gains in those industries should be weighed against any alleged losses in the agriculture industry.

This is not the first such agricultural and life sciences “reorganization through merger”. The current manifestation of Monsanto is the spin-off of a previous merger between Monsanto and Pharmacia & Upjohn in 2000 that created today’s Pharmacia. At the time of the Pharmacia transaction, Monsanto had portfolios in agricultural products, chemicals, and pharmaceuticals. After reorganizing assets within Pharmacia, three business lines were created: agricultural products (the current Monsanto), pharmaceuticals (now Pharmacia, a subsidiary of Pfizer), and chemicals (now Solutia, a subsidiary of Eastman Chemical Co.). Merging interests allowed Monsanto and Pharmacia & Upjohn to create more focused business lines that were better positioned to pursue innovations and serve customers in their respective industries.

In essence, Dow-DuPont is following the same playbook. Although such intentions have not been announced, Bayer’s broad product portfolio suggests a similar long-term play with Monsanto is likely.

Interconnected Technologies, Innovation, and the Margins of Competition

As noted above, regulatory scrutiny of these three mergers focuses on them in the context of pesticide or agricultural chemical manufacturing. However, innovation in the ag chemicals industry is intricately interwoven with developments in other areas of agricultural technology that have rather different competition and innovation dynamics. The current technological wave in agriculture involves the use of Big Data to create value using the myriad data now available through GPS-enabled precision farming equipment. Monsanto and DuPont, through its Pioneer subsidiary, are both players in this developing space, sometimes referred to as “digital farming”.

Digital farming services are intended to assist farmers’ production decision making and increase farm productivity. Using GPS-coded field maps that include assessments of soil conditions, combined with climate data for the particular field, farm input companies can recommend the types of rates of applications for soil conditioning pre-harvest, seed types for planting, and crop protection products during the growing season. Yield monitors at harvest provide outcomes data for feedback to refine and improve the algorithms that are used in subsequent growing seasons.

The integration of digital farming services with seed and chemical manufacturing offers obvious economic benefits for farmers and competitive benefits for service providers. Input manufacturers have incentive to conduct data analytics that individual farmers do not. Farmers have limited analytic resources and relatively small returns to investing in such resources, while input manufacturers have broad market potential for their analytic services. Moreover, by combining data from a broad cross-section of farms, digital farming service companies have access to the data necessary to identify generalizable correlations between farm plot characteristics, input use, and yield rates.

But the value of the information developed through these analytics is not unidirectional in its application and value creation. While input manufacturers may be able to help improve farmers’ operations given the current stock of products, feedback about crop traits and performance also enhances R&D for new product development by identifying potential product attributes with greater market potential. By combining product portfolios, agricultural companies can not only increase the value of their data-driven services for farmers, but more efficiently target R&D resources to their highest potential use.

The synergy between input manufacturing and digital farming notwithstanding, seed and chemical input companies are not the only players in the digital farming space. Equipment manufacturer John Deere was an early entrant in exploiting the information value of data collected by sensors on its equipment. Other remote sensing technology companies have incentive to develop data analytic tools to create value for their data-generating products. Even downstream companies, like ADM, have expressed interest in investing in digital farming assets that might provide new revenue streams with their farmer-suppliers as well as facilitate more efficient specialty crop and identity-preserved commodity-based value chains.

The development of digital farming is still in its early stages and is far from a sure bet for any particular player. Even Monsanto has pulled back from its initial foray into prescriptive digital farming (call FieldScripts). These competitive forces will affect the dynamics of competition at all stages of farm production, including seed and chemicals. Failure to account for those dynamics, and the potential competitive benefits input manufacturers may provide, could lead regulators to overestimate any concerns of competitive harm from the proposed mergers.


Farmers are concerned about the effects of these big-name tie-ups. Farmers may be rightly concerned, but for the wrong reasons. Ultimately, the role of the farmer continues to be diminished in the agricultural value chain. As precision agriculture tools and Big Data analytics reduce the value of idiosyncratic or tacit knowledge at the farm level, the managerial human capital of farmers becomes relatively less important in terms of value-added. It would be unwise to confuse farmers’ concerns regarding the competitive effects of the kinds of mergers we’re seeing now with the actual drivers of change in the agricultural value chain.

The CPI Antitrust Chronicle published Geoffrey Manne’s and my recent paperThe Problems and Perils of Bootstrapping Privacy and Data into an Antitrust Framework as part of a symposium on Big Data in the May 2015 issue. All of the papers are worth reading and pondering, but of course ours is the best ;).

In it, we analyze two of the most prominent theories of antitrust harm arising from data collection: privacy as a factor of non-price competition, and price discrimination facilitated by data collection. We also analyze whether data is serving as a barrier to entry and effectively preventing competition. We argue that, in the current marketplace, there are no plausible harms to competition arising from either non-price effects or price discrimination due to data collection online and that there is no data barrier to entry preventing effective competition.

The issues of how to regulate privacy issues and what role competition authorities should in that, are only likely to increase in importance as the Internet marketplace continues to grow and evolve. The European Commission and the FTC have been called on by scholars and advocates to take greater consideration of privacy concerns during merger review and encouraged to even bring monopolization claims based upon data dominance. These calls should be rejected unless these theories can satisfy the rigorous economic review of antitrust law. In our humble opinion, they cannot do so at this time.



The Horizontal Merger Guidelines have long recognized that anticompetitive effects may “be manifested in non-price terms and conditions that adversely affect customers.” But this notion, while largely unobjectionable in the abstract, still presents significant problems in actual application.

First, product quality effects can be extremely difficult to distinguish from price effects. Quality-adjusted price is usually the touchstone by which antitrust regulators assess prices for competitive effects analysis. Disentangling (allegedly) anticompetitive quality effects from simultaneous (neutral or pro-competitive) price effects is an imprecise exercise, at best. For this reason, proving a product-quality case alone is very difficult and requires connecting the degradation of a particular element of product quality to a net gain in advantage for the monopolist.

Second, invariably product quality can be measured on more than one dimension. For instance, product quality could include both function and aesthetics: A watch’s quality lies in both its ability to tell time as well as how nice it looks on your wrist. A non-price effects analysis involving product quality across multiple dimensions becomes exceedingly difficult if there is a tradeoff in consumer welfare between the dimensions. Thus, for example, a smaller watch battery may improve its aesthetics, but also reduce its reliability. Any such analysis would necessarily involve a complex and imprecise comparison of the relative magnitudes of harm/benefit to consumers who prefer one type of quality to another.


If non-price effects cannot be relied upon to establish competitive injury (as explained above), then what can be the basis for incorporating privacy concerns into antitrust? One argument is that major data collectors (e.g., Google and Facebook) facilitate price discrimination.

The argument can be summed up as follows: Price discrimination could be a harm to consumers that antitrust law takes into consideration. Because companies like Google and Facebook are able to collect a great deal of data about their users for analysis, businesses could segment groups based on certain characteristics and offer them different deals. The resulting price discrimination could lead to many consumers paying more than they would in the absence of the data collection. Therefore, the data collection by these major online companies facilitates price discrimination that harms consumer welfare.

This argument misses a large part of the story, however. The flip side is that price discrimination could have benefits to those who receive lower prices from the scheme than they would have in the absence of the data collection, a possibility explored by the recent White House Report on Big Data and Differential Pricing.

While privacy advocates have focused on the possible negative effects of price discrimination to one subset of consumers, they generally ignore the positive effects of businesses being able to expand output by serving previously underserved consumers. It is inconsistent with basic economic logic to suggest that a business relying on metrics would want to serve only those who can pay more by charging them a lower price, while charging those who cannot afford it a larger one. If anything, price discrimination would likely promote more egalitarian outcomes by allowing companies to offer lower prices to poorer segments of the population—segments that can be identified by data collection and analysis.

If this group favored by “personalized pricing” is as big as—or bigger than—the group that pays higher prices, then it is difficult to state that the practice leads to a reduction in consumer welfare, even if this can be divorced from total welfare. Again, the question becomes one of magnitudes that has yet to be considered in detail by privacy advocates.


Either of these theories of harm is predicated on the inability or difficulty of competitors to develop alternative products in the marketplace—the so-called “data barrier to entry.” The argument is that upstarts do not have sufficient data to compete with established players like Google and Facebook, which in turn employ their data to both attract online advertisers as well as foreclose their competitors from this crucial source of revenue. There are at least four reasons to be dubious of such arguments:

  1. Data is useful to all industries, not just online companies;
  2. It’s not the amount of data, but how you use it;
  3. Competition online is one click or swipe away; and
  4. Access to data is not exclusive


Privacy advocates have thus far failed to make their case. Even in their most plausible forms, the arguments for incorporating privacy and data concerns into antitrust analysis do not survive legal and economic scrutiny. In the absence of strong arguments suggesting likely anticompetitive effects, and in the face of enormous analytical problems (and thus a high risk of error cost), privacy should remain a matter of consumer protection, not of antitrust.

On Wednesday, March 18, our fellow law-and-economics-focused brethren at George Mason’s Law and Economics Center will host a very interesting morning briefing on the intersection of privacy, big data, consumer protection, and antitrust. FTC Commissioner Maureen Ohlhausen will keynote and she will be followed by what looks like will be a lively panel discussion. If you are in DC you can join in person, but you can also watch online. More details below.
Please join the LEC in person or online for a morning of lively discussion on this topic. FTC Commissioner Maureen K. Ohlhausen will set the stage by discussing her Antitrust Law Journal article, “Competition, Consumer Protection and The Right [Approach] To Privacy“. A panel discussion on big data and antitrust, which includes some of the leading thinkers on the subject, will follow.
Other featured speakers include:

Allen P. Grunes
Founder, The Konkurrenz Group and Data Competition Institute

Andres Lerner
Executive Vice President, Compass Lexecon

Darren S. Tucker
Partner, Morgan Lewis

Nathan Newman
Director, Economic and Technology Strategies LLC

Moderator: James C. Cooper
Director, Research and Policy, Law & Economics Center

A full agenda is available click here.