Data Justice Blog

Why Data Tech is Driving Income Disparity via Vinod Khosla

Coming across this piece, The Next Technology Revolution Will Drive Abundance And Income Disparity by Vinod Khosla, I thought it adds a bit to the discussion in our blog post, 

The Emerging Corporate Control of Social Science Knowledge

Big data is shifting power over our idea of the social not just from the theorists and qualitative researchers to the data scientists, but also from academic social scientists to research labs controlled by corporations.  

This at least is the clear implications of a piece by Ben Williamson at the London School of Economics entitled The death of the theorist and the emergence of data and algorithms in digital social research

Wired’s Chris Anderson argued a few years ago that the social sciences are being replaced by big data and algorithmic techniques, where associations emerging from data has replaced the model of theoretical hypotheses being tested in deliberate manner by social scientists.  Williamson cites such positions as "exaggerated obituaries for the social sciences,” but what is clear is that big data is challenging how and who controlled the institutional practices associated with social science “knowledge production.” 

New players, often associated with corporate research labs like Facebook’s Data Science Team, are using internally generated corporate data to promote their understanding of how people behave.  Studying social media is a seen as an especially rich source of behavioral data, where analyzing the use of Twitter and blogs to document everyday activities will ideally highlight massive population trends and social patterns.  

However, the dark side of this shift is that such data analysis tends to focus on graphical visualization, creating in the words of researcher Lisa Gitelman a “database aesthetics” that amplifies the rhetorical function of data over more qualitative and deeper mathematical analysis as well. 

Ashlin Lee's Why Big Data Is Scarier Than Metadata Retention- a Nice Bibliography of Sources

Ashlin Lee at Australia's Lifehacker site argues that issues like data retention and government surveillance are just the tip of the iceberg of worries the public should have over Big Data-- and nicely creates a linked bibliography of resources to further explore the problems he highlights:

FTC Report Revealed Clash Between Google's Public Statements and Internal Documents

Harvard's Ben Edelman responded to the leak of the FTC Google staff report with an indepth memo comparing available materials (particularly the staff memorandum's primary source quotations from internal Google em

Media Focuses on Silicon Valley Political Power in Wake of FTC Pass for Google

A series of articles in the media across the political spectrum spotlighted the rising political power of Silicon Valley— and Google in particular — in the wake of revelations of the Federal Trade Commission ignoring most of the recommendations and analysis of why antitrust action was needed against Google.   

A sampling of headlines:

The Monsanto Privacy Model? Monsanto and Farmer Organizations May be Forging New Path to Data Cooperation between Users and Corporations

Monsanto is usually thought of (and sometimes reviled) for its sale of genetically modified seeds, but the company is increasingly becoming the key big data technology firm providing real-time data to farmers as they plant their fields.  Since it purchased technology firm Climate Corp. in 2013 for $930 million, Monsanto has begun providing real-time data through a cell phone app to farmers cultivating 60 million out of the 161 million acres of U.S. farmland— meaning more than a third of U.S. farmland is under the guidance of Monsanto’s climate and cultivation data. Basic data is freely provided and farmers pay a premium for more specialized data and help.
There are 30 million agricultural fields in America and Monsanto has mapped them all with soil and climate data to a 10-meter by 10-meter resolution.  It provides real time temperature, weather, soil moisture, and other metrics to guid farmers on what to expect, even telling them the best days to work their fields and, with the premium version, how much water and fertilizer to use. 
The company uses satellite data to show farmers trouble spots in their fields and with auto-steering technologies in place, help farmers drive equipment in straight rows. Monsanto is, in the words of Robb Fraley, now Monsanto’s chief technology officer, "modeling microclimatic conditions, so you can become predictive on not only which field, but which part of a field should someone be looking at.” Another product provided by acquisition 640 Labs grabs geo-tagged data from tractors, combines and other equipment and allows farmers to store it for real-time and future analysis. 
Of course, Monsanto also gains as its reach expands, since every new farmer using its Climate Corp. software is new information about its customers for Monsanto— detailing what products they use, what they are farming and how much money they are making.  This puts Monsanto in the position to control more real-time data about farming in the nation than anybody else by far.
All of which is frightening, given Monsanto’s track record, but farmer organizations have already recognized the danger of losing control of such vital data to a single company and have organized to negotiate a set of principles on data sharing that could be a model for many other sectors.  Last fall, led by the American Farm Bureau, Monsanto, the American Soybean Association, Beck’s Hybrids, Dow AgroSciences LLC, DuPont Pioneer, John Deere, National Association of Wheat Growers, National Corn Growers Association, National Farmers Union, Raven Industries, and the USA Rice Federation all agreed last November to a set of principles to help protect farmer control of their own data as they negotiate with large agribusiness data companies like Monsanto. 
The principles of the agreement, which could be a model for other groups and legislation, include:

Scandal of FTC Burying Staff Analysis of Google's Search Advertising Monopoly Power

When the Wall Street Journal accidently got their hands on an original staff report from the Federal Trade Commission antitrust investigation of Google, many in the media focused  on where the FTC staff report on Google differed with the conclusions of the FTC Commissioners on whether Google’s favoring of its own properties  in its search engine unfairly hurt rival “vertical” content providers.

Now, politically accountable officials can and should disagree with often overeager staff on the conclusions of their work, but it is problematic that the shockingly short original decision of the FTC ignored so much of the evidence supporting the staffers argument for action against Google.

But the real scandal out of the revelations of the staff report is that the FTC Commissioners didn’t even address the staff arguments that Google was not just undermining competition in search engines, but that it so dominated search advertising that no rival could viably challenge it.

Originally I like many others thought the FTC Commissioners' decision was weak because they had ordered too narrow an investigation (just looking at search dominance versus competing verticals and ignoring the advertising side of the business model).   But no, the staff report has extensive analysis of the advertising side and how Google's exclusive contracts, higher monetization rates and other actions make it impossible for competitors to challenge it. 

The FTC Commissioners Barely Acknowledged Google Was in the Advertising Business

Reading the original FTC decisions, you’d barely know Google made all its money from advertising customers, not from users of its search engine.  And aside from condemning a practice of Google’s of not allowing third party software to be used with its own AdWords site along with rival sites, there was essentially no discussion by the Commissioners of anticompetitive actions by Google in the advertising side of its operations.

But then you read the staff report and a gusher of analysis of the search advertising marketplace emerges.  Questions that antitrust scholars like myself had asked fruitlessly for lack of data were addressed by FTC research staff, yet the data was neither made public nor even discussed by the Commissioners.

In fact, this kind of research for the public is the least of what the FTC should be doing, even if it doesn’t take direct action, yet instead the Commissioners buried the report and its data.

That is the real outrage.

Panopticon to Cryptopticon- or How Scary Surveillance Gave Way to Invisible Data-Driven Exploitation

Siva Vaidhyanathan has a must-read analysis in the Hedgehog Review of how to think of “privacy” in the modern era of big data. He starts by invoking an older trope of the Panopticon, Michel Foucault’s model of how the rise of visible, threatening surveillance by government and corporate actors changed the nature of modern patterns of life.  Prison towers tracked prisoners throughout a prison yard, foremen tracked workers with stopwatches, and governments induced obedience with cops on the beat.  
Privacy in that older mode of existence was about hiding from that surveillance and protecting freedom to act without retaliatory discipline.  One feature of that older form of Panopticonic surveillance was it didn’t matter whether the employer or government was actually watching you; people changed their behavior to either obey for fear of being caught or hid from surveillance in certain ways induced by the margins of freedom left out of that surveillance.  
The Stasi police state is the prototypical Panopticon of obedience induced by fear of surveillance.  However, in the modern, democratic world of  data-driven overlapping worlds of family, work, finance, and public life, we are now less fleeing specific centralized surveillance than, in Vaidhyanathan’s words, trying to "manage our various reputations within and among various contexts.”  
Visible surveillance gives way to endless data collection, but he argues that this rise of Big Data is less driven by specific technological opportunities than economic imperatives which have increased "incentives to target, trace, and sift” for profit opportunities.  
Instead of the scary Panopticon,  Vaidhyanathan argues for the emergence of an almost invisible “Cryptopticon", where instead of being frightened by surveillance, we are induced to share as much of ourselves, our data, our relationships as possible. "the Cryptopticon is not supposed to be intrusive or obvious. Its scale, its ubiquity, even its very existence, are supposed to go unnoticed.”  We may know we are being tracked but we cease to care.  
Where the Panopticon depended on forcing people to act in ever more uniform ways to induce obedience along particular lines demanded by those in economic and political power, the goal of the Cryptopticon is to encourage people to voluntarily "sort themselves into “niches” that will enable more effective profiling and behavioral prediction."

Understanding Value in a Black Box Society

A “big data revolution” is afoot in the social sciences. The increasing volume, variety, and velocity of data are irresistible raw material for inquiry. For its most optimistic exponents, the “datistic turn” renews social science by focusing inquiry on objective, verifiable, and measurable facts. Explicit models of behavior premised on (quasi-)experimental evidence may render once-soft fields as hard as biology, chemistry, or physics. On this account, economics has led the way, and the rest—ranging from philosophy to anthropology—must follow.

The datistic turn should revive interest in a neglected meta-field: the philosophy of social science. Lively debates raged in mid-20th century between some forerunners of today’s big data devotees (behaviorists), and interpretive social scientists committed to more narrative, normative, and holistic inquiry. The behaviorists’ tendency to treat mental processes as a “black box” is uncannily echoed in many current researchers’ uncritical acceptance of extant corporate data sets (and limits imposed on their use) as objective records.

Given firms’ triple layers of real and legal secrecy, and obfuscation, journals should be wary of such research until it is truly reproducible. Moreover, given the importance of key firms themselves to understanding our society, their internal decisionmaking should be archived for eventual release (even if it is decades in the future).  Social scientists might consider going beyond analysis of extant data, and joining coalitions of activists, to assure a more expansive, comprehensible, and balanced set of “raw materials” for analysis, synthesis, and critique. In short, rather than solely watching society, social science must now commit to assuring the representativeness and relevance of what is watched. The only alternative to “future-forming” research is to let the most powerful pull the strings in comfortable obscurity, while scholars’ agendas are dictated by the information that, by happenstance or design, is readily available.

The same cautions should govern legal scholarship on the platform economy. Digital labor remains highly controversial. For example, Uber has very creatively orchestrated a series of studies and alliances purporting to demonstrate the value and importance of its services. However, in order to truly understand its social costs, as Brishen Rogers shows, we would need to have access to far more information, which is now proprietary and hidden. For example, who approved its fake ride requests to undermine its competitor, Lyft? What types of returns are investors being promised? How much of the firm’s success is due to real, productive innovation, and how much simply reflects regulatory arbitrage (akin to Amazon’s famous tax advantages over brick-and-mortar retailers)?

Similarly, the extraordinary controversy over the only partially-available FTC staff report on the agency’s antitrust investigation of Google shows how even innovation policy itself can remain “in the dark” when it is politically convenient for it to remain so. I called for release of the report in 2013, only to be met with stony silence by the agency. Now, every other page of the report has been inadvertently released, and even this partial disclosure has several damning allegations and pieces of evidence. Until the full report is released (as well as some indication of the scope and nature of the controversy between the enforcement and economics divisions over the case), competition policy in the US remains opaque. Given what we have now, it’s hard to resist the conclusion that brute political calculations overrode the agency’s expert judgment.

When state and trade secrecy impose severe limits on the availability and use of sources, we must be very cautious about drawing conclusions too quickly about the nature of the digital economy. Leading firms have an agenda, which researchers can unwittingly advance when they focus inquiry on data which (executives have decided) are innocuous enough to be disclosed. A diverse coalition of watchdog groups, archivists, open data activists, and public interest attorneys are now working to assure a more balanced and representative set of “raw materials” for analysis. The critical and emancipatory potentials of social science and legal scholarship depend on the success of such efforts. 

This Time is Different: How Big Data Has Left the Middle Class Behind

What if innovation is driving economic stagnation and inequality?  That’s the question Charles Leadbetter analyzes in “The Whirlpool Economy” over at the UK Innovation Foundation’s Long+Short site.  He makes key points about the current relationships between innovation and the economy, but misses partly what may make the new technology of big data a particularly toxic driver of current economic inequality. That stagnation haunts the U.S. and especially Europe is a common observation, but as Leadbetter notes, it’s a "a very strange one, for it comes at time when our lives are in the midst of incessant change, much of it brought on by what claim to be radical innovations."

In past periods of stagnation, he argues "the economy stagnated because there was little underlying dynamism, few new ideas and limited opportunities for entrepreneurship.  He nods in the direction of basic Keynesian analyses of the problem, such as from Larry Summers who sees a deficit in demand driven by lower wages and austerity public policy. But Leadbetter argues that current innovation itself is a key driver of stagnation since so much new innovation “is aimed at eliminating jobs and lowering costs”:

The economy is creating jobs but many of them are low-productivity, low-pay service jobs. The result is that many young people find themselves doing work for which they are over-qualified: a quarter of all ‘entry level’ jobs in London are filled by someone with a degree, quite possibly one they have paid for themselves with debts they may never pay off."

He argues that this problem of the automation of jobs and deskilling of middle class households needs to be addressed by policy that raises wages and kickstarts the virtuous cycle of higher incomes, higher demand and higher production.  And we need less “disruptive” innovation and more innovation that "generates new jobs and augments existing ones; while addressing the spiralling costs of things like energy, health and social care that matter to median-income households."

Leadbetter’s argument is very on point as far as it goes, but what he doesn’t fully address is why automation now is so different from past cycles of boom-and-bust.   Analysts have been worrying about mass unemployment and impoverishment of the working classes at least since the Luddites in the early 19th century saw new textile technology endangering skilled textile jobs.  The rise of mass production and the assembly line were seen as replacing skilled craft workers with only semi-skilled automatons working at the behest of the production line machinery.  Yes, robotics threatens to add to the cycle of displacement but new jobs not even imagined before were created in the past and will likely be created in the future; heck, IBM just announced that they intend to train 10,000 engineers in analyzing Twitter data as part of their business services division, a kind of job that didn’t even exist in the past. 

But the kind of “big data” jobs IBM is developing as part of this cycle of job destruction/creation may highlight what IS different about this technological cycle and why new innovation is not being channeled into new income for middle income families.   In past rounds of technological job destruction, after the initial pain of unemployment and skills redeployment, workers would organize to demand a share of the new wealth created by the new machines and consumers would benefit from lower prices. However, with new “disruptive” technology today designed to help corporate America profile workers and consumers to better increase corporate profits, the “wealth” being created is by its nature more of a zero-sum game.  The industrial age created at least some degree of shared wealth where Henry Ford could argue paying higher wages for workers would in turn create demand for his cars, but subprime mortgage companies profiling consumers to sell them bad loans depend on the immiseration of working families as their profit source.


Subscribe to DJ Blog

Follow @DataJustice1 on Twitter

Data Graph


Subscribe to Syndicate