Data Justice Blog

A/B testing in politics- or are we just optimizing war chests?

As research for my book I’m studying the way people use big data techniques, mostly from the marketing world, in politics. So naturally I was intrigued by Kyle Rush’s blogpost about A/B testing on the Obama campaign. Kyle was the Deputy Director of Frontend Web Development at Obama for America.

In case you don’t know the lingo, A/B testing is a test done by marketers to decide which of two ad designs is more effective – the ad with the dark blue background or the ad with the dark red background, for example. But in this case it was more like, the ad with Obama’s family or the ad with Obama’s family and the American flag in the background.

The idea is, as a marketer, you offer your target audience both ads – actually, any individual in the target audience either sees ad A or ad B, randomly – and then, after enough people have seen the ads, you see which population responds more, and you go with that version. Then you move on to the next test, where you keep the characteristic that just won and you test some other aspect of the ad, like the font.

As a mathematical testing framework, A/B testing is interesting and has structural complications – how do you know you’re getting a global maximum instead of a local maximum? In other words, if you’d first tested the font, and then the background color, would you have ended up with a “better ad”? What if there are 50 things you’d like to test, how do you decide which order to test them in?

But that’s not what interests me about Kyle’s Obama A/B testing blogpost. Rather, I’m fascinated by the definition of success that was chosen.

After all, an A/B test is all about which ad “works better,” so there has to be some way to measure success, and it has to be measured in real time if you want to go through many iterations of your ad.

In the case of the Obama campaign, there were two definitions of success, or maybe three: how often people signed up to be on Obama’s newsletter, how often they gave money, and how much money they gave. I infer this from Kyle’s braggy second sentence, “Overall we executed about 500 a/b tests on our web pages in a 20 month period which increased donation conversions by 49% and sign up conversions by 161%.” Those were the measures Kyle and his team was optimizing on.

FTC Dropped Google Antitrust Action Staff Wanted - But Why Should That be Surprising?

Yesterday, the Wall Street Journal reported they had obtained the internal staff report from the Federal Trade Commission, where top staff urged action against Google for abusing its monopoly position.  

On four key questions, (1) Did Google illegally favor its own content over rivals, (2) Did Google illegally copy content from rival sites, (3) Did Google illegally restict advertisers' ability to run campaigns on rival search engines, and (4) Did Google illegally restrict other websites that publish its search results from working with rival search engines, the staff report found that Google had violated the law and that the FTC should take action.

However, when the FTC commission made its ruling, while a few commissioners expressed concern, they argued in the end that Google had done nothing illegal and no action was warranted.  Notably, the commissioners did not even reference most of the evidence outlined in this unredacted report, which was based on nine million pages of documents from Google and other parties.  Factually, one important detail that the FTC staff had found was that Google's dominance of seach was not 65% as widely report but potentially as high as 84% of U.S. search queries.

The FTC Commissioners may well have just had a different view of the evidence than its staff -- although I obviously thing the staff had the better arguments (see my law review articles here and here on other approaches I argue the government should be taking against Google on antitrust) -- but what's surprising is that anyone would expect the political appointees at the FTC to take any other action.  The Republican-chosen members had an ideological bias against antitrust action of almost any kind to begin with, while the Democratic nominees were just very unlikely to go after one of the most important political friends of President Obama.

Eric Schmidt, Chairman and previously CEO of Google, was not any ordinary political funder of Obama.   He was a core political operative for the campaign who played a key role in establishing the digital operations that wowed the nation as Obama built a campaign network that took first the nomination and then the Presidency.   As David Plouffe wrote in his campaign autobiography, The Audacity to Win: The Inside Story and Lessons of Barack Obama's Historic Victory:

“With the help of supporters like Eric Schmidt of Google, we dramatically improved our digital strategy and execution, and I’d say we were competitive digitally with any business-world start-up.”

It is not overly cynical to note that while Obama has taken on corporate interests in a number of fields from health care to the for-profit health care industry to the energy sector, he has been notably softer on a few of the sectors where he got the most core initial campaign support, notably Wall Street finance folks clustered around Goldman Sachs and Silicon Valley.   And Google through Eric Schmidt was one of the key conduits for that Silicon Valley support, so expecting Obama's FTC to take what would be a groundbreaking antitrust action against Obama's key political supporter was unlikely to ever be in the cards.

15 Years of FTC Failure to Factor Privacy into Merger Reviews

Under both Democratic and Republican administrations, over more than fifteen years, the Federal Trade Commission has ignored privacy concerns in approving merger after merger. The Electronic Privacy Information Center (EPIC) details that history in extensive comments submitted to the FTC as part of its review of its own merger remedy process. 

Just a sampling of the mergers detailed where privacy concerns were ignored:

  • 1999-  EPIC first raised concerns when the Internet advertising firm Doubleclick proposed acquiring the catalog database firm Abacus.
  • 2000- EPIC and a coalition of consumer groups highlighted the danger to privacy in the proposed merger of Tim Warner and AOL.
  • 2007- EPIC highlighted the clear danger to consumer privacy of combining Google’s own extensive profiling of consumers with Doubleclick’s database in a corporate merger.
  • 2014- As recently as last year, consumer groups asked that privacy concerns be taken into account to block the merger of Facebook with WhatsApp.

As EPIC argues, in each case:

[T]he practical consequence of the merger would be to reduce the privacy protections for consumers and expose individuals to enhanced tracking and profiling. The failure of the Federal Trade Commission to take this into account during merger review is one of the main reasons consumer privacy in the United States has diminished significantly over the last 15 years.

In many cases, companies that previously built their businesses on promises not to collect or share personal data, then were absorbed by companies without such commitments, betraying the trust users had placed in the original companies. Notably, after the DoubleClick merger, “Google has continued to expand the tracking and profiling Internet users, often ignoring prior commitments it had made to protect the privacy of these same users.”

Notably, European competition regulators are increasingly seeing protecting personal data from corporate control as an integral part of their responsibility.  Recently appointed European Union Competition Commissioner Margrethe Vestager argued recently:

A critique of a review of a book by Bruce Schneier

I haven’t yet read Bruce Schneier’s new book, Data and Goliath: The Hidden Battles To Collect Your Data and Control Your World. I plan to in the coming days, while I’m traveling with my kids for spring break.

Even so, I already feel capable of critiquing this review of his book (hat tip Jordan Ellenberg), written by Columbia Business School Professor and Investment Banker Jonathan Knee. You see, I’m writing a book myself on big data, so I feel like I understand many of the issues intimately.

The review starts out flattering, but then it hits this turn:

When it comes to his specific policy recommendations, however, Mr. Schneier becomes significantly less compelling. And the underlying philosophy that emerges — once he has dispensed with all pretense of an evenhanded presentation of the issues — seems actually subversive of the very democratic principles that he claims animates his mission.

That’s a pretty hefty charge. Let’s take a look into Knee’s evidence that Schneier wants to subvert democratic principles.

NSA

First, he complains that Schneier wants the government to stop collecting and mining massive amounts of data in its search for terrorists. Knee thinks this is dumb because it would be great to have lots of data on the “bad guys” once we catch them.

Any time someone uses the phrase “bad guys,” it makes me wince.

But putting that aside, Knee is either ignorant of or is completely ignoring what mass surveillance and data dredging actually creates: the false positives, the time and money and attention, not to mention the potential for misuse and hacking. Knee’s opinion on that is simply that we normal citizens just don’t know enough to have an opinion on whether it works, including Schneier, and in spite of Schneier knowing Snowden pretty well.

It’s just like waterboarding – Knee says – we can’t be sure it isn’t a great fucking idea.

Wait, before we move on, who is more pro-democracy, the guy who wants to stop totalitarian social control methods, or the guy who wants to leave it to the opaque authorities?

Data Justice Report: Taking on Big Data as an Economic Justice Issue

For Release: March 16, 2015
Contact:  Nathan@datajustice.org  (917) 854-0279

The control of personal data by “big data” companies is not just as issue of privacy but is becoming a critical issue of economic justice, argues a new report issued by the organization Data Justice (www.datajustice.org), which itself is being publicly launched in conjunction with the report. 

Report: Data Justice
Taking on Big Data as an Economic Justice Issue

“This steady loss of data by individuals into the hands of increasingly centralized corporate hands is helping to drive a large portion of the economic inequality that has becoming central to political debate in our nation,” said Data Justice Director Nathan Newman, the report’s author.

Big data platforms collect so much information about so many people, details the report, that correlations emerge that allow individuals to be slotted into hiring and marketing categories in unexpected and often unwelcome ways that usually leave them at a distinct disadvantage in negotiations.  This enables advertisers to offer goods at different prices to different people, what economists call price discrimination, to extract the maximum price from each individual consumer.  Such online price discrimination raises prices overall for consumers, while often hurting lower-income and less technologically savvy households.

Data crunchers were key to manipulating financial markets and securities throughout the financial industry and big data platforms were critical parts of the marketing machine that used various forms of consumer profiling and price discrimination to push subprime financial products out to the most vulnerable members of the American public.  Notably, by the mid-2000s, the lion’s share of the online advertising economy was being driven by subprime and related mortgage lenders, highlighting the ways the profits of big data platforms have often come at the expense of consumer welfare.

“At the same time, big data is fueling economic concentration across our economy,” argues Newman.  As a handful of data platforms generate massive amounts of user data, the barriers to entry rise since potential competitors have little data themselves to entice advertisers compared to the incumbents who have both the concentrated processing power and supply of user data to dominate particular sectors.  With little competition, companies end up with little incentive to either protect user privacy or share the economic value of that user data with the consumers generating those profits.

The report argues for a threefold approach to making big data work for everyone in the economy, not just the big data platforms’ shareholders:

  • First, regulators need to strengthen user control of their own data by both requiring explicit consent for all uses of the data and better informing users of how it’s being used and how companies profit from that data. 
  • Second, regulators need to factor control of data into merger review and to initiate antitrust actions against companies like Google where monopoly control of a sector like search advertising has been established.  
  • Third, policymakers should restrict practices that harm consumers, including banning price discrimination where consumers are not informed of all discount options available and bringing the participation of big data platforms in marketing financial services under the regulation of the Consumer Financial Protection Bureau

Data Justice itself has been founded as an organization “to promote public education and new alliances to challenge the danger of big data to workers, consumers and the public.”  It will work to educate the public, policymakers and organizational allies on how big data is contributing to economic inequality in the economy.  Its new website at

http://www.datajustice.org

is intended to bring together a wide range of resources highlighting the economic justice aspects of big data.

Big Data Is The New Phrenology

Have you ever heard of phrenology? It was, once upon a time, the “science” of measuring someone’s skull to understand their intellectual capabilities.

This sounds totally idiotic but was a huge fucking deal in the mid-1800’s, and really didn’t stop getting some credit until much later. I know that because I happen to own the 1911 edition of the Encyclopedia Britannica, which was written by the top scholars of the time but is now horribly and fascinatingly outdated.

For example, the entry for “Negro” is famously racist. Wikipedia has an excerpt: “Mentally the negro is inferior to the white… the arrest or even deterioration of mental development [after adolescence] is no doubt very largely due to the fact that after puberty sexual matters take the first place in the negro’s life and thoughts.”

But really that one line doesn’t tell the whole story. Here’s the whole thing, it’s long:

Pages 1 and 2

Pages 1 and 2

Industrial Policy for Big Data

 

If you are childless, shop for clothing online, spend a lot on cable TV, and drive a minivan, data brokers are probably going to assume you’re heavier than average. We know that drug companies may use that data to recruit research subjects.  Marketers could utilize the data to target ads for diet aids, or for types of food that research reveals to be particularly favored by people who are childless, shop for clothing online, spend a lot on cable TV, and drive a minivan.

We may also reasonably assume that the data can be put to darker purposes: for example, to offer credit on worse terms to the obese (stereotype-driven assessment of looks and abilities reigns from Silicon Valley to experimental labs).  And perhaps some day it will be put to higher purposes: for example, identifying “obesity clusters” that might be linked to overexposure to some contaminant

To summarize: let’s roughly rank these biosurveillance goals as: 

1) Curing illness or precursors to illness (identifying the obesity cluster; clinical trial recruitment)

2) Helping match those offering products to those wanting them (food marketing)

3) Promoting the classification and de facto punishment of certain groups (identifying a certain class as worse credit risks)

At present, law does not do enough to recognize how valuable goals like 1) are, and how destructive 3) could become. In fact, to the extent 1 is highly regulated, and 3 is unregulated, law may perversely help channel capital into discriminatory ventures and away from socially productive ones. 

Launching Data Justice

As a handful of data platforms generate massive amounts of user data, the barriers to entry rise since potential competitors have little data themselves to entice advertisers compared to the incumbents who have both the concentrated processing power and supply of user data to dominate particular sectors.  The upshot of this market power by big data platforms is that the marketplace is doing little to create options for consumers that might alleviate the misuse of consumer data or encourage big data platforms to better compensate users who are willing to share their data. 

Data Justice has been launched as a project to promote public education and new alliances to challenge the danger of big data to workers, consumers and the public.  Our work will include:

  • A Focus on Financial Exploitation: Data Justice will educate key stakeholders, allies, and the public on approaches to prevent big data platforms from using that data in ways that harm consumers.  We will highlight the way big data platforms facilitate the exploitation of employees, consumers and citizens by abusive financial services companies and thereby increase economic discrimination and inequality in the economy.
  • Outreach to Allies: We will work to expand the coalition of organizations focused on the problem of big data platforms by bringing in the consumer, civil rights, union and other organizations currently mobilized around financial reform in the wake of the recent financial crisis.  Data Justice will work with a range of organizations to highlight the problem of big data platforms and how a focus on their role in promoting economic exploitation in financial services fits within those groups' current work. 
  • Public Education Campaign:  Data Justice is engaged in a broad public education campaign, including developing a public website, social media campaign, public policy documents and placing individual articles, blog posts and other media pieces to support the effort.  Our goal is focus media and public attention on the issue of online price discrimination and the power of big data platforms, as well as the way concentrated control of user data feeds increasing economic inequality in the economy.
  • Develop Policy Research:  On an ongoing basis, Data Justice will produce policy reports, articles and policy briefs that outline how big data impacts different sectors of consumer and workers rights and policy options for reducing those harms to the public.
  • Educate Government Officials and Other Targets of Campaign:  We will work to educate key members of government agencies, elected officials and other non-government institutions about what we see as the danger of data platforms' impact on economic justice issues and why regulation is warranted.

We hope you will follow our work in the months to come!

Pages

Subscribe to DJ Blog
Transparency

Follow @DataJustice1 on Twitter

Data Graph

Syndicate

Subscribe to Syndicate