A/B testing may seem harmless, but many consumers don鈥檛 like how easily companies can test them without their knowledge. Should marketers change how they test?
What ethical questions could a simple
A/B test raise? What could be wrong with testing how people react to two
different campaigns or website colors?
Michelle Meyer would tell you: not much. She鈥檚 an assistant professor and associate director of research ethics at and says that she doesn鈥檛 see the bounds of A/B testing so differently from the bounds of business ethics. 鈥淪elling? Yes. Upselling? Hmm. Advertising? Yes. False advertising? No,鈥 she says.
But as the amount of data online grows, the line between research and business gets thinner. Companies can now A/B test large groups of consumers, and social media platforms can test even larger numbers of users. Large corporations, including Google, Amazon and Netflix, undertake many A/B tests every day, unknown and unseen by the users being tested. This is unsettling to some; many consumers have loudly鈥攕ometimes angrily鈥攕poken out when they realized that they were part of A/B tests undertaken by companies that wield large pools of data. In March 2014, Facebook tested about 700,000 of its users without telling them. The social platform gave some users a positive newsfeed and others a negative newsfeed, publishing the results of its study in collaboration with researchers from Cornell University in the journal Proceedings of the National Academy of Sciences. The resulting paper, titled 鈥,鈥 found that users with negative newsfeeds posted more negative words, and those with positive newsfeeds posted more positive words. Many users decried Facebook鈥檚 use of 鈥渉uman experimentation.鈥 The chorus of complaints grew so loud that Adam D.I. Kramer, one of the Facebook researchers who worked on the study, apologized.
鈥淭he reason we did this research is because we care about the emotional impact of Facebook and the people that use our product,鈥 Kramer wrote in a Facebook blog post. 鈥淚 can understand why some people have concerns about it, and my co-authors and I are very sorry for the way the paper described the research and any anxiety it caused.鈥
Meyer found the kerfuffle to be distasteful鈥攏ot for what Facebook and Cornell researchers studied, but for what she thought was a misled, overheated response by many angry media members and Facebook users. The sample size of the study was huge, but the results were minimal鈥攙iewers who saw negative newsfeeds, for example, only posted four more negative words for every 10,000 words they wrote. What was reported by many as a giant wave of emotion was more of a minor blip. 鈥淧eople were wrong on the internet and it was annoying,鈥 Meyer says. , an administrative body that regulates research performed on human subjects, if Facebook had practiced slightly better 鈥攏otifying potential subjects of active tests and alerting them to potential side effects. Her post was quickly picked up by Wired and shared thousands of times. 鈥淲e can certainly have a conversation about the appropriateness of Facebook-like manipulations, data mining and other 21st-century practices,鈥 Meyer wrote in the post. 鈥淏ut so long as we allow private entities freely to engage in these practices, we ought not unduly restrain academics trying to determine their effects.鈥
A year later, Meyer wrote a column for The New York Times with Christopher Chabris, an associate professor of psychology at Union College, which editors provocatively titled: 鈥溾 Meyer and Chabris wrote that the outrage against Facebook鈥檚 testing was a 鈥渕oral illusion,鈥 a false choice between releasing a product or atmosphere and experimenting with different products or atmospheres.
鈥淐ompanies鈥攁nd other powerful actors, including lawmakers, educators and doctors鈥斺榚xperiment鈥 on us without our consent every time they implement a new policy, practice or product without knowing its consequences,鈥 they wrote. 鈥淲e aren鈥檛 saying that every innovation requires A/B testing. Nor are we advocating nonconsensual experiments involving significant risk. But as long as we permit those in power to make unilateral choices that affect us, we shouldn鈥檛 thwart low-risk efforts 鈥 to rigorously determine the effects of those choices. Instead, we should cast off the A/B illusion and applaud them.鈥
But others disagree and see A/B testing as an ethical risk, no matter if researchers run tests in a lab or corporations run tests atop shared desks. , a professor of computing science at the University of Aberdeen and chief scientist of Arria NLG, teaches the Facebook study to his students as something to avoid. It was unethical, he says. 鈥淚 would certainly never accept that as an academic research project,鈥 Reiter says. 鈥淚t鈥檚 not acceptable to me to manipulate people鈥檚 emotion. That is not acceptable without informed consent.鈥
Informed consent is the heart of modern research ethics, Reiter says. In 2017, he struggled with an A/B test proposal for this reason. He was asked to approve a project by a researcher who was working with a real-world service provider; they wanted to test different strategies on different clients, then evaluate the results. that he knew these real-world tests can be helpful, but he also knew that they exist in ethical gray areas. He wasn鈥檛 quite sure how to ask participants for informed consent. If researchers were transparent, they might bias the participants and ruin the test; if the researchers weren鈥檛 transparent, he didn鈥檛 believe that the testing would be ethical.
鈥淲e decided to go with allowing [participants] to opt out and be transparent after the fact,鈥 he says, meaning the study was explained to participants after they were tested. 鈥淎cademically, I think that it鈥檚 the right thing to do, and I think that companies might also consider going down that path because, otherwise, it鈥檚 a danger to blow up in their face.鈥
Most companies solve this issue by avoiding it. They don鈥檛 ask for informed consent from the users whom they A/B test; at most, they inform users of potential tests in agreements users sign when they join. These agreements are filled with fine print and legalese; no one reads them, Reiter says. Users click away without even glancing at the fine print, thereby allowing companies to test their data and send it along to third-party aggregators. Reiter says that researchers need to hold themselves to a higher standard than this, but he believes that businesses should, too. Most companies decide their testing policies in terms of what will or won鈥檛 get them sued, he says, but to not account for ethics is to activate a ticking time bomb that he believes will destroy a company鈥檚 reputation.
Ethics don鈥檛 lend themselves to binary choices, Meyer says. Ethical marketers, like researchers, will always be faced with tough choices. 鈥攖he U.S. rule of ethics that oversees biomedical and behavioral research involving human subjects鈥攇uides researchers toward more ethical choices, but situational standards may be applied differently on each test. And even so, the Common Rule doesn鈥檛 apply to corporations. Companies that can afford ethical consultants might hire them for tough decisions, but those without the budget should also consider the ramifications of their testing.
There are obvious fault lines of A/B testing that marketers will encounter: A/B testing an alcohol brand on the Facebook fans of Alcoholics Anonymous is unethical to its core, whereas A/B testing two different headlines on the same marketing email is as benign as a single drop of rain. As Meyer noted, this is Business Ethics 101.
There are many reasons why consumers might object to A/B testing, Meyer says: objections to randomization, a feeling of unfairness or inequality, an assumption that businesses already know what will work. But she doesn鈥檛 believe that it鈥檚 logically consistent to be against A/B testing just because it tests two different things. CEOs often decide to launch one product to market without testing鈥攅ssentially a 鈥淏 test.鈥 Meyer says that she doesn鈥檛 believe anyone would be angry at the CEO of that company for giving consumers a B test without consent.
鈥淪o why is the moral world shook upside down if half of people get A and half of people get B?鈥 she asks. 鈥淭hat鈥檚 a bit of a mystery.鈥
What would be a change is more transparency, similar to what Reiter opted for in his academic A/B test. Meyer, like Reiter, says that companies should assuage consumer concerns by creating a landing page that explains the ongoing A/B tests to them. This page would be a simple explanation about why the company is showing users multiple versions of the same page. Reiter says that users should also be given the right to opt out if they don鈥檛 want their data to be used as part of an A/B test, and he says that they should be informed as to how their data is being protected.
The alternative, of course, is to do nothing. This is what most companies do, save for a note when users sign up for their services. If users find out they鈥檙e being tested, they may be angry or they may not care. 鈥淚t鈥檚 a dilemma,鈥 Meyer says.
Reiter says that much of this dilemma can be solved with transparency, to inform users of the test. If users complain, then he says that the company should change its future tests.
鈥淚f you don鈥檛 do it then it鈥檚 going to blow up in your face sooner or later,鈥 he says. 鈥淎s an academic, we tell people, 鈥榊ou鈥檝e got to [test] properly. And if you try to hide that you are doing it, you鈥檒l eventually get found out, and it will be a lot worse.鈥欌