How an IRB Could Have Legitimately Approved the Facebook Experiment—and Why that May Be a Good Thing

Image courtest Flickr — Image courtesy Flickr

By now, most of you have probably heard—perhaps via your Facebook feed itself—that for one week in January of 2012, Facebook altered the algorithms it uses to determine which status updates appeared in the News Feed of 689,003 randomly-selected users (about 1 of every 2500 Facebook users). The results of this study—conducted by Adam Kramer of Facebook, Jamie Guillory of the University of California, San Francisco, and Jeffrey Hancock of Cornell—were just published in the Proceedings of the National Academy of Sciences (PNAS).

Although some have defended the study, most have criticized it as unethical, primarily because the closest that these 689,003 users came to giving voluntary, informed consent to participate was when they—and the rest of us—created a Facebook account and thereby agreed to Facebook’s Data Use Policy, which in its current iteration warns users that Facebook “may use the information we receive about you . . . for internal operations, including troubleshooting, data analysis, testing, research and service improvement.”

Some of the discussion has reflected quite a bit of misunderstanding about the applicability of federal research regulations and IRB review to various kinds of actors, about when informed consent is and isn’t required under those regulations, and about what the study itself entailed. In this post, after going over the details of the study, I explain (more or less in order):

How the federal regulations define “human subjects research” (HSR)
Why HSR conducted and funded solely by an entity like Facebook is not subject to the federal regulations
Why HSR conducted by academics at some institutions (like Cornell and UCSF) may be subject to IRB review, even when that research is not federally funded
Why involvement in the Facebook study by two academics nevertheless probably did not trigger Cornell’s and UCSF’s requirements of IRB review
Why an IRB—had one reviewed the study—might plausibly have approved the study with reduced (though not waived) informed consent requirements
And why we should think twice before holding academics to a higher standard than corporations

The Study

As the authors explain, “[b]ecause people’s friends frequently produce much more content than one person can view,” Facebook ordinarily filters News Feed content “via a ranking algorithm that Facebook continually develops and tests in the interest of showing viewers the content they will find most relevant and engaging.” In this study, the algorithm filtered content based on its emotional content. A post was identified as “positive” or “negative” if it used at least one word identified as positive or negative by software (run automatically without researchers accessing users’ text).Some critics of the experiment have characterized it as one in which the researchers intentionally tried “to make users sad.” With the benefit of hindsight, they claim that the study merely tested the perfectly obvious proposition that reducing the amount of positive content in a user’s News Feed would cause that user to use more negative words and fewer positive words themselves and/or to become less happy (more on the gap between these effects in a minute). But that’s not what some prior studies would have predicted. Previous studies both in the U.S. and in Germany had found that the largely positive, often self-promotional content that Facebook tends to feature has made users feel bitter and resentful—a phenomenon the German researchers memorably call “the self-promotion-envy spiral.” Those studies would have predicted that reducing the positive content in a user’s feed might actually make users less sad. And it makes sense that Facebook would want to determine what will make users spend more time on its site rather than close that tab in disgust or despair. [Update as I’m wrapping up this post: the first author, Adam Kramer of Facebook, confirms—on Facebook, of course—that they did indeed want to investigate the theory that seeing friends’ positive content makes users sad.]In any case, the researchers conducted two experiments, with a total of four groups of users (about 155,000 each). In the first experiment, Facebook reduced the positive content of News Feeds. Each positive post “had between a 10% and 90% chance (based on their User ID) of being omitted from their News Feed for that specific viewing.” In the second experiment, Facebook reduced the negative content of News Feeds in the same manner. In both experiments, these treatment conditions were compared to control conditions in which a similar portion of posts were randomly filtered out (i.e., without regard to emotional content). Note that whatever negativity users were exposed to came from their own friends, not, somehow, from Facebook engineers. In the first, presumably most objectionable, experiment, the researchers merely chose to filter out varying amounts (10% to 90%) of friends’ positive content, thereby leaving a News Feed more concentrated with posts in which a user’s friend had written at least one negative word.

The results? Whether they hypothesized these results in advance or not, the investigators found that

[f]or people who had positive content reduced in their News Feed, a larger percentage of words in people’s status updates were negative and a smaller percentage were positive. When negativity was reduced, the opposite pattern occurred. These results suggest that the emotions expressed by friends, via online social networks, influence our own moods, constituting, to our knowledge, the first experimental evidence for massive-scale emotional contagion via social networks.

Note two things. First, while statistically significant, these effect sizes are, as the authors acknowledge, quite small. The largest effect size was a mere two hundredths of a standard deviation (d = .02). The smallest was one thousandth of a standard deviation (d = .001). The authors suggest that their findings are primarily significant for public health purposes, because when the aggregated, even small individual effects can have large social consequences.

Second, although the researchers conclude that their experiments constitute evidence of “social contagion” — that is, that “emotional states can be transferred to others” — this overstates what they could possibly know from this study. The fact that someone exposed to positive words very slightly increased the amount of positive words that she then used in her Facebook posts does not necessarily mean that this change in her News Feed content caused any change in her mood. The very slight increase in the use of positive words could simply be a matter of keeping up (or down, in the case of the reduced positivity experiment) with the Joneses. It seems highly likely that Facebook users experience (varying degrees of) pressure to conform to social norms about acceptable levels of snark and kvetching—and of bragging and pollyannaisms. Someone who is already internally grumbling about how United Statesians are, like, such total posers during the World Cup may feel more free to voice that complaint on Facebook than when his feed was more densely concentrated with posts of the “human beings are so great and I feel so lucky to know you all—group hug! <3 <3 <3” variety.

It is not at all clear, then, that the experiment caused the 155,000 or so users who were in the reduced positivity arm to feel any worse than they otherwise would have, much less for any increase in negative affect that may have occurred to have risen to the level of a mental health crisis, as some have suggested.

Was the Facebook Study “Human Subjects Research”?

One threshold question in determining whether this study required IRB approval is whether it constituted “human subjects research.” An activity is “research” under the federal regulations, if it is “a systematic investigation, including research development, testing and evaluation, designed to develop or contribute to generalizable knowledge.” The study was plenty systematic, and it was designed to investigate the “self-promotion-envy spiral” theory of social networks. Check.

As defined by the regulations, a “human subject” is “a living individual about whom an investigator . . . obtains,” inter alia, “data through intervention.” Intervention, in turn, includes “manipulations of the subject or the subject’s environment that are performed for research purposes.” According to guidance issued by the Office for Human Research Protections (OHRP), the federal agency tasked with overseeing application of the regulations to HHS-conducted and –funded human subjects research, “orchestrating environmental events or social interactions” constitutes manipulation.

I suppose one could argue—in the tradition of choice architecture—that to say that Facebook manipulated its users’ environment is a near tautology. Unless it indiscriminately dumps every friend’s status update into a user’s feed in chronological order, there is no way for Facebook not to manipulate—via filter—users’ feeds. Of course, Facebook could do just that, but in fact it does not; it employs algorithms to filter News Feeds, which it apparently regularly tweaks in an effort to maximize user satisfaction, ideal ad placement, and so on. Given this baseline of constant manipulation, you could say that this study did not involve any incremental additional manipulation. No manipulation, no intervention. No intervention, no human subjects. No human subjects, no federal regulations requiring IRB approval.

It may be that Facebook regularly changes the algorithms that determine how a user experiences her News Feed. But that doesn’t mean that moving from one algorithm to the next doesn’t constitute a manipulation of the user’s environment. In any case, I’ll assume that this study meets the federal definition of “human subjects research” (HSR).

Was the Facebook Study Subject to Federal Research Regulations?

Contrary to the apparent beliefs of some commentators, not all HSR is subject to the federal regulations, including IRB review. By the terms of the regulations themselves, HSR is subject to IRB review only when it is conducted or funded by any of several federal departments and agencies (so-called Common Rule agencies), or when it will form the basis of an FDA marketing application. HSR conducted and funded solely by entities like Facebook is not subject to federal research regulations.

But this study was not conducted by Facebook alone; the second and third authors on the paper have appointments at the University of California, San Francisco, and Cornell, respectively. Although some commentators assume that university research is only subject to the federal regulations when that research is funded by the government, this, too, is incorrect. Any college or university that accepts any research funds from any Common Rule agency must sign a Federalwide Assurance (FWA), a boilerplate contract between the institution and OHRP in which the institution identifies the duly-formed and registered IRB that will review the funded research. The FWA invites institutions to voluntarily commit to extend the requirement of IRB review from funded projects to all HSR in which the institution is engaged, regardless of the source of funding, if any (see item 4b)—and historically, the vast majority of colleges and universities have agreed to “check the box,” as it’s called. If you are a student or a faculty member at an institution that has checked the box, then any HSR you conduct must be approved by an IRB.

As I recently had occasion to discover, Cornell has indeed checked the box (see #5 here). UCSF appears to have done so, as well, although it’s possible that it simply requires IRB review of all HSR by institutional policy, rather than FWA contract.

But these FWAs only require IRB review if the two authors’ participation in the Facebook study meant that Cornell and UCSF were “engaged” in research. When an institution is “engaged in research” turns out to be an important legal question in much collaborative research, and one the Common Rule itself doesn’t address. OHRP, however, has issued (non-binding, of course) guidance on the matter. The general rule is that an institution is engaged in research when its employee or agent obtains data about subjects through intervention or interaction, identifiable private information about subjects, or subjects’ informed consent.

According to the author contributions section of the PNAS paper, the Facebook-affiliated author “performed [the] research” and “analyzed [the] data.” The two academic authors merely helped him design the research and write the paper. They would not seem to have been involved, then, in obtaining either data or informed consent. (And even if the academic authors had gotten their hands on individualized data, so long as that data remained coded by Facebook user ID numbers that did not allow them to readily ascertain subjects’ identities, OHRP would not consider them to have been engaged in research.)

It would seem, then, that neither UCSF nor Cornell was “engaged in research” and, since Facebook was engaged in HSR but is not subject to the federal regulations, that IRB approval was not required. Whether that’s a good or a bad thing is a separate question, of course. (A previous report that the Cornell researcher had received funding from the Army Research Office, which as part of the Department of Defense, a Common Rule agency, would have triggered IRB review, has been retracted.)

Did an IRB Review the Facebook Experiment, and If So, How Could It Have Approved It?

Princeton psychologist Susan Fiske, who edited the PNAS article, told a Los Angeles Times reporter the following:

Then Forbes reported that Fiske “misunderstood the nature of the approval. A source familiar with the matter says the study was approved only through an internal review process at Facebook, not through a university Institutional Review Board.”

Most recently, Fiske told the Atlantic that Cornell’s IRB did indeed review the study, and approved it as having involved a “pre-existing dataset.” Given that, according to the PNAS paper, the two academic researchers collaborated with the Facebook researcher in designing the research, it strikes me as disingenuous to claim that the dataset preexisted the academic researchers’ involvement. As I suggested above, however, it does strike me as correct to conclude that, given the academic researchers’ particular contributions to the study, neither UCSF nor Cornell was engaged in research, and hence that IRB review was not required at all.

But if an IRB had reviewed it, could it have approved it, consistent with a plausible interpretation of the Common Rule? The answer, I think, is Yes, although under the federal regulations, the study ought to have required a bit more informed consent than was present here (about which more below).

Many have expressed outrage that any IRB could approve this study, and there has been speculation about the possible grounds the IRB might have given. The Atlantic suggests that the “experiment is almost certainly legal. In the company’s current terms of service, Facebook users relinquish the use of their data for ‘data analysis, testing, [and] research.’” But once a study is under an IRB’s jurisdiction, the IRB is obligated to apply the standards of informed consent set out in the federal regulations, which go well, well beyond a one-time click-through consent to unspecified “research.” Facebook’s own terms of service are simply not relevant. Not directly, anyway.

According to Prof. Fiske’s now-uncertain report of her conversation with the authors, by contrast, the local IRB approved the study “on the grounds that Facebook apparently manipulates people’s News Feeds all the time.” This fact actually is relevant to a proper application of the Common Rule to the study.

Here’s how. Section 46.116(d) of the regulations provides:

An IRB may approve a consent procedure which does not include, or which alters, some or all of the elements of informed consent set forth in this section, or waive the requirements to obtain informed consent provided the IRB finds and documents that:

The research involves no more than minimal risk to the subjects;

The waiver or alteration will not adversely affect the rights and welfare of the subjects;

The research could not practicably be carried out without the waiver or alteration; and

Whenever appropriate, the subjects will be provided with additional pertinent information after participation.

The Common Rule defines “minimal risk” to mean “that the probability and magnitude of harm or discomfort anticipated in the research are not greater in and of themselves than those ordinarily encountered in daily life . . . .” The IRB might plausibly have decided that since the subjects’ environments, like those of all Facebook users, are constantly being manipulated by Facebook, the study’s risks were no greater than what the subjects experience in daily life as regular Facebook users, and so the study posed no more than “minimal risk” to them.

That strikes me as a winning argument, unless there’s something about this manipulation of users’ News Feeds that was significantly riskier than other Facebook manipulations. It’s hard to say, since we don’t know all the ways the company adjusts its algorithms—or the effects of most of these unpublicized manipulations. We know that one News Feed tweak “directly influenced political self-expression, information seeking and real-world voting behaviour of millions of people” during the 2010 congressional elections. That tweak may have been designed to contribute to generalizable knowledge, so perhaps it shouldn’t count in the risks “ordinarily encountered in daily life” analysis. But another tweak to the Facebook interface designed to affect not only users’ word choice or even mood but their behavior—Mark Zuckerberg’s decision to give users a formal way of telling their friends that they had registered as an organ donor—was motivated by altruism after conversations with liver transplant recipient Steve Jobs, although the dramatic effects of that policy change have been studied by academics.

Even if you don’t buy that Facebook regularly manipulates users’ emotions (and recall, again, that it’s not clear that the experiment in fact did alter users’ emotions), other actors intentionally manipulate our emotions every day. Consider “fear appeals”—ads and other messages intended to shape the recipient’s behavior by making her feel a negative emotion (usually fear, but also sadness or distress). Examples include “scared straight” programs for youth warning of the dangers of alcohol, smoking, and drugs, and singer-songwriter Sarah McLachlan’s ASPCA animal cruelty donation appeal (which I cannot watch without becoming upset—YMMV—and there’s no way on earth I’m being dragged to the “emotional manipulation” that is, according to one critic, The Fault in Our Stars).

Continuing with the rest of the § 46.116(d) criteria, the IRB might also plausibly have found that participating in the study without Common Rule-type informed consent would not “adversely effect the rights and welfare of the subjects,” since Facebook has limited users’ rights by requiring them to agree that their information may be used “for internal operations, including troubleshooting, data analysis, testing, research and service improvement.”

Finally, the study couldn’t feasibly have been conducted with full Common Rule-style informed consent—which requires a statement of the purpose of the research and the specific risks that are foreseen—without biasing the entire study. Of course, surely the IRB, without biasing the study, could have required researchers to provide subjects with some information about this specific study beyond the single word “research” that appears in the general Data Use Policy, as well as the opportunity to decline to participate in this particular study, and these things should have been required on a plain reading of § 46.116(d). In other words, the study was probably eligible for “alteration” in some of the elements of informed consent otherwise required by the regulations, but not for a blanket waiver.

Moreover, subjects should have been debriefed by Facebook and the other researchers, not left to read media accounts of the study and wonder whether they were among the randomly-selected subjects studied.

Still, the bottom line is that—assuming the experiment required IRB approval at all—it was probably approvable in some form that involved much less than 100% disclosure about exactly what Facebook planned to do and why.

Two Ways of Viewing “Minimal Risk” Bootstrapping

There are (at least) two ways of thinking about this feedback loop between the risks we encounter in daily life and what counts as “minimal risk” research for purposes of the federal regulations.

One view is that this is a lamentable bootstrapping of increasingly common business practices of data mining and behavioral manipulation by marketers and others into federal risk-based regulation of human subjects research, where risk is accordingly defined down. Human beings attempt to manipulate one another all the time, of course, and have done so since time immemorial. But once upon a time, the primary sources of emotional manipulation in a person’s life were called “toxic people,” and once you figured out who those people were, you would avoid them as much as possible. Now, everyone’s trying to nudge, data mine, or manipulate you into doing or feeling or not doing or not feeling something, and they have access to you 24/7 through targeted ads, sophisticated algorithms, and so on, and the ubiquity is—on this view—being further used against us by watering down human subjects research protections.

There’s something to that lament.

But there’s something to the other view, as well. This other view is that this bootstrapping is entirely appropriate. If Facebook had acted on its own, it could have tweaked its algorithms to cause more or fewer positive posts in users’ News Feeds even without obtaining users’ click-through consent (it’s not as if Facebook promises its users that it will feed them their friends’ status updates in any particular way), and certainly without going through the IRB approval process. It’s only once someone tries to learn something about the effects of that activity and share that knowledge with the world that we throw up obstacles.

Academic researchers’ status as academics already makes it more burdensome for them to engage in exactly the same kinds of studies that corporations like Facebook can engage in at will. If, on top of that, IRBs didn’t recognize our society’s shifting expectations of privacy (and manipulation) and incorporate those evolving expectations into their minimal risk analysis, that would make academic research still harder, and would only serve to help ensure that those who are most likely to study the effects of a manipulative practice and share those results with the rest of us have reduced incentives to do so. Would we have ever known the extent to which Facebook manipulates its News Feed algorithms had Facebook not collaborated with academics incentivized to publish their findings?

We can certainly have a conversation about the appropriateness of Facebook-like manipulations, data mining, and other 21^st-century practices. But so long as we allow private entities freely to engage in these practices, we ought not unduly restrain academics trying to determine their effects. Recall those fear appeals I mentioned above. As one social psychology doctoral candidate noted on Twitter, IRBs make it impossible to study the effects of appeals that carry the same intensity of fear as real-world appeals to which people are exposed routinely, and on a mass scale, with unknown consequences. That doesn’t make a lot of sense. What corporations can do at will to serve their bottom line, and non-profits can do to serve their cause, we shouldn’t make (even) harder—or impossible—for those seeking to produce generalizable knowledge to do.

[Cross-posted at The Faculty Lounge and The Bioethics Program Blog.]