Person typing on computer.

Online Terms of Use for Genealogy Websites – What’s in the Fine Print?

By Jorge L. Contreras

Since genealogy websites first went online, researchers have been using the data that they contain in large-scale epidemiological and population health studies. In many cases, data is collected using automated tools and analyzed using sophisticated algorithms.

These techniques have supported a growing number of discoveries and scientific papers. For example, researchers have used this data to identify genetic markers for Alzheimer’s Disease, to trace an inherited cancer syndrome back to a single German couple born in the 1700s, and to gain a better understanding of longevity and family dispersion.  In the last of these studies, researchers analyzed family trees from 86 million individual genealogy website profiles.

Despite the scientific value of publicly-available genealogy website information, and its free accessibility via the Internet, it is not always the case that this data can be used for research without the permission of the site operator or the individual data subjects.

In fact, the online terms of use (TOU) for genealogy websites may restrict or prohibit the types of uses for data found on those sites.

Though online TOUs can seem to be nothing more than routine annoyances, they have been found to be binding contracts by more than a few courts in the U.S. and elsewhere.  Violating such contractual terms could give rise to various legal remedies, including monetary damages and orders to cease using data obtained without permission.

In order to understand the degree to which website TOUs limit research use of public genealogy data, a group of collaborators and I analyzed the TOUs of seventeen leading genealogy websites.

Of the seventeen websites, thirteen of them contained restrictions that effectively prohibited the use of data for scientific research purposes — whether through limiting use to genealogical purposes only, prohibiting “commercial” uses, or prohibiting technically necessary steps such as downloading or automatically scraping, crawling or harvesting data.

These restrictive terms of use came about largely in response to questions of individual privacy.

Genealogy — studying our ancestry and family histories — has become one of America’s favorite pastimes.  There are now thousands of web sites that provide tips and information to amateur genealogists, link to public records, offer forums for conversation, and allow users to upload photos, family trees, and other information (many of these are cataloged at Cyndi’s List).

In recent years, sites like GEDmatch, AncestryDNA and FamilyTreeDNA have begun to allow the uploading and sharing of genetic information (just data, no actual biospecimens).  This information allows users to match DNA data to locate and verify long-lost relatives and, in some cases, siblings, parents and offspring. When someone who was allegedly located in Germany contacted me claiming that she was my father’s hitherto unknown half-sister, the first thing I did was suggest that she submit a DNA sample to Ancestry so that we could see whether we were DNA matches (we were!).

But online genealogy sites are not just about family reunions and finding out whether your ancestors really came over on the Mayflower.

In 2018, investigators revealed that they had identified and arrested the infamous “Golden State Killer” by comparing crime scene DNA to the data contained in the public GEDmatch database.  Numerous other “cold cases” have been solved in a similar manner.  While these arrests have generally been applauded, they also raise questions of individual privacy — particularly with respect to data contained accessible from public genealogy websites.

Subsequently, GEDmatch and FamilyTreeDNA reportedly amended their online terms of use to prohibit the unauthorized use of their data for law enforcement purposes, and Ancestry has publicly announced that it will not voluntarily make consumer-uploaded data available to law enforcement authorities or other third parties.

These restrictions, however, might inadvertently expose scientific researchers to legal risks. Per our findings, scientific research is now effectively prohibited by the majority of these websites’ terms of use.

While we are not aware of any lawsuits that have been brought against scientific researchers using public genealogy data without permission, they would be well-advised to proceed with caution.

Based on our findings, we recommend that genealogy site operators consider granting researchers permission to use publicly-available data for legitimate scientific research purposes, even if they wish to prohibit more controversial data uses such as law enforcement, surveillance, racial profiling, insurance underwriting, and direct marketing.

Jorge Contreras

Jorge L. Contreras is a Presidential Scholar and Professor of Law at the University of Utah with an adjunct appointment in the Department of Human Genetics. His research focuses on intellectual property, technical standards and science policy, and he is one of the co-founders of the Open COVID Pledge, a framework for contributing intellectual property to the COVID-19 response. Professor Contreras is the editor of six books and the author of more than 100 scholarly articles and chapters appearing in scientific, legal and policy journals including Science, Nature, Georgetown Law Journal, NYU Law Review, Iowa Law Review, Harvard Journal of Law and Technology and Antitrust Law Journal. He has served as a member of the National Institutes of Health (NIH) Council of Councils, the Advisory Councils of the National Human Genome Research Institute (NHGRI) and the National Center for Advancing Translational Sciences (NCATS), and as the Co-Chair of the National Conference of Lawyers and Scientists. He is a graduate of Harvard Law School (JD) and Rice University (BSEE, BA).

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.