neural web and female face

Privacy Regulation in the Age of Machine Learning

By Adrian Gropper

Will the machines use our own personal information against us? The answer depends on privacy regulations that are yet to be written.

I know that the current approach to privacy regulation, be it general as in GDPR or sectoral as in HIPAA, is not readily extensible to a world where the principal value of personal data is machine learning. It’s easier to follow my logic if you agree that technology costs are already low compared to the value of personal data. Buying your own AI is increasingly sensible. Then, who will teach your personal AI in school, at work, in your community? You and your doctor will both have personal AI. What is the role of intellectual property when the cost of personal data dominates the cost of your AI? How do you use your AI to license the use of your personal data by others? Standards will be essential to maximize the market for your personal data, whether it’s sold or donated for the public good. These personal data standards are less-than-welcome in a process dominated by enterprises. Nonetheless, thoughtful privacy regulation designed around machine learning will keep humans responsible for the machines.

Data Aggregation

Snippets of personal data are generated by pretty much every activity and every interaction we have with a service provider. Each individual snippet is not worth much to anyone but when the snippets are aggregated across multiple service providers and over time, the value of the personal data grows exponentially. For example, Roche recently paid $2,000 per patient record in a $2 billion deal with a startup that aggregated the personal data of a million cancer patients. Neither the patients nor the doctors that treated them got a penny for training that machine. The investors in that startup, Flatiron Health, would maintain that their $300 million investment was risky and essential over the five or so years it took to assemble their intellectual property. We are left to speculate how much the next million cancer patients will have to spend to buy health records software equivalent to Flatiron Health, but we should also consider that Roche will likely be charging those future patients $20 billion in order to justify their $2B investment in Flatiron Health. Privatization of medical knowledge under the guise of AI comes at a huge cost to society.

Teachers for Your Personal AI

Patients and doctors teach each other directly. A hospital or school may be involved but, before AI, these intermediaries did not take an ownership interest in medicine. The recent scandal at Memorial Sloan Kettering sold patient records for a large equity stake in a private AI business, Paige.AI, illustrates today’s reality. Neither the hospital nor the AI enterprise had an interest in sharing this boondoggle with the patients or their treating physicians. Looking ahead to where the patients and treating physicians had their own AI, why would they / should they share the data with MSK or Paige.AI at all? The investments in machine learning technology for personal AI will likely be different than those in Flatiron Health or Paige.AI. Our privacy regulations must recognize this and avoid stacking the deck against investors in personal AI. Personal AI continues our tradition of open medicine practiced by learned intermediaries and should be protected.

Learned Intermediaries

As learned intermediaries, doctors play an important role in protecting manufacturers of pharmaceuticals, medical devices, and other therapies from liability as long as the manufacturers provide all of the necessary information. In order for the doctor to take responsibility for their AI, be it hospital or personally owned, the doctor should have all necessary information about that AI. The potential for hidden intentional or unintentional bias in AI and the importance of the data sets used to implement algorithms and train AI are the subject of much discussion these days and considered in current privacy regulation including GDPR. The use of open source software goes a long way to keeping the learned intermediary informed but it also changes the methods used to finance algorithm development. Open access to the patient records used as training data also contributes to informing the doctor. In this respect, the design of the machine learning algorithms will be quite different when training data is managed by the patients and doctors than when the data is managed by hospital intermediaries like MSK. Privacy regulations must consider this difference and allow for safe and effective personal AI.

Algorithms as Secret Medicine

As machines take over diagnostic tasks that were traditionally done by doctors, patients will have new choices for clinical advice. AI will be offered directly to patients, directly to doctors, as well as indirectly through hospital systems. As such, AI will both collaborate and compete with the doctor for the role of learned intermediary. Secret algorithms benefit for-profit investment in AI but they also endanger the role of the physician as a learned intermediary. As the MSK scandal and others show, it’s been very hard for medical schools and medical research in hospitals to resist the lure of for-profit intellectual property. As these scandals become more frequent and the cost of AI technology continues to shrink relative to its value to society, society will rebalance the financing of open medical knowledge as a public good.

From Consent to Capabilities

Consent is too much of an ethical construct to be practical as applied to privacy engineering. Algorithmic and autonomous systems that control the uses of our personal data are delegated that authority by us. When these digital agents or trustees are owned by the person (personal AI) consent loses meaning in favor of delegation. As individuals, we delegate limited capabilities to our trustees and they, in turn, delegate more limited capabilities to entities that seek to use our personal data. A human right to personal AI means that consent is replaced by two or more stages of delegation, each stage allowed to exercise only a subset of the capabilities that it was granted by the stage before. In an age of ubiquitous network connectivity and almost cost-free storage, our privacy regulations must recognize the human right to delegation includes machines and that the services that hold and use our personal data must be standards-based for the benefit of our machine agents as a matter of accessibility.

Standards for Human Agency

Handicap accessibility is regulated to ensure that our infrastructure, be it public or private, is accessible to a person using a normal wheelchair technology. The same ethical principles of accessibility apply to a person using personal AI as an autonomous agent in decisions about our personal data. It is not sufficient for privacy regulations such as GDPR Article 15 to specify a right of access by the data subject. Privacy regulation must make it practical, safe, and cost-effective for a data subject to delegate reuse of their personal data by others under the control of their personal AI. Standards for access and reuse of personal data held by others already exist but their adoption will be very slow without regulation.

Transparency and Accountability

Current work on regulating algorithms that impacts humans does not consider personal AI. Work on institutional AI is important but it is also highly subjective and difficult to translate into effective privacy regulation because of a lack of transparency and individual accountability. Personal AI, on the other hand tends toward transparency (as in open source and open training data) and accountability is clear because only one person is responsible. AI that integrates deeply with a human and enhances the capabilities of the learned intermediary can vastly accelerate the kind of progress we seek from Big Data and ubiquitous networking – without compromising privacy. It is not a panacea because selection bias and decentralization can make certain use-cases more difficult, but it is also possible that on balance a personal AI perspective is the best way to protect human dignity as well as the interests of society. Some of this work is alreadyunderway.

Each of us, patient, doctor, citizen, or lawyer must be able to own our personal AI. A new generation of privacy regulations must start with this premise and deal with surveillance capitalism in the 21st Century the way we dealt with slavery in the 19th. Algorithms built from personal data are no more the property of corporations than are goods produced by slave labor the property of corporations. Thoughtful privacy regulation designed around machine learning will lead to more sharing and more benefits from of personal data by keeping humans responsible for our machines.

Adrian Gropper

Adrian Gropper

Adrian Gropper, MD, is the CTO of Patient Privacy Rights, a national organization representing 10.3 million patients and among the foremost open data advocates in the country.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.