Data Trusts Forming As Alternate Model of Protecting Data Privacy
By John P. Desmond, AI Trends Editor To meet the challenge of providing the vast amount of data required for AI applications, made more challenging by regulation and privacy issues, innovative firms are turning toward “data trusts” or “data cooperatives.” A data trust is a structure in which data is placed under the control of […]
By John P. Desmond, AI Trends Editor
To meet the challenge of providing the vast amount of data required for AI applications, made more challenging by regulation and privacy issues, innovative firms are turning toward “data trusts” or “data cooperatives.”
A data trust is a structure in which data is placed under the control of a board of trustees, with a responsibility to look after the interests of the beneficiaries, to give them a greater say in how the data is collected, accessed, and used by others.
“They involve one party authorizing another to make decisions about data on their behalf, for the benefit of a wider group of stakeholders,” states the blog of the Open Data Institute, a non-profit founded in 2012 by Tim Berners-Lee and Nigel Shadbolt, to encourage people to innovate with data. “Data trusts are a fairly new concept and a global community-of-practice is still growing around them,” the blog states, citing several examples.
Reasons to share data are fraud detection in financial services, gaining speed and visibility across supply chains, and combining genetics, insurance data and patient data to develop new digital health solutions, according to a recent account in Harvard Business Review. The account cited research showing that 66% of companies are willing to share data, including personal customer data. However, strict regulatory oversight applies to certain private data, with violations risking significant costs financially and to reputations.
The author of the HBR article, George Zarkadakis, recently piloted a data trust with his firm, Willis Towers Watson, providers of consulting and technology services for insurance companies, with several of its clients. Zarkadakis is the digital lead at Towers Watson, a senior fellow at the Atlantic Council, and the author of several books.
If the data trust adopts leading-edge technologies such as federated machine learning, homomorphic encryption (allowing calculations to be done on data without decrypting it), and distributed ledger technology, a trust can guarantee transparency in data sharing and an audit trail of who is using the data at any time and for any purpose. “Thus removing the considerable legal and technological friction that currently exists in data sharing,” Zarkadakis stated.
The objectives of the Towers Watson data trust pilot were to: identify a business case, form a successful “minimal viable consortia” (MVC), in which data providers and consumers agree to share resources and talent to focus on a specific business case; agree on a legal and ethical governance framework to enable data sharing; and to understand what technologies were needed to promote transparency and trust in the MVC.
Lessons learned included:
The importance of developing an ethical and legal framework for data sharing.
The team found it was important to set this foundation at the start. They worked to ensure compliance with the European Union’s General Data Protection Regulation (GDPR), which spells out a range of privacy protections. For the MVC to go beyond pilot to a commercial stage, it would need to be audited by an independent “ethics council” that would explore the ethical and other implications of the use of data and related AI algorithms.
Employ a federated/distributed architecture.
In a federated approach, data remains where it is and algorithms are distributed to the data, helping to allay fears about transferring sensitive data to an external environment. The team explored privacy-preserving technologies including differential privacy (describes patterns in a dataset while withholding information about individuals) and homomorphic encryption. The team also explored distributed ledger technology, including blockchain, as part of the technology stack.
“We architected the data trust as a cloud-native peer-to-peer application that would achieve data interoperability, share computational resources, and provide data scientists with a common workspace to train and test AI algorithms,” stated Zarkadakis.
Savvy Cooperatives Aims to Compensate for Use of Medical Data
One entrepreneur saw an opportunity to set up a data trust around personal medical information, one that would attempt to have payments made to cooperating participants by companies using their data. Jen Horonjeff, founder and CEO of the Savvy Cooperative, uses puppets in a video posted on the company’s website to explain the model. The company uses surveys, interviews, and focus groups to gather data, which is made available to healthcare companies and other providers.
Savvy raised an undisclosed amount of funding from Indie.vc last year, according to an account in TechCrunch. “The financing will allow us to expand our offerings, support more companies and in turn, improve the lives of countless more patients,” stated Horonjeff.
Indie.vc takes a non-traditional approach to venture capital and is geared towards startups. “Savvy represents everything we’d like to see in the future of impact business—shared ownership, diverse perspectives and aligned incentives—tackling one of the largest industries on the planet,” stated Indie.vc founder Bryce Roberts.
At the other end of the spectrum of data trust examples, Facebook in 2018 established an Oversight Board, with the promise to “uphold the principle of giving people a voice while also recognizing the reality of keeping people safe,” according to a recent account in Slate.
The board was formed six months later as a body of 20 experts from all over the world and a variety of fields, including journalists and judges. Early critics worried it would be nothing more than a PR stunt. Out of more than 150,000 cases submitted, six were chosen last December. They represented issues around content moderation, censorship of hate speech and Covid-19 misinformation. The board’s first five decisions were announced in late January.
The cases were debated by five-member panels, each including a representative from the place where the post in question was written. The panel sometimes requested public comments and integrated them into their decision. Before finalizing a decision, a majority of the board had to agree.
“The real decisions about what people can say and how they can say it in our world are no longer based on Supreme Court decisions,” but by companies like Facebook, stated Michael McConnell, a former federal judge who is now director of the Constitutional Law Center at Stanford Law School, who is a member of the Facebook board. The board tries to uphold freedom of expression while acknowledging the tension with the “harm that can take place as a result of social media activity,” McConnell stated.
Read the source articles on the blog of the Open Data Institute, in the Harvard Business Review, in TechCrunch and in Slate.