phas3 has been exploring Web3 solutions for data privacy with LYNX – a phas3-led R&D project. The aim of LYNX is to enable privacy-preserving storage, sharing and analysis of sensitive data – with a focus on biometric data from wearable devices – to amplify user insights and catalyse science and innovation.
This idea originated from a need expressed by a previous employer to be able to gather insights on employee well-being but not wanting to collect and store the raw data. LYNX came about as an answer to this problem. It also emerged as a solution to empower individuals to maintain ownership and control over their data, whilst being able to permit it for specific uses and maintain data privacy. Originally the solution did not use any Web3 tooling but as we started to think through the problem it became clearer that Web3 tooling could prove useful for data provenance tracking and implementing incentive mechanisms.
Initially, the team carried out market and user research on version 1 of the ‘Lite Paper’ to inform the solution. Insights from this research clarified that users felt a lack of ownership over their data and issues with data privacy and anonymity. They also highlighted that any solution should guard against future threats. Participants also wanted information on pricing and user experience. The research also highlighted that communication needed to be clearer around:
How Web3 tools offer advanced protections/security and AI analysis capabilities
What are the benefits of data ownership
What will happen to data once it is shared
This research informed version 2 of the ‘Lite Paper’ and FAQs. Following this, LYNX partnered with Algovera to host a joint hackathon to test the tech, test the human element, and kickstart the community (learn more here).
We have also been incredibly lucky to have partnered with DataUnion which has helped us think through the data journey from the perspective of a data union or MID (mediators of individual data). Their weekly DAO calls explored topics that are “bubbling”; things that are “hot” right now; and any updates from the different DAO members.
An interested and engaged community for LYNX has emerged – starting with a Telegram group which grew to over 94 members. Then, after the AlgoLYNXathon – the joint hackathon hosted with Algovera – a dedicated Discord with weekly community calls was established which focused on the key themes of engineering; data ethics and regulation; value proposition.
Throughout our work on LYNX, we have identified the problem in great detail. From this, a range of open questions related to the problem and possible solutions has been identified. These include (but are not limited to):
The overarching problem
Users want to have more autonomy over their data
Existing data silos hinder the ability to connect data in a useful way
Data and insights from the data are difficult to share between different parties - for example, results from a sleep study conducted by company A cannot be easily shared with company B due to data privacy regulations such as GDPR
The questions posed during the working groups can be distilled into the following:
How could the unlocking from data silos go wrong?
What is personal data and when does data no longer belong to you?
How can anonymised data from multiple sources be linked to an individual?
How do we enable people to retain their right to withdraw and the right to be forgotten?
How can we identify and understand all the risks involved in achieving informed consent?
How can we make the solution understandable to a general audience for achieving informed consent?
How can we ensure vulnerable users are not exploited by incentives for data sharing?
In a decentralised ecosystem, how can research studies obtain ethics board approval?
In a decentralised ecosystem, who is accountable for data breaches and ensuring they are resolved?
Ethics related to science, data and AI are culturally defined. In a decentralised ecosystem, what are the global red lines for human data?
What functions in a data union can help with data ethics governance?
How do we architect a data ecosystem that has longevity?
What good examples of Web2/Web3 hybrids already exist?
How would a decentralised solution be better as DLTs are currently slower than centralised databases and the data is less secure as its location(s) cannot be established with certainty?
How do we react to data breaches in Web3?
Can we ensure security moving on-chain to off-chain? i.e. how would data oracles work with personal data?
What are the encryption failure modes?
How would data unions or MIDs (mediators of individual data) organise themselves? - would it require some type of centralisation?
Identified major trade-offs between storing data on-chain and off-chain:
Biometric data from wearable devices is extremely sensitive (for example, a woman’s body temperature could indicate pregnancy and abortion, resulting in murder charges in some locations). It is therefore imperative that any new solution is more secure and ethical than current solutions.
As a consequence of the above unknowns and concerns, we have transitioned LYNX from an R&D project to an interest group. To get involved in the LYNX biometric data interest group, please join the LYNX Discord and introduce yourself.
We have identified a need in the Web3 ecosystem for global ethical guidelines for human data – from data collection (i.e. informed consent), through the data pipeline, to analysis (e.g. using decentralised AI protocols). To help achieve this, we have created a Web3 human data ethics community. We encourage anyone who is interested in this topic to please join (Telegram link).