Interview with Bastian Greshake Tzovaras of Open Humans

Bastian Greshake Tzovaras is the Director of Research at Open Humans, a non-profit organization that provides an open participatory science ecosystem. His work is at the intersection of citizen science, the quantified self and traditional research. With Open Humans he brings together individuals around sharing and re-using their personal data to perform new kinds of research across disciplines. Before joining Open Humans, Bastian founded the crowdsourced personal genomics data repository openSNP which puts genomes into the public domain and did a PhD in Bioinformatics. Read his full bio.

Interview with Bastian Greshake Tzovaras of Open Humans

Q: Patient healthcare data aggregation and analysis is seen as both the panacea for tremendous breakthroughs in precision medicine and as one of its biggest challenges. Are both true and how so?

A: The promises and challenges around personal health data aggregation are two sides of the same coin: There is definitely a lot of potential in the aggregating all these data, especially as we are now collecting so much data about ourselves. In the age of personal data we’re now individually collecting more data about ourselves than humanity has collected in total before the digital age. But this deluge of data brings so many challenges: How can we collect and aggregate these large amounts of data efficiently? And how can we make sense of this unprecedented wealth of data?

Q: What are the biggest hurdles today in getting people to share their health data?

A: One of the biggest roadblocks in getting people to share their data is the lack of access that people have to their own data for numerous reasons: So much of our health and health-related data is stored in various data silos across companies and organizations.

This is not only the case for the traditional medical establishment but also commercial players, be it wearable manufacturers or personal genome testing companies. This makes it hard for people to get any idea of what kinds of health data they even have and where! Without knowing which data there are, how could people share?

But even if people can get a picture of which health data about them exists, exporting and aggregating these data is problematic. Not only because some players in the area like to keep their data under lock and key, but also because technology shifts quickly. Data formats and ways to export and access them can change quickly. This makes it hard to create sustainable infrastructure for sharing.

Q: How can they be overcome? What is needed?

A: It takes a village to overcome these barriers. Building a sustainable health data sharing ecosystem can’t be achieved by a single organization or company but can only be team work. Instead we need the collaboration and interaction of all the players in the field. To achieve this we need to make sure that our data is FAIR. The concept of FAIR data – data that are Findable, Accessible, Interoperable and Re-usable – is often used in a general research data concept, but this applies just as well for personal health data.

For example, we need a commitment of data controllers to ensure data portability. The European Union has recognized that data portability is a key issue in the digital age and enshrined this as a key element in the General Data Protection Regulation that became law this year. Such policies can go a long way in helping make data accessible and re-usable, but need to be enforced and supported by data aggregators.

Furthermore, we need data aggregators that can act as trusted parties for individuals willing to aggregate and share their data. Unlike many other research data, health data is personal and often identifiable by design (e.g. genetic data). As such the hosting and sharing of such data requires data controllers who are trusted by the people sharing their data. Such aggregators need to enable ethical data sharing practices and have to put the individuals who want to share in the driver’s seat to make informed decisions.

Last but not least we need a cultural shift, away from data sharing that is only done between researchers in academia and industry. To enable the full potential of personal health data we need to have more participatory approaches, which include the individual data owners more in the complete research cycle.

Q: What has worked? Can you provide some examples that demonstrate that patients and healthy people can successfully share their data where everyone benefits?

A: At my work with Open Humans we are trying to play our part in addressing these hurdles by implementing our ideas of how to get around them as outlined earlier in a participant-centered way. And I think we are seeing some great examples of this. Probably the most inspiring example comes from the Type-I Diabetes community and their efforts in sharing data. Through Open Humans they have organized their own data commons to share continuous glucose monitoring data not only with academic researchers, but also citizen scientists around the globe. Through this they facilitate new kinds of research, to a point where they themselves even become lead investigators with their own research grants!
For example, we need a commitment of data controllers to ensure data portability. The European Union has recognized that data portability is a key issue in the digital age and enshrined this as a key element in the General Data Protection Regulation that became law this year. Such policies can go a long way in helping make data accessible and re-usable, but need to be enforced and supported by data aggregators.

Furthermore, we need data aggregators that can act as trusted parties for individuals willing to aggregate and share their data. Unlike many other research data, health data is personal and often identifiable by design (e.g. genetic data). As such the hosting and sharing of such data requires data controllers who are trusted by the people sharing their data. Such aggregators need to enable ethical data sharing practices and have to put the individuals who want to share in the driver’s seat to make informed decisions.

Last but not least we need a cultural shift, away from data sharing that is only done between researchers in academia and industry. To enable the full potential of personal health data we need to have more participatory approaches, which include the individual data owners more in the complete research cycle.

Q: We have a long way to go. With clinical trials enrolling at 2-3% today and that number falling. What type and level of shift in culture, laws, collection methods, or other areas is going to be needed to accomplish widespread data sharing?

A: I think this is basically covered in #3 and #6?

Q: How can participants be incentivized to share their health data and other data that researchers need to improve prevention and treatment and develop new therapies and health practices?

A: In my experience people are more willing to share data than we give them credit for. I think the problem often less one of incentives but rather one of a lack of opportunities to share data combined with a problem of how easy it is to share these data. We all have busy lives with little time and lots of priorities. For data sharing to be successful we need to make it easy enough that it doesn’t feel like yet another chore we have to fit into an already overly full schedule. Which needs we need to be mindful of the design challenges that we face when we develop systems for sharing.

Q: Will there always be certain communities or populations that will not participate in research because of history or privacy issues?

A: In past efforts we have seen that only a very small subset of the population is willing to share their data (c.f. https://peerj.com/preprints/27079/), with all the problems that brings for how generalizable precision medicine can be if research is more or less exclusively done on a well-educated, well-earning, largely Northern European population. But we need to be very mindful of the many reasons why people are not willing to share their data. These reasons are not only rooted in history, but also reflect our current society in which still a large fraction of the population is vulnerable to discrimination of all sorts.

The only way to overcome these barriers that rightfully prevent people from sharing is creating a more equitable and just society. To achieve this in the context of personal health data sharing, we need to empower individuals to make informed decisions and turn them into active agents around their own data instead of treating them as passive participants.

Q: What role will personal technology play in scaling health data sharing and collection?

A: While collecting data through doctors and medical professionals will always remain useful, there is a big issue of temporal resolution. If we are lucky as individuals we have only very infrequent interactions with the health care system. Leading to a very sparse data collection for much of our lives. Personal technology can help a great deal to overcome this limitation.

Already our phones & wearables are collecting data nearly 24/7 about ourselves. Data which can be useful down the road to see historical trends in our own health. Additionally, with connected devices the sharing of these data can be facilitated and automated in a way, which is great.

The big question is if and how we can make best use of these data. Our current healthcare system is not yet prepared to make the best use of this. Unless we can get all healthcare players to adapt to the realities of these new data they will remain unused.

Q: What do you predict the landscape will look like in 10 years in terms of people sharing their health data? What are the determinants to making your vision a reality?

A: As the saying goes: Predictions are hard, especially about the future. I think there’s a broad spectrum of possible futures for how the data sharing landscape can evolve. It seems to me that most of the technical challenges of doing efficient data sharing are solvable in the long run and maybe even in the next 10 years. The social and societal frameworks around data sharing are the ones that are much harder to foresee.

I would hope that we can get to a point in which data sharing is not done exclusively between the healthcare system and commercial players. This will need a cultural norm of empowering individuals to use their data as well as laws that support this. And of course there’s the big uncertainty of who will be able to share data.

Will sharing remain for the privileged as it is these days? Or will we be able to opening up this option for more people without them having to fear negative repercussions? Especially in the US this might largely rest on the question on whether universal healthcare will be a thing here in the next 10 years.