Q’s for UCSC Genomics Institute’s Dir. David Haussler on “Data Solutions in Clinical Genomics”

UCSC Genomics Institute’s director David Haussler will chair a session on “Data Solutions in Clinical Genomics” at the upcoming Personalized Medicine World Conference (PMWC) 2016 Silicon Valley, co-hosted with Stanford Health Care and UCSF on January 24-27. Professor Haussler shared some of his thoughts with the PMWC team about his work and on the dynamic field of genomics.

PMWC: What are some of the biggest challenges when it comes to working with genomics data in the clinical setting and what are suggested solutions to address these challenges?

DH: As we start sequencing genomes, individual companies, institutions, and laboratories find genetic variants that they have not seen frequently before and are not sure how to interpret – they could have important clinical implications, or they could be totally harmless. Because there is significant variation in the human genome from one individual to the next, and because we have not yet sequenced enough genomes to be able to assign risk to the majority of variants, a significant fraction of the cases seen by typical institutions will be so-called “variants of uncertain significance.”  Many of these are very similar to, but not exactly like, those that are known to have health consequences, so clinicians often take conservative preventative measures to be safe. This can lead, in some cases, to unnecessary medical procedures and responses and result in undue health anguish. Thus, a fully genetically informed clinical decision process is profoundly important for the patient.

PMWC: Why are large-scale genomics efforts needed? How are you and your team involved?

DH: The only way to deal with variants of unknown significance is to create a large-scale genomic knowledge network that includes all the genetic variants observed worldwide, and their associated health observations. This requires a global network of shared data. The vast majority of human genetic variants are individually rare but collectively common, so this effort will have health impact on almost everybody. My team is helping to build the tools and approaches necessary to share data responsibly and effectively.

PMWC: There are numerous large-scale genomics efforts under way or in the process of getting started.

  1. Will the size, in terms of number of samples, of these sequencing projects be sufficient to fully understand most rare diseases?
  2. If not sufficient, how can this be addressed better – algorithmically, via data sharing platforms, or other?

DH: It is wonderful that several countries are launching very large case cohort studies, including the Precision Medicine Initiative in the US and the 100,000 Genomes Initiative in England. These will provide consistent and controlled data sets for research that are larger than have ever been available before. However, we must also move very aggressively to begin sharing genetic data from routine clinical practice, which will very soon be generating orders of magnitude more data than all the controlled cohort studies combined. The tyranny of numbers in statistics points to this as the information source of greatest potential power. But to access that potential and turn routine clinical practice into revolutionary learning for health and medicine, we need to create widely used data accessing platforms that bring all the knowledge together.

PMWC: What are the promises and challenges of sharing clinical and research sequencing data?

DH: We will ultimately attain a whole new level of understanding of how our individual detailed genetic variation influences our health and learn to exploit this knowledge, either by adjusting our diet, behavior and lifestyle or through direct medical intervention. We’ll be able to monitor and control diseases of aging such as cancer, diabetes and Alzheimer’s at their molecular basis. We’ll gain a detailed understanding of the current state of our immune system, what it needs to do to make us healthier, and how we can help it along to work for us.

The most difficult challenges to the controlled sharing of genetic data are social challenges, e.g. privacy issues and competition between medical institutions, but there are also significant technical challenges, and there is much interplay between the two types of challenge.

PMWC: You are a co-founder and co-chair of the Data Working Group of the Global Alliance for Genomics and Health. Can you tell us more about the initiative and efforts under way and why this working group is needed?

DH: The Global Alliance for Genomics and Health (GA4GH) is an international network of researchers, clinicians, companies, and patient and disease advocates who are working together to create the harmonized tools and approaches necessary for responsible and effective data sharing. The Data Working Group of the GA4GH is addressing the technical challenges of accessing data distributed globally. We are the geeks that have come together from hundreds of companies, nonprofits, and universities to get the technical data-sharing job done. Our sister Working Groups (Regulatory and Ethics, Security, and Clinical) are handling the complementary issues that put our work into context.

PMWC: What is the ideal data storage, analysis, and sharing platform for genomics data?

DH: We are working on that!  It is not an easy question. One thing we have learned is that global social realities dictate that there will not be a single centralized database of all personal genome data. We simply cannot create enough global trust in any single government or privately funded institution for that. However, the Internet itself is an example of a very successful decentralized platform that became sufficiently trusted to attain global use, and that forms the basis for many government and privately funded  “apps” that make beneficial use of all kinds of data, including personal data. The GA4GH Data Working Group is essentially trying to extend the Internet a bit further so that it includes a little more specialized infrastructure that understands and responds to the need for accessing global genetic data, while paying careful attention to very important privacy issues.

PMWC: What can other organizations or the commercial sector working on delivering these types of platforms learn from organizations like the UC Santa Cruz Genomics Institute?

DH: One thing that is evident is that it is valuable to have a neutral third party — an entity that does not itself have a competing medical institute, a corporate board of directors with an overriding profit motive, or a political mandate — to convene disparate groups around global standards and shared infrastructure. This is what the UC Santa Cruz Genomics Institute does. We began by posting the first draft of the human genome sequence on the Internet in July of 2000 in collaboration with the International Human Genome Sequencing Consortium, followed through with the UCSC Genome Browser and the CGHub Cancer Genomics database, and are now devoting considerable energy to the GA4GH. We would like to reach out to and encourage everyone who wants to make an impact on human health to join us and contribute to a growing worldwide data sharing movement.

Interested in hearing more? Join Dr. Haussler, 200 speakers and 1100 attendees at PMWC 2016 Silicon Valley.  The full Program and registration details can be found at: http://2016sv.pmwcintl.com/