Is Google's Genome Project A Diagnostic Delusion?
By Chuck Seegert, Ph.D.
Google Genomics, a cloud-based technology designed to archive individual human genomes, could hold promise for future diagnostics and medical breakthroughs, especially as computing costs continue to decrease. Challenges are significant, however, as there are potential technological and regulatory hurdles in the way of this new era of diagnostics.
The DNA that inhabits every cell of the living body has long been thought of as the instruction manual for life. This instruction manual contained in the double helical string of molecules influences every facet of our biology from the moment we are conceived. The wealth of knowledge stored in DNA from a healthcare perspective has driven a desire to understand more, which led to the decoding of the first human genome. The Human Genome project was a monumental task that took more than a decade and a budget of several billion dollars.
In an attempt to advance this area, Google has been developing a cloud-based platform called Google Genomics. The vision behind this ambitious endeavor is to archive genomic data by providing a virtual experimental space where researchers can share information and efficiently run virtual experiments, according to a recent story from the MIT Tech Review. The ability to compare genomes in large numbers promises to drive future medical discoveries and improve diagnostics.
“Our bird’s eye view is that if I were to get lung cancer in the future, doctors are going to sequence my genome and my tumor’s genome, and then query them against a database of 50 million other genomes,” said Deniz Kural, CEO of Seven Bridges, a company that stores genome data on behalf of 1,600 researchers in Amazon’s cloud, according to the MIT story. “The result will be ‘Hey, here’s the drug that will work best for you.’”
While the potential for personalized medicine is huge, the challenges involved are also significant. For example, each genome contains about 6 billion nucleotides, the building blocks of the DNA chain. In order to store this information, it takes about 100 gigabits, according to a recent story from RT.com. On an individual basis, this isn’t too daunting of a task, but consider archiving a major city like Moscow. Its inhabitants alone would require about 1.2 million terabit hardrives.
Currently Google’s search index is 100 petabytes, or about 100,000 terabytes. According to RT.com, that dataset requires about 0.25 seconds to search. Google proposes to continue using its existing search technology despite the increased size and complexity of the material being studied.
From a regulatory standpoint, government-funded research from the National Institutes of Health was recently brought under a new set of guidelines designed to protect patient privacy, according to a story published on Med Device Online. How Google handles this aspect of the genomics project may also be an interesting challenge, if the technological difficulties can be overcome.