Using Big Data to Find Cause and Effect

The results of a cancer research study from a University of California, San Francisco and GNS Healthcare collaboration were presented at the American Association for Cancer Research Annual Meeting in Chicago this week. GNS is a Big Data analytics company that uses its Reverse Engineering and Forward Simulation (REFS) platform to learn cause and effect relationships from large-scale clinical data.

UCSF researcher Rina Gendelman, PhD, presented the results of one of its collaborations with GNC at the meeting. The study identified novel signaling networks that slow the proliferation of breast cancer cells.

Not only does the study pave the way for Gendelman’s team and others’ ongoing research, but the study is significant because the procedure it involved was only recently made possible with advancements in Big Data technology.

UCSF gave GNS huge amounts ― terabytes ― of data. GNS ran it through their REFS platform to build models that the researchers could then simulate.

“The advantage of the GNS engine is they can actually extract causal relationships from gene expression data,” Gendelman said. This is opposed to relationships that are based on correlation. “And correlation analysis is really not sufficient to build models that can be further simulated.”

GNS gave the researchers a short list of 12 predictions based on the crunched data. With this focus, Gendelman’s team could then go in and validate the results. What Gendelman found amazed her.

Eleven out of 12 predictions were correct.

“You know, as far as models go, I was happy that it wasn’t all 12 because no one would believe me,” Gendelman joked. “I’m happy to say that there is a little bit of a false positive because no model is that good.”

The UCSF and GNS collaboration shows how Big Data can focus research experiments so that researchers aren’t off trying to find a needle in a haystack. Back during the Health 2.0 2011 Fall Conference, CEO of GNS Healthcare Colin Hill talked about this kind of human-computer collaboration.

“The machines are able to sift through and evaluate large quantities of hypotheses and even generate the hypotheses,” Hill said on a discussion panel. “But it does need humans on the other end to do some interpretation and some validation of the findings.”