The complexity and differences in data-protection laws are making it difficult for researchers to use massive volumes of individual-level data, so-called big data, for improving public health and clinical outcomes, according to Cason Schmit, JD, an assistant professor at the Texas A&M University School of Public Health in College Station, Texas, where he is director of the Program in Health Law and Policy. If public health and health care in general in the US are to benefit optimally from big data, the country needs to abandon its current approach to protecting data.
“A comprehensive data protection law that permits data to be used for public health and research is needed to truly understand the impact of social determinants of health because these data are scattered and protected by different laws with different standards,” Schmit said.
Without a comprehensive data protection law, application of big data to medical research and public health will face substantial barriers, according to Schmit. “HIPAA is often unfairly targeted as a barrier to science and public health. In fact, HIPAA is among the few federal data protection laws with robust provisions allowing disclosures of identifiable information for both public health and research purposes,” Schmit said.
Overly Protective Policies
Impediments to big data use include adoption of highly conservative policies that restrict otherwise completely legal data uses. Organizations might adopt such policies because they do not fully understand HIPAA provisions or wish to simplify compliance with a complex law with overly broad restrictions. Another reason may be overprotection of commercially valuable and personally sensitive information. “While some organizations might publicly justify these protections as serving their patients, our research shows that the public is comfortable with their data being used to promote social good, like research and public health,” Schmit said.
Recently, Schmit led a study that surveyed 504 nationally representative participants who were asked how comfortable they were with different big data use scenarios. The results of the study, published in the Journal of Medical Internet Research, suggested that the public strongly prefers that big data be used for public health and research purposes over profit-driven, marketing, or crime-detection activities.
What the Public Prefers
The existing patchwork of US data protection laws does not reflect public preference for individual data use. The survey results suggest that the public is more concerned about who is using the data and for what purposes and that individuals have a strong preference for data uses that promote the common good as opposed to commercial interests, Schmit said.
HIPAA is widely criticized for its incredibly broad and vague definition of “identifiable” information, according to Schmit. As a result, many organizations rely heavily on the HIPAA safe harbor rule for legal de-identification that lists 18 types of identifiers. Some organizations insist on removing all of these identifiers before releasing information for research or public health, which is not required by law and can dramatically reduce the usesfulness of the information.
Michael Greenberger, JD, founder and director of the University of Maryland Center for Health and Homeland Security in Baltimore, said big data should flow freely but policies to enable this have to be developed thoughtfully with consideration of all interested parties. “My view is that all interests can be accommodated,” Greenberger said. “You can scrub the person’s identity. I think big data is needed for research. I think it is a helpful part of our developing advancements in science and for medical researchers,” Greenberger said.
Big Data’s Promise
Stefano Piotto, PhD, an associate professor in the department of pharmacy at the University of Salerno in Italy, said big data can revolutionize health policies by improving the health and safety of citizens and reducing the costs of national health systems. “There is plenty of research that has shown this, and we, too, have been able to verify how behavioral data, the frequency and type of contacts between people, can be used to anticipate outbreaks of a virus,” Dr Piotto said. “If properly collected and used under the control of international health organizations, big data could become the first in silico medicine of our time. It could become the first line of intervention for future pandemics,” Dr Piotto said. In silico refers to methods or predictions that use computational approaches.
Shelley Tworoger, PhD, Associate Center Director of Population Science at the Moffitt Cancer Center, Tampa, Florida, said big data has been changing cancer care, especially during the pandemic. The opportunity to better understand the development and treatment of cancer has exponentially increased with the advent of big data. For example, using machine learning and artificial intelligence methods to predict which individuals have a cancer recurrence or a poor response to treatment have high potential to improve targeted treatment and earlier interventions if treatment has failed. “We also have a unique opportunity to leverage new electronic tools, like cellphone apps, to provide information and interventions to patients to help improve their outcomes and treat potential side effects of treatment,” Dr Tworoger said.
The appearance of the Delta variant of SARS-CoV-2 has ushered in an urgent need for a rapid response. By helping to track rates of infection, hospitalization, and deaths, public health surveillance could be among the most important ways big data can help combat the COVID-19 pandemic. “This can inform public health recommendations for prevention,” Dr Tworoger said. “There is a great opportunity for more research to be conducted on the impact of the pandemic on cancer patients. For example, we are currently conducting a study where we are asking cancer patients about their experiences with COVID-19 exposure, infection, and how they are trying to prevent getting exposed.”