Training and Retaining Home-grown Data Scientists
The Data Science Fellows program is developing a Health Sciences workforce to meet the growing data needs of researchers.
Jonathan Lifshitz, PhD, is renowned for his research into traumatic brain injuries. He leads a neurotrauma and social impact research team that advances the national conversation on intimate partner violence-induced traumatic brain injuries and related topics. The success of the team’s research projects, and the data obtained for those studies, has resulted in an unexpected problem.
“I have a couple of projects that have been mothballed,” admitted Dr. Lifshitz, a child health professor at the University of Arizona College of Medicine – Phoenix. “I’ve had to rely on my colleagues’ goodwill to handle the level of data science these projects require. But sometimes it is hard to find people that have the time to help, even if they want to.”
Dr. Lifshitz’s experience is not unlike that of many researchers across the university.
“We had more people sitting with data ready to be processed than we had people to help them process their data,” explained Nirav Merchant, MS, director of the UArizona Data Science Institute and leader of Health Analytics Powerhouse, an initiative of UArizona Health Sciences.
A key part of the Health Analytics Powerhouse initiative is the Data Science Fellows program, which provides training and mentorship to UArizona postdoctoral scientists and doctoral candidates with a health sciences focus.
“The goal of the program is workforce development,” said Merchant, who is a member of the university’s BIO5 Institute. “We want a workforce to be literate in software, data and machine learning so as we grow, we can reduce the strain on researchers.”
What is data science?
Data science is more than just dealing with large amounts of data.
“If you’re working with a lot of data, you’re just doing science,” Merchant explained. “Data science combines data sets and methods that allow you to connect things that are otherwise not easily connected.”
For example, a researcher may want to develop a software program to evaluate and train health care providers on their interactions with consenting patients. One way to design the software is by analyzing audio recordings from the patient interactions, Merchant said.
Researchers with medical and clinical expertise could collaborate with speech and hearing scientists, who can help with natural language patterns, but it is unlikely either group will have the technical expertise to work with a software developer to create the program.
Someone trained in data science, however, can bridge these gaps in expertise. A data scientist has the tools to organize and analyze large quantities of data, such as what might be collected over thousands of hours of audio recordings. They can train the researcher how to better collect and organize data on the front end and how to interpret the data that are produced.
“A lot of what happens in data science is team science,” Merchant said. “Expertise is so broad that you cannot expect one person to know everything. Having the diversity of expertise really elevates the kind of science we can do.”
Teaching the teacher
With the help of Luisa Rojas, a doctoral candidate in the UArizona Health Sciences Clinical Translational Sciences graduate program, Dr. Lifshitz has found someone capable and eager to get started on his “mothballed” projects.
Rojas, a native of Colombia, is in the second cohort of data science fellows that began in January. She is quickly gaining expertise and sharing that knowledge with Dr. Lifshitz and fellow researchers. Rojas even created a Slack channel called “Data Science” to share information she is learning with her peers.
“She is inspiring others to embrace data science,” Dr. Lifshitz said. “Perhaps now we can make strides on the projects that stalled because we didn’t have an analyst able or available to do the required work.”
Dr. Lifshitz includes Rojas in meetings with his research teams so she can listen and provide relevant input when she sees opportunities to implement data science principles. Rojas is applying data science techniques to her own research, as well. For her translational dissertation, she is conducting research using the fecal microbiome to track the effects of therapeutic drugs for traumatic brain injury.
“I am grateful for the fellowship because now I have a different mindset about data science, the need to think about all the tools, and how we apply these to research projects,” Rojas said.
An eye to the future
The second class of fellows also includes Lydia Jennings, PhD, a postdoctoral researcher in the Department of Community, Environment and Policy at the UArizona Mel and Enid Zuckerman College of Public Health. Dr. Jennings’ research focuses on data policy and governance of environmental databases in relationship to Indigenous communities.
Only a few months into the fellowship, Dr. Jennings is already seeing the benefits of the data science toolsets she is acquiring, which go far beyond organizing data.
“I did not learn these skills in my doctoral program, but I really feel like this is the direction research is going,” Dr. Jennings said. “If you’re going to be at the forefront of research, I think it is important to have these skills because they are going to be required for most of the funding opportunities going forward.”
“A lot of what happens in data science is team science.”
Nirav Merchant, MS
Dr. Jennings also offers a unique perspective for her peers in the Data Science Fellows program, according to her postdoctoral advisor, Stephanie Russo Carroll, DrPH, MPH, assistant professor of public health policy and management at the Zuckerman College of Public Health.
“I think she is contributing more than most students are able to, in terms of thinking about the broader sets of principles that need to affect not only how we behave as data scientists, but how we create cyber infrastructure and how we set policies for interacting with that infrastructure,” Dr. Carroll said.
Building a network
Data science fellows are trained and mentored for one year, during which they meet for formal lecture and lab time twice weekly.
They are expected to spend some of their time in the Bioscience Research Laboratories Data Science Learning Space, where they work with other fellows, assisting on special projects, developing their data science and domain expertise, developing training material, and delivering workshops and webinars. They spend the rest of their time applying the tools and concepts to their laboratories and research projects.
The program already has its first success story. Gustavo de Oliveira Almeida, PhD, coordinator of the University of Arizona Health Sciences Sensor Lab, was among the first cohort of data science fellows. His fellowship focused on how multiple streams of data can be collected in real time to drive action. The experience prepared Dr. Almeida for his new position with the UArizona Sensor Lab, and now he is helping others across campus.
“We did not lose this talent,” Merchant said. “That is our goal with each cohort. We want to find them a home at the university, where they can help multiple people. We are hoping to build a network of people across campus to collectively elevate data science and the best practices for research analysis.”
Contact
Blair Willis
520-626-2101
bmw23@arizona.edu