Meet Dr. Ryan Urbanowicz, Co-Lead of the Technology Identification and Training Core at PennAITech. In his day job, he serves as an Assistant Professor of Computational Biology at Cedars Sinai Medical Center and holds an adjunct position at the University of Pennsylvania. Dr. Urbanowicz's research is a blend of machine learning, artificial intelligence, and data science aimed at solving biomedical and clinical problems. His work has led to the development of several software packages, such as ReBATE and STREAMLINE, that focus on complex data patterns, interpretability, and scalability. Highly collaborative in nature, his research engages with clinicians and investigators across various medical fields. Dr. Urbanowicz is also a dedicated educator, sharing his knowledge through courses, tutorials, and a YouTube channel. Follow along as we delve into his perspectives on the challenges and opportunities in aging and dementia research—and how AI can be a game-changer in addressing them.
My research focuses on the development and application of machine learning, artificial intelligence, informatics, and data science methods and their application to biomedical and clinical problems. More specifically, our research group has developed a number of software packages including ReBATE (for interaction sensitive feature selection), GAMETES (genomic data simulation), ExSTraCS (interpretable rule-based machine learning modeling), FIBERS (feature learning in survival analysis data), and STREAMLINE (a end-to-end automated machine learning analysis pipeline). Our work specializes in the development of tools that (1) tackle complex patterns of association in data, including epistatic interactions and heterogeneity, (2) yield interpretable and/or explainable models and outputs, and (3) scale to larger data analyses. Our work is highly collaborative, relying on partnerships with clinicians and investigators studying biomedical outcomes such as obstructive sleep apnea, pancreatic cancer, congenital heart disease, Alzheimer's disease, bladder cancer, patient readmission, and others. I'm also a passionate educator and mentor, teaching courses, tutorials, and workshops in person and on YouTube (https://www.youtube.com/channel/UCHIKWNLhglKHJpkxw993sKQ).
I've always loved a good puzzle and solving problems that require thinking outside the box. I'd attribute my general involvement and commitment to biomedical research to my father, who battled lymphoma throughout my childhood, and my motivation to work in agetech, aging, and dementia to the simple acknowledgement that this is an area that impacts almost everyone at one point or another in their lives. My educational background originally focused on biomedical engineering, which laid a natural foundation for a career that would integrate biomedical research with methodological design and computer science. My start in computational biology was in many ways the random product of circumstance; i.e., choosing to go to graduate school at Dartmouth College and rotating in the research lab of [PennAITech Co-Principal Investigator] Dr. Jason H. Moore, who introduced me to this field as a possible career. I was immediately drawn to this field because it seemed to me to have incredible potential to have a broad impact and expose me to a wide variety of research opportunities. With respect to AI, it's extremely exciting to be working in such a diverse and cutting-edge area of study home to some of the most interesting and challenging puzzles around.
Given that my expertise is more focused on the AI and data science side of research, from my perspective the biggest gaps in the current landscape appear in how data is collected, shared, integrated, analyzed, and translationally put to good use in research and care. The collection of data is often a critical bottleneck. What information is most valuable to collect? How do we collect it? How do we avoid bias in collecting it? Data sharing is often a critical element to the process in terms of having enough of it to do something meaningful, to know that our findings generalize to the broader population, or to simply make research more cost effective and efficient. In terms of data analysis, how do we know we are using the right/best tool for the job? What biases do the tools we apply have? Are there better ways to conduct analyses than what we've used in the past? And lastly, what are the best ways to translate our findings to benefit patients and caregivers? Are the strategies or technologies we implement fair, accessible, effective, and cost-effective, and can we monitor their performance to identify limitations and failings?
The field of AI is obviously extremely exciting and full of opportunity. In my experience, coming up with ideas is the easy part. The real challenge is separating hype from substance in developing ideas that are built on a solid foundation; which requires (1) identifying an impactful problem; (2) understanding that problem from the perspective of patients, caregivers, and other stakeholders; (3) understanding the data needs, quality, and accessibility; (4) understanding the current availability and limitations of technologies and tools; and (5) finding the right team that brings whatever interdisciplinary expertise together that would be needed.
I think the most useful advice I've gotten over the years (from multiple mentors), is to know your audience and share your ideas and research in a simple, honest, and accessible manner. If you can't explain something clearly, how well do you truly understand it, and how can you expect others to trust it or provide useful feedback on it? This advice has certainly positively impacted my teaching, scientific communication, grant writing, and research. Following this advice forced me to ask more questions, put myself in other people's shoes, and deepen my own understanding.