New tools for winnowing and learning
While the application-oriented side of big data is a common theme in the College of Engineering, UW-Madison engineers also want to make transformative contributions to the mathematical and theoretical underpinnings of analyzing massive data sets. For Electrical and Computer Engineering Associate Professor Rebecca Willett and McFarland-Bascom Professor of Electrical and Computer Engineering Rob Nowak, both Discovery Fellows (and along with colleagues including Industrial and Systems Engineering and Computer Sciences Professor Stephen Wright, also a Discovery Fellow), that effort centers on the intersection of human expertise and computing power. They focus primarily on developing algorithms to winnow down big data sets to their most important elements for human analysis, and on creating algorithms and theory that enable machines to learn from human experts with a minimum of human interaction. This research means confronting the limitations of current data-analysis tools from a computing angle, and also confronting the bottlenecks that human experts create when carrying out their role in the data-analysis process. “There are a certain big-data problems that we don’t yet know how to solve, where we need new tools,” Willett says.
“There are a certain big-data problems that we don’t yet know how to solve, where we need new tools.”
– Rebecca Willett
While they are focused on advancing the fundamental underpinnings of big data research, they do think about the myriad opportunities for applications, in data sets that emerge in settings ranging from astronomers’ satellites to neuroimaging studies. Some of the fundamental questions Willett and Nowak tackle are informed by collaborations, including one with UW-Madison astronomy professor Christy Tremonti, and by their roles with the newly established UW-Madison Center for Predictive Computational Phenotyping, which aims to turn massive amounts of patient data into useful information that could inform treatments and health risk assessments.
On the importance of human input: “Methods that Google and Facebook use to automatically analyze images do not always translate to scientific data,” Willett says. “The features that an astronomer may use to distinguish between two galaxies may be totally different from the features that allow me to distinguish between a cat and a dog. We need easy-to-use methods that incorporate that astronomer’s expertise without it being a burden.”
On the promise and challenges of big data: “Any time you’re trying to do data analysis, there are costs,” Willett says. “But computational costs are decreasing, so we can do more computation with fewer resources—this is not the bottleneck. Rather, the bottleneck is the human expert that can only examine a small fraction of the available data.”