Nobody knows funny like Bob Mankoff, cartoon editor of the New Yorker magazine. But when it comes to the magazine’s weekly caption contest, Mankoff can’t do it alone. Each week, the New Yorker takes a cartoon that did not quite make the cut for publication, removes the artist’s original caption, and gives readers a chance to take their best shot at writing the funniest caption to accompany the drawing. And, each week, Mankoff and his staff receive thousands of entries. Sorting through those entries — the good, the bad, and the horrible — is a monumental task. Deciding which are the funniest is more than a one-man job.
But just because cartoon captioning is an art doesn’t mean it can’t be helped by science. To facilitate the caption contest selection process and further engage their audience, Mankoff and the New Yorker have enlisted the help of a machine learning system called NEXT developed by current and former WID Optimization researchers, including Discovery Fellow Rob Nowak, Graduate Students Lalit Jain and Scott Sievert, and WID alumnus and current UC Berkeley postdoc Kevin Jamieson. Nowak is a core faculty member of a new graduate training program called LUCID at UW that aims to combine research in machine learning, cognition, and education, in ways that can help solve a wide range of problems in science and industry.
Nowak was introduced to Mankoff through Paula Niedenthal, professor of psychology at UW-Madison, and herself an amatuer cartoonist interested in the psychology of humor. Nowak quickly became interested in the caption contest as a machine learning problem. Traditional machine learning uses algorithms to make predictions based on data, but is passive in that the reference data is static. To accelerate machine learning, researchers have developed active learning methods that require adaptive collection of new datasets. NEXT, supported by grants from the National Science Foundation, automatically focuses and optimizes crowdsourcing to allow algorithms to operate on the most informative samples and find answers faster.
NEXT uses cloud computing to help researchers in a variety of data-heavy fields — from psychology to biology to social sciences and more — move away from regular data collection toward using active machine learning instead. Because algorithms and parameters are publicly available, replication of experiments can be done quickly and easily.
Mankoff and the staff at the New Yorker are keen on the possibilities afforded by machine learning, not to replace human judgment, but to augment it. “They’re very interested in machine learning and how it can be used to understand human reasoning about humor,” says Nowak. The collaboration with the New Yorker has been under way for a few months; some of the New Yorker’s caption contest cartoons and their captions even appeared in a paper from Nowak’s lab.
The cartoon caption contest, from the perspectives of psychologists and data scientists, can itself present an opportunity for a social experiment relying on crowd-sourced data. Participants in an ongoing study from Nowak’s lab see a drawing accompanied by a caption. The viewer decides whether the caption is unfunny, somewhat funny, or funny. Then, the state-of-the-art active learning algorithm decides which caption the participant sees next. Very quickly, NEXT can generate informative infographics and in-depth statistical analyses and reports. And from the New Yorker’s perspective, machine learning can be a powerful tool that can help sift through thousands of captions and help pick the ones readers find funniest.
See Mankoff talk about his collaboration with UW-Madison data scientists here.
To see Nowak talk about machine learning, check out his entry in the Morgridge Institute for Research’s Blue Sky Science series.
Image credit: the featured image is taken from a cartoon by Paula Niedenthal. See the full cartoon here.