Tools for Discovery is a monthly profile series that inspects the computer programs, gadgets and methods behind WID’s ideas and discoveries.
As a Frontier Fellow at WID, Sreedevi Nair examines U.S. exoneration cases and searches for patterns to better understand how criminal convictions are overturned. Working as a fellow and mentee under WID faculty mentor Jordan Ellenberg, Nair is creating an algorithm to better characterize and maybe even predict false convictions in the U.S. legal system. Now a sophomore planning to double major in genetics and statistics, she spent her freshman year at UW-Madison as a math major.
What do you work on at WID?
The project I’m working on as a Frontier Fellow deals with exoneration case data. Basically, my project aims to create an algorithm that will allow lawyers and others in the legal field to be able to calculate the probability of any given case conviction being overturned based on certain criteria. The National Registry of Exonerations, which the University of Michigan Law School and the Northwestern University Law School manage together, has a summary of every known exoneration case that has occurred in the United States since 1980.
After reading the case summary provided with each exoneration listed, I have identified useful variables and compiled all the information into my own spreadsheet. Some of the variables I use are fairly generic such as first/last name, age, gender and race. The second type of variable is qualitative, which on the surface does not seem compatible with any kind of data analysis. These are factors that may or may not have played a role in any given case such as inadequate legal defense and official misconduct. These factors often go hand in hand with other ones, such as perjury, false accusation and inadequate evidence.
With this data, I’d like to know what chance there is that a case with a particular factor in it will be overturned. How important was that factor in the exoneration? I look only for the presence of the variable and account for it as a “yes/no” value. When programming, these translate as binary values. Currently, I have no means to evaluate the influence of these variables on each case, which is why this the best option for this phase of my research. That’s where the algorithm comes in. So in my utopia, lawyers will be able to use this. It’s reducing human behavior to patterns and analysis, but it’s really one of the best indicators because race, age and geographic location play a really huge role.
We usually think of doctors and teachers as people who are really contributing to society, but mathematicians and statisticians can really change the world. Our world is so Big Data driven that we can tell so much about ourselves — our inclinations, our biases and our habits — with numbers. But without anyone who can interpret them, they are just numbers.
“Our world is so Big Data driven that we can tell so much about ourselves — our inclinations, our biases and our habits — with numbers. But without anyone who can interpret them, they are just numbers.”
— Sreedevi Nair
What are your tools for analysis?
I use a whiteboard or just pen and paper. I have two notebooks for the project. I have a small one that I can fit in my backpack or in my jeans pocket. I also have huge college-ruled notebook that I keep with me whenever I’m working on it on a computer. It’s been especially helpful, especially when I have questions about code. I’m using Excel for inputting the data on the project. A lot of statisticians and mathematicians prefer to have their data in CSV, and that’s something I’m still getting used to. I’ve used Excel so much; it’s hard to break away from that. For statistical analysis I’ve been using Python and a little bit of R, a programming language. In Python, I’ve been using NumPy and SciPy.
Tools for writing?
I don’t actually write that much because I’m still very much in the statistical world in this project. But when I do write a paper about this research, I would write it in Microsoft Word, and I know I’d use graphics. I think in science, especially, “a picture is worth a thousand words,” so I would try to minimize word count and add as many tables, charts and pictures as possible. When I write, I use pen and paper and then type it.
Tools for collaboration?
I collaborate with Professor Ellenberg on this project. Usually we meet face-to-face in his office. This works really well for me. He has a chalkboard, and I understand things very clearly if I can see them. So if I have a question about the data or about programming, it’s incredibly useful to be able to see it on the board. By and large, I prefer face-to-face discussions, so Skype is a really good second option. My other main collaborator is actually my dad. He is very interested in my project. We just bounce ideas back and forth. He has this uncanny ability to just remove all of the background noise in a problem and stare solely into the heart of the problem. That’s really helpful. If he can just give me one direct route, it’s much easier to approach.
Your ultimate tool for discovery?
Your own curiosity, motivation, desire and perseverance are much more useful for discovery than anything else. If you are genuinely curious and passionate enough about it, you can go off on your own and try to discover whatever it is, especially in this day and age where research is so accessible and data of any kind is so easy to find.
— Interview conducted and edited by Mary Sussman