(adapted from a Zoom interview between Samantha Rydzewski and Sarah Lapean)
Samantha Rydzewski, a computer science and math double major, took some time out of her busy schedule on March 22 to tell us about her thesis with the computer science department. She is looking at suboptimal attacks against machine learning models, and investigating how they compare to optimal attacks.
The journey towards her thesis began the summer after her first year, when she participated in the SURF program with Professor Scott Alfeld, who would later become her advisor. It was a machine learning project, and it got her hooked. She also discovered that she enjoys working on long-term projects, and that she learns more from their longevity. After that summer, she took more computer science classes and attended thesis talks. She saw a computer science thesis as an exciting way to challenge herself in her senior year and keep herself engaged.
The topic of her thesis is adversarial machine learning. Machine learning finds stuff in data to build a model which then performs a set task, such as a spam filter learning to differentiate between spam and not spam. Adversarial machine learning, therefore, “focuses on the idea that there is someone who wants to manipulate the process of machine learning.” Imagine there is someone trying to fool the spam filter. That’s where Samantha comes in.
Most literature focuses on how to make the best optimal attack. Samantha works on the opposite end of the spectrum – an attacker in the real world who does not know machine learning, and is “just some guy typing away at his computer, seeing what will happen.” How would you model these types of attacks – suboptimal attacks?
Her research consists of two parts. First, she generates fake data – data that is “nice, pretty, easy to play with.” She puts this data into the model, and a learner is created. Next, Samantha applies her suboptimal attack, followed by the optimal attack. She can then play around with parameters, increasing and decreasing various aspects to see how it affects the process. Every time she runs a test, she gets a score, which reflects how well she fooled the learner. The second part of her research will involve taking real world data (such as image recognition data and spam filter data sets) and running her attacks to see what happens.
Samantha typically devotes time to her thesis on Tuesdays. She goes to the comp-sci thesis room in the Science Center and sets herself up amongst the giant monitors and white boards. She either works on her thesis draft or on coding some new part of her experiment. For a while, her typical day involved adapting code for specific attacks and reading literature on the many different kinds of optimal attacks. She would look through the citations to see what their implementations looked like, and then bring that into her research. Recently, she’s been working on justifying her parameters. This requires coming up with new experiments and running tests. She has a weekly meeting with her advisor mapping out where she is, what’s her next task, and how she wants to get there.
Samantha believes that machine learning in tech is a hot topic right now, and people are rushing to incorporate it into daily life. However, she reminds us that machine learning, like any computer system, has vulnerabilities. Seemingly minor problems in machine learning can result in significant consequences in areas such as medicine and finance. You have to make sure it cannot be influenced, and that it’s working the way it’s supposed to.
As for advice for people considering writing a thesis, Samantha says don’t be intimidated to ask questions. She advises seriously considering what you want to research, and taking new classes until you find a professor that you click with. It is important to find an advisor you like, and who will give you the structure you need. Be proactive. Give yourself deadlines. Start reading over the summer. Don’t procrastinate (even though everybody does).
P.S. She really is obsessed with the CS thesis lounge.