AI Security Pals
Cryptopals for AI security

Table of Contents

UPDATE: I'm currently developing this idea with Jessica Cussins Newman and Eddie Melcer.


While deploying AI security threats (e.g., deepfakes) becomes increasingly "democratized," strategies for defending against these attacks are not. Current educational resources for defending against AI/ML attacks require advanced theoretical knowledge of both machine learning and security. The result is that barriers to entry for would-be defenders are higher than for would-be attackers.

This project builds a hands-on, interactive introduction to defensive machine learning security built around coding puzzles. Our approach, modeled off of successful examples in cryptography, will presume no theoretical knowledge—only a workable, practitioner's understanding of machine learning. We will evaluate this work with undergraduate students, using quantitative and qualitative methods to gauge their experience with the teaching materials.

Prior work

The Cryptopals Crypto Challenges provide an amazing service: a walk through practical attacks in cryptography.1 The on-the-ground practicality of Cryptopals makes it an incredible teaching tool: it assumes no formal mathematical background, but has you implementing attacks that all-to-frequently work in the real world. It has taught me, and a few other people I know, how to apply cryptography effectively.

Cryptopals has three, key design features (to my mind):

  • Cryptopals is a puzzle. It's clear when you've gotten the answer.
  • Cryptopals asks users to play both red team and blue team. As the player, you break cryptography, then make it stronger, then break it again.
  • Cryptopals is self-contained. It doesn't rely on any third-party services or packages, and it doesn't care what language you use. You'll be building the tools you need from scratch.

Meanwhile, the idea of deepfake detection as a game has been posed here. This is more of a 'formal game,' but still useful in justifying the gameified approach I think.

An analogy for ML

Here are some priors about the content. This is probably not an exhaustive list, but I want to check my most important priors.

  • Deepfakes matter. Everyone is talking about them. At least one set should focus on GANs and generative attacks. One idea might be to do a low-tech, Nancy Pelosi-style fake, have the player detect it, implement a proper deepfake to do better, try to detect that using whatever Hany Farid knows how to do, etc.
  • Dataset poisoning matters. Datasets get poisoned. Can we ask the user to poison a dataset, then detect that the dataset has been poisoned, then ask them to poison it more effectively?
  • Adversarial examples matter. We should ask users to build discriminators, generate adversarial examples for those discriminators, then improve their discriminators.

Here are some priors about the design procedure.

  • Domain experts should take ownership over domains. For example, ideally, someone like Hany Farid (or students) would take ownership over a deepfakes challenge. CLTC's goal should be to provide design research and executive direction, in that order.
  • Deepfakes may be the best place to start. They're the most topical-seeming, and it's obvious enough how the Cryptopals model might apply.

TODO Known unknowns

  1. I do not know enough about current AI safety/adversarial ML courses. I would need to collect some and read up to see what void this really fills.
  2. I have not yet articulated the "market(s)" for these challenges, in part because I do not know the "market(s)" for the Cryptopals challenges. Cryptopals requires programming experience, but not domain expertise in cryptography. Do we expect folks to have some expertise in machine learning?

Design notes

Existing resources

From Jess [2020-01-09 Thu]

Is there a policy angle?

Maybe informing educational policy, making condensing policymakers' concerns into a finite list of things we can teach. We want to meet society's needs, not just what academics think those needs are. Computational propaganda is one.

Design inspirations and starting points

on [2019-12-26 Thu], Nitin says

I did an activity this past semester with my students on getting them to trick Google's teachable machine that is related to this. I had them create a classifier using and try to get the machine to classify "how many fingers am I holding up" but with confounding features instead (e.g. have the machine pick up left hand vs right hand instead, or some portrait in the background, etc)

if we could build a self-contained version of that exercise, it could be a great PoC

on [2019-12-27 Fri] richmond warns against using the term "bias"

to me that's a much more socially situated/constructed thing than what this sounds like. e.g., bias from systematic social discrimination which affects data collection, etc. although if you want to do a set of problems that's a more social-situated version, that could be cool too. like do some adversarial back and forth stuff around inferring gender using ML, getting to the ultimate point that you shouldn't really use ML to infer gender as a default action (because of the harms it can produce), unless you have a really good reason to use gender as a category. (which was a recommendation from morgan klaus scheruman's cscw paper this year on gender harms)

Some design observations on Cryptopals

These may or may not apply to our domain. They're meant as descriptive characteristics of Cryptopals.

  • Cryptopals doesn't check your answer. They run a static webpage. That's easy to maintain—and it's up to you to verify your results.
  • Cryptopals has no theme or domain. While doing so could be fun (c.f., adventofcode), it can also be tacky, uncool, or downright offensive to some. It can also go stale over time.
  • Cryptopals has a definite hazing vibe. Like many things in infosec, the implicit challenge seems to be that the user isn't smart or cool enough to solve the challenge… until they prove themselves. The downside of this approach is that it's obviously exclusionary. The upside is that it does seem to get a certain type of person really invested in the issue. Whether we want that type of person invested versus how much we need that type of person invested is a sort of utility/culture tradeoff.

Funding leads

The CITRIS seed grant is one route—potentially collaborating with Joe Dumit at UC Davis.

Another is the NSF SaTC EDU track. I inquired to the program officer, and Li Yang confirmed that "your topic fits in the scope of SaTC-EDU designation." They noted that "proposals must have clear and specific plans for assessment and evaluation," and pointed to the solicitation, which outlines specific criteria for intellectual merit and broader impacts.



To quote its designer, Thomas H. Ptacek, "I work on crypto the way other vulnerability researchers work on iOS, or on Windows kernel vulnerabilities."

Date: 2020-02-21 Fri 00:00


Created: 2020-11-17 Tue 16:15