How we prevent Code.org from becoming a Botnet
For most of Code.org's history, teachers and students could access all nearly all Code.org courses without being signed in or having an account - a free account is required to access CS Principles or CS Discoveries content only. Furthermore, teacher verification is required for a small set of things (namely answer keys for CSP and CSD). We also require teacher verification to access our CSA course and JavaLab content.
Note: For CSA, teachers must assign the CSA course to their class section for students to be able to run code in Java Lab.
Why is teacher verification required in order to run code in CSA and Java Lab? Why does becoming verified unlock the same action for a teacher’s students? The short answer: We wanted to keep Code.org from becoming a botnet!
Let’s break that down. What’s a botnet? How could Code.org become one? How does teacher verification prevent this?
Botnets
In short, a botnet is a collection of computers, phones, smart TVs, smart lights, and other internet connected devices that have been compromised by an attacker and directed to execute the same program. These are usually devices that have been infected with a virus and are controlled by a hacker. The programs are generally sophisticated enough to run in the background so the actual owner of the device doesn’t know what’s happening.
These botnets then get used for a variety of actions that range from legal (except that they are running on stolen resources) to very illegal. These actions could include mining cryptocurrency, executing DDOS attacks, spam advertising, and many other actions. Botnets can also be rented for prices that scale with the size and duration of the attack. (Wallarm has a great in-depth article on botnets if you're interested in learning more)
How does this relate to Code.org? Read on.
Code.org's CSA curriculum
In 2022, Code.org launched our first CSA curriculum, which can be used to teach the skills necessary for passing the AP CSA College Board exam. In CSA, students learn Java, a language that needs to be compiled and then run in an environment that has a Java Virtual Machine (JVM) installed. Up until Code.org added this CSA curriculum, all student and teacher code on Code.org was run in each user’s browser, on each user’s computer. This was possible because all Code.org coding environments, such as App Lab or Sprite Lab, used some variation of JavaScript, a language that browsers understand.
Not so for Java.
In order to run Java code, we needed a new approach. The full details could fit in a blog post of their own, but in short, student and teacher code needed to run on an operating system that has Java and a JVM installed. In order to keep from asking our teachers and students to install this on their own computer (a lot of work for each school’s local IT department), we instead compile and run the user-created code on our own servers.
For classroom use, this is innocent enough. This enables students to create street art in The Neighborhood, create image filters in The Theater, and make interactive console projects.
For an attacker, this presents an opportunity. In order to create a botnet, an attacker usually has to create a virus that circumvents each device’s built-in security settings in order to run the attacker’s code. In Java Lab, an attacker has direct access to our servers and can run arbitrary code, no circumvention needed.
To mitigate this, we needed to do a few things. First, we set time limits on how long a user’s program can run (we also did a couple other things such as preventing network requests, but that’s less relevant for this story). That first step was a huge help since most botnet usages require open access to the internet or a lot of time. But an attacker can get around the time limit. They would just need to split their program into small pieces and run it thousands of times in Java Lab to get the same effect. The next step would be to limit the number of times each user can run a program. But that leads us to our next challenge…
User Accounts
At Code.org, we want to keep the barrier of entry as low as possible. This is why many of our activities and courses can be accessed without even making an account on our website. But in order to do certain things, such as record a user’s progress, we need to know which user is accessing the site. That means we need to encourage students and teachers to create accounts. But again, to keep the barrier of entry low, we set up the ability for teachers to make student accounts in bulk and for students to make accounts without an associated email address. After all, how many 2nd grade students have an email address that they can reliably access?
For our users, this is great. But for an attacker, it presents yet another opportunity.
That limit we set on the number of times a user can run a program? The attacker can get around it by just making thousands of accounts that represent fake people.
Enter: Teacher Verification
Back in 2015, we needed to create a way to ensure that only teachers were given access to CSP answer keys, example solutions, and specific features such as the ability to give feedback to students. We had to be reasonably sure that each individual given this ability was actually a teacher in real life (instead of just claiming to be one on the internet). The process of teacher verification was born.
A side effect of this process is that each user account that is a verified teacher account is also directly associated with a real person. This was the solution we needed to prevent these kinds of botnet-generating attacks on Java Lab.
With verified teacher accounts, we could ensure that a single individual couldn’t make thousands of fake accounts to get around our usage limits. Additionally, student accounts assigned to a section are directly related to their teacher’s account. By checking if a user is in a verified teacher’s section, we apply additional usage limits to the teacher’s sections as well.
Summing It Up
Running a user’s code on your servers, instead of on their browser, opens up a door to all kinds of malicious activity. The worst of this can be prevented by limiting the amount of time the code can run on the server. But to apply those limits, we needed to be able to reliably detect who was running the code. And the best way for us to do that right now is through teacher verification.
We apologize for the extra steps needed here - and we appreciate you taking the time to become verified! Please contact us at verification@code.org with any questions on verification or otherwise!