Originally published at The Renegade Coder on March 11, 2019
Since I started teaching, I’ve been trying to find ways to automate my grading responsibilities. After all, it’s super time consuming, and I don’t find it to be extremely helpful to the students. Every minute I save due to automation goes back to providing quality feedback, focusing on my teaching, and improving my mental health. Who could say no to that?
Grading Responsibilities
I’ve mentioned my grading responsibilities a few times in this series, but I figured it wouldn’t hurt to outline them once again.
In addition to teaching, I’m responsible for grading 12 projects, 14 homework assignments, 14 labs, and 3 exams a semester. Multiply all those numbers by 40, and that’s the total number of assignments I grade over the course of a semester. As you can probably imagine, it’s a huge time sink outside of the classroom.
To speed things up, I’ve tried to find ways to automate grading. Perhaps the largest time saving opportunity is the projects which can take about 6 hours a week to grade. Unfortunately, that long duration is due to a handful of problems:
- Projects are worth the most points, so they require the most feedback.
- Projects have to be tested which can take some time depending on the complexity of the program.
- Projects are organized in packages, so they have to be transferred in archive formats like zip.
- Projects are written by students, so style varies wildly making code difficult to read.
As you can see, there are a lot of stringent requirements for projects which can make grading a very time consuming task. To add insult to injury, students have a tendency to not follow directions, so files sometimes have to be edited before they can be executed. Worst case scenario: I have to contact students because they didn’t submit everything.
Grading Automation
As someone who is always trying to milk every little bit of efficiency out of every day tasks, I quickly took it upon myself to automate the project grading. To be honest, I just couldn’t imagine completing the following procedure for 40 students without going insane:
- Download student solution.
- Unzip student solution.
- Load file(s) into IDE.
- Run file(s) (repeat for various test cases).
- Gauge solution style.
- Assess solution based on testing and style.
- Give feedback.
After looking at this list, I feel I’ve made the right choice to automate my grading, but what exactly does automation entail? Let’s take a look.
Introducing JUnit
During my first semester, the best option I had at the time for automation was JUnit testing. In any given week, it would take me about 90 minutes to write up a JUnit solution to the project and another 2 hours to complete grading. In other words, I managed to reduce a 6 hour process down to about 4 hours. I’ll take that any day!
Of course, JUnit probably wasn’t the ideal choice. After all, we don’t teach methods until the 6th week, so most of the projects are massive main methods. In addition, students don’t always follow the same naming conventions for classes, so I have to be clever in how I call the main method.
As a result, I ended up writing a pretty complex set of methods to guess at the class names using reflection. For instance, the following method generates a list of class names for brute force reflection:
private static ArrayList<String> getTestClasses(int project) {
ArrayList<String> toTest = new ArrayList<String>();
toTest.add("osu.cse1223.Project%1$s");
toTest.add("osu.cse1223.Project%1$sa");
toTest.add("osu.cse1223.CSEProject%1$s");
toTest.add("cse1223.Project%1$sa");
toTest.add("cse1223.Project%1$s");
toTest.add("project%1$s.Project%1$s");
toTest.add("Project%1$s");
toTest.add("Project%1$sA");
toTest.add("osu.cse1223.DragonsGame");
toTest.add("Project04.DragonTrainers");
toTest.add("Main");
String projectNumberWhole = Integer.toString(project);
String projectNumberPad = "0" + projectNumberWhole;
int originalSize = toTest.size();
for (int i = 0; i < originalSize; i++) {
String test = toTest.get(i);
toTest.set(i, String.format(test, projectNumberPad));
toTest.add(String.format(test, projectNumberWhole));
toTest.add(String.format(test, projectNumberPad).toLowerCase());
toTest.add(String.format(test, projectNumberWhole).toLowerCase());
}
return toTest;
}
In addition, since many of the projects leverage the main method and text formatting, I spent a lot of time capturing standard output and writing to standard input. Check out my setup and teardown methods:
@Before
public void setUp() {
System.setOut(new PrintStream(outContent));
System.setErr(new PrintStream(errContent));
}
@After
public void tearDown() {
System.setIn(System.in);
System.setOut(System.out);
}
Overall, the JUnit solution is pretty clunky, but it got the job done.
Unzip Script
While JUnit did save me a lot of time, there were still ways to cut wasted time. In particular, I found that I was wasting a lot of time manually unzipping folders.
To put things into perspective a bit, we use Canvas for uploading solutions which does a bit of file name mangling. As a result, standalone Java submissions end up with their file names ruined. To combat this issue, we ask students to export their solutions from Eclipse as zip files. This helps in two ways:
- It protects the underlying Java files names.
- It preserves package structure when needed.
Unfortunately, I was stuck unzipping 41 files every week. Granted, I did speed things up with 7-zip, but I still had to do that all by hand.
Eventually, I decided to automate this unpacking process using Python and the zipfile library:
def extract_main_zip() -> str:
"""
Extracts an archive given by the user.
:return: the path to the unzipped archive
"""
archive_name = filedialog.askopenfilename(
title="Select Zip File",
filetypes=(("zip files", "*.zip"), ("all files", "*.*"))
)
archive = zipfile.ZipFile(archive_name)
archive_path = os.path.join(os.path.dirname(archive_name), ARCHIVE)
archive.extractall(archive_path)
archive.close()
return archive_path
In this function, I use tk
to open a file selection GUI. From there, I unpack the selected zip file and return the path to the extraction site.
Since the zip file contains zip files, I decided to automate that unpacking process as well:
def extract_solutions() -> str:
"""
Extracts user folders.
:return: the path to the extraction site
"""
unzipped_archive = extract_main_zip()
dump = os.path.join(os.path.dirname(unzipped_archive), DUMP)
pathlib.Path(dump).mkdir(parents=True, exist_ok=True)
for file in os.listdir(unzipped_archive):
file_name = os.fsdecode(file)
file_path = os.path.join(unzipped_archive, file_name)
file_path_plus_name = os.path.join(dump, file_name.split("_")[0])
if file_name.endswith(".zip"):
zip_file = zipfile.ZipFile(file_path, "r")
zip_file.extractall(file_path_plus_name)
zip_file.close()
else:
name = file_name.split("_")[0]
project = file_name.split("_")[-1]
pathlib.Path(os.path.join(dump, name)).mkdir(parents=True, exist_ok=True)
new_file_path = os.path.join(dump, name, project)
os.rename(file_path, new_file_path)
return dump
As we can see, this function calls the previous function, and stores the path to the extraction site. From there, the function generates a new extraction site called Dump.
After that, we iterate over all the zip files, extract them, and place them in a new folder with the students name as the directory name. If we encounter a file that isn’t a zip file, we attempt to fix the name mangling issue before placing the file in a folder alongside all the extracted zip files.
When we’re done, we return the path to the new extraction site. In total, we’ll have two new folders. One which contains all the zip files (Archives), and one which contains all the unzipped files (Dump). At this point, the Archives directory is useless, so we could delete it.
Testing Automation
With the extraction process automated, I probably saved myself about 30 seconds a file which amounts to a gain of about 20 minutes. Of course, I’d take that any day.
That said, I felt like there was still more to be done. In particular, I found it really time consuming to do the following:
- Download all student submissions.
- Run the Python extraction script.
- Load up Dr. Java.
- Drag and drop the test file into the IDE.
- Grade student submission (repeat 40 times).
- Retrieve a student submission and drop it into the IDE.
- Hit test.
- Analyze test results.
- Assess submission style.
- Give feedback.
As annoying as this new process was, it was an incredible improvement over grading by hand. In any given week, I might spend only 2 to 3 hours grading projects. It would be silly to say that all the automation up to this point wasn’t worth it.
However, there are still a lot of manual steps in the process above, so I took it upon myself to reduce the steps once again:
- Download all student submissions.
- Run Python extraction and testing script.
- Assess submission style (repeat 40 times)
- Give feedback (repeat 40 times)
To do this, I extended my Python script to support JUnit testing. At a high level, each solution is graded as follows:
def grade_file(classes: str, build_file: str, test_class: str, results):
"""
Grades a file.
:param classes: a directory contain files under test
:param build_file: a file to test
:param test_class: the path to the test file
:param results: the results file
:return: None
"""
classpath = "C:\\Program Files\\JUnit\\junit-4.13-beta-2.jar;C:\\Program Files\\JUnit\\hamcrest-all-1.3.jar;"
compile_junit(classes, classpath, build_file)
compilation_results = compile_junit(classes, classpath, test_class)
execution_results = test_junit(classes, classpath, get_test_name(test_class))
write_to_file(results, compilation_results, execution_results, build_file)
Beyond the hardcoded classpath, this solution will automatically compile the student solution and my JUnit test code, execute the test, and print the results to a file. At that point, all I have to do is scan through the file for student names and their testing report before I can assess a grade.
Future Extensions
While the new process is light years faster than any grading I’d been doing last semester, there are still improvements that can be made. For instance, it’s possible to automate the downloading of student solutions. Hell, it’s probably even possible to schedule that process on a server which emails me the testing results at the deadline.
On the other end, it might be nice to create a testing report that just tells me grades, so I don’t take any sort of cognitive load to translate test cases to grades. If that’s possible, it’s probably possible to automate grade uploading as well.
From end to end, we’d have a system which would completely automate student grades. There would be no need for me to take time to assess grades. Instead, I could focus on what I care about which is student feedback. After all, grades are sort of arbitrary metrics. Feedback is what helps students grow.
Also, without the cognitive load from grading, I’d probably be able to build better lecture material, hold better office hours, and give better support over email. That would be the dream!
Drawbacks
Recently, I was telling a friend about what I had done to automate my grading, and they had a great question for me:
If you automate everything, how are you going to detect plagiarism?
- Amigo, 2019
And to be honest, that’s not something I had thought about. Of course, at this point, it’s not something I have to worry about. After all, I do look at every solution for feedback purposes, so I should be able to detect plagiarism.
But, it may be fun to extend the current solution to detect for plagiarism locally. In other words, I could save all the solutions and diff them against each other as I go. That could be fun!
That said, I’ve never been a glutton for punishment. My core values are based on trust, so I tend to offer those same values to students. If I don’t suspect any cheating, I’m not going to go hunting for it. My trust is theirs to lose.
The Power of Automation
Every once in awhile I’ll see a meme making fun of developers who would rather take an hour to write a script to automate a task than spend five minutes doing that task, and I’m so very guilty of that. That said, I don’t think my quest for automation is a bad thing. After all, I always share my solutions to the public.
For example, you’re free to check out all the JUnit test code I use to automate grading for my CSE 1223 course. For instance, the Projects folder contains all of the JUnit testing scripts. Meanwhile, I recently moved the Python script to its own repo. Feel free to look around and borrow some of my work for your own benefit. That’s why I do what I do!
Also, I should mention that the Python grading script has gone through a lot of changes since I wrote this article. For example, it now dumps all the grades to a JSON file which allows me to nest parts of the file in an IDE, so it’s easier to scan. With the JSON improvement, I’m able to get a high-level idea of who did well and who didn’t which I use to grade similar scoring assignments in succession.
If you know any teachers who might be interested in automated grading, why not forward this article to them. I’m sure they’d appreciate it! At any rate, thanks for taking the time to read this article.
Top comments (8)
Why not teach them JUnit as a way of learning the language and doing TDD? While their tests may not be correct, it will help them to think about functionality ahead of time.
This will also change your testing requirement to behavior of the solution-- that is, blackbox testing. To your point, at this level, the guts of the projects may not be as important as a functioning project so maybe this is enough.
We actually do teach JUnit in the second CS course. But, there are a lot of complications that make it hard to teach in the first place. In particular, it’s a 3-credit intro course (2 hours of lecture and 1 hour of lab a week), and a lot of the students have no interest in coding. Of course, the main issue was I didn’t have any control over the curriculum, though I suppose I could have snuck a lesson or two in.
They piloted a new way of doing projects last semester which had an online textbook where you could submit code for testing. It would show all the test cases you missed and how you missed them which I thought was pretty cool. Granted, it doesn’t really teach them how to test code themselves.
Oddly enough, I learned JUnit when I was in my first Java class, so I get where you’re coming from. It’s possible, and it would be great if the course moved in that direction in the future.
As a former adjunct professor, I can appreciate this. However, how do the students get feedback on ways to improve their code?
So, that’s kind of the cool part of a script like this. It handles the heavy lifting in terms of making sure that each solution is correct. Then, I have more time to actually comb through each solution and provide feedback.
At the end of the day, I still have to manually enter grades, so I take that opportunity for a little complement sandwich.
nice!
🔥🔥🔥🔥🔥
Maybe incorporate the use of linters, style guide & CI/CD process.
Which email the student's results of their program if it fails the linter check?
I like the idea of using a linter! I might try to add that as a step to the Python script. That said, we don't really enforce any sort of style as students are learning the basics. I encourage good style practices, but we don't teach anything formal at that level.
Unfortunately, the project just has too many constraints to be able to use existing tech like CI/CD, so I had to roll my own project. I'd also love to get students using version control, but, again, there's just too much they don't understand at that point.