This is a tool for grading Coq homework for CS565, Purdue's graduate programming language course. To use this tool, we also need to develop grading script (in Coq, mostly Ltac) for each homework.
Currently it only supports BrightSpace (hard-coded), the learning platform Purdue uses.
Unfortunately, the code has very few comments (my bad) at the moment.
This section outlines the workflow of using this tool for CS565. I assume you
have installed stack
and know how to use it to build and run Haskell programs.
I will not talk about how to write grading scripts in this section, and I will
explain that in details in section (Grading scripts).
-
Distribute homework: a homework typically consists of a homework file with exercises, a test file for sanity check, a
Makefile
and_CoqProject
file for building, and auxiliary files that the homework imports. Although the test file is somewhat similar to the test file used for grading, make sure you do not reveal any answers in this test file. See an example homework. -
Export grade book: go to BrightSpace, select the
Grades
tab, and click theExport
button. Make sure we select the following options:- Key field: Org Defined ID
- Grade values: points grade
- User details: last name and first name
Then select the homework we want to grade. Do not select more than one. Finally click
Export to CSV
to download a CSV file. We will call thisgradebook
file. -
Download submitted homework: go to BrightSpace, select
Course Tools
and thenAssignments
. Click the homework we want to grade. In the submission page, go to the buttom of this page and select200 per pages
. Blame BrightSpace for not having adownload all
orselect all
button, and pray that you don't have more than 200 submissions. Now click theselect all rows
checkbox and then clickDownload
button to download a zip file of all submissions. -
Grader file structure: See an example grader directory. This directory, called
top
directory, consists of 3 directories:input
,output
andbuild
.input
directory contains students' homework, gradebook and our grading scripts. Thebuild
andoutput
directories are generated by the grader.build
directory is the working directory where we build and test homework.output
directory will host the updated gradebook and feedback files for students after we finish grading. -
Create the
input
directory: copy the gradebook file toinput/gradebook.csv
. Unzip the submission zip file to a subdirectory calledhw
. Thishw
directory should have:index.html
for whatever reason- directory per submission: each directory should contain only one file, that
is the homework file, usually called
hw<number>.v
. The directory name should have the format<some id> - <last name> <first name> - <timestamp>
. There might be multiple submissions from the same student, and the grader will pick the latest one according to the timestamp. However, BrightSpace constantly changes their timestamp format (and the format of this directory name) for no reason (they changed 3 times in one semester!). In that case, the grader will throw an error (hopefully!), and you will have to update the corresponding parser (loadStudentDirs
inLib.hs
).
We also need to create a directory
aux
, consisting of grading scripts. I will talk about this in details later, but now you just need to know the grading scripts will produce a log file. The example grader directory contains aninput
directory with synthetic student homework and grading scripts. -
Prepare the build directory: run
stack run -- prepare <top directory> <homework file>
. For example,stack run -- prepare for_grader hw.v
if you try this on the example. Check the output of this command and see if there is anything weird, e.g., you may check if it indeed copies the latest submission if a student has multiple submissions. At the end of the output, it should say "All done!!".Also check the generated
build
directory (e.g.,for_grader/build
). It should consist of directories indexed by PUID, each of which contains the student's homework and the grading scripts fromaux
. You should try not to mentally map these PUIDs to the students' names, to minimize bias. -
Build and grade homework: run
stack run -- grade <top directory>
. For example,stack run -- grade for_grader
. It is possible that the compilation fails for some submissions (or even loops forever if they submitted bad proof scripts!). Go to that student's build directory and manually fix the issues. If it's their faults, edit the filelocal.v
and modifyComment
andPenalty
to reflect that. More on that in section (Grading scripts). -
Manually grade some exercises: ideally the
grade
command finishes all the grading, but in most cases we need to manually grade some exercises. It is either because they are open questions, or because we need to give partial credits. The grader allows us to write some automation (called sound verifier and sound falsifier) to decide if this student gets the answers correct (or incorrect) incompletely. Although a good automation can greatly reduce the number of exercises that require manual grading, we still need to manually grade some answers if the automation fails.After running the
grade
command, it should output a list of exercises and student IDs that require manual grading (in the form of<exercise name>: <student IDs>
). Manually process those exercises and update the scores in the correspondinglocal.v
. You can also update the automation in test scripts to cover more cases. I will talk more about it later, but the key is to balance the efforts spent on automation and efforts spent on manual grading. Strengthening the automation might take more work than manual grading if we overdo it. But it can save us a lot of efforts when it's done right.Re-run
grade
command until we finish all manual grading and it should output "All done!!". Usually I like to runstack run -- grade <top directory> -f
one last time just to make sure nothing weird happens (-f
will runmake clean
first).The example grader directory contains a
build_completed
directory with the completedlocal.v
files. You can inspect these files and copy them to thebuild
directory and see how it works. -
Generate
output
directory: runstack run -- publish <top directory>
, e.g.,stack run -- publish for_grader
. The generatedoutput
directory should contain an updatedgradebook.csv
and afeedback
directory which consists offeedback.txt
for each student. Check some of these files just to make sure nothing weird happened. Note that some students might get more points than maximum points due to bonus questions. -
Publish feedback and grades: go to
output/feedback
, and runzip -r feedback.zip . -x ".*"
(on Mac, you may runzip -r feedback.zip . -x ".*" -x "__MACOSX"
). This command is part ofInfo-ZIP
that should be available in most UNIX-like systems by default (e.g., MacOS and Linux). DO NOT use other tools, whether or not they come with your system. I know it's easy to create a zip file from GUI, but it will most likely not work! Blame BrightSpace for that.Go to BrightSpace, and select
Grades
tab. Click the drop-down menu of the grade category (the first line of the table header) and selectEdit
. Find the optionAllow category grade to exceed category weight
in theEdit
page and consider enabling it if you want the bonus points of this homework/exam to compensate their overall grade (I enabled it when I taught the course, but please discuss it with the instructor!). Go back to theGrades
page. Click the drop-down menu of the particular homework/exam (the second line of the table header) and selectEdit
. Find the optionCan Exceed
and consider enabling it (to compensate the grade in this category with bonus points).Go back to the
Grades
page, clickImport
button, and import the grades usinggradebook.csv
inoutput
directory.Go to
Course Tools
tab and clickAssignments
. Select the corresponding homework. In the homework page, clickAdd Feedback Files
on top, and uploadfeedback.zip
. You should check some students and see iffeedback.txt
appears as attachments. Finally, click theselect all rows
checkbox and then clickPublish Feedback
.Be careful that the homework category and homework name should not be the same! You can check that by going to
Grades
tab: the first line of the header is category and the second line is homework names. Importing grades or uploading feedback files could trigger yet another bug that prevents you from doing so if they use the same name. Again, blame BrightSpace. -
Inform the students: we have finished the grading! Announce it on whatever platform we decide to use (e.g., Piazza).
This section documents the commands this grader provides. We invoke the commands
by stack run -- <command> <arguments>
.
prepare <top directory> <homework file>
: createbuild
directory frominput
directory, e.g.,prepare for_grader hw.v
. Importantly, this command does not overwrite the homework file andlocal.v
(actually all files end withlocal
) if they are already present inbuild
directory, so we can safely rerun this command when we make updates to the grading scripts.grade <top directory> [-f] [student IDs]
: build and grade homework. It optionally takes a list of student IDs and only builds those. If-f
is specified, runmake clean
first.publish <top directory>
: createoutput
directory with feedbacks and updated gradebook.parse <log file>
: parse a log file, grade it and print out the result. The log file is generated by making the grading script (e.g.,make -s > log
), and it is already in thebuild
directory if you ran thegrade
command. Thisparse
command is mostly for debugging.one <homework directory>
: grade a single directory outside oftop
directory. This homework directory should follow the same structure as the directory<top directory>/build/<student ID>
, with scripts fromaux
and the student homework file. This bypasses the gradebook and other BrightSpace related processing. This command is used when you have to deal with a late submission (e.g., the student was sick) or other situations where you get the homework file in a different way (e.g., from email). After running this command, you can send the student the generatedfeedback
file and the score is also in that file (you need to manually scale it though).
While the grader provides a framework for grading CS565 homeworks, it does not
know how to do the actual grading. We need to develop grading scripts for each
homework. Grading scripts reside in <top directory>/input/aux
and will be
copied to the build
directory. See an
example and their inline comments to understand
how to write them.
This repository also provides templates of the grading scripts.
The grading scripts (aux
directory) consist of the following files:
Makefile
and_CoqProject
: you can useMakefile
as it is from templates. But you need to edit_CoqProject
to modify or add file names.graderlib.v
: a small library for writing the test script. You should use the one from template.- auxiliary files that the homework imports.
local.v
: this file contains comments for students, penalty, manually graded scores and other code that is specific to a student. See the inline comments in the example for usage. This file will not be overwritten even when you runprepare
, so we don't have worry about these information, like manually graded scores, getting lost.test.v
: the test script in Coq (mostly Ltac). This is the most important file that we need to take time developing. The output will be redirected to a log file and used by the grader. It contains automation checking students' answers, sanity checks and options like what axioms they are allowed to use. See the inline comments in the example for more instructions.
Here I share my workflow of writing test scripts. You may have your own workflow.
- First I use the solution file (with all correct answers, provided by the instructor) to develop the first version of the test script. In this iteration, I make sure the automation is good enough that it requires no manual grading for this perfect homework.
- Then I improve the test script against a blank homework file. It should automatically give the blank homework zero (sometimes it's not possible because we may need to give partial credits).
- When we start grading real students' homework, there will be a lot of
exercises requiring manual grading. I check one such answer and ask myself
this question: is this answer very exotic? How many other students might have
answers similar to this? If it is not likely, I give it a manual score and
check the next submission. But if it seems like a common answer, I update
test.v
(interactively) so that it can automatically grade this answer too. Then copy this updatedtest.v
toinput/aux
, and rerunprepare
andgrade
. It also feels great when you add a snippet of automation and knock down half of the exercises that you need to manually grade! - Repeat this process of manual grading and script enhancing, until all exercises are graded.
- The design of the
dependencies
option is flawed. See the inline comments in the example. I know how to fix it, but haven't got the chance to do it yet. - It seems the directory names of the student submissions that BrightSpace generates now
contain their username (email name). You need to update this grader to parse that. On
the other hand, originally I used first name and last name as the key to connect the
homework directories and the entries in
gradebook.csv
. Now that the username is available, perhaps we should use that as the key. In that case, when we export the gradebook from BrightSpace, we also need to selectemail
in theUser details
option.