Painless (almost) multiple-choice exams in LaTeX

24 Feb 2018

Requirements
LaTeX to the rescue?
Some options
The solution

Worked example files:

Requirements

If you, like me, are interested in writing exams, there is a serious lack of free, quality tools to write exams. In my experience, most instructors that I’ve worked with basically slap something together in Microsoft Word and call it a day. While this is fine for exams and quizzes for courses with fewer than 80 students or so, it rapidly falls apart once you’re administering a multiple choice exam to 600 students. Some of the pitfalls of the Word approach:

Cannot easily scramble question order, to create multiple versions of an exam.
Cannot easily scramble answer order, again for multiple versions of an exam.
Cannot easily generate an “answer” version where correct answers are bolded or circled or whatever. I need this to refer back to when writing the exam, and when students have questions about the exam.
Cannot easily generate a “short” answer key where question numbers and answers are just listed together. I need this to fill in scantron answer keys.
Cannot easily generate a large-print version for students that need this type of accommodation.

I also have an aversion to commercial, GUI-only software using formats that aren’t easily accessible to humans to store questions. If I want to remix an exam or use old questions, I have to laboriously copy and paste questions, make sure the numbering is right, etc. Computers should be able to do this for me.

LaTeX to the rescue?

For writing text, LaTeX is basically what I turn to. There are many packages that purport to do all of the above, plus a few more with respect to LaTeX specifically:

I need to be able to say that some questions are “grouped” together, so I can create sets of problems that are all asking about a common prompt.
Must have an easier syntax for both multiple choice and true false questions. Meaning if I have to write \begin{enumerate} \item[A] etc. for each question I will lose it.
When scrambling answer order, I need to be able to specify that some questions shouldn’t have their answers scrambled. For example, questions with “none of the above” or “all of the above” as a possible response should not have the answers scrambled since those options should appear last.

Well, which packages are good? As you can tell from the title, it’s not great. I feel the pain of this poster on TeX Exchange; it seems like everything just doesn’t quite fit right. My requirements are basically the same as theirs, after all.

Some options

Here’s a listing of everything I’ve tried so far:

exam - Does not permit randomization. But, it has an incredible syntax for both multiple choice questions, true/false, matching questions, and short answer/essay type questions. I definitely use this one for shorter in-class quizzes that students complete in 20 minutes or so.

examdesign - It almost does everything I need. Randomization, great syntax for multiple choice questions, except it doesn’t have good ergonomics for True/False questions, as True/False questions MUST be in a separate “section” from regular MC questions. Which doesn’t make sense since True/False questions are basically multiple-choice questions, except they have a fixed set of answers. Forcing True/False questions into a separate section has the downside that (1) you can’t randomize the order of MC and TF questions, (2) the TF questions don’t look like the MC questions (bad if you are writing a scantron exam), and (3) if you just write TF questions as multiple-choice questions, you MUST include answer choices A. True and B. False every time even though you know it’s a True/False question.

esami - The documentation is terrible. It’s written by Italians and they have not bothered to translate their macros to English, instead supplying brief lessons about the Italian language. Ok, normally I’m fine with bad documentation—scientists are awful at it too and usually I can figure it out if they provide working examples. But I can’t get their examples to work either, because the error messages are in Italian and also talk about macros that are not defined. Great.

probsoln - This one is focused on math, and also has no built-in syntax for MC questions.

automultiplechoice - Not stricly LaTeX only, but it has such a horrific syntax for multiple choice questions that I just ran away screaming.

The solution

Is everything terrible?

No!! I found something that almost, almost works just right: mcexam. It does it all: permutation of both question order and answer order, question grouping, answer permutation customization, and a flexible enough syntax that I can include “image” and “table” answers, and also write a few macros to get True/False questions looking and working correctly. It can generate answer key versions (where answers are placed next to question text), short answer keys, as well as an instructor “concept” version that shows in one document how questions and answers are permuted. It also permits some nifty item analysis using an external R script. Sweet!

There are still a few pain points though, so I’m going to document how I brutally hacked at mcexam to get it to do what I want. You can also download an example file that includes everything below and can be compiled after you make one change to mcexam.sty as detailed below.

Useful tips for writing questions

A macro for True/False questions

            \global\def\qtrue{\begin{mcanswers}[permutenone]\answer[correct]{1}{}\answer{2}{}\end{mcanswers}}
\global\def\qfalse{\begin{mcanswers}[permutenone]\answer{1}{}\answer[correct]{2}{}\end{mcanswers}}

Note to get this to work you have to hack at mcexam.sty. Copy that file from the CTAN archive into the same folder as your exam .tex file, then comment out this line with a %:

            \xifinlist{\a}{\mc@answernumVals}{}{\PackageError{mcexam}{Question \q: answernum \a\space is not specified.}{}}

          

You can also just download a pre-modified version of this file. Drop it in the same folder that you have your .tex file.

A macro to show how many questions are in a question group

            \global\def\numq[#1]{[Questions \the\numexpr\value{setmcquestionsi}+1\relax--\the\numexpr\value{setmcquestionsi}+#1\relax]}

          

With the above two macros, you can easily make a series of True/False questions like this:

            \begin{mcquestioninstruction}
\numq[5] Which of the following foods can be eaten on a
         ketogenic diet? Mark A for True and B for False:
\end{mcquestioninstruction}

\question         Blueberries \qtrue
\question[follow] Eggs        \qtrue
\question[follow] Steak       \qtrue
\question[follow] Bread       \qfalse
\question[follow] Cupcakes    \qfalse

          

And there won’t be any extraneous A. True B. False options thanks to that macro!

Save space with `multicol`

If you have a multiple choice question with really short options, use the multicol environment to save some vertical space by aligning the options all on one line.

First, add \usepackage{multicol} in your preamble, then in the question:

            \question Which of the following numbers is prime?

\begin{multicols}{4}
\begin{mcanswerslist}[ordinal]
    \answer 12
    \answer 16
    \answer[correct] 17
    \answer 20
\end{mcanswerslist}
\end{multicols}

          

This example also shows off one of the neat ergonomic features for answer scrambling, option ordinal, which permutes answer options forward and backward but keeps their relative order. So for each version you’ll see 12, 16, 17, 20; or 20, 17, 16, 12. This works well when options have a natural ordering. There’s also fixlast for “None of the above” type options, and also one where you manually specify permissible permutations.

Other formatting stuff

By default, mcexam will permit questions and their associated answer choices to be split across multiple pages, which is poor test ergonomics. Ensure that questions and their associated answer choices are printed on the same page:

            \usepackage{calc}
\renewenvironment{setmcquestion}
{\begin{minipage}[t]{\linewidth-\labelwidth}}
{\end{minipage}\par}

          

Similar to the above, force question “instructions” to be printed on the same page:

            \renewenvironment{setmcquestioninstruction}
{\begin{minipage}{\textwidth}}
{\end{minipage}}

          

This saves some space in the answer formatting by reducing the space between options:

            \renewenvironment{setmcanswers}{}{}
\setlist[setmcquestions]{label=\mcquestionlabelfmt{*}.
                        ,ref=\mcquestionlabelfmt{*}
                        ,itemsep=0.5\baselineskip
                        ,topsep=1\baselineskip
                        }

          

Use Arabic numerals instead of Roman numerals for test form versioning. Our scantrons don’t use Roman numerals on the version field, so this avoids some headaches when people misinterpret Roman numerals.

            \renewcommand\mcversionlabelfmt[1]{\arabic{#1}}

          

Ensure there’s always one blank page at the end of the exam (so if students turn over their exam early they don’t see question text!) You will also need to set the document class to twoside.

            \mcifoutput{exam}{
    \AtEndDocument{\ifodd\value{page}
        \newpage\thispagestyle{empty}\hbox{This page intentionally left blank.}\newpage\thispagestyle{empty}\hbox{}
        \else
            \newpage\thispagestyle{empty}\hbox{}
        \fi}
}

          

Set fancy headers and footers on each page that show the page number and exam version. Ensures that students and proctors can see if there was a printing error that caused them to miss a few pages.

            \usepackage{fancyhdr,lastpage}
\pagestyle{fancy}
\fancyhf{}
\renewcommand{\headrulewidth}{0pt} 
\renewcommand{\footrulewidth}{1pt}
\lfoot{\mctheversion}
\rfoot{Page \thepage\ of \pageref{LastPage}}

          

Fix question spacing wackiness due to grouped questions.

\raggedbottom

Generating everything at once

I’m too lazy to manually go in and generate the concept version, the answers version, etc. by renaming files. So I wrote a simple shell script to do the same with the help of some clever macros. Place this in your preamble:

            \usepackage{etoolbox}

\ifdef{\myoutput}{}{\def\myoutput{concept}}
\ifdef{\myversion}{}{\def\myversion{1}}

\usepackage[output=\myoutput
           ,numberofversions=2
           ,version=\myversion
           ,seed=4
           ,randomizequestions=true
           ,randomizeanswers=true
           ,writeRfile=true
           ]{mcexam}


          

Basically, what this says is, if the macro \myoutput is not defined, set it to “concept”, and if the macro \myversion isn’t defined, set it to “1”. But we can actually define macros on the command line, using the some clever trickery:

xelatex '\def\myversion{1} \def\myoutput{exam} \input{example.tex}'

This pre-defines those macros, then puts in the rest of your latex file afterwards. Combined with the -jobname option you can specify what file name each run should have. Here’s my full shell script:

            #!/bin/bash

xelatex -jobname=EXAM1 '\def\myversion{1} \def\myoutput{exam} \input{example.tex}' &
xelatex -jobname=EXAM2 '\def\myversion{2} \def\myoutput{exam} \input{example.tex}' &
xelatex -jobname=CONCEPT '\def\myoutput{concept} \input{example.tex}' &
xelatex -jobname=KEY '\def\myoutput{key} \input{example.tex}' &
xelatex -jobname=ANSWERS1 '\def\myversion{1} \def\myoutput{answers} \input{example.tex}' &
xelatex -jobname=ANSWERS2 '\def\myversion{2} \def\myoutput{answers} \input{example.tex}' &

wait

          

All six of these jobs will run in parallel. You might have to run it multiple times if you have \LastPage macros or other cross-references. You can then clean up the intermediate files with rm *.log *.aux.

Randomization seeds

If you have a lot of grouped questions, it’s possible that your questions will be randomized in such a way that one version of the exam is significantly longer than other questions. To that end, you should iterate over a few seeds and check them by hand to ensure that there isn’t anything weird going on. It also helps you see if you forgot to group a set of questions (common when copying question sets from Word or text documents). Note that if you add or remove questions, or change the grouping of questions, the previous seed will no longer generate the same exam. It is thus critical you don’t modify the test or change the seed until after you’ve conducted item analysis.

Item analysis

Let’s go over how to use the R script for item analysis. When the option writeRfile is set to true, mcexam will also write an R file of the same name as your tex script. This R file provides a single function, mcprocessanswers that un-permutes the questions and writes an .ana file of the same name, which is used by the mcexam package in latex. This function takes three arguments: a student ID, the exam version they took, and an answers matrix, where 1 = A, 2 = B, etc.

First we load up all of the packages and the mcexam analysis code:

            library(tidyverse)
library(readxl)
library(magrittr)
source("exam.r")

          

Our exam scoring service at UCLA provides the student responses as an Excel sheet, with one student per row, and their answer choices as the columns Q001 through Q200 or so.

            version1 <- read_xlsx("version1.xlsx")
version2 <- read_xlsx("version2.xlsx")

Here I’m extracting student identifiers from the sheet. The ID number doesn’t have to be an ID number, it just has to be some kind of unique identifier.

            id1 <- version1 %>% transmute(id = str_c(IDNum, LastName, FirstName, sep = " ")) %>% extract2("id")
id2 <- version2 %>% transmute(id = str_c(IDNum, LastName, FirstName, sep = " ")) %>% extract2("id")
id <- c(id1, id2)

          

Generate the version numbers based on which spreadsheet the score came from.

            version <- c(rep_len(1, length(id1)), rep_len(2, length(id2)))

          

This line extracts the columns that correspond to the questions in the exam and combines them into a single data frame.

            answers <- bind_rows(select(version1, Q001:Q062), select(version2, Q001:Q062))

          

This line the answers A, B, C, etc. into 1, 2, 3.

            answers <- apply(answers, 2, match, LETTERS)

          

If the student did not answer a question, recode it to an invalid, dummy value “9”.

            answers[is.na(answers)] <- 9

          

Generate the .ana analysis file.

            mcprocessanswers(id, version, answers)

          

Once you have that analysis file, set the output to analysis and check out the item analysis! The best statistics to look at are proportion correct, which indicates question difficulty, and item-rest correlation, which tells you whether people who scored high on the exam also scored high on that question (you want this to be positive).

If you found this post useful, please consider supporting my work with some cheese 🧀.

Jonathan Chang