Vadim Zaytsev aka @grammarware

Unified Assessment Pipeline for Learning-by-Doing Unified Assessment Pipeline


Learning-by-doing is a paradigm we follow at the University of Twente, among other things, to teach programming to students of various programmes. By doing so, we see programming as a skill of expressing oneself in a rather formal way while embracing the problem and eliciting a solution, learning from both successes and failures along the way. Yet, it requires a constant stream of normative feedback to be successful. Technical Computer Science has long passed the point when this can be provided by professors to each student directly, and heavily rely on teaching assistants. This is an expensive and unreliable method: TAs need to be trained explicitly for their tasks, and still deliver work of varying quality, as repeatedly indicated in student evaluations. However, there is research evidence that feedback provided by automated tools, can reach the same level of quality and usefulness.

Student deliverables also need to be summatively assessed, which is done differently across programmes and modules, with two large categories of approaches being: reliance on TAs for grading or using external tools. Relying on student assessors suffers from the same complaints as indicated above, plus from limitations imposed by the Examination Board. External grading assistance tools vary greatly per teacher and are neither integrated in the assessment platform nor linked (even implicitly) to the learning goals. Much research is needed to investigate existing and possible techniques for automatic feedback and grading, taking into account recent developments of AI such as large language models. Some programmes also have learning goals which cannot be tested in a simple way at all due to their inherent ill-definedness (possible multiple incomparable solutions, lack of formal theories, heavy design focus, no natural decomposition, abstract concepts).

The project aims to create one integrated pipeline that features a platform for students to perform/submit their work, which then can be automatically assessed in some appropriate form. The testing part can be based on existing systems like WebLab from Delft or Anubis from UT, or at least inspired by them. The grading and feedback generating part will depend on the nature of assignments: mathematical computations, symbolic evaluations, static code analysis (cf. Ask-Elle from Utrecht, Apollo++ from UT, numerous other examples), artefact quality metrics, code smells, etc. The unique point is that those assessments can be explicitly linked to learning outcomes of the corresponding units, the goal shared with prior projects like Atelier, Apollo and Apollo++.

Participants (alphabetically by surname)

About us (in the media)



The page is maintained by Dr. Vadim Zaytsev a.k.a. @grammarware. Last updated: September 2024.
XHTML 1.1 CSS 3