Out-Law News 2 min. read
01 Jul 2024, 9:10 am
research has highlighted the significant risks AI use can place on the integrity of exams and assessment in universities and other higher education institutions, an expert has said.
Julian Sladdin of Pinsent Masons was commenting after the results of a study conducted at the University of Reading in the UK were published.
The study, described by the institution as “the largest and most robust blind study of its kind, to date”, found that a popular generative-AI tool was not only able to generate exam answers for psychology modules that could largely avoid being recognised as having been AI-generated, but that those answers could typically outscore real student answers upon assessment.
In a statement, the University of Reading said the ChatGPT-generated answers “went undetected in 94% of cases” and that those answers “on average, attained higher grades than real student submissions”.
The study was published after a recent UNESCO study highlighted that fewer than 10% of 450 academic institutions surveyed globally have developed policies for or guidance around the use of generative AI tools.
One of the academics behind the University of Reading study, associate professor Peter Scarfe, said that while education institutions will not revert “fully to hand-written exams”, the education sector “will need to evolve in the face of AI”. He said “many institutions have moved away from traditional exams to make assessment more inclusive” already.
Sladdin said that while more research into the potential impact of AI on academic integrity is needed, the University of Reading study highlights the need for universities to implement robust policies on AI use and carefully consider modes of assessment. The move is necessary to ensure that AI can be harnessed by students in their studies but not mis-used to gain unfair academic advantages, he said.
Sladdin said: “The academics who led this study have described its findings as a ‘wake up call’ for academic institutions, not only in the UK but also worldwide. I agree. It shows that significant recent advances in generative AI have made its use much harder to identify by the fact that previous weaknesses with the technology, such as hallucination effects – the creation of plausible responses which are false or not faithful to the original data – have been reduced. In addition, the research may also suggest that while many institutions have changed their assessment processes already to meet this challenge, more work is needed considering the continuing developments in generative AI and its use by students.”
“It will be key for more research to be carried out in this field – to inform institutional responses and help universities set clearer boundaries about the ways in which AI can be used ethically by their students, as well as the redlines which need to be drawn around academic misconduct. This will only work, however, if investment is also put into developing inclusive assessment processes which account for the need for students to be able access generative AI as part of their learning and development. Competence standards will need to be tested in ways that can meet the needs of the curriculum and the need to educate students about effective use of the technology available whilst mitigating the increased risk of these new technologies being inappropriately exploited by students to gain unfair advantage in assessed work and the consequent impact on perceived academic integrity and standards,” he said.