Written exam

A “written exam” refers to an assessment or test in which individuals are required to answer questions or complete tasks by providing written responses. This type of examination is commonly used in education to evaluate a person's knowledge, understanding, and ability to express ideas on paper. Written exams can cover various subjects and may include essay questions, short-answer questions, multiple-choice questions, or other formats.

Key features of a written exam include:

Written Responses: Participants are typically required to write out their answers or responses on paper or in a digital format.

Time Constraints: Written exams often have time limits to assess not only knowledge but also the ability to manage time effectively under pressure.

Objective and Subjective Components: Questions can be objective, such as multiple-choice or true/false, or subjective, requiring longer written responses, essays, or problem-solving.

Evaluation of Skills: Besides testing knowledge, written exams may also evaluate critical thinking, analytical skills, communication skills, and the ability to organize and articulate ideas.

Formal Setting: Written exams are usually administered in a controlled environment to ensure fairness and prevent cheating.

Grading: Grading may be done manually by teachers or through automated systems, depending on the type of questions and the examination format.

Written exams are widely used in various educational levels, from primary school to higher education, and are also common in professional certifications and licensing exams. They serve as a tool for assessing individuals' understanding of a subject and their ability to communicate that understanding in a written format.

50 questions were included in a written exam, 46 questions were generated by humans (senior staff members) and 4 were generated by ChatGPT. 11 participants took the exam (ChatGPT and 10 residents). Questions were both open-ended and multiple-choice. 8 questions were not submitted to ChatGPT since they contained images or schematic drawings to interpret.

Formulating requests to ChatGPT required an iterative process to precise both questions and answers. Chat GPT scored among the lowest ranks (9/11) among all the participants). There was no difference in response rate for residents between human-generated vs AI-generated questions that could have been attributed to less clarity of the question. ChatGPT answered correctly to all its self-generated questions.

AI is a promising and powerful tool for medical education and specific medical purposes, which need to be further determined. To request AI to generate logical and sound questions, that request must be formulated as precise as possible, framing the content, the type of question, and its correct answers ¹⁾.

¹⁾

Bartoli A, May AT, Al-Awadhi A, Schaller K. Probing artificial intelligence in neurosurgical training: ChatGPT takes a neurosurgical residents written exam. Brain Spine. 2023 Nov 29;4:102715. doi: 10.1016/j.bas.2023.102715. PMID: 38163001; PMCID: PMC10753430.