Student-AI Interactions Pilot Study in co-creation with first year Science students
In this pilot, the team aimed to co-create an investigation into the ways in which GenAIs might be implemented in the research and education lifecycle with our students as collaborators. They explored the benefits of a co-creation approach, namely openness and transparency in dialogue, while also developing a better understanding of how our students thought and felt about the integration of GenAIs into their work flows. The team observed varying levels of distrust, resentment and optimism. Ultimately, both instructors and students were able to identify perils and potentials.
Background information
In the wake of the pop ularization of ChatGPT, individual researchers, lecturers, institutes and research centers across Utrecht University were tasked with devising their own guidelines for how products of GenAIs were going to be treated in their courses and research outputs. At the Freudenthal Institute, the team was tasked with constructing such regulations, however, they did not want to create a set of prescriptivist, punitive regulations that failed to capture and adequately respond to the complexity of the challenge put before both instructors and students alike. Therefore, they decided to create a set of guiding questions that instructors could use either directly in their classes or in the construction of their syllabi that considered the potential applications of a GenAI, the possible benefits of using it and the possible detriments. These guiding questions were associated with a variety of topics relevant to the research lifecycle, such as performing a literature review, or brainstorming ideas for chapter topics. This initial document served its purpose of helping to guide discussions between instructors and students, but they wanted to further understand if the topics they identified as relevant to performing research were accurately capturing the ways in which our students were thinking about and engaging with GenAI on their own. The primary goal in performing this co-creative research with students was to create an environment where they felt comfortable to rely on their instructors to explore these new technologies without fear of punishment. The team believed that in doing so they would strengthen the relationship between teacher and student, encourage future transparency, and be able to have direct access to the students’ perspective on GenAI, some of whom can be considered AI-natives and all of whom are the direct targets of university policies on GenAI usage.
Project description
The team decided to organize a short (1-hour) session with first year MSc History and Philosophy of Science students to explore how ChatGPT could perform a basic academic task: writing a summary of a text. Being able to write a meaningful and useful summary of a text is a major learning goal for our students, but it is often laborious and time consuming. Because GenAIs, such as ChatGPT, had been lauded for their ability to quickly summarize large amounts of texts, they wanted to explore a domain that the students might be most tempted to offload to GenAIs to see how it impacted their educational experience.
Students came to one of their mandatory introductory classes with their own written summaries (without the use of genAI) and they had been instructed to read two of their peers’ summaries beforehand, all of which were on blackboard. In class, a series of iterative prompts were provided that students could put into ChatGPT. These prompts are all contained in the attached document, but all prompts in some way or another asked ChatGPT to generate a summary of a particular text. After each prompt, the class considered the extent to which we believed ChatGPT had adequately summarized the text, which required them to think a bit critically about what a good summary entailed. This was a positive didactic moment in its own right. The class ended up agreeing that they were most interested in understanding if ChatGPT could identify the important themes of the text, situate the text in its social context adequately, and determine how and why this text was meaningful for the discipline. They concluded with some general reflections on their experiences. By and large, students exhibited a very critical attitude with respect to the GenAI. They did not find the summaries to be meaningful or useful, and concluded that had they relied on the summaries created, they would have failed to grasp the content of the text in its entirety.
The students expressed that they were quite enthusiastic about the pilot, and the team thus organized a second session exploring how ChatGPT could perform the construction of a literature review. Students came prepared with the research topic they would be investigating for the remainder of the period, and again were provided with a series of iterative prompts, followed by a period of collaborative assessment and reflection on the outputs. Again, these prompts are contained in the attached document. The team observed that students themselves did not have a great understanding of what a literature review actually is. Once instructed on what makes a good literature review, they considered the extent to which ChatGPT was able to find sources, relate research questions, and how skeptical they felt about these actions.
Results
During the sessions, the team focused on exploring the emotional state of the students with respect to the outputs of the GenAI by asking questions such as: do you feel like ChatGPT is an authority in your topic/you can defer to ChatGPT’s judgement/that you have received high quality work? There was a general sense that ChatGPT’s performance was too low to be reliable, particularly in the summarizing task. Many students reflected that the summaries ChatGPT made completely missed important themes in the texts.
Anecdotally, the team would say that a larger percentage of students did not know what a good literature review was, as opposed to the group that did not know what a good summary was. They noticed a distinctly different emotional relationship to ChatGPT in the second session, where more students were unsure of themselves. They were faster to doubt themselves and think of the GenAI as an authority. They reflected on this emotional response, noting that while it might be possible to use GenAIs in domains in which they already had specialization, it might be far more difficult and dangerous to defer to the tool when you do not have such specialization.
There were also experiences of frustration, such that the students were disappointed that ChatGPT had conceived of a connection they themselves had not conceived of. In the second session, one student asked ChatGPT to come up with a framework for a literature review on his topic, and he was saddened to see that the tool had come up with a way of framing the research and relating it to other topics that had never occurred to him. It made him feel inadequate, and as if perhaps he needed to use this tool in order to perform at a higher level. There was also a mix of frustration and skepticism about the variability of the generated output, and the high sensitivity to the formulation of the prompt.
The team also saw an interesting opportunity to strengthen traditional learning goals through instruction on the utilization of GenAIs. In both sessions, students expressed skepticism regarding their own understanding of very basic academic tasks, such as writing a summary or conducting a literature review. While some students felt that they understood what made a good summary, or what demarcates a literature review from other types of research, many did not. In both sessions, they ended up spending time teaching these traditional learning goals, and thus strengthening the students understanding.
Generally, the students gained a more robust understanding of how they ought to navigate their personal relationship to GenAIs and felt emboldened to make responsible decisions about how to approach the utilization of the technology and its products.
The team felt that they had adequately captured the types of academic work that students might be interested in offloading to GenAIs, as the students did not have any suggestions for additional areas of utilization. However, they do expect this to change as the availability of AI tools increases in the coming years. They all, students and teachers alike, walked away from the sessions feeling that they had learned something about how GenAI usage will impact our academic lives, and the students reported feeling grateful that we had taken the time to investigate these issues in situ.
The positive feedback from the students has encouraged the team to extend the pilot to the other MSc program in our institute. Additionally, they have decided to provide more tailored instruction to our students in the form of workshops on prompt engineering, as the sensitivity to prompts was something that was discussed quite a bit during our pilots.
We plan to deliver the same pilot in the next academic year, and will make a few changes to the format:
- They will deliver pre- and post-questionnaires to collect more data about students’ experiences
- They will pay more attention to classroom design
- They will re-evaluate the strength of the prompts that we provide the students with. That is, they will make sure that these will be in line with the instruction they will provide in the aforementioned workshop on prompt engineering.
Lessons learned & tips
- Establishing an open, exploratory environment between the students and instructors was critical for the success of this pilot. Students felt encouraged to fully explore how this technology might work for them while being supported by instructors who were there to not only provide insights and information, but also to legitimately engage in the co-creation of standards for an emerging technological tool.
- Due to the discursive and dialogue-based nature of this pilot, wherever structure can be introduced is welcome. Extra attention should be paid to classroom design, and the ease of access of longer prompts. This means ensuring that tables are already put together in groups, and that the prompts are made available in a shared place such as a classroom management software so that students can simply copy and paste prompts instead of having to type them out themselves. We noticed that we could have benefitted from preparing the classroom and the accessibility of the prompts a bit better.
- Think about how you want to record the qualitative data. We did not consider formally collecting qualitative data, as we really just wanted to explore our students relationship to GenAI without introducing any sort of external pressure on them. However, this means that we can only share our own recounting of the experience. Positives and negatives are associated with data collection, and every instructor must make their own cost-benefit analysis.
- Ensuring that our students have a high degree of AI-literacy should be a shared essential learning goal across departments. This requires instructors to engage openly and critically with the new technologies, while also recognizing that educating on GenAIs provides an opportunity for us to strengthen our students’ understanding of traditional learning goals, such as we had experienced with the learning goals on summarising texts and constructing literature reviews
- Focus on working in small groups, less than 30, and be explicit that the work done in the session is not going to be used to punitively criticize the ways in which students think about using GenAI in their studies.
- Think critically about which tasks students in your program might be interested in using GenAIs for (e.g. coding, writing, research, etc.), and tailor sessions to focus on these.
- Approach your students as co-creators in the establishment of novel knowledge relating to student-AI interactions. We are in a unique time where there are simply no experts in GenAI education, be sure to appreciate the value-added by involving students explicitly in the creation of this new sort of expertise.
More information:
FI Guidelines on GenAI: https://www.uu.nl/onderzoek/freudenthal-instituut/onderwijs/generative-ai