Human-AI collaboration for Inclusive Science Communication
The project explored generative AI as a support tool for students to identify and refine non-inclusive language in written text. Master’s students in the Science Communication and Education Programme, who had chosen the Inclusive Science Communication elective, were first tasked with rewriting dissemination articles to address biased or non-inclusive language.
Students then received (human) peer feedback on their work and used generative AI to further analyze their articles. Individual interviews were conducted to better understand students’ experiences. Across the five interviews, students viewed AI as helpful, mainly for refining language: improving clarity, reducing jargon, adjusting tone, and making texts more accessible. It rarely contributed deeper inclusivity insights, which were instead shaped by the course and peer feedback. Participants often double-checked AI suggestions and ignored those that felt inaccurate or irrelevant. Challenges included ethical concerns, hallucinations, and unwanted changes. Overall, AI supported micro-level editing, while substantive inclusive reflection remained human-driven.
Background information
This project explored how generative AI may support university students in recognizing and addressing bias in written science dissemination texts. Research to tackle this gap is still in its infancy, as higher education is currently experimenting with ethical and efficient uses of AI for scientific research and teaching. Equity, diversity and inclusion (EDI) values represent a prominent case study to understand AI’s contributions and hindrances to science writing and knowledge production, especially for the next generation of scientists and researchers. This is important because the ability to recognize and address bias directly shapes the fairness, accuracy, and societal relevance of future scientific communication and research. All people’s perspectives are limited to their own lived experiences and backgrounds (McGill et al., 2023). The project investigated human-AI collaboration, specifically how tools like ChatGPT or Copilot can help students detect and mitigate bias in science dissemination texts (alongside the use of human feedback). Additionally, the project aimed to gain insights into how students perceive AI feedback and whether this process deepens their understanding of inclusive science communication. For university teachers, the interviews show that AI can help students clean up their writing, clarify wording, and reduce jargon, but it does not teach them how to think inclusively. Students still look at teachers to learn how to identify bias, understand social context, and apply inclusive strategies in a meaningful way. AI can support writing, but the development of true inclusive awareness remains something only teaching and discussion can provide.
The assignment
As part of the course assignment, students selected 500 words from an online science article of their choice, taken from major media outlets. They were asked to evaluate and improve its inclusivity through language use, representation, stereotype awareness, recognition of diverse perspectives in science, and accessibility for varied audiences using an assessment tool (see appendix). They then received feedback on their writing from a human peer, that they could incorporate in their revised text.
After receiving this peer feedback, students were given the option to use GenAI as an additional diagnostic tool to further improve their texts. Students uploaded the original text to a generative AI of their choice using the prompt: “Please review this text and suggest revisions to make the language more inclusive”. Students were encouraged to use additional prompts to seek specific feedback relevant to their article (e.g. if they chose an article about women’s health, they were asked to seek guidance on gender-inclusive language). They were then asked to critically evaluate the AI-generated feedback, accepting or declining suggestions for altering text, and providing justification for their choices. A reflective paragraph discussed the use of AI as a diagnostic tool for inclusive (science) writing.
Finally, students reflected on the usefulness of the process during individual interviews, comparing how peer and AI feedback influenced their decisions, and discussing their experiences with AI for this task. This approach was chosen to systematically assess inclusivity in science writing across language, representation, perspective, and accessibility. The decisions were informed by EDI principles and by common sources of bias in science communication identified in education and science communication research (Canfield & Menezes, 2020).
Importantly, the majority of the students in the course chose not to participate or dropped out of the project, citing concerns about the environmental impact of AI, questioning whether AI tools should even be included in conversations about inclusivity and time constraints for the extra assignment.
The results
Five students that decided to use GenAI as a diagnostic tool and fully completed the assignment showed mixed preferences on leveraging AI for inclusive scientific writing. They expressed interest in using chatbots to rephrase and undertake editing work in terms of inclusive science communication. Nonetheless, their trust towards AI’s suggestions for the very content of their assignment varied. Some participants argued they had used AI to brainstorm and to check whether relevant elements were missing in their analysis. In most cases, students reported that chatbots had not provided them with new insights: some argued this reassured them with respect to the exhaustiveness of their own reasoning, while others claimed they were somewhat disappointed by the lack of new suggestions, stressing it as an inherent limit of AI. Students appreciated the fact that, at first glance, chatbots did generate inclusive written output. Nonetheless, their concerns with respect to AI-embedded bias, AI’s environmental impacts, and loss of skills due to excessive use of AI persisted.
AI was helpful for rewriting at the level of wording, clarity, and jargon reduction, but less so for identifying deeper bias. Compared to peer feedback, AI feedback was seen as faster and more generic, while peers provided more contextual and critical insights.
In reflections and interviews, students described AI as a diagnostic aid that confirmed existing concerns rather than generating new ones, and most felt it did not substantially deepen their understanding of inclusive science communication. We found it surprising how consistently students described AI as useful only for surface-level language changes, despite expectations that it might support deeper bias recognition. Successes can be explained by AI’s strength in pattern-based language editing, while challenges stem from inclusivity requiring contextual judgement, ethical awareness, and lived experience, which AI cannot meaningfully provide.
Lessons learned Tips
Central AI policy
All AI-related activities on this page must be implemented in line with Utrecht University’s central AI policy and ethical code.
Responsibility for appropriate tool choice, data protection, transparency, and assessment use remains with the instructor.