Gender biases in GPT-4 generated biographies. A corpus study on Italian and French anthroponyms

Research output: Contribution to journalResearch articleContributedpeer-review

Abstract

As has been shown in various studies considering different languages, in professional contexts women tend to be referred to differently than men. While men are typically referred to by their surname (e. g., Fermi), women are more often referenced with their full name (e. g., Samantha Cristoforetti) or first name alone (e. g., Samantha). The present study proposes an empirical case study investigating whether this gender-indexing bias is also present in texts generated by large language models (LLMs). Based on the analysis of a self-assembled data collection comprising 420 biographies produced by GPT-4 on 140 eminent Italian and French female and male personalities, our study reveals that the synthetic texts investigated not only reflect the gender biases found in human-authored texts but, in some cases, even amplify them.

Details

Original languageEnglish
Pages (from-to)227-245
JournalLinguistik Online
Volume144
Issue number3
Publication statusPublished - 10 Apr 2026
Peer-reviewedYes