AI Democracy Projects: chatbots give bad election answers (2024)

In February 2024 the AI Democracy Projects, a collaboration between the nonprofit newsroom Proof News, led by investigative journalist Julia Angwin, and the Science, Technology and Social Values Lab at the Institute for Advanced Study in Princeton, published a study of how leading AI chatbots answer voters’ questions. The team posed real election questions to five models: Anthropic’s Claude, Google’s Gemini, OpenAI’s GPT-4, Meta’s Llama 2, and Mistral’s Mixtral.

More than 40 state and local election officials and AI experts rated the responses. About half of the answers were judged inaccurate, around 40 percent were rated harmful, roughly a third were incomplete, and about an eighth were biased. The errors were concrete and consequential: models invented polling places, gave wrong registration deadlines, and in Nevada, where same-day registration has existed since 2019, four of the five models wrongly told voters they would be blocked from registering close to election day. GPT-4 performed best but still produced inaccurate answers about 19 percent of the time. The authors concluded the models were not reliable enough to be trusted with voters’ questions.

The study put hard numbers on a risk that was otherwise mostly hypothetical and pushed several AI companies to restrict election-related queries.

For a business reader, it is a caution about deploying general-purpose chatbots for high-stakes factual lookups: fluency is not accuracy, and confident wrong answers can do real harm.