HALoGEN: Fantastic LLM Hallucinations and Where to Find Them

Despite their impressive ability to generate high-quality and fluent text, generative large language models (LLMs) also produce hallucinations: fabricated statements that contain false information, or that deviate from provided context (for example when a model is asked to summarize a document, but inserts new facts in the summary that were never in the provided document).

AI hallucinations pose serious risks: they can propagate false information, erode user trust, and render systems unsuitable for critical applications. Understanding how often these hallucinations occur and what causes them remains a fundamental challenge in developing trustworthy AI systems.

A hallucinated historical narrative produced by GPT-4.

We release 🔦HALoGEN, a diverse multi-domain benchmark to measure LLM hallucinations which consists of:
🔎 10,923 prompts for generative models spanning nine domains including programming, scientific attribution, and summarization.
🔧 Automatic high-precision verifiers for each use case that decompose LLM generations into individual atomic facts (pieces of information), and verify each fact against a high-quality knowledge source to identify hallucinations.

Building on this framework we provide:
🤖 150,000 generations from 14 large language models with automated factuality annotations.
⭐ A novel taxonomy that classifies model hallucinations by tracing them back to their origin in training data, enabling systematic identification of potential reasons for language models generating false or unsupported outputs. We categorize hallucinations into three distinct types based on their training data origins:

Hallucination Type	Description
Type A	The correct fact was present in the pretraining data, but the model still hallucinated.
Type B	An incorrect fact was in the pretraining data, or the fact is taken out of context i.e. the fact appeared within a specific setting in a document in the training data, but when taken in isolation, it loses its original meaning.
Type C	Neither a correct nor an incorrect fact was present in the pretraining data, and the model over-generalized when making predictions.

You can read about our (many!) findings here.

HALoGEN

Fantastic LLM Hallucinations and Where to Find Them

HALoGEN

Fantastic LLM Hallucinations and Where To Find Them

Cite Us