WebGPT: Browser-assisted Question-Answering with Human Feedback

“WebGPT: Browser-assisted question-answering with human feedback,” posted to arXiv on December 17, 2021 by Reiichiro Nakano and seventeen co-authors at OpenAI, fine-tuned GPT-3 to answer long-form questions by using a text-based web browser. Instead of relying only on what the model had memorized, WebGPT could issue search queries, click links, scroll, read pages, and collect quotes, then write an answer with references back to the sources it used.

The model was first trained by imitation learning from humans performing the same browsing task, then refined with human feedback on which answers people preferred. The best WebGPT system produced answers that humans preferred 56 percent of the time over answers written by the human demonstrators, and it also outperformed the reference answers in the ELI5 dataset. Crucially, by citing the passages it relied on, the system made its answers easier to fact-check.

WebGPT is an important early link between language models and the open web. It anticipated both retrieval-augmented generation and the later wave of browser-using agents, and its emphasis on citing sources prefigured how search-and-answer products would later try to ground responses. For a general reader, it is the moment a chatbot stopped being a closed box and started reaching out to look things up.

WebGPT: Browser-assisted Question-Answering with Human Feedback

Sources

Related