Experimental Evidence on the Productivity Effects of Generative AI (Noy and Zhang)

In a study first circulated in early 2023 and published in Science in July 2023, MIT economics doctoral students Shakked Noy and Whitney Zhang ran a controlled experiment on how ChatGPT affects mid-level professional writing. They assigned occupation-specific, incentivized writing tasks - the kinds of memos, press releases, and reports done by marketers, grant writers, consultants, data analysts, and HR professionals - to a group of college-educated professionals, then randomly gave half of them access to ChatGPT.

The results, as reported by the authors and summarized by MIT’s economics department, were large. Average time to complete a task fell by about 40 percent, and the quality of the output, graded by independent evaluators, rose by about 18 percent. The study found that performance inequality between workers shrank: the people who had scored lowest without the tool benefited the most from it, while strong writers gained less.

The authors also reported that ChatGPT tended to substitute for worker effort rather than complement skill, shifting how people spent their time toward generating ideas and editing and away from drafting from scratch. Participants who used the tool in the experiment were more likely to keep using it on the job weeks later. Together with the Brynjolfsson, Li, and Raymond support-agent study, this paper is one of the most-cited early pieces of evidence that generative AI raises output most for less experienced workers - a finding later complicated by the METR developer experiment, which measured a slowdown rather than a speedup.