Procedural Content Generation via Machine Learning (PCGML)

“Procedural Content Generation via Machine Learning (PCGML),” by Adam Summerville, Julian Togelius, and six co-authors (posted to arXiv in February 2017 and later published in IEEE Transactions on Games), defined and surveyed a then-new subfield. It distinguishes machine-learning-based content generation from older rule-based and search-based methods by defining PCGML as “the generation of game content using machine learning models trained on existing content.”

The survey focuses on functional content - “platformer levels, game maps, interactive fiction stories, and cards in collectible card games” - rather than cosmetic assets like sprites or sound effects, because functional content has to be playable and correct, which raises the stakes. It catalogs the methods being applied, from Markov models and clustering to “neural networks, long short-term memory (LSTM) networks, autoencoders, and deep convolutional networks.” Beyond simply generating new levels, the authors argue that because PCGML models the existing corpus, it is also suited to “repair, critique, and content analysis,” plus co-creative and mixed-initiative design tools.

The paper became the standard reference for treating game content as something to be learned from data rather than authored by rules. It set the research agenda - learning from limited datasets, style transfer between games, and guaranteeing playability - that much of the later work on AI-generated levels and, eventually, generative models for games followed.

Sources

Last verified June 7, 2026