In “Temporal Difference Learning and TD-Gammon” (Communications of the ACM, March 1995), Gerald Tesauro reported that his neural network learned backgammon almost entirely through self-play using temporal-difference learning, with no human coaching. Former World Champion Bill Robertie assessed that the program “plays at a strong master level,” and elite player Kit Woolsey said “its positional judgment is far better than mine.” This learning-from-self-play approach is a direct ancestor of later systems like AlphaGo.