In October 2019 OpenAI showed a five-fingered robot hand solving a Rubik’s Cube one-handed, a result described in the paper “Solving Rubik’s Cube with a Robot Hand,” posted to arXiv on October 16, 2019. The system, part of OpenAI’s Dactyl project, combined a control policy that learned dexterous in-hand manipulation with vision-based estimation of the cube’s state. The cube-solving algorithm itself was a known method; the hard part was the physical manipulation.
The headline technical contribution was Automatic Domain Randomization. The policy was trained entirely in simulation, never on the real hand, but a model trained in one fixed simulated world transfers poorly to reality. Automatic Domain Randomization instead generates an ever-widening distribution of simulated environments, varying friction, mass, sizes, and visual appearance, and makes them progressively harder as the policy improves. A policy that can cope with that whole range of simulated conditions turns out to also cope with the messiness of the real world.
The researchers reported emergent meta-learning, where the policy adapted on the fly to perturbations such as a rubber glove on the hand or fingers being tied together. The work was a prominent demonstration of sim-to-real transfer for hard manipulation tasks, though critics noted the cube’s solving was scripted and reliability remained limited.