Cognition claimed Devin resolved 13.86% of SWE-bench issues end-to-end
Cognition's March 2024 Devin launch post stated that Devin resolved 13.86% of SWE-bench issues end-to-end, up from a prior state of the art of 1.96%.
Atomic, verifiable facts - every one tied to a primary source.
Cognition's March 2024 Devin launch post stated that Devin resolved 13.86% of SWE-bench issues end-to-end, up from a prior state of the art of 1.96%.
Common Crawl describes its openly available web corpus as petabytes of data that it has regularly collected since 2008.
DARPA's Strategic Computing program spent about a billion dollars from 1983 to 1993 chasing military machine intelligence.
ETS's e-rater engine uses natural language processing to grade essays alongside human raters on the GRE and TOEFL.
The AIME is a 15-question, 3-hour exam in which every answer is an integer between 0 and 999, making it trivial to score automatically.
F.E.A.R.'s acclaimed combat AI ran on a finite state machine with only three states.
Ford booked a $2.7 billion impairment when it and Volkswagen wound down their self-driving venture Argo AI in 2022.
Forza's Drivatars learn real players' driving styles in the cloud and race as AI stand-ins for them.
Perceptron inventor Frank Rosenblatt drowned in 1971 in a Chesapeake Bay boating accident on his 43rd birthday, his ideas decades ahead of the hardware.
Gottlob Frege's 1879 Begriffsschrift introduced quantifiers - the 'for all' and 'there exists' notation - founding the predicate logic still used today.
At launch, Google said Gemini Ultra was the first model to outperform human experts on the MMLU benchmark of multitask knowledge.
The 2018 Gender Shades study reported gender-classification error rates up to 34.7% for darker-skinned females versus 0.8% for lighter-skinned males.
GGUF stores a model and all its metadata in one self-contained, memory-mappable file, the standard format for running LLMs locally with llama.cpp.
On November 9, 2015, Google released TensorFlow as free open-source software under the Apache 2.0 license, announced by Jeff Dean and Rajat Monga.
After Gemini produced historically inaccurate images, Google said in February 2024 it had turned off image generation of people while it worked on improvements.
When Google revealed the TPU in May 2016, it had already run the custom AI chips in its data centers since 2015, powering Search and AlphaGo.
GPT-4o can respond to audio inputs in as little as 232 milliseconds, averaging 320 milliseconds - close to human conversational response time.
OpenAI's GSM8K benchmark is a dataset of roughly 8,500 high-quality, linguistically diverse grade-school math word problems.
Herbert Simon received the 1975 ACM Turing Award in computing and the 1978 Nobel Memorial Prize in Economic Sciences.
The 2020 HiFi-GAN vocoder produced 22kHz audio 167.9 times faster than real time on a single GPU at near-human quality.
In The Sims, smart objects advertise how much they will satisfy a Sim's needs, and Sims pick the best.
OpenAI trained InstructGPT using a team of about 40 contractors hired on Upwork and through Scale AI to write demonstrations and rank outputs.
In 2025 Italy's Garante fined Luka Inc., maker of the Replika companion chatbot, five million euros over its data processing.
The Fifth Generation Computer Systems project, launched by Japan in 1982, concluded around 1992, capped by ICOT's final international conference that year.
William Stanley Jevons's keyboard-operated Logic Piano, completed around 1869, was the first device to solve a logic problem faster than a person without it.
Andrej Karpathy's 2025 explainer 'Deep Dive into LLMs like ChatGPT' runs about three and a half hours, covering the training stack from raw text to assistant.
Kathleen Booth wrote the first assembly language at Birkbeck College, the readable notation that freed programmers from coding in raw binary.
The LAION-5B dataset comprises 5.85 billion CLIP-filtered image-text pairs, of which 2.32 billion are in English.
Over 120,000 hands of heads-up no-limit poker, Libratus finished ahead of four top professionals by a collective 1,766,250 dollars in chips.
In Note G of her 1843 Notes, Ada Lovelace wrote that the Analytical Engine has no pretensions whatever to originate anything.
Meta's Galactica science model went public on November 15, 2022 and its demo was paused by November 17, 2022.
The DFDC dataset released by Meta (then Facebook) AI contained over 100,000 face-swap video clips sourced from 3,426 paid actors who agreed to participate.
The Llama community license requires companies with over 700 million monthly active users to obtain a separate license from Meta, so it is not OSI-approved.
Microsoft took its Tay Twitter chatbot offline less than a day after launching it, after users provoked it into offensive posts.
Microsoft's Immersive Reader reads text aloud, splits words into syllables, and translates, to help readers with dyslexia.
The Model Card, now a standard way to document an AI model's behavior across groups, was proposed in a 2019 paper led by Margaret Mitchell.
What became known as Moore's Law began as a four-page article Gordon Moore wrote for Electronics magazine on April 19, 1965, not a law of physics.
In 1998 Hans Moravec estimated human-level computation at about 100 million MIPS and predicted cheap machines would match it before 2030.
The physical symbol system hypothesis states that a physical symbol system has the necessary and sufficient means for general intelligent action.
Noam Shazeer is the second author of 'Attention Is All You Need' (the Transformer) and first author of the sparsely-gated mixture-of-experts paper.
Numerai obfuscates its financial dataset so contributors cannot tell which stocks the features describe, preventing trading and overfitting.
The 2020 Nature paper by NumPy's core developers describes the library as 'the foundation upon which the entire scientific Python universe is constructed.'
On April 13, 2019, OpenAI Five defeated Dota 2 world champions OG, becoming the first AI system to beat the reigning world champions at an esport.
OpenAI curated a dataset of 1.2 million songs, paired with lyrics and metadata, to train its raw-audio music model Jukebox.
Hugging Face's documentation states there are over 1M Transformers model checkpoints on the Hugging Face Hub, each loadable by name with a single install.
The PyTorch team's own retrospective fixes the framework's public debut to January 2017, one year before their January 19, 2018 'a year in' post.
reCAPTCHA used everyday CAPTCHA solutions to transcribe scanned books and newspapers with word accuracy exceeding 99 percent.
Refik Anadol trained the AI behind MoMA's Unsupervised on 138,151 pieces from the museum's collection using StyleGAN2.