AI and Compute

We’re releasing an analysis showing that since 2012, the amount of compute used in the largest AI training runs has been increasing exponentially with a 3.5 month-doubling time (by comparison, Moore’s Law had an 18-month doubling period). Since 2012, this metric has grown by more…

AI Safety via Debate

We’re proposing an AI safety technique which trains agents to debate topics with one another, using a human to judge who wins. We believe that this or a similar approach could eventually help us train AI systems to perform far more cognitively advanced tasks than…

Evolved Policy Gradients

We’re releasing an experimental metalearning approach called Evolved Policy Gradients, a method that evolves the loss function of learning agents, which can enable fast training on novel tasks. Agents trained with EPG can succeed at basic tasks at test time that were outside their training…

OpenAI Charter

We’re releasing a charter that describes the principles we use to execute on OpenAI’s mission. This document reflects the strategy we’ve refined over the past two years, including feedback from many people internal and external to OpenAI. The timeline to AGI remains uncertain, but our…

Retro Contest

We’re launching a transfer learning contest that measures a reinforcement learning algorithm’s ability to generalize from previous experience. In typical RL research, algorithms are tested in the same environment where they were trained, which favors algorithms which are good at memorization and have many hyperparameters….

Report from the OpenAI Hackathon

On March 3rd, we hosted our first hackathon with 100 members of the artificial intelligence community. We had over 500 RSVPs arrive within two days of announcing the event — if you didn’t make it this time, please RSVP again in the future! Thank you…

Reptile: A Scalable Meta-Learning Algorithm

We’ve developed a simple meta-learning algorithm called Reptile which works by repeatedly sampling a task, performing stochastic gradient descent on it, and updating the initial parameters towards the final parameters learned on that task. Reptile is the application of the Shortest Descent algorithm to the…