During my first two and a half years at OpenAI, I worked on the robotics team with an amazing idea: we wanted to teach a single human-like robotic hand to solve a Rubik’s cube. It was a tremendously exciting, challenging and emotional experience. We solved the challenge with deep reinforcement learning (RL), incredible amounts of domain randomization, and no real-world training data. More importantly, we conquered the challenge as a team.
From RL simulation and training to vision insight and hardware firmware, we collaborate so closely and cohesively. It was an amazing experiment and during that time, I often thought of Steve Jobs. reality distortion field: When you believe in something so strongly and keep pushing it so persistently, you can somehow make the impossible possible.
Since the beginning of 2021, I started leading the Applied AI Research team. Leading a team presents a different set of challenges and requires changes in work style. I am very proud of several projects related to language model security within applied AI:
- We designed and built a dataset and evaluation tasks to assess the tendency of pretrained language models to generate hateful, sexual, or violent content.
- We created a detailed taxonomy and built a robust classifier to detect unwanted content, as well as the reason why the content is inappropriate.
- We are working on various techniques to make the model less likely to generate unsafe results.
As the Applied AI team is practicing how best to implement cutting-edge AI techniques, such as large, pre-trained language models, we’re seeing how powerful and useful they are for real-world tasks. We are also aware of the importance of safely implementing the techniques, as emphasized in our Charter.