💻 OpenAI introduced reinforcement fine-tuning for o1
OpenAI showed reinforcement fine-tuning feature for "reasoning" models o1 and o1-mini. With RFT, creating your own expert in a narrow domain from o1 will be possible by showing the models just a few dozen examples.
🧬Technology was demonstrated with bioinformatician Justin Reese from the Berkley Lab. Using the o1-mini, developers tackled the challenge of identifying genes linked to specific diseases. The system analyzed a dataset detailing patients' symptoms alongside the genes whose abnormalities were associated with these conditions.
After additional training, o1-mini could guess which gene is "broken" by describing symptoms much better than o1-mini and surpassed even o1 presented yesterday (see the chart).
Moreover, testing the models on a control dataset showed that o1-mini has learned to reason competently about the links between symptoms and gene pathologies rather than memorizing them.
This method allows the training of expert models in economics, law, medicine, and other professional fields.
📆 Now reinforcement fine-tuning is in beta test, access will be opened to users in early 2025.
This was only the 2d of 12 OpenAI "shipmas" streams with announcements. Looking forward to tomorrow.

