How to Fine-tune LLMs with RLVR (OpenAI’s RFT API)

2.3K views· 83 likes· 26:00· Feb 8, 2026

ShareTwitter Facebook LinkedIn Instagram

🛍️ Products Mentioned (4)

🤝 Want your team maximizing Claude? I run 1:1 and team AI workshops for companies doing $1M+ per year: https://aibuilder.academy/yt/k-94oCJ_WJo This is the 3rd video in a larger series on reinforcement learning (RL) with LLMs. Here, walk through a concrete example of fine-tuning GPT-o4-mini to detect HDFS anomalies using RLVR. 💻 GitHub Repo: https://github.com/ShawhinT/rlvr-hdfs-classification 🤗 Dataset: https://huggingface.co/datasets/shawhin/HDFS_v1_blocks ▶️ Series Playlist: https://www.youtube.com/playlist?list=PLz-ep5RbHosU_UY8NtZAMaraz74sMHo2W References [1] arXiv:2509.16679 [cs.CL] [2] arXiv:2509.04501 [cs.CL] [3] arXiv:2501.12948 [cs.CL] [4] https://platform.openai.com/docs/guides/reinforcement-fine-tuning Introduction - 0:00 RL with LLMs - 0:15 RLVR - 1:42 SFT vs RLVR - 2:23 Example: HDFS Classification with RLVR - 4:09 Step 0: Imports - 6:37 Step 1: Train-Validation Split - 7:40 Step 2: Format Data - 10:23 Step 3: Create Grader - 12:27 Step 4: Fine-tune Model - 15:38 Step 5: Evaluate Model - 19:07 Limitations - 22:40 What's Next? - 25:00

Watch on YouTube