🤝 Want your team maximizing Claude? I run 1:1 and team AI workshops for companies doing $1M+ per year: https://aibuilder.academy/yt/k-94oCJ_WJo This is the 3rd video in a larger series on reinforcement learning (RL) with LLMs. Here, walk through a concrete example of fine-tuning GPT-o4-mini to detect HDFS anomalies using RLVR. 💻 GitHub Repo: https://github.com/ShawhinT/rlvr-hdfs-classification 🤗 Dataset: https://huggingface.co/datasets/shawhin/HDFS_v1_blocks ▶️ Series Playlist: https://www.youtube.com/playlist?list=PLz-ep5RbHosU_UY8NtZAMaraz74sMHo2W References [1] arXiv:2509.16679 [cs.CL] [2] arXiv:2509.04501 [cs.CL] [3] arXiv:2501.12948 [cs.CL] [4] https://platform.openai.com/docs/guides/reinforcement-fine-tuning Introduction - 0:00 RL with LLMs - 0:15 RLVR - 1:42 SFT vs RLVR - 2:23 Example: HDFS Classification with RLVR - 4:09 Step 0: Imports - 6:37 Step 1: Train-Validation Split - 7:40 Step 2: Format Data - 10:23 Step 3: Create Grader - 12:27 Step 4: Fine-tune Model - 15:38 Step 5: Evaluate Model - 19:07 Limitations - 22:40 What's Next? - 25:00

The 8 Claude Skills Running My Business
1.2K views

How to Use Claude Better than 99% of Founder-CEOs
798 views

Claude Cowork Explained in 29 Minutes (for non-coders)
1.7K views

How I Taught Claude To Edit My YouTube Videos
4.5K views

How to Automate Anything with Claude (4-Step Framework)
4.4K views

Claude Code for SWE Teams: Building a Shared AI Coding Toolkit
1.9K views