Question 1

How do I fine-tune DeepSeek R1 on my own dataset?

Accepted Answer

In the video, I load the DeepSeek R1 distilled LLaMA-8B model with Unsloth, then apply LoRA so we only train a small set of adapter weights. After that, I format the dataset into a chat-style structure and run supervised fine-tuning with SFTTrainer. The result is a model that’s adapted to your data without retraining all parameters.

Question 2

What is DeepSeek R1 and why is it good for reasoning?

Accepted Answer

DeepSeek R1 is a logical reasoning model derived from DeepSeek V3, trained with Chain-of-Thought (CoT) style datasets. The point is that it learns to think through problems before answering, which often improves multi-step reasoning. If you want the architecture details, I reference my separate DeepSeek paper-explainer video.

Question 3

What is LoRA fine-tuning (PEFT) and why use it here?

Accepted Answer

LoRA is a parameter-efficient fine-tuning method where you add low-rank adapters instead of updating the entire model. In this pipeline, it’s the difference between “I can actually run this” and “this is too heavy.” You get faster training and smaller updates while still steering the model toward your task.

Question 4

Why are you using Unsloth for fine-tuning?

Accepted Answer

I use Unsloth because it optimizes the fine-tuning loop with more efficient matrix multiplications and generally reduces overhead. Practically, that means you can fine-tune large-ish models more comfortably. It’s a very system-design-friendly choice when you care about speed and iteration.

Question 5

Which dataset do you use for training in this video?

Accepted Answer

I use the vicgalle/alpaca-gpt4 dataset and then reshape it into a conversation-friendly format for instruction tuning. The goal is to make the model learn from human-like prompt/response patterns. That structure plays nicely with SFTTrainer for supervised fine-tuning.

Question 6

How do I save a fine-tuned DeepSeek model locally with Ollama?

Accepted Answer

After training, I package the fine-tuned weights into an Ollama-compatible local model so it can run offline. Ollama becomes the runtime layer that makes loading and chatting with the model straightforward. This is the step that turns a notebook experiment into something usable day-to-day.

Question 7

How can I chat with my fine-tuned model after training?

Accepted Answer

Once the model is saved in Ollama, I use Ollama’s chat functionality to interact with it like a local assistant. This is where you validate whether the tuning actually improved the responses you care about. I treat this as the final test in the pipeline: training is useless if you can’t reliably serve and evaluate the model.

Deepseek R1 Fine Tuning [ How to Fine Tune LLM ] Parameter Efficient Fine Tuning LORA Unsloth Ollama

🛍️ Products Mentioned (14)

Source Code

To get the Source Code, Follow me on GitHub

Bit Product

2. GenAI Full Course with LLM Fine Tuning and Evaluation

3. Learn RAG from scratch with GenAI projects

4. Latest AI/GenAI Research Papers Explained

5. RAG and LLM Use Cases in Finance Domain Projects

6. Prompt Engineering

7. Financial Data Analysis and Financial Modelling

8. Artificial Intelligence Projects

9. Predict IPL 2023 Winner (End-to-End Data Science Project)

10. Explainable AI (XAI) Machine Learning

11. Face Recognition

Book your call at

About This Video

Frequently Asked Questions

🎬 More from FreeBirds Crew - Data Science and GenAI