Base Model vs Instruction-Tuned Model | robert@barcik.training

Key insight: The base model is an internet text simulator — it autocompletes based on patterns in training data. The instruction-tuned model (SFT → RLHF) learns to follow instructions and be helpful. Same underlying knowledge, fundamentally different behavior. InstructGPT (Jan 2022) demonstrated this; ChatGPT (Nov 2022) brought it to the public.

Base / Foundation Model pre-training only

Click "Generate" to see output...

Instruction-Tuned Model SFT + RLHF

Click "Generate" to see output...