Week 2: Data, ML, and How Models Learn
Classification, Regression, and Model Choice
Know when a classical model is the correct tool.
Week 2: Data, ML, and How Models Learn
Know when a classical model is the correct tool.
Objective
Differentiate classification from regression and choose a simple baseline model appropriately.The lesson is public. The pressure loop lives inside the app where submissions, revision, and review happen.
Deliverable
A simple ML pipeline with evaluation and a leakage audit.Each lesson contributes to a week-level artifact and eventually to the shipped AI-native SaaS.
Preview
Lesson Preview
Know when a classical model is the correct tool.
This lesson teaches the decision boundary between common supervised learning tasks and the discipline of choosing the simplest model that fits the job.
Most expensive AI mistakes begin with using the wrong class of tool. Choosing an LLM for structured prediction or choosing a complex model when a baseline would do creates operational waste.
Start with the business question, map it to target type, then choose the least complex model that gives acceptable signal and explainability.
What This Is
This lesson teaches the decision boundary between common supervised learning tasks and the discipline of choosing the simplest model that fits the job.
Why This Matters in Production
Most expensive AI mistakes begin with using the wrong class of tool. Choosing an LLM for structured prediction or choosing a complex model when a baseline would do creates operational waste.
Mental Model
Start with the business question, map it to target type, then choose the least complex model that gives acceptable signal and explainability.
Deep Dive
Classification predicts categories. Regression predicts continuous values. Clustering groups without labels. The deeper lesson is that problem framing governs everything after it. If you frame the problem badly, metrics, model family, data prep, and deployment expectations all drift. A serious AI Engineer learns to slow down at framing time because that is where cost and interpretability are decided.
Worked Example
Predicting whether a user will churn next month is classification. Predicting next-month spend is regression. Segmenting learners into latent behavior groups is clustering. If you misuse one for another, your evaluation metric stops telling the truth.
Common Failure Modes
Common failures include optimizing for novelty, skipping baselines, and choosing a model because it sounds advanced instead of because it matches the target and operating constraints.
References
official-doc
Use this as the canonical task-family reference.
Open referenceofficial-doc
Useful for keeping terminology precise.
Open referencearticle
Tie model selection to observability and practical judgment.
Open reference