Core Concepts of Machine Learning (w/ Elaboration & Examples) 🐶

Sherlynn
3 min readApr 25, 2025

Here are the essentials you should understand before going deeper into advanced topics:

🧩 1. Features

Features are the input variables that the model uses to make predictions.

Think of features like:

  • The clues in a mystery novel
  • The ingredients in a recipe 🍳

Examples:

🏠 House price prediction:

  • Features: square footage, number of bedrooms, location, year built

📧 Spam email detection:

  • Features: number of links, presence of certain words (“free”, “win”), email length

🏃‍♂️ Predicting running performance:

  • Features: age, weekly mileage, VO2 max, sleep hours

📝 Note: Features can be numerical (e.g., age), categorical (e.g., city = “Seattle”), or even text/images/audio, which are pre-processed into a form the model can handle.

🎯 2. Labels

Labels are the correct answers that the model is trying to learn to predict during training.

If features are the “question,” the label is the “answer.”

Examples:

House price prediction:

  • Label = price in dollars

Email classification:

  • Label = spam or not spam

Disease diagnosis:

  • Label = positive or negative

During training, the model sees both features and labels. It only sees features during testing or real-world use and must guess the label.

🧠 3. Model

The model is the mathematical structure that tries to learn the relationship between features and labels.

Think of it as:

  • A black box that turns input (features) into output (predicted labels)
  • A recipe your model “cooks up” during training

Examples:

  • A linear regression model might look like:
price = 50,000 + (100 * square footage) + (10,000 * bedrooms)
  • A decision tree model might decide:
If income > 50k and age < 40, then buy = Yes

There are many types of models (linear, tree-based, neural nets, etc.), and boosting involves combining many tree-based models sequentially.

❌ 4. Loss Function

The loss function is a score that tells the model how wrong its predictions are.

The goal of training is to minimize this loss.

Think of it like:

  • A teacher grading your paper and telling you how off you were 📉
  • A compass pointing the model toward better performance 🧭

Common Loss Functions:

Mean Squared Error (MSE) – for regression

  • Measures the average squared difference between the prediction and the actual value
  • E.g., predicted price = $200k, actual = $250k → error = (200k – 250k)² = 2500 (in thousands)

Cross Entropy – for classification

  • Punishes confident wrong predictions (e.g., predicted 0.99 for “cat” but it was “dog”)

🏋️‍♀️ 5. Training

Training is the process by which the model learns from data by adjusting internal parameters to minimize the loss.

Think of training like:

  • A student is doing practice problems and adjusting their approach after each one.
  • A muscle grows stronger by reacting to resistance

What happens during training:

1. The model makes a prediction

2. The loss function evaluates the prediction

3. Optimizer adjusts the model to improve next time

4. Repeat until the model improves or plateaus

😬 6. Overfitting vs. Underfitting

Overfitting: Model is too “memorized” on training data

  • Learns noise instead of accurate patterns
  • Performs well on training data, poorly on new/unseen data
  • Like a student who memorizes practice test answers but can’t solve similar questions on the actual test

Example:

  • A decision tree that splits on every tiny detail until it classifies the training data perfectly, but can’t generalize.

Underfitting: Model is too simple

  • Misses important patterns
  • Performs poorly on both training and test data

Example:

  • Trying to fit a straight line to data that clearly curves

🧪 Generalization

The fundamental goal of ML isn’t to perform well on training data, but to generalize well to new, unseen data.

  • If your model only memorizes and can’t apply patterns broadly, it’s not truly “learning.”

--

--

Sherlynn
Sherlynn

Written by Sherlynn

Software Engineer living in Seattle 👩🏻‍💻

No responses yet