I taught myself machine learning > 10 years ago.
If I had to start again today, I wouldn’t touch models, LLMs, or agents first, as many AI experts suggest.
I'd start with the math and the code.
Ugly truth: 90% of people skip the foundations, then wonder why everything feels like magic or falls apart in production.
If you want to be different, actually understand ML, not just copy-paste,
this is the roadmap I'd follow:
Start with fundamentals:
Because no matter how fast LLMs or GenAI evolve, your math, code, and logic will keep you relevant.
Here's what you should focus on:
📐 1. Linear Algebra
Learn these core ideas:
Vectors, matrices, tensors
Matrix multiplication (dot products, broadcasting)
Transpose, inverse, rank, determinants
Eigenvalues & eigenvectors (especially for PCA & embeddings)
Projections and orthogonality
✅ Use NumPy to implement everything yourself
→ Practice matrix ops, dot products, and visualizing transformations with Matplotlib
🔁 2. Calculus
Focus on:
Derivatives & partial derivatives
Chain rule (for backpropagation in neural nets)
Gradient descent
Convex functions, minima/maxima
✅ Use SymPy or JAX to visualize and compute derivatives
→ Plot functions and their gradients to develop deep intuition
🎲 3. Probability
You need a solid grip on:
Random variables (discrete & continuous)
Conditional probability & Bayes' rule
Joint & marginal probability
The Chain rule
Expectation, variance, entropy
Common distributions: Bernoulli, Binomial, Gaussian, Poisson
Central limit theorem
The law of large numbers
✅ Simulate simple probability experiments in Python with NumPy
→ E.g. simulate sampling from distributions
📊 4. Statistics
These are must-know topics:
Descriptive stats: mean, median, mode, standard deviation
Hypothesis testing: p-values, confidence intervals, t-tests
Correlation vs. causation
Sampling, bias, and variance
Overfitting/underfitting
A/B testing basics
✅ Use Pandas & SciPy to explore real datasets
→ Calculate descriptive stats, create histograms/box plots, run t-tests
🔧 Essential Python libraries to learn early
NumPy – for vectorized math and fast array ops
Pandas – for loading, cleaning, and analyzing tabular data
Matplotlib / Seaborn – for plotting and visualizing distributions, relationships, and trends
SymPy – for symbolic math and calculus
SciPy – for stats, optimization, and numerical methods
Use Jupyter Notebooks(to combine math, code, & visuals in one place)
📚 Best resources to nail the fundamentals:
✅ Machine Learning Foundations Math series (ML Foundations: Linear Algebra, Calculus, Probability, and Statistics)-series of 4 courses that I've created together with LinkedIn learning
✅ Hands-On ML with TensorFlow & Keras book by Aurélien Géron
✅ The Hundred-page Machine Learning Book by Andriy Burkov
If you want to become an actual ML engineer, not just someone who watches and copies demos, start here.
♻️ Repost to help others💚