ML Questions to answer
Use these as revision prompts after reading ML Key concepts, ML Equations and definitions, and ML Models from cheatsheet.
Foundations
- What makes a learning problem supervised?
- What is the difference between a feature vector , target , parameter , and hyperparameter?
- Why is empirical risk only a proxy for real-world performance?
- Why can a model with more parameters generalise well in some settings but overfit badly in others?
Classification vs regression
- What is the difference between classification and regression?
- Give three examples of categorical targets and three examples of continuous targets.
- Why is logistic regression a classification model despite its name?
- Why can accuracy be a bad metric for classification?
- How would you choose a classification threshold when false positives and false negatives have different costs?
SVMs and linear boundaries
- What is a hyperplane?
- What does it mean for an SVM to maximise the margin?
- What are support vectors and why do they matter?
- Why is the assumption of linear separability often too strong?
KNN
- How does KNN classification differ from KNN regression?
- What happens when is too small? What happens when is too large?
- Why does feature scaling matter for KNN?
Trees and ensembles
- How does a decision tree partition the input space?
- What does a classification tree store in a leaf? What does a regression tree store?
- Why does a random forest reduce the instability of a single tree?
- How does gradient boosting use residuals/errors from previous models?
- Why can boosting overfit even though it is built from weak learners?
Regularisation
- What is the difference between Lasso and Ridge penalties?
- Why can Lasso perform feature selection?
- Why is Ridge useful with noisy or correlated features?
- How does the regularisation strength affect bias and variance?
Classification-specific models
- What probability does logistic regression model?
- What does LDA try to find geometrically?
- State Bayes’ theorem and explain how Naive Bayes uses it.
- What is the naive assumption in Naive Bayes?
Regression-specific models
- What does the slope mean in simple linear regression?
- Why can polynomial regression fit nonlinear curves even though it is linear in coefficients?
- What does Gaussian process regression output besides a mean prediction?
- What does a 95% uncertainty interval mean, and what assumptions does it depend on?
Optimisation and evaluation
- What does the gradient of the loss point toward, and why does gradient descent move in the negative-gradient direction?
- What does it mean if training loss decreases but validation loss increases?
- How would you detect data leakage?
- Which metric would you choose for a rare-event detector?
- In a physics-informed model such as Residuals in NF2 and PINNs, what residuals should be checked besides data fit?
Good answers should connect Stats Key concepts, Vector Calculus Key concepts, validation hygiene, and domain assumptions.