Machine Learning With Python

Course 1: Introduction to Machine Learning

Basics of Machine Learning and Motivation : A first approach to machine learning. We’ll go over the main motivations, the main kind of algorithms, what they can be used for…

Metrics in Machine Learning: Todo

Why learning works? An overview of Vapnik–Chervonenkis theory: Todo

Prepare data for Machine Learning: Todo

Overfitting and regularization: Todo

Course 2: Supervised Machine Learning

a. Statistical inference

Linear Regression (Part 1): We’ll explore the simple framework of OLS and multi-dimensional regression.

Linear Regression (Part 2): Random design matrix, Normal regression, Pseudo Least Squares and other extensions…

The Logistic Regression: One of the fundamentals algorithms for classification.

b. Core algorithms

The Bayes Classifier: At the core of any algorithm, the Bayes Classifier is considered as one of the first algorithm to master.

Support Vector Machine: Todo

Linear Discriminant Analysis (LDA) and QDA : Intuition behind LDA, when it should be used, and the maths behind it. We’ll also quick cover the Quadratic version of LDA.

Tree-based methods with CART: Todo

c. Bagging Methods

Random Forest and Extra Trees: Todo

d. Boosting Methods

Adaptative Boosting (AdaBoost) : A clear approach of boosting algorithms and adaptative boosting with illustrations. When should we use boosting ? What are the foundations of the algorithm ?

Gradient Boosting (Regression): The basics of gradient boosting regression, and implementation of a high level version in Python.

Gradient Boosting (Classification): Gradient boosting classification as an extension of the Regression.

e. Time Series

Introduction to Time Series : A first approach to exploring a time series in Python with open data.

Key Concepts in Time Series : Stationarity, ergodicity… We’ll cover the key concepts of time series.

Basics of Time Series Forecasting : How do we make a series stationary ? How do we forecast ?

Time Series Forecasting with Facebook Prophet : Explore time series forecasting using the Prophet open-source package.

Handle missing values in Time Series : A quick illustration of backward filling and forward filling.

f. Recommmendation Systems

Content-based Filtering : Todo

Colaborative Filtering : Todo

Course 3: Optimization and tuning

GridSearch vs. RandomizedSearch : When it comes to parameter selection, you usually encounter 2 main solutions. GridSearch and RandomizedSearch. What is the main difference between these 2 techniques ? What are the pros and cons of each technique ?

Bayesian Hyperparameter Optimisation (HyperOpt) : Bayesian Hyperparameter Optimization is a great alternative to GridSearch and RandomizedSearch. How does it work ? How do you implement it in Python ?

AutoML with h2o : The interest in AutoML is rising over time. AutoML algorithms are reaching really good rankings in data science competitions. But what is AutoML ? How does it work ? When to use it ? And how can you implement an AutoML pipeline in Python ?

Machine Learning Explainability : We’ll cover permutation importance, partial dependence plots and SHAP Values to better explain the outputs of a ML model.

Course 4: Unsupervised Machine Learning

Clustering algorithms: Todo

Unsupervised Anomaly detection: Todo

Reducing dimension: Todo

All codes and exercises are accessible on this repo. Don’t hesitate to show your suppot and star the repo: