Let be labeled training data. The data is said to be linearly separable if there exists a hyperplane that correctly classifies all the examples :

In general, finding is impossible, but we search for some that separates the 2 classes. The objective function to optimize is :

This is called a batch objective since it relies on a cumulative fit to data. By our assumption : . There are however many solution hyperplanes if we consider scaling of . To solve this, we fix to be the smallest-norm vector that guarantees :

How can we solve this? Using the Perceptron recursive update. It has been shown that the perceptron converges to a solution in a finite number of steps.

The algorithm of the Perceptron is the following :


Conclusion : That’s it ! I hope this introduction to Online Learning was clear. Don’t hesitate to drop a comment if you have any question.

Like it? Buy me a coffeeLike it? Buy me a coffee

Leave a comment