The Support Vector Machines (SVM) are a set of supervised learning algorithms that you can use for classification, regression and outlier detection purposes.
Its optimisation objective is to maximise the margin, i.e. the distance between the separating decision boundary (hyperplane) and the training samples closest to the boundary. Larger margins mean less overfitting.
These algorithms are called also Maximal Margin Classifiers.
In the case of support-vector machines, a data point is viewed as a p-dimensional vector (a list of p numbers), the so-called support vectors.
We want to know whether we can separate such points with a (p-1) -dimensional hyperplane. For p=2, the hyperplane is a line. There are many hyperplanes that might classify the data.
The figure shows a p=2 example:
H3 does not separate the classes.
H1 does, but only with a small margin.
H2 separates them with the maximal margin.
H2 represents the largest separation, or margin, between the two classes. By choosing the hyperplane – among all separating hyperplanes – so that the gap or margin between the two classes is the biggest, we intend to lower the generalisation error and avoid overfitting.
This is a convex quadratic program, and can be solved efficiently, for example using the SVM module from SKlearn.
Let’s see a couple of examples of classifying data via SVM.
You can follow the code also along my notebook on GitHub.