Support Vector Machine | Machine Learning


In this tutorial, we will learn the Support Vector Machine algorithm and implement it in Python.

Support Vector Machine: Support Vector Machine is a discriminative classifier which finds the optimal hyperplane that distinctly classifies the data points in an N-dimensional space(N - the number of features). In a two dimensional space, a hyperplane is a line that optimally divides the data points into two different classes.


How the Algorithm Works:

Let's say you need to classify two different classes of data points in a two-dimensional space. Look at the following illustration.

11_SVM_1

Here we see two classes of data points, one in the red and other in green. Now, what can we do to separate these classes? We can simply draw a line that separates them. This line could be drawn anywhere in the plane.

11_SVM_2


Here, any of the lines can separate the classes. But our task is to find the best fit or optimal line that classifies the data points most accurately. Here the Support vector machine can help us to do so. This algorithm finds us the optimal line/hyperplane. It does so by finding the line with the maximum margin(i.e. the highest distance between data points of both classes).

11_SVM_3

Here support vectors are those two data points that are supporting the decision boundary(the data points which have the maximum margin from the hyperplane). That’s why this algorithm is called support vector machine.

Note: In higher dimensional space(more than two dimensions), the classes cannot be represented as single data points, so they are represented as vectors.

This is one of the simplest but yet a powerful algorithm to solve classification problems.

SVM in python: Now we will implement this algorithm in Python. For this task, we will use the dataset Social_Network_Ads.csv. Let's have a glimpse of that dataset.

                                                      A+hVT3cdswwlAAAAAElFTkSuQmCC

This dataset contains the buying decision of a customer based on gender, age and salary. Now, using SVM, we need to classify this dataset to predict the decision for unknown data points.

You can download the whole dataset from here.


First of all, we need to import the essential libraries to our program.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Now, lets import the datset.

dataset = pd.read_csv('Social_Network_Ads.csv')

In the dataset, the Age and EstimatedSalary columns are independent and the Purchased column is dependent. So we will take both the Age and EstimatedSalary in our feature matrix and the Purchased column in the dependent variable vector.

X = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values

Now, we will split our dataset in training and test sets.

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)


We need to scale our dataset for getting a more accurate prediction.

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

Well, its time to fit the SVM algorithm to our training set. For this, we use the SVC class from the ScikiLearn library.

from sklearn.svm import SVC
classifier = SVC(kernel = 'linear', random_state = 0)
classifier.fit(X_train, y_train)

Note: Here kernel specifies the type of algorithm we are using. You will know about it in detail in our Kernel SVM tutorial. For simplicity, here we choose the linear kernel.

Our model is ready. Now, let's see how it predicts for our test set.

y_pred = classifier.predict(X_test)

To see how good is our SVM model is, let's calculate the predictions made by it using the confusion matrix.

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred) 

The output of the confusion matrix will be

                                   

Now, let's visualize our test set result.

# Visualising the Test set results
from matplotlib.colors import ListedColormap
X_set, y_set = X_test, y_test
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('SVM (Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

The graph will like the following

21_svmp_4

From the above graph, we can see that our model tries to find the optimal line that separates the data points accurately.

This tutorial only explains SVM in two-dimensional space, in the next tutorial we will see SVM in higher dimensional spaces.