How to Create a Support Vector Machine in Python using sklearn

In this article, we show how to create a support vector machine in Python using the sklearn module.

A support vector machine is a type of machine learning algorithm that seeks to predict values of data based on creating a hyperplane that divides the data into the various outputs available. SVMs seek to find the optimal hyperplane that is able to separate distinct sets of data into their appropriate outputs based on the characteristics of the data.

So a support vector machine is a machine learning system that groups results according to classification. It classifies various data points as outputs based on the optimal hyperplane.

We will use the breast cancer dataset from sklearn datasets.

Below is the Python code that uses a support vector machine to classify results from the breast cancer dataset as either malignant or benign.

import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns from sklearn import datasets cancer= datasets.load_breast_cancer() X= pd.DataFrame(cancer['data'], columns= cancer['feature_names']) y= cancer['target'] from sklearn.svm import SVC model= SVC() from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test= train_test_split(X,y, test_size= 0.3, random_state=101) model.fit(X_train, y_train) predictions= model.predict(X_test) from sklearn.metrics import classification_report, confusion_matrix print(confusion_matrix(y_test, predictions)) print('\n') print(classification_report(y_test,predictions))

The first thing we have to do is import our modules, including pandas, numpy, matplotlib, seaborn, and sklearn.

Since we going to use a dataset from the sklearn module, we import datasets.

We create a variable, cancer, which is set equal to, datasets.load_breast_cancer()

To construct a pandas dataframe object of all the features of the dataset, we create a variable, X, and set it equal to, pd.DataFrame(cancer['data'], columns= cancer['feature_names'])

We then create the y variable, y, and set it equal to, cancer['target']

The y variable represents the outcome of each instance (whether the tumor is benign or malignant).

We import SVC, our support vector machine system, from sklearn.svm

The model variable is set equal to SVC()

We then want to train our support vector machine.

So we import train_test_split from sklearn.model_selection

The line, X_train, X_test, y_train, y_test= train_test_split(X,y, test_size= 0.3, random_state=101), creates all of our training data and test data. The training data is the data we use to train our ML model. The test data is the data that is used to test our model once trained so that we can view how effective the model is.

The line, model.fit(X_train, y_train), trains our model.

Once the model is trained, we can now test it with the test data. This is done by the line, predictions= model.predict(X_test)

We then show the effectiveness of the predictions variable by the classification_report and the confusion_matrix of sklearn metrics.

After running this program, we get the following results shown below.

[[ 56 10] [ 3 102]] precision recall f1-score support 0 0.95 0.85 0.90 66 1 0.91 0.97 0.94 105 accuracy 0.92 171 macro avg 0.93 0.91 0.92 171 weighted avg 0.93 0.92 0.92 171

The results show us that there are 56 true positives, 10 false positives, 3 false negatives, and 102 true negatives.

The results give us a 93% precision.

So our support vector model works well.

And this is how to create a support vector machine in Python using the sklearn module.

Related Resources

How to Randomly Select From or Shuffle a List in Python

HTML Comment Box is loading comments...

Learning about Electronics

How to Create a Support Vector Machine in Python using sklearn