How to Import Datasets in Python using the sklearn Module



Python


In this article, we show how to import datasets in Python using the sklearn module.

So many Python modules have built-in datasets.

These datasets can be used to practice with without us having to create our own data.

The sklearn module has several datasets that we can use.

In the example below, we import the diabetes dataset from the sklearn module.



So you can see that we have imported data related to diabetes.

The dataset stores the quantitative measure of disease progression one year after baseline. 10 variables are used in relation to this outcome value: age, sex, bmi, blood pressure, T-cells, LDL, HDL, TSH, lamotrigine, and blood sugar level.

The target, column 11, is a quantitative measure of disease progression one year after baseline.

There are 442 instances of data within this dataset.

So this is one example.

Below are more examples of datasets from the sklearn module.

Datasets from sklearn module
load_boston
Load and return the boston house-prices dataset
load_iris
Load and return the iris dataset
load_digits
Load and return the digits dataset
load_linnerud
Load and return the physical exercise linnerud dataset


Below we have code from another dataset.

This is shown below.



You can see that the load_boston dataset has 506 instances.

There are 13 attributes, variables, to the dataset and the target is the median value of the home prices.

So these are example datasets that you can use when you want to work with data such as testing out different machine learning algorithms to see which is most effective for predicting data.

And this is how to import datasets in Python using the sklearn module.


Related Resources

How to Randomly Select From or Shuffle a List in Python



HTML Comment Box is loading comments...