How to Import Datasets in Python using the sklearn Module
In this article, we show how to import datasets in Python using the sklearn module.
So many Python modules have built-in datasets.
These datasets can be used to practice with without us having to create our own data.
The sklearn module has several datasets that we can use.
In the example below, we import the diabetes dataset from the sklearn module.
So you can see that we have imported data related to diabetes.
The dataset stores the quantitative measure of disease progression one year after baseline. 10 variables are used in relation to this outcome value: age, sex, bmi, blood pressure, T-cells, LDL, HDL, TSH, lamotrigine, and blood sugar level.
The target, column 11, is a quantitative measure of disease progression one year after baseline.
There are 442 instances of data within this dataset.
So this is one example.
Below are more examples of datasets from the sklearn module.
Datasets from sklearn module |
|
load_boston |
Load and return
the boston house-prices dataset |
load_iris |
Load and return
the iris dataset |
load_digits |
Load and return
the digits dataset |
load_linnerud |
Load and return
the physical exercise linnerud dataset |
Below we have code from another dataset.
This is shown below.
You can see that the load_boston dataset has 506 instances.
There are 13 attributes, variables, to the dataset and the target is the median value of the home prices.
So these are example datasets that you can use when you want to work with data such as testing out different machine learning algorithms to see which is most effective for predicting data.
And this is how to import datasets in Python using the sklearn
module.
Related Resources
How to Randomly Select From or Shuffle a List in Python