Let’s break down the code step-by-step.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
matplotlib.pyplot
is used for plotting graphs.numpy
is a library for numerical computations.pandas
is used for knowledge manipulation and evaluation.sklearn.linear_model
accommodates machine studying algorithms, together withLinearRegression
.
data_root = "https://github.com/ageron/knowledge/uncooked/primary/"
lifesat = pd.read_csv(data_root + "lifesat/lifesat.csv")
X = lifesat[["GDP per capita (USD)"]].values
y = lifesat[["Life satisfaction"]].values
data_root
holds the bottom URL for the dataset.lifesat = pd.read_csv(data_root + "lifesat/lifesat.csv")
reads the CSV file from the URL right into a pandas DataFrame.X = lifesat[["GDP per capita (USD)"]].values
extracts the GDP per capita column and converts it right into a numpy array.y = lifesat[["Life satisfaction"]].values
extracts the Life satisfaction column and converts it right into a numpy array.
lifesat.plot(type='scatter', grid=True, x="GDP per capita (USD)", y="Life satisfaction")
plt.axis([23_500, 62_500, 4, 9])
plt.present()
lifesat.plot(type='scatter', grid=True, x="GDP per capita (USD)", y="Life satisfaction")
creates a scatter plot with GDP per capita on the x-axis and Life satisfaction on the y-axis. Thegrid=True
argument provides a grid to the plot.plt.axis([23_500, 62_500, 4, 9])
units the x-axis limits from 23,500 to 62,500 and the y-axis limits from 4 to 9.plt.present()
shows the plot.
mannequin = LinearRegression()
mannequin = LinearRegression()
initializes a linear regression mannequin.
mannequin.match(X, y)
mannequin.match(X, y)
trains the linear regression mannequin utilizing the GDP per capita (X) because the enter and Life satisfaction (y) because the output.
X_new = [[37_655.2]] # Cyprus' GDP per capita in 2020
print(mannequin.predict(X_new)) # output: [[6.30165767]]
X_new = [[37_655.2]]
creates a brand new knowledge level representing Cyprus’ GDP per capita in 2020.print(mannequin.predict(X_new))
predicts the Life satisfaction for Cyprus utilizing the skilled mannequin and prints the end result. The output[[6.30165767]]
signifies the expected Life satisfaction rating for Cyprus.
Every step corresponds to a key facet of the method: knowledge loading, knowledge visualization, mannequin choice, mannequin coaching, and making predictions.
Reference:
E-book — Arms-On Machine Studying with Scikit-Be taught, Keras & TensorFlow Ideas, Instruments, and Methods to Construct Clever Methods — By Aurélien Géron