Ridge regression and Lasso regression are two basic techniques for reducing model complexity and avoiding over-fitting that may occur when using simple linear regression.

Ridge regression

Ridge regression is used when the independent variables have a high level of correlation among themselves. In ridge regression, the least squared estimates produce an unbiased value due to the correlation between data. Ridge regression uses a lambda parameter in order to handle the multicollinearity problem. The lambda parameter is called a shrinkage parameter because it is used to fine-tune function and reduce multicollinearity. Ridge regression is represented using the following function,

In the above formula, lower the value of lambda more the model will resemble linear regression. The lambda parameter is used as a penalty for the regression coefficient so that the model can be fine-tuned to produce accuracy.

Multiple regression lines are drawn for various values of alpha or lambda in the graph below. After that, the best-fit line is chosen based on the accuracy, and the alpha value is used to penalize the coefficient.

Graph showing multiple regression lines for different alpha values

Let’s predict Boston housing rates using ridge regression.

  1. Import the libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from tabulate import tabulate

import warnings
warnings.filterwarnings('ignore')
  1. Load the dataset

sklearn library comes with few toy datasets. We will use the Boston Housing dataset to predict the housing rates using ridge regression.

# Load dataset
from sklearn.datasets import load_boston

boston =load_boston()

df_boston = pd.DataFrame(data = boston.data, columns= boston.feature_names)
df_boston.head()

Output:

  1. Split dataset into train and test set

In this step, the feature and the target values are separated and divided into train and test datasets.

# Slice the dataframe into features and target
df_boston_features = df_boston.iloc[:, :-1]
df_boston_target = df_boston.iloc[:,-1:]

# Spliting dataset into training and testing data
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(df_boston_features, df_boston_target, train_size = 0.8)
  1. Building Ridge Regression model for alpha = 1.0

Here, the model is instantiated and then trained using the train set.

# Importing library
from sklearn.linear_model import Ridge

# Initialize the regressor with default alpha value (alpha = 1.0)
regressor_ridge = Ridge(alpha = 1.0)

# Fit the model to train set
regressor_ridge.fit(X_train, Y_train)
  1. Predicting the target values and comparing with actual values
# Predict values
ridge_pred = regressor_ridge.predict(X_test)

ridge_prediction_df = pd.DataFrame(data = ridge_pred, columns=['Predicted rates'])

ridge_prediction_df['Actual rates'] = Y_test.values

ridge_prediction_df.head()

Output:

  1. Retrieving the intercept
# Printing intercept
print(regressor_ridge.intercept_)

Output:

  1. Retrieving the slopes
# Printing coefficients
print(regressor_ridge.coef_)

Output:

These coefficients are for each of the features in the dataset.
  1. Visualizing the best fit line for alpha = 1.0
#Visualize the ridge regression on testing dataset
plt.figure(figsize=(12,6))
plt.scatter(Y_test, ridge_pred, color = 'r', alpha = 0.5)
plt.plot(Y_test, Y_test, color = 'r')
plt.ylabel('Predicted House Rate')
plt.xlabel('Actual House Rate')
plt.show()

Output:

  1. Building Ridge Regression model for alpha = 0.5 and predicting the values
# Initialize the regressor with default alpha value (alpha = 0.5)
reg_ridge = Ridge(alpha =0.5)

# Fit the model to train set
reg_ridge.fit(X_train, Y_train)

# Predict values
ridge_pred_alpha_mod = reg_ridge.predict(X_test)

ridge_prediction_df_alpha_mod = pd.DataFrame(data = ridge_pred_alpha_mod, columns=['Predicted rates'])

ridge_prediction_df_alpha_mod['Actual rates'] = Y_test.values

ridge_prediction_df_alpha_mod.head()

Output:

  1. Visualizing the best fit line for alpha = 0.5
# Visualize the linear regression on testing dataset
plt.figure(figsize=(12,6))
plt.scatter(Y_test, ridge_pred_alpha_mod, color = 'b', alpha = 0.5)
plt.plot(Y_test, Y_test, color = 'b')
plt.ylabel('Predicted House Rate')
plt.xlabel('Actual House Rate')
plt.show()

Output:

  1. Comparing the evaluation metrics for both ridge models
# Evaluating the prediction with metrics
# Importing the libraries for evaluating the metrics
from sklearn.metrics import mean_squared_error, mean_absolute_error

# Metrics for ridge regression model with alpha = 1.0
MSE = mean_squared_error(Y_test, ridge_pred)
MAE = mean_absolute_error(Y_test, ridge_pred)
RMSE = mean_squared_error(Y_test, ridge_pred, squared=False)

# Metrics for ridge regression model with alpha = 0.5
MSE_alpha_mod = mean_squared_error(Y_test, ridge_pred_alpha_mod)
MAE_alpha_mod = mean_absolute_error(Y_test, ridge_pred_alpha_mod)
RMSE_alpha_mod = mean_squared_error(Y_test, ridge_pred_alpha_mod, squared=False)

# Tabulating the values of both the models
ridge_metrics = ['Ridge', regressor_ridge.alpha, MSE, MAE, RMSE]
ridge_metrics_alpha_mod = ['Ridge', reg_ridge.alpha, MSE_alpha_mod, MAE_alpha_mod, RMSE_alpha_mod]

ridge_table = [ridge_metrics, ridge_metrics_alpha_mod]
print(tabulate(ridge_table, headers=('Model', 'Alpha', 'MSE', 'MAE', 'RMSE')))

Output:

In the table above, the error values of the ridge model with alpha 0.5 is lower than that of the ridge model with alpha 1.0, thus, the model with an alpha value of 0.5 would predict more precise results as compared to that of the model with alpha value 1.0

Lasso regression

This regression is similar to ridge regression. The only difference is that in Lasso regression the function uses the absolute value of the regression coefficient rather than the square of the values. Lasso regression uses feature selection in which it selects a set of required features from the dataset to build the model and all the other features are made zero. If there is high collinearity between the variables then only one variable is used and other variables are reduced to zero. Lasso uses regularization along with feature selection. The equation for lasso regression is given as follows,

In lasso regression as well, numerous regression lines can be generated for varying values of alpha or lambda. After which, the best-fit line is picked based on accuracy, and the coefficient is penalized using the alpha value. Multiple regression lines for various alpha values are shown in the graph below.

Graph showing multiple regression lines for different alpha values

Let’s predict Boston housing rates using lasso regression.

  1. Import the libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from tabulate import tabulate

import warnings
warnings.filterwarnings('ignore')
  1. Load the dataset

sklearn library comes with few toy datasets. We will use the Boston Housing dataset to predict the housing rates using lasso regression.

# Load dataset
from sklearn.datasets import load_boston

boston =load_boston()

df_boston = pd.DataFrame(data = boston.data, columns= boston.feature_names)
df_boston.head()

Output:

  1. Split dataset into train and test set

In this step, the feature and the target values are separated and divided into train and test datasets.

# Slice the dataframe into features and target
df_boston_features = df_boston.iloc[:, :-1]
df_boston_target = df_boston.iloc[:,-1:]

# Spliting dataset into training and testing data
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(df_boston_features, df_boston_target, train_size = 0.8)
  1. Building Lasso Regression model for alpha = 1.0

Here, the model is instantiated and then trained using the train set.

# Import library
from sklearn.linear_model import Lasso

# Initialize the regressor with alpha = 1.0
regressor_lasso = Lasso(alpha = 1.0)

# Fit the regressor
regressor_lasso.fit(X_train, Y_train)
  1. Predicting the target values and comparing with actual values
# Predicting rates for testing data
lasso_pred = regressor_lasso.predict(X_test)

lasso_prediction_df = pd.DataFrame(data = lasso_pred, columns=['Predicted rates'])

lasso_prediction_df['Actual rates'] = Y_test.values

lasso_prediction_df.head()

Output:

  1. Retrieving the intercept
# Printing intercept
print(regressor_lasso.intercept_)

Output:

  1. Retrieving the slopes
# Printing coefficients
print(regressor_lasso.coef_)

Output:

These coefficients are for each of the features in the dataset. In this, we can observe that the model has reduced a few of the least important features to zero and selected only the features that would help to increase the performance of the model.
  1. Visualizing the best fit line for alpha = 1.0
# Visualize the lasso regression on testing dataset
plt.figure(figsize=(12,6))
plt.scatter(Y_test, lasso_pred, color = 'r', alpha = 0.5)
plt.plot(Y_test, Y_test, color = 'r')
plt.ylabel('Predicted House Rate')
plt.xlabel('Actual House Rate')
plt.show()

Output:

  1. Building Lasso Regression model for alpha = 0.5 and predicting the values
# Initialize the regressor with alpha = 0.5
regressor_lasso_alpha_mod = Lasso(alpha = 0.5)

# Fit the regressor
regressor_lasso_alpha_mod.fit(X_train, Y_train)

# Predicting rates for testing data
lasso_pred_alpha_mod = regressor_lasso_alpha_mod.predict(X_test)

lasso_prediction_df_alpha_mod = pd.DataFrame(data = lasso_pred_alpha_mod, columns=['Predicted rates'])

lasso_prediction_df_alpha_mod['Actual rates'] = Y_test.values

lasso_prediction_df_alpha_mod.head()

Output:

  1. Visualizing the best fit line for alpha = 0.5
  1. Comparing the evaluation metrics for both lasso models
# Evaluating the prediction with metrics
# Metrics for lasso regression model with alpha = 1.0
lasso_MSE = mean_squared_error(Y_test, lasso_pred)
lasso_MAE = mean_absolute_error(Y_test, lasso_pred)
lasso_RMSE = mean_squared_error(Y_test, lasso_pred, squared=False)

# Metrics for lasso regression model with alpha = 0.5
lasso_MSE_alpha_mod = mean_squared_error(Y_test, lasso_pred_alpha_mod)
lasso_MAE_alpha_mod = mean_absolute_error(Y_test, lasso_pred_alpha_mod)
lasso_RMSE_alpha_mod = mean_squared_error(Y_test, lasso_pred_alpha_mod, squared=False)

# Tabulating the values of both the models
lasso_metrics = ['Lasso', regressor_lasso.alpha, lasso_MSE, lasso_MAE, lasso_RMSE]
lasso_metrics_alpha_mod = ['Lasso', regressor_lasso_alpha_mod.alpha, lasso_MSE_alpha_mod, lasso_MAE_alpha_mod, lasso_RMSE_alpha_mod]

lasso_table = [lasso_metrics, lasso_metrics_alpha_mod]

print(tabulate(lasso_table, headers=('Model', 'Alpha', 'MSE', 'MAE', 'RMSE')))

Output:

In the table above, the error values of the lasso model with alpha 0.5 is lower than that of the lasso model with alpha 1.0, thus, the model with an alpha value of 0.5 would predict more precise results as compared to that of the model with alpha value 1.0

Click here to understand the evaluation metrics used for ridge and lasso regression.

Click here to get access to the complete code.

Click here to view other topics that might excite you.

1,413 Views