Cross Validation: Calculating R² for LOOCV

Natalie Olivo
codeburst
Published in
4 min readNov 28, 2017

--

Hello, This blog post aims to solve a cross validation woe: calculating metrics (R² in particular) after Leave One Out Cross Validation (LOOCV); something that cannot be done using the typical

sklearn.model_selection.cross_val_score(model, X, y, scoring = 'r2')

Very brief primer on cross validation and LOOCV:

Leave One Out Cross Validation or LOOCV is similar to k-fold cross validation, but k = n. If that explanation isn’t clear, allow me to explain further.

The goal of cross validation is to get a generalized score of your model. The reason for this generalization is to, hopefully, improve your model’s effectiveness in predicting on future data inputs.

You won’t always have access to the new data, so cross validation allows you to work with the data you already have by holding out a portion of it to test on; ‘test’ data, while using the rest of your data to build your model; ‘train’ data. From there you can calculate an R², MSE, Accuracy Score, whatever metrics make the most sense for your model. See the complete list here.

There are a number of ways to cross validate. Leave One Out Cross Validation uses all but one data point to ‘train’ your model and aims to predict that one held out data point. This process is done n times.

Typically, to calculate the cross validated R² for an iterative cross validation process, you calculate a R² score for each iteration and take the average of them. In my code below we will see how cross_val_score does this for LOOCV. (Hint: it doesn’t, exactly.)

Cross validation woe: What happens when we use cross_val_score?

Relevant Libraries:

import numpy as np 
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split, cross_val_score, LeaveOneOut
from sklearn import metrics

Data used: admissions.csv, found on my github, or if you google it.
Model used: Linear Regression

admit = pd.read_csv('assets/admissions.csv')
admit = admit.dropna()
Xr = admit[['admit', 'gre', 'prestige']]
yr = admit['gpa']
X_array = np.array(Xr) #r stands for 'regression'
y_array = np.array(yr)

Here’s the documentation for cross_val_score.

The cv attribute is equal to the number of cross validation folds so to do LOOCV, we set cv equal to total number of observations (n).

In our case, n is 397.

[output]
Cross-validated scores: [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
Average: 0.0
Variance: 0.0

Wow, all zeros. Is this what we expected? Actually yes. Because each iteration only aims to predict one point.

For a more detailed look at this, check the source code of the sklearn.metrics.r2_score, particularly lines 540–542 where the function defaults the denominator (aka the total variance) of the R² to 0 if there is only one predicted y. This will make the r2_score function return 0.

So how can LOOCV be evaluated? After all, it’s one of the least bias, most variable, and computationally expensive cross validation methods.

Proposed Solution:

How to calculate R² for LOOCV? I propose adding all predicted values to a list and seeing how they do against all y values.

[output]
Leave One Out Cross Validation
R^2: 14.08407%, MSE: 0.12389

Whew that is much more similar to the R² returned by other cross validation methods! (Train/Test Split cross validation which is about 13–15% depending on the random state.) 14% R² is not awesome; Linear Regression is not the best model to use for admissions.

Happy cross validation!

Thank you to Jingfei Cai and Evann Smith for their feedback and reviews of this blog post.

Link to my github, the code behind this post in particular: https://github.com/nmolivo/Blogs/blob/master/001_LOOCV/blog_001-LOOCV.ipynb

If you found my ‘Brief Primer on Cross Validation and LOOCV” was too brief, please check out the following resources:
https://www.cs.cmu.edu/~schneide/tut5/node42.html
https://en.wikipedia.org/wiki/Cross-validation_(statistics)

Do you have other methods or resources to calculate metrics for LOOCV? Please share them in the comments. I was surprised how few resources I could find on how to perform this particular task in Python, so I’d love to hear about any I may have missed!

--

--