Ridge regression for approximate solution of linear equation
In previous example, I’ve showed you how to ise linear regression to find relation between two data sets. In this example, I will show you a bit more complicated case, where we have two sets of input data and and one output, but furthermore, two sets of input data are correlated with pretty high coefficient of correlation (near 1 by module).
In this case we have unlimited number of solutions and we need to chose the best one.
Task for ridge regression
We have some kind of data relation Y = W1*X1 + W2*X2 + W0 with correlation between X1 and X2
Here is “real” data
X1 | 0 | 1 | 2 | 3 |
X2 | 3 | 2 | 1 | 0 |
Y | 0 | 1 | 0 | 3 |
As it is very easy to see, X1 and X2 have correlation = -1 and we have unlimited number of solutions.
Traditional way of solving
Speaking mathematically, we need to find the minima of the function
L(W2, W1, W0) = Σ(W2*x2i + W1*x1i + W0 - yi)2
To find the minima, we need to solve two equations in partial derivatives
∂L/∂W2 = 0; ∂L/∂W1 = 0; ∂L/∂W0 = 0
but this system have unlimited number of solution and we need to make some mechanism to choose between them.
We can introduce a requirement, what coefficient W2, W1 and W0, should be minimal by its absolute value. For doing this we will introduce new function:
R = L(W2, W1, W0) + C*(W22 + W21 + W20)
or
R = Σ(W2*x2i + W1*x1i + W0 - yi)2 + C*(W22 + W21 + W20)
And we will need to find minima of this new function. It is easy to see, that if we will approximate C to 0, we will have our previous equation. Therefore it is a good step to use C =1
Now we need to solve system of three equations
∂R/∂W2 = 0;
∂R/∂W1 = 0;
∂R/∂W0 = 0;
Solving ridge regression with scikit-learn
This problem already implemented in scikit-learn Python library.
from sklearn.linear_model import Ridge
import numpy as np
C=1.0
X = np.array( [ [0,3], [1,2], [2,1], [3,0] ] )
y = np.array( [ 0, 1, 0, 3 ] )
#expand X by 1.0 column
X = np.insert(X, 0, values=1, axis=1 )
model = Ridge(alpha=C, fit_intercept=False )
model.fit(X, y )
print("coef:", model.coef_); #coef: [ 0.17391304 0.62450593 -0.1027668 ]
So we need to take coefficient if the correct order:
W0 = 0.17; W1 = 0.62; W2 = -0.10
Published: 2022-06-18 15:23:34
Updated: 2022-06-18 15:25:57