My Coding >
Programming language >
Python >
PyTocrch >
PyTorch  Neural network for simple regression analysis >
Linear regression for approximate solution of linear equation
Linear regression for approximate solution of linear equationThis is practical task from many real experiments. If we have two, linearly related values and we need to find this relation. But from the experiment we only knows the few pairs of values, and furthermore, these pairs are not measured precisely, but with some errors. We need to find the original linear relation between these two values. Lets for example consider these data from equation y = k*x + b:
As you can see, these data are not ideal and can’t be approximated by one line, but we need to do it. The easiest way is to calculate minimal square distance between them Manual solution of approximate linear equation
So, we need to find W_{1} and W_{0} from equation y'=W_{1}*x + W_{0} with minimal square distance between y' from our calculations and y from experiment. Speaking mathematically, we need to find the minima of the function L(W_{1}, W_{0}) = Σ(W_{1}*x_{i} + W_{0}  y_{i})^{2} To find the minima, we need to solve two equations in partial derivatives ∂L/∂W_{1} = 0; ∂L/∂W_{0} = 0 or ∂/∂W_{1}Σ(W_{1}*x_{i} + W_{0}  y_{i})^{2} = 0; ∂/∂W_{0}Σ(W_{1}*x_{i} + W_{0}  y_{i})^{2} = 0; For understanding of all equations, let’s do it manually here with immediate substitution of our table data: L = (W_{0})^{2} + (W_{1} + W_{0}  1)^{2} + (2*W_{1} + W_{0})^{2} + (3*W_{1} + W_{0}  3)^{2} = 14*W^{2}_{1} + 4*W^{2}_{0} + 10 + 12*W_{1}*W_{0}  20*W_{0}  8*W_{0}; Now we can take partial derivatives: ∂L/∂W_{1} = 28*W_{1} + 12*W_{0}  20 = 0 ∂L/∂W_{0} = 12*W_{1} + 8*W_{0}  8 = 0 and we need to solve this system of equations. W_{1} = 0.8; W_{0} = 0.2 How to solve the system of linear equations you can read here.
Now, when we understand all mathematical operation, staying behind this procedure, we can use some python tools to solve this task Numpay method of solving approximate systems numpy.linalg.lstsq
We need to rewrite our line equation y=W_{1}*x + W_{0} as y = Ap, where A = [[x 1]] and p = [[W_{1}], [W_{0}]], and then solve it:
which is very close to our manual solution. Sclearn method of solving approximate systemsBasically, sclearn do the same procedure. The only difference, you need to prepare data in slightly different format. It is better to prepare data in numpy array again for easy manipulation. For linear_model.fit procedure you need to prepare data in the following format: x = [[0],[1],[2],[3]], y = [0,1,0,3] Easiest way id to reshape X with parameters (1, 1)
Performance comparison between Sclearn and numpy
To calculate the time difference between these two functions I’ve create two x,y dataset with 5000 points (only 500 are shown on the picture) and solve it 10000 times with numpy and sclearn
The result is pretty predictable, numpy is almost 3 times faster than sklearn. numpy time = 0:00:02.199247 sklearn time = 0:00:06.124603

Last 10 artitles
9 popular artitles


© 2020 MyCoding.uk My blog about coding and further learning. This blog was writen with pure Perl and frontend output was performed with TemplateToolkit. 