My Coding >
Programming language >
Python >
Python libraries and packages >
Python NumPy >
Polynomial fit with Numpy polyfit
Polynomial fit with Numpy polyfitWhy we need Polynomial fit?Suppose you have some experimental dependency Y for X, in some reasonably limited range and this function is not indefinitely periodical. In that case, you can model, or approximate it with a polynomial function with the general formula: \[P(x) = \sum_{k=0}^{n} a_k x^k = a_n x^n + a_{n-1} x^{n-1} + a_{n-2} x^{n-2} + \ldots + a_1 x + a_0\] Strictly speaking, every function without breaks can be modelled as a polynomial function, but the coefficient can be diverged for some of them. In this work, we will consider good function - almost without discontinuity and only on a reasonable short interval, defined analytically. I will show how to approximate this function with a polynomial function in Numpy. For the test, I will use the following function: \[y = \begin{cases} x, & 0 \leq x < 1 \\ \frac{1}{x}, & 1 \le x \leq 5 \end{cases}\] As you can see, this function is piecewise with no breaks, but its derivative, especially the second derivative, has breaks, which can be unsuitable in some cases. Therefore, we need to model it as a polynomial function. Or, we have a set of experimental values and need to approximate them with some function for further simple calculations—in this case, polynomial approximation can also be reasonable. To estimate the quality of this approximation we can calculate RMSD - Root Mean Square Deviation - standard value for estimating the difference between models. It can be calculated by equation: \[\text{RMSD} = \sqrt{\frac{1}{N} \sum_{i=1}^{N} \left( y_i - \hat{y}_i \right)^2}\] Numpy tools for polynomial fitNumpy offers a range of different methods, which can be used for manipulations with polynomial data. The most important are: polyfit, polyval and poly1d. Numpy polyvalThe parameters are very simple numpy.polyval(p, x) - it will calculate the value of \[p_0*x^{(N-1)} + p_1*x^{(N-2)} + ... + p_{N-2}*x + p_{N-1}\] for every x given. Generally speaking, it is easy to calculate it without this function, but when you have the polynomial coefficient, this function is faster than a cycle. Numpy poly1dThe poly1d is a class in NumPy that creates a polynomial object. Once you create a poly1d object, you can easily evaluate it, perform arithmetic operations, and even access its derivative or integral. Generally speaking, if you are planning to do some manipulations with polynome, it is better and more convenient to use this class. For example:
As you can see, this class is very useful for many manipulations of polynomial functions. Numpy polyfitLeast squares polynomial fit. The standard basic use of this code is numpy.polyfit(x, y, deg), where X, Y - is the list of X and Y points to be fitted with the polynomial function of power deg. An example of this function usage can be found below. Polynomial fit for the given functionSimple fit of the polynomial functionLet's consider the simple usage of the polynomial fit function in the following code. The comments in the will describe the idea of this usage.
This code will print the polynomial function and RMSD: y = -0.03*x^4 + 0.35*x^3 + -1.38*x^2 + 1.93*x^1 + -0.09*x^0 and RMSD=0.059084821745851446, witch is enough for the further work. As you can see from the code above, the general usage of this function is simple and straightforward. To study the function and its behaviour, it is more interesting to present everything as a graph. Let's do it in the following section Graphical representationWe do not need to make a big difference in the code, only it is necessary to add graphical output. This is the first part of the program, almost identical to the previous code. The only difference is that one more array was created to calculate the polynomial function by dots. Ideally, these dots should be different from the experimental set to see the difference more clearly, if any.
When everything is calculated, we can plot these data as two graphs
Interactive study of polyfitIt is possible to plot and calculate data for the different powers, but sometimes it can be ey useful to observe it in dynamics, how changing the polynomial power can affect the quality of the solution. Let's do it with matplotlib widgets. The idea of the code will be the same, but in this case, we will calculate coefficients for all possible powers from 1 to 100 and then, with widget will plot the required polynomial function. First of all, we need to define all functions and do not forget to include the widgets libraries. Also, if you want to use this in juputer notebook, then it is necessary to use the proper key: %matplotlib widget. Function for calculating our target function, which we need to approximate.
Then function for plotting our graphs
Now we need to have a function, which will handle the widget calls. This function will catch any changes in the widget scroll and call the plot function with new parameters.
Now we need to prepare all the data.
When you run your code you will have an error:
The warning message RankWarning: Polyfit may be poorly conditioned is triggered when using NumPy's polyfit function to fit a polynomial to data points, particularly when the degree of the polynomial is too high relative to the data provided. Poorly conditioned means that the system of equations generated to fit the polynomial is numerically unstable, leading to large errors in the polynomial coefficients. This happens because fitting a high-degree polynomial can lead to issues with floating-point precision and large differences in the magnitude of terms in the system, which causes instability. In practice, a poorly conditioned fit implies that the polynomial may not accurately represent the underlying data and might exhibit extreme oscillations or overfitting. OverfittingA high-degree polynomial will try to pass through or near every data point, which may cause the curve to oscillate wildly, especially if the data is noisy or sparse. Numerical instabilityHigher-degree polynomials involve very large and very small powers of x, which can cause numerical precision problems. Small rounding errors in the coefficients can lead to large deviations in the results. Now we can plot our results.
This will produce the widget:
To check the RMSD it is possible to use a very simple piece of code:
which will produce
As you can see, RMSD drops dramatically in the range from 1 to 15 and then starts to oscillate. This oscillation is related to the pseudo-even and odd polynomial function behaviour. It is better to stop at the maximal power of 15. Let's plot the Maximal by module coefficients.
As you can see, up to power of 15, the coefficients are low and then that's to increase significantly, which means that it is overfitting. The oscillation of the high values means overfitting as well.
|
Last 10 artitles
9 popular artitles
|
|||||||||
© 2020 MyCoding.uk -My blog about coding and further learning. This blog was writen with pure Perl and front-end output was performed with TemplateToolkit. |