Preface
Goal: Solving third order polynomial equation, from data modelling in excel to python.
This is a practical problem solving in math. To be specific, finding polynomial coefficient, using known value. Instead of theory, we use spreadsheet calc/excel, and also coding such as python.
The difference with previous interpolation is,
the approach of getting the matrix solution.
Since inverse can only work with square nxn
matrix.
We need to alter the equation a bit with transpose.
Instead of linset
,
we are going to calculate the solution manually.
The step by step approach is
required to get deeper understanding,
of how things works.
Then we are going to use this matrix approach
to replace polyfit
with manual calculation.
Even when you have library to solve the equation,
fundamental understanding is still required,
before stepping-up to further knowledge.
Shorter
In spreadsheet, there is a formula called linset
,
that you can just, use it right away.
My intention is deriving equation using spreadsheet. So I won’t use this formula.
Step by Step
We would pick third order polynomial as a case. Then we would explain in systematic way.
- Start with cubic equation with know coefficient, to produce Y-value, using spreadsheet.
- Solve the Y-Value, using spreadsheet, but this time with the help of matrix.
- Inverse using spreadsheet, find the coefficient, with know Y-Value.
- Write down the matirx equation in Python.
Worksheet Source
Playsheet Artefact
The Excel file is available, so you can have fun, modify as you want.
Polynomial Case
Task: Find the coefficient.
Simple numerical method for statistical problem.
Polynomial Expression
A cubic polynomial equation can be describe as this expression below:
This third degree polynomial
have four coefficient: a
, b
, c
and d
.
Known Data
In solving polynomial, we have known data series, and unknown coefficient. Our task is to find the coeffient, based on known data.
For example, we have 13 pairs of value.
X | Y-Value |
---|---|
0 | 5 |
1 | 10 |
2 | 410 |
3 | 90 |
4 | 190 |
5 | 350 |
6 | 460 |
7 | 960 |
8 | 1050 |
9 | 1740 |
10 | 1340 |
11 | 3270 |
12 | 3540 |
Cubic Equation
Consider reverse our task, find the data for known coefficient.
Coeff. | Value |
---|---|
a | 5 |
b | 4 |
c | 3 |
d | 2 |
Equation
The equation would be:
LaTeX user can obtain the equation here:
y = 5 + 4{x} + 3{x}^2 + 2{x}^3
Applying Equation
Applying equation for each pairs can be written as below:
Spreadsheet
In Excel/Calc this can be represented as this image below:
In the same sheet, we can also draw the XY line in a chart as below figure:
We are going to solve this formula with different method.
Matrix Equation
Getting unknown Y-Value
Spreadsheet
We can represent our formula as 13x4
matrix.
If we cross multiply this 13x4
matrix with
4x1
coefficient matrix,
we are going to have the result as 13x1
matrix.
The excel formula is MMULT
.
We require curly bracket to work with array.
To calculate array in Calc,
you can use ctrl+shift+enter,
for example for this {=MMULT(F6:I18;K6:K9)}
formula.
Now we are ready to solve our real problem.
Reverse back to unknown coefficient for known Y-Value
.
Equation
Vandermonde Matrix
The formula representation of the 13x4
matrix is as below:
LaTeX user can obtain the equation here:
\begin{bmatrix}
1 & x_0 & {x_0}^2 & {x_0}^3 \\
1 & x_1 & {x_1}^2 & {x_1}^3 \\
1 & x_2 & {x_2}^2 & {x_2}^3 \\
1 & x_3 & {x_3}^2 & {x_3}^3 \\
1 & x_4 & {x_4}^2 & {x_4}^3 \\
1 & x_5 & {x_5}^2 & {x_5}^3 \\
1 & x_6 & {x_6}^2 & {x_6}^3 \\
1 & x_7 & {x_7}^2 & {x_7}^3 \\
1 & x_8 & {x_8}^2 & {x_8}^3 \\
1 & x_9 & {x_9}^2 & {x_9}^3 \\
1 & x_{10} & {x_{10}}^2 & {x_{10}}^3 \\
1 & x_{11} & {x_{11}}^2 & {x_{11}}^3 \\
1 & x_{12} & {x_{12}}^2 & {x_{12}}^3 \\
\end{bmatrix}
\begin{bmatrix}
a \\
b \\
c \\
d \\
\end{bmatrix}
=
\begin{bmatrix}
y_0 \\
y_1 \\
y_2 \\
y_3 \\
y_4 \\
y_5 \\
y_6 \\
y_7 \\
y_8 \\
y_9 \\
y_{10} \\
y_{11} \\
y_{12} \\
\end{bmatrix}
The 13x4
size matrix can be described as:
- 13 number of data.
- 4 coefficient, for third order.
Inverse Matrix
Getting unknown coefficient
Problem Domain
The problem domain can be shown as below spreadsheeet:
Equation
We are going to solve the formula above using this equation
Inverse can only work with square nxn
matrix.
Since our 13x4
matrix is definitely not a square matrix.
We need to alter the equation a bit with transpose.
LaTeX
If you are curious, the LaTeX code can be obtained here:
\begin{align*}
&& A\mathbf{C} &= \mathbf{B} \\
\Rightarrow && A^T A\mathbf{C} &= A^T \mathbf{B} \\
\Rightarrow && \mathbf{C} &= (A^T A)^{-1} A^T \mathbf{B}
\end{align*}
You can use quicklatex to generate LaTeX.
![Polynomial: Latex: Matrix Equation][13-latex-solving]
Transpose
With transpose
formula such as: =TRANSPOSE(E9:H21)
,
we can have the matrix transposed as below:
Cross Multiply
Again, we can cross multiply both, with MMULT
.
Inverse
You can use minverse
formula,
such as {=MINVERSE(E32:H35)}
.
You can examine the excel formula here.
Coeffecient
You can examine the excel formula here.
Yes we have the result.
Big Picture
You can examine the big picture as below:
Matrix in Python
Consider represent above equation into python code.
Source Code
You can obtain the source code in below links:
Setup Matrix A
We can utilize numpy to setup matrix,
import numpy as np
order = 3
mx = np.array([
0, 1, 2, 3,
4, 5, 6, 7,
8, 9, 10, 11, 12])
And also list comprehension to setup the Matrix.
mA = np.array([
[(x ** i) for i in range(order+1)]
for x in mx])
print(mA)
With numpy, the matrix result is nicer, compared with regular array.
$ python 01-poly.py
[[ 1 0 0 0]
[ 1 1 1 1]
[ 1 2 4 8]
[ 1 3 9 27]
[ 1 4 16 64]
[ 1 5 25 125]
[ 1 6 36 216]
[ 1 7 49 343]
[ 1 8 64 512]
[ 1 9 81 729]
[ 1 10 100 1000]
[ 1 11 121 1331]
[ 1 12 144 1728]]
You can see the result here:
Numpy’s Vander
This is a Vandermonde Matrix,
and python numpy
has built in vander
function.
mV = np.vander(mx, 4)
But the arrangement for the built in vandermonde
While our arrangement is
So what we have to do is flip
the result horizontally.
mV = np.flip(np.vander(mx, 4), axis=1)
We can replace our list comprehension with vander.
Matrix Operation
Getting unknown coefficient
Matrix operation in python is incredibly simple. Let’s say we have this setup. A know XY value pairs.
import numpy as np
order = 3
mx = np.array([
0, 1, 2, 3,
4, 5, 6, 7,
8, 9, 10, 11, 12])
mB = np.array([
5, 14, 41, 98,
197, 350, 569, 866,
1253, 1742, 2345, 3074, 3941])
mA = np.array([
[(x ** i) for i in range(order+1)]
for x in mx])
We can prepare to compute the previous matrix equation.
mAt = np.transpose(mA)
mAt_A = np.matmul(mAt, mA)
mAt_B = np.matmul(mAt, mB)
And finally get the result:
mAt_A_i = np.linalg.inv(mAt_A)
mC = np.matmul(mAt_A_i, mAt_B)
print("Coefficients (a, b, c, d):", mC)
Or even shorter
mC = np.linalg.solve(mAt_A, mAt_B)
print("Coefficients (a, b, c, d):", mC)
@ (at) Operator
If you wish you can use @
(at) operator.
# Calculated Matrix Variable
mA = np.flip(np.vander(mx, 4), axis=1)
mAt = np.transpose(mA)
mAt_A = mAt @ mA
mAt_B = mAt @ mB
# First Method
mAt_A_i = np.linalg.inv(mAt_A)
mC = mAt_A_i @ mAt_B
print("Coefficients (a, b, c, d):", mC)
# Second Method
mC = np.linalg.solve(mAt_A, mAt_B)
print("Coefficients (a, b, c, d):", mC)
# Draw Plot
[a, b, c, d] = mC
Matplotlib
We can even plot the result.
Source Code
You can obtain the source code in below link:
Basic Plot
Consider plot rewrite the previous calculation.
And plot the result.
x_plot = np.linspace(min(mx), max(mx), 100)
y_plot = a + b * x_plot + \
c * x_plot**2 + d * x_plot**3
plt.scatter(mx, mB, label='Data points')
plt.plot(x_plot, y_plot, color='red',
label='Fitted third-order polynomial')
plt.legend()
plt.xlabel('x')
plt.ylabel('y')
plt.title(
'Third-order polynomial curve fitting')
plt.show()
With the result as.
Test Other Data
Consider change the initial value, so we can test how valid our equation is.
mB = np.array([
5, 10, 410, 90,
190, 350, 460, 960,
1050, 1740, 1340, 3270, 3540])
The matrix equation should remain the same.
Append additional title:
# Draw Plot
[a, b, c, d] = mC
...
plt.suptitle(
'Third-order polynomial curve fitting')
subfmt = "a = %.2f, b = %.2f c = %.2f, d = %.2f"
plt.title(subfmt % (a, b, c, d), y=-0.01)
plt.show()
Now we can see, how the line fit the dots.
You can do it yourself.
Enhance Script
You can make your life easier with these few tricks:
Source Code
You can obtain the source code in below link:
Setup Numpy
Instead of formatting data all the time,
we can set the numpy
output globally.
# Setup numpy
np.set_printoptions(
precision=2,
formatter={
'int': '{:30,d}'.format,
'float': '{:10,.8f}'.format
},
linewidth=np.inf,
suppress=True)
Data Source with CSV
Instead of array, you can have your data source from anywhere, from database, spreadsheet, or simple CSV.
# Getting Matrix Values
mCSV = np.genfromtxt("poly.csv",
skip_header=1, delimiter=",", dtype=float)
mCSVt = np.transpose(mCSV)
mx = mCSVt[0]
mB = mCSVt[1]
You can get the example CSV here.
The CSV looks like this file below:
x,y
0,5
1,14
2,41
3,98
4,197
5,350
6,569
7,866
8,1253
9,1742
10,2345
11,3074
12,3941
Polyfit
Instead of doing calculation manually,
you can utilize polyfit
.
This way make it easy to choose any polynomial degree.
# Initial Matrix Value
order = 3
mCSVt = np.transpose(mCSV)
x_values = mx = mCSVt[0]
y_values = mB = mCSVt[1]
# Perform cubic regression using polyfit
mC = np.polyfit(x_values, y_values, deg=order)
print('Using polyfit')
print(f'Coefficients (a, b, c, d):\n\t{np.flip(mC)}\n')
Now you can have output as below:
$ python 05-poly.py
Using polyfit
Coefficients (a, b, c, d):
[5.00000000 4.00000000 3.00000000 2.00000000]
Output Result
For debugging purpose you can show each preciction value line by line.
for x in mx:
y = a + b*x + c*x**2 + d*x**3
print(f'x = {x:5} => y = {y:10,.2f} ')
$ python 05-poly.py
x = 0.0 => y = 5.00
x = 1.0 => y = 14.00
x = 2.0 => y = 41.00
x = 3.0 => y = 98.00
x = 4.0 => y = 197.00
x = 5.0 => y = 350.00
x = 6.0 => y = 569.00
x = 7.0 => y = 866.00
x = 8.0 => y = 1,253.00
x = 9.0 => y = 1,742.00
x = 10.0 => y = 2,345.00
x = 11.0 => y = 3,074.00
x = 12.0 => y = 3,941.00
Editable Output
This not enhanced script, but rather a trick.
Instead of exporting .png
from matplotlib,
you can export to .svg
.
Sometimes you need to alter the result for further processing,
such as report, or removing annoying lines quickly.
You can have the SVG source from here
Choosing Order
In my experience, higher order tend to have further precision. If you have issue with precision, you might consider lower order.
# Initial Matrix Value
order = 2
Interactive JupyterLab
You can obtain the interactive JupyterLab
in this following link:
What’s Our Next Endeavor 🤔?
It is fun, right?
We can describe mathematical equation, in practical way. What do you think?
But what good is it a theory without a ready to used material fro daily basis. I present you simple tools, the spreadsheet, and the python script, covering linear, quadratic and cubic.
Consider continuing your exploration with [ Trend - Polynomial Examples ].