Preface
Goal: Solving trend with built-in method available in excel and python.
Before you dive into the realm of manual calculating of curve fitting, you should know that there is already an easy to use method.
linset
in Excel/Calcpolyfit
in python
Worksheet Source
Playsheet Artefact
The workbook containing multiple sheet.
The Excel file is available, so you can have fun, modify as you want.
Linest
The easiest way to get curve fitting coefficient,
in spreadsheet in Excel or Calc is by using linest
formula.
- y = a + bx
- y = a + bx + cx²
- y = a + bx + cx² + dx³
Order
Considering the order we can arrange the equation in tabular manner.
Linear Linest
Consider this linear equation:
The formula is simply as below.
Be aware that the first argument is y
column first,
then x
column.
=LINEST(C6:C18;B6:B18)
In order to save space,
I usually put the result in one column using transpose
.
=TRANSPOSE(LINEST(C6:C18;B6:B18))
With this result, we get the linear equation as
Quadratic Linest
Consider this quadratic equation:
The formula is as below:
=LINEST(F6:F18;E6:E18^{1.2})
And the transposed version is:
=TRANSPOSE(LINEST(F6:F18;E6:E18^{1.2}))
With this result, we get the quadratic equation as
Since the formula contain array such as {1.2}
,
Do not forget to use ctrl+shift+enter
after entering the formula.
Cubic Linest
With the same mtehod we can get cubic equation:
The formula is as below:
=LINEST(I6:I18;H6:H18^{1.2.3})
And the transposed version is:
=TRANSPOSE(LINEST(I6:I18;H6:H18^{1.2.3}))
With this result, we get the quadratic equation as
Since the formula contain array such as {1.2.3}
,
Do not forget to use ctrl+shift+enter
after entering the formula.
Comparing Linest
Three Linest using The Same Series
Now we can compare three different results form one source series.
Polyfit
The easiest way to get curve fitting coefficient,
in python script is by using polyfit
method in numpy
.
The library that we need is just numpy
and matplotplib
:
import numpy as np
import matplotlib.pyplot as plt
Linear Polyfit
You can obtain the source code in below link:
We can start with given data series and order.
x_values = np.array([
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
y_values = np.array([
5, 9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53])
Then getting the polynomial
coefficient using polyfit
method:
# Curve Fitting Order
order = 1
# Perform linear regression using polyfit
mC = np.polyfit(x_values, y_values, deg=order)
print('Using polyfit')
print(f'Coefficients (a, b):\n\t{np.flip(mC)}\n')
With the result as below coefficient:
Using polyfit
Coefficients (a, b):
[5. 4.]
With this result, we get the linear equation as
Matplotlib
We can plot the result with matplotlib.
First calculate the x_plot
and y_plot
:
# Draw Plot
[a, b] = np.flip(mC)
x_plot = np.linspace(min(x_values), max(x_values), 100)
y_plot = a + b * x_plot
Then draw two chart in the same plt
object.
plt.scatter(x_values, y_values, label='Data points')
plt.plot(x_plot, y_plot, color='red',
label='Linear Equation')
Add some accesories
plt.legend()
plt.xlabel('x')
plt.ylabel('y')
plt.suptitle(
'Straight line fitting')
And finally show the plot.
subfmt = "a = %.2f, b = %.2f"
plt.title(subfmt % (a, b), y=-0.01)
plt.show()
Now we can enjoy the simple result as below:
You can obtain the interactive JupyterLab
in following link:
Quadratic Polyfit
You can obtain the source code in below link:
We can start with given data series and order.
x_values = np.array([
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
y_values = np.array([
5, 12, 25, 44, 69, 100, 137,
180, 229, 284, 345, 412, 485])
With similar script but different order, we can get the coefficient of quadratic equation.
With the result as below:
Using polyfit
Coefficients (a, b, c):
[5. 4. 3.]
With this result, we get the quadratic equation as
With data series above we can get the result as this perfect fit below:
You can obtain the interactive JupyterLab
in following link:
Cubic Polyfit
We can start with given data series and order.
x_values = np.array([
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
y_values = np.array([
5, 14, 41, 98, 197, 350, 569, 866,
1253, 1742, 2345, 3074, 3941])
You can obtain the source code in below link:
With similar script but different order, we can get the coefficient of cubic equation.
With the result as below:
Using polyfit
Coefficients (a, b, c):
[5. 4. 3.]
With this result, we get the cubic equation as
With data series above we can get the result as this perfect fit below:
You can obtain the interactive JupyterLab
in following link:
Playing with data
Now we can play with various order [1, 2 or 3], and also various custom series [ys1, ys2 or ys3]:
Custom Series
def main() -> int:
order = 2
xs = [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
ys1 = [ 5, 9, 13, 17, 21, 25, 29,
33, 37, 41, 45, 49, 53]
ys2 = [ 5, 12, 25, 44, 69, 100, 137,
180, 229, 284, 345, 412, 485]
ys3 = [ 5, 14, 41, 98, 197, 350, 569,
866, 1253, 1742, 2345, 3074, 3941]
example = CurveFitting(xs, ys3)
example.process()
return 0
if __name__ == "__main__":
raise SystemExit(main())
Class Skeleton
This utilized custom CurveFitting
class with skeleton as below:
class CurveFitting:
def __init__(self, order: int,
def calc_coeff(self) -> None:
def calc_plot_1st(self) -> None:
def calc_plot_2nd(self) -> None:
def calc_plot_3rd(self) -> None:
def draw_plot(self) -> None:
def process(self) -> None:
def main() -> int:
The link to the source code is given above.
Example Curve Fitting
For example, the third series, with polyfit of second order polynomial:
ys3 = [ 5, 14, 41, 98, 197, 350, 569,
866, 1253, 1742, 2345, 3074, 3941]
With the result as below coefficient:
Using polyfit
Coefficients (a, b, c):
[ 137. -162. 39.]
We get the quadratic equation as
With data series above we can get the curve fitting as below:
Comparing Plot
We can merge also all plot at once, for easy comparation
Data Series
For example using these series,
you may choose ys1
, ys2
, or ys3
:
xs, ys1, ys2, ys3
0, 5, 5, 5
1, 9, 12, 14
2, 13, 25, 41
3, 17, 44, 98
4, 21, 69, 197
5, 25, 100, 350
6, 29, 137, 569
7, 33, 180, 866
8, 37, 229, 1253
9, 41, 284, 1742
10, 45, 345, 2345
11, 49, 412, 3074
12, 53, 485, 3941
We can manage the main method.
We start with the main()
,
store array in numpy
.
def main() -> int:
# Getting Matrix Values
mCSV = np.genfromtxt("series.csv",
skip_header=1, delimiter=",", dtype=float)
mCSVt = np.transpose(mCSV)
example = CurveFitting(mCSVt[0], mCSVt[3])
example.process()
return 0
Let’s say that we pick ys3
,
then we choose mCSVt[3]
as curve fitting parameter.
Class Skeleton
This utilized custom CurveFitting
class with skeleton as below:
class CurveFitting:
def __init__(self, xs, ys : List[int]) -> None:
def calc_coeff(self, order) -> np.ndarray:
def calc_plot_all(self) -> None:
def draw_plot(self) -> None:
def process(self) -> None:
def main() -> int:
The link to the source code is given above.
Class
class CurveFitting:
def __init__(self, xs, ys : List[int]) -> None:
# Given data
self.xs = np.array(xs)
self.ys = np.array(ys)
Calculate Coefficient
Prepare a method to calculate coefficient based on order:
def calc_coeff(self, order) -> np.ndarray:
# Perform regression using polyfit,
mC = np.polyfit(self.xs, self.ys, deg=order)
# Display
coeff_text = {
1: '(a, b)', 2: '(a, b, c)', 3: '(a, b, c, d)'}
order_text = {
1: 'Linear', 2: 'Quadratic ', 3: 'Cubic'}
print(f'Using polyfit : {order_text[order]}')
print(f'Coefficients : {coeff_text[order]}:'
+ f'\n\t{np.flip(mC)}\n')
# Get coefficient matrix
return np.flip(mC)
Get all the coefficient:
def process(self) -> None:
self.mCoeff_1st = self.calc_coeff(1)
self.mCoeff_2nd = self.calc_coeff(2)
self.mCoeff_3rd = self.calc_coeff(3)
With the coefficient result as:
❯ python 17-polyfit-merge.py
Using polyfit : Linear
Coefficients : (a, b):
[-721. 306.]
Using polyfit : Quadratic
Coefficients : (a, b, c):
[ 137. -162. 39.]
Using polyfit : Cubic
Coefficients : (a, b, c, d):
[5. 4. 3. 2.]
Now we have all the equations:
Calculate The Plot Value
With that coefficient above,
we can calculate all the x_plot
and y_plot
values.
def calc_plot_all(self) -> None:
self.x_plot = xp = np.linspace(
min(self.xs), max(self.xs), 100)
[a1, b1] = self.mCoeff_1st
self.y1_plot = a1 + b1 * xp
[a2, b2, c2] = self.mCoeff_2nd
self.y2_plot = a2 + b2 * xp + c2 * xp**2
[a3, b3, c3, d3] = self.mCoeff_3rd
self.y3_plot = a3 + b3 * xp + c3 * xp**2 + d3 * xp**3
Then plot all the different y_plot
in one plt
object.
def draw_plot(self) -> None:
plt.scatter(self.xs, self.ys, label='Data points', color='teal')
plt.plot(self.x_plot, self.y1_plot, color='red',
label='Linear Equation')
plt.plot(self.x_plot, self.y2_plot, color='green',
label='Fitted second-order polynomial')
plt.plot(self.x_plot, self.y3_plot, color='blue',
label='Fitted third-order polynomial')
plt.legend()
plt.xlabel('x')
plt.ylabel('y')
plt.title('Polynomial Curve Fitting')
plt.show()
Plot
After finishing all the process:
def process(self) -> None:
self.mCoeff_1st = self.calc_coeff(1)
self.mCoeff_2nd = self.calc_coeff(2)
self.mCoeff_3rd = self.calc_coeff(3)
self.calc_plot_all()
self.draw_plot()
With data series above we can get all curve fitting as below:
You can obtain the interactive JupyterLab
in this following link:
What’s the Next Exciting Step 🤔?
After built-in formula and method, we can start the discussion by solving simple equations. A system of linear equation, followed by system of quadratic and cubic equation.
Consider continuing your exploration in the next section: [ Trend - Polynomial Interpolation ].