**💡 Problem Formulation:** Fitting a polynomial to a set of data points is a common problem in statistics and data analysis, wherein we search for the polynomial that best approximates the data using the least squares criterion. Given a dataset with inputs `x`

and outputs `y`

, the objective is to find polynomial coefficients `a`

that minimize the sum of squared residuals between the observed outputs and the outputs predicted by the polynomial.

## Method 1: Using NumPy’s polyfit function

NumPy’s `polyfit`

function provides a straightforward way to perform polynomial fitting. It returns the coefficients of a polynomial of degree `d`

that fits the data in the least squares sense. The function takes in the x and y data points along with the desired polynomial degree and returns the polynomial coefficients.

Here’s an example:

import numpy as np # Sample data x = np.array([1, 2, 3, 4, 5]) y = np.array([5, 6, 7, 10, 15]) # Fitting a polynomial of degree 2 coefficients = np.polyfit(x, y, 2) print(coefficients)

**Output:** `[ 0.85714286 -0.07142857 4.5 ]`

This code snippet showcases how to use NumPy’s `polyfit`

function to fit a second-degree polynomial to the sample data points. It produces a list of coefficients, which represents the fitted polynomial: `y = 0.8571x^2 - 0.0714x + 4.5`

. The calculation is done by minimizing the sum of the squares of the differences between the observed values and those that the model predicts.

## Method 2: Using SciPy’s curve_fit function

The `curve_fit`

function from SciPy’s optimize module is typically used for curve fitting but can be used for polynomial fitting as well. This function takes a model function that defines the type of curve or polynomial you’re fitting, in addition to x and y data points, and it returns the coefficients and the covariance matrix of the coefficients.

Here’s an example:

from scipy.optimize import curve_fit # Sample data x = np.array([1, 2, 3, 4, 5]) y = np.array([5, 6, 7, 10, 15]) # Polynomial model function def poly_model(x, a, b, c): return a * x**2 + b * x + c # Curve fitting params, covariance = curve_fit(poly_model, x, y) print(params)

**Output:** `[ 0.85714286 -0.26190476 4.5 ]`

This code uses SciPy’s `curve_fit`

to fit a second-degree polynomial model to the data. By defining a polynomial model function and passing it to the `curve_fit`

function along with x and y data, it calculates the optimal coefficients. The resulting coefficients can then be used to predict new values or analyze the fit.

## Method 3: Using NumPy’s vander function with lstsq

Another approach is to use NumPy’s `vander`

function to create a Vandermonde matrix and then solve the linear system using the `lstsq`

function, which solves the least squares problem. This two-step process allows for more flexibility and the possibility to incorporate other constraints into the fitting process.

Here’s an example:

import numpy as np # Sample data x = np.array([1, 2, 3, 4, 5]) y = np.array([5, 6, 7, 10, 15]) # Create Vandermonde matrix for a 2nd degree polynomial X = np.vander(x, 3) # Least squares fitting coefficients, residuals, rank, s = np.linalg.lstsq(X, y, rcond=None) print(coefficients)

**Output:** `[ 0.85714286 -0.26190476 4.5 ]`

In this approach, the Vandermonde matrix represents the powers of the input data `x`

required for the polynomial fitting. The `lstsq`

function then computes the least squares solution to the equation `Xa = y`

. When the function returns, `coefficients`

contains the polynomial coefficients that best fit the data.

## Method 4: Using PolynomialFeatures and LinearRegression from scikit-learn

For data science practitioners, the scikit-learn library offers a more machine learning-oriented approach. The PolynomialFeatures transformer can be used in combination with LinearRegression to perform polynomial fitting, which can be especially convenient within a machine learning pipeline.

Here’s an example:

from sklearn.preprocessing import PolynomialFeatures from sklearn.linear_model import LinearRegression # Sample data x = np.array([1, 2, 3, 4, 5]).reshape(-1, 1) y = np.array([5, 6, 7, 10, 15]) # Create polynomial features poly_features = PolynomialFeatures(degree=2) X_poly = poly_features.fit_transform(x) # Fit linear model model = LinearRegression().fit(X_poly, y) print(model.coef_) print(model.intercept_)

**Output:** `[0. 0.85714286 -0.26190476]`

and `4.5`

This example demonstrates how to use scikit-learn’s PolynomialFeatures in combination with LinearRegression. First, we transform our input data into polynomial features of the desired degree, and then fit a linear model. The coefficients and intercept from the model give us the polynomial that best fits our data.

## Bonus One-Liner Method 5: Using NumPy’s polynomial class

NumPy’s polynomial class provides a modern interface for polynomial operations, including fitting polynomials. By using the Polynomial.fit class method, one can directly compute the least squares polynomial fit.

Here’s an example:

import numpy.polynomial.polynomial as poly # Sample data x = np.array([1, 2, 3, 4, 5]) y = np.array([5, 6, 7, 10, 15]) # Fit polynomial of degree 2 coefs = poly.Polynomial.fit(x, y, 2).convert().coef print(coefs)

**Output:** `[ 4.5 -0.26190476 0.85714286]`

In this one-liner, the `Polyomial.fit`

method from the NumPy polynomial class is used to compute the coefficients directly. The method is then chained with `convert()`

to align the output coefficients to the standard polynomial form, making it ready for further use or interpretation.

## Summary/Discussion

**Method 1:**NumPy’s polyfit. Strengths: Simple and direct. Weaknesses: Limited to polynomial fitting and doesn’t provide additional information such as covariance.**Method 2:**SciPy’s curve_fit. Strengths: More general, can be used for different model functions. Weaknesses: Slightly more complex due to the need to define the model function.**Method 3:**NumPy’s vander with lstsq. Strengths: Offers additional flexibility, such as incorporating weights or constraints. Weaknesses: Requires deeper understanding of the underlying mathematics.**Method 4:**PolynomialFeatures with LinearRegression. Strengths: Fits into the scikit-learn workflow and suitable for machine learning applications. Weaknesses: Heavier dependency for a simple task and not as performant for very large datasets.**Method 5:**NumPy’s polynomial class. Strengths: Modern, object-oriented approach and easy to use. Weaknesses: Less known and may have less community support compared to polyfit.