Recall, linear regression allows us to model the interactivity of independent variables and dependent variables using a straight line. However, if our data does not follow a linear relationship, we should probably consider modelling the relationship using a quadratic model. This is a simple extension of the linear model; we just add an extra term! See below:
where \(\beta_0, \beta_1, \beta_2 \in \mathbb{R}\) are coefficients illustrating the shape of our quadratic model. As with the linear case, we also have an error term \(\epsilon_i \sim \mathcal{N}(0, \sigma^2)\) which accounts for the dispersion of the observed values about the straight line.
As before, we are also able to represent it using matrices:
Quadratic regression is used when a scatter graph of the data shows a nonlinear relationship that can be well-approximated by a parabolic curve. In order to illustrate this concept, please see the following example:
Example 1: Non-linear relationships
I wish to see how increasing the amount of fertiliser alters the yield I get from my crops. It’s known that small amounts of fertilizer can improve yield significantly, but after a certain point, adding more fertiliser doesn’t help much and might even decrease the yield due to over-fertilisation. To test this, I have collected the following data:
Just like we did with the linear example, I have plotted this as a scatter graph below:
Figure 1: A scatter plot showing the relationship between amount of fertiliser and crop yield.
Notice that we have a non-linear relationship that looks quadratic. In the next section, we shall develop the theory of least squares estimation for the quadratic case.
Fortunately, the least squares estimation theory which we discussed in our previous post (about the linear case) holds true in the quadratic case. We just need to remember that the design matrix has an extra column!
Using the least squares theory discussed in the previous post, we shall now seek to find the “best” value for \(\boldsymbol{\beta}\) which best models this (presumably) quadratic relationship.
Example 1 (continued): Least squares estimation for quadratic regression
Let us refer back to our previous example in this post. Recall, our data collected was as follows: