Remember the linear regression model \[Y_i = \mathbf{X}_{i, *}\boldsymbol{\beta}+\epsilon_i = \beta_1X_{i,1} + \dots + \beta_1X_{i,p} + \varepsilon_i.\]
In this model the values for \(Y_i\) and \(\mathbf{X}_{i, *}\) are known. What remains unknown are the parameters \(\mathbf{\beta}\) and \(\sigma_\varepsilon^2\).
We know that \(Y_i \sim \mathcal{N}(\mathbf{X}_{i,*}\mathbf{\beta}, \sigma_\varepsilon^2)\), so the density is
\[f_{Y_i}(y_i) = (2\pi\sigma_\varepsilon^2)^{-1/2}\text{exp}[-(y_i -\mathbf{X}_{i,*}\mathbf{\beta})^2 / 2\sigma_\varepsilon^2].\]
So far we have expressed everything in terms of the least-squares (i.e. minimizing \(\sigma_\varepsilon^2\)), but we can also express the linear model in terms of maximizing the likelihood. The likelihood is:
\[L(\mathbf{Y},\mathbf{X};\mathbf{\beta},\sigma_\varepsilon^2) = \prod_{i=1}^{n} (\sigma_\varepsilon\sqrt{2\pi})^{-1}\text{exp}[\mathbf{Y}_i -\mathbf{X}_{i,*}\mathbf{\beta})^2 / 2\sigma_\varepsilon^2].\]