-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathU_linear model assumptions.py
20 lines (12 loc) · 1.12 KB
/
U_linear model assumptions.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Linear regression assumptions are:
(1) Linearity: The mean values of the outcome variable for each increment of the predictor(s) lie along a straight line.
In other words, there is a linear relationship between predictors and target.
(2) No perfect multicollinearity: There should be no perfect linear relationship between two or more of the predictors.
(3) Normally distributed errors: the residuals are random and normally distributed with a mean of 0.
(4) Homoscedasticity: At each level of the predictor variable(s), the variance of the residual terms should be constant.
To determine if a linear model fits the data well:
(1) The residuals should have a normal distribution with the mean centered at zero, and should be homoscedastic.
If this is true, we can be fairly confident that the model is doing a good job.
(2) The normal distribution can be assessed by Q-Q plots. Homoscedasticity can be assessed by residual plots.
(3) We can also examine if there is a linear relationship between the predictors and the target with scatter-plots and residuals plots,
and assess multi-colinearity with correlation matrices.