Regression analysis is a statistical method used to establish a mathematical relationship between a dependent variable (Y) and one or more independent variables (X). It helps in prediction and forecasting.
| Regression | Correlation |
|---|---|
| Establishes cause-and-effect relationship | Measures degree of relationship |
| Predicts value of dependent variable | Does not predict values |
| Has a direction (Y depends on X) | Symmetric (no directional dependency) |
| Regression coefficients are not symmetric | Correlation coefficient is symmetric |
| Used for prediction/forecasting | Used to measure association |
The mathematical equation of a regression line is:
Simple Linear Regression
Y = a + bX + e
Where:
Y = dependent variable (response variable)
X = independent variable (predictor variable)
a = intercept (value of Y when X = 0)
b = slope (change in Y per unit change in X)
e = error term (residual)
Intercept (a): The value of the dependent variable (Y) when the independent variable (X) is zero. It represents the starting point of the regression line on the Y-axis.
Slope (b): The rate of change in Y for a unit change in X. A positive slope means Y increases as X increases; a negative slope means Y decreases as X increases.
Least Squares Regression is a statistical method used to find the best-fitting line (or curve) that describes the relationship between two variables. It is based on the principle of minimizing the sum of the squares of the errors (differences) between actual values and estimated values.