Linear Regression

Linear Regression is one of the simplest and most widely used algorithms in machine learning.
It is used to predict a continuous target variable based on one or more input features.

The key idea is to fit a linear equation to the observed data:

$$ y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_n x_n + \epsilon $$

Where:

$y$ is the target variable
$x_1, x_2, \dots, x_n$ are input features
$\beta_0$ is the intercept
$\beta_1, \dots, \beta_n$ are coefficients
$\epsilon$ is the error term

How It Works

Fit a line that minimizes the difference between predicted and actual values.
This is usually done by minimizing the Mean Squared Error (MSE):

[ MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2 ]

Estimate coefficients using methods like Ordinary Least Squares (OLS).

Assumptions

Linear Regression works best when the following assumptions hold:

Linearity: Relationship between features and target is linear
Independence: Observations are independent
Homoscedasticity: Constant variance of errors
Normality: Errors are normally distributed

Simple Example in Python

```python from sklearn.linear_model import LinearRegression import numpy as np

Sample data

X = np.array([[1], [2], [3], [4], [5]]) y = np.array([2, 4, 5, 4, 5])

Fit linear regression model

model = LinearRegression() model.fit(X, y)

print(“Coefficient:”, model.coef_) print(“Intercept:”, model.intercept_)

Contents

Linear Regression