Regression Calculator

Enter your X values and Y values (one per line or comma-separated) into the Regression Calculator to compute the line of best fit. You'll get the slope (b), intercept (a), R² (coefficient of determination), Pearson correlation coefficient (r), plus MSE and RMSE — everything you need to interpret the linear relationship between two variables.

Enter the predictor (independent) variable values.

Enter the response (dependent) variable values. Must match the count of X values.

Most regressions include the intercept. Only force through origin if your theory requires it.

Results

R² (Coefficient of Determination)

--

Slope (b)

--

Intercept (a)

--

Pearson Correlation (r)

--

Mean Squared Error (MSE)

--

Root Mean Squared Error (RMSE)

--

Mean Absolute Error (MAE)

--

Number of Data Points (n)

--

Regression Equation

--

Regression Fit Quality

Results Table

Frequently Asked Questions

What is a linear regression model?

A linear regression model describes the relationship between a predictor variable (X) and a response variable (Y) using a straight line: Y = a + bX. The slope (b) shows how much Y changes for each unit increase in X, while the intercept (a) is the expected value of Y when X equals zero. It's one of the most widely used statistical techniques for prediction and explanation.

What does R² mean and what is a good R² value?

R² (the coefficient of determination) measures the proportion of variance in Y that is explained by X, ranging from 0 to 1. An R² of 0.90 means 90% of the variation in Y is explained by the model. What counts as 'good' depends on the field — social sciences may accept 0.3–0.5, while engineering or physics may require 0.99+.

What is the difference between r and R²?

Pearson's r is the correlation coefficient and ranges from −1 to +1, indicating both the strength and direction of the linear relationship. R² is simply r squared and ranges from 0 to 1, representing the proportion of variance explained. For example, r = 0.9 gives R² = 0.81, meaning 81% of Y's variance is explained by X.

What are the key assumptions of linear regression?

Simple linear regression assumes: (1) a linear relationship between X and Y, (2) independence of observations, (3) homoscedasticity — residuals have constant variance across all values of X, (4) normally distributed residuals, and (5) no significant outliers. Violating these assumptions can lead to biased or unreliable estimates.

What is MSE and RMSE, and why do they matter?

MSE (Mean Squared Error) is the average of the squared differences between observed and predicted Y values — it penalises large errors heavily. RMSE (Root Mean Squared Error) is the square root of MSE, putting the error back in the same units as Y, making it easier to interpret. Lower MSE and RMSE indicate a better-fitting model.

When should I force the regression through the origin (no intercept)?

You should only force the line through the origin (set intercept = 0) when your theory or physical law explicitly requires that Y must equal zero when X equals zero. In most practical situations, including the intercept produces a more accurate and unbiased fit, even if the intercept value seems unintuitive.

How many data points do I need for linear regression?

At minimum, you need at least 3 data points, but meaningful regression typically requires 10 or more. A common rule of thumb is at least 10–20 observations per predictor variable. With too few points, estimates of slope, intercept, and R² can be highly unstable and unreliable.

How do I interpret the regression equation output?

The regression equation Ŷ = a + bX lets you predict Y for any new X value. For example, if a = 1.5 and b = 2.0, then for X = 5 you would predict Ŷ = 1.5 + (2.0 × 5) = 11.5. The residuals shown in the table (Y − Ŷ) tell you how far each actual observation is from the model's prediction.

More Math Tools