Logistic Regression Calculator

Enter your binary outcome data (0s and 1s) and a corresponding predictor variable (X values) to fit a logistic regression model. This calculator estimates the intercept (β₀) and coefficient (β₁), computes the log-likelihood, and plots the S-curve probability model — P = 1 / (1 + e^−(β₀ + β₁x)). Paste your data below, one value per line, to see your model results.

Results

Predicted P(Y=1) at X

Intercept (β₀)

Coefficient (β₁)

Log-Likelihood

Null Deviance

Residual Deviance

AIC

Number of Observations

Successes (Y=1)

Failures (Y=0)

Results Table

More Statistics Tools

Frequently Asked Questions

What is logistic regression and when should I use it?

Logistic regression is a statistical method used to model the probability of a binary outcome (0 or 1) based on one or more predictor variables. You should use it when your dependent variable has exactly two categories — such as pass/fail, disease/no disease, or purchase/no purchase — and you want to understand how predictors influence the likelihood of each outcome.

What do the β₀ and β₁ coefficients mean?

β₀ (the intercept) represents the log-odds of the outcome when the predictor X equals zero. β₁ (the slope coefficient) represents the change in log-odds for each one-unit increase in X. A positive β₁ means the probability of Y=1 increases as X increases; a negative β₁ means the opposite.

How is the logistic regression probability formula calculated?

The model estimates P(Y=1) using the formula P = 1 / (1 + e^−(β₀ + β₁·x)). This S-shaped (sigmoid) curve maps any real-valued linear combination of predictors to a probability between 0 and 1. The coefficients are estimated using maximum likelihood estimation (MLE), which iteratively finds the β values that make the observed data most probable.

What is log-likelihood and why does it matter?

Log-likelihood measures how well the fitted model explains the observed data. A higher (less negative) log-likelihood indicates a better fit. It is used to compute other diagnostics like the residual deviance (−2 × log-likelihood) and AIC, which help compare models and assess goodness of fit.

What is the difference between null deviance and residual deviance?

Null deviance measures how well a model with no predictors (only an intercept) fits the data. Residual deviance measures how well your fitted model (with predictors) fits. A large reduction from null to residual deviance indicates your predictor significantly improves the model. Ideally, residual deviance should be much lower than null deviance.

What is AIC and how do I use it to compare models?

AIC (Akaike Information Criterion) balances model fit against complexity. It is calculated as AIC = −2 × log-likelihood + 2 × number of parameters. Lower AIC values indicate a better model. AIC is most useful when comparing two or more competing models on the same dataset — choose the model with the lowest AIC.

How many data points do I need for logistic regression?

A common rule of thumb is to have at least 10 events (occurrences of the less frequent outcome) per predictor variable. For a single predictor model, this means at least 10 cases of both Y=0 and Y=1. Small samples can lead to unstable coefficient estimates and overfitting, so larger datasets generally yield more reliable results.

What does the predicted probability output mean?

When you enter a specific X value in the 'Predict Probability' field, the calculator uses your fitted model to estimate P(Y=1) at that value. For example, if X represents patient age and Y represents disease presence, entering X=50 gives the estimated probability that a 50-year-old has the disease, based on your data.

Results

Logistic Regression S-Curve

Results Table

More Statistics Tools

Frequently Asked Questions