Winsorized Mean Calculator

Enter a comma-separated number set and a winsorization percentage to calculate the Winsorized Mean. Outliers at both ends are replaced by the nearest boundary values rather than removed, giving you a robust average that resists extreme values. You'll also see the original mean, the winsorized dataset, and how many values were replaced on each tail.

Enter numbers separated by commas

%

Percentage of values to replace at each tail (1–49%). E.g. 10% replaces the bottom 10% and top 10%.

Results

Winsorized Mean

--

Original Mean

--

Values Replaced Per Tail

--

Sample Size (N)

--

Lower Boundary Value

--

Upper Boundary Value

--

Original Mean vs Winsorized Mean

Results Table

Frequently Asked Questions

What is the Winsorized Mean?

The Winsorized Mean is a robust statistical measure that reduces the influence of outliers by replacing extreme values at both ends of a dataset with the nearest non-extreme boundary values. Unlike trimming (which removes outliers), Winsorization substitutes them, preserving the original sample size while limiting the distortion caused by extreme observations.

How is the Winsorized Mean calculated?

First, sort the dataset in ascending order. Then determine g = round(winsorization % × sample size), which is the number of values to replace on each tail. Replace the lowest g values with the (g+1)th value and the highest g values with the (N−g)th value. Finally, compute the arithmetic mean of this modified dataset.

What is the difference between Winsorized Mean and Trimmed Mean?

A trimmed mean removes the extreme values from both tails entirely and averages the remaining data, reducing sample size. A Winsorized mean replaces those extreme values with the nearest boundary values instead of discarding them, keeping sample size constant. Winsorization is generally preferred when you want to retain all observations but limit outlier influence.

Can the Winsorized Mean handle multiple outliers?

Yes, but with limitations. The Winsorization percentage controls how many extreme values are replaced on each tail, so you can account for multiple outliers by increasing the percentage. However, setting the percentage too high (above 40–45%) risks distorting the central tendency of legitimate data rather than just outliers.

Can the Winsorized Mean be used with non-numeric data?

No. The Winsorized Mean requires numeric data that can be sorted and averaged. It cannot be applied to categorical, nominal, or non-numeric datasets. For such data, other measures like mode or frequency-based statistics are more appropriate.

Does the Winsorized Mean preserve data variability?

Partially. Because outliers are replaced rather than removed, the dataset retains its original size, which helps preserve some measure of spread. However, the replaced boundary values reduce variance at the tails, so variance and standard deviation of the Winsorized dataset will generally be lower than those of the original data.

How does the Winsorized Mean impact hypothesis testing?

The Winsorized Mean can improve hypothesis testing robustness when data contains outliers, as it reduces the distorting effect of extreme values on test statistics. Tests based on Winsorized means (such as Winsorized t-tests) tend to have better Type I error control and power under heavy-tailed or skewed distributions compared to standard tests.

What winsorization percentage should I choose?

Common choices are 5%, 10%, or 20%, depending on how extreme the outliers are and how many are present. A 10% Winsorization (replacing the bottom 10% and top 10%) is a widely used default. Avoid exceeding 25–40% unless your data has severe contamination, as over-Winsorization can bias the mean toward the center.

More Statistics Tools