Shannon Entropy Calculator

Enter a text string or a set of event probabilities to compute Shannon entropy. Choose your input mode — text analysis or manual probabilities — and the calculator returns entropy in bits per symbol, total metric entropy, and a breakdown of each symbol's contribution. Perfect for information theory coursework, data compression analysis, and randomness testing.

Choose whether to analyse a text string or enter custom probabilities.

Paste any string — binary, ASCII, or natural language.

Treat uppercase and lowercase letters as the same symbol.

Exclude space characters from the entropy calculation.

How many distinct events / symbols do you have?

Probability of event 1 (0 to 1).

Probability of event 2.

Probability of event 3.

Probability of event 4.

Probability of event 5.

Probability of event 6.

Results

Shannon Entropy

--

Total Metric Entropy

--

Alphabet Size (Unique Symbols)

--

Maximum Possible Entropy

--

Relative Entropy (Efficiency)

--

Symbol Probability Distribution

Results Table

Frequently Asked Questions

What is Shannon entropy?

Shannon entropy, introduced by Claude E. Shannon in his 1948 paper 'A Mathematical Theory of Communication', is a measure of the uncertainty or randomness in a set of data. It quantifies the average minimum number of bits needed to encode a sequence of symbols, given their probability distribution. Higher entropy means more unpredictability; lower entropy means more structure or redundancy.

What is the Shannon entropy formula?

The formula is H(X) = −∑ P(xᵢ) · log₂ P(xᵢ), summed over all possible symbols or events. Each term P(xᵢ) · log₂ P(xᵢ) represents the contribution of symbol xᵢ to the total entropy. The negative sign is used because log₂ of a probability (a value between 0 and 1) is always negative, making the entropy a positive value measured in bits.

How is Shannon entropy used in information theory?

Shannon entropy is fundamental to data compression, cryptography, and communications. It sets a theoretical lower bound on lossless data compression — you cannot compress a source below its entropy without losing information. It is also used to measure the strength of passwords, evaluate randomness in random number generators, and assess feature importance in machine learning algorithms like decision trees.

What does maximum entropy mean?

Maximum entropy occurs when all symbols in the alphabet are equally probable. For an alphabet of n symbols, the maximum entropy is log₂(n) bits per symbol. Any deviation from a uniform distribution reduces entropy. The relative entropy (efficiency) shown by this calculator tells you what fraction of the theoretical maximum your distribution achieves.

What is the difference between Shannon entropy and metric entropy?

Shannon entropy (bits per symbol) describes the average information content of a single symbol. Metric entropy (total bits) is simply Shannon entropy multiplied by the total number of symbols in the string — it represents the total information content of the entire sequence.

Why do probabilities need to sum to 1?

Shannon entropy is defined over a valid probability distribution, where all probabilities are non-negative and sum to exactly 1. If your probabilities don't sum to 1, the calculation is mathematically undefined. This calculator automatically normalises your entered probabilities so they sum to 1 before computing entropy.

How does entropy relate to password strength?

Password entropy estimates how difficult a password is to guess by brute force. A password drawn from a larger, more varied character set has higher entropy. For a password of length L using an alphabet of N characters (assuming uniform distribution), the entropy is L · log₂(N) bits. Higher entropy passwords are exponentially harder to crack.

What does an entropy of 0 mean?

An entropy of 0 bits per symbol means there is absolutely no uncertainty — the same symbol always appears (probability = 1 for one symbol, 0 for all others). For example, the string 'AAAAAAA' has zero entropy because you can perfectly predict every character. Real-world strings almost always have entropy greater than 0.

More Statistics Tools