Why is the Exponential Distribution called Memoryless?

4 min readApr 27, 2021

The Exponential distribution is the only memoryless continuous distribution. It’s cousin — Geometric distribution is also memoryless, but it is a discrete distribution.

Image Credits: Image Credits: https://dlsun.github.io/

The above statement is a very powerful property of the Exponential distribution. The exponential distribution is memoryless because the past has no bearing on its future behavior. Every instance is like the beginning of a new random period, which has the same distribution regardless of how much time has already elapsed. This distribution has a memoryless property, which means it “forgets” what has come before it. In other words, if we continue to wait, the length of time we wait neither increases nor decreases the probability of an event happening. Any time may be marked down as time zero. Let’s say a hurricane hits our island. The probability of another hurricane hitting in one week, one month, or ten years from that point are all equal.

Proof

Let X be exponentially distributed with parameter lambda. Suppose we know X > t. What is the probability that X is also greater than some value s + t? Thus we want to know: P(X > s + t| X > t)

This type of problem shows up frequently in queueing systems where we’re interested in the time between events. For example, suppose that jobs in our system have exponentially distributed service times. If we have a job that’s been running for one hour, what’s the probability that it will continue to run for more than two hours?
Using the definition of conditional probability, we have

If X > s + t, then X > t is redundant, so we can simplify the numerator.

Using the CDF of the exponential distribution,

The e ^ -(lambda*t) terms cancel, giving the surprising result

It turns out that the conditional probability does not depend on t! The probability of an exponential random variable exceeding the value s + t given t is the same as the variable originally exceeding that value s, regardless of t. In our job example, the probability that a job runs for one additional hour is the same as the probability that it ran for one hour originally, regardless of how long it’s been running.

Implications of the Memoryless Property

The memoryless property makes it easy to reason about the average behavior of exponentially distributed items in queuing systems. Suppose we’re observing a stream of events with exponentially distributed interarrival times. Because of the memoryless property, the expected time until the next event is always 1/λ , no matter how long we’ve been waiting for a new arrival to occur.

This behavior is a bit counterintuitive. We might expect that arrivals get more likely the longer we wait. For example, if the bus is supposed to come every ten minutes, and we have been waiting for nine minutes without seeing a bus, we expect that the next bus should be along very soon. If the time between bus arrivals is exponentially distributed, however, the memoryless property tells us that our waiting time — no matter how long it’s been — is of no use in predicting when the next bus will arrive. Suppose we have a queue with exponentially distributed service times. If a new customer arrives to the queue to find someone in service, the residual service time is the time until the currently running customer finishes service and departs the queue. Because of the memoryless property, the distribution of the residual service times does not depend on how long the customer has been in service. The probability that the current customer runs for an additional minute and then departs is the same as the probability that a new customer just entering service runs for one minute. Likewise, the average remaining service time is simply s-bar, the expected time for a new customer just entering service.

Another Perspective

While many authors consider the memoryless property to be one of the most useful aspects of the distribution, it’s also the reason why it may not make sense for modeling certain situations. For example, let’s say we wanted to model deaths of family dogs. Because of the memoryless property, the probability of a pet dog dying at age 1 would be the same as the dog dying at age 15, which is obviously nonsensical. Therefore, we should think about whether the exponential makes sense logically to your particular area of interest. For lifetime studies, the exponential is usually used only as a first rough model for the process.

I hope I did justice to this intriguing question.
Thanks for reading! Stay Safe!

References:
1. https://www.statisticshowto.com/
2. Conditional Probabilities and the Memoryless Property — Daniel Myers.

Why is the Exponential Distribution called Memoryless?

Proof

Implications of the Memoryless Property

Another Perspective

Written by Aman Gupta

No responses yet