So, you want to calculate the probability of an event knowing that another has happened. There is a formula for that, it is called conditional probability, but why is it the way it is? Let’s first write down the definition of conditional probability:
We need to wonder: what does the happening of event tell about the odds of happening of event
? How much more likely
becomes if
happens? Think in terms of how
affects
.
If and
are independent, then knowing something about B will not tell us anything at all about
, at least not that we did not know already. In this case
is empty and thus
. This makes sense! In fact, consider this example: how does me buying a copybook affects the likelihood that your grandma is going to buy a frying pan? It does not: the first event has no influence on the second, thus the conditional probability is just the same as the normal probability of the first event.
If and
are not independent, several things can happen, and that is where things get interesting. We know that B happened, and we should now think as if
was our whole universe. The idea is: we already know what are the odds of
, right? It is just
. But how do they increase if we know that we do not really have to consider all possible events, but just a subset of them? As an example, think of
versus
knowing that all balls are red. This makes a huge difference, right? (As an aside, that is what we mean when we say that probability is a measure of our ignorance.)
So anyway, now we ask: what is the probability of ? Well, it would just be
, but we must account for the fact that we now live inside
, and everything that is outside it is as if it did not existed. So
actually becomes
: we only care about the part of
that is inside
, because that is where we live now.
But, there is a caveat. Continue reading “Conditional probability: why is it defined like that?”