Bayes’ rule for probability densities is a somewhat treacherous area, since it is derived differently than the Bayes’ rule for probabilities even though they both have the same form. Let’s start off with the latter.
Bayes’ rule for probabilities
Given the (nonzero) probability for the event and the probability for the event (that is, the events and occurring simultaneously), one can define the conditional probability of given :

(1) 
This implies by symmetry that , which gives Bayes’ rule for probabilities

(2) 
Bayes’ rule for probability densities
A probability density is (or PDF for probability density function) is a function defined over some set so that

(3) 
for any measurable subset (event) , where is some measure on set (typically, but not necessarily, the Lebesgue measure of the coordinates used to span set ).
A conditional probability density is a probability density that depends on an additional variable , where is generally not the same as (as is the case for conditional probability defined above). It is also a probability density in the same sense as .

(4) 
If there is a probability density associated with elements with respect to a measure on , one can define a joint probability density

(5) 
which is a probability density over the joint set with respect to the measure , since

(6) 
(easily verified by insertion).
By symmetry, we have again that , which gives the Bayes’ rule for probability densities

(7) 
Bayes’ rule for probability densities is thus derived from the definition of joint probability density (5), and not from the definition on conditional probability (1). It is also more limited than the probability formulation, applying only to the case of joint probability spaces for which two sets of coordinates can be separated out.
© 2008 Emanuel Winterfors code can be used in comments: $latex p(\theta)$ gives 
Leave a Reply