The learning approaches we have discussed so far are based on the principle of maximum likelihood estimation. While being extremely general, there are limitations of this approach as illustrated in the two examples below. Example 1 Let’s suppose we are interested in modeling the outcome of a biased coin, (X in {heads, tails}). We toss