In my last post, Cyan brought up the issue that many practitioners of statistics might object to using prior information in Bayesian statistics. The philosophical case for using prior information is very strong, and I think most people intuitively agree that using prior information is legitimate, at the very least in selecting what kinds of models to consider. I think most statistics users would be OK with using prior information when there is some kind of objective prior distribution. However, people justifiably worry about bias or overconfidence on the part of the statistician; people don’t want the results of statistics to depend much on the identity of the statistician.
In practice, this problem is not too hard to sidestep. There are at least two approaches:
The first is to include significantly less prior information than is available, to make make statistical inference robust to bias and overconfidence. The two common approaches to this are to use weakly informative priors or non-informative/maximum entropy priors. Weakly informative priors are very broad distributions that still include some prior information that almost no one would object to. For example, if you’re estimating the strength of a metal alloy, you might choose a prior distribution that expresses your belief that the strength will probably be stronger than that of tissue paper but weaker than a hundred times as strong as the strongest known material. Maximum entropy priors represent the minimum physically possible to know about the parameters of interest.
The second is to do the calculations using several different prior distributions that different consumers of the statistics might think are relevant. This accomplishes something like a sensitivity analysis for the prior distribution. For example, you might include a non-informative distribution, a weakly informative distribution and a very concentrated prior distribution. This allows people with different prior opinions to choose the result that makes the most sense to them.