Cox showed that if you want to represent ‘degree’s of belief’ both using real numbers and consistent with classical logic, you must use probability theory (link). Bayes theorem is the theoretically correct way to update probabilities based on evidence. Bayesian statistics is the natural combination of these two facts.

Bayesian statistics two chief advantages over other kinds of statistics:

1. Bayesian statistics is conceptually simple.
• This excellent book introduces statistics, some history and the whole of the theoretical foundations of Bayesian statistics in a mere 12 pages; the rest of the book is examples and methods.
• Users of classical statistics very frequently misunderstand what p-values and confidence intervals are. In contrast, posterior distributions are exactly what you’d expect them to be.
• After learning the basics, students can easily derive their own methods and set up their own problems. This is not at all true in classical statistics.
2. Bayesian statistics is almost always explicitly model centric. It requires people to come up with a model which describes their problem. This has several advantages:
• It’s often very easy to build a model that’s very closely tailored to your problem and know immediately how to solve it conceptually if not practically.
• It makes it harder to be confused about what you’re doing.