Crash-Course: P-Value, Statistical Significance, and Confidence Level

Understanding the P-value and Statistical Significance is critical if you want to become a data scientist or Machine-Learning Engineer. It is quite difficult to understand it in the beginning and it may give you a headache. I hope to bring a simpler-to-understand way of learning about P-values and Statistical Significance in this tutorial. As a tip for understanding these terms, I recommend just reading the article once, without squeezing your head too much to understand everything, and then reading it again trying to understand everything. This will help you make connections much faster and with less ache.

What is Statistical Significance?

As it is well known, Statistics is not an exact science, even if it has a strong basis in mathematics. Statistics can be described as an extremely well-thought and processed guesswork. But if your guesswork makes you win with a 90% chance instead of the 10% chance, then it is worth knowing statistics.

The Statistical Significance usually has the notation α and is also called the Alpha Level.

Statistical Significance, simply put, represents how meaningful your research on making a decision is. It means checking if your assumptions are true and well thought out or significant, as the name implies. To make sure your research or statistics model is actually meaningful and significant, you need to make sure the results that you get out of your statistical modeling are not by pure chance due to random factors. “A high degree of statistical significance indicates that an observed relationship is unlikely to be due to chance.”

“Statistical significance helps quantify whether a result is likely due to chance or to some factor of interest.”, Thomas Redman

Statistical Significance is extremely important in multiple cases, such as experiments, surveys, polls, analyzing sets of data, etc. For all those types of datasets, to find out if your results are statistically significant, you need to keep something very important in mind. The bigger your sample size is, the less chance you have to get results that imply that some relationship in your data is by random chance (if it isn’t). So be sure you use big enough datasets that correctly encompass crucial information in those datasets. If your dataset or sample size is too small to reflect reality, you have a high chance to get the wrong and misleading results.

But how is Statistical Significance calculated after all? This is where the P-value comes into the headlights.

P-value, you little devil

In research and data science, to find out how significant results are, everyone usually uses the P-value. This value indicates the probability (from P) under which the result occurred if chance alone is responsible for that said result. If the probability for randomness (or P-value) is small, then we can accurately say that there is a factor that influences that result, not randomness. This means that the given independent variable has some relationship with the dependent variable, and it can be used in predicting the dependent variable.

Make sure you don’t confuse the P-value with the Alpha value (α).

“The alpha value of a hypothesis is the threshold we use to determine whether or not our p-value is low enough to reject the null hypothesis.”

So if the alpha value is 0.05 and the p-value is 0.03, then we conclude that there is a statistical significance between the 2 variables. If the p-value is higher than 0.05 (e.g. 0.15), we conclude that there is no statistical significance and the variables or causes and effects are not correlated.

An easy way to understand P-value is to use the classical example in statistics, of the coin toss. In a coin toss, you can either get heads or tails, with an approximately 50% chance (there is a very small chance that the coin will fall in the middle, but that chance is so small that we ignore it). We will assume, something called a Null Hypothesis, which means that the probability is exactly 50% of getting heads or tails and that the coin is not rigged (or the guy or girl tossing it is not altering it). Simply put, if you get heads 7 times in a row, then you can probably conclude that the coin toss is rigged (that the guy or girl tossing that coin is altering it somehow) because the chance of this happening is under 1%. This can make you decide (if you are tossing coins for money, for example)if you should beat that guy to a pulp for cheating or not. So it is very important to make the right decision.

In the above case, if the probability of getting a coin 7 times in a row is under 1% this means that the P-value is under 1%. This implies that there is an over 99% chance (this is also called a confidence level) that someone is modifying the coin toss. Or, with the same thinking, that there is a 1% chance that the guy or girl tossing the coin is not rigging the coin. So, you can say with 99% confidence that you should beat that guy up for cheating you!

The values usually used in data science, machine learning, and statistics for P-Value are 0.05 or 0.01 depending on the problem at hand. That means that if for an independent variable, the P-value is under 0.05 or 0.01, then the independent variable is statistically significant in determining the dependent variable (the variable that needs to be predicted). If the P-value for that independent variable is bigger than 0.05 or 0.01 that independent variable is usually removed from the dataset.

Confidence level

The P-value is also used to determine something called confidence level, which I briefly mentioned above. The confidence level is the exact opposite of the Statistical Significance — if Statistical Significance is 0.03, it means that the Confidence level is 97%. Confidence levels are usually described as a percentage, where 0% means that you have no faith at all in something, and 100% means you are absolutely sure of something. However, if you get a 0% confidence level, it means you did something terribly wrong in your statistics, and 100% simply doesn’t exist as a confidence level in statistics (it can only happen with a dataset where absolutely all cases are encompassed — which is in most practical datasets, impossible).

To calculate the Confidence level you need to use the 1-α formula. As stated above, the α here represents the Statistical Significance.

In practice, a confidence coefficient is used, which is basically the percentage as a real value. For example, if you have a confidence level of 99%, the confidence coefficient will be 0.99.

Conclusion

I truly hope I made your life easier with this article. I remember how much of a headache I had when trying to understand these terms the first time. They are hard to grasp when they are explained with other mathematics and statistics jargon around them, so I tried to keep this article as free of those as possible.

If you are wondering how to calculate the P-value there are plenty of resources online that you can choose from. The scope of this article is to only give you an intuitive understanding of what those statistical terms are.

Good health and happy learning!

Stefan Silver

Search This Blog