What is Hypothesis testing?
Ans: The process of using probability and statistics to set up an experimental situation and decide whether or not to reject the “status quo” hypothesis based on sample data is called hypothesis testing.Before getting into details of how Hypothesis testing works , let us get ourselfs familiar with some terminology related to Hypothesis testing
Example: Suppose we wanted to determine whether a coin was fair and balanced(unbiased). A Null hypothesis states that half the flips would result in Heads and half in Tails.This can mathematically be written as follows
Now the coin is tossed let’s say 10 times and 7 Heads and 3 Tails are observed.Now Alternative hypothesis states that the coin is a biased one as we did’nt observe equal number of Heads and Tails in our experiment.
Now let’s go back and visit our definition of \(H_{0}\) -"It assumes that the observation is due to a chance factor".A chance factor is an influence that contributes randomly to each observation, and is unpredictable.
In simple sense, Null hypothesis argues that prior belief ( P(Head)= 0.5 in this case) is true and the observations from experiment is due to some randomness and hence can be ignored.
The definition of \(H_1\) says that -"Contrary to the null hypothesis, the alternative hypothesis shows that observations are the result of a real effect".
Therefore alternative hypothesis argues that observations of the experiment are biased because the coin itself is a biased-coin and not due to some chance factor.
Now that we are done with Null hypotheis and Alternate Hypothesis , let’s move to the next terms
The two types of hypothesis tests, based on the alternative hypothesis $ H_1$, are:
Note: Since we are showing the plots of normal distribution, it didn’t mean that hypothesis test only applicable for only normal distribution.
When an Hypothesis test is performed, we either have to reject Null hypothesis or fail to reject it. The possible errors that may occur are
Note: We would like the probability of committing either one of these errors to be as small as possible. Unfortunately, decreasing the probability of committing one type of error only increases the probability of committing the other type of error. So our main focus of interest would be Type I error, i.e 𝛼
The significance level determines how far our from the null hypothesis value we’ll draw that line on the graph. To draw a significance level of 0.05, we need to shade the 5% of the distribution that is furthest away from the null hypothesis.
The shaded region is also called as Critical region and the if the sample mean falls into that region, we reject the Null Hypothesis \(H_0\) .
It means that if \(H_0\) is actually true and the hypothesis test is repeated on different random samples of data from the same population, then we would expect \(H_0\) to be incorrectly rejected 5% of the time.
P-values are the probability of obtaining an effect at least as extreme as the one in your sample data, assuming the truth of the null hypothesis.
<font color='orange' size=5>The Misunderstood p Value </font> The p value is one of the most misunderstood quantities in psychological research. Even professional researchers misinterpret it, and it is not unusual for such misinterpretations to appear in statistics textbooks! The most common misinterpretation is that the p value is the probability that the null hypothesis is true—that the sample result occurred by chance. For example, a misguided researcher might say that because the p value is .02, there is only a 2% chance that the result is due to chance and a 98% chance that it reflects a real relationship in the population. But this is incorrect. The p value is really the probability of a result at least as extreme as the sample result if the null hypothesis were true. So a p value of .02 means that if the null hypothesis were true, a sample result this extreme would occur only 2% of the time. You can avoid this misunderstanding by remembering that the p value is not the probability that any particular hypothesis is true or false. Instead, it is the probability of obtaining the sample result if the null hypothesis were true. <font size=2>Credit: https://opentextbc.ca/researchmethods/chapter/understanding-null-hypothesis-testing/</font>
A z-score (aka, a standard score) indicates how many standard deviations an element is from the mean. A z-score can be calculated from the following formula. $$z =\frac{(\overline{x}-\mu)}{\frac{\sigma}{\sqrt{n}}}$$
Here is how to interpret z-scores.
Enough of theory , now let’s jump into Hypothesis implementation with example
<font size=2>credit: https://xkcd.com/882/</font>
def calculate_p_value(sample1, sample2, alpha):
#Step 1- calculate the difference between samples
difference_between_sample_means = mean(sample1)-mean(sample2)
#Step 2- Permuatation test
#Step 2.1 Merge the sameples
difference=[]
total_sample = list(sample1)
total_sample.extend(sample2)
total_sample = np.array(total_sample)
#Step 2.2 Sampling the data for 1000 times
for i in range(0,1000):
#Step 2.3 Picking 100 random numbers
samples = random.sample(range(0, len(total_sample)), 100)
#Step 2.4 First 50 random numbers are taken as set 1
set1 = total_sample[samples[:50]].mean()
#Step 2.5 Next 50 random numbers are taken as set 2
set2 = total_sample[samples[50:]].mean()
#Step 2.6 Taking the differnce between the two sets
difference.append(set1 - set2)
#Step3- Sort and count the number of values greater than the threshold
difference.sort()
count = sum(((i > diff) and (i>0)) for i in difference)
pValue = count/len(difference)
print("% of values > than the difference",diff," =",pValue*100,"%")
print("The pValue=",pValue,"and P(Reject H0 when H0 is true)=",alpha)
if pValue>alpha:
print("We fail to reject the null hypothesis")
else:
print("We can reject the null hypothesis")
print('_'*50)
return difference
========================
For Sample Size: 200
The average spendings 100 male = 9881.62
The average spendings 100 female= 7703.87
The difference between mean of male and female spendings = 2177.750
Percentage of values greater than the difference 2177.750 = 0.8 %
The pValue = 0.008 and the P(Reject H0 when H0 is true)= 0.15
We can reject the null hypothesis
in the above plot, when we take the sample size 100 we are getting pValue = 0.008
, for sample size 500 we are getting pValue = 0.161
and for the sample size 1000 we are having pValue = 0.289
If we reject the null hypothesis, we do not prove the alternative hypothesis is true. We merely state there is sufficient evidence to reject the null hypothesis. If we fail to reject the null hypothesis, we do not prove the null hypothesis is true. We merely state there is not sufficient evidence to reject the null hypothesis. Unfortunately, whatever the decision, there is always a chance we made an error!