# PUH 5302, Applied Biostatistics 1

PUH 5302, Applied Biostatistics 1

Course Learning Outcomes for Unit V

Upon completion of this unit, students should be able to:

1. Explain the basic concepts of biostatistical analysis. 1.1 Explain how to prepare data for hypothesis testing. 1.2 Define null hypothesis and research hypothesis. 1.3 Discuss the five-step process involved in hypothesis testing.

Course/Unit Learning Outcomes

Learning Activity

1.1 Unit Lesson Chapter 7 Unit V Essay

1.2 Unit Lesson Chapter 7 Unit V Essay

1.3 Unit Lesson Chapter 7 Unit V Essay

Reading Assignment

Chapter 7: Hypothesis Testing Procedures

Unit Lesson

Welcome to Unit V. In the previous unit, we had an overview of sample size determination procedures and why it is important to have the appropriate sample sizes for various populations under study. We also identified common sample size calculation wizards available on the Internet.

In this lesson, we will discuss the process involved in hypothesis testing, but first, we will examine how data are prepared for hypothesis testing, define null and research hypothesis, and describe the five-step process involved in hypothesis testing.

Hypothesis Testing

Hypothesis testing is a crucial part of a research study. The process starts with the researcher making two specific statements about the population sample based on available information. The two statements oppose each other with one indicating an association between the variables and the other indicating no association between the variables. The test disproves one of the two statements, which the researcher will reject. He or she will then accept the opposite statement, which was proven to be true. The statement that indicates no association or reflects no difference is referred to as the null hypothesis represented by (H0). The other statement, which reflects the researcher’s belief, is called the alternative hypothesis and is represented by Ha or H1 (Sullivan, 2018).

For example, Let’s look at a research question and possible alternative and null hypotheses.

Research question: What is the relationship between leadership styles and turnover intent among hospital employees?

UNIT V STUDY GUIDE

Hypothesis Testing

PUH 5302, Applied Biostatistics 2

UNIT x STUDY GUIDE

Title

Alternative hypothesis (Ha or H1): There is statistical significance between leadership styles and turnover intent among hospital employees. Null Hypothesis (H0): There is no statistical significance between leadership styles and turnover intent among hospital employees. Before any hypothesis testing is done, the researcher first prepares the data for testing by conducting other preliminary tests including the following.

Reliability analysis informs the researcher whether or not the research instruments were appropriate

for the study. That is, did the research instrument measure what it was designed to measure? The results of the test are represented by the Cronbach Alpha (α) value, which is a standard test for reliability. The recommended or accepted Cronbach alpha value for many research studies is (α = .75 and above), and the closer the Cronbach alpha is to 1, the better the instrument is serving its purpose.

Exploratory data analysis (EDA) is done to test for outliers. Outliers are values that are farther away from the mean score (below or above the mean). Researchers may want to resolve the problem of outliers because they negatively impact the study. They are known to skew the results (push the results in one direction resulting in a biased study). Bias in a research study is a nightmare for many researchers. In order to avoid this and to ensure the scores are normal or close to normal, researchers run a test of normality, which serves two purposes. It helps to determine if the scores are normal and based on the results, and it determines the type of test to deploy for testing the hypothesis. For example, if the study is designed to investigate the correlation between variables, the tests used are either Pearson correlation or Spearman’s rank coefficient. If the data are normal as revealed from the outlier and normality tests, the researcher uses Pearson correlation. However, if the data are not normal, the researcher then uses the Spearman’s rank correlation because it does not consider normality of data for analysis.

Steps in Hypothesis Testing After the data are prepared via exploratory data analysis, the researcher is now ready to run the hypothesis testing. Some researchers identify seven steps, including data collection and preparation, while others specifically deal with the actual test itself. Step 1: Set up hypotheses and determine the level of significance: The null hypothesis (H0) is a statement that says two variables or groups have no effect on each other (Sullivan, 2018). Typically, when conducting a study, a researcher wants to disprove the null hypothesis. To do this, the researcher first comes up with the two opposing statements based on the data at hand. For example, recall the examples discussed earlier in the unit:

Null hypothesis (H0): There is no statistical significance between leadership styles and turnover intent among hospital employees.

Research hypothesis or alternative hypothesis (H1): There is statistical significance between leadership styles and turnover intent among hospital employees.

The alternative hypothesis (H1) is the statement that two variables do have an effect on each other. Typically, the research wants to prove this hypothesis. The investigator believes, from our example, that there is a statistically significant relationship between leadership styles and turnover intentions among hospital employees. In order to prove what the investigator believes is true in relation to the hypothesis, the investigator determines a significant level of α = 0.05 (5%). That means, per analysis, all other things remaining constant, there is a 5% risk of concluding that there is a difference when in actuality there is not a difference. If a significant level of α = 0.01 (1%) is selected, that means, per analysis, all other things remaining constant, there is a 1% risk of concluding that a difference exists when there is no actual difference. Step 2: Select the appropriate test statistics: The next step involves the researcher collecting sample data and analyzing the data to determine whether the sample data supports the research hypothesis or not by

PUH 5302, Applied Biostatistics 3

UNIT x STUDY GUIDE

Title

finding out the value of the test statistics, that is, the mean score, proportion, t statistic, and z-score, as described in the analysis plan (NEDARC, n.d.). Step 3: Set the decision rule: Identify the acceptance or rejection statements by setting up decision rules. An analysis plan will include rules for refusing the null hypothesis. Statisticians advance two ways in describing these decision rules, either by referencing a p-value or a region of acceptance (Sullivan, 2018). The p-value is the strength of evidence in support of the null hypothesis. For example, if the test statistic is equal to a certain value, say Y, then the p-value is the probability of observing a test statistic like Y. This would assume the null hypothesis is true. Therefore, if the p-value is less than the significance level (α = 0.05), the null hypothesis can be rejected, and the alternative hypothesis can be accepted (NEDARC, n.d.). You should be familiar with the following terms that fall under this step. The region of acceptance is a range of values for the test statistics that warrants the acceptance of the null hypothesis. Within the region of acceptance, the chance of making a Type I error is the equal to the significance level. Opposite of the region of acceptance is the region of rejection; this is the set of values that is not in the region of acceptance. The null hypothesis is rejected if it falls within this region. That means, the hypothesis has been rejected at the α level of significance. Step 4: Compute test statistics: In this step, the researcher summarizes sample information in the test statistics (e.g., z-value) and draws conclusions by comparing test statistic to decision rule. Then, they provide final assessment as to whether H1 is likely true given the observed data (NEDARC, n.d.). Step 5: Conclusion: The researchers will make a final conclusive statement about the test based on the results. For example:

1. If the p-value is less than the significance level, or (α) is greater than the significant level, reject the null hypothesis in favor of the alternative hypothesis because this would mean that the result is significant.

2. If the p-value is greater than the significance level, or (α) is less than the significant level, fail to reject the null hypothesis or accept the null hypothesis, the result is not statistically significant (NEDARC, n.d.).

It is wise to know that in hypothesis testing, the researcher does not completely prove a null hypothesis. This means that when there is no evidence against the null hypothesis, the researcher does not reject the null hypothesis. On the other hand, when there is strong enough evidence or enough evidence against the null hypothesis, the researcher does reject the null hypothesis (Sullivan, 2018). The conclusion must also include a statement about the alternative hypothesis. In addition, the descriptive statistics should also be included when he or she is presenting the results of the hypothesis test. The researcher has to report the exact p- values instead of just a certain range (Sullivan, 2018). For example, say that a hypothesis states that the patient admission rate differed significantly by socioeconomic class with patients higher up in the socioeconomic class having a lower rate of patient admission (p = 0.03). The following is an example of reporting a research conclusion.

H0: There is no difference in patient admission rate between patients of a lower socioeconomic class and patients of higher socioeconomic class.

H1: There is difference in patient admission rate between patients of a lower socioeconomic class and patients of higher socioeconomic class.

α = 0.05; 30% increase in admission rate for patients in low socioeconomic class; p-value = 0.002. The following conclusion could be drawn.

The null hypothesis should be rejected and the alternative hypothesis accepted.

The difference in patient admission rate between patients from low socioeconomic class and those from higher socioeconomic class was statistically significant.

PUH 5302, Applied Biostatistics 4

UNIT x STUDY GUIDE

Title

There was a 30% increase in admission for patients from low socioeconomic class (p = 0.001).

In summary, hypothesis testing is a crucial aspect of scientific research, especially for quantitative research. There are normally five main steps in hypothesis testing. Researchers first run an exploratory data analysis for outliers and normality tests before pursuing the actual hypothesis testing. The various steps include setting up the hypotheses and level of significance at which the test is to be conducted, selecting the appropriate test statistics, setting the decision rule, computing the test statistics, and drawing conclusions.

References NEDARC. (n.d.). Hypothesis testing.

http://www.nedarc.org/statisticalHelp/advancedStatisticalTopics/hypothesisTesting.html#PV Sullivan, L. M. (2018). Essentials of biostatistics in public health (3rd ed.). Burlington, MA: Jones & Bartlett

Learning.