Which hypothesis test should i use
Especially when it comes to statistics. Don't get me wrong, I love to analyze data and see what it means And in statistics, as in sports, if you don't use it, you lose it. If you haven't done an analysis in months it's not unreasonable to imagine you might need a little help. In that case, you might seek out the Assistant. Specifically, the Assistant menu in Minitab Statistical Software. The Assistant's always ready to guide you through a difficult statistical task if you're not quite sure what to do.
For example, suppose you want to compare two different materials for making backpacks —Cloth A and Cloth B —to determine which would make a more durable product.
You sample materials from both suppliers and measure the mean amount of force needed to tear them. If you're already up on your statistics, you know right away that you want to use a 2-sample t-test, which analyzes the difference between the means of your samples to determine whether that difference is statistically significant.
You'll also know that the hypotheses of this two-tailed test would be:. And that if the test's p-value is less than your chosen significance level, you should reject the null hypothesis. These formulas and tables can be easily found in statistics textbooks. To estimate the p-value, the results of the statistical test need to be combined with the number of the degrees of freedom.
There are several aspects to which the researcher needs to be aware when selecting the statistical test to be used. The first aspect relates to the type of data at hand, which may be "independent" or "paired". Data are considered independent when the parameters found in an individual do not depend upon the values observed for another respondents in the sample. For example, in the RCT by Bagatin et al 4 , the authors assume that each individual shows a certain response to photoaging, and, in theory, this response is not influenced by the response shown by other participants in the study.
The same applies to cross-sectional and cohort studies, as well as to unmatched case-control investigations. This is the case of the cross-sectional study by Duquia et al. However, when the results or the selection of an individual are related to the results or the selection of one or more participants in the same study, the data is considered to be "paired", as in the case of matched case-control studies.
The next step in selecting the statistical test considers the type of variable through which the outcome and the exposure variable were measured. At this stage, several requirements for the choice of the statistical test will have to be considered. The main criterion used to determine the type of test that should be selected for analyzing numeric outcomes is the symmetry of the variable.
In this case, "parametric" tests are the most appropriate ones. Figure 2 shows parametric and nonparametric test options for numeric outcomes, both for the analysis of independent and paired data.
Both the t-test and the ANOVA test have an additional prerequisite: the variance or standard deviation of the outcome should be homogeneous between the groups being compared. If the variances are not homogeneous, the use of a nonparametric test is recommended even if the outcome is symmetric , as the p-value resulting from the t-test or the ANOVA test may be biased.
However, the t-test is considered to be "robust", because if the number of subjects in a sample is greater than evenly distributed into the exposed and unexposed groups and the outcome is symmetric, the p-value will be reliable, even if the variance is heterogeneous between groups. In both cases the H A tested is that the mean outcome is different between the categories of the exposure variable. In the case of polytomous exposures, the H A of the ANOVA and the Kruskal-Wallis tests is that there is a significant difference between the mean outcome of at least two exposure categories heterogeneity test.
For example, in the RCT by Bagatin et al. However, the resulting p-value would not indicate which of the interventions is the best one. Post-hoc tests would compare the mean type I collagen in all possible combinations ISO vs. When the independent variable is ordinal and the H A postulates the existence of a trend regarding the outcome increasing or decreasing the mean outcome according to the categories of the exposure variable , the researcher might decide to use a test of linear trend, instead of a heterogeneity test.
In these cases, the exposure variable must also be symmetrical. Spearman's correlation should be used when the exposure and the outcome variables are not symmetrical, or when they are symmetrical but there is not a linear relationship between the variables. Figure 3 clarifies what is estimated by Pearson's correlation and linear regression. Each point in this graph is called an "observed" value of the subject.
Based on the set of observed values, it is possible to estimate a "prediction line": the expected outcome values for each value of the exposure variable.
The difference between the "observed" values and the "predicted" values is called "residual" value. Subject Identification Number. Because the differences are computed by subtracting the cholesterols measured at 6 weeks from the baseline values, positive differences indicate reductions and negative differences indicate increases e. The goal here is to test whether there is a statistically significant reduction in cholesterol. Because of the way in which we computed the differences, we want to look for an increase in the mean difference i.
In order to conduct the test, we need to summarize the differences. In this sample, we have. Difference 2. Is there statistical evidence of a reduction in mean total cholesterol in patients after using the new medication for 6 weeks? We reject H 0 because 4. Here we illustrate the use of a matched design to test the efficacy of a new drug to lower total cholesterol.
We also considered a parallel design randomized clinical trial and a study using a historical comparator. It is extremely important to design studies that are best suited to detect a meaningful difference when one exists. There are often several alternatives and investigators work with biostatisticians to determine the best design for each application. It is worth noting that the matched design used here can be problematic in that observed differences may only reflect a "placebo" effect.
All participants took the assigned medication, but is the observed reduction attributable to the medication or a result of these participation in a study.
Here we consider the situation where there are two independent comparison groups and the outcome of interest is dichotomous e. The goal of the analysis is to compare proportions of successes between the two groups. The relevant sample data are the sample sizes in each comparison group n 1 and n 2 and the sample proportions which are computed by taking the ratios of the numbers of successes to the sample sizes in each group, i. There are several approaches that can be used to test hypotheses concerning two independent proportions.
Here we present one approach - the chi-square test of independence is an alternative, equivalent, and perhaps more popular approach to the same analysis. In tests of hypothesis comparing proportions between two independent groups, one test is performed and results can be interpreted to apply to a risk difference, relative risk or odds ratio.
As a reminder, the risk difference is computed by taking the difference in proportions between comparison groups, the risk ratio is computed by taking the ratio of proportions, and the odds ratio is computed by taking the ratio of the odds of success in the comparison groups.
Because the null values for the risk difference, the risk ratio and the odds ratio are different, the hypotheses in tests of hypothesis look slightly different depending on which measure is used. When performing tests of hypothesis for the risk difference, relative risk or odds ratio, the convention is to label the exposed or treated group 1 and the unexposed or control group 2.
For example, suppose a study is designed to assess whether there is a significant difference in proportions in two independent comparison groups. The test of interest is as follows:. The following are the hypothesis for testing for a difference in proportions using the risk difference, the risk ratio and the odds ratio.
First, the hypotheses above are equivalent to the following:. The risk difference is analogous to the difference in means when the outcome is continuous. The three different alternatives represent upper-, lower- and two-tailed tests, respectively. Where is the proportion of successes in sample 1, is the proportion of successes in sample 2, and is the proportion of successes in the pooled sample. If there are fewer than 5 successes or failures in either comparison group, then alternative procedures, called exact methods must be used to estimate the difference in population proportions.
The outcome of interest is prevalent CVD and we want to test whether the prevalence of CVD is significantly higher in smokers as compared to non-smokers. Here smoking status defines the comparison groups and we will call the current smokers group 1 exposed and the non-smokers unexposed group 2. The test of hypothesis is conducted below using the five step approach.
Specifically, we need to ensure that we have at least 5 successes and 5 failures in each comparison group. In this example, we have more than enough successes cases of prevalent CVD and failures persons free of CVD in each comparison group.
We first compute the overall proportion of successes:. Smoking has been shown over and over to be a risk factor for cardiovascular disease.
What might explain the fact that we did not observe a statistically significant difference using data from the Framingham Heart Study? A randomized trial is designed to evaluate the effectiveness of a newly developed pain reliever designed to reduce pain in patients following joint replacement surgery.
The trial compares the new pain reliever to the pain reliever currently in use called the standard of care. A total of patients undergoing joint replacement surgery agreed to participate in the trial. Patients were randomly assigned to receive either the new pain reliever or the standard pain reliever following surgery and were blind to the treatment assignment. Before receiving the assigned treatment, patients were asked to rate their pain on a scale of with higher scores indicative of more pain.
Each patient was then given the assigned treatment and after 30 minutes was again asked to rate their pain on the same scale. The primary outcome was a reduction in pain of 3 or more scale points defined by clinicians as a clinically meaningful reduction. The following data were observed in the trial. Number with Reduction. Proportion with Reduction. We now test whether there is a statistically significant difference in the proportions of patients reporting a meaningful reduction i.
Specifically, we need to ensure that we have at least 5 successes and 5 failures in each comparison group, i. In this example, we have min 50 0. The sample size is adequate so the following formula can be used. Again, the procedures discussed here apply to applications where there are two independent comparison groups and a dichotomous outcome.
There are other applications in which it is of interest to compare a dichotomous outcome in matched or paired samples. For example, in a clinical trial we might wish to test the effectiveness of a new antibiotic eye drop for the treatment of bacterial conjunctivitis. Participants use the new antibiotic eye drop in one eye and a comparator placebo or active control treatment in the other.
Because the two assessments success or failure are paired, we cannot use the procedures discussed here. The appropriate test is called McNemar's test sometimes called McNemar's test for dependent proportions. Here we presented hypothesis testing techniques for means and proportions in one and two sample situations. Tests of hypothesis involve several steps, including specifying the null and alternative or research hypothesis, selecting and computing an appropriate test statistic, setting up a decision rule and drawing a conclusion.
There are many details to consider in hypothesis testing. The first is to determine the appropriate test. We discussed Z and t tests here for different applications. The appropriate test depends on the distribution of the outcome variable continuous or dichotomous , the number of comparison groups one, two and whether the comparison groups are independent or dependent.
The following table summarizes the different tests of hypothesis discussed here. Once the type of test is determined, the details of the test must be specified. Specifically, the null and alternative hypotheses must be clearly stated. The null hypothesis always reflects the "no change" or "no difference" situation. The alternative or research hypothesis reflects the investigator's belief. The investigator might hypothesize that a parameter e. Once the hypotheses are specified, data are collected and summarized.
The appropriate test is then conducted according to the five step approach. If the test leads to rejection of the null hypothesis, an approximate p-value is computed to summarize the significance of the findings. When tests of hypothesis are conducted using statistical computing packages, exact p-values are computed. Because the statistical tables in this textbook are limited, we can only approximate p-values.
If the test fails to reject the null hypothesis, then a weaker concluding statement is made for the following reason. In hypothesis testing, there are two types of errors that can be committed. A Type I error occurs when a test incorrectly rejects the null hypothesis.
A Type II error occurs when a test fails to reject the null hypothesis when in fact it is false. We noted in several examples in this chapter, the relationship between confidence intervals and tests of hypothesis.
The approaches are different, yet related. It is possible to draw a conclusion about statistical significance by examining a confidence interval. It is important to note that the correspondence between a confidence interval and test of hypothesis relates to a two-sided test and that the confidence level corresponds to a specific level of significance e.
The exact significance of the test, the p-value, can only be determined using the hypothesis testing approach and the p-value provides an assessment of the strength of the evidence and not an estimate of the effect. We reject the null hypothesis because Therefore there is a statistically significant difference in the proportion of children in Boston using dental services compated to the national proportion. What is statistical significance? What is the difference between quantitative and categorical variables?
What is the difference between discrete and continuous variables? Discrete and continuous variables are two types of quantitative variables : Discrete variables represent counts e. Continuous variables represent measurable amounts e.
Is this article helpful? Rebecca Bevans Rebecca is working on her PhD in soil ecology and spends her free time writing. She's very happy to be able to nerd out about statistics with all of you. Other students also liked. A step-by-step guide to hypothesis testing Hypothesis testing is a formal procedure for investigating our ideas about the world. It allows you to statistically test your predictions.
Test statistics explained The test statistic is a number, calculated from a statistical test, used to find if your data could have occurred under the null hypothesis. Understanding normal distributions In a normal distribution, data is symmetrically distributed with no skew and follows a bell curve.
What is the effect of income on longevity? What is the effect of income and minutes of exercise per day on longevity? What is the effect of drug dosage on the survival of a test subject? What is the effect of two different test prep programs on the average exam scores for students from the same class? What is the difference in average exam scores for students from two different schools?
What is the difference in average pain levels among post-surgical patients given three different painkillers? What is the effect of flower species on petal length , petal width , and stem length? How are latitude and temperature related?
0コメント