Hypothesis Testing and Regression
Hypothesis Testing and Regression
In Week Three you became familiar with your data set by completing some basic data analysis on three research questions. That assignment helped you to become familiar with your data and to start to understand some preliminary results which may be concluded based on your data. In Week Four we are going to take the data one step further and do some hypothesis testing using linear or logistic regression. Upon completion of this assignment you will be able to utilize these skills with the topic you choose for your Final Paper.
Assignment Instructions:
Utilize the Epi InfoTM 7 Quick Start Guide, v.0.2.2 as a resource and complete the tasks below.
First, you will need to open your saved canvas file:
Launch the Epi Info TM 7 program that you saved to your computer in the beginning of the Week Three assignment and, from the Menu screen, select Visual Dashboard.
Retrieve the Epi Info TM 7 canvas file that you saved for the Week Three assignment. The dataset should already be saved within this file. ?Next, you will need to recode your dependent variable: ?
When doing advanced statistics, sometimes you have to recode your data so that your statistical software is able to recognize which variables make up the reference group. Epi Info TM 7 requires that when you are using a logistic regression statistical test, your outcome/dependent variable must be coded as a 1 or 0 or as yes/no. For the purposes of this assignment, we are going to recode our dependent variables as 1 and 0.
Locate the “Defined Variables” feature on the left-hand side of the Visual Dashboard canvas, and reference the 2011 YRBS Data User’s Guide (codebook) to recode your dependent variable of “marijuana use” (qn48) as 1 for those who answered yes. This is also represented by 1 in the codebook. Then recode as 0 for those who answered no, which is represented by 2 in the codebook. ?Then, you will use advanced statistics to obtain the logistic regression output for your outcome variable:?
Right click on the blank center canvas screen, hover over the “Add Analysis Gadget,” hover over “Advanced Statistics,” and select Logistic Regression from the menu.
Using your recoded outcome variable of “marijuana use” (qn48), determine what demographic variables are predictors of marijuana use. Use the following variables as your demographic variables: ?
Age (q1)
Gender (q2)
Grade (q3)
Race (RACEETH)
Once you have a logistic regression table, you can export the output to Excel by right-clicking with your mouse cursor on the table itself. From Excel, copy and paste each table output that you have created into one tab within the same Excel file. The tab should be labeled Wk4_LogisticRegression. Save the entire Excel file as Firstname_Lastname_Week 4_Assignment. ?Additionally, you will analyze risk factors in conjunction with your outcome variable: ?
Using the controls of gender (q2), grade (q3), and race (RACEETH), determine if cigarette smoking (qn31) is a risk factor of marijuana use.
Use the following additional variables for other risky behaviors: sexual behavior (qn60), alcohol use (qn43), and ecstasy (qn55). Determine whether they are associated with marijuana use (qn48). Use the same controls as listed above (gender, grade, and race).
To start, run your models separately, one each for sexual behavior (qn60), alcohol use (qn43), and ecstasy (qn55), respectively, while controlling for the demographic variables (gender, grade, and race).
Then, run one model with all three variables (sexual behavior, alcohol use, and ecstasy) and the demographic variables (gender, grade, and race).
Once you have your data tables, you can export the output to Excel by right-clicking with your mouse cursor on the table itself. From Excel, copy and paste each table output that you have created into one tab within the same Excel file. The tab should be labeled Wk4_RiskFactors. Save the entire Excel file as Firstname_Lastname_Week 4_Assignment. ?Finally, address the following as you summarize your results:?
What was your null and alternative hypothesis for each research question? The research questions were:
“What demographic variables are predictors of marijuana use?” (Step #6)
“Is cigarette smoking a risk factor for marijuana use?” (Step #8)
“Are other risky behaviors associated with marijuana use?” (Steps #10-11)
Summarize the results of your logistic model.
Based on these results, should you accept or reject your null hypothesis?
From Steps #10-11, compare your three separate models to the model that included all of the variables. If any differences were found between the models, summarize the differences you saw in the results.
Is it best to use separate models or one model with all the variables included? Justify your answer.
Your assignment must be two to three pages (excluding title, reference, and analysis output pages) and formatted according to APA style as outlined in the Ashford Writing Center. Additionally, upload the Firstname_Lastname_Week 4_Assignment Excel file with all your statistical data in addition to your summary document.
Carefully review the Grading Rubric for the criteria that will be used to evaluate your assignment.
The following screencast demonstration will assist you with this assignment, it is presented by Dr. James Koziol, Full-Time Faculty member in the Department of Health Administration: