Skip to Content

Lesson 6: Database research: What can we learn from the smoking behavior data?


Students will use the database to investigate specific hypotheses related to their team’s overarching hypothesis. They will interpret information and artifacts gathered throughout the unit to complete their team research investigation. Students are given the opportunity to reflect on what they know and seek more information to strengthen their knowledge and final product.

Class Time: 100 minutes (2 days)

Learning Objectives Evidence
Students will demonstrate that they can propose and test specific hypotheses related to their overarching hypothesis  Students develop specific hypotheses that are based on preliminary evidence from smoker profiles and other resources, justify why they proposed the hypotheses they did, and perform an odds ratio calculation
 Students will evaluate whether an association is statistically significant  Students demonstrate their ability to interpret the meaning of the odds ratio and 95% confidence interval for specific queries they make
Students will apply the criteria for causality to determine whether an exposure might have caused the outcome Students discuss whether an observed association between an exposure and the outcome is consistent with one or more of the criteria for causality
Students will evaluate whether their evidence supports or refutes each specific hypothesis Students evaluate evidence from different sources and discuss whether this evidence supports or refutes each specific hypothesis
Students will evaluate whether their results support or refute their overarching hypothesis Students will use logical reasoning to relate their specific hypotheses with their overarching hypothesis



Day 1: Question and exposure selection and hypothesis testing

Section A. Comparison of Hypothesis Testing and Hypothesis Generation

1. Review Figure 6.1- Comparison of Hypothesis Testing and Hypothesis Generation to go over the difference between hypothesis testing and hypothesis generation

Video- Epidemiologist Discussion on the Multiple Comparison Problem by Noel Weiss, UW Professor of


Video- Classroom Discussion on the Multiple Comparison Problem


How to describe the difference between Hypothesis Testing and Hypothesis Generation through analogy-Super Bowl analogyImagine that a group of friends all have to miss the Super Bowl game because they’re going to be out hiking in the mountains. They decide that they’ll watch the late night re-play and pretend that the game hasn’t happened yet. They all promise not to listen to any reports of the game or talk to other people about who won. Before watching the game, they set up a friendly wager about who will win and what the point spread will be.This is analogous to hypothesis testing, where the researcher forms a hypothesis before accessing the data.Now, imagine that one of the friends wants to hedge his bets, so he proposes that he’ll enter 20 different wagers, with different combinations of who wins and what the point spread will be. His buddies say, you must be kidding. You’ll increase your odds of winning 20X if you do that!This is like doing lots of queries during hypothesis testing.Another person just can’t help herself, and she checks the final score on her cell phone before the game begins. And low and behold, she selects the winning team and point spread.This is analogous to forming a hypothesis after looking at the data, as we do in hypothesis generation.Finally, after watching the game and knowing the outcome, the friends go back and look at it play by play, and analyze which plays contributed to each team doing well or poorly. The propose strategies for each coach and team to improve their playing in the next season.This is kind of like hypothesis generation.


Section B. Hypothesis Testing

1. Follow or hand out Lesson 6- Directions for Research Project Pages to complete Blank_Research Project Pages Lesson 6

2. Ask students to work with their teams to fill in the first sections of RPP-7-10 with their research topic, their overarching hypothesis, the four questions they plan to analyze, and the specific hypothesis for each question with the smoking behavior questionnaire. Additionally, students should fill in the responses they will use to define exposed and not-exposed for each question before working with the smoking behavior database.

Video- Students argumentation over how to define exposed


3. Once the first sections of RPP-7-10 have been completed, ask each pair of students in each team to share a computer and analyze half of the group’s questions.

4. Have each student pair navigate to:

5. Ask students to click on “Hypothesis Testing”. Note that putting the mouse cursor over the blue subtitle icon gives helpful hints.

6. When students are ready to begin odds ratio calculations, you may want to lead them through the process of calculating and interpreting an odds ratio for one or more questions, as you did for Question 19 in Lesson 5. Students may need additional help defining exposed versus not-exposed, stating what the odds ratio and 95% CI mean, and applying the criteria for causality.

7. Have students complete the rest of RPP-7-10 using the Hypothesis Testing step in the database. You may want to give them the option of entering their data interpretation into Tasks a, b, and c of the database and then saving and printing their results (instead of writing their responses in the bottom half of RPP-7-10).

Video- Student argument over how to interpret association


Day 2: Analyzing hypothesis testing results

Section C. Mapping Activity and Drawing Conclusions

1. Using the Mapping Activity in RPP-11, each group should fill in their overarching hypothesis in the oval. In each of the four squares, they should write the number for each of the four questionnaire questions and the specific hypothesis for that question.

2. Students should then draw the relationship between each question and becoming a regular smoker, using the key that is provided. Their decision for each question should be based on the results recorded in RPP-7-10 (or the pages they printed from their database queries).

3. The questions in RPP-12, Drawing Conclusions, are intended to guide students’ syntheses of their results. For Question 1, students should address how their results for each of the four questionnaire questions support their overarching hypothesis. Their four questions may not all support their overarching hypothesis to the same extent, so they should provide justification for their observations. In Question 2, they should discuss the two specific hypotheses that best support their claim and why. In Question 3, they should discuss the strength of their evidence supporting their overarching hypothesis and whether other factors, such as confounding factors or selection bias might have a significant impact on their results. Question 4 asks about the broader implications for their findings on human health and the community. They may consider discussing how their results might impact public policy.

Evidence of Student Understanding- Students should be able to discuss how their evidence from each of their four specific hypotheses supports or refutes their overarching hypothesis. The strength of association for the four specific questions may be different, and if so, they should discuss why they are different.For each specific hypothesis, students should be able to apply the criteria for causality. Not all of the criteria may apply. In addition, there may be contradiction among the different criteria. For example, there may be good evidence for a causal relationship, except their results may not be consistent with other studies. If so, students should weigh their results against previous findings and discuss why there might be differences (e.g. different experimental design or study population among the studies they are comparing). As a second example, they may think that there is good evidence for causality, but also be concerned about the possibility of a confounder or significant bias. In this case they would have to weigh the influence of the confounder or bias on their overall results.Students should demonstrate an understanding of their findings on human health and their community. They may want to suggest how their results might be used to influence public policy. For example, if their research shows that smokers are more likely to have had parents who smoke, then they may propose that family doctors counsel parents who smoke about the risk of their children also becoming smokers.


4. Make sure that students have completed RRP-7-12 prior to continuing with Lesson 7. They will need to have RPP-7-12 finished to complete their final project.

No comments yet. You should be kind and add one!

Allowed HTML tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

By submitting a comment you grant Exploring Databases a perpetual license to reproduce your words and name/web site in attribution. Inappropriate and irrelevant comments will be removed at an admin’s discretion. Your email is used for verification purposes only, it will never be shared.