Replicability, also referred to as repeatability or reproducibility, is key to scientific progress. While true replication would involve new data collection to determine whether the same effect or phenomena is observed in a new sample or population, another element of replication involves simply being able to take the original dataset and reproduce the reported results, following the reported methods used for data analyses. For your final project, you will reproduce and extend analyses from a published research article. There are several goals for this analysis replication project:
The final project will be turned in and graded in 5 phases, worth a total of 100 points toward your final grade.
Activity | Total points |
---|---|
P0: Pick a partner and a paper | 5 |
P1: Data quality review | 10 |
P2: Data delivery | 15 |
P3: EDA report | 20 |
P4: Replication/extension report | 25 |
P5: Presentation | 25 |
The goal of this project piece is to look at and explore your dataset for your final project. You will use the Quartz Bad Data Guide to help guide your data quality review. Indicate which, if any variable(s), in your dataset has any of the issues listed. If you indicate a variable with an issue, please include a brief paragraph (2-3 sentences tops) describing the issue further.
Please upload 4 things, as detailed in the Ellis & Leek paper:
This report should include:
Each team has already discussed the scope of your replication in a one-on-one meeting with Alison. The scope was intentionally limited to those analyses/results that are based on the general linear model (t-tests, linear regression, Analysis of Variance).
You should attempt to replicate every result, table, and figure in the original paper that is relevant to that analysis, as well as any analyses reported in text.
Do not spend too much time on layout or formatting. So, if your figure has legends in a different place or uses different colors in your plot or if a table has a certain formatting, don’t worry about replicating every last detail- what is important is that you attempt to replicate the content.
Your extension involves going beyond the original published article, using the same data. Your extension must be well-reasoned, with a sound research question that is clearly stated, and must include at least 2 of 3 of the following possible extension pieces:
Your final report should be a document that summarizes your replication/extension project with code in R Markdown. This document will be highly structured. You will show all of your R code in chunks (you can do this easily by doing nothing at all! This keeps the default knitr
global chunk setting of echo = TRUE
for all chunks). Scripts that were used to import, clean, and tidy your data should be referenced in your R Markdown document using source()
in a chunk, as in:
source("01-import.R") # part of your "data delivery"
source("02-clean.R") # part of your "data delivery"
source("03-tidy.R") # part of your "data delivery"
# etc as needed
We should be able to knit your R Markdown file with no errors after you upload a zip file for your project directory folder.
Your final report should have three sections:
You will submit a zip file that contains:
Although the structure of the reports differs this year from years past, here are links to 2 sample reports:
At the end of the quarter, you and your partner(s) will present your project to the class. The goal of your presentation is to teach the class about your replication project, explain what you have learned, and reflect on the process. Each group will do a 20-minute presentation; here’s a rough outline (but you can tailor it to your specific project):
You are welcome to use slides.