Milestone 2

Due date: Tuesday, March 11, 2025 (with HW6)

Set up a Github Repository

Purpose: The goal of Milestone 2 is to make progress on the goals you set out earlier and get tailored feedback on next steps to make the final product as high quality as possible.

Task (requirements for passing this Milestone):

  1. Open https://github.com/orgs/Mac-Stat-212/repositories and create a team GitHub repository (indicate you want a private repo and want a README file when you create it). Name it 212Project_FirstName1-FirstName2-FirstName3 (replace FirstName with your group members’ first names).

  2. Add your teammates as a collaborators to your project GitHub repository.

  3. Clone this repository to your local machine and create a folder structure as follows:

    • 212Project_FirstName1-FirstName2-FirstName3
      • code
        • clean (add a Milestone2.qmd here that implements your plan)
      • data
        • raw (add your original source data files in here here)
        • clean (codebook.md should go in here)
      • results (nothing in here yet)
      • planning (move your Milestone1.qmd planning file in here)
  4. Create a .gitignore file.

    • What is this? .gitignore is a special file that tells Git what files and folders to ignore in version control.
    • Steps to create this file:
      1. Open the Terminal in RStudio. Enter the command pwd. Make sure that a path to your project folder is displayed. If not, use the command cd RELATIVE/PATH/TO/YOUR/PROJECT/FOLDER to change the directory to your project folder. The part that comes after cd is a relative path from your current location to your project folder. If you need to use cd, use pwd again afterward to confirm that you are in your project folder.

      2. Enter the command touch .gitignore (if on Mac) or type nul > .gitignore (if on Windows).

      3. Enter the command ls -a. You should see all files in this directory (including hidden files that start with .). You should see the .gitignore file.

      4. Enter the command open .gitignore (if on Mac) or start .gitignore (if on Windows). On a Mac, note that you can use tab completion in the Terminal for typing shortcuts. After you type open .giti hit Tab to auto-complete the rest of the .gitignore file name. This will open the .gitignore file in RStudio or your computer’s plain text editor.

      5. Add the following lines to your .gitignore file, and save the file.

        data/raw/
        .DS_Store
        .Rhistory
        .Rproj.user
        *.Rproj
        .quarto/
    • Add, commit, and push your changes to Github using Github Desktop.
  5. Add a code chunk to the end of all of your .Rmd/.qmd documents with sessionInfo()

    • When rendering your markdown file to HTML, this adds information about all packages used in the file as well as their package versions. As packages get updated over time, old code may break, so it is good to know what version of a package was used to complete your work so that you can restore that particular package version.
  6. Create a codebook.md file (put it in data/clean/). Write up a data codebook. That is, describe the type and meaning of the variables in your dataset. Group your variables into categories (e.g., demographic variables, neighborhood variables).

    • If you have a lot of variables, it may not be necessary/feasible to describe every variable individually. Rather, you can describe groups of similar variables.
  7. In a Milestone2.qmd file, complete the steps in your plan from Milestone 1 (the plan with feedback from the instructional team).

  8. At the bottom of Milestone2.qmd, write a plan for further pursuing your 2-3 broad questions. Make sure that the steps in this plan are reasonable to complete in the next few weeks for Milestone 3 (which involves writing a short blog with initial data story (your results so far). You will receive feedback on this plan and will be expected to integrate this feedback for Milestone 3. Questions to think about as you develop this plan:

    • Do your 2-3 original broad questions need to be revised?
    • What additional information and context (news articles, journal articles, expert interview, etc.) do you need to understand the data?
    • Is it time to start looking for additional datasets?

To turn in your work:

  • Commit and push your changes to Github using Github Desktop