Learning Goals
Course catalog description
This second course in the data science curriculum emphasizes advanced data wrangling and manipulation, interactive visualization, writing functions, working with data in databases, version control, and data ethics. Through open-ended and interdisciplinary projects, students practice the constant feedback loop of asking questions of the data, manipulating the data to help answer the question, and then returning to more questions. Prerequisite(s): COMP 112 and COMP 123 and STAT 155; STAT 253 recommended but not required.
By the end of this course you should be able to:
- Sustain a reflection practice
- Reflect on your learning process so that you are equipped for independent learning
- Reflect on your collaborative work so that you can develop community no matter where you go
- Create effective visualizations and interactive applications
- Create a variety of visualizations in ggplot2 that go beyond the plot types that you learned in STAT/COMP 112
- Wrangle and visualize spatial data
- Create interactive web applications and visualizations that adapt to user input
- Wrangle arbitrarily messy data
- Use appropriate R tools to manage numeric, logical, date, strings, and factors
- Use appropriate R tools to write functions and loops
- Use appropriate methods when working with missing data
- Double check your data cleaning steps to ensure accuracy
- Acquire data from a variety of sources
- Write queries in structured query language (SQL) to access data from databases
- Write code to access data from application programming interfaces (APIs)
- Write code to scrape data from websites and evaluate the ethics of collecting such data
- Craft high quality data stories
- Iterate on the question-explore-question cycle to craft compelling data stories with attention to data context and ethical considerations
- Use a combination of data acquisition, data wrangling, static and interactive visualization, and statistical modeling to further a data science investigation
- Use AI and search tools to figure out difficult tasks
- Use appropriate coding jargon to construct effective search queries (e.g., Google) and evaluate the accuracy of results that you find
- Construct effective AI prompts (e.g., Chat GPT, Google Bard) and evaluate the accuracy of generated results
- Articulate the ethical considerations in using AI and search tools
- Use professional data science tools
- Use Git as a version control system
- Use GitHub as a platform for sharing code and collaborating with others
- Maintain a digital portfolio of your data science projects on your personal website
- Work in a collaborative team
- Understand and demonstrate characteristics of effective collaboration (team roles, interpersonal communication, self-reflection, awareness of social dynamics, advocating for yourself and others).
- Develop a common purpose and agreement on goals.
- Be able to contribute questions or concerns in a respectful way.
- Share and contribute to the group’s learning in an equitable manner.
- Develop a familiarity and comfort in using collaboration tools such as Git and Github.