Course Description

The objective of this course is to teach skills, concepts, and theories relevant to data analysis, visualization, and basic statistics. These topics will be combined in a way to provide students in-demand job skills with a focus on real-world applications. Students will learn about ethical issues and the impact of bias with data, applications of statistics in Excel, data visualization tools using Excel, linear regression, time-series, and classification algorithms.

Learning Objectives

At the completion of the course, students will be able to:

• Understand implications of machine bias in case studies
• Know the differences between common data types in Excel
• Demonstrate ability to calculate standard statistical measures in Excel
• Demonstrate proficiency with common data analysis techniques in Excel, e.g., Pivot Tables and Joins
• Create and interpret a Linear Regression Model in Excel
• Create and interpret a Time Series Model in Excel
• Understand the difference between Supervised and Unsupervised Learning
• Implement Logistic Regression in Excel
• Familiarity with basic plotting tools in Excel
• Familiarity with basic concepts of Hypothesis Testing

Statistical Analysis with Excel

Schedule

Week 1

• Impact of Analytics & Ethics in the Real World
• Ethics Case Study: Criminal Recidivism
• Quiz

Week 2

• Impact of Analytics & Ethics in the Real World
• Ethics Case Study: Home Mortgage Disclosure Act
• Quiz

Week 3

• Introduction to Analysis with Excel
• Data types
• Excel Tables
• Filtering
• Selecting
• Summary Statistics
• Mean, Median, Mode
• Variance
• Standard Deviation
• Quiz

Week 4

• Pivot Tables and Pivot Charts in Excel
• Hands-on guided using custom datasets
• Test (up to pivot tables)

Week 5

• Basic Probability Theory
• Combinatorics & Permutations
• Independence
• Conditional Probability
• Quiz

Week 6

• Distributions
• Visual Summary Statistics
• Histograms
• Boxplots & interquartile ranges
• Outliers
• Quiz

Week 7

• Applications of Probability in Industry
• Quiz

Week 8

• Bayes’ Theorem
• Midterm (up to week 7)

Week 9

• Law of Large Numbers
• Central Limit Theorem
• Z-scores and applications of CLT
• Quiz

Week 10

• Applications of Statistics in Excel
• Biases
• Confirmation
• Selection
• Small sample sizes
• Exploratory Data Analysis in Excel
• Scatterplots
• Barplots
• Quiz

Week 11

• Hypothesis Testing
• Type I & Type II errors
• Confidence Intervals
• Quiz

Week 12

• More applications of Hypothesis Testing
• Test 3 (up to week 11)

Week 13

• Introduction to Supervised and Unsupervised Learning
• Relationships between variables
• Linear Regression in Excel
• Assumptions of Linear regression
• Feature Engineering
• Normalizing features
• Dealing with count data
• Interpreting residual plots

Week 14

• Time Series in Excel
• Moving Average Models
• Weighted Moving Average Models
• Quiz

Week 15

• Logistic regression
• Case Study
• Implementing Logistic Regression in Excel
• SF vs NY housing dataset
• Theory & Assumptions
• Quiz

Week 16

• Review
• Putting it all together
• What’s to come in course 2 & 3
• Final Exam