Course Description

The objective of this course is to teach skills, concepts, and theories relevant to data analysis, visualization, and basic statistics. These topics will be combined in a way to provide students in-demand job skills with a focus on real-world applications. Students will learn about ethical issues and the impact of bias with data, applications of statistics in Excel, data visualization tools using Excel, linear regression, time-series, and classification algorithms.

Learning Objectives

At the completion of the course, students will be able to:

  • Understand implications of machine bias in case studies
  • Know the differences between common data types in Excel
  • Demonstrate ability to calculate standard statistical measures in Excel
  • Demonstrate proficiency with common data analysis techniques in Excel, e.g., Pivot Tables and Joins
  • Create and interpret a Linear Regression Model in Excel
  • Create and interpret a Time Series Model in Excel
  • Understand the difference between Supervised and Unsupervised Learning
  • Implement Logistic Regression in Excel
  • Familiarity with basic plotting tools in Excel
  • Familiarity with basic concepts of Hypothesis Testing

Recommended Reading

Statistical Analysis with Excel

Schedule

Week 1

  • Impact of Analytics & Ethics in the Real World
    • Business Case Study: Spotify
    • Ethics Case Study: Criminal Recidivism
  • Quiz

Week 2

  • Impact of Analytics & Ethics in the Real World
    • Business Case Study: AirBnB
    • Ethics Case Study: Home Mortgage Disclosure Act
  • Quiz

Week 3

  • Introduction to Analysis with Excel
    • Data types
  • Excel Tables
    • Filtering
    • Selecting
  • Summary Statistics
    • Mean, Median, Mode
    • Variance
    • Standard Deviation
  • Quiz

Week 4

  • Pivot Tables and Pivot Charts in Excel
    • Hands-on guided using custom datasets
  • Test (up to pivot tables)

Week 5

  • Basic Probability Theory
    • Combinatorics & Permutations
    • Independence
    • Conditional Probability
  • Quiz

Week 6

  • Distributions
  • Visual Summary Statistics
    • Histograms
    • Boxplots & interquartile ranges
    • Outliers
  • Statistical Paradoxes
  • Quiz

Week 7

  • Applications of Probability in Industry
  • Quiz

Week 8

  • Bayes’ Theorem
  • Midterm (up to week 7)

Week 9

  • Law of Large Numbers
  • Central Limit Theorem
  • Z-scores and applications of CLT
  • Quiz

Week 10

  • Applications of Statistics in Excel
  • Biases
    • Confirmation
    • Selection
    • Small sample sizes
  • Exploratory Data Analysis in Excel
    • Scatterplots
    • Barplots
  • Quiz

Week 11

  • Hypothesis Testing
  • Type I & Type II errors
  • Confidence Intervals
  • Quiz

Week 12

  • More applications of Hypothesis Testing
  • Test 3 (up to week 11)

Week 13

  • Introduction to Supervised and Unsupervised Learning
  • Relationships between variables
  • Linear Regression in Excel
    • Business Case Study: Zillow
    • Assumptions of Linear regression
  • Feature Engineering
  • Normalizing features
  • Dealing with count data
  • Interpreting residual plots

Week 14

  • Time Series in Excel
    • Moving Average Models
    • Weighted Moving Average Models
  • Quiz

Week 15

  • Logistic regression
    • Case Study
    • Implementing Logistic Regression in Excel
      • SF vs NY housing dataset
    • Theory & Assumptions
  • Quiz

Week 16

  • Review
  • Putting it all together
  • What’s to come in course 2 & 3
  • Final Exam