Data Science Principles


Develop a Data Mindset

Data Science Principles is a Harvard Online course in collaboration with Harvard Business School Online that gives you an overview of data science with a code- and math-free introduction to prediction, causality, data wrangling, privacy, and ethics.

Learn More




What You'll Learn

The course will be delivered via HBS Online’s course platform and immerse learners in real-world examples from experts at industry-leading organizations. By the end of the course, participants will be able to:Understand the modern data science landscape and technical terminology for a data-driven world

  • Recognize major concepts and tools in the field of data science and determine where they can be appropriately applied
  • Appreciate the importance of curating, organizing, and wrangling data
  • Explain uncertainty, causality, and data quality—and the ways they relate to each other
  • Predict the consequences of data use and misuse and know when more data may be needed or when to change approaches



Who Will Benefit

Students and Recent Graduates

Prepare for your career by building a foundation of the essential concepts, vocabulary, skills, and intuition necessary for business.

Early- and Mid-Career Professionals

Recognize how data is changing industries and think critically about how to develop a data-driven mindset to prepare you for your next opportunity.

Marketing & Project Management Professionals

Learn how data science techniques can be essential to your industry and how to contribute to cross-functional, data-oriented discussions.


Meet Your Instructor

Dustin Tingley

Dustin Tingley is a data scientist at Harvard University. He is Professor of Government and Deputy Vice Provost for Advances in Learning and helps to direct Harvard's education focused data science and technology team. Professor Tingley has helped a variety of organizations use the tools of data science and he has helped to develop machine learning algorithms and accompanying software for the social sciences. He has written on a variety of topics using data science techniques, including education, politics, and economics.


Data Science Principles makes the fundamental topics in data science approachable and relevant by using real-world examples and prompts learners to think critically about applying these new understandings to their own workplace. Get an overview of data science with a code- and math-free introduction to prediction, causality, data wrangling, privacy, and ethics.

Learning requirements: In order to earn a Certificate of Completion from Harvard Online and Harvard Business School Online, participants must thoughtfully complete all 7 modules, including associated quizzes, by stated deadlines.

Download Syllabus



Modules Case Studies Takeaways Key Exercises
Module 1: Data 101
  • Flu Detection
  • Explain why data collection is important
  • Identify factors that may affect data quality
  • Recognize that not all data is numerical
  • Explain how the organization of data can affect the information you are able to extract from it
  • List sources of data
  • Discuss what can be done with data
  • Categorize data by various factors
  • Determine whether data is high-quality or not
Module 2: Predictions and Recommendations
  • Predicting Sepsis
  • Understand the basic structure of a predictive algorithm
  • Identify where human decisions shape predictive systems
  • Evaluate the success of a predictive system
  • Examine how weather forecasts work
  • Use data to create a prediction
  • Sort types of training data
  • Simulate a predictive system
Module 3: Cause and Effect
  • The Google Tax
  • Explain why it is important to establish causal relationships
  • Identify barriers to establishing causal relationships in a variety of settings
  • Identify why randomization can help establish a causal relationship but also create other problems
  • Classify relationships based on correlation or causation
  • Examine the relationship between variables
  • Identify potential common causes for correlated events
Module 4: Data Governance and Privacy
  • Privacy and Facial Recognition
  • Explain why data privacy is important
  • Describe what can constitute a violation of privacy
  • Critique existing privacy policies
  • Create a set of ethical tenets to guide data work at their own organizations
  • Formulate data privacy guidelines
  • Discuss the risks of data re-identification
  • Evaluate existing data privacy policies for ethics
Module 5: Beyond the Spreadsheet
  • Burning Glass and Text Data
  • Identify sources of non-numerical data
  • Explain why it would be useful to use non-numerical data
  • Describe the differences in approach for supervised and unsupervised learning
  • Identify use cases for neural networks
  • Perform a sentiment analysis
  • Determine what types of data an algorithm cannot read
  • Examine how computers intake visual and audio data
  • Experiment with facial recognition
Module 6: Data Science Ecosystems
  • Harvard Link
  • Explain the importance of data transformation and wrangling
  • List the common technologies used within data science ecosystems
  • Describe the connection between data science tasks, software tools, and hardware tools
  • Identify potential sources of bottlenecks in the data science process
  • Identify and order the lifecycle of data
  • Define what "the cloud" is
  • Estimate the size of various data streams
Module 7: The Road Ahead
  • Healthcare Prioritization
  • Recognize a problem that an algorithm might be able to solve
  • Recognize the challenges created by using data science tools in ways outside their intended use
  • Identify steps within the data science process that need auditing
  • Choose types of data to ingest into an algorithm
  • Evaluate the risks of solely using an algorithm to make decisions
  • Discuss how algorithms can reinforce biases
  • Create a set of guidelines to evaluate projects


What past partipants are saying about Data Science Principles

  • Data Science Principles applies to many aspects of our daily lives. The course helps guide people in everyday life through decision making and process thinking.”

    - Jared B. on Data Science Principles
    Senior Director of Sales

  • “I found value in the real-world examples in Data Science Principles. With complicated topics and new terms, it's especially beneficial for learnings to be able to tie back new or abstract concepts to ideas that we understand. This course helped me understand data in this context and what algorithms are actually trying to solve.”

    - Alejandro D. on Data Science Principles
    Financial Services Analyst

  • “This is a topic that people in any industry should have at least basic knowledge of in order to create more efficient and competitive businesses, tools, and resources.”

    - Carlos E. Sapene on Data Science Principles
    CEO, Chief Strategy Officer

  • “This course was impactful especially using case studies of real-life situations to solve complex and confusing problems. The results of this will help improve my managerial decisions within and outside the organizations to minimize risks and increase profits.”

    - Bamidele Ajisogun on Data Science for Business
    Sr. Project Analyst Business Intelligence, Strategy, Product Development & Innovation UPMC Workpartners

  • “This course had an amazing instructor, amazing examples, and an amazing user interface that made it easy for me to grasp the material and learn simultaneously with others around the world.”

    - Shawn Carrington, Jr. on Data Science for Business
    Senior Executive Officer Perspecta, Inc.

  • “This course is very well structured. Learning goals are well set up and in line with my expectations. I found the course to be just as entertaining as educational and presented in a very attractive manner.”

    - Data Science Principles Participant