Guaranteed Success with Mandarin Chinese

Parents' Guide

Education, Scholarships, Parenting Tips

[Johns Hopkins University] Data Science Specialization

Last Updated on 29 April 2023

The Data Science Specialization covers the concepts and tools you'll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. In the final Capstone Project, you’ll apply the skills learned by building a data product using real-world data. At completion, students will have a portfolio demonstrating their mastery of the material.

Course Duration

Approx. 11 months to complete, suggested 7 hours/week, flexible schedule.

Course Content

The Data Science Specialization has 10 courses. You should have beginner-level experience in Python. Familiarity with regression is recommended.

Begin by taking The Data Scientist's Toolbox and Introduction to R Programming, in order. The other courses may be taken in any order, and in parallel if desired.

Course 1: The Data Scientist’s Toolbox
There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.

Course 2: R Programming
This course covers practical issues in statistical computing which includes programming in R, reading data into R, accessing R packages, writing R functions, debugging, profiling R code, and organizing and commenting R code. Topics in statistical data analysis will provide working examples.

Course 3: Getting and Cleaning Data
This course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”.

Course 4: Exploratory Data Analysis
This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models.

Course 5: Reproducible Research
This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results.

Course 6: Statistical Inference
This course presents the fundamentals of inference in a practical approach for getting things done. After taking this course, students will understand the broad directions of statistical inference and use this information for making informed choices in analyzing data.

Course 7: Regression Models
This course covers regression analysis, least squares and inference using regression models. Special cases of the regression model, ANOVA and ANCOVA will be covered as well.

Course 8: Practical Machine Learning
This course will cover the complete process of building prediction functions including data collection, feature creation, algorithms, and evaluation.

Course 9: Developing Data Products
This course covers the basics of creating data products using Shiny, R packages, and interactive graphics.

Course 10: Data Science Capstone
The capstone project class will allow students to create a usable/public data product that can be used to show your skills to potential employers. Projects will be drawn from real-world problems and will be conducted with industry, government, and academic partners.

Can I just enroll in a single course?

Yes! To get started, click the course title that interests you and enroll. You can enroll and complete the course to earn a shareable certificate, or you can audit it to view the course materials for free. When you subscribe to a course that is part of a Specialization, you’re automatically subscribed to the full Specialization. Visit your learner dashboard to track your progress.

button learn more
Notify of
Inline Feedbacks
View all comments
We'd love to hear your thoughts about this!x
Send this to a friend