DataTrail Courses

DataTrail scholars complete the DataTrail courses, a free online curriculum designed to help those with less familiarity with computers move into data science. This set of courses were created by faculty members and researchers at the Johns Hopkins Bloomberg School of Public Health.

This set of courses teaches the basics from word processing to basic data analysis, including information on how to network and get a job in data science. All courses are free and open to all through the LeanPub course platform. You can access the entire set of courses here.

Course material is also available without login (but they can't be taken for certification unless you use Leanpub).

This is the first class in the DataTrail series. Data science is one of the most exciting and fastest-growing careers in the world. The goal of this series is to help people with no background and limited resources transition into data science. The only pre-requisites are a computer with a web browser and the ability to type and follow instructions. We'll guide you through the rest.

This course will introduce you to using a Chromebook. The Introduction and Setup course might sound simple, but it will set up the infrastructure for success with the later, more challenging courses.

The Google and the Cloud course introduces using Google’s in-built apps, which form the fundamental backbone of a Chromebook. We’ll go step by step through the process of integrating these apps together to form your productivity workflow.

Projects are central to the role of any data scientist. These lessons will discuss how to organize projects and files that are part of each project and will introduce you to Markdown, a simple way to compile text documents to a standard format.

Github is the world’s most popular version control website. GitHub and Markdown, provide a powerful way for you to get your code out to the world. In this course, we will tour GitHub, discussing the basic features of the website, what a repository is, and how to work with repositories on GitHub.

R is a simple to learn programming language that is powerful for data analysis. The R Basics course will teach you how to get started from ground zero. We will discuss what objects and packages are, introduce some basic R commands, and discuss RMarkdown, which you will use to write all your reports and to develop a personal website.

This course will focus on how to organize and tidy data sets in R, this is the first step most data scientist’s do before analyzing data.

This course will cover the different types of visualization most commonly used by data scientists as well as how to make these different plots in R. We will cover how to make basic tables and figures as well as how to make interactive graphics.

Data is often misunderstood in both subject and application. The Data course will focus on understanding what data is, what the data you’ll encounter will look like, and how to analyze and use data. Additionally, we’ll start to discuss important ethical and legal considerations when working in data science, where to find data, and how to work with these data in RStudio.

This course will discuss the various types of data analysis, what to consider when carrying out an analysis, and how to approach a data analysis project.

This course will discuss better practices for oral and written communication in data science.

After you learn all of these skills, it is still crucial that you learn the best ways to network and get a job in data science. This course will focus on so-called soft skills on how to give presentations, how to present yourself in the online community, how to network, and how to do data science interviews.

This course is designed for people with no background with package or software development. It would be helpful if you had already taken our Introduction to R and Version Control courses. This should be a great introduction to R package development for individuals who have not previously developed software.