- Home
- Jobs
- Part-Time Jobs
- Full-Time Jobs
- Internships
- Babysitting Jobs
- Nanny Jobs
- Tutoring Jobs
- Restaurant Jobs
- Retail Jobs
- Summer Camp Jobs
- Golf Caddie Jobs
- Lifeguard Jobs
- Swim Instructor Jobs
- Housing
- Roommates
- Tutors
- Course Notes
- Test Prep
- GMAT Test Prep
- GRE Test Prep
- LSAT Test Prep
- MCAT Test Prep
- DAT Test Prep
- OAT Test Prep
- PCAT Test Prep
- ACT Test Prep
- SAT Test Prep
- NCLEX Test Prep
- USMLE Test Prep
- Bar Exam Test Prep
- PRAXIS Test Prep
- CPA Test Prep
- Textbooks
- Solutions
- Accounting Textbook Solutions
- Biology Textbook Solutions
- Business Textbook Solutions
- Chemistry Textbook Solutions
- Computer Science Textbook Solutions
- Economics Textbook Solutions
- Engineering Textbook Solutions
- Finance Textbook Solutions
- Health Textbook Solutions
- Management Textbook Solutions
- Math Textbook Solutions
- Music Textbook Solutions
- Other Textbook Solutions
- Physics Textbook Solutions
- Psychology Textbook Solutions
- Statistics and Probability Textbook Solutions
- Statistics Textbook Solutions
- Study Abroad
- Student Loans
- More
- Online Courses
- Arts and Humanities
- Biology
- Data Analysis
- Design and Product
- Economics
- Education
- Environmental Science and Sustainability
- History
- Law
- Leadership and Management
- Medicine & Healthcare
- Music and Art
- Nutrition
- Philosophy
- Business
- Biology
- Business Essentials
- Business Strategy
- Data Analysis
- Design and Product
- Economics
- Education
- Electrical Engineering
- Entrepreneurship
- Environmental Science and Sustainability
- Finance
- Law
- Leadership and Management
- Marketing
- Medicine & Healthcare
- Music and Art
- Nutrition
- Philosophy
- Computer Science
- Algorithms
- Biology
- Computer Security and Networks
- Data Analysis
- Design and Product
- Economics
- Education
- Electrical Engineering
- Environmental Science and Sustainability
- History
- Law
- Leadership and Management
- Medicine & Healthcare
- Mobile and Web Development
- Philosophy
- Physics and Astronomy
- Software Development
- Data Science
- Biology
- Data Analysis
- Design and Product
- Economics
- Education
- Electrical Engineering
- Environmental Science and Sustainability
- History
- Law
- Leadership and Management
- Machine Learning
- Medicine & Healthcare
- Philosophy
- Probability and Statistics
- Health
- Animal Health
- Animals and Veterinary Science
- Basic Science
- Bioinformatics
- Biology
- Clinical Science
- Health Informatics
- Healthcare Management
- Medicine and Healthcare
- Nutrition
- Patient Care
- Psychology
- Public Health
- Research
- Information Technology
- Language Learning
- Life Sciences
- Animals and Veterinary Science
- Bioinformatics
- Biology
- Chemistry
- Clinical Science
- Data Analysis
- Design and Product
- Economics
- Education
- Electrical Engineering
- Environmental Science and Sustainability
- History
- Law
- Leadership and Management
- Medicine & Healthcare
- Medicine and Healthcare
- Music and Art
- Nutrition
- Philosophy
- Physics and Astronomy
- Math and Logic
- Biology
- Data Analysis
- Economics
- Electrical Engineering
- Medicine & Healthcare
- Philosophy
- Physics and Astronomy
- Personal Development
- Physical Science and Engineering
- Biology
- Chemistry
- Data Analysis
- Design and Product
- Economics
- Education
- Electrical Engineering
- Environmental Science and Sustainability
- History
- Leadership and Management
- Mechanical Engineering
- Medicine & Healthcare
- Nutrition
- Philosophy
- Physics and Astronomy
- Research Methods
- Social Sciences
- Biology
- Data Analysis
- Design and Product
- Economics
- Education
- Electrical Engineering
- Environmental Science and Sustainability
- Governance and Society
- History
- Law
- Leadership and Management
- Medicine & Healthcare
- Music and Art
- Nutrition
- Philosophy
- Physics and Astronomy
- Psychology
- Software
- Professor Ratings
- Student Travel
- Scholarships
- Insurance
- Tickets
- GPA Calculator
- Grade Calculator
- Final Grade Calculator
- College Checklist
Data Science: Wrangling

In this course, part of our Professional Certificate Program in Data Science,we cover several standard steps of the data wrangling process like importing data into R, tidying data, string processing, HTML parsing, working with dates and times, and text mining. Rarely are all these wrangling steps necessary in a single analysis, but a data scientist will likely face them all at some point. Very rarely is data easily accessible in a data science project. It's more likely for the data to be in a file, a database, or extracted from documents such as web pages, tweets, or PDFs. In these cases, the first step is to import the data into R and tidy the data, using the tidyverse package. The steps that convert data from its raw form to the tidy form is called data wrangling. This process is a critical step for any data scientist. Knowing how to wrangle and clean data will enable you to make critical insights that would otherwise be hidden.
Created by: Harvard University
Level: Introductory
Created by: Harvard University
Level: Introductory
Find Out More
Share
Related Online Courses Listings
Darío, trabajador en una ONG en Colombia y participante del curso, apunta que aplicó los conocimientos del curso para crear DATASIMUS, un portal de datos de movilidad urbana que compara más de 400 variables de sistemas de transporte de la región. Descubre más abajo otros testimonios sobre el impa...
more
If you have specific questions about this course, please contact us at sds-mm@mit.edu. Data science requires multi-disciplinary skills ranging from mathematics, statistics, machine learning, problem solving to programming, visualization, and communication skills. In this course, learners will...
more
In data science, data is called "big" if it cannot fit into the memory of a single standard laptop or workstation. The analysis of big datasets requires using a cluster of tens, hundreds or thousands of computers. Effectively using such clusters requires the use of distributed files systems, such...
more
We will explain how to perform the standard processing and normalization steps, starting with raw data, to get to the point where one can investigate relevant biological questions. Throughout the case studies, we will make use of exploratory plots to get a general overview of the shape of the...
more
In this course, we begin with approaches to visualization of genome-scale data, and provide tools to build interactive graphical interfaces to speed discovery and interpretation. Using knitr and rmarkdown as basic authoring tools, the concept of reproducible research is developed, and the concept...
more
Data Science: Wrangling
Find Out MoreOnline Courses
- Arts and Humanities
- Biology
- Data Analysis
- Design and Product
- Economics
- Education
- Environmental Science and Sustainability
- History
- Law
- Leadership and Management
- Medicine & Healthcare
- Music and Art
- Nutrition
- Philosophy
- Business
- Biology
- Business Essentials
- Business Strategy
- Data Analysis
- Design and Product
- Economics
- Education
- Electrical Engineering
- Entrepreneurship
- Environmental Science and Sustainability
- Finance
- Law
- Leadership and Management
- Marketing
- Medicine & Healthcare
- Music and Art
- Nutrition
- Philosophy
- Computer Science
- Algorithms
- Biology
- Computer Security and Networks
- Data Analysis
- Design and Product
- Economics
- Education
- Electrical Engineering
- Environmental Science and Sustainability
- History
- Law
- Leadership and Management
- Medicine & Healthcare
- Mobile and Web Development
- Philosophy
- Physics and Astronomy
- Software Development
- Data Science
- Biology
- Data Analysis
- Design and Product
- Economics
- Education
- Electrical Engineering
- Environmental Science and Sustainability
- History
- Law
- Leadership and Management
- Machine Learning
- Medicine & Healthcare
- Philosophy
- Probability and Statistics
- Health
- Animal Health
- Animals and Veterinary Science
- Basic Science
- Bioinformatics
- Biology
- Clinical Science
- Health Informatics
- Healthcare Management
- Medicine and Healthcare
- Nutrition
- Patient Care
- Psychology
- Public Health
- Research
- Information Technology
- Language Learning
- Life Sciences
- Animals and Veterinary Science
- Bioinformatics
- Biology
- Chemistry
- Clinical Science
- Data Analysis
- Design and Product
- Economics
- Education
- Electrical Engineering
- Environmental Science and Sustainability
- History
- Law
- Leadership and Management
- Medicine & Healthcare
- Medicine and Healthcare
- Music and Art
- Nutrition
- Philosophy
- Physics and Astronomy
- Math and Logic
- Biology
- Data Analysis
- Economics
- Electrical Engineering
- Medicine & Healthcare
- Philosophy
- Physics and Astronomy
- Personal Development
- Physical Science and Engineering
- Biology
- Chemistry
- Data Analysis
- Design and Product
- Economics
- Education
- Electrical Engineering
- Environmental Science and Sustainability
- History
- Leadership and Management
- Mechanical Engineering
- Medicine & Healthcare
- Nutrition
- Philosophy
- Physics and Astronomy
- Research Methods
- Social Sciences
- Biology
- Data Analysis
- Design and Product
- Economics
- Education
- Electrical Engineering
- Environmental Science and Sustainability
- Governance and Society
- History
- Law
- Leadership and Management
- Medicine & Healthcare
- Music and Art
- Nutrition
- Philosophy
- Physics and Astronomy
- Psychology
- Software