Concept B.1.1

Data cleaning

Identify and address data quality issues to ensure accuracy and reliability, progressing from simple error identification to using systematic approaches.

K–2 Competencies

Recognize and explain any missing data (e.g., a student was absent when data was collected) or data recording errors (e.g., "10" recorded as a "1").

K-2.B.1.1a

Record responses so that you can tell if everyone has been asked.

K-2.B.1.1b

Classroom resources

Classroom Tip
Getting Started

Data Science Starter Kit Module 2: Getting Data Ready - Creation and Curation

Welcome to the hands-on world of data collection and organization! This module focuses on where data comes from and how to make it useful for investigation.🔗

Creation and Curation isn’t about turning your students into professional researchers. It’s about helping them understand that data doesn’t just appear—it’s collected by people making decisions about what to measure and how. Whether students are conducting their own surveys or using existing datasets, they need to understand how data gets from the messy real world into organized, analyzable formats.

3–5 Competencies

Look through data to identify missing data, and add additional cases or values for variables if needed.

3-5.B.1.1a

Look through data to identify unreasonable values or recording errors in data values, and correct these if the correct values are known.

3-5.B.1.1b

Classroom resources

Classroom Tip
Getting Started
Thank you for your feedback.
Write more feedback

Data Science Starter Kit Module 2: Getting Data Ready - Creation and Curation

Welcome to the hands-on world of data collection and organization! This module focuses on where data comes from and how to make it useful for investigation.🔗

Creation and Curation isn’t about turning your students into professional researchers. It’s about helping them understand that data doesn’t just appear—it’s collected by people making decisions about what to measure and how. Whether students are conducting their own surveys or using existing datasets, they need to understand how data gets from the messy real world into organized, analyzable formats.

6–8 Competencies

Informally identify anomalies and outliers in a distribution of data and make an informed decision as to whether those observations should be removed or filtered out for analysis.

6-8.B.1.1a

Classroom resources

Classroom Tip
Getting Started
Thank you for your feedback.
Write more feedback

Data Science Starter Kit Module 2: Getting Data Ready - Creation and Curation

Welcome to the hands-on world of data collection and organization! This module focuses on where data comes from and how to make it useful for investigation.🔗

Creation and Curation isn’t about turning your students into professional researchers. It’s about helping them understand that data doesn’t just appear—it’s collected by people making decisions about what to measure and how. Whether students are conducting their own surveys or using existing datasets, they need to understand how data gets from the messy real world into organized, analyzable formats.

9–10 Competencies

Use data dictionaries to identify codes for missing or incomplete data (e.g., NA, 99999, 0, " "), and either recode or filter data to remove those observations.

9-10.B.1.1a

Apply basic cross-validation techniques to verify data quality across multiple sources, including source comparison, split sampling, internal consistency checks, and domain range validation.

9-10.B.1.1b

Classroom resources

Classroom Tip
Getting Started
Thank you for your feedback.
Write more feedback

Data Science Starter Kit Module 2: Getting Data Ready - Creation and Curation

Welcome to the hands-on world of data collection and organization! This module focuses on where data comes from and how to make it useful for investigation.🔗

Creation and Curation isn’t about turning your students into professional researchers. It’s about helping them understand that data doesn’t just appear—it’s collected by people making decisions about what to measure and how. Whether students are conducting their own surveys or using existing datasets, they need to understand how data gets from the messy real world into organized, analyzable formats.

11–12 Competencies

Develop comprehensive data validation procedures, including automated checks.

11-12.B.1.1a

Implement verification protocols for complex datasets with multiple dependencies.

11-12.B.1.1b

Classroom resources

Classroom Tip
Getting Started
Thank you for your feedback.
Write more feedback

Data Science Starter Kit Module 2: Getting Data Ready - Creation and Curation

Welcome to the hands-on world of data collection and organization! This module focuses on where data comes from and how to make it useful for investigation.🔗

Creation and Curation isn’t about turning your students into professional researchers. It’s about helping them understand that data doesn’t just appear—it’s collected by people making decisions about what to measure and how. Whether students are conducting their own surveys or using existing datasets, they need to understand how data gets from the messy real world into organized, analyzable formats.

Classroom resources

Support other teachers by sharing a resource

Do you have a lesson plan, video, or tip that could help others teaching this topic?

Developed by our coalition

Coalition organizers

Share feedback on the Learning Progressions

Your feedback helps us improve these progressions for teachers around the world. Thank you!

Thank you! We’ve received your submission.
Oops! Something went wrong while submitting the form.

Share feedback on the Learning Progressions

Your feedback helps us improve these progressions for teachers around the world. Thank you!

Thank you! We’ve received your submission.
Oops! Something went wrong while submitting the form.

Share a classroom resource

Suggesting a resource helps students around the world learn essential data science skills.

Thank you! We’ve received your submission.
Oops! Something went wrong while submitting the form.