15 Online Data Science Courses for High School Students
For high school students curious about emerging tech fields, online courses offer a structured way to explore interests beyond standard curricula. These courses allow you to learn about and build tech skills, explore industry-relevant tools, and gain exposure to how professionals work in real settings. Online learning also connects you to instructors and peers from around the world. A major advantage of fully virtual courses is flexibility, as you can access university-level content from home without worrying about travel or temporary relocation costs. If your interests lie in data science, an online course in the field can help you study the basics as well as advanced programming, machine learning, and data visualization concepts from anywhere in the world.
How are data science courses different from other programs in high school?
Unlike traditional classes and theory-focused programs, data science courses emphasize applied learning through coding, data analysis, and problem-solving projects while helping you track your progress through assessments. You will engage with topics such as statistics, machine learning, and data interpretation as you work through structured modules that progress from basics to advanced concepts. Many courses are designed to be self-paced, allowing you to balance coursework with school and extracurriculars.
To help you find the right fit, we have identified 15 online data science courses for high school students.
If you’re looking for free online programs, check out our blog here.
1. Stanford Pre-Collegiate Studies: Introduction to Data Science
Location: Virtual
Cost: $3,200
Application deadline: March 13
Dates: June 15 – 26 | July 6 – 17
Eligibility: Students in grades 9 – 11 with a working knowledge of statistics and exposure to a computer programming language
This Stanford University course introduces you to data science as a method for understanding and analyzing complex information using computational tools. You will explore how algorithms generate different models and how each model entails trade-offs among accuracy, interpretability, and applicability. You will analyze datasets from multiple disciplines, exploring how data science methods adapt across domains. You will also practice machine learning techniques through R-based coding exercises that reinforce conceptual understanding. Finally, you will apply ethical and methodological considerations while working with data.
2. Veritas AI: AI Scholars & AI Fellowship
Location: Virtual
Cost: Varies; financial aid available
Application deadline: Varies by cohort. You can apply to the program here.
Dates: 10 – 15-week cohorts run several times each year
Eligibility: High school students; AI Fellowship with Publication and Showcase accepts previous AI Scholar participants or those with some experience working with AI or Python.
Veritas AI offers multiple focused, short learning opportunities designed for hands-on exploration of artificial intelligence. These programs have been developed and executed by Harvard graduate students and alumni. If applying to the beginner-friendly AI Scholars program, you will attend 10 sessions that introduce you to data science, Python, machine learning, and AI concepts. If you have prior experience with AI/coding, you can also opt for the AI Fellowship with Publication and Showcase program, which offers mentorship from AI practitioners or researchers to help you develop your own unique project. Here, you will also receive support from an internal publication team that helps prepare your work for high school research journals. You can check out past projects here and read about a student’s experience in the program here.
3. HarvardX: Data Science: Building Machine Learning Models
Location: Virtual via edX
Cost: Free to enroll; $149 to get a certificate
Application deadline: Not specified
Dates: Self-paced course; commitment of ~eight weeks
Eligibility: Open to all
This course from HarvardX focuses on how machine learning models are constructed and evaluated in data science. You will explore how machine learning differs from other computer-guided decision systems by relying on data-driven prediction. You will also learn about foundational algorithms and dimensionality-reduction techniques, such as principal component analysis. The course allows you to practice what you learn by building a movie recommendation system to apply regularization and model training concepts. Additionally, you will work with training data to generate predictions for unseen datasets. Modules also cover overfitting and validation techniques used to assess model performance.
4. Lumiere Research Scholar Program: AI Track
Location: Virtual!
Cost: Varies; financial assistance offered
Application deadline: Varies by cohort
Dates: Multiple sessions, including summer, spring, fall, and winter cohorts, are scheduled each year
Eligibility: High school students; accepted students typically have an unweighted GPA of 3.3 out of 4.0Lumiere’s
Research Scholar Program is a rigorous opportunity for high school students who want to explore an area or topic of interest in depth. As a participant, you will work one-on-one with a Ph. D.-level mentor on an independent research project. You can choose research topics from a wide range of subjects, including AI/machine learning, data science, and computer science. You will finalize a research question with support from your mentor and also work with a writing coach to present your findings. You can find more details about the application and available program formats here, and check out students’ reviews of the program here and here.
5. HarvardX: Introduction to Data Science with Python
Location: Virtual via edX
Cost: Free to enroll; $299 for a certificate
Application deadline: Not specified
Dates: Self-paced course; commitment of about eight weeks
Eligibility: Open to anyone who has some knowledge of programming (preferably in Python) and statistics
This Harvard University and edX course introduces you to data science through Python-based machine learning workflows. You learn how data scientists use algorithms to make sense of large datasets. You will study regression and classification models that form the basis of many machine learning systems. You will then apply these models using libraries such as Pandas, NumPy, matplotlib, and sklearn. The course allows you to explore techniques for preventing overfitting and evaluating model performance. In the process, you will develop familiarity with core machine learning and artificial intelligence concepts.
6. IBM: Python Basics for Data Science
Location: Virtual via edX
Cost: Free to audit; $99 for a certificate
Application deadline: Not specified
Dates: Self-paced course; commitment of about three weeks
Eligibility: Open to all
This IBM course provides a foundational overview of Python programming for data science. You will start by studying beginner-level programming concepts implemented in Python. You will then write Python scripts to practice the coding fundamentals you learned, and also use Python to explore basic data analysis techniques. Throughout the course, you will complete practical exercises in a Jupyter-based lab environment. The experience can help you build familiarity with Python as a data science tool.
7. HarvardX: Data Science: R Basics
Location: Virtual
Cost: Free to audit; $219 for a certificate
Application deadline: Not specified
Dates: Self-paced course; commitment of about eight weeks
Eligibility: Open to all
This HarvardX course introduces you to the fundamentals of R programming through applied data analysis. You will work with a real-world dataset on crime in the United States to learn how R is used to answer comparative questions across states. The course covers core R functions, data types, and vector operations to help you build a foundation in programming. You will also learn how to use control structures such as if-else statements and for loops. Additionally, you will apply data wrangling, analysis, and visualization techniques in R and build a foundational skill set that supports later study of advanced data science topics.
8. IBM: Introduction to Data Science
Location: Virtual via edX
Cost: Free to audit; $99 for a certificate
Application deadline: Not specified
Dates: Self-paced course; commitment of about six weeks
Eligibility: Open to all
This IBM and edX course presents data science as a field that has developed from early data analysis practices into a modern discipline. You will learn about historical examples of data use, including population tracking and environmental prediction. You will also explore how these practices connect to contemporary data science approaches. The course offers a structured overview of what data science involves in current settings. You will even encounter professionals working in data science to understand how the field is applied. The curriculum is designed to help you develop a clear understanding of the scope and nature of data science today.
9. UCSanDiegoX: Python for Data Science
Location: Virtual via edX
Cost: Free to audit; $350 for a certificate
Application deadline: Not specified
Dates: Self-paced course; commitment of about 10 weeks
Eligibility: Advanced learners with prior data science/stats experience
This edX course, offered by UC San Diego, introduces you to Python-based tools used in data science. You will learn how open-source technologies support data import, exploration, analysis, and visualization. You will work with Python, Jupyter notebooks, pandas, NumPy, matplotlib, and git, which make up the course environment. You will apply these tools while solving structured data science problems, and while generating visualizations and shareable reports based on your analysis. You will gain experience working with large datasets using standard data science workflows.
10. UCSanDiegoX: Probability and Statistics in Data Science using Python
Location: Virtual via edX
Cost: Free to audit; $350 for a certificate
Application deadline: Not specified
Dates: Self-paced course; commitment of about 10 weeks
Eligibility: Advanced learners with prior data science/stats experience
This course focuses on teaching you probability and statistics within a data science context. You will examine how uncertainty affects the interpretation of real-world datasets. You will also study mathematical foundations, including random variables and statistical dependence. Models cover concepts such as correlation and regression and how they support data analysis. Information-theoretic ideas, including entropy and MDL, are also part of the curriculum. You will apply these theoretical concepts through hands-on Python-based notebook exercises.
11. Hong Kong University of Science and Technology’s Mathematical Methods for Data Analysis
Location: Virtual via edX
Cost: Free to audit; $439 for a certificate
Application deadline: Not specified
Dates: Self-paced course; commitment of about eight weeks
Eligibility: Open to anyone with prior math and data analysis knowledge
This course focuses on teaching you the key mathematical techniques applicable to data analysis. You will examine how mathematical formulations are used to represent data relationships. You will also study computational methods that leverage inherent data structures. The course additionally focuses on Fourier analysis through examples in signal and image data. You will review selected machine learning algorithms using case-based learning. Throughout the course, you will get to connect mathematical concepts to practical data analysis scenarios.
12. LinuxFoundationX: Ethics in AI and Data Science
Location: Virtual via edX
Cost: Free course materials
Application deadline: Not specified
Dates: Self-paced course; commitment of about six weeks
Eligibility: Open to all
This course focuses on ethical responsibility in AI and data science practices, helping you explore how modern AI systems affect society, institutions, and economic structures. You will examine concerns such as user privacy, data misuse, and surveillance. You will study principles and frameworks designed to introduce accountability and transparency into AI systems. You will also explore how ethics can be operationalized in business and technology initiatives. The course additionally covers practical approaches to resolving ethical dilemmas in data analytics roles.
13. MITx: Introduction to Computational Thinking and Data Science
Location: Virtual via edX
Cost: Free to enroll; $149 for a certificate
Application deadline: Not specified
Dates: Nine-week instructor-led course; various start dates available
Eligibility: Open to all
This MIT instructor-led course introduces you to data science through the principles of computational thinking. You will examine how computational structures shape how data problems are approached through lecture videos. You will also explore how data analysis connects to algorithmic reasoning. The curriculum covers introductory concepts that define data science and focuses on the development of computational thinking while working with data. Programming assignments and quizzes are part of the course to help you track your progress.
14. UMBC: Data Preprocessing for Data Science
Location: Virtual via edX
Cost: Free to audit; $98.99 for a certificate
Application deadline: Not specified
Dates: Four weeks of self-paced learning
Eligibility: Open to all
This course focuses on the essential steps involved in preparing data for data science applications. You will learn how preprocessing techniques improve the usability of raw datasets. You will study data cleaning, transformation, and reduction methods, and find opportunities to gain hands-on experience implementing these techniques in Python. You will also use NumPy and scikit-learn to support preprocessing workflows. In the process, you will develop an understanding of how prepared data contributes to effective machine learning models.
15. IIMBx: Foundations of Data Science
Location: Virtual via edX
Cost: Free course materials
Application deadline: Not specified
Dates: Seven weeks of self-paced learning
Eligibility: Open to all
This course provides you with a structured introduction to data science fundamentals. You will explore probability theory and its application in machine learning algorithms. You study use cases such as market basket analysis and recommender systems. The course covers statistical concepts, including estimation, sampling, and the central limit theorem. It also focuses on hypothesis testing techniques used in regression and logistic regression models. You will additionally study optimization methods and linear algebra concepts used in AI and ML workflows.
Image source - Stanford Logo
