code-for-a-living March 2, 2021

Level Up: Mastering statistics with Python – part 4

While many introductory statistics classes teach the CLT, very few actually attempt to prove it because that requires some complex math. In this session, we'll bypass all that math by using Python loops to simulate the CLT.

Welcome back! This is the fourth class in our Level Up series on statistics with Python. If you’re just tuning in, you can catch up on what we’re doing and review the first lesson here.

In this lesson we’ll learn about the Central Limit Theorem (CLT) by simulating it in Python. 

The CLT is the basis for a few common statistical hypothesis tests, like Z-tests and t-tests. While many introductory statistics classes teach the CLT, very few actually attempt to prove it because that requires some complex math. In this session, we’ll bypass all that math by using Python loops to simulate the CLT. This helps build intuition for how hypothesis testing works, while also practicing our Python programming skills and avoiding math-y equations!

This is a fun stream because it is our first step into the world of inferential statistics. We’re no longer interested in simply looking at a sample of data by itself — we’re now starting to think about how to use a sample to gain an understanding of a population we cannot observe.

Here are some StackOverflow questions related to the work we did in today’s session:

Finding distribution of sample mean by Central Limit Theorem

Implementing the Central Limit Theorem – Which Random Number Generator?

For loops in Python

If you enjoyed this lesson, you can catch up on the rest of the series on YouTube. If you’d like to watch a session live, follow the Codecademy YouTube channel.

Every Tuesday from now until March 2nd, we’ll be streaming a new session at 4PM EST. You can set a reminder for the stream for March 2nd here.

Finally, if you want even more stats content, you can sign up for the Master Statistics with Python interactive course this series was based on. This course was developed by Sophie and has many more quizzes, projects, and helpful nuggets that we can’t fit into our streams!

Tags: , , ,
Podcast logo The Stack Overflow Podcast is a weekly conversation about working in software development, learning to code, and the art and culture of computer programming.

Related

code-for-a-living May 22, 2021

Level Up: Linear Regression in Python – Part 1

Linear regression is a machine learning technique for modeling continuous outcomes. It is used for both prediction and data analysis in a variety of different fields. It is also the basis for a number of other machine learning models, including logistic regression and poisson regression. For anyone who is interested in learning more about data…