Hi. It’s Mr. Andersen and in

this video I’m going to talk about statistics for science. And the reason why statistics

is important in science is that’s what statistics is. It’s basically collecting, organizing,

analyzing, interpreting and then presenting data so that people can use it. And so we

don’t usually do a lot of statistics in high school science and that’s too bad because

it’s very important we understand what to do with data once we’ve collected it. And

there’s a big push right now to improve statistics knowledge of high school students. And the

reason why is that they become college students and they’re eventually going to have to start

working with what’s called Big Data. And so what it is Big Data? Well the days of a scientist

just lonely sitting by themself collecting data are gone. Most science now is done by

huge teams or groups or centers. And a lot of it is crowd sourced and we’re generating

so much data right now that we actually have to go through that. So what’s an example?

Well meteorology or the study of the weather and climate on our planet, we just get more

and more data, better data, but we have to pour through this to make models and make

predictions. Or genomics is sequencing of the genome. So looking at the actual letters

in the nucleotides in DNA and RNA. And we sequenced the human genome but now we’re sequencing

all these different organisms and so all that data is pouring in and we have to look through

it. Or this is a term I hadn’t even heard of before, connectomics, which is basically

using intense magnetic resonance imaging to look at neurons. Looking at the brain. And

then modeling that, using computers to model individual neurons and then grow that into

like a virtual brain. So you can see over the next 20 or 30 years we’re going to need

scientists who understand what to do with big data. And to understand big data, let’s

start by going through kind of the basics of statistics. And so when you’re dealing

with statistics, one big thing when we’re talking broad is the idea of what a population

is. And so a population is big. And so a population is going to be everything. So it could be

like all of the students in a class. So that could be a population. But it can get much

bigger than that. And so when we’re studying the population, not to be confused with like

a population that we study in ecology, the population, all of the characteristics of

that are going to be called parameters. And so an example of one that we’ll actually use

in science is N. That’s the population size. But I said that the population is everything.

And so it could also be like all of the stars in the universe. Or it could be all of the

planets in the universe. Or it could be not only one scientific experiment but an infinite

number of scientific experiments that you could do. And so it really is everything when

we’re talking about the population. And so if we go back to an example of a population,

well in science what we can do is take a sample of that. So this is the population and then

this is a sample of the population. And we move from population where we study parameters

and we get to the sample we have what are called statistics. And so statistics are going

to be characteristics of a sample. And hopefully that’s a random sample. And so a question,

a really good question at this point might be, which is more important? Is the population

important or is the sample important? In other words, which one do we use more? And I used

to think, you know, the population has to be the most important thing. We want to know

everything. We want to know all the outcomes. We want to know what the universe looks like

and in fact it’s the wrong answer. The right answer and the most important thing is the

sample, because you can never know everything, but you can know a sample of that. And if

you have a good understanding of the statistics we can make predictions about everything.

Predictions about the population. And so everything I’m going to talk about, I’m talking about

the sample because that’s what scientists do. We can’t do every conceivable experiment.

We can’t gather every conceivable piece of data. We just have to work with what’s called

the sample and make sense of that. So let me give you an example of thus. This right

here is, I remember reading there was a survey and they asked scientists like what’s the

greatest scientific discovery of the last 100 years. So from 1900-2000. And I thought

maybe it was going to be Einstein, relativity, or quantum physics or all of those things.

Actually the right answer, or the winner we’ll say was this guy. His name is Edwin Hubble.

And you’ve probably heard of his name because they named the Hubble space telescope after

him. But you might not know what he did. And so he sat here at the Mt. Wilson observatory

and he looked at galaxies in the universe. And what he found is that no matter where

he looked in the universe, they seemed to be shifted towards the red. So they were more

red in color. What does that tell us? Well, as objects move away from us, they get red-shifted.

And so it told him that all of these galaxies were moving away from us. In other words,

everything in the universe is moving away. And you can see that he just plotted that

in a nice little scatter plot and then we have a line of fit. And so did he measure

all of the galaxies in the universe? No. But he sampled, or he had a sample set of those.

And from that we can make predictions and what’s the prediction that we make based on

this? It’s the idea of an expanding universe. And the idea that all, since everything is

expanding that means everything was together at one point. And so this is that big bang

theory. That idea that all of the universe began at one singularity. And so let’s get

to some statistics. Let’s actually get to some numbers of the sample. And so let’s go

through these. The first one is going to be the sample size. That’s going to be the number

of observations that you make. So that could be the number in your sample group. In your

random sample. Next we have what’s called an X bar or the mean. The mean and the average

are going to be the exact same thing. So if you know what an average is and how to figure

it out, that’s going to be the mean. Next is the Median. Median is simply going to be

the midpoint in between all of our data sets. And then finally we have a range. And so this

is a sample set over here. So let’s say in the science lab this is some data that you

collect. And so could you figure out these four things: sample size, median, mean and

range? Well let me walk you through it. So the first thing we could do is the sample

size. And so sample size or n, get used to that letter n right here, sample size is just

going to be the number of samples that we made. And so in this set we have 1, 2, 3,

4, 5, 6, 7. And so our n value is going to be 7. Let’s go to the next one. What’s the

mean or what’s the average? Well to figure that out all you do is add up all of these

quantities and you’re going to divide it by the number of of quantities. And so if I add

all these up together I get 35. If I divide that by 7 which is the total number in my

sample size I’m going to get a mean or average of 5. How do you do the median? Or how do

you find the midpoint? Well, you have to line them up in order. So when I line it up in

order basically what I can do is I can cross it out from the sides. So I’ll cross one out

from the sides. I’ll cross another one out from the side and then we have the midpoint

which is right here. So the median and the mean in this case is going to equal 5. But

you might think to yourself, what do I do if it’s not even or if it is even? In other

words, what do I do here? Well I could knock off 2 from each side and let me knock off

another one from each side and now I have 5 and 6. So if this is our sample set, then

our median is going to be the average between 5 & 6 or the average is going to be 5.5 Let’s

get to the range then. What’s the range? The range is going to be the difference between

the extremes. And so this is the number 2 and this is the number 13. So this is my low

and this is my high, then my range is simply going to be 11 in this case. And so what are

these? These are all simple statistics that we can gather from a sample set. And again,

it’s a random sample from everything from this big population. Last thing I want to

leave you with is an idea that is sometimes is confusing to students and that’s called

degrees of freedom. And we refer to that as n minus 1. And so what is n? Well, n remember

is going to be the sample size and where does the freedom come from. Well I drew a, I drew

a flag right here to help you remember that. So what does it mean? What does a degree of

freedom? Well think of it like this. This is the best way to understand it. So imagine

I have these three numbers and they are going to add up to ten. And so this is A + B + C

equals 10. And I say choose a random number. And let’s make it easy by just choosing a

whole number. Well you might say that this is 3. And so I’m going to choose this to be

3. And did you notice I had total freedom? I had a freedom in my choice as to what number

I was going to choose to represent A. I had total freedom here. That was fun. Let me get

a little more freedom. So let’s say I’ve got to choose the next number B. I want to go

crazy. Maybe I want the next one to be 13. I could choose any number in the world. In

other words I have freedom to choose what that is. And so now this is fun. I have a

lot of freedom. So let’s go to the last one then. So we’re going to say that this plus

this plus this equals 10. So I’ve got a constraint here. It’s got to equal 10. Well now we’ve

got to choose C. Well what can C be? Well all of a sudden I’ve lost my freedom here.

In other words if this is 3, this is 13, this has to be negative 6 if I want this to be

10 because this is 16 minus 6. That’s got to be 10. And so all of a sudden I lost my

freedom. And so when we’re talking about degrees of freedom how do you figure that out? Well

you take the number in your data set, in this case it’s going to be 3 and you’re going to

subtract one from that. And so in this case I have 2 degrees of freedom of there were

two numbers at which I had a choice as to what I was going to choose. And so this will

be important in a couple of different ways. Number 1 when we’re figuring out standard

deviation using n minus 1 or degrees of freedom, we’re going to get more accurate results,

or more precise results. And so you’ll see this again when we calculate standard deviation.

And then when we start comparing data sets, when we do a Chi Squared test, it’s important

that you understand what a degree of freedom is. So if we have two different groups then

we’d only have 1 degree of freedom. Or if we have eight different groups or eight different

choices then we have seven degrees of freedom. And so those are all statistics. Again their

parts of the sample set which is part of everything and it allows us to give meaning to math.

And what I mean, I learned so much math in high school especially in algebra two, but

I didn’t always know like when am I going to use this? Statistics is something I promise

you that you are going to use. If you move on to college and hopefully get some kind

of advanced degree or find an awesome job, statistics will come back and it will find

you at some point so you might as well learn it now. So this is an intro on statistics

and I hope that was helpful.

Cool! great work!

Good work again Paul. Your videos always help to de-mystify complex phenomenon.

What is the purpose of a degree of freedom?

It improves the precision of standard deviation and will become very important as we start to compare data sets using tests like chi-squared.

When I'm looking to see if there is significant difference between two groups , I can use a chi-square, or a (1 or 2 tailed) t-test. I understand how to do them, but I'm not sure which to use depending on the circumstances. Any tips?

BTW- I'm a graduate student in biology, and teach undergrad labs. Your videos are really help me prep for when I teach my students. Thanks!

Hello teacher,

I'm a medical student from Damascus University, and I would like to thank you for this video.

Today we were taking a lecture in medical statistics, and our professor, were talking about the degrees of freedom, she confused us while she was talking, but I knew every she said, and my friends didn't, do you know why?

Because I watched this video, you made the heard things simple and easy, esp. your example about the degrees of freedom, A + B + C = 10

Thank you.

I like your channel.

What does it mean when people refer to 'overlap' in regard to standard deviation and/or standard error? Also, what are true means and sample means? Thank you very much!

yOU'RE THE BOSS MAN

BIG DATA

His video (including some other) helped me pass the CLEP Biology, Thanks

Great video as usual ! 🙂

Do you have a video for student's t test? if so can you give me the link please and if not can you make a video ASAP please???

thanks!!

The degrees of freedom explanation was great, why didn't my college stats teacher explain it like that? He just said memorize this stuff and then tested us on it.

Excellent, love the way you explained this. Thanks. Especially degrees of freedom.

An ANOVA video would be wonderful

Great explanation and video of course!

even though i think statistics is one of the worst math classes (i love math like algebra and physics), you made seem clear about the concept of statistics.

Great videos, but sometimes odd – half the video is explaining that sample<population

can you do a video on ttests please. Thanks.

Man I love you..SUBSCRIBE

Fast becoming my 'go-too-guy' for all my interests in science! Keep it up, loads of us are loving it!

best explanation of degree of freedom ever!

best explanation ..Thank you

I'm being forced to watch this .Thx alot

Another brilliant and educating video. I thoroughly enjoyed it.

10:55 it will find you. and IT WILL HUNT YOU DOWN AND KILL YOU

I almost cried , because finally i know that i am not stupid , you clear most of the questions i had flying in my head with all this language and terminology of statistics language that i needed to understand , i just found your video and i,'ll watch all of them

great video

Great video!! Thanks for the explanation!

You good Mr anderson

Yr teaching is amazing

thanks teacher, please can anyone guide me to a website where i can find some raw spreadsheets on biology surveys in order to practice some statistic tests

good clips. Don't stop making and continue with rest of stats topics… please

You explain statistics clearly !!! Thank you …

I enjected mariweed and I dieded

This is awesome. Thanks so much.

First time knowing what degrees of freedom actually are.

this man is amazing. i learn a lot fro his videos. thanks sir

pangu not new

Why there is no video on null hypothesis???pls

lol anyone here from AP Biology

I love you life sciences presentations. I didn't like your statistics presentations, and I have an idea why. It's not a good idea to combine concepts, computations and software in a single presentation, UNLESS you highlight one of these three and downplay the other two. For instance, emphasize the key concepts of probability and statistics that are critical to measurement data. You did this with mean, stdev and stderr. But downplay the math by only briefly showing/naming the formulas and avoiding the laborious crank-n-grind calculations. Also, downplay the software by showing excel, but already having all the data and macros set up and running properly. I like your good presentations, because you see (and describe) the forest, but you are also very knowledgeable about the trees (without getting lost in them). You are a great explainer of concepts, but then you show how those concepts are applied in practice. Many instructors are strong with crank-n-grind, but they don't do such a good job of linking the procedures to anything understandable or relevant. I prefer your approach, since you take the time to explain what the thing is and why it is important and how it gets used. I would also suggest that you avoid the mechanics of using software completely. Most students will learn to use Excel on their own (and from one another). Don't waste your time teaching that stuff. Also, to teach probability and statistics properly will require a LOT of videos. A lot of other people have already done that. Concentrate on applying statistics (and stderr) to scientific data.

Hii this is awesome! I'm taking Biology and I'm so confuseeee with all the date. Thank you so much for doing this.

My like statistic…. very mutch

Really helpful.. thank u sir.

Great … Easy is better

Great explanation – thank you.

Today's science is mainly focus on generating data…. Some time in some country it's only generating data not more than this.

Good videos. But please expand the statistics section more

Thanks a lot!

Прикольно

Cool

Thank you so much

What is meant by objects get "red-shifted" as they move away?

Great brother

It’s lazy science. Statistics change over a population in a given location over a given time period. But big scientific consensus comes often, and it’s frequently from a study that was done from somewhere else other than the inquiry. You can’t have a scientific law that’s got a million variables that may change over miles or social classes or time periods, so basically anything. Even a basic group you could call normal people could cause a different outcome on 50% of outcomes of many experiments 10 years from then because knowledge and about 1000 other basic intelligence related categories change in such a short period of time. We are in such a great change in knowledge and teachings because of government pressure on education. People are getting so much more intelligent, even though they are taking more risks because you know, human, that I remember being so much more advanced than my parents in 11th grade at every possible subject. Able to analyze and see an easier and more probable outcome. Now at age 22, my brothers and sisters who are in 8th and 10th grade are doing things with math, science and literature that I didn’t fathom until my last year in high school or second year in college. The education is at such an uptick and science uses data to try to predict our future but you’d need an algorithm for an algorithm to actually achieve anything when you consider how fast our society is becoming an amazing behemoth of intelligent being. Like alien brainiacs from the movies. It’s amazing.

AP Bio Grind Time

5:35 all the important stuff was here, at least for my note packet.