Notice

Basic Terms

Population The set of all people, items, events, objects, etc. that are of interest.
Examples: everyone in the US, the results of 10000 coin tosses, every car built in 2010, etc.
Sample A subset (smaller set) of the population. This is the part of the population for which data is gathered.
Examples: every thousandth person in the US, the results of 100 coin tosses, etc.
Population vs Sample The population refers to all the individuals or events of interest. However, it is usually too difficult, time consuming, or expensive to collect data on an entire population such as everyone in the United States or every car built in 2010. So a smaller subset of the population called the sample is chosen on which to gather data. It is assumed that the sample is representative of the population. One way to do insure this is by choosing a simple random sample.
Sampling with Replacement After choosing an element from a population, the element is placed back into the population and can be drawn again.
Sampling without Replacement After choosing an element from a population, it is withheld from the population and cannot be drawn again.
Simple Random Sample A simple random sample means that each element in the population has the same chance of being chosen.
Note that random does not mean chaotic or without order, but that each element has the same chance of being drawn. The elements in the sample need to be chosen independently. That means that drawing an element has no effect on which element is chosen next.
Element One of the individuals, objects, etc. in the population or sample.
Variable A characteristic of interest of the population or sample. This is what is measured.
For example, the height of everyone in the US, the number of heads in 100 coin tosses, etc.
Discrete Variable A quantitative variable that can assume a countable number of values. There is a distance between any two values.
An example is how many states a person can live in during his life. He could have lived in 1, 2, 3, 4, 5, 6, ..., 49, or 50 states. He couldn’t have lived in 3.68797869 states. Notice that there is a difference of at least 1 between any two values.
Figure skating judges give fractional scores of 9.9, 7.5, etc. Yet the scores are still discrete because two scores differ by 0.1 increments. No one will get a score such as 9.8473.
Continuous Variable A quantitative variable that can assume an uncountable number of values. A continuous variable can assume any value in an interval.
An example is how much gas can be used to fill a car. It could take 13, 13.423425, 13.746456 gallons, or any fractional amount to fill the tank.
Discrete vs Continuous A variable could be either discrete or continuous depending how the variable is measured.
When a person is asked his age, it is typical to give an answer in terms of years. If a person turned 25 two months ago, he would simply say he is 25 years old. With this reasoning, age would be a discrete variable.
However, he isn’t really 25 years old. He is 25 years and 2 months old. Of course, he could also measure his age by weeks, days, minutes, seconds, or nano seconds. In this context, age would be a continuous variable.
Data The set of values collected for the variables.
Observation The measurement for a specific element or one data value.
Parameter A numerical value summarizing all the data of the population.
Symbols for parameters are usually Greek letters such as μ, σ,  or τ.
Examples: If the population is every person in the US, a parameter could be the mean of the height of everyone in the US or the standard deviation of the height of everyone in the US.
Statistic A numerical value summarizing all the data of the sample.
Symbols for statistics are usually letters from the English alphabet such as x, s,  or T.
For example, if the sample is 1000 randomly chosen people in the US, a statistic could be the mean of the height of the 1000 randomly chosen people.
Random Variable A variable that assumes a numerical value to each of the outcomes in an experiment. It is called a random variable because it represents the unknown outcome of an experiment. The outcomes are uncertain until the experiment is actually carried out. Examples:
A random variable could assign a 1 to every head result in a coin toss and a 0 to every tail.
A random variable could assign the number 1 to every blond, 2 to every brunette, etc.
A random variable could assign the number 23 to a 23 year old, 50 to a 50 year old, etc.