Data Collection and Analysis

 The previous page told you a bit about how to describe a group of numbers. Let's now look at how these numbers should be collected during an experiment. For the results of an experiment to be valid, there are several important steps that must be followed. Contents of Statistics Pages

## The Sample

To collect data, you need to have something to measure. Your source of data could be from people, animals, plants...anything that will provide you the data you need. It is essential that the data come from a representative sample of the overall population that you want to describe. For example, if you wanted to report on the "average American", it would be impossible to test every person in the United States. So, instead, you take a "sample" of all of these people. It is important that the sample of subjects has people from the all over the country, of all ages, male and female, high and low income, etc., etc. The results of an experiment or survey that sampled women only may be different from the results if men were included.

Because it is usually impossible to test every subject in a whole population, only a small portion (a "sample") of the entire population is tested. For the most accurate estimation of the whole population, it is best if the subjects in an experiment are selected at random. This means that everyone in the population has an equal chance of being in the experiment. In this way the results from the sample group are used to estimate what would happen within the whole population. Example populations that researchers might sample include: middle school students, parents, cats, or computer users.

It is also important that there are enough subjects in the sample group to make meaningful statements about the results. Do you think that an average coming from 3 people is better or worse than an average coming from 100 people? The accuracy of the sample data is dependent on the size of the sample: in general, a larger sample group will provide a more accurate "picture" of the population.

## "Blind" Testing

It may be the case that people in an experiment and researchers may consciously or unconsciously influence ("bias") the results of an experiment when they know too much about the subjects in their groups. People may try to please the researcher with a particular type of response if they know what treatment they have received. A researcher may unconsciously treat subjects differently if he or she knows which treatment a subject has received.

What would happen if an researcher knew which people received a drug for pain and which people got a fake pill filled only with sugar? It is possible that the researcher may influence the subjects to respond in different ways. Perhaps the researcher would treat the subjects differently knowing what treatment each subject received. If a person knew he or she was only getting a sugar pill and not a pain killer, it is possible that they would think, "Hey, of course this will not get rid of my pain." On the other hand, if the person, knew that they were receiving a drug, they might think, "Hey, the Doc is giving me a drug that is sure to cure me." Knowing what "should happen" may change the results of the experiment.

To eliminate this possibility, it is important that experiments be performed "blind". This means that the subject will not know the treatment that he or she is receiving. A "double blind" experiment is one in which both the researcher and the subject do not know what group the subject is in. In this way, the subjects and researchers cannot influence the results because they do not have any expectations about how each subject should perform. When all of the data have been collected, then the researchers and subjects can be told which group they were in.

Let's go back to that "sugar pill" for a minute. It is possible that just the thought of getting a real treatment can create the same effect as a real treatment. This is called the "placebo effect". A placebo is a drug or treatment that really has no "active ingredient." It is important to have some subjects in every experiment receive the placebo treatment. This allows the researcher to separate the "real" effects of a drug or treatment from the effects of merely being in the experiment. Also, some illnesses will cure themselves. For example, the common cold will get better in about 7-10 days without any treatment. A placebo treatment will allow a researcher to measure this spontaneous recovery. Sometimes placebos can have very strong effects, but no one is really sure how placebos work.

# Measurement

Experiments need data. To get data, a researcher must measure something. Measurements come in many different varieties. For example, it is possible to measure time, weight, length, number of responses, height, pleasantness and brightness. The way numbers represent a particular measurement is called the "scale" (scales of measurement).

## Example

Nominal Scale

A nominal scale classifies data according to a category only. For example, an experiment may examine which color people select. No assumptions are made that any color has more or less value than any other color. Colors differ qualitatively from one another, but they do not differ quantitatively. A number could be assigned to each color, but it would not have any value. The number serves only to identify the color.
A Nominal Scale

Ordinal Scale

An ordinal scale classifies data according to rank. With ordinal data, it is fair to say that one response is greater or less than another. For example, if people were asked to rate the hotness of three chili peppers, a scale of "hot", "hotter" and "hottest" could be used. Values of "1" for "hot", "2" for "hotter" and "3" for "hottest" could be assigned. However, and this is important, you cannot say that the difference between the hot pepper and the hotter pepper is the same as the difference between the hotter pepper and the hottest pepper. It may be that you can eat a hot pepper without feeling any pain. You may also be able to eat the hotter pepper, but your mouth just tingles a bit. However, the hottest pepper is really, really hot...so hot your whole mouth burns.
An Ordinal Scale

Interval Scale

An interval scale assumes that the measurements are made in equal units. However, an interval scale does not have to have a true zero. Good examples of interval scales are the Fahrenheit and Celsius temperature scales. A temperature of "zero" does not mean that there is no temperature...it is just an arbitrary zero point.
An Interval Scale

Ratio Scale

Ratio scales are similar to interval scales. A ratio scale allows you to compare differences between numbers. For example, if you measured the time it takes 3 people to run a race, their times may be 10 seconds (Racer A), 15 seconds (Racer B) and 20 seconds (Racer C). You can say with accuracy, that it took Racer C twice as long as Racer A. Unlike the interval scale, the ratio scale has a true zero value.
A Ratio Scale

 Did you know? The word "placebo" comes from the Latin phrase that means "I will please."