Presentations Home - Return to Home Page
Slide 1
LIS 570
Bivariate analysis (continued)
Slide 2
Summary
Prediction
regression analysis
comparing means
Statistical inference
significance
Slide 3
Regression
Slide 4
Regression
Slide 5
Regression equation
Slide 6
Regression
Prediction using regression is
most secure when the independent variable x takes a value within the range
of the x values in your data
not about cause and effect
extrapolation
Using the regression equation
for prediction outside the range of the original data
less secure
Slide 7
R 2:
the Coefficient of Determination
Link between correlation and
regression
tells us the proportion of the
variance of one variable that can be explained by straight line dependence
on the other variable
How much can we rely on the regression estimates
Slide 8
R 2:
the Coefficient of Determination
.892 = .79
79% of the variance in first
year uni marks can be accounted for by the variance in the sample’s SAT
scores
21% of the variance in first
year marks is accounted for by other unknown variables
Eg 2. the correlation between
length of car and mpg/l is -.7
Interpret in terms of r2
percent of variance in the Y scores variable which is associated with the variance in the X scores.
Slide 9
Regression
Use when...
1.
both the variables are interval
2.
for prediction about the scores of individual cases or groups
3.
to measure the amount of impact or change that one variable produces in
another
Slide 10
Comparison of means
Focus on comparison of data distributions
Slide 11
Comparison of means
Appropriate when..
Dependent variable is interval
independent variable has few
categories (2 or 3)
initial analysis
look for patterns then use tables
Slide 12
Statistical
significance
“Real” or “Chance”?
Significance
judgements that are made
according to agreed on mathematical rules of probability
used to infer observed differences or relationships in the sample to the population studied
Slide 13
Statistical
significance
If we drew 100 samples, how
likely is it that we would get a faulty one
Probability theory
provides us an estimate of how
likely it is that sampling error is the real explanation for the
association that we are observing
Tests of significance
a figure from 0.000 to 1.000
the probability of error
Slide 14
P - value
P = 0.04
in only 4 out of every 100
samples would we expect to see the association we have noted purely by
chance.
The much stronger likelihood is that the association is real
Slide 15
Statistical
significance
Every finding derived from a
sample is associated with some probability of error
How much probability of error
should be tolerated?
Researcher decides
sometimes referred to as
tolerance limits
0.05 common
Slide 16
Presenting data
Slide 17
Another example
Slide 18
Means and
proportions
Two means
T-test
Several means
Analysis of variance (ANOVA)
Proportions
Chi-square
Slide 19
Conclusion
Univariate analysis
Describing frequency
distributions
shape; central tendency;
dispersion
Inferential statistic
Interval estimates
Bivariate analysis
cross tabulation; correlation
(strength, direction, nature)
scattergram; regression
(prediction)
statistical significance
comparison of means (T;
ANOVA) and proportions Chi-square