Glossary – Key Terms

Alisa Beyer

20 Glossary – Key Terms

Terms are organized by sections of the textbook

Research & Variable Terminology

A collection of measurements or observations.
datum	A single measurement or observation and is commonly called a score or raw score.
dependent variable	In an experiment, the variable that is observed for changes. (the scores)
descriptive statistics	Techniques that organize and summarize a set of data
discrete variable	A variable that exists in indivisible units.
experimental condition	A condition where the treatment is administered.
experimental method	A research method that manipulates one variable, observes a second variable for changes, and controls all other variables. The goal is to establish a cause-and-effect relationship.
independent variable	In an experiment, the variable that is manipulated by the researcher. (the treatment conditions)
inferential statistics	Techniques that use sample data to draw general conclusions about populations.
integer	whole numbers (no decimal or fraction)
interval scale	An ordinal scale where all the categories are intervals with exactly the same width.
lower real limit	The boundary that separates an interval from the next lower interval.
nominal scale	A measurement scale where the categories are differentiated only by qualitative names.
nonequivalent groups study	A research study in which the different groups of participants are formed under circumstances that do not permit the researcher to control the assignment of individuals to groups and the groups of participants are, therefore, considered nonequivalent.
operational definition	A procedure for measuring and defining a construct.
ordinal scale	A measurement scale consisting of a series of ordered categories.
parameter	A characteristic that describes a population.
population	The entire group of individuals that a researcher wishes to study.
pre–post study	Quasi-experimental and nonexperimental designs consisting of a series of observations made over time. The goal is to evaluate the effect of an intervening treatment or event by comparing observations made before versus after the treatment.
quasi-independent variable	In a quasi-experimental or nonexperimental research study, the variable that differentiates the groups or conditions being compared. Similar to the independent variable in an experiment.
ratio scale	An interval scale where a value of zero corresponds to none.
raw score	An original, unaltered measurement.
real limits	The boundaries separating the intervals that define the scores for a continuous variable.
real number	Using real data, data can be written in fraction/decimal form
reliability	Consistency of measure
sample	A group selected from a population to participate in a research study.
sampling error	The discrepancy between a statistic and a parameter.
statistic	A characteristic that describes a sample.
statistics	A value, usually a numerical value, that describes a sample. A statistic is usually derived from measurements of the individuals in the sample.
upper real limit	The boundary that separates an interval from the next higher interval.
validity	Authenticity of measure (face validity, construct validity, predictive validity)
variable	A characteristic that can change or take on different values.

Graph, Tables, & Distribution Vocabulary

apparent limits	The score values that appear as the lowest score and the highest score in an interval.
axes	The two perpendicular lines that form a bar graph.
bar graph	A graph showing a bar above each score or interval so that the height of the bar corresponds to the frequency. A space is left between adjacent bars. Typically used for nominal and ordinal data
class interval/class limit	A group of scores in a grouped frequency distribution. Groups of scores have same range (e.g., grouped by 10s)
cumulative frequency	Percentage of individuals with scores at or below a particular point in the distribution
frequency distribution	A tabulation of the number of individuals in each category on the scale of measurement.
grouped frequency distribution	A frequency distribution where scores are grouped into intervals rather than listed as individual values. Uses class intervals.
hinges	The 25% and 75% in a box plot, the top and bottom of the “box”
histogram	A graph showing a bar above each score or interval so that the height of the bar corresponds to the frequency and width extends to the real limits.
negatively skewed distribution	A distribution where the scores pile up on the right side and taper off to the left. (think your left foot)
normal	A specific shape that can be precisely defined by an equation.
outlier	An extreme score, in a boxplot it is indicated as outside the whiskers
percentile	Transformations of raw scores indicating placement in the distribution
percentile rank	Rank gives cumulative percentage
polygon	A graph consisting of a line that connects a series of dots. A dot is placed above each score or interval so that the height of the dot corresponds to the frequency.
positively skewed distribution	A distribution where the scores pile up on the left side and taper off to the right. (think your right foot)
range	The distance from the upper real limit of the highest score to the lower real limit of the lowest score; the total distance from the absolute highest point to the lowest point in the distribution.
relative frequency	The proportion of the total distribution rather than the absolute frequency. Used for population distributions for which the absolute number of individuals is not known for each category.
stem and leaf graph (stem plot)	Way to share specific data points and spread based on base unit (stem) and final significant digit (leaf)
symmetrical distribution	A distribution where the left-hand side is a mirror image of the right-hand side.
tail(s) of a distribution	A section on either side of a distribution where the frequency tapers down toward zero as the X values become more extreme.
whiskers	Vertical lines in a box plot the designate the spread of the data points.

Descriptive Statistics Terminology

bimodal	A distribution with two modes.
central tendency	A statistical measures that identifies a single score (usually a central value) to serve as a representative for the entire group.
line graph	A display in which points connected by straight lines show several different means obtained from different groups or treatment conditions. Also used to show different medians, proportions, or other sample statistics.
major mode	The taller peak of two modes with unequal frequencies.
median	The score that divides a distribution exactly in half.
minor mode	The shorter peak of two modes with unequal frequencies.
mode	The score with the greatest frequency overall (major), or the greatest frequency within the set of neighboring scores (minor).
multimodal	A distribution with more than two modes.
(normal) symmetrical distribution	A distribution where the left-hand side is a mirror image of the right-hand side.
weighted mean	The average of two means, calculated so that each mean is weighted by the number of scores it represents.

Dispersion Measures Vocabulary

biased statistic	A statistic that, on average, consistently tends to overestimate (or underestimate) the corresponding population parameter.
degrees of freedom (df)	Degrees of freedom = df = n – 1, measures the number of scores that are free to vary when computing SS for sample data. The value of df also describes how well a t statistic estimates a z-score.
deviation score	The distance (and direction) from the mean to a specific score. Deviation = X – μ.
error variance	Unexplained, unsystematic differences that are not caused by any known factor.
mean squared deviation	The mean squared deviation equals the population variance. Variance is the average squared distance from the mean.
population standard deviation (σ)	The square root of the population variance; a measure of the standard distance from the mean.
population variance (σ2)	The average squared distance from the mean; the mean of the squared deviations.
range	The distance from the upper real limit of the highest score to the lower real limit of the lowest score; the total distance from the absolute highest point to the lowest point in the distribution.
sample standard deviation (s)	The square root of the sample variance.
sample variance (s²)	The sum of the squared deviations divided by df = n – 1. An unbiased estimate of the population variance.
sum of squares (SS)	The sum of the squared deviation scores.
unbiased statistic	A statistic that, on average, provides an accurate estimate of the corresponding population parameter. The sample mean and sample variance are unbiased statistics.

Z-score & more Terminology

z-score	standardized version for raw score. calculated knowing the x-value, mean and standard deviation.
Empirical Rule	68-98-99 68% of all scores within 1 standard deviation of the mean; 95% of all scores within 2 standard deviations of the mean; 99% of all scores within 3 standard deviations of the mean;
(normal) symmetrical distribution	A distribution where the left-hand side is a mirror image of the right-hand side. “Gaussian curve”
probability	expected relative frequency value of a particular outcome
relative frequency	number of times an event/outcome takes place relative to the number of times it could have taken place
probability in normal distributions	connected to area under the normal curve, can also interpret as percentage of total who fall into that event/outcome
probability distribution	describes probability of all possible outcomes for an activity
percentile	transformation of proportion where proportion is decimals and percentage is multiplying by 100 to get a %
z – distribution	Standardized Unit Normal table aka unit normal table, indicates area for associated z-score
body of distribution	typically area that includes the mean area shaded
tail of distribution	typically the smaller region shaded
gamblers fallacy
random sampling	every person in population/group has equal chance of being selected
statistical independence	two outcomes/variables are unrelated and unique
odds ratio	how likely will an event/outcome occur
law of large numbers	as a sample size grows, its mean gets closer to the average of the whole population
Central Limit Theorem	The central limit theorem states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement, then the distribution of the sample means will be approximately normally distributed. This will hold true regardless of whether the source population is normal or skewed, provided the sample size is sufficiently large (usually n > 30). If the population is normal, then the theorem holds true even for samples smaller than 30.
Distribution of Sample Means	The distribution of sample means is defined as the set of means from all the possible random samples of a specific size (n) selected from a specific population.
sampling error	A sampling error is a statistical error that occurs when an analyst does not select a sample that represents the entire population of data. As a result, the results found in the sample do not represent the results that would be obtained from the entire population.
standard error of the mean	considered the standard deviation of the means taken from a population. The smaller the standard error, the more representative the sample will be of the overall population.

t-test terminology

between-subjects research design	An alternative term for an independent-measures design.
dependent t-test	In a within-in subjects design, a hypothesis test that evaluates the statistical significance of the mean difference between two scores from the same set of participants. AKA paired t-test
difference scores	The difference between two measurements obtained for a single subject. D = X2 – X1
homogeneity of variance	An assumption that the two populations from which the samples were obtained have equal variances.
independent-measures t statistic	In a between-subjects design, a hypothesis test that evaluates the statistical significance of the mean difference between two separate groups of participants.
independent-measures research design	A research design that uses a separate sample for each treatment condition or each population being compared.
individual differences	The naturally occurring differences from one individual to another that may cause the individuals to have different scores.
matched-subjects design	A research study where the individuals in one sample are matched one-to-one with the individuals in a second sample. The matching is based on a variable considered relevant to the study.
order effects	The effects of participating in one treatment that may influence the scores in the following treatment.
pooled variance	A single measure of sample variance that is obtained by averaging two sample variances. It is a weighted mean of the two variances.
related-samples designs	Two research designs that are statistically equivalent. The scores in one set are directly related, one-to-one, with the scores in the second set.
repeated-measures research design	A research design in which the different groups of scores are all obtained from the same group of participants. Also known as within-subjects design.
within-subjects research design	A research design in which the different groups of scores are all obtained from the same group of participants. Also known as repeated-measures design.

ANOVA vocabulary

F-ratio	The test statistic for analysis of variance is called an F-ratio and compares the differences (variance) between treatments with the differences (variance) that are expected by chance.
analysis of variance (ANOVA)	A hypothesis-testing procedure that is used to evaluate mean differences between two or more treatments (or populations).
ANOVA summary table	A table that shows the source of variability (between treatments, within treatments, and total variability), SS, df, MS, and F.
between-treatments variance	Values used to measure and describe the differences between treatments (mean differences).
distribution of F-ratios	All of the possible F values when Ho is true.
error term (within-subjects variance)	For ANOVA, the denominator of the F-ratio is called the error term. The error term provides a measure of the variance caused by random, unsystematic differences. When the treatment effect is zero (H0 is true), the error term measures the same sources of variance as the numerator of the F-ratio, so the value of the F-ratio is expected to be nearly equal to 1.00.
eta squared	A measure of effect size based on the percentage of variance accounted for by the sample mean differences.
experimentwise alpha level	The risk of a Type I error that accumulates as you do more and more separate tests.
factor	In analysis of variance, an independent variable (or quasi-independent variable) is called a factor.
G	Grand Mean (mean of all scores)
interaction	unique relationship between 2 factors with two-way ANOVA
interactive plot	graph of interaction for 2-way ANOVA
k	number of groups/levels/conditions/treatments
levels	In an experiment, the different values of the independent variable selected to create and define the treatment conditions. In other research studies, the different values of a factor. AKA groups/conditions/treatments
mean square (MS)	In analysis of variance, a sample variance is called a mean square or MS, indicating that variance measures the mean of the squared deviations.
mixed design	Factorial ANOVA – 1 factor is between group and 1 factor is within group/repeated measure
n	samples size for level/conditon/treatment/group
N	total sample
pairwise comparisons	To go back through the data and compare the individual treatments two at a time.
post hoc tests	A test that is conducted after an ANOVA with more than two treatment conditions where the null hypothesis was rejected. The purpose of post hoc tests is to determine exactly which treatment conditions are significantly different.
Scheffe test	A test that uses an F-ratio to evaluate the significance of the difference between any two treatment conditions. One of the safest of all possible post hoc tests.
T	sum of scores for each level/condition/treatment/group
testwise alpha level	Systematic differences that are caused by changing treatment conditions.
Tukey’s HSD test	A test that allows you to compute a single value that determines the minimum difference between treatment means that is necessary for significance. A commonly used post hoc test.
within-treatments variance	The differences that exist inside each treatment condition.

Correlation and Regression Terminology

Y-intercept	The value of Y when X = 0. In the linear equation, the value of a.
analysis of regression	Evaluating the significance of a regression equation by computing an F-ratio comparing the predicted variance (MS) in the numerator and the unpredicted variance (MS) in the denominator.
coefficient of determination	The degree to the variability in one variable can be predicted by its relationship with another variable: determination measured by r².
correlation	A statistical value that measures and describes the direction and degree of relationship between two variables. The sign (+/–) indicates the direction of the relationship. The numerical value (0.0 to 1.0) indicates the strength or consistency of the relationship. The type (Pearson or Spearman) indicates the form of the relationship. Also known as correlation coefficient.
correlation matrix	A table that shows the results from multiple correlations and uses footnotes to indicate which correlations are significant.
dichotomous variable	A variable with only two values. Also called a binomial variable.
linear equation	An equation of the form Y = bX + a expressing the relationship between two variables X and Y.
linear relationship	A relationship between two variables where a specific increase in one variable is always accompanied by a specific increase (or decrease) in the other variable.
negative correlation	A relationship between two variables where increases in one variable tend to be accompanied by decreases in the other variable.
partial correlation	A partial correlation measures the relationship between two variables while controlling the influence of a third variable by holding it constant.
Pearson correlation	A measure of the direction and degree of linear relationship between two variables.
perfect correlation	A relationship where the actual data points perfectly fit the specific form being measured. For a Pearson correlation, the data points fit perfectly on a straight line.
phi-coefficient	A correlation between two variables both of which are dichotomous
point-biserial correlation	A correlation between two variables where one of the variables is dichotomous.
positive correlation	A relationship between two variables where increases in one variable tend to be accompanied by increases in the other variable.
residual	Error between the predicted and actual Y values, used in least-squares to build regression line

Chi-Square Terminology

chi-square distribution	The theoretical distribution of chi-square values that would be obtained if the null hypothesis was true.
chi-square statistic	A test statistic that evaluates the discrepancy between a set of observed frequencies and a set of expected frequencies.
chi-square test for goodness-of-fit	A test that uses the proportions found in sample data to test a hypothesis about the corresponding proportions in the general population.
chi-square test for independence	A test that uses the frequencies found in sample data to test a hypothesis about the relationship between two variables in the population.
Cramér’s V	A modification of the phi-coefficient to be used when one or both variables consist of more than two categories
contingency table	Also called crosstabs. Identifies frequencies for each level of the variable(s)
distribution-free test	Also called a nonparametric test. A test that does not test hypotheses about parameters or make assumptions about parameters. The data usually consist of frequencies.
expected frequencies	Hypothetical, ideal frequencies that are predicted from the null hypothesis.
nonparametric test	A test that does not test hypotheses about parameters or make assumptions about parameters. The data usually consist of frequencies.
observed frequencies	The actual frequencies that are found in the sample data.
parametric test	A test evaluating hypotheses about population parameters and making assumptions about parameters. Also, a test requiring numerical scores.
phi-coefficient	A correlational measure of relationship when both variables consist of exactly two categories. A measure of effect size for the test for independence.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introduction to Statistics for Psychology Copyright © 2021 by Alisa Beyer is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Terms are organized by sections of the textbook

Research & Variable Terminology

Graph, Tables, & Distribution Vocabulary

Descriptive Statistics Terminology

Dispersion Measures Vocabulary

Z-score & more Terminology

t-test terminology

ANOVA vocabulary

Correlation and Regression Terminology

Chi-Square Terminology

License

Share This Book