Standard deviation line

From Wikipedia, the free encyclopedia
Plot of the standard deviation line (SD line), dashed, and the regression line, solid, for a scatter diagram of 20 points.

In statistics, the standard deviation line (or SD line) marks points on a scatter plot that are an equal number of standard deviations away from the average in each dimension. For example, in a 2-dimensional scatter diagram with variables and , points that are 1 standard deviation away from the mean of and also 1 standard deviation away from the mean of are on the SD line.[1] The SD line is a useful visual tool since points in a scatter diagram tend to cluster around it,[1] more or less tightly depending on their correlation.

Properties[edit]

Relation to regression line[edit]

The SD line goes through the point of averages and has a slope of when the correlation between and is positive, and when the correlation is negative.[1][2] Unlike the regression line, the SD line does not take into account the relationship between and .[3] The slope of the SD line is related to that of the regression line by where is the slope of the regression line, is the correlation coefficient, and is the magnitude of the slope of the SD line.[2]

Typical distance of points to SD line[edit]

The root mean square vertical distance of points from the SD line is .[1] This gives an idea of the spread of points around the SD line.

  1. ^ a b c d Freedman, David (1998). Statistics. Robert Pisani, Roger Purves (3rd ed.). New York: W.W. Norton. ISBN 0-393-97083-3. OCLC 36922529.
  2. ^ a b Stark. "Regression". www.stat.berkeley.edu. Retrieved 2022-11-12.
  3. ^ Cochran. "Regression". www.stat.ucla.edu. Retrieved 2022-11-12.