Correlation coefficient

According to most math books, the correlation coefficient is the linear association between two variables.  Bottom line: correlation coefficient shows the relationship (or, association) between two things.

The correlation coefficient NEVER shows or proves causation.  A USMLE high yield question may include answers with the word “cause” in it.  Never select an answer-choice with the word “cause” in it.

Core concepts of the correlation coefficient

The Correlation Coefficient is a number between, -1 and 1 (including -1 & 1), and is denoted by the letter, “r” (e.g., r = 0.78).  r provides the direction & strength of a correlation between two things – the, "linear association between two variables."

Negative Correlation (lifespan ↓ as smoking ↑)

If r is between -1 & -0.1, then the direction is .
-1 = perfectly negative correlation (maximum strength), so a -0.6 represents a mildly-strong association.










Here's a general scale for Correlation Coefficient's strength:

Positve Correlation (height ↑ as weight ↑)

If r is between +0.1 & +1, then the direction is .
+1 = perfectly positive correlation (maximum strength), so a +0.8 represents a strong association.

Solving / answering high yield correlation coefficient questions

Questions may provide scatter plots (the easiest) or, entirely in the form of a word problem (trickiest).  Calculations are not required; instead, it’s entirely conceptual.

A high yield question designed to confuse the examinee could provide a positive r, p=0.02, but frame the answer so the correct choice involves two decreasing variables.  I provide practice questions that address this. 

