# Why do we need correlation

## Pearson Product Moment Correlation: Requirements

**Product-moment correlation**

The Pearson product-moment correlation is one of the **parametric method**. This means that certain requirements must be met so that the results are correct and we can interpret them.

**Scale level.**The correlation coefficient provides reliable results if the variables are at least interval scaled or for dichotomous data (since dichotomous data are by definition metrically scaled).**Linearity.**The relationship between the two variables must be linear. If the relationship is not linear, the Pearson product-moment correlation will underestimate the strength of the relationship.**There are no outliers in the groups.**Most parametric statistics are not very robust against outliers, i.e. values that are far removed from the mass of the other values. A single outlier cannot make an otherwise significant result non-significant. It is therefore particularly important to check the data for outliers.**Finite variance and covariance.**If the variance of one or both variables is not finite, the product-moment correlation will not provide reliable results. The same is true for covariance.

SPSS also automatically checks whether the correlations differ significantly from zero. For the interpretation of the significance, both variables have to be added **bivariate normally distributed** be.

### Finite variance and covariance

The formula for calculating *r* is based on the variance and covariance of both random variables. Finite (co-) variance means that if we take a sample of, for example *N*= 100, the variance would stabilize at a value similar to that at a higher value of *N*. If the variance were not finite, it would be greater *N* keep increasing.

If both variables have a bivariate normal distribution (like the variables in the figure on the right), finite variance is automatically given. In this case, the correlation coefficient of the sample is also the same **Maximum likelihood estimator** of the population correlation coefficient. He is with it **asymptotically fair to expectation** and **efficient.** In simple terms, this means that it is impossible to make a more accurate estimate of the correlation than the correlation coefficient. The correlation coefficient remains for samples that are not normally distributed **almost true to expectation**but may no longer be efficient. Therefore, the sample correlation coefficient remains a constant estimate of the population correlation coefficient as long as the variance and covariance are finite (which is guaranteed by the law of large numbers).

That is why one often reads in some books that one of the requirements of the correlation coefficient is the **bivariate normal distribution** of the variable. This is**Not** the case. Normally distributed variables are important, however, if the significance is determined using the *t*-Tests should be checked. Similar requirements apply here as for the *t*-Test as a hypothesis test.

If there is no finite variance, a non-parametric method should be used, such as Spearman’s Rho or Kendall’s Tau.

Finite variance and covariance is an important requirement, but this cannot be checked with SPSS. We'll assume finite variance and covariance for this in the course of this tutorial (and don't worry: it's very unlikely to have a data set that doesn't do this).

### Linearity

Correlation is a measure of * linear* Addiction. If one variable cannot be written as a linear function of the other, a perfect correlation of -1 or +1 cannot be achieved. There are possibilities to change the distribution properties of the variables through transformations, but one should be careful and use these transformations with caution. Too rigorous use could improve the correlation, but at the expense of the actual applicability and interpretability of the findings. If there is a lack of linearity, non-parametric methods should be considered, such as Spearman’s Rho or Kendall’s Tau.

The easiest way to assume linearity is visual, with one **Scatter plot,** as we shall see later in this guide.

### Normal distribution

It is true that the two correlated variables themselves do not have to have a bivariate normal distribution for Pearson’s *r* To be able to calculate, however, you want to be able to calculate the **significance** check, further requirements must be met. These prerequisites correspond to those of the *t*-Tests, as a corresponding *t*- Statistics is used to check significance.

Unfortunately, SPSS does not have a method for checking the bivariate normal distribution. We shall therefore have to resort to a simpler (though not always accurate) method.

- Stomach flu can cause erectile dysfunction
- Civilizations were a good idea for mankind
- How many worlds does information technology touch?
- Which laws did Congresswoman Ilhan Omar violate?
- How do you rediscover lost languages
- What happens during a drug test
- What rhymes with spots
- Why did My Name Is Earl
- Vinegar kills ringworm
- Can you cross a sniper
- How do I write a legal letter
- Can physical and chemical changes occur together?
- What caused the demise of Bleach Anime
- How does studying in Montreal look like
- If 3x 3 66 what is x
- What do contemporary dancers think of ballet?
- How does the Airbnbs algorithm classify properties
- The frequency of COINTELPRO operations increases
- What is the data dictionary in DBMS
- What is the log of 40000
- What is a vortex
- What is the GOP index
- Is someone really lacking faith?
- What are the different Gundam qualities