Math In Machine Learning

Math In Machine Learning

Mathematics in machine learning~

IQR (Interquartile Range)

IQR describes the distance between the 1st quartile and the 3rd quartile. It is a method to detect outliers in dataset.

outliers are data < Q1 - 1.5 * IQR, and data > Q3 + 1.5 * IQR


Variance and Standard Deviation

Variance

Variance describes how a group of data distribute from their mean. It measures the dispersion degree (离散程度).

The more variance is, the more dispersive the data distributes.

σ2=1ni=1n(xiμ)2,

Standard Deviation

Standard Deviation is the square root of variance. Compared to variance, standard deviation is more helpful to compare the dispersion degree of the data.

Standard Deviation has the same dimension as the original data.


Covariance and Correlation

Covariance

Between two groups of variables, Covariance describes when one variable changes, how will another one change. It measures the linear relationship between two variables.
Covariance(x,y)=i=1n(xix)(yiy)n

The value of covariance is between and +.
When it equals to 0, it means there is no linear relationship between these two variables.

Correlation

The correlation between two variables describes how strong the relationship between two variables.
Correlation(x,y)=i=1n(xix)(yiy)i=1n(xix)2(yiy)2=covariance(xi,yi)SD(xi)SD(yi)

The value of correlation is between -1 and +1.
Correlation is the ‘scaled version’ of covariance.

Cross-Covariance

Gram Matrix

Gram matrix is formed by vectors, each vector measures the product of two vectors.
> Can be used in neural style transfer.

Distance Matrix

Distance metrics deal with finding the proximity or distance between data points and determining if they can be clustered together.

Euclidean

Represents the shortest distance between two vectors.
dEuclidean=(x2x1)2+(y2y1)2

Manhattan

The sum of absolute differences between points across all the dimensions.
dManhattan=|x2x1|+|y2y1|

Mahalanobis

The distance between a point and a distribution.
dMahalanobis2=(xm)TC1(xm)

Hamming

A fundamental tool for measuring the dissimilarity between two pieces of data, typically strings or integers.

For string data, dHamming measures the dissimilarity by the sum of different chars.
For numerical data, dHamming measures the dissimilarity by the sum of different values between their binary forms.

Data Distribution

Normal Distribution (正态分布)

  1. In a normal distribution, mean, median and mode are equal.
  2. 68.2% values are within 1 standard deviation of ‘mean’, 95% values are within 2 standard deviations of ‘mean’ and 99% values are within 3 standard deviations of ‘mean’

Skewness (偏度)

If the values extend to the right, it is right-skewed, and if the values extend left, it is left-skewed.

Entropy

Measure how disorder the data is. The larger entropy is, the messier the data will be.
img.png

img.png

Calculation

H=inpi(log2pi)

Cross-Entropy

Measuer the gap and difference between two probability distributions.

Calculation

H(p, q)=inpi(log2qi)

Comments