본문 바로가기

Data Science/R

[R] Classification Problem : LDA(Linear Discriminant Analysis)

1. Discriminant Analysis

  • Bayes Theorem : Pr(Y=k|X=x)=πkfk(x)l=1Kπlfl(x)
  • The density for X in class k : fk(x)=12πσke(xμk)22σk2
  • The prior probability for class k : πk=Pr(Y=k)
  • If the distribution of the X is approximately normal, LDA and QDA is more stable.

 

1.1 LDA(Linear Discriminant Analysis)

  • Assume that σ=σ1=σ2=...σk
  • fk(x)
    • p = 1 : fk(x)N(μk,σ)
    • p > 1 : fk(x)N(μk,)
  •  pk(x)=πk12πσe(xμk)22σ2l=1Kπl12πσe(xμl)22σ2
  • To classify at the value X=x, see which of the pk(x) is largest.
  • Because below term of pk(x) is same, we need to consider upper terms.
    • Discriminant score :
      • p = 1 : σk(x)=xμkσ2μk22σ2+log(πk)
      • p > 1 : σk(x)=xT1μk12μkT1μk+log(πk)
    • σk(x) is a linear function of x
  •  Decision boundary at x=μ1^+μ2^2
  • Need to estimate : (μ1,...,μk),(π1,...,πk),

 

1.2 [Ex] Iris Data

Training model with LDA

 

# open the iris dataset 
data(iris)
str(iris)
summary(iris)
plot(iris[, -5], col=as.numeric(iris$Species) + 1)

# Apply LDA for iris data
library(MASS)
g <- lda(Species ~., data=iris)
plot(g)
plot(g, dimen=1)

 

 

Compute Missclassification Rate for training sets

 

# Compute misclassification error for training sets
pred <- predict(g)
table(pred$class, iris$Species)
mean(pred$class!=iris$Species)

 

 

We can get inferred probability 𝑝̂ 𝑘(𝑥) : predict(g, iris)$posterior[-tran, ].

 

Performance comparison between LDA vs Multinomial Regression

 

library(nnet)
set.seed(1234)
K <- 100
RES <- array(0, c(K, 2))
for (i in 1:K) {
    tran.num <- sample(nrow(iris), size=floor(nrow(iris)*2/3))
    tran <- as.logical(rep(0, nrow(iris)))
    tran[tran.num] <- TRUE
    g1 <- lda(Species ~., data=iris, subset=tran)
    g2 <- multinom(Species ~., data=iris, subset=tran, trace=FALSE)
    pred1 <- predict(g1, iris[!tran,])$class
    pred2 <- predict(g2, iris[!tran,])
    RES[i, 1] <- mean(pred1!=iris$Species[!tran])
    RES[i, 2] <- mean(pred2!=iris$Species[!tran])
}
apply(RES, 2, mean)

# 0.0254 0.0436

 

1.3 [Ex] Default Dataset

# Importing library 
library(ISLR)
data(Default)
attach(Default)
library(MASS)

# Training model 
g <- lda(default~., data=Default)
pred <- predict(g, default)
table(pred$class, default)
mean(pred$class!=default)

 

 

  • False Positive rate : The fraction of negative examples that are classified as positive(0.22%)
  • False Negative rate : The fraction of positive examples that are classified as negative(76.2%)
  • If we classified the prior - always to class "No", then we would make 333/10000 errors, which is only 3.33%.