본문 바로가기

Data Science/R

[R] Non-Linear Models : Local Regression, GAM

1. What is Local Regression

  • Local regression or Local polynomial regression, also known as moving regression, is a generalization of the moving average and polynomial regression.
  • Local regression computes the fit at target upoint \(x_0\) using only the regression nearby training obervation.
  • With a sliding weight function, we fit separate linear fits over the range of \(x\) by weighted least squares.
    • OLS : \(\hat{\beta}^{OLS} = (X^{'}X)^{-1}X^{'}y\)
    • WLS : \(\hat{\beta}^{WLS} = (X^{'}WX)^{-1}X^{'}W^{-1}y\)
    • GLS : \(\hat{\beta}^{GLS} = (X^{'}\Omega^{-1} X)^{-1}X^{'}\Omega^{-1}y\)

 

 

2. [Ex] Local Regression

# Prepare datasets
data(Wage)
age <- Wage$age
wage <- Wage$wage
age.grid <- seq(min(age), max(age)) 

# Training model 
fit1 <- loess(wage ~ age, span=.2, data=wage)
fit2 <- loess(wage ~ age, span=.3, data=wage) 

# Visualize fitted local regression model 
plot(age, wage, cex =.5, col = "darkgrey")
title("Local Linear Regression")
lines(age.grid, predict(fit1, data.frame(age=age.grid)),
col="red", lwd=2)
lines(age.grid, predict(fit2, data.frame(age=age.grid)),
col="blue", lwd=2)
legend("topright", legend = c("Span = 0.2", "Span = 0.7"),
col=c("red", "blue"), lty=1, lwd=2)

 

 

3. [Ex] Comparison of Polynomials, Natural Cubic Splines, Local Regression

set.seed(1357)
MSE2 <- matrix(0, 100, 2)
for (i in 1:100) {
    # Train-Test split 
    tran <- sample(nrow(Wage), size=floor(nrow(Wage)*2/3))
    test <- setdiff(1:nrow(Wage), tran)
    # Training model based on value of span 
    g1 <- loess(wage ~ age, span=.2, data=Wage)
    g2 <- loess(wage ~ age, span=.7, data=Wage)
    # Calculate MSE result of testing set 
    mse1 <- (wage-predict(g1, Wage))[test]^2
    mse2 <- (wage-predict(g2, Wage))[test]^2
    MSE2[i,] <- c(mean(mse1), mean(mse2))
}
MSE <- cbind(MSE1, MSE2)
apply(MSE, 2, mean)
apply(MSE, 2, sd)

# Visualize results 
boxplot(MSE, boxwex=0.5, col=4:7, ylim=c(1200, 2000), 
        names= c("Smoothing Spline", "Natural Cubic", "Local (S=0.2)", "Local (S=0.7)"), 
        ylab="Mean Squared Errors")

 

 

4. General Additive Models

  • Generalized additive models(GAMs) allows non-linear functions of each of the variables, while maintaining additivity.
  • \(y_i = \beta_0 + f_1(x_{i1}) + f_2(x_{i2}) + ... + \epsilon_i\)
  • It is called an additive model because we calcualte a separate \(f_j\) for each \(x_j\), and then add together all of their contributions.
  • GAMs provide a useful-compromise between linear and fully nonparametric models.

 

4.1 [Ex] GAMs using gam package

# Prepare library and dataset 
library(ISLR)
library(splines) 
data(Wage)

# Training model 
gam1 <- lm(wage ~ ns(year, 4) + ns(age, 5) + education, data=Wage)
summary(gam1)
library(gam)

# s() : smoothing spline
gam <- gam(wage ~ s(year, 4)+s(age, 5)+education, data=Wage)

# Visualize fitted model 
par(mfrow =c(1,3))
plot(gam, se=TRUE, col="blue", scale=70)
plot.Gam(gam1, se = TRUE, col = "red")

 

 

# Significant test 
gam.m1 <- gam(wage ~ s(age, 5) + education, data=Wage)
gam.m2 <- gam(wage ~ year + s(age, 5) + education, data=Wage)
anova(gam.m1, gam.m2, gam, test = "F")
summary(gam)

 

  • The way of selecting features is using ANOVA F-test.
  • Using Group Lasso is recommended.

 

4.2 Simulation Study

set.seed(1357)
MSE3 <- matrix(0, 100, 3)
for (i in 1:100) {
    tran <- sample(nrow(Wage), size=floor(nrow(Wage)*2/3))
    test <- setdiff(1:nrow(Wage), tran)
    g1 <- gam(wage ~ s(age, 5) + education, data=Wage, subset=tran)
    g2 <- gam(wage ~ year + s(age, 5) + education, data=Wage, subset=tran)
    g3 <- gam(wage ~ s(year, 4) + s(age, 5) + education, data=Wage, subset=tran)
    mse1 <- (wage - predict(g1, Wage))[test]^2
    mse2 <- (wage - predict(g2, Wage))[test]^2
    mse3 <- (wage - predict(g3, Wage))[test]^2
    MSE3[i,] <- c(mean(mse1), mean(mse2), mean(mse3))
}
apply(MSE3, 2, mean)

 

  • The result of MSE3 : 1265.564, 1261.940, 1262.823
  • The performance of gam(wage ~ s(age, 5) + education) works worst.