[R] Elastic-Net Regression 1. Regularization Methods : Elastic-Net Lasso penalty function(\(l1\)-norm) If \(p > n\), lasso selects at most \(n\) variables. Lasso is indifferent to highly correlated variables and tends to pick only one variable. Ridge penalty function(\(l2\)-norm) If cannot perform variable selection. Shrinks correlated features to each other. Elastic-net regularization Combine Lasso and Ridge \(p_{\lambda..
[LeetCode] 13. Roman to Integer 1. Description Roman numericals are represented by seven different symbols : I, V, X, L, C, D, and M. Symbol Value I 1 V 5 X 10 L 50 C 100 D 500 M 1000 For example, 2 is written II in Roman numeral, just two ones added together, 12 is written as XII, which is simply X + II. The number 27 is written as XXVII, which is XX + V + II. Roman numerals are usually written largest to smallest from left t..
[LeetCode] 9. Palindrome Number 1. Description Given an integer x, return true if x is palindrome integer. An integer is a palindrome when it reads the same backward as forward. For example, 121 is a palindrome while 123 is not. constraints : \(-2^{31} 12 - backward += x % 10 -> 1 - backward = backward * 10 -> 10 iteration2 : - x % 10 -> 2 - x // 10 -> 1 - backward += x % 10 -> 12 Start If x 0 and x % 10 == 0), ret..
[LeetCode] 1. Two Sum 1. Description Given an array of integers nums and an integer target, return indices of two numbers such that they add up to target. You may assume that each input would have exactly one solution, and you may not use the same element twice. You can return the answer in any order. constraints : 2
[pandas] Augmenting Pandas with SQLite 1. SQLite with Pandas The Pandas library works by storing data inside the memory. However, DBMS such as SQLite process data on a disk. That is, if Pandas can process data inside available memory, SQLite processes data inside available disk space. By expanding this, processing data on server inside the cloud will allow much cheaper disk space than memory. In most cases, data is extracted from a t..
[pandas] Processing Dataframes in Chunks 1. What is Chunks? Even after optimizing the data type of the data frame and selecting the appropriate column, the size of the data set may not be suitable for memory. At this time, it is more efficients to process the entire data frame in Chunk units than to load it into memory. Only a portion of the entire row should be used in memory for a given time. In other words, we need to process tasks ..
[R] Regularization Methods 1. Formular of Regularization Methods $$ Q_{\lambda}(\beta_0, \beta) = -l(\beta_0, \beta) + p_{\lambda}(\beta)$$ 2. The negative log-likelihood function Quantitative outcome: least square loss function Binary outcome: logistic likelihood Matched case-control outcome: conditional logistic likelihood Count outcome: Poisson likelihood Qualitative outcome: Multinomial likelihood Survival outcome: Co..
[R] Simulation Study : Prediction Performance 1. How to do simulation about prediction performance? If \(X\) has \(n \times p (200 \times 2000)\) size, we need to find which variables are selected and which variable affects target most(coefficients). So, we need to consider variable models predicting well. - M1 : \(\hat{\beta}^{lasso} + \lambda_{min}\) - M2 : \(\hat{\beta}^{lasso} + \lambda_{1se}\) - M3 : \(\hat{\beta}^{lasso} + \lambda_{mi..
[pandas] Optimizing DataFrame's Memory 1. Estimating the amount of memory The Pandas DataFrame.info() method provides information on non-null counts, dtype, and memory usage of data frames. The memory_usage='deep' keyword can confirm more accurate memory usage. import pandas as pd df = pd.read_csv('file.csv') df.info(memory_usage='deep') 1.1 Pandas BlockManager The Pandas's BlockManager Class optimizes data by type and stores it sepa..
[R] Useful Functions for Regression Problems 1. model.matrix() Make dummy variable with category and intercept : model.matrix(~., x) Make dummy variable with category not intercept : mdoel.matirx(~., x)[, -1] Make prediciton from regsubsets, glm, or lm function : model.matrix(~., x) %*% coef(g, id=i) Make prediction from glmnet : model.matrix(~., x) %*% coef(g, s=g$lambda[0]) The model.matrix function convert original data into categorical..
[R] Regularization Methods : Binary 1. Regulaization Methods Regularization methods are based on a penalized likelihood : \(Q_{\lambda}(\beta_0, \beta) = -l(\beta_0, \beta) + p_{\lambda}(\beta)\) \((\hat{\beta_0}, \hat{\beta}) = arg min Q_{\lambda}(\beta_0, \beta)\) Penalized likelihood for quantitive Linear regression model : \(y_i = \beta_0 + x_i^T \beta + \epsilon_i\) l1-norm : \(\lambda \sum(\hat{\beta}^2)\) l2-norm : \(\lambd..
[R] Variable Selection Methods : Lasso 1. Lasso Regression Ridge have disadvantages of including all p predictors in the final model. What we want to do is variable selection. Lasso shrinks \(\hat{\beta}\) towards zero. \(RSS + \lambda\sum_{j=1}^{p}|\beta_j|\) The \(l_1\)-norm of \(\hat{\beta}\) : \(df(\hat{\beta}_{\lambda_1}) = 0
[R] Variable Selection Methods : Ridge 1. Variable Selection Methods We cannot use subset selection model in \(n > Var(\hat{\beta}^{sh})\) Examples Ridge Lasso Elastic Net : Ridge + Lasso 3. Ridge Regression \(RSS + \lambda\sum_{j=1}^{p}\beta_j^2\) where \(\lambda >= 0\) is a tuning parameter. For a grid of \(\lambda\) : \(\lambda_{max} = \lambda_1 > ... > \lambda_m = \lambda_{min}\). The \(l_2\)-norm of \(\hat{\beta}\) : \(||\hat{\b..
[R] Best Subset Selection 1. Three classes of solving problems To solve the problem (variance become higher when the number of features is bigger), we need to make p lower than n. Subset Selection : Identify a subset of the p predictors that we belive to be related to the response. Shrinkage : Fit a model involving all p predictors, but the estimated coefficient are shrunked towards zero relative to the OLS estimates. Di..
[R] Linear Model 1. OLS(Ordinary Least Square) model The linear regression model : \(Y = \beta_0 + \beta_1 X_1 + ... \beta_p X_p + \epsilon\) OLS Ordinary least squared (OLS) is a type of linear least squares method for estimating the unkown parameters in a linear regression. All parameters of OLS model are unbiased estimators. \(E(\hat{\beta}^{OLS}) = \beta\) \(Var(\hat{\beta}^{OLS}) ↓\) Problems in multiple li..
[R] Cross Validation 1. What is Cross Validation? In real world, we can't get test data for \(MSE_{test}\). So we should divide train data into train set and test set. Test-set error estimation Mathmatical Adjustment : \(C_p\), \(AIC\), \(BIC\), Adjusted \(R^2\) Hold out : holding out a subset of training set. Validation set approach K-fold Cross Validation LOOCV, LpOCV 2. Validation Set Approach Divide training set..
[R] Assessing Model Accuracy 1. How do we assess model accuracy? Quantitative : MSE(mean squared error) Qualitative : Classification error rate Type of dataset Training set : To fit statistical learning models Validation set : To select optimal tuning parameter Test set : To select the best model 2. MSE(Mean Squared Error) Suppose our fitted model \(\hat{f}(x)\) from training dataset, \((x_i, y_i)\). \(MSE_{train} = \frac{1..
[R] Flexibility and Interpretability 1. Parametric and Non-Parametric Methods Parametric methods : Make an assumption about the functional form or shape of \(f\). Non-Parametric methods : Do not make explicit assumptions about the functional form of \(f\). 2. Flexibility and Interpretability Flexibility : The flexibility of a model can be described as how much is model's behavior influenced by characteristics of the data. So, if fl..
[R] Supervised Learning 1. Model based on Supervised Learning Ideal model : \(Y = f(X) + \epsilon\) Good \(f(X)\) can make predictions of \(Y\) at new points \(X = x\). Statistical Learning refers to a set of approaches for estimating the function \(f(X)\). # Indexing without index AD
[R] Introduction to Statistical Learning 1. Definitions of Statistical Learning Statistical Learning is a set of tools for modeling and understanding complex datasets. Supervised Statistical Learning builds a statistical model for predicting or estimating for data with output based on one or more inputs. Unsupervised Statistical Learning learns relationships and structure from data that has inputs but no supervising output. 2. Supervis..