1. What is Classification Problem?
Usually classification have two discrete output zero and one which first one is 'negative output', the other is a 'positive output'. For example, in classification for spam mail, zero means mail is not spam mail, one means mail is spam mail.
$$ y \in 0, 1 $$
Multivariate classification have multiple discrete output.
$$ y \in 0, 1, 2, ... $$
2. Logistic Regression
We use logistic function called 'Sigmoid Function'.
$$ h_{\theta }(x)=g(\theta ^Tx) $$
$$ z=\theta ^Tx $$
$$ \\ g(z)=\frac{1}{1+e^{-z}} $$
Function \(g(z)\) maps any real number to the \((0, 1)\) interval, making it useful for transforming an arbitrary-valued function into a function better suited for classification.
Using logistic regression, \(h(x)\) will give us the probability that out output is 1.
3. Predicting discrete output
Thinking about logistic regression's sigmoid function. We can assume \(y = 1\), when \(x >= 0\), and also \(y = 0\), when \(x < 0\).
In sigmoid function, we know about below :
$$ h_{\theta }(x)=g(\theta ^Tx) $$
$$ z=\theta ^Tx $$
$$ \\ g(z)=\frac{1}{1+e^{-z}} $$
To satisfy calculation, we can get insight such as below :
- \(y=1\) when \(h(x)\\ \ge 0.5\)
- \(y=0\) when \(h(x)\\ <0.5\)
4. Decision Boundary
For example, we get information about parameter theta below, and get insgith from logistic regression.
$$ \theta =\begin{bmatrix}-3\\1\\1\end{bmatrix} $$
$$ y=1, \theta ^Tx\ge 0=-3+x_1+x_2\ge 0$$
So the way how we can predict "this is zero, this is one" can be known with line \(-3 + x_1 + x_2\).
And this line is decision boundary for predicting zero and one.If we have difficult problem with polynomial condition(non-linear decision boundary), we can solve problem just satisfying logisit regression insight.
5. Logistic Regression Cost Function
Logistic regression have training set with different \(y\). And each \(x\) have with n feature vector.
- Training set : \((x^{(1)},y^{(1)}),\\ (x^{(2)},y^{(2)}),\\ ...\\ ,\\ (x^{(m)},y^{(m)})\)
- \(m\) example : :\( x\in \\ \begin{bmatrix}x_0\\x_1\\...\\x_n\end{bmatrix}\)
- \(h_{\theta }(x)=1/1+e^{-|\theta ^Tx|}\)
To make cost function satisfying logistic regression, we might have new model having convex shape.
$$ (h_{\theta }(x^i)-y^i)^2\to Cost(h_{\theta }(x),y) $$
Cons function have convex shape for optimal \(H(x)\), when model is same as below :
$$ Cost(h_{\theta }(x),y)\begin{cases}-\log \left(h_{\theta }(x)\right)\ \ \ if\ y\ =\ 1\\-\log \left(1-h_{\theta }(x)\right)\ if\ y\ =\ 0\end{cases}$$
If we make final mdoel, then we make final logisitc regression cost function called maximum likelihood estimation.
$$ J(\theta )=-\frac{1}{m}[\sum _{i=1}^m\\ y^{(i)}\log h_{\theta }(x^{(i)})+(1-y^{(i)})\log (1-h_{\theta }(x^{(i)})] $$
And to find parameter theta, we find min of \(J(\theta)\). After do those calculation, we can make prediction given new \(x\) which is estimating probability.
6. Gradient Descent
To find parameter theta which minimize \(J(\theta)\), we use gradient decent whih model same as linear regression gradient descent.
$$ \theta _j\ :=\theta _j-\alpha \frac{\partial }{\partial \theta _j}J(\theta _0,\theta _1,\theta _2,...,\theta_n $$
$$ \theta _j\ :=\theta _j-\alpha \frac{1}{m}\sum _{i=1}^m(h_{\theta }(x^i)-y^i)\times x_j^i $$
$$ j=0,....,n $$
7. Multiclass Classification
Let's consider when we have more than two categories \(y\). Since \(y \in {0, 1, 2, ... , n}\), we divide our problem into \(n + 1\) binary classification problems, same as \(y \in {0, 1}\).
$$ y\in \\ [0,1,...,n] $$
$$ h_{\theta }^{(0)}(x)=P(y=0|x;\theta ) $$
$$ \\ h_{\theta }^{(1)}(x)=P(y=1|x;\theta )\\ $$
$$ ... $$
$$ h_{\theta }^{(0)}(x)=P(y=0|x;\theta ) $$
prediction = max \((h_{\theta }^{(i)}(x))\)
Applying binary decision, we can think about multiclass as "one vs all". So we can predict max probability of class.
'Data Science > Classification' 카테고리의 다른 글
[Models] Classification Models (0) | 2022.09.20 |
---|---|
[Theorem] Bias vs Variance (0) | 2022.09.19 |
[Theorem] Validation Sets (1) | 2022.09.19 |
[Theorem] Regularization (0) | 2022.09.19 |
[Theorem] Overfitting (0) | 2022.09.19 |