## 支持向量机

SVM对于hyperplane的定义，在形式上和logistic regression一样,logistic regression的decision boundary由 $\theta^TX=0$ 确定,SVM则用 $w^TX+b=0$ 表示,其中b相当于logistic regression中的 $\theta_0$ ，从形式上看，两者并无区别，当然如前面所说，两者的目标不一样，logistic regression着眼于全局，SVM着眼于support vectors。有监督算法都有label变量y，logistic regression取值是{0,1}，而SVM为了计算距离方便，取值为{-1,1}。

$w^T x_i + b \ge +1$ when $y_i = +1$
$w^T x_i + b \le +1$ when $y_i = -1$

## Machine Learning Ex 5.2 - Regularized Logistic Regression

m4s0n501

Now we move on to the second part of the Exercise 5.2, which requires to implement regularized logistic regression using Newton's Method.

Plot the data:

x < - read.csv("ex5Logx.dat", header=F)
y <- y[,1]

d <- data.frame(x1=x[,1],x2=x[,2],y=factor(y))
require(ggplot2)
p <- ggplot(d, aes(x=x1, y=x2))+
geom_point(aes(colour=y, shape=y))


## Machine Learning Ex 5.1 - Regularized Linear Regression

The first part of the Exercise 5.1 requires to implement a regularized version of linear regression.

Adding regularization parameter can prevent the problem of over-fitting when fitting a high-order polynomial.

Plot the data:


 1 2 3 4 5 6 7 8 9  x < - read.table("ex5Linx.dat") y <- read.table("ex5Liny.dat")   x <- x[,1] y <- y[,1]   require(ggplot2) d <- data.frame(x=x,y=y) p <- ggplot(d, aes(x,y)) + geom_point(colour="red", size=3)

## Machine Learning Ex4 - Logistic Regression

Exercise 4 required implementing Logistic Regression using Newton's Method.

The dataset in use is 80 students and their grades of 2 exams, 40 students were admitted to college and the other 40 students were not. We need to implement a binary classification model to estimates college admission based on the student's scores on these two exams.

plot the data


 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21  x < - read.table("ex4x.dat",header=F, stringsAsFactors=F) x <- cbind(rep(1, nrow(x)), x) colnames(x) <- c("X0", "Exam1", "Exam2") x <- as.matrix(x)   y <- read.table("ex4y.dat",header=F, stringsAsFactors=F) y <- y[,1]   ## plotting data d <- data.frame(x, y = factor(y, levels=c(0,1), labels=c("Not admitted","Admitted" ) ) )   require(ggplot2) p <- ggplot(d, aes(x=Exam1, y=Exam2)) + geom_point(aes(shape=y, colour=y)) + xlab("Exam 1 score") + ylab("Exam 2 score")

## Machine Learning Ex3 - Multivariate Linear Regression

Part 1. Finding alpha.
The first question to resolve in Exercise 3 is to pick a good learning rate alpha.

This require making an initial selection, running gradient descent and observing the cost function.

I test alpha range from 0.01 to 1.


 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51  ##preparing data input. x < - read.table("ex3x.dat", header=F) y <- read.table("ex3y.dat", header=F)   #normalize features using Z-score. x[,1] <- (x[,1] - mean(x[,1]))/sd(x[,1]) x[,2] <- (x[,2] - mean(x[,2]))/sd(x[,2])   x <- cbind(x0=rep(1, nrow(x)), x) x <- as.matrix(x)   ##gradient descent algorithm. gradDescent_internal <- function(theta, x, y, m, alpha) { h <- sapply(1:nrow(x), function(i) t(theta) %*% x[i,]) j <- t(h-y) %*% x grad <- 1/m * j theta <- t(theta) - alpha * grad theta <- t(theta) return(theta) }   ## cost function. J <- function(theta, x, y, m) { h <- sapply(1:nrow(x), function(i) t(theta) %*% x[i,]) j <- 2*sum((h-y)^2)/m return(j) }   ## calculate cost function J for every iteration at specific alpha value. testLearningRate <- function(x,y, alpha, niter=50) { j <- rep(0, niter) m <- nrow(x) theta <- matrix(rep(0, ncol(x)), ncol=1) for (i in 1:niter) { theta <- gradDescent_internal(theta,x,y,m, alpha) j[i] <- J(theta, x, y, m) } return(j) }     ## test learning rate. alpha=c(0.01, 0.03, 0.1, 0.3, 1) xxx=sapply(alpha, testLearningRate, x=x, y=y) colnames(xxx) <- as.character(alpha)   require(ggplot2) xxx <- melt(xxx) names(xxx) <- c("niter", "alpha", "J") p <- ggplot(xxx, aes(x=niter, y=J)) p+geom_line(aes(colour=factor(alpha))) +xlab("Number of iteractions") +ylab("Cost J")
