Thanks to the post by al3xandr3, I found OpenClassroom. In addition, thanks to Andrew Ng and his lectures, I took my first course in machine learning. These videos are quite easy to follow. Exercise 2 requires implementing gradient descent algorithm to model data with linear regression.
The algorithm was shown in the following figure:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | gradDescent <- function(x, y, alpha=0.07, niter=1500, eps=1e-9) { x <- cbind(rep(1, length(x)), x) theta.old <- rep(0, ncol(x)) m <- length(y) for (i in 1:niter) { theta <- gradDescent_internal(theta.old, x, y, m, alpha) if (all(abs(theta - theta.old) <= eps)) { break } else { theta.old <- theta } } return(theta) } gradDescent_internal <- function(theta, x, y, m, alpha) { h <- sapply(1:nrow(x), function(i) theta %*% x[i,]) j <- (h-y) %*% x grad <- 1/m * j theta <- theta - alpha * grad return(theta) } |
1 2 3 4 5 6 7 8 9 10 11 12 13 | require(ggplot2) x <- read.table("ex2x.dat", header=F) y <- read.table("ex2y.dat", header=F) x <- x[,1] y <- y[,1] p <- ggplot() + aes(x, y) + geom_point() + xlab("Age in years") + ylab("Height in meters") theta <- gradDescent(x,y) yy <- theta[1] + theta[-1] %*% t(x) yy <- as.vector(yy) predicted <- data.frame(x=x, y=yy) p+geom_line(data=predicted, aes(x=x,y=y)) |
At last, I explored how the cost function converge using the gradient descent algorithm.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | ## generate 100 number theta0 <- seq(-3,3, 6/99) theta1 <- seq(-1,1, 2/99) m <- length(y) j <- lapply(theta0, function(i) sapply(theta1, function(p) 1/(2*m) * sum(((i+p*x)-y)^2) ) ) require(plyr) J <- ldply(j) J <- as.matrix(J) rownames(J) <- as.character(theta0) colnames(J) <- as.character(theta1) j3d <- melt(J) p <- ggplot(j3d, aes(x=X1,y=X2,z=value, fill=value)) + geom_tile()+geom_contour(bins=15) + xlab(expression(Theta[0])) + ylab(expression(Theta[1])) print(p) |


- Pingback on 2011/10/24/ 14:49
OpenClassroom seems great!
Reply
I got the following error message:
“Fehler in alpha * grad : nicht-numerisches Argument für binären Operator” or in Englishj
“error in alpha * grad : non-numeric argument for binary operator”
Reply
ygc
Reply:
October 13th, 2011 at 1:02 pm
typo was corrected.
Reply
I get this error message:
> theta <- gradDescent(x,y)
Error in if (all(abs(theta – theta.old) <= eps)) { :
missing value where TRUE/FALSE needed
And when I comment out the convergence section, the values of theta explode.
Reply
ygc
Reply:
October 16th, 2011 at 8:30 pm
did you load the data properly?
Reply
ygc
Reply:
October 16th, 2011 at 9:04 pm
you can also use normal equation to solve this problem.
Reply
You can compute h more directly in gradDescent_internal using matrix*vector multiplication. The entire calculation of the new theta can be written as
theta = theta – alpha/m * t(x) %*% (x %*% theta – y);
Reply
ygc
Reply:
October 31st, 2011 at 9:59 am
Yes, You are right.
I found it too, as define in
http://ygc.name/2011/10/25/machine-learning-5-1-regularized-linear-regression/
## hypothesis function h < - function(theta, x) { #sapply(1:m, function(i) theta %*% x[i,]) toReturn <- x %*% t(theta) return(toReturn) }Reply