## Machine Learning Ex2 - Linear Regression

Thanks to the post by al3xandr3, I found OpenClassroom. In addition, thanks to Andrew Ng and his lectures, I took my first course in machine learning. These videos are quite easy to follow. Exercise 2 requires implementing gradient descent algorithm to model data with linear regression.

The algorithm was shown in the following figure:

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 ``` ```gradDescent <- function(x, y, alpha=0.07, niter=1500, eps=1e-9) { x <- cbind(rep(1, length(x)), x) theta.old <- rep(0, ncol(x)) m <- length(y) for (i in 1:niter) { theta <- gradDescent_internal(theta.old, x, y, m, alpha) if (all(abs(theta - theta.old) <= eps)) { break } else { theta.old <- theta } } return(theta) }   gradDescent_internal <- function(theta, x, y, m, alpha) { h <- sapply(1:nrow(x), function(i) theta %*% x[i,]) j <- (h-y) %*% x grad <- 1/m * j theta <- theta - alpha * grad return(theta) }```
 ```1 2 3 4 5 6 7 8 9 10 11 12 13 ``` ```require(ggplot2) x <- read.table("ex2x.dat", header=F) y <- read.table("ex2y.dat", header=F) x <- x[,1] y <- y[,1] p <- ggplot() + aes(x, y) + geom_point() + xlab("Age in years") + ylab("Height in meters")   theta <- gradDescent(x,y)   yy <- theta[1] + theta[-1] %*% t(x) yy <- as.vector(yy) predicted <- data.frame(x=x, y=yy) p+geom_line(data=predicted, aes(x=x,y=y))```

At last, I explored how the cost function converge using the gradient descent algorithm.

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 ``` ```## generate 100 number theta0 <- seq(-3,3, 6/99) theta1 <- seq(-1,1, 2/99) m <- length(y)   j <- lapply(theta0, function(i) sapply(theta1, function(p) 1/(2*m) * sum(((i+p*x)-y)^2) ) )   require(plyr) J <- ldply(j) J <- as.matrix(J)   rownames(J) <- as.character(theta0) colnames(J) <- as.character(theta1) j3d <- melt(J)   p <- ggplot(j3d, aes(x=X1,y=X2,z=value, fill=value)) + geom_tile()+geom_contour(bins=15) + xlab(expression(Theta[0])) + ylab(expression(Theta[1])) print(p)```

1. OpenClassroom seems great!

2. I got the following error message:
"Fehler in alpha * grad : nicht-numerisches Argument für binären Operator" or in Englishj
"error in alpha * grad : non-numeric argument for binary operator"

ygc Reply:

typo was corrected.

3. I get this error message:

> theta <- gradDescent(x,y)
Error in if (all(abs(theta - theta.old) <= eps)) { :
missing value where TRUE/FALSE needed

And when I comment out the convergence section, the values of theta explode.

ygc Reply:

did you load the data properly?

ygc Reply:

you can also use normal equation to solve this problem.

```x0 = rep(1, length(x))
xx=matrix(c(x0,x), byrow=F, ncol=2)
solve(t(xx) %*% xx) %*% t(xx) %*% y
```

4. You can compute h more directly in gradDescent_internal using matrix*vector multiplication. The entire calculation of the new theta can be written as
theta = theta - alpha/m * t(x) %*% (x %*% theta - y);

ygc Reply:

Yes, You are right.

I found it too, as define in
http://ygc.name/2011/10/25/machine-learning-5-1-regularized-linear-regression/

```## hypothesis function
h < - function(theta, x) {
#sapply(1:m, function(i) theta %*% x[i,])
toReturn <- x %*% t(theta)
return(toReturn)
}
```

