Tag Archives: R

ChIPseeker for ChIP peak annotation

ChIPpeakAnno WAS the only R package for ChIP peak annotation. I used it for annotating peak in my recent study.

I found it does not consider the strand information of genes. I reported the bug to the authors, but they are reluctant to change.

So I decided to develop my own package, ChIPseeker, and it's now available in Bioconductor.
Read more »

Bug of R package ChIPpeakAnno

I used R package ChIPpeakAnno for annotating peaks, and found that it handle the DNA strand in the wrong way. Maybe the developers were from the computer science but not biology background.

?View Code RSPLUS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
> require(ChIPpeakAnno)
> packageVersion("ChIPpeakAnno")
[1] '2.10.0'
> peak < - RangedData(space="chr1", IRanges(24736757, 24737528))
> data(TSS.human.GRCh37)
> ap < - annotatePeakInBatch(peak, Annotation=TSS.human.GRCh37)
> ap
RangedData with 1 row and 9 value columns across 1 space
                     space               ranges |        peak      strand
                  <factor>            <iranges> | <character> </character><character>
1 ENSG00000001461        1 [24736757, 24737528] |           1           +
                          feature start_position end_position insideFeature
                      </character><character>      <numeric>    </numeric><numeric>   <character>
1 ENSG00000001461 ENSG00000001461       24742284     24799466      upstream
                  distancetoFeature shortestDistance fromOverlappingOrNearest
                          <numeric>        </numeric><numeric>              <character>
1 ENSG00000001461             -5527             4756             NearestStart
</character></numeric></character></numeric></character></iranges></factor>

In this example, I defined a peak ranging from chr1:24736757 to chr1:24737528 and annotated the peak using ChIPpeakAnno package.

It returns that the nearest gene is ENSG00000001461, whose gene symbol is NIPAL3.

?View Code RSPLUS
1
2
3
4
5
> require(org.Hs.eg.db)
> gene.ChIPpeakAnno < - select(org.Hs.eg.db, key=ap$feature, keytype="ENSEMBL", columns=c("ENSEMBL", "ENTREZID", "SYMBOL"))
> gene.ChIPpeakAnno
          ENSEMBL ENTREZID SYMBOL
1 ENSG00000001461    57185 NIPAL3

When looking at the peak in Genome Browser, I found the nearest gene is STPG1.
Screenshot 2014-01-13 22.00.46
Read more »

Run remote R in Emacs with ESS

Emacs is a great front-end for most of the command line tools. Although R-Studio is pretty good, I think Emacs/ESS is better. I’ve always used Emacs/ESS to run R, since 2007 on Ubuntu, on Windows, and on my MacBook Pro. It gives me the same experiences across all platforms. I love the way Emacs formatting source codes, and literate programming with Roxygen supported. Unfortunately, ESS does not suport displaying plots in Emacs buffer, which has been supported by imaxima.

As I need to log into the server remotely to run some computationally intensive tasks. I always write and test codes on my MacBook and copy the Rscript file to server by scp command. The Rscript file can be run through screen terminal or using nohup command.

I am wondering is it possible to write R script on my MacBook and send the command to the R running on server directly in Emacs/ESS. After reading the ESS manual, I figure out it is very easy.
Read more »

project euler -- problem 68

Consider the following "magic" 3-gon ring, filled with the numbers 1 to 6, and each line adding to nine.

Working clockwise, and starting from the group of three with the numerically lowest external node (4,3,2 in this example), each solution can be described uniquely. For example, the above solution can be described by the set: 4,3,2; 6,2,1; 5,1,3.

It is possible to complete the ring with four different totals: 9, 10, 11, and 12. There are eight solutions in total.

Total      Solution Set
9        4,2,3; 5,3,1; 6,1,2
9        4,3,2; 6,2,1; 5,1,3
10      2,3,5; 4,5,1; 6,1,3
10      2,5,3; 6,3,1; 4,1,5
11      1,4,6; 3,6,2; 5,2,4
11      1,6,4; 5,4,2; 3,2,6
12      1,5,6; 2,6,4; 3,4,5
12      1,6,5; 3,5,4; 2,4,6
By concatenating each group it is possible to form 9-digit strings; the maximum string for a 3-gon ring is 432621513.

Using the numbers 1 to 10, and depending on arrangements, it is possible to form 16- and 17-digit strings. What is the maximum 16-digit string for a "magic" 5-gon ring?

Read more »

modified wp-codebox to highlight R code as in Pretty-R

I found wp-codebox could highlight R code two years ago. This plugin is based on GeSHi to highlight source code internally.

Now there are many ways to highlight R syntax in the website. Pretty-R provided by Inside-R is a popular tool in the community.

I like the color style of Pretty-R more than which provided by GeSHi. GeSHi also links functions to the online manuals; this feature is very helpful for those not familiar with R. But I found many of the keywords are not linked properly.

I modified wp-codebox, to color the functions as in Pretty-R, and link the documents back to inside-R. The external link works fine and syntax now highlighted just exactly like the Pretty-R as you can refer to my previous post.

The modified file can be downloaded from github.