Tag Archives: R

再不相信预编译的R

去年ubuntu下apt-get了R-3.0.2, 用了没多久就发现了system命令有问题,通常情况下调用系统命令是正常的,但是我调用bowtie的时候,就会报错:

Warning: Could not open read file "/tmp/8156.inpipe1" for reading; skipping...
Error: Encountered internal Bowtie 2 exception (#1)
Command: /usr/bin/bowtie2-align --wrapper basic-0 -p 12 -x /ssd/genomes/hg19 -S tmp.sam -1 /tmp/8156.inpipe1 -2 /tmp/8156.inpipe2

这很明显是因为fasta.gz文件,bowtie需要调用zcat来读的,在R里调用bowtie就找不到好基友zcat的输出管道。当时没在意,R不干,那就找shell。
Read more »

visualization methods in ChIPseeker

After two weeks developed, I have added/updated some plot functions in ChIPseeker (version >=1.0.1).

ChIP peaks over Chromosomes

?View Code RSPLUS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
> files=getSampleFiles()
> peak=readPeakFile(files[[1]])
> peak
GRanges with 1331 ranges and 2 metadata columns:
         seqnames                 ranges strand   |             V4        V5
            <rle>              <iranges>  <rle>   |       <factor> <numeric>
     [1]     chr1     [ 815092,  817883]      *   |    MACS_peak_1    295.76
     [2]     chr1     [1243287, 1244338]      *   |    MACS_peak_2     63.19
     [3]     chr1     [2979976, 2981228]      *   |    MACS_peak_3    100.16
     [4]     chr1     [3566181, 3567876]      *   |    MACS_peak_4    558.89
     [5]     chr1     [3816545, 3818111]      *   |    MACS_peak_5     57.57
     ...      ...                    ...    ... ...            ...       ...
  [1327]     chrX [135244782, 135245821]      *   | MACS_peak_1327     55.54
  [1328]     chrX [139171963, 139173506]      *   | MACS_peak_1328    270.19
  [1329]     chrX [139583953, 139586126]      *   | MACS_peak_1329    918.73
  [1330]     chrX [139592001, 139593238]      *   | MACS_peak_1330    210.88
  [1331]     chrY [ 13845133,  13845777]      *   | MACS_peak_1331     58.39
  ---
  seqlengths:
    chr1 chr10 chr11 chr12 chr13 chr14 ...  chr6  chr7  chr8  chr9  chrX  chrY
      NA    NA    NA    NA    NA    NA ...    NA    NA    NA    NA    NA    NA
> plotChrCov(peak, weightCol="V5")
</numeric></factor></rle></iranges></rle>

chrCoverage
Read more »

ChIPseeker for ChIP peak annotation

ChIPpeakAnno WAS the only R package for ChIP peak annotation. I used it for annotating peak in my recent study.

I found it does not consider the strand information of genes. I reported the bug to the authors, but they are reluctant to change.

So I decided to develop my own package, ChIPseeker, and it's now available in Bioconductor.
Read more »

Bug of R package ChIPpeakAnno

I used R package ChIPpeakAnno for annotating peaks, and found that it handle the DNA strand in the wrong way. Maybe the developers were from the computer science but not biology background.

?View Code RSPLUS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
> require(ChIPpeakAnno)
> packageVersion("ChIPpeakAnno")
[1] '2.10.0'
> peak < - RangedData(space="chr1", IRanges(24736757, 24737528))
> data(TSS.human.GRCh37)
> ap < - annotatePeakInBatch(peak, Annotation=TSS.human.GRCh37)
> ap
RangedData with 1 row and 9 value columns across 1 space
                     space               ranges |        peak      strand
                  <factor>            <iranges> | <character> </character><character>
1 ENSG00000001461        1 [24736757, 24737528] |           1           +
                          feature start_position end_position insideFeature
                      </character><character>      <numeric>    </numeric><numeric>   <character>
1 ENSG00000001461 ENSG00000001461       24742284     24799466      upstream
                  distancetoFeature shortestDistance fromOverlappingOrNearest
                          <numeric>        </numeric><numeric>              <character>
1 ENSG00000001461             -5527             4756             NearestStart
</character></numeric></character></numeric></character></iranges></factor>

In this example, I defined a peak ranging from chr1:24736757 to chr1:24737528 and annotated the peak using ChIPpeakAnno package.

It returns that the nearest gene is ENSG00000001461, whose gene symbol is NIPAL3.

?View Code RSPLUS
1
2
3
4
5
> require(org.Hs.eg.db)
> gene.ChIPpeakAnno < - select(org.Hs.eg.db, key=ap$feature, keytype="ENSEMBL", columns=c("ENSEMBL", "ENTREZID", "SYMBOL"))
> gene.ChIPpeakAnno
          ENSEMBL ENTREZID SYMBOL
1 ENSG00000001461    57185 NIPAL3

When looking at the peak in Genome Browser, I found the nearest gene is STPG1.
Screenshot 2014-01-13 22.00.46
Read more »

Run remote R in Emacs with ESS

Emacs is a great front-end for most of the command line tools. Although R-Studio is pretty good, I think Emacs/ESS is better. I’ve always used Emacs/ESS to run R, since 2007 on Ubuntu, on Windows, and on my MacBook Pro. It gives me the same experiences across all platforms. I love the way Emacs formatting source codes, and literate programming with Roxygen supported. Unfortunately, ESS does not suport displaying plots in Emacs buffer, which has been supported by imaxima.

As I need to log into the server remotely to run some computationally intensive tasks. I always write and test codes on my MacBook and copy the Rscript file to server by scp command. The Rscript file can be run through screen terminal or using nohup command.

I am wondering is it possible to write R script on my MacBook and send the command to the R running on server directly in Emacs/ESS. After reading the ESS manual, I figure out it is very easy.
Read more »