clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters

Increasing quantitative data generated from transcriptomics and proteomics require integrative strategies for analysis. Here, we present an R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters. The analysis module and visualization module were combined into a reusable workflow. Currently, clusterProfiler supports three species, including humans, mice, and yeast. Methods provided in this package can be easily extended to other species and ontologies. The clusterProfiler package is released under Artistic-2.0 License within Bioconductor project. The source code and vignette are freely available at http://bioconductor.org/packages/release/bioc/html/clusterProfiler.html

Yu, Guangchuang, Li-Gen Wang, Yanyan Han, and Qing-Yu He. clusterProfiler: An R Package for Comparing Biological Themes Among Gene Clusters. OMICS: A Journal of Integrative Biology 16, no. 5 (May 2012): 284–287.

zv7qrnb

Related Posts

Leave a comment ?

28 Comments.

  1. 您好!我在安装Reactome时显示:
    错误于read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) :
    无法打开链结
    此外: 警告信息:
    1: In download.file(url, destfile, method, mode = "wb", ...) :
    下载的长度70724不等于报告的长度265171851
    2: In download.file(url, destfile, method, mode = "wb", ...) :
    下载的长度121849不等于报告的长度124549
    3: In unzip(zipname, exdir = dest) : 从zip文件中抽取1时出了错
    4: In read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) :
    无法打开压缩文件'reactome.db/DESCRIPTION',可能是因为'No such file or directory'

    请问如何解决?谢谢!

    Reply

    ygc China Unknow Browser Unknow Os Reply:

    这是reactome.db的问题,你是否通过biocLite安装?

    Reply

  2. Hello, I'm trying to use clusterProfiler to analyze a dataset but it is returning an error I don't understand:

    GOtest <- groupGO(gene = test[[1]], organism = "yeast", ont = "CC", level = 2, readable = TRUE)
    Error in .testIfKeysAreOfProposedKeytype(x, keys, keytype) :
    None of the keys entered are valid keys for the keytype specified.

    Any help with how to address this error would be greatly appreciated.

    Reply

    ygc China Unknow Browser Unknow Os Reply:

    yeast was supported only in the current devel version (bioc 2.13) and the gene ID should be ORF ID.

    Reply

  3. Hi, thank you for getting back to me. I am using the current devel version of bioc. Could you be more specific about what ORF ID I should use?

    When I use the entrez gene id I get the same error:
    test12 <- groupGO(gene=test12entrez[[1]], organism = "yeast", ont="CC", level = 2, readable = TRUE)
    Error in .testIfKeysAreOfProposedKeytype(x, keys, keytype) :
    None of the keys entered are valid keys for the keytype specified.

    When I use either the SGD id or the systematic ORF id I get this error:
    Warning message:
    In EXTID2NAME(genes, organism = organism) :
    the input geneID is not entrezgeneID, and cannot be mapped

    Thank you again for your help

    Reply

    ygc China Unknow Browser Unknow Os Reply:

    please give me some of your sample ID.

    Reply

    Rachel United States Unknow Browser Unknow Os Reply:

    ORF ID: "YAL015C" "YAL018C" "YAL037W" "YAL047C" "YAL059W" "YAL067C"

    SGD ID: "S000003725" "S000006064" "S000006435" "S000004592" "S000000067" "S000007362"

    entrez gene ID: "13393612" "854834" "853357" "853479" "1466473" "854944"

    Thank you!!

    Reply

    ygc China Unknow Browser Unknow Os Reply:

    set readable=FALSE, the mapping from yeast ORF to SYMBOL is not supported yet and will be supported soon.

    Reply

  4. 你好,我在运行ck<-compareCluster(geneClusters = clslist,fun='enrichKEGG')时出现错误:“The following `from` values were not present in `x`: .id”,而用cg<-compareCluster(geneClusters = clslist,fun='enrichGO')对GO进行分析是没有问题的,请问是不是因为该gene clusters基因过少?

    Reply

    ygc Hong Kong Mozilla Firefox Ubuntu Linux Reply:

    少到什么程度?看着错误信息,应该是完全没有基因被注释到。kegg的数据是比较老的,因为现在收费了,所以有些基因的注释会缺失。

    Reply

    zkwabm China Google Chrome Windows Reply:

    cluster的基因共有130个,分成了8个子类,这些基因放在一起做KEGG富集分析的时候是没有问题的,共富集了3个pathway

    Reply

  5. 刚才回复的内容丢失了,我再重新写一遍吧。
    这130个基因一起做enrichKEGG分析时,是没有问题的,富集了三个pathway;
    分成子类后,我单独对每个子类进行enrichKEGG分析时结果都为NULL;
    用compareCluster分析enrichKEGG时出现第一次所讲的错误,每个子类的基因个数少则八九个,多则二十多个。

    Reply

    ygc Hong Kong Mozilla Firefox Ubuntu Linux Reply:

    那就真是太少了,而且函数默认只检验基因集大于5的pathway,你才8,9个,null也是正常的。

    你可以设置参数minGSSize = 1, 如果也是null,你再把pvalueCutoff和qvalueCutoff设大一点,起码你可以拿到基因到pathway的注释。

    Reply

    zkwabm China Google Chrome Windows Reply:

    谢谢耐心讲解!

    Reply

    grace Australia Google Chrome Windows Reply:

    我是enrichGO.子类里面有500多个gene以上都分析出来的结果为0 rows,但是放到一起8000多个却可以得到不少富集的Pathway。请问这是为什么啊?

    Reply

  6. why clusterProfiler fails | YGC United States WordPress Unknow Os - pingback on August 7, 2014 at 1:46 pm
  7. 我用那个EnrichKEGG为啥出现错误 max(p)>1 min(P)<0,数据都是entrez gene ID. 没有NA数据啊。

    还有enrichGO为啥输入的数据小的时候出错,大数据又没有错了。。好奇怪啊。我都整理两天了。还没明白。求解啊

    Reply

    ygc Hong Kong Mozilla Firefox Ubuntu Linux Reply:

    多半情況下,是你用的不對,我沒有你的數據,無法重複你的錯誤,首先確認你用的是最新的release版本,第二按上面我回復別人的方法,設minGSSize=1,pvalueCutoff=1, qvalueCutoff=1,跑一次看結果。

    Reply

    grace Australia Mozilla Firefox Linux Reply:

    Using Bioconductor version 2.12
    (BiocInstaller 1.10.4), R version
    3.0.2.

    it Can't work as well. Can you add my QQ:491805841 for discussion? Thanks a lot.

    Reply

    ygc Hong Kong Mozilla Firefox Ubuntu Linux Reply:

    you should update your R and bioconductor. The current release is version 2.14.

    Even I found old release has bug, it won't be changed since bioconductor only compile current release and devel branch. Old release will keep untouch all the time.

    I will fix the issue if it still exists after you upgrade your R and bioconductor.

    Reply

  8. 请问那个qvalueCutoff相当于是FDR吗?是用的multiple test, BH method吗

    Reply

    ygc Hong Kong Mozilla Firefox Ubuntu Linux Reply:

    qvalue可以理解爲FDR,multiple testing也會算,可以通過pAdjustMethod="BH"來設定。

    Reply

  9. but I can't set the pAdjustMethod. unused arguments (pAdjustMethod = "BH"

    Reply

    ygc Hong Kong Mozilla Firefox Ubuntu Linux Reply:

    剛說了,使用新版再來說,舊版即使有問題我也不管,已不支持。

    Reply

  10. Listing gene IDs from hyperGTest | YGC United States WordPress Unknow Os - pingback on September 4, 2014 at 6:09 pm
  11. 博主,您好!我在用clusterProfiler的groupGO做了GO Classification后,用barplot作图的时候想只画出Count大于100的GO,请问这个应该怎么设置barplot的参数呢?或者是如何从groupGO中提取出我所需要的亚集来做barplot。
    非常感谢!

    Reply

  12. 我没有给barplot提供这样的参数,变通一下就可以做到:

    > yy@result = yy@result[yy@result$Count > 100,]
    > barplot(yy, showCategory=nrow(yy@result))
    

    Reply

    HuailongXu China Google Chrome Windows Reply:

    非常感谢,我试试先。

    Reply

Leave a Comment


NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>