| Title: | High Throughput 'GoMiner' |
|---|---|
| Description: | Two papers published in the early 2000's (Zeeberg, B.R., Feng, W., Wang, G. et al. (2003) <doi:10.1186/gb-2003-4-4-r28>) and (Zeeberg, B.R., Qin, H., Narashimhan, S., et al. (2005) <doi:10.1186/1471-2105-6-168>) implement 'GoMiner' and 'High Throughput GoMiner' ('HTGM') to map lists of genes to the Gene Ontology (GO) <https://geneontology.org>. Until recently, these were hosted on a server at The National Cancer Institute (NCI). In order to continue providing these services to the bio-medical community, I have developed stand-alone versions. The current package 'HTGM' builds upon my recent package 'GoMiner'. The output of 'GoMiner' is a heatmap showing the relationship of a single list of genes and the significant categories into which they map. 'High Throughput GoMiner' ('HTGM') integrates the results of the individual 'GoMiner' analyses. The output of 'HTGM' is a heatmap showing the relationship of the significant categories derived from each gene list. The heatmap has only 2 axes, so the identity of the genes are unfortunately "integrated out of the equation." Because the graphic for the heatmap is implemented in Scalable Vector Graphics (SVG) technology, it is relatively easy to hyperlink each picture element to the relevant list of genes. By clicking on the desired picture element, the user can recover the "lost" genes. |
| Authors: | Barry Zeeberg [aut, cre] |
| Maintainer: | Barry Zeeberg <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 1.2 |
| Built: | 2026-05-11 08:53:17 UTC |
| Source: | https://github.com/cran/HTGM |
driver to invoke GoMiner for multiple studies, and integrate the results in a categories versus study hyperlinked heatmap
HTGM( title = NULL, dir = tempdir(), sampleLists, GOGOA3, ONT, enrichThresh = 2, countThresh = 5, fdrThresh = 0.1, nrand = 100, mn = 2, mx = 200, opt = 0, verbose = 1 )HTGM( title = NULL, dir = tempdir(), sampleLists, GOGOA3, ONT, enrichThresh = 2, countThresh = 5, fdrThresh = 0.1, nrand = 100, mn = 2, mx = 200, opt = 0, verbose = 1 )
title |
character string descriptive title |
dir |
character string full pathname to the directory acting as result repository |
sampleLists |
list of character vector of user-supplied genes of interest |
GOGOA3 |
return value of subsetGOGOA() |
ONT |
c("molecular_function","cellular_component","biological_process") |
enrichThresh |
numerical acceptance threshold for enrichment passed to GoMiner |
countThresh |
numerical acceptance threshold for gene count passed to GoMiner |
fdrThresh |
numerical acceptance threshold for fdr passed to GoMiner |
nrand |
integer number of randomizations passed to GoMiner |
mn |
integer param passed to trimGOGOA3, min size threshold for a category |
mx |
integer param passed to trimGOGOA3, max size threshold for a category |
opt |
integer 0:1 parameter used to select randomization method |
verbose |
integer parameter passed to vprint() |
returns the matrix of significant categories versus study
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") # load("data/Housekeeping_Genes.RData") sampleList<-unique(as.matrix(Housekeeping_Genes[,"Gene.name"])) n<-nrow(sampleList) sampleLists<-list() # test the effect of random sampling of the entire gene set # this can give an idea of the quality of the GoMiner results # when the complete gene set is yet to be determined sampleLists[["1"]]<-sampleList[sample(n,n/2)] sampleLists[["2"]]<-sampleList[sample(n,n/2)] sampleLists[["3"]]<-sampleList[sample(n,n/2)] sampleLists[["4"]]<-sampleList[sample(n,n/2)] sampleLists[["5"]]<-sampleList[sample(n,n/2)] sampleLists[["ALL"]]<-sampleList m<-HTGM(title=NULL,dir=tempdir(),sampleLists,GOGOA3,ONT="biological_process", enrichThresh=2,countThresh=5,fdrThresh=0.10,nrand=100) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") # load("data/Housekeeping_Genes.RData") sampleList<-unique(as.matrix(Housekeeping_Genes[,"Gene.name"])) n<-nrow(sampleList) sampleLists<-list() # test the effect of random sampling of the entire gene set # this can give an idea of the quality of the GoMiner results # when the complete gene set is yet to be determined sampleLists[["1"]]<-sampleList[sample(n,n/2)] sampleLists[["2"]]<-sampleList[sample(n,n/2)] sampleLists[["3"]]<-sampleList[sample(n,n/2)] sampleLists[["4"]]<-sampleList[sample(n,n/2)] sampleLists[["5"]]<-sampleList[sample(n,n/2)] sampleLists[["ALL"]]<-sampleList m<-HTGM(title=NULL,dir=tempdir(),sampleLists,GOGOA3,ONT="biological_process", enrichThresh=2,countThresh=5,fdrThresh=0.10,nrand=100) ## End(Not run)
generate FDR matrix of id versus cat
htgmM(l, fdrThresh)htgmM(l, fdrThresh)
l |
list of return values of GoMiner() |
fdrThresh |
numerical acceptance threshold for fdr |
returns numeric matrix m containing FDR values
# load("data/x_l.RData") m<-htgmM(x_l,.1)# load("data/x_l.RData") m<-htgmM(x_l,.1)
populate subdirectory of hyperlinked gene lists
hyperGenes(l, dir)hyperGenes(l, dir)
l |
return value of GoMiner() |
dir |
character string containing path name of results directory |
returns no value but has side effect of populating subdirectory of hyperlinked gene lists
# x_l<-load("data/x_l.RData") dir<-tempdir() print(dir) hyperGenes(x_l,dir)# x_l<-load("data/x_l.RData") dir<-tempdir() print(dir) hyperGenes(x_l,dir)
driver to add gene list hyperlinks to the HTGM heatmap
hyperlinks(s, rownames, colnames)hyperlinks(s, rownames, colnames)
s |
character path name of the file containing the HTGM svg |
rownames |
character vector of row names |
colnames |
character vector of column names |
returns the path name of the file containing the hyperlinked HTGM svg
#load("data/x_rn.RData") #load("data/x_cn.RData") #load("data/x_svg.RData") s<-system.file("extdata","x_htgm.svg",package="HTGM") # need to avoid writing to "extdata" dir<-tempdir() file.copy(from=s, to=dir) hyperlinkedFileName<-hyperlinks(sprintf("%s/%s",dir,"x_htgm.svg"),x_rn,x_cn) print("hyperlinkedFileName") print(hyperlinkedFileName)#load("data/x_rn.RData") #load("data/x_cn.RData") #load("data/x_svg.RData") s<-system.file("extdata","x_htgm.svg",package="HTGM") # need to avoid writing to "extdata" dir<-tempdir() file.copy(from=s, to=dir) hyperlinkedFileName<-hyperlinks(sprintf("%s/%s",dir,"x_htgm.svg"),x_rn,x_cn) print("hyperlinkedFileName") print(hyperlinkedFileName)
add gene list hyperlinks to the HTGM heatmap
pasteHyperlinks(str, c1, c2)pasteHyperlinks(str, c1, c2)
str |
character a line from the svg that is to have a hyperlink inserted |
c1 |
character list of row names |
c2 |
character list of column names |
returns a line of code to insert into svg
#load("data/x_svgr.RData") #load("data/x_rnr.RData") #load("data/x_cnc.RData") hl<-pasteHyperlinks(x_svgr,x_rnr,x_cnc)#load("data/x_svgr.RData") #load("data/x_rnr.RData") #load("data/x_cnc.RData") hl<-pasteHyperlinks(x_svgr,x_rnr,x_cnc)