| Title: | Automate the Mapping Between a List of Genes and Gene Ontology Categories |
|---|---|
| Description: | In gene-expression microarray studies, for example, one generally obtains a list of dozens or hundreds of genes that differ in expression between samples and then asks 'What does all of this mean biologically?' Alternatively, gene lists can be derived conceptually in addition to experimentally. For instance, one might want to analyze a group of genes known as housekeeping genes. The work of the Gene Ontology (GO) Consortium <geneontology.org> provides a way to address that question. GO organizes genes into hierarchical categories based on biological process, molecular function and subcellular localization. The role of 'GoMiner' is to automate the mapping between a list of genes and GO, and to provide a statistical summary of the results as well as a visualization. |
| Authors: | Barry Zeeberg [aut, cre] |
| Maintainer: | Barry Zeeberg <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 1.3 |
| Built: | 2026-05-10 08:29:15 UTC |
| Source: | https://github.com/cran/GoMiner |
determine if gene list and database contain compatible identifiers
checkGeneListVsDB(geneList, ontology, GOGOA3, thresh = 0.5, verbose = FALSE)checkGeneListVsDB(geneList, ontology, GOGOA3, thresh = 0.5, verbose = FALSE)
geneList |
character list of gene names |
ontology |
character string c("molecular_function", "cellular_component", "biological_process") |
GOGOA3 |
return value of subsetGOGOA() |
thresh |
numeric acceptance threshold for fraction of gene list matching database identifiers |
verbose |
integer vector representing classes |
returns no value, but may have side effect of aborting the computation
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") checkGeneListVsDB(geneList=cluster52,ontology="biological_process", GOGOA3,thresh=0.5,verbose=TRUE) # supposed to generate error message load("/Users/barryzeeberg/personal/GODB_RDATA/sgd/GOGOA3_sgd.RData") checkGeneListVsDB(geneList=xenopusGenes,ontology="biological_process", GOGOA3,thresh=0.5,verbose=TRUE) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") checkGeneListVsDB(geneList=cluster52,ontology="biological_process", GOGOA3,thresh=0.5,verbose=TRUE) # supposed to generate error message load("/Users/barryzeeberg/personal/GODB_RDATA/sgd/GOGOA3_sgd.RData") checkGeneListVsDB(geneList=xenopusGenes,ontology="biological_process", GOGOA3,thresh=0.5,verbose=TRUE) ## End(Not run)
compute the false discovery rate (FDR) of the hypergeometric p values of genes mapping to gene ontology (GO) categories
FDR(sampleList, tablePop3, hyper, GOGOA3, nrand, ontology, subd, opt = 0)FDR(sampleList, tablePop3, hyper, GOGOA3, nrand, ontology, subd, opt = 0)
sampleList |
character vector of user-supplied genes of interest |
tablePop3 |
return value of GOtable3() |
hyper |
return value of GOhypergeometric3() |
GOGOA3 |
return value of subsetGOGOA() |
nrand |
integer number of randomizations |
ontology |
c("molecular_function","cellular_component","biological_process") |
subd |
character string pathname for directory containing sink.txt |
opt |
integer 0:1 parameter used to determine randomization method |
returns a list with FDR information
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") fdr<-FDR(x_sampleList1,x_tablePop31,x_hyper1,GOGOA3,3,"biological_process",tempdir(),0) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") fdr<-FDR(x_sampleList1,x_tablePop31,x_hyper1,GOGOA3,3,"biological_process",tempdir(),0) ## End(Not run)
compute the gene enrichment in a GO category
GOenrich3(tableSample3, tablePop3)GOenrich3(tableSample3, tablePop3)
tableSample3 |
sample return value of GOtable3() |
tablePop3 |
population return value of GOtable3() |
returns a matrix with columns c("SAMPLE","POP","ENRICHMENT")
m<-GOenrich3(x_tableSample3,x_tablePop3)m<-GOenrich3(x_tableSample3,x_tablePop3)
generate a matrix to be used as input to a heat map
GOheatmap(sampleList, x, thresh, fdrThresh = 0.105, verbose)GOheatmap(sampleList, x, thresh, fdrThresh = 0.105, verbose)
sampleList |
character list of gene names |
x |
DB component of return value of GOtable3() |
thresh |
output of GOthresh() |
fdrThresh |
numeric value of FDR acceptance threshold |
verbose |
integer vector representing classes |
returns a matrix to be used as input to a heat map
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") heatmap<-GOheatmap(cluster52,GOGOA3$ontologies[["biological_process"]],x_thresh,verbose=1) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") heatmap<-GOheatmap(cluster52,GOGOA3$ontologies[["biological_process"]],x_thresh,verbose=1) ## End(Not run)
compute the hypergeometric p value for gene enrichment in a GO category
GOhypergeometric3(tableSample3, tablePop3)GOhypergeometric3(tableSample3, tablePop3)
tableSample3 |
sample return value of GOtable3() |
tablePop3 |
population return value of GOtable3() |
returns a matrix with columns c("x","m","n","k","p")
hyper<-GOhypergeometric3(x_tableSample3,x_tablePop3)hyper<-GOhypergeometric3(x_tableSample3,x_tablePop3)
driver to generate heatmap
GoMiner( title = NULL, dir, sampleList, GOGOA3, ontology, enrichThresh = 2, countThresh = 5, pvalThresh = 0.1, fdrThresh = 0.1, nrand = 100, mn = 2, mx = 200, opt, verbose = 1 )GoMiner( title = NULL, dir, sampleList, GOGOA3, ontology, enrichThresh = 2, countThresh = 5, pvalThresh = 0.1, fdrThresh = 0.1, nrand = 100, mn = 2, mx = 200, opt, verbose = 1 )
title |
character string descriptive title |
dir |
character string full pathname to the directory acting result repository |
sampleList |
character list of gene names |
GOGOA3 |
return value of subsetGOGOA() |
ontology |
character string c("molecular_function", "cellular_component", "biological_process") |
enrichThresh |
numerical acceptance threshold for enrichment |
countThresh |
numerical acceptance threshold for gene count |
pvalThresh |
numerical acceptance threshold for pval |
fdrThresh |
numerical acceptance threshold for fdr |
nrand |
numeric number of randomizations to compute FDR |
mn |
integer param passed to trimGOGOA3, min size threshold for a category |
mx |
integer param passed to trimGOGOA3, max size threshold for a category |
opt |
integer 0:1 parameter used to select randomization method |
verbose |
integer vector representing classes |
modes of FDR estimation: opt=0 use original database with randomized geneLists opt=1 use original geneList with internally scrambled genes databases (uses randomGODB())
databases that can be used with the real geneList: these are explicitly passed as parameter to GoMiner() (1) original GOGOA3 (2) randomized version of GOGOSA GOGOA3R<-randomGODB(GOGOA3) (3) database containing a subset of the big hitters genes (randomGODB2driver()) attempts to compensate for the over-annotation of some genes, that might lead to false positive if gene G has a lot of mappings to categories, randomly sample G/category pairs to retain a reasonable number of them. e.g., reduce G from 100 category mappings to 7 category mappings, by omitting 93 of the mappings G/category mappings
returns a matrix suitable to generate a heatmap
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") l<-GoMiner("Cluster52",tempdir(),cluster52, GOGOA3=GOGOA3,ontology="biological_process",enrichThresh=2, countThresh=5,pvalThresh=0.10,fdrThresh=0.10,nrand=2,mn=2,mx=200,opt=0,verbose=1) # try out yeast database! load("/Users/barryzeeberg/personal/GODB_RDATA/sgd/GOGOA3_sgd.RData") # make sure this is in fact the database for the desired species GOGOA3$species # use database to find genes mapping to an interesting category cat<-"GO_0042149__cellular_response_to_glucose_starvation" w<-which(GOGOA3$ontologies[["biological_process"]][,"GO_NAME"]==cat) geneList<-GOGOA3$ontologies[["biological_process"]][w,"HGNC"] l<-GoMiner("YEAST",tempdir(),geneList, GOGOA3,ontology="biological_process",enrichThresh=2, countThresh=3,pvalThresh=0.10,fdrThresh=0.10,nrand=2,mn=2,mx=200,opt=0) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") l<-GoMiner("Cluster52",tempdir(),cluster52, GOGOA3=GOGOA3,ontology="biological_process",enrichThresh=2, countThresh=5,pvalThresh=0.10,fdrThresh=0.10,nrand=2,mn=2,mx=200,opt=0,verbose=1) # try out yeast database! load("/Users/barryzeeberg/personal/GODB_RDATA/sgd/GOGOA3_sgd.RData") # make sure this is in fact the database for the desired species GOGOA3$species # use database to find genes mapping to an interesting category cat<-"GO_0042149__cellular_response_to_glucose_starvation" w<-which(GOGOA3$ontologies[["biological_process"]][,"GO_NAME"]==cat) geneList<-GOGOA3$ontologies[["biological_process"]][w,"HGNC"] l<-GoMiner("YEAST",tempdir(),geneList, GOGOA3,ontology="biological_process",enrichThresh=2, countThresh=3,pvalThresh=0.10,fdrThresh=0.10,nrand=2,mn=2,mx=200,opt=0) ## End(Not run)
tabulate number of geneList mappings to GO categories
GOtable3(hgncList, DB)GOtable3(hgncList, DB)
hgncList |
character list of gene names |
DB |
selected ontology branch of return value of subsetGOGOA |
returns a list whose components are c("DB","table","ngenes") where 'DB' is the GO DB subsetted to the desired ONTOLOGY, and 'table' is tabulation of number of occurrences of each GO category name within the desired ONTOLOGY, and ngenes is the total number of hgncList genes mapping to GOGOA
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") x<-GOtable3(cluster52,GOGOA3$ontologies[["biological_process"]]) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") x<-GOtable3(cluster52,GOGOA3$ontologies[["biological_process"]]) ## End(Not run)
retrieve lines of m that meet both enrichThresh and countThresh
GOthresh(m, sampleFDR, enrichThresh, countThresh, pvalThresh, fdrThresh)GOthresh(m, sampleFDR, enrichThresh, countThresh, pvalThresh, fdrThresh)
m |
return value of GOenrich3() |
sampleFDR |
component of return value of RCPD() |
enrichThresh |
numerical acceptance threshold for enrichment |
countThresh |
numerical acceptance threshold for gene count |
pvalThresh |
numerical acceptance threshold for pval |
fdrThresh |
numerical acceptance threshold for fdr |
returns a subset of matrix (m joined with fdr$sampleFDR) with entries meeting all thresholds
thresh<-GOthresh(x_m,x_fdr$sampleFDR,enrichThresh=2,countThresh=2,pvalThresh=0.1,fdrThresh=0.100)thresh<-GOthresh(x_m,x_fdr$sampleFDR,enrichThresh=2,countThresh=2,pvalThresh=0.1,fdrThresh=0.100)
driver to invoke hitters2() and trimGOGOA3()
hitterBeforeAfterDriver(GOGOA3, mn = 20, mx = 200, verbose)hitterBeforeAfterDriver(GOGOA3, mn = 20, mx = 200, verbose)
GOGOA3 |
return value of minimalistGODB::buildGODatabase() |
mn |
integer minimum category size |
mx |
integer maximum category size |
verbose |
integer vector representing classes |
returns the return value of trimGOGOA3()
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # This example is given in full detail in the package vignette. # You can generate GOGOA3.RData using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO dir<-"/Users/barryzeeberg/personal/GODB_RDATA/goa_human/" load(sprintf("%s/%s",dir,"GOGOA3_goa_human.RData")) geneList<-GOGOA3$ontologies[["biological_process"]][1:10,"HGNC"] GOGOA3tr<-hitterBeforeAfterDriver(GOGOA3,mn=20,mx=200,1) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # This example is given in full detail in the package vignette. # You can generate GOGOA3.RData using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO dir<-"/Users/barryzeeberg/personal/GODB_RDATA/goa_human/" load(sprintf("%s/%s",dir,"GOGOA3_goa_human.RData")) geneList<-GOGOA3$ontologies[["biological_process"]][1:10,"HGNC"] GOGOA3tr<-hitterBeforeAfterDriver(GOGOA3,mn=20,mx=200,1) ## End(Not run)
determine the number of mappings for the top several genes
hitters2(GOGOA3, verbose = 1)hitters2(GOGOA3, verbose = 1)
GOGOA3 |
return value of minimalistGODB::buildGODatabase() |
verbose |
integer vector representing classes |
returns no value, but has side effect of printing information
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # This example is given in full detail in the package vignette. # You can generate GOGOA3.RData using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO dir<-"/Users/barryzeeberg/personal/GODB_RDATA/goa_human/" load(sprintf("%s/%s",dir,"GOGOA3_goa_human.RData")) geneList<-GOGOA3$ontologies[["biological_process"]][1:10,"HGNC"] hitters2(GOGOA3,1) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # This example is given in full detail in the package vignette. # You can generate GOGOA3.RData using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO dir<-"/Users/barryzeeberg/personal/GODB_RDATA/goa_human/" load(sprintf("%s/%s",dir,"GOGOA3_goa_human.RData")) geneList<-GOGOA3$ontologies[["biological_process"]][1:10,"HGNC"] hitters2(GOGOA3,1) ## End(Not run)
GoMiner data set
data(Housekeeping_Genes)data(Housekeeping_Genes)
determine if database represents human species
human(GOGOA3, verbose = TRUE)human(GOGOA3, verbose = TRUE)
GOGOA3 |
return value of subsetGOGOA() |
verbose |
integer vector representing classes |
returns Boolean TRUE if species is human
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") hum<-human(GOGOA3) load("/Users/barryzeeberg/personal/GODB_RDATA/sgd/GOGOA3_sgd.RData") hum<-human(XENOPUS,1) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") hum<-human(GOGOA3) load("/Users/barryzeeberg/personal/GODB_RDATA/sgd/GOGOA3_sgd.RData") hum<-human(XENOPUS,1) ## End(Not run)
driver to perform several preprocessing steps: quick peek trim small and large categories is the database for human species validate validated HGNC symbols in sampleList determine up to date (ie, contains GOGOA3$species) or legacy version of human database
preprocessDB(sampleList, GOGOA3, ontology, mn, mx, thresh, verbose)preprocessDB(sampleList, GOGOA3, ontology, mn, mx, thresh, verbose)
sampleList |
character list of gene names |
GOGOA3 |
return value of subsetGOGOA() |
ontology |
character string c("molecular_function", "cellular_component", "biological_process") |
mn |
integer param passed to trimGOGOA3, min size threshold for a category |
mx |
integer param passed to trimGOGOA3, max size threshold for a category |
thresh |
numerical paramter passed to checkGeneListVsDB() |
verbose |
integer vector representing classes |
returns a list whose components are a trimmed version of GOGOA3 and (for human) a sampleList with validated HGNC symbols
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") pp<-preprocessDB(cluster52,GOGOA3,"biological_process",20,200,0.5,3) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") pp<-preprocessDB(cluster52,GOGOA3,"biological_process",20,200,0.5,3) ## End(Not run)
retrieve n unique random genes
randSubsetGeneList(geneList, ngenes)randSubsetGeneList(geneList, ngenes)
geneList |
character vector geneList |
ngenes |
integer desired number of random genes |
returns a character vector of genes
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") genes<-randSubsetGeneList(GOGOA3$genes[["biological_process"]],20) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") genes<-randSubsetGeneList(GOGOA3$genes[["biological_process"]],20) ## End(Not run)
prepare a cpd of p values from randomized gene sets
RCPD(GOGOA3, tablePop, geneList, nrand, ontology, hyper, subd, opt)RCPD(GOGOA3, tablePop, geneList, nrand, ontology, hyper, subd, opt)
GOGOA3 |
return value of subsetGOGOA() |
tablePop |
return value of GOtable3() |
geneList |
character vector lisgt of genes to randomize |
nrand |
integer number of randomizations |
ontology |
c("molecular_function","cellular_component","biological_process") |
hyper |
return value of GOhypergeometric3() from real (nonrandom) data |
subd |
character string pathname for directory containing sink.txt |
opt |
integer 0:1 parameter used to select randomization method |
the cpd of the randomizations is to be used for estimating the false discovery rate (FDR) of the real sampled genes
returns a histogram of log10(p)
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") rcpd<-RCPD(GOGOA3,x_tablePop31,10,3,"biological_process",x_hyper1,tempdir(),0) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") rcpd<-RCPD(GOGOA3,x_tablePop31,10,3,"biological_process",x_hyper1,tempdir(),0) ## End(Not run)
driver to run GoMiner under several randomization procedures
runGoMinerExamples( title = NULL, dir, sampleList, GOGOA3, ontology, enrichThresh = 2, countThresh = 5, pvalThresh = 0.1, fdrThresh = 0.1, nrand = 2, mn = 2, mx = 200, verbose = 1 )runGoMinerExamples( title = NULL, dir, sampleList, GOGOA3, ontology, enrichThresh = 2, countThresh = 5, pvalThresh = 0.1, fdrThresh = 0.1, nrand = 2, mn = 2, mx = 200, verbose = 1 )
title |
character string descriptive title |
dir |
character string full pathname to the directory acting result repository |
sampleList |
character list of gene names |
GOGOA3 |
return value of subsetGOGOA() |
ontology |
character string c("molecular_function", "cellular_component", "biological_process") |
enrichThresh |
numerical acceptance threshold for enrichment |
countThresh |
numerical acceptance threshold for gene count |
pvalThresh |
numerical acceptance threshold for pval |
fdrThresh |
numerical acceptance threshold for fdr |
nrand |
numeric number of randomizations to compute FDR |
mn |
integer param passed to trimGOGOA3, min size threshold for a category |
mx |
integer param passed to trimGOGOA3, max size threshold for a category |
verbose |
integer vector representing classes |
returns a list containing the return value of GoMiner()
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") ontology<-"biological_process" t<-sort(table(GOGOA3$ontologies[[ontology]][,"HGNC"]),decreasing=TRUE) dir<-tempdir() sampleList<-names(t)[1:50] title<-"hi_hitters" hh<-runGoMinerExamples(title,dir,sampleList,GOGOA3,ontology,nrand=5) sampleList<-names(t)[1001:1050] title<-"hi_hitters5" hh<-runGoMinerExamples(title,dir,sampleList,GOGOA3,ontology,nrand=5) sampleList<-cluster52 title<-"cluster52" hh<-runGoMinerExamples(title,dir,sampleList,GOGOA3,ontology,nrand=5) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # you can generate it using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases load("/Users/barryzeeberg/personal/GODB_RDATA/goa_human/GOGOA3_goa_human.RData") ontology<-"biological_process" t<-sort(table(GOGOA3$ontologies[[ontology]][,"HGNC"]),decreasing=TRUE) dir<-tempdir() sampleList<-names(t)[1:50] title<-"hi_hitters" hh<-runGoMinerExamples(title,dir,sampleList,GOGOA3,ontology,nrand=5) sampleList<-names(t)[1001:1050] title<-"hi_hitters5" hh<-runGoMinerExamples(title,dir,sampleList,GOGOA3,ontology,nrand=5) sampleList<-cluster52 title<-"cluster52" hh<-runGoMinerExamples(title,dir,sampleList,GOGOA3,ontology,nrand=5) ## End(Not run)
remove categories from GOGOA3 that are too small or too large
trimGOGOA3(GOGOA3, mn, mx, verbose)trimGOGOA3(GOGOA3, mn, mx, verbose)
GOGOA3 |
return value of subsetGOGOA() |
mn |
integer min size threshold for a category |
mx |
integer max size threshold for a category |
verbose |
integer vector representing classes |
If a category is too small, it is unreliable for statistical evaluation Also, in the extreme case of size = 1, then that category is essentially equivalent to a gene rather than a category. Same is partially true for size = 2. If a category is too large, it is too generic to be useful for categorization. Finally, by trimming the database, analyses will run faster.
returns trimmed version of GOGOA3
## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # This example is given in full detail in the package vignette. # You can generate GOGOA3.RData using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases GOGO3tr<-trimGOGOA3(GOGOA3,mn=2,mx=200,1) ## End(Not run)## Not run: # GOGOA3.RData is too large to include in the R package # so I need to load it from a file that is not in the package. # Since this is in a file in my own file system, I could not # include this as a regular example in the package. # This example is given in full detail in the package vignette. # You can generate GOGOA3.RData using the package 'minimalistGODB' # or you can retrieve it from https://github.com/barryzee/GO/tree/main/databases GOGO3tr<-trimGOGOA3(GOGOA3,mn=2,mx=200,1) ## End(Not run)
convert outdated HGNC symbols to current HGNC symbols
validHGNCSymbols(geneList)validHGNCSymbols(geneList)
geneList |
character vector of HGNC symbols |
removes NA and /// from output of checkGeneSymbols()
returns list of mapping table and vector of current HGNC symbols
geneList<-c("FN1", "tp53", "UNKNOWNGENE","7-Sep", "9/7", "1-Mar", "Oct4", "4-Oct","OCT4-PG4", "C19ORF71", "C19orf71") l<-validHGNCSymbols(geneList)geneList<-c("FN1", "tp53", "UNKNOWNGENE","7-Sep", "9/7", "1-Mar", "Oct4", "4-Oct","OCT4-PG4", "C19ORF71", "C19orf71") l<-validHGNCSymbols(geneList)