Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Text-based over-representation analysis of microarray gene lists with annotation bias

Leong, Hui Sun and Kipling, David Glyn 2009. Text-based over-representation analysis of microarray gene lists with annotation bias. Nucleic Acids Research 37 (11) , e79-e79. 10.1093/nar/gkp310

Full text not available from this repository.


A major challenge in microarray data analysis is the functional interpretation of gene lists. A common approach to address this is over-representation analysis (ORA), which uses the hypergeometric test (or its variants) to evaluate whether a particular functionally defined group of genes is represented more than expected by chance within a gene list. Existing applications of ORA have been largely limited to pre-defined terminologies such as GO and KEGG. We report our explorations of whether ORA can be applied to a wider mining of free-text. We found that a hitherto underappreciated feature of experimentally derived gene lists is that the constituents have substantially more annotation associated with them, as they have been researched upon for a longer period of time. This bias, a result of patterns of research activity within the biomedical community, is a major problem for classical hypergeometric test-based ORA approaches, which cannot account for such bias. We have therefore developed three approaches to overcome this bias, and demonstrate their usability in a wide range of published datasets covering different species. A comparison with existing tools that use GO terms suggests that mining PubMed abstracts can reveal additional biological insight that may not be possible by mining pre-defined ontologies alone.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Medicine
Subjects: Q Science > QH Natural history > QH426 Genetics
R Medicine > R Medicine (General)
Publisher: Oxford University Press
ISSN: 0305-1048
Last Modified: 04 Jun 2017 03:50

Citation Data

Cited 15 times in Google Scholar. View in Google Scholar

Cited 15 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item