Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Investigating “Gene Ontology”- based semantic similarity in the context of functional genomics

Welter, Danielle ORCID: https://orcid.org/0000-0003-1058-2668 2011. Investigating “Gene Ontology”- based semantic similarity in the context of functional genomics. PhD Thesis, Cardiff University.
Item availability restricted.

[thumbnail of 2011welterdphd.pdf]
Preview
PDF - Accepted Post-Print Version
Download (3MB) | Preview
[thumbnail of welterd.pdf] PDF - Supplemental Material
Restricted to Repository staff only

Download (308kB)

Abstract

Gene functional annotations are an essential part of knowledge discovery in the analysis of large datasets, with the Gene Ontology [Ashburner et al., 2000] as the de facto standard for such annotations. A considerable number of approaches for quantifying functional similarity between gene products based on the semantic similarity between their annotations have been developed, but little guidance exists as to which of these measures are the most appropriate for different purposes. This was addressed here by comparing the performances of a number of similarity measures and associated parameters. This comparison provided some interesting new insights as well as confirming emerging trends from the literature. There is also a pressing need for novel ways of applying these measures to facilitate the functional analysis of lists of gene products. We developed a novel algorithm, FuSiGroups, to group GO terms based on their semantic similarity and genes based on their functional similarity. This two-fold grouping results in groups of not only functionally similar genes but also an associated set of related GO terms that characterise a single functional aspect relating the genes in the group, which facilitates analysis by creating more coherent groups. Each gene can belong to multiple groups, so the groups more accurately reflect the complexity of biological reality than clusters generated using traditional approaches. FuSiGroups was tested on a number of scenarios and in each case, successfully generated biologically relevant groups, identifying the key functional aspects of the dataset. The algorithm also managed to eliminate genes that were functionally unrelated to the bulk of the dataset and distinguish between different biological pathways. Although dataset size is currently a limiting factor, with smaller datasets performing the best, FuSiGroups has been demonstrated as a promising approach for the functional analysis of gene products.

Item Type: Thesis (PhD)
Status: Unpublished
Schools: Computer Science & Informatics
Uncontrolled Keywords: Functional genomics; Semantic similarity; Gene ontology
Date of First Compliant Deposit: 30 March 2016
Last Modified: 18 Oct 2022 13:33
URI: https://orca.cardiff.ac.uk/id/eprint/14292

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics