Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Development of new knowledge discovery tools to explore biomedical datasets in breast cancer

Hill, Nathan Stuart 2009. Development of new knowledge discovery tools to explore biomedical datasets in breast cancer. PhD Thesis, Cardiff University.

[thumbnail of U584574.pdf] PDF - Accepted Post-Print Version
Download (30MB)

Abstract

The explorative power of high throughput technologies in cancer research has become well established in recent years, exemplified by diverse gene microarray studies. However, development of the necessary biomedical data analysis tools has historically been confined to a commercial environment, while comprehensive, user-friendly analysis approaches are still needed. Availability of freely-available software, notably the 'R' project statistical programming language, allowed development of a user-friendly multivariate statistics application - Informatics Tenovus (I-10) - in this project. I-10 provides a platform through which powerful existing and future 'R' project statistical analysis methodologies can be applied, without prior programming knowledge. The new system was tested in the context of exploring antihormone resistance in breast cancer, analysing microarray datasets from in vitro models of acquired Tamoxifen (TAMR) or Faslodex resistance (FASR) versus endocrine responsive MCF-7 cells. The analysis not only revealed known de-regulated genes, but also further potential future markers/targets for endocrine response/resistance. The advantages of the 'R' programming environment together with Microsoft Visual Basic.net technology for producing user-friendly biomedical analysis tools facilitated subsequent development of a tool which could explore SEER cancer patient datasets. This new cancer query survival tool - Superstes -allows detailed statistical modelling of the impact that multiple patient attributes (in this instance derived from the SEER breast and colorectal cancer datasets) have on patient survival. The versatility of 'R' was additionally demonstrated in further exploring classifiers, where it was able to interface with the sophisticated, freely available machine learning application 'Weka'. Using 'R' and Weka, breast cancer patient survival was modelled using equivalent patient attributes to the Nottingham Prognostic Index and a 10 year survival subset of the SEER breast cancer dataset. Several machine learning methodologies were compared for their ability to accurately model survival, with their value in routine clinical use for prediction of patient survival then critically evaluated.

Item Type: Thesis (PhD)
Status: Unpublished
Schools: Pharmacy
Subjects: R Medicine > R Medicine (General)
ISBN: 9781303196621
Date of First Compliant Deposit: 30 March 2016
Last Modified: 10 Jan 2018 03:00
URI: https://orca.cardiff.ac.uk/id/eprint/54473

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics