Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

The experimentally obtained functional impact assessments of 5' splice site GT>GC variants differ markedly from those predicted

Chen, Jian-Min, Lin, Jin-Huan, Masson, Emmanuelle, Liao, Zhuan, Férec, Claude, Cooper, David N. ORCID: https://orcid.org/0000-0002-8943-8484 and Hayden, Matthew 2020. The experimentally obtained functional impact assessments of 5' splice site GT>GC variants differ markedly from those predicted. Current Genomics 21 (1) , pp. 56-66. 10.2174/1389202921666200210141701

[thumbnail of Chen SpliceAI V5_25Nov2019.pdf]
Preview
PDF - Accepted Post-Print Version
Download (353kB) | Preview

Abstract

Introduction: 5' splice site GT>GC or +2T>C variants have been frequently reported to cause human genetic disease and are routinely scored as pathogenic splicing mutations. However, we have recently demonstrated that such variants in human disease genes may not invariably be pathogenic. Moreover, we found that no splicing prediction tools appear to be capable of reliably distinguishing those +2T>C variants that generate wild-type transcripts from those that do not. Methodology: Herein, we evaluated the performance of a novel deep learning-based tool, SpliceAI, in the context of three datasets of +2T>C variants, all of which had been characterized functionally in terms of their impact on pre-mRNA splicing. The first two datasets refer to our recently described “in vivo” dataset of 45 known disease-causing +2T>C variants and the “in vitro” dataset of 103 +2T>C substitutions subjected to full-length gene splicing assay. The third dataset comprised 12 BRCA1 +2T>C variants that were recently analyzed by saturation genome editing. Results: Comparison of the SpliceAI-predicted and experimentally obtained functional impact assessments of these variants (and smaller datasets of +2T>A and +2T>G variants) revealed that although SpliceAI performed rather better than other prediction tools, it was still far from perfect. A key issue was that the impact of those +2T>C (and +2T>A) variants that generated wild-type transcripts represents a quantitative change that can vary from barely detectable to an almost full expression of wild-type transcripts, with wild-type transcripts often co-existing with aberrantly spliced transcripts. Conclusion: Our findings highlight the challenges that we still face in attempting to accurately identify splice-altering variants.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Medicine
Publisher: Bentham Science Publishers
ISSN: 1389-2029
Date of First Compliant Deposit: 7 August 2020
Date of Acceptance: 3 February 2020
Last Modified: 13 Nov 2023 11:29
URI: https://orca.cardiff.ac.uk/id/eprint/134041

Citation Data

Cited 9 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics