APPRIS principal isoforms and MANE Select transcripts in clinical variant interpretation

Abstract

Most coding genes are able to generate multiple alternatively spliced transcripts. Determining which of these transcript variants produces the main protein isoform, and which splice variants are functionally important, is crucial in comparative genomics and essential for clinical variant interpretation. Here we show that the principal isoforms chosen by APPRIS and the MANE Select variants provide the best approximations of the main cellular protein isoforms. Principal isoforms are predicted from conservation and from protein features, and MANE transcripts are chosen from the consensus between teams of expert manual curators. APPRIS principal isoforms coincide in over 94% of coding genes with MANE Select transcripts and the two methods are particularly discriminating when they agree on the main splice variant. Where the two methods agree, the splice variants coincide with the main isoform detected in proteomics experiments in 98.2% of genes with multiple protein isoforms. We also find that almost all ClinVar pathogenic mutations map to MANE Select or APPRIS principal isoforms. Where APPRIS and MANE agree on the main isoform, 99.93% of validated pathogenic variants map to principal rather than alternative exons. MANE Plus Clinical transcripts cover most validated pathogenic mutations in alternative coding exons. TRIFID functional importance scores are particularly useful for distinguishing clinically important alternative isoforms: the highest scoring TRIFID isoforms are more than 300 times more likely to have validated pathogenic mutations. We find that APPRIS, MANE and TRIFID are important for determining the biological relevance of splice isoforms and should be an essential part of clinical variant interpretation.Competing Interest StatementThe authors have declared no competing interest.

Publication
bioRxiv

Related