Logo UHasselt





Logo UHasselt Universiteit Hasselt - Knowledge in action

Centre for Statistics (CENSTAT) : project R-7699

Abstract: Next-generation sequencing (NGS) technology produces millions of short reads. One of the NGS-based applications is RNA sequencing (RNA-seq), which is widely used to study gene (transcript or exon) expression. In order to quantify the gene expression level, the short sequenced reads need to be identified. After it, expression summaries, i.e., read counts, are generated. Therefore, mapping the short reads is a key step in RNA sequencing processing. Read mapping makes possible to find a region, where a short read is identical or similar to genomic or transcriptomic location. However, such matching may not be accurate. Sequenced reads may be matched to multiple locations. In practice, ambiguously mapped sequences cause problems in finding region from which they truly originate, and as such, its abundance estimation. The existing methods for assigning ambiguous reads produce biased abundance estimation. In this project, we will develop two novel approaches, the theoretical framework and the weighted approach, for allocating multiple-mapped reads which allows for alleviating mentioned bias. Moreover, the ambiguity problem occurs when determining the geneisoform quantification. Different transcripts can share the same exon. In order to estimate the gene-isoform expression level, shared exons should be incorporated into statistical model.
Period of project : 1/01/2017to31/12/2017

Other projects of Centre for Statistics (CENSTAT) :

Details of Centre for Statistics (CENSTAT) : back