Research report 2020 - Max Planck Institute for Developmental Biology
Tiny intron prediction and the current limits of machine learning
Swart, Estienne Carl
Forschergruppe Genomik und Molekuklarbiologie der Ciliaten
SummaryThough we are in the era of thousand genome projects, the genes predicted within these genomes still leave much to be desired. In particular, some of the simplifying assumptions result in errors as soon as the peculiarities of molecular biology come into play. Thus, there is a continued need to improve the machine learning and other algorithms used in gene prediction. In the course of assembling and annotating new genomes, we developed a program, Intronarrator, to overcome the gene prediction inaccuracy due to tiny introns by direct intron predictions from deep RNA sequencing.