CSI 4900 Projects
- My workload for 2022-2023 and 2023-2024 does not leave me enough time to supervise Honours projects.
- Evaluating space-filling curve representations of protein sequences with convolution and recurrent networks.
-
Splice site prediction in mitochondrial genomes
-
End goal (long term, possibly beyond the scope of a CSI 4900 project):
-
Creating a tool to predict splice sites specifically for mitochondrial (mt) genomes.
- There are many splice site predictions tools, but I don’t know of any tool specific to the organelle genomes.
- By splice site prediction, we mean the bondary between exons and introns in eukaryotic genomes.
-
Steps:
-
Creating a comprehensive, large scale, data set of splice sites junction.
-
This would be done using RNA-Seq data.
- RNA-Seq data is the high-throughput sequencing of expressed DNA.
- This involves mapping the RNA-Seq reads to the reference genome using existing tools.
-
This would be done using RNA-Seq data.
-
Creating a tool
- Using deep learning (CNN, BLSTM, etc.) to predict splice site junctions.
-
Future work,
- Adding expression level information to quantify the expression of the transcrips.
- Adding exon/intron predictors to possibly improve accuracy.
-
Creating a comprehensive, large scale, data set of splice sites junction.
-
Creating a tool to predict splice sites specifically for mitochondrial (mt) genomes.
-
End goal (long term, possibly beyond the scope of a CSI 4900 project):
Past Projects:
-
D3.js visualisation of sequence and structure RNA motifs
2014 F, Joseph Sleiman
-
An Efficient and Effective Algorithm to Evolve Regular Expressions
2012 F, Manuel Belmadani
-
Application Web pour la gestion des ressources du 412e Escadron
2011, Jean-Philippe Pellerin
-
Iterative Maximum Parsimony Multiple Sequence Alignment (ParAli)
2010 W, Derek O'Brien
-
A Genetic Programming Approach to RNA-RNA Interaction Motif Discovery (GP-RNA^2)
2009 F, Christopher Saunders
-
Approximate Matching of RNA Secondary Structure Expressions Containing Pseudoknots (pkSeed)
2006 F, Penny J.X. Pan
-
Progressive Simultaneous Alignment and Structure Prediction of Multiple RNA Sequences (hD)
2006 W, Luke Cen
-
Implementation of Range Minimum Query Algorithm (RMQ)
2006 W, Ayse Abacioglu
-
Approximate Matching of RNA Secondary Structure Expressions (RNA Matching)
2005 W, Sol Ackerman
-
Implementing a Parallel Version of Dynalign for the SunFire V880 architecture (pD)
2004 W, Philippe Desjardins
-
A Genetic Programming Approach to RNA Secondary Structure Motif Discovery (GP)
2003 F, Robert Collier
-
Simulating Genetic Drift (Sim)
2003 F, Alain Gagnon
-
RNA Secondary Structure Viewer (RS2V)
2003, F, Dina Bilenkis
-
String Algorithms in Java (Suffix Trees)
2003 W, Daniela Cernea
-
Learning Representations of Protein Inter-Domain Linkers Using Inductive Logic Programming (Linkers)
2003 W, Patrick Wisking
-
Intelligent Agents for Updating Biological Databases (AgentDB)
2003 W, Navneet Bhalla
-
Simulator for the TC-1101 Computer (VM)
2002 W, Yvgeniya Lozdernik
-
Protein Viewer/Modeler Written with Java 3D (Java Protein 3D Viewer)
2002 W, Andrew Henry, Elton Lum and Devin Kennedy