During my first year in the Computational Biology and Bioinformatics PhD program, I had the chance to rotate with three labs before having to select my lab. I worked with (a) Alex Hartemink, (b) Sandeep Dave, and (c) Ed Iversen (with Ravi Karra). After working with these excellent people, I decided to return to the Dave Lab for my dissertation work. You can find my CV here. Below is a brief description of projects I've recently been a part of.

Clonal analysis of cardiomyocyte growth and regeneration

Advisor: Ed Iversen (with Ravi Karra)
In my third rotation, I worked on creating a statistical model for analysis of cardiomyocte proliferation. The biological question to be answered was how the distribution of proliferating cardiomyocyte cells differs from the background growth rate and how the location of cardiomyocytes was correlated with the vasculature. We worked with fluorescence microscopy images (with three channels) of heart slices that had been injured. Currently, I'm still working on wrapping up this project. Hit me up to know more!

Reconstructing Immunoglobulin sequences from bulk sequencing data

Advisor: Sandeep Dave
In my second rotation, I worked on immunoglobulin reconstruction from tumor bulk sequencing data. The immunogolobulin locus is a highly variable locus that is highly mutated to the extent that almost every B cell has a different sequence at this locus. This extreme variability is the secret behind our body's excellent immune system. In lymphomas, one of these B cells is clonally replicated and thus there exists a population of cells that exhibit a certain clonotypic sequence at this locus. Single cell methods have been used to identify these clonotypes in the recent past. I used bulk sequencing data to accomplish the same task. I presented my work on this project at the CBB 2019 retreat poster session - you can find my poster here. Hit me up to know more!

Understanding non-coding transcripts

Advisor: Alex Hartemink
In my first rotation, I had the pleasure of working with Alex Hartemink where I worked with RNAseq expression data for non coding trancripts. We know that there exists pervasive non-coding transcription in the yeast genome. I worked on developing and implementing a systemic classification of pertinent transcripts based on their location with respect to coding genes. The next step was to identify the relation between the changes in the transcription of the genes and the adjacent non-coding transcripts. I ended the project the project by looking at the gain/loss in nucleosomal structure in these adjacent gene-transcript pairs. Hit me up to know more!

Selecting Features from Sample Specific Coexpression Networks using Random Forests

Advisor: ChloƩ-Agathe Azencott
Sample Specific Coexpression networks may be evaluated from the aggregate network by estimating the effect of each sample on the network. The proposition here is that there might be other dissmilarity measures to calculate the edge weight that are at least as expressive. I used random forests to predict the edge weights using measures including L1, L2, and Mahalanobis distances, and identified important features using permutation feature importance. (Code) Hit me up to know more!

Modeling the optimal propensity of lysogeny for coexisting populations

Advisor: Supreet Saini
Temperate phages make a developmental decision between lysogeny and lysis in order to avoid the extinction of not only their own species but also of their bacterial hosts. I worked on estimating the optimal lysogenic propensity as a function of the environmental stresses for individual species and the multiplicity of infection in order to maximize coexistence (biorXiv). Hit me up to know more!