Chemistry and Biochemistry,
Biochemistry, Molecular Biology
Bioinformatics GPB Home Area,
Graduate Program in Biochemistry, Molecular and Structural Biology,
Molecular Biology Institute
Biophysics and Structural Biology,
Proteomics and Bioinformatics
My lab works in several areas of bioinformatics:
1. alternative splicing and genome evolution: alternative splicing is the process by which a single gene can produce multiple gene products with different specific functions (by controlling which exons are spliced into the final product). My lab has done both a lot of genome-wide analysis of the types and functions of alternative splicing, and its apparent role in evolution of mammalian genomes. Alternative splicing appears to have greatly accelerated major evolutionary events such as exon creation, and now is an exciting new area of research in genome evolution.
2. protein evolutionary pathways. Using a massive dataset of clinical HIV sequences, we have developed new methods to decode the evolutionary pathways by which HIV evolves drug resistance -- a major clinical problem for the treatment of AIDS. We have just shown that our methods can measure the detailed "fitness landscape" describing how HIV proteins can evolve, as a kinetic network showing the actual rate of evolution along every possible evolutionary pathway. This work is aimed at both a new level of understanding of pathogen evolution, and the ability to predict the detailed evolutionary pathways that lead to drug resistance.
3. graph databases for bioinformatics and genomics. Graph structures are becoming increasingly important and universal in bioinformatics, as a flexible way of describing and querying genomic data. We have developed a general framework for working with genomic data as an abstract graph database, with very high performance for fundamental problems such as multiple genome alignment query and protein interaction network analysis. These problems also pose interesting computer science questions. For more information see http://www.bioinformatics.ucla.edu/pygr.
Focus on Alternative splicing: Annotation of the human genome by gene prediction methods has turned out to be very error-prone; fortunately, there is abundant experimental data that we can use instead. In particular, high-throughput shotgun sequencing of mRNA fragments (Expressed Sequence Tags, or ESTs) provides a massive dataset for seeing exactly what?s expressed, discovering gene structures, and identifying alternative splice forms. Whereas alternative splicing was previously considered to be a relatively rare form of functional regulation (perhaps present in 5 - 15% of genes), EST analyses have indicated that it is ubiquitous, observable in 40-60% of human genes. Using this combination of experimental data and bioinformatics methods, our lab has identified over 30,000 alternative splicing events in the human genome (effectively doubling the number of transcript forms relative to the consensus estimate of approximately 32,000 human genes). These data are used by researchers around the world via our online ASAP database (www.bioinformatics.ucla.edu/ASAP).
These data provide many fascinating windows into the regulation of biological function, when careful statistics are employed to assess the significance of apparent patterns and shifts in alternative splicing within the data. For example, we have identified a large subset of alternative splice forms that display strong tissue-specificity, indicating functional regulation of the transcript and protein product in an individual tissue or developmental stage. Similarly, we have identified a large set of genes whose splicing is altered dramatically in tumors relative to normal tissue, suggesting that alternative splicing may play a significant role in cancer, for example, by contributing to maintenance of the transformed state. We have also obtained very interesting results from analysis of how alternative splicing changes protein domain architecture and function.
Recently, we analyzed the comparative genomics of alternative splicing by comparing alternative splicing patterns in orthologous genes from a number of vertebrate genomes. Surprisingly, whereas 98% of exons from the human genome are also found in the orthologous mouse and rat genes, alternatively spliced exons showed a 30-fold increase in newly created exons (that is, exons that were found in the human genome but not mouse, indicating that they were created subsequent to the split of these two genomes from their common ancestor). These data suggest that alternative splicing may play an important role in accelerating gene evolution by enabling much more rapid exon creation than is possible without alternative splicing. Our lab is looking at many aspects of the comparative genomics of alternative splicing and its role in genome evolution.
Prof. Lee has been a Faculty member in the Department of Chemistry and Biochemistry since 1998. His training provided an unusual combination of experimental cell biology, biophysics, and algorithm development, which he has has applied at UCLA to bioinformatics analysis of genome evolution. He has led efforts to establish a bioinformatics Ph.D. program at UCLA. He has served on the Board of Scientific Counselors, NIH National Center for Biotechnology Information, and serves on the editorial board of Biology Direct. His current research focuses on alternative splicing and its role in genome evolution.
A selected list of publications: