A NEW COMPARATIVE-GENOMICS APPROACH FOR DEFINING PHENOTYPE-SPECIFIC INDICATORS: APPLICATIONS IN ECOLOGY AND IN DISEASE ETIOLOGY



Zohar Pasternak1, Tom Ben Sasson2, Yossi Cohen1, Elad Segev2, Edouard Jurkevitch1
1 Department of Plant Pathology and Microbiology, The Hebrew University of Jerusalem, Rehovot, Israel
2 Department of Applied Mathematics, Holon Institute of Technology, Holon, Israel

Understanding how genotype links to phenotype is a fundamental question in microbiology, with functional implications including the precise identification of disease-causing agents or the characterization of ecological functions of specific populations. We developed a bioinformatic tool called DiffGene, which automatically identifies marker genes that are specific to groups of genomes; it takes as input the complete gene content of all available fully-sequenced genomes in any two groups of genomes, and maps the presence/absence of each gene in each genome to find genes that are unique to a group. The ability of this approach to specify genes to unique phenotypes was tested on predatory bacteria. Predatory bacteria are ubiquitous in nature, seeking and consuming other live bacteria. Currently there are no known genetic markers distinguishing them from non-predatory bacteria, so estimating the effects of microbial predation in the environment is a daunting task. Using DiffGene, a predator-specific marker of 60 amino acids in the tryptophan 2,3-dioxygenase protein was found in all known obligate predator genomes, and was absent from all non-predatory bacteria. PCR primers that amplify this marker were designed for high-throughput sequencing and applied to environmental samples; the bacterial predators known to be present in the tested environment were detected, along with a wealth of putative novel predatory bacteria. DiffGene was also applied on 60 genomes of E. coli, correctly mapping the various pathogroups. This new tool is thus useful in medical microbiology and microbial ecology. It can further help understand the role of bacterial assemblages in ecology and diseases, and uncover the key genetic determinants underlying the changing bacterial milieu.