Abstract
Intragenomic gene conversion (IGC) is important in the evolution of bacteria but has only been analyzed computationally in a few strains of Escherichia coli. This paper describes a scientific workflow approach to analyze IGC in all NCBI bacterial genomes. RECOMBFLOW, an extension to Taverna, automates this complex procedure, incorporating new reusable generic web and file access and navigation processors to automate protein and genomic data integration from the Web and invoke sequence analysis tools. RECOMBFLOW analyzed >400 bacterial genomes, with a median analysis time per genome of <5 min. Results show that IGC varies greatly both between different species and among multiple genomes of the same species. We analyze for the first time the large variation of IGC in the pathogen Streptococcus pyogenes, and also in non-pathogenic bacteria. The workflow system approach enables organizing large-scale computational analyses of multiple genomes and will facilitate future comparative studies of genome organization
Original language | American English |
---|---|
Journal | International J. Bioinformatic Research and Applications |
Volume | 5 |
State | Published - 2009 |
Keywords
- bacteria
- DNA recombination
- Intragenomic recombination
- Gene conversion
- Genomes
- Scientific workflow
- Streptococcus pyogenes
Disciplines
- Life Sciences
- Biology
- Genetics and Genomics