Academia.eduAcademia.edu
System. Appl. Microbiol. 26, 483–494 (2003) © Urban & Fischer Verlag http://www.urbanfischer.de/journals/sam The Variable Part of the dnaK Gene as an Alternative Marker for Phylogenetic Studies of Rhizobia and Related Alpha Proteobacteria Tomasz Ste˛pkowski1, Magdalena Czaplińska1, Katarzyna Miedzinska1, and Lionel Moulin2,3 1 Institute of Bioorganic Chemistry Polish Academy of Sciences, Poznań, Poland Laboratoire des Symbioses Tropicales et Méditerranéennes, IRD-CIRAD-INRA-ENSAM, Montpellier cedex 5, France 3 Department of Biology 3, University of York, York, UK 2 Received: June 26, 2003 Summary DnaK is the 70 kDa chaperone that prevents protein aggregation and supports the refolding of damaged proteins. Due to sequence conservation and its ubiquity this chaperone has been widely used in phylogenetic studies. In this study, we applied the less conserved part that encodes the so-called α-subdomain of the substrate-binding domain of DnaK for phylogenetic analysis of rhizobia and related non-symbiotic alpha-Proteobacteria. A single 330 bp DNA fragment was routinely amplified from DNA templates isolated from the species of the genera, Azorhizobium, Bradyrhizobium, Mesorhizobium, Rhizobium and Sinorhizobium, but also from some non-symbiotic alpha Proteobacteria such as Blastochloris, Chelatobacter and Chelatococcus. Phylogenetic analyses revealed high congruence between dnaK sequences and 16S rDNA trees, but they were not identical. In contrast, the partition homogeneity tests revealed that dnaK sequence data could be combined with other housekeeping genes such as recA, atpD or glnA. The dnaK trees exhibited good resolution in the cases of the genera Mesorhizobium, Sinorhizobium and Rhizobium, even better than usually shown by 16S rDNA phylogeny. The dnaK phylogeny supported the close phylogenetic relationship of Rhizobium galegae and Agrobacterium tumefaciens (R. radiobacter) C58, which together formed a separate branch within the fast-growing rhizobia, albeit closer to the genus Sinorhizobium. The Rhizobium and Sinorhizobium genera carried an insertion composed of two amino acids, which additionally supported the phylogenetic affinity of these two genera, as well as their distinctness from the Mesorhizobium genus. Consistently with the phylogeny shown by 16S–23S rDNA intergenic region sequences [62], the dnaK trees divided the genus Bradyrhizobium into three main lineages, corresponding to B. japonicum, B. elkanii, and photosynthetic Bradyrhizobium strains that infect Aeschynomene plants. Our results suggest that the 330 bp dnaK sequences could be used as an additional taxonomic marker for rhizobia and related species (alternatively to the 16S rRNA gene phylogeny). Key words: alpha Proteobacteria – dnaK – gene marker – phylogeny – rhizobium – symbiosis Introduction Rhizobia have the rare ability to form a nitrogen-fixing symbiosis with leguminous plants. This ability is conferred by a unique class of nodulation genes responsible for the synthesis of Nod factors – specific morphogens responsible for rhizobia recognition and induction of root cortical cell divisions that result in nodule formation [10]. Classification of legume root-nodule bacteria is based on the analysis of numerous phenetic and genetic data in a Nucleotide sequence data reported are available in the EMBL database under the accession numbers: AJ431131, AJ431133, AJ431134, AJ431135, AJ431136, AJ431137, AJ431138, AJ431139, AJ431140, AJ431141, AJ431142, AJ431143, AJ431144, AJ431145, AJ431146, AJ431147, AJ431148, AJ431149, AJ431150, AJ431151, AJ431152, AJ431153, AJ431154, AJ431155, AJ431156, AJ431157, AJ431158, AJ431159, AJ431160, AJ431161, AJ431162, AJ431163, AJ431164, AJ431165, AJ431166, AJ431167, AJ431168, AJ431169, AJ431170, AJ431171, AJ431172, AJ431173, AJ431174, AJ493254, AJ510113, AJ544178, AJ544179. 0723-2020/03/26/04-483 $ 15.00/0 484 T. Ste˛pkowski et al. process that is termed a polyphasic approach [53]. This combination of various methods allowed the description of ten genera and around 40 species. Until recently, rhizobia have been classified in the genera Allorhizobium, Azorhizobium, Bradyrhizobium, Mesorhizobium, Rhizobium and Sinorhizobium [6, 7, 11, 20, 21, 44]. However, four new lineages of nodule bacteria related to the genera Devosia, Methylobacterium, Burkholderia and Ralstonia have been described in the last two years. Interestingly, the latter two belong to the beta subclass of Proteobacteria, being the most phylogenetically distant with respect to all rhizobial species [5, 32, 37, 43]. While the genera Azorhizobium and Bradyrhizobium, as well as nodulating Methylobacterium and Burkholderia spp., belong to distinct phylogenetic lineages, the taxonomic status of the fast-growing Rhizobium, Mesorhizobium and Sinorhizobium genera is less certain. For instance, Rhizobium galegae is phylogenetically closer to pathogenic Agrobacterium spp., whereas the delineation of Rhizobium and Sinorhizobium spp. as discrete genera is disputed [50]. At least partially, these controversies have been eliminated by the inclusion of the genera Agrobacterium and Allorhizobium into the genus Rhizobium [71]. Among all approaches used, the most decisive step in the classification process has been the analysis of rRNA gene sequences [38, 60, 69]. However, the application of 16S rRNA gene markers is limited by low resolution of closely related species. Therefore, a procedure based on DNADNA reassociation becomes decisive in the species delineation. In principle, strains sharing 70% or more DNA similarity are classified as single species. However, DNADNA reassociation is a relatively expensive method and not always repeatable, which limits the application of this technique [53]. These difficulties resulted in the search for other macromolecules as potential phylogenetic markers. In principle, housekeeping protein-coding genes accumulate more substitutions than 16S or 23S rRNA genes. Among the proteins examined as alternative phylogenetic markers, the most consistent with 16S rRNA phylogeny were the phylogenies obtained with GroEL, RecA, ATPase β-subunit, elongation factor Tu, and RNA polymerases [12, 14, 30, 54]. Hsp70 is a molecular chaperone responsible for various cellular processes, including the folding of nascent polypeptides, assembly and disassembly of protein complexes, protein degradation and membrane translocation of secreted proteins [3]. In prokaryotes, Hsp70 is better known as DnaK protein. A large number of dnaK sequences in the databases, representing various groups of organisms, enabled the use of this gene in phylogenetic studies. However, DnaK phylogeny contradicts the threedomain dogma of all living organisms, and predicts a close and specific relationship between Archaea and gram-positive bacteria, as well as between eukaryota and gram-negative bacteria [17, 18]. Thus, in Archaea dnaK gene has apparently been acquired either from gram-positive bacteria or from the Thermotoga maritima cluster [16]. The product of dnaK is a 70 kDa protein that consists of two domains: the ATPase and the substrate-binding domain. In Escherichia coli, the ATPase and substrate-bind- ing domain correspond to the residues 1–385 and 386–605, respectively. The 3-D structures of the ATPase and the substrate-binding domain of DnaK protein have been determined [73], while the structure of the remaining C-terminal fragment (about 30 amino acids) remains unresolved. In E. coli, the loss of the fragment corresponding to the residues 540–637 did not affect the growth at higher temperature or λ phage propagation, although it abolished σ32 subunit degradation [29]. The substrate-binding domain is composed of two subdomains: the highly structured β-sandwich (396–501 amino acids) and more variable α-subdomain. The α-subdomain is composed mainly of α-helices that are termed αA, αB, αC, αD and αE [3, 73]. The α-subdomain is assumed not to interact directly with peptide substrate. In fact, it forms a lid covering the peptide-binding groove that closes or opens up depending on the ATP/ADP-dependent conformational status. This part is also less conserved than the ATPase or β-sandwich parts of the substrate-binding domain [41, 73]. The divergence of nucleotide sequences corresponding to the α-subdomain encouraged us to design a pair of specific primers homologous to the conserved parts of the 3′ region of dnaK gene for PCR amplification and to use the resulting sequences in phylogenetic studies of root-nodule bacteria. The results suggest that this variable part of dnaK could be used as an alternative or additional taxonomic marker of rhizobia and related species. Materials and Methods Bacterial strains All strains are listed in Table 1. Yeast-extract mannitol YMB medium [55] was used for growth and maintenance of the strains. PCR amplification PCR reactions were carried out using either total genomic DNA isolated as described elsewhere [1] or, for the majority of strains, boiled bacterial suspensions. Genomic DNA and boiledcell suspensions were prepared from single colonies obtained after serial dilutions in 0.01% Tween20 –10 mM MgSO4. The 3′ region of dnaK genes was amplified using TSdnaK3 (5′-AAG GAGCAGCAGATCCGCATCCA-3′; position 1468–1490 bp within 1902 bp-long dnaK gene of B. japonicum USDA110) and TSdnaK2 (5′-GTACATGGCCTCGCCGAGCTTCA-3′; position 1794–1772 bp), and PCR reactions were carried out using the ExpandTM High Fidelity PCR System (Boehringer Mannheim GmbH, Germany) according to the manufacturer’s recommendations. The PCR protocol was as follows: 1 minute of initial denaturation carried out at 94 °C followed by 35 cycles of 1 min at 94 °C, 1 min at 62 °C and 40 sec at 72 °C. The 330 bp PCR products obtained were purified directly with QIAquick Gel Extraction-PCR purification columns (QIAGEN, Germany) and sequenced directly using the ABI Prism 310 capillary apparatus (Applied Biosystems, Foster city, California). The accession numbers for dnaK sequences are listed in Table 1. Phylogenetic analyses All sequence data sources are indicated in table 1 and in the legend of Figs. 1 and 2. The nucleotide and protein sequences were aligned using Clustal X [46] and alignments were optimised manually. All phylogenetic analyses were performed using PAUP 4.0b10 [42]. 16S rRNA and dnaK gene phylogenies were Table 1. List of bacterial strains used in this study. Strains Legume host AN for 16S rDNA AN for dnaK Source or reference Azorhizobium caulinodans ORS571 Blastochloris sulfoviridis DSM729 Bradyrhizobium elkanii USDA76 Bradyrhizobium japonicum USDA110 Bradyrhizobium sp. ANU289 Bradyrhizobium sp. ARC403 Bradyrhizobium sp. BC-C2 Bradyrhizobium sp. CBP55 Bradyrhizobium sp. CBP70 Bradyrhizobium sp. CBP90 Bradyrhizobium sp. CCT6186 Bradyrhizobium sp. CCT6187 Bradyrhizobium sp. CCT6194 Bradyrhizobium sp. CCT6205 Bradyrhizobium sp. CCT6212 Bradyrhizobium sp. CCT6281 Bradyrhizobium sp. Jan2 Bradyrhizobium sp. NC92 Photosynthetic Bradyrhizobium sp. ORS278 Bradyrhizobium sp. Os2 Bradyrhizobium sp. Os6 Bradyrhizobium sp. USDA3042 Bradyrhizobium sp. USDA3259 Bradyrhizobium sp. USDA3505 Bradyrhizobium sp. USDA3517 Bradyrhizobium sp. WM9 Bradyrhizobium sp. Zarn2 Bradyrhizobium sp. Jan10 Brucella melitensis biovar ovis 63/290T ATCC25840 Brucella melitensis biovar suis 1330 Caulobacter crescentus CB15 Chelatobacter heintzii ATCC29600T DSM6450 Chelatococcus asaccharovorans TE2T Escherichia coli K12 (() Mesorhizobium loti NZP2037 Mesorhizobium sp. (huakuii bv. loti) MAFF303099 Mesorhizobium sp. A21 Mesorhizobium sp. AM18 Mesorhizobium sp. USDA3717 Mesorhizobium sp. WM5 M. tianshanense A-1BST (USDA3529) R. leguminosarum bv. phaseoli 8401 R. leguminosarum bv. trifolii ANU843 R. leguminosarum bv. trifolii T24 R. leguminosarum bv. trifolii USDA7001 USDA7102 R. leguminosarum bv. viciae 3841 Sesbania rostrata non-symbiotic Glycine max Glycine max Parasponia andersonii Lupinus albus Chamaecytisus sp. Lupinus campestris Lupinus campestris Lupinus campestris Cajanus cajan Cajanus cajan Cajanus cajan Cajanus cajan Cajanus cajan Cajanus cajan Genista sp. Arachis hypogaea Aeschynomene indica Sarothamnus sp. Sarothamnus sp. Lupinus albus Phaseolus lunatus Lupinus montanus Faidherbia albida Lupinus luteus Sarothamnus scoparius Genista sp. non-symbiotic non-symbiotic non-symbiotic non-symbiotic non-symbiotic X67221 D86514 U35000 D13430 – – AF000551 AJ431131* AJ493254* AJ431152* Y09633 AJ431133* AJ431134* AJ431135* AJ431136* AJ431137* AJ431138* AJ431139* AJ431140* AJ431141* AJ544178* AJ544179* AJ510113* AJ431142* AJ431144* AJ431145* AJ431146* AJ431147* AJ431148* AJ431149* AJ431150* AJ431151* AF222752* AJ431153* AJ431143* M95799 [11] [19] P. van Berkum, USDA NC_002969 AE005675 AJ431154* genome genome T Egli, Switzerland AJ431155* AE000112 AJ431156* AP003004 T. Egli, Switzerland genome C. Ronson, New Zealand genome; [48] AJ431157* AJ431158* AJ431159* AJ431160* AJ431161* Y14649 AJ431163* AJ431164* AJ431166* AJ431167* Blast on genome Rhizobium galegae USDA4128 Rhizobium mongolense USDA1844T Rhizobium (Agrobacterium) radiobacter C58 Rhizobium tropici CFN299 (Type A) Rhizobium tropici CIAT899 (Type B) Rhizobium sp. USDA2163 Rhizobium sp. ARC402 Rhodopseudomonas palustris No7 Sinorhizobium meliloti 1021 Sinorhizobium sp. CCT6189 Sinorhizobium sp. NGR234 Sinorhizobium terangae USDA4894 Galega officinalis Medicago ruthenica non-symbiotic Phaseolus vulgaris Phaseolus vulgaris Trifolium sp. Lupinus luteus non-symbiotic Medicago sativa Cajanus cajan Lablab purpureum Acacia laeta W. Malek, Poland W. Malek, Poland P. van Berkum, USDA W. Malek, Poland [44] [8] B. Rolfe, Australia E. Triplett, USA P. van Berkum, USDA P. van Berkum, USDA http://www.sanger.ac.uk/ Projects/R_leguminosarum/ P. van Berkum, USDA P. van Berkum, USDA genome E. Martinez-Romero, Mexico E. Martinez-Romero, Mexico P. van Berkum, USDA S Raza non-symbiotic non-symbiotic Lotus sp. Lotus sp. Astragalus sp. Astragalus sp. Lupinus succulentus Lupinus luteus Glycyrrhiza pallidiflora Phaseolus vulgaris Trifolium subterraneum Trifolium sp. Trifolium isodon Trifolium sp. Pisum sativum AF239255 U69636 AF222751 L26168 NC_002969 AE006011 AJ011762 AJ294349 AE000474 AP003001 AF041447 U76341 U31074 Blast on genome X67226 U89817 AJ012209 X67233 U89832 AF184625 AL591782 AJ301628 X68387 AJ431162* AJ431168* NC_003304 AJ431169* AJ431170* AJ431165* AJ431171* D78133 NC_003047 AJ431172* AJ431173* AJ431174* B. Rolfe, Australia S. Raza, Egypt [56] E. Martinez-Romero, Mexico E. Martinez-Romero, Mexico E. Martinez-Romero, Mexico M. de Oliveira, Brasil M. de Oliveira, Brasil M. de Oliveira, Brasil M. de Oliveira, Brasil M. de Oliveira, Brasil M. de Oliveira, Brasil W. Malek, Poland [15] E. Giraud, France W. Malek, Poland W. Malek, Poland P. van Berkum, USDA P. van Berkum, USDA P. Van Berkum, USDA P. van Berkum, USDA W. Malek, Poland W. Malek, Poland W. Malek, Poland J. Denarie, France M. de Oliveira, Brasil W. Broughton, Switzerland P. van Berkum, USDA The table includes strains received from the USDA collection, as well as from other sources. All strains bearing the CCT acronym were isolated from nodules of pigeon pea (Cajanus cajan) used for trapping the rhizobia present in Cerrado soils in Brazil. The sequences obtained in this work are marked with an asterisk (*). AN: Accession Numbers. γ: gamma-proteobacteria. “genome” indicates that the sequences are available from whole genome sequencing (at http://www.jgi.doe.gov/JGI_microbial/html/index.html; or http://www.ncbi.nlm.nih.gov/genomes/static/eub_g.html); “Blast on genome” indicates that the sequences were obtained by Blast on primary sequences obtained from a whole genome sequencing project. 486 T. Ste˛pkowski et al. assessed by distance, parsimony and maximum likelihood (ML) methods, while DnaK protein phylogenies were assessed by distance and parsimony methods. For gene distance analyses, several different evolutionary models were tested: Jukes-Cantor (equal base frequencies, one substitution type [22]), Kimura-2 (equal base frequencies, unequal Transition (Ti):Transvertion (Tv) ratio [23]) and F84 (unequal base frequencies, unequal Ti:Tv [13]). The trees were constructed with the neighbour joining method. A bootstrap analysis with 1000 replicates was performed to evaluate the confidence of the nodes, and nodes not supported by bootstraps values greater than 50% were kept unresolved. All parsimony analyses were performed using the heuristic search algorithm of PAUP with gaps treated as informative positions. The strict consensus method included in PAUP was used to obtain a single consensus tree. For dnaK and 16S rDNA ML analyses, multiple heuristic searches using the tree bisection-reconnection (TBR) method were performed under PAUP with a model of two types of substitutions (HKY85 variant) and the estimation by ML of the Ti/Tv ratio and nucleotide frequencies. For dnaK phylogeny, specific substitution rates were evaluated following the codon structure of the DNA sequence. The latter model allows different substitution rates for each position in the codon. The third codon position usually evolves at high rates and may reach saturation, hiding phylogenetic signal. The ML starting trees were constructed either by neighbour joining or by stepwise addition of sequences. The partition homogeneity tests (100 random trees; 1000 replicates) and Shimodaira-Hasegawa tests of congruence of trees topologies were performed using PAUP 4.0b10. 16S rRNA gene sequences Accession Numbers (AN) of strains are listed in Table 1 Fig. 1. (A, B). Maximum likelihood phylogenetic trees based upon partial dnaK gene (A) and full 16S rDNA (B) sequences. The ML model used for each tree is described in Materials & Methods. Bootstrap values (only greater than 50%), shown as percentage of 1000 replicates are indicated at tree nodes (sampling performed under distance criterion, with trees constructed by NJ). I, II, III, IV, V and VI indicates the clusters we interpreted from the dnaK phylogeny. The bar represents 10% substitutions per site. The accession numbers for dnaK and 16S rDNA sequences are listed in Table 1, remaining sequences are as follows: B. elkanii USDA94 (D13429), Bradyrhizobium sp. ORS285 (AF230722), Bradyrhizobium sp. BtAi1 (AB079633), Mesorhizobium ciceri UPM Ca-7T (U07934), M. huakuii CCBAU2609T (D13431), M. loti NZP2213T (X67229), M. mediterraneum UPM Ca-36T (L38825), M. amorphae (AF041442); S. fredii USDA205T (X67231); S. saheli ORS609T (X68390), S. medicae (L39882); S. kostiense (Z78203), Phyllobacterium rubiaceareum (D12790); Mycoplana dimorpha (D12786). dnaK Phylogeny of Rhizobia PCR reactions yielded a single, 330 bp length DNA fragment on the template DNAs from strains belonging to the genera Azorhizobium, Bradyrhizobium, Mesorhizobium, Rhizobium, and Sinorhizobium, and additionally from non-symbiotic bacteria such as Blastochloris sulfoviridis, Chelatobacter heintzii and Chelatococcus asaccharovorans. We found that the primers we designed could be applied to many rhizobia and, additionally, to some species belonging to other groups, especially to Caulobacter spp. Nucleotide sequence identity of dnaK PCR fragments ranged from 64% to 87% between genera of the alphaProteobacteria subclass (Table 2). For comparison, 16S Brucella suis 330 S. meliloti 1021 CFN299 USDA3529 TE2 Strain Rhodops.No7 Table 2. Nucleotide sequence identities of 3′ end fragment of dnaK (in %). ATC29600 For over twenty years, bacterial phylogeny has relied primarily upon the analysis of RNA ribosomal sequences, in particular on the 16S rRNA gene that has become universally used and widely accepted as a taxonomic molecular marker [27, 65]. However, no single molecular marker will always lead to a phylogeny that is completely consistent with the organismal evolution [66]. In the case of 16S rRNA gene, phylogenetic inference may be blurred by the presence of multiple and in some cases divergent copies. Indeed, except obligate intracellular bacteria and the majority of Archaea, most bacterial genomes harbour multiple 16S rRNA gene copies. In some species, e.g. Bacillus subtilis, Deinococcus radiodurans, Escherichia coli and Vibrio cholerae, these copies are divergent (not shown). Although the diversity among various copies usually has a limited effect on phylogenetic inference, in some cases this may be affected. Thus far, the largest diversity has been found in Thermobispora bispora and Thermomonospora chromogena (6.4% divergence between 16S rDNA copies in each genome). It has been postulated that the distinct copy of T. chromogena was acquired from T. bispora [57, 70]. Unlike protein coding genes, whose duplication usually results in sequence diversification, multiple 16S rRNA gene copies undergo rather concerted evolution, which results in homogeneity of their sequences [26]. On the other hand, the significance of gene conversion as the cause of 16S rDNA sequence divergence in rhizobia has recently been indicated by van Berkum et al. [51]. Unlike rRNA genes, dnaK is usually a single-copy gene. Among over 100 prokaryotic genomes available, only a couple of species harbors two or more copies. Borrelia burgdorferi that harbors two divergent dnaK copies belongs to this group. Similarly, three dnaK genes have been reported in both cyanobacteria Synechocystis sp. PCC6803 and Synechococcus sp. PCC7942. These diver- PCR amplification of dnaK USDA110 dnaK as a single copy phylogenetic marker USDA76 Results and Discussion gent copies are differentially expressed, and only two have been found necessary for the growth in laboratory conditions [33]. Multiple and divergent dnaK copies are also carried by Anabaena sp. PCC7120, Nostoc punctiforme and Prochlorococcus marinus, indicating a possible general feature in cyanobacteria. Two dnaK genes have also been reported in Verrucomicrobium [58]. On the contrary, some Euryarchaeota (Archaeoglobales; Thermococcales) and probably all Crenarchaeota are devoided of dnaK (data not shown). Most genomes carry genes encoding proteins with some similarity to DnaK chaperone. Usually, these proteins show little or partial similarity to DnaK sequences, however, in the case of the HscA protein this similarity is large and concerns the entire amino acid sequence. HscA proteins are present in the genomes of gamma Proteobacteria, and appear to be involved in the assembly of Fe/S proteins [9]. A search of genome databases revealed the lack of HscA proteins in alpha-Proteobacteria (not shown). DSM729 and Fig. 1. The sequences used in the partition alignment (3366 bp length) correspond to 16S rRNA gene (from 1 to 1507 bp), atpD (1508–1962), dnaK (1963–2236), glnA (2237–2886) and recA (2887–3366). AN for atpD, glnA and recA sequences used in the partition are as follows (atpD, recA, glnA): R. galegae USD4128 (AJ294406, AJ294378, AF169575), M. tianshanense USDA3529 (AJ294393, AJ294368, AF169577), S. terangae HAMBI220 (AJ294403, AJ294383, AF169570), R. tropici USDA9039 type A (AJ294396, AJ294372, AF169569), R tropici USDA9030 type B (AJ294397, AJ294373, AF169568), Azorhizobium caulinodans ORS571 (AJ294389, AJ294363, Y10213). Sequences were obtained from genome records for Agrobacterium tumefaciens C58 (NC_003304), Sinorhizobium meliloti 1021 (AL591688), Mesorhizobium sp. MAFF303099 (NC_002678), Escherichia coli K12 (NC_000913), R. leguminosarum 3841 (sequences obtained by Blast on genome shotgun available at http://www.sanger.ac.uk/Projects/R.leguminosarum), Rhodopseudomonas palustris (Blast on genome at http://bahama. jgi-psf.org/prod/bin/microbes/rpal/home.rpal.cgi). The alignment of the partition used (16SrDNA+atpD+dnaK+recA+glnA) is available upon request at sttommic@ibch.poznan.pl. 487 ORS571 (A) 73 73 74 65 76 68 68 70 66 67 DSM729 (Bl) 78 78 69 70 72 73 76 72 75 USDA76 (B) 89 67 70 69 70 87 69 70 USDA110 (B) 70 70 71 69 82 70 71 ATCC29600 (Chb) 71 82 73 65 73 75 TE2 (Chc) 70 71 67 65 67 USDA3529 (M) 71 67 75 75 CFN299 (R) 67 80 73 Rhodo. No7 65 66 1021 (S) 76 A: Azorhizobium, Bl: Blastochloris, B: Bradyrhizobium, Chb: Chelatobacter, Chc: Chelatococcus, R: Rhizobium, S: Sinorhizobium, M: Mesorhizobium, Rhodo: Rhodopseudomonas. 488 T. Ste˛pkowski et al. rDNA identity between these genera ranges from 88% to 97% [50]. At the genus level, dnaK gene identity was greater than 83%, while at the within-species level it ranged from 90% to 100% (usually higher than 95%) (not shown). It is noteworthy that 91% dnaK sequence identity between Rhizobium tropici type A (CFN299) and type B (CIAT899) strains corresponds well with the low 35% similarity value obtained in DNA-DNA reassociation assays. dnaK base frequencies and data saturation Base frequencies of all dnaK sequences were calculated and a χ2 homogeneity test across taxa was applied. A high P value (0.9868) in the homogeneity test was obtained only when excluding Ehrlichia, Wolbachia and Rickettsia sequences from the dataset. These three sequences harbour low GC base frequencies (around 42% for Ehrlichia and Wolbachia, 36% for Rickettsia) in comparison with other strains (mean GC at 63%), which could explain their base frequency divergence. As these strains harbour extreme low GC contents, they were removed from DNA phylogenetic analyses. The phylogenies constructed by using only the third codon position, or only the first and the second positions showed congruent but less resolved trees than when using all three codon positions. Relative substitution rates for each position in the character codon partition were estimated by ML approach: position one: 0.4373, two: 0.2860, three: 2.0505. These rates show low data saturation for position 3 in the codon. Among 310 characters in the dnaK sequence, 200 were found informative under PAUP heuristic search, showing a relatively strong phylogenetic signal in this sequence. All these results indicate a strong phylogenetic signal in the short dnaK sequences studied. Phylogenetic analyses of dnaK sequences Phylogenetic analyses of nucleotidic and proteic dnaK sequences were performed using the several methods and ML models defined in Material and Methods. The estimated base frequencies were as follows: A = 0.1593; C = 0.3551; G = 0.3571; T = 0.1283 Ti/Tv ratio was 1.020. The best ML tree obtained (score = 4700.579) is shown in Fig. 1A. We also performed parsimony and neighbour joining (NJ) analyses using different evolutionary models and bootstrapping analyses. All trees were generated without an outgroup and thus they can be considered as unrooted. ML trees were rooted on Escherichia coli K12 for graphical view, while Rhodobacter capsulatus was used for NJ trees. The parsimony trees (not shown) and NJ trees based on amino acid DnaK sequences (shown in Fig. 2A) were identical to the best ML nucleotide tree (shown in Fig. 1A), excepted for the Brucella clade. NJ trees based on dnaK nucleotide data (not shown) shared similar topology with the best ML tree obtained previously, but with minor differences concerning the relative position of Sinorhizobium strains in secondary branches, or the position of Caulobacter crescentus within the Azorhizobium caulinodans-Chelatococcus asaccharovorans clade. Additionally, photosynthetic Bradyrhizobium ORS278 formed a common branch with Os2 and Os6 strains in ML tree (Fig. 1A) and NJ amino acid tree (Fig. 2A), while on NJ nucleotide tree ORS278 position was unresolved (not shown). The Brucella sequences fell outside the fast-growing rhizobia group in DNA ML and distance analyses, while inside this group in parsimony analyses. These differences could be due to their relatively low GC content in comparison to other strains (due to base composition bias affecting phylogenetic analyses). To test this possibility, NJ phylogenetic analyses of partial (Fig. 2A) and complete (Fig. 2B) DnaK amino acid sequences were performed. On both trees Brucella sequences were placed close to the fast growing rhizobia, this position being congruent with the taxonomic status of the Brucella genus. Based on similarity levels and phylogenetic approaches, dnaK sequences can be grouped into 6 distinct branches (we interpreted branches according to the NJ, parsimony and ML analyses, concluding on amino acid analyses for primary branches when differences in trees were found): the Sinorhizobium-Rhizobium (including ‘Agrobacterium’) clade (1), the Mesorhizobium-Chelatobacter clade (2), the Brucella clade (3), the BradyrhizobiumRhodopseudomonas clade (4), the Blastochloris clade (5), and the Azorhizobium-Chelatococcus-Caulobacter clade (6). These clusters were found by NJ, parsimony and ML approaches. Comparison of dnaK and 16Sr DNA trees In order to compare phylogenies of dnaK and 16S rRNA genes, we constructed 16S rDNA phylogenies. All 16S rDNA sequences used were obtained from Genbank and their accession numbers are given in Table 1 and Fig. 1 legend. 16S rRNA gene trees obtained by neighbour joining (NJ) were different depending on the evolutionary model used (see Material & Methods for description of the models). In contrast, the best ML tree obtained (shown in Fig. 1B) was identical to the most parsimonious consensus tree (not shown). We thus decided to use the 16S rDNA ML tree for comparison with the dnaK ML phylogeny. Relationships among genera generally agreed with the published phylogenies [28, 60, 59, 14]. Bootstraps (percentage of 1000 replicates, trees constructed by NJ) are shown at the nodes of the ML 16S rDNA tree (Fig. 1B). The 16S rDNA and dnaK ML trees share very similar topologies, but with some exceptions. Chelatococcus is grouped together with Azorhizobium in dnaK trees, while closer to the Bradyrhizobium group in the 16S rDNA tree (Fig. 1A,B). Moreover, Rhodopseudomonas palustris was closer to Bradyrhizobium elkanii in dnaK tree, while closer to B. japonicum in 16S rDNA phylogeny. The dnaK phylogeny doesn’t support the proposition of Young et al. [71] to include Agrobacterium in the Rhizobium genus, since the Rhizobium galegae-Rhizobium radiobacter (Agrobacterium) clade is closer to the Sinorhizobium clade in dnaK trees. The dnaK ML tree also contradicts dnaK Phylogeny of Rhizobia 489 Fig. 2. (A, B). Comparison of neighbour joining trees of partial (A)(94 Amino Acids) and complete (B)(641 AA) DnaK sequences. Strains common to both trees are indicated in bold in tree A. Bootstraps values (% of 1000 replicates) are indicated at the nodes. I, II, III, IV, V and VI indicates the clusters we interpreted from the dnaK phylogeny. Scale bar indicates number of substitutions per site. Trees were rooted on Rhodobacter capsulatus (U57637) for graphical view. the recent report of Turner et al. [48] that shows rather distant positions of the M. loti type strain and MAFF303099 strain on rrn, glnA, glnII and recA trees. Thanks to these analyses the latter strain has been assigned to Mesorhizobium huakuii as M. huakuii bv. loti. However, we did not use the M. loti or M. huakuii type strains in our study. As no taxonomic data are available for the M. loti NZP2037 strain we used, this strain is probably mis-assigned and further studies are needed to confirm the affiliation of this strain to M. huakuii. The topology tests with dnaK and 16S rRNA gene ML trees (data only from strains used in partition homogeneity tests) were performed with a ML approach (Shimodaira-Hasegawa test under PAUP). Despite similar topologies, dnaK and 16S rRNA gene trees were estimated to be statistically different (P < 0.05, ML SSU tree: –ln L = 5969.439; ML dnaK: –ln L = 5985.71) when 16S rDNA sequences were used as input data, but not statistically different when dnaK sequences were used as input data (P = 0.121, ML dnaK: –ln L = 2267.344, ML SSU: –ln L = 2274.45). As for dnaK NJ trees, some internal nodes between genera in 16S rRNA gene trees were not fully resolved between genera. However, the dnaK phylogeny showed better resolution of internal branches, and linked the Sinorhizobium and Rhizobium genera with significantly higher bootstraps than the 16S rRNA gene phylogeny (Fig. 1). Consistent with the 16S and 23S rRNA gene phylogenies [45], Rhizobium galegae grouped closer to Agrobacterium tumefaciens strain C58 than to other Rhizobium species. Interestingly, the position of the latter clade is not clearly defined to belong to the Rhizobium or the Sinorhizobium genus in either phylogeny. 490 T. Ste˛pkowski et al. Partition homogeneity tests dnaK and 16S rDNA sequence data from the same bacterial species were combined and analysed using a two partition test, each partition corresponding to each gene. Parsimony step-length homogeneity was not supported for the dnaK+16S rDNA test (P = 0.001), indicating that these data must be treated separately in phylogenetic analyses. As previously described by Gaunt et al. [14], the phylogeny of two housekeeping genes, recA and atpD, support the 16S rDNA phylogeny even if parsimony steplength homogeneity was not supported for a atpD+recA+16SrDNA test. However, the homogeneity was supported by a two partition test recA+atpD [14]. Another housekeeping gene, glnA, encoding glutamine synthetase, has been shown to correlate with the 16SrDNA phylogeny of rhizobia [47]. As several sequences among all these housekeeping genes were available, we tested homogeneity in a partition combining dnaK, recA, atpD and glnA sequences of 12 strains (rhizobia and other species, see Materials and Methods for details). Partition step-length homogeneity tests were always supported when combining dnaK and any other single house-keeping gene, indicating a similar phylogenetic signal in the several genes assessed. Moreover, we tried a four-partition test combining dnaK, recA, atpD and glnA sequences to see if homogeneity could be found in a dnaK+recA+atpD+glnA test. Parsimony step-length homogeneity was supported by the dnaK+atpD+recA test (P = 0.23) showing that the data from recA, atpD and dnaK should be combined, but not when including glnA sequences in the previous partition (P = 0.01). Thus, our results suggest that the 330 bp length dnaK sequence contains a similar phylogenetic signal to other molecular markers tested, excluding 16S rDNA sequences. Further analysis with more species will be needed to evaluate if these data should be combined for phylogenies. domain in the Rhizobium and Sinorhizobium genera, but not in related Brucella or Mesorhizobium (Fig. 3). This insertion is composed of glutamic (or aspartic) acid and proline (or alanine), corresponding to the residues 580 and 581 in 638 amino acids chain of DnaK of Escherichia coli strain K12. Undoubtedly, this insertion could be regarded as a signature sequence, specific for these two fast-growing rhizobial genera. A secondary structure prediction analysis carried out for all sequences obtained in this study revealed that in all cases, the amplified fragment might form α helices corresponding to the helices αA, αB, αC, αD and a part of the αE helix of the α-subdomain. The two-amino acids’ insertion is placed within a turn that separates the αC and αD helices (Fig. 3). This turn links two antiparallel α-helices, and probably because of that it accommodates such insertion without distorting the three-dimensional configuration of the substrate-binding domain (not shown). This observation is further supported by the presence of identically-located insertions, also composed of two amino acids in Ehrlichia sp. USG3 (NC, Accession Number AF029321), as well as in the genus Burkholderia (SA for B. cepacia, L36603, SS for B. pseudomallei, AF016711) of the β-subdivision of Proteobacteria. This suggests that these insertions have been independently acquired by various lineages of the Proteobacteria during evolution. The divergence of all fast-growing rhizobia has been estimated to be more than 200 MY [47]. It was well before the emergence of legumes, taking into account that the oldest paleobotanical records of the Leguminoseae have been dated at 60 MY. The paleobotanical data are essentially congruent with recent estimations of the divergence time of the Leguminoseae based on molecular markers [67]. As rhizobia are polyphyletic and have been evolving during 150 MY before acquiring symbiotic properties, it will not be unexpected to find other legumenodulating bacteria as well as non-symbiotic strains with similar insertion in their dnaK sequences. An insertion of two amino acids characterises the genera Rhizobium and Sinorhizobium The Bradyrhizobium-Rhodopseudomonas branch Rhizobia belong to the alpha subdivision of Proteobacteria. Phylogenetic analyses, as well as estimation of substitution rates of 16S rRNA gene and GSI sequences, suggest that most (if not all) currently recognised rhizobial genera diverged prior to the emergence of legumes. It seems likely that the most distant genus, Bradyrhizobium, had diverged from the last common ancestor of all rhizobia prior to the emergence of land plants [34, 47]. Although the genera Mesorhizobium, Rhizobium and Sinorhizobium have been well resolved in the latter study, their divergence has been placed around the same time, some 200 MYA. Our study supports strong association between the Rhizobium and Sinorhizobium genera (85% bootstrap with large data sets included in Fig. 1). However, dnaK trees revealed earlier divergence of the genus Mesorhizobium from the remaining fast-growing rhizobial genera. Although this conclusion is based on phylogenetic analyses, it is also strengthened by the presence of an insertion of two amino acids in the C-terminal DnaK This branch consisted of the genera Bradyrhizobium (26 strains) and one Rhodopseudomonas palustris strain. The entire branch could be divided into at least three, and possibly five main lineages that further split into several smaller clusters. One of the lineages contained a tightly grouped cluster comprising B. japonicum USDA110, several strains from lupine (or other Genisteae), two strains from Cajanus cajan, one from Chamaecytisus proliferus, and one from peanut (Arachis hypogaea L). We assume that this group corresponds to the species B. japonicum, also referred to as Bradyrhizobium groups I and Ia [50]. These strains originated both from temperate (Europe and the US) and (sub) tropical regions (Brazil, Egypt, Canary Islands, Mexico) being consistent with the worldwide distribution of B. japonicum. The second lineage comprised several Bradyrhizobium isolates, including the B. elkanii type strain USDA76. Based on this fact, we concluded that these strains corresponded to B. elkanii species [24]. This indicates that the dnaK Phylogeny of Rhizobia 491 Fig. 3. Alignment of deduced amino acid sequences of the C-terminal part of the DnaK protein. Only differences relative to the top sequence (E. coli strain K12) are shown. The two amino acid insert positions specific for the Rhizobium and Sinorhizobium genera are outlined (see arrow). A consensus sequence is shown at the bottom. Top scale is according to the E. coli full length DnaK protein. The location of the α-helices domain is shaded. The secondary structure prediction was performed via http://www.embl-heidelberg.de/predictprotein/ submit_meta.html. Abbreviations are as follows: -: gap; Sino: Sinorhizobium; Rhizo: Rhizobium; lp/lt: leguminosarum biovar phaseoli/ trifolii; Meso: Mesorhizobium; Chelatob: Chelatobater; Brady: Bradyrhizobium; Rhodo: Rhodopseudomonas; Rhodob: Rhodobacter; Blasto: Blastobacter; Azo: Azorhizobium; Chelatoc: Chelatococcus; Caulo: Caulobacter; Burk: Burkholderia. delineation of B. elkanii as distinct species is fully warranted, even though the 16S rDNA sequence divergence found in this species could be due to gene conversion rather than point mutations [51]. All these strains were collected in (sub) tropical areas, or infected plants that are typical for such regions. The majority of B. elkanii strains had similar or even identical sequences. The only exception was the sequence of USDA3517 that had 91–92% identity with respect to the remaining ones. One Rhodopseudomonas palustris strain formed the third group, placed as the sister group with respect to B. elkanii cluster. The fourth cluster comprised ORS278, a photosynthetic strain isolated in Senegal from Aeschynomene sensitiva, together with Os2 and Os6strains collected in Japan from the nodules of broom (Sarothamnus sp). However, Os2 and Os6 strains remained unresolved from ORS278 on NJ tree (not shown). The sequence identity between these five major Bradyrhizobium lineages ranged from 84% – one of the lowest values for strains belonging to a same genus – to 100%. Within the same lineage, sequence identity was 492 T. Ste˛pkowski et al. over 90%, usually higher than 95%. Three strains from the B. elkanii group (USDA3259, CCT6205 and CCT6212) had identical nucleotide sequences, as did strains CCT6187 and WM9, and CBP70 and Jan2, respectively, in the B. japonicum cluster. Some of these strains such as CCT6205 and CCT6212 originated from the Cerrado region, the others were collected in remote areas. However, the complete identity of nucleotide sequences is not unusual among highly conserved genes. Most of the Bradyrhizobium strains belonged to either B. japonicum or B. elkanii. We did not include Bradyrhizobium liaoningense in this study, the third species described in this genus [68], but our finding is consistent with other reports showing that these two species are the most widespread lineages in the Bradyrhizobium genus [35, 36, 49, 52, 56]. The genus Bradyrhizobium is characterised by a very high diversity of RAPD patterns, cellular protein profiles and nod gene sequences, which often contrasts with nearly identical 16S rDNA sequences [2, 52, 72]. Nevertheless, Lafay and Burdon [25], upon analysis of 16S rDNA sequences, have defined 16 genospecies among the strains isolated in Australia from native shrubby legumes. Likewise, 34 AFLP clusters have been obtained among Bradyrhizobium strains collected in Africa from Aeschynomene spp. and Faidherbia albida. Later on, these strains were grouped into 11 genospecies (I-XI) according to DNA-DNA reassociation values and similarity of 16S–23S rDNA (ITS) spacer sequences [62, 63, 64]. The strains isolated from Aeschynomene plants deserve special attention. These strains comprise a large and heterogeneous group composed of several clusters. Some of these clusters are classified as B. japonicum and B. elkanii, however, two clearly distinct clusters that are referred to as the ARDRA groups A (or genospecies VI and VIII) and D comprise strains nodulating only Aeschynomene spp [8, 31]. These narrow-host range bradyrhizobia form stem and root nodules on Aeschynomene plants, possessing a photosynthetic ability that is unique among rhizobia. Due to their phenetic and genetic distinctness, the ARDRA group A strains may represent a separate species [8, 31, 62]. Thus, our finding that dnaK of strain ORS278 which belongs to the ARDRA group A, occupies a separate position on the tree is entirely consistent with these reports, but also with 16S rRNA gene phylogeny [4]. The dnaK sequence of ORS278 is most similar to sequences obtained from two Bradyrhizobium strains isolated in Japan from Sarothamnus plants. However, these two sequences differ from that of ORS278 at 18 positions, of which two are non-synonymous replacements (Fig. 3), and therefore a further work is needed to evaluate to which genospecies (VI or VIII) these two strains belong [39, 40]. Conclusion We found that phylogenetic analyses of the variable 330 bp-length dnaK region are essentially congruent with the 16S rRNA gene classification of rhizobia and related species, being even better resolved in some cases. The phylogenetic information contained in this gene fragment was compatible with that from other house-keeping genes, and could therefore be used as an alternative taxonomic marker when 16S rDNA analysis is limited by low sequence divergence. Thus the dnaK sequence fragment could become a commonly used marker because of its short length for sequencing and its frequent representation in databanks. Acknowledgements We are greatly indebted to Prof. Andrzej B. Legocki for his help throughout the entire work. We are also grateful to Dr J. Peter W. Young for his comments as well as to Drs Wanda Mal/ ek, Valeria de Oliveira, Peter van Berkum and Desta Beyene, Esperanza Martinez, Thomas Egli, Akira Hiraishi, Pablo Vinuesa and S. Raza for bacterial strains used in this study. Also, we thank Dr. Zbigniew Michalski for his help during the preparation of this manuscript. Preliminary sequence data was obtained from Joint Genome Institute (JGI) at http://www.jgi.doe.gov/tempweb/JGI_microbial/html/index.html. Computer work has been partially carried out in the co-operation with Poznań Supercomputing and Networking Center. This work was financed by grant 6P04C 070 11 (TS) from the Polish Science Ministry (KBN), and partially by a minigrant (MC) awarded by Sieć Biologii Komórki UNESCO. References 1. Ausubel, F. M., Brent, R., Kingston, R. E., Moore, D. D., Seidma, J. G., Smith, J. A. and Struhl, K.: Currents protocols in molecular biology. (Wiley, J. S.) New York, 1988. 2. Barrera, L. L., Trujillo, M. E., Goodfellow, M., Garcia, F. J., Hernandez-Lucas, I., Davila, G., van Berkum, P. and Martinez-Romero, E.: Biodiversity of bradyrhizobia nodulating Lupinus spp. Int. J. Syst. Bacteriol. 47, 1086–1091 (1997). 3. Bukau, B., Horwich, A. L.: The Hsp70 and Hsp60 chaperone machines. Cell 92, 351–366 (1998). 4. Chaintreuil, C., Giraud, E., Prin, Y., Lorquin, J., Ba, A., Gillis, M., de Lajudie, P. Dreyfus, B.: Photosynthetic bradyrhizobia are natural endophytes of the African wild rice Oryza breviligulata. Appl. Environ. Microbiol. 66, 5437–5447 (2000). 5. Chen, W. M., Laevens, S., Lee, T. M., Coenye, T., De Vos, P., Mergeay, M. Vandamme, P.: Ralstonia taiwanensis sp. nov., isolated from root nodules of Mimosa species and sputum of a cystic fibrosis patient. Int. J. Syst. Evol. Microbiol. 51, 1729–1735 (2001). 6. De Lajudie, P., Laurent-Fulelem E., Willems, A., Torck, U., Coopman, R., Collins, M. D., Kersters, K., Dreyfus, B., Gillis, M.: Allorhizobium undicola gen. nov., sp. nov., nitrogen-fixing bacteria that efficiently nodulate Neptunia natans in Senegal. Int. J. Syst. Bacteriol. 48, 1277–1290 (1998). 7. De Lajudie, P., Willems, A., Nick, G., Moreira, F., Molouba, F., Hoste, B., Torck, U., Neyra, M., Collins, M. T., Lindstrom, K., Dreyfus, B. Gillis, M.: Characterization of tropical tree rhizobia and description of Mesorhizobium plurifarium sp. nov. Int. J. Syst. Bacteriol. 48, 369–382 (1998). 8. Doignon-Bourcier, F., Sy, A., Willems, A., Torck, U., Dreyfus, B., Gillis, M. de Lajudie, P.: Diversity of bradyrhizobia from 27 tropical Leguminosae species native of Senegal. Syst. Appl. Microbiol. 22, 647–661 (1999). 9. Dougan, D. A, Mogk A, Bukau, B.: Protein folding and degradation in bacteria: to degrade or not to degrade? That is the question. Cell Mol. Life Sci. 59, 1607–1616 (2002). dnaK Phylogeny of Rhizobia 10. Downie, J.: Functions of rhizobial nodulation genes, pp. 387–402. In: The Rhizobiaceae. (Spaink, H. P., Kondorosi, A. and Hooykaas, P. J. J.) Dordrecht, Kluwer academics 1998. 11. Dreyfus, B., Garcia, J. L. Gillis, M.: Characterization of Azorhizobium caulinodans gen. nov., sp. nov., a stem nodulating nitrogen fixing bacterium isolated from Sesbania rostrata. Int. J. Syst. Bacteriol. 38, 89–98 (1988). 12. Eisen, J. A.: The RecA protein as a model molecule for molecular systematics studies ob bacteria: comparison of trees of RecA’s and 16S rRNA’s from the same species. J. Mol. Evol. 41, 1105–1123 (1995). 13. Felsenstein, J.: Distance methods for inferring phylogenies: a justification. Evolution. 38, 16–24 (1984). 14. Gaunt, M. W., Turner, S. L., Rigottier-Gois, L., LloydMacgilp, S. A. Young, J. P. W.: Phylogenies of atpD and recA support the small subunit rRNA-based classification of rhizobia. Int. J. Syst. Evol. Microbiol. 51, 2037–2048 (2001). 15. Gillette, W. K. Elkan, G. H.: Bradyrhizobium (Arachis) sp. strain NC92 contains two nodD genes involved in the repression of nodA and a nolA gene required for the efficient nodulation of host plants. J. Bacteriol. 178, 2757–2766 (1996). 16. Gribaldo, S., Lumia, V., Creti, R., de Macario, E. C., Sanangelantoni, A. Cammarano, P.: Discontinuous occurrence of the hsp70 (dnaK) gene among Archaea and sequence features of HSP70 suggest a novel outlook on phylogenies inferred from this protein. J. Bacteriol. 181, 434–443 (1999). 17. Gupta, R. S. Golding, G. B.: Evolution of HSP70 gene and its implications regarding relationships between archaebacteria, eubacteria, and eukaryotes. J. Mol. Evol. 37, 573–582 (1993). 18. Gupta, R. S.: Protein phylogenies and signature sequences: A reappraisal of evolutionary relationships among archaebacteria, eubacteria, and eukaryotes. Microbiol Mol. Biol. Rev. 62, 1435–1491 (1998). 19. Hiraishi, A.: Transfer of the bacteriochlorophyll b-containing phototrophic bacteria Rhodopseudomonas viridis and Rhodopseudomonas sulfoviridis to the genus Blastochloris gen. nov. Int. J. Syst. Bacteriol. 47, 217–219 (1997). 20. Jarvis, B. D., Van Berkum, P., Chen, W. X., Nour, S. M., Fernandez, M. P., Cleyet-Marel, J. C. Gillis, M.: Transfer of Rhizobium loti, Rhizobium huakuii, Rhizobium ciceri, Rhizobium mediterraneum, and Rhizobium tianshanense to Mesorhizobium gen. nov. Int. J. Syst. Bact. 47, 895–898 (1997). 21. Jordan, J. C.: Transfer of Rhizobium japonicum Buchanan 1980 to Bradyrhizobium gen. nov., a genus of slow growing root nodule bacteria from leguminous plants. Int. J. Syst. Bacteriol. 32, 136–139 (1982). 22. Jukes, T. H. Cantor, C. R.:Mammalian protein metabolism., 21–132. In: Mammalian protein metabolism. (Munro, H. N.) New York, Academic Press 1969. 23. Kimura, M.: A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16, 111–120 (1980). 24. Kuykendall, L. M., Saxena, B., Devine, T. E. Udell, S. E.: Genetic diversity in Bradyrhizobium japonicum Jordan 1982 and a proposal for Bradyrhizobium elkanii sp. nov. Can. J. Microbiol. 38, 501–503 (1992). 25. Lafay, B. Burdon, J. J.: Molecular diversity of rhizobia occurring on native shrubby legumes in southeastern australia. Appl. Environ. Microbiol. 64, 3989–3997 (1998). 26. Liao, D.: Gene conversion drives within genic sequences: concerted evolution of ribosomal RNA genes in bacteria and archaea. J. Mol. Evol. 51, 305–317 (2000). 493 27. Maidak, B. L., Cole, J. R., Lilburn, T. G., Parker, C. T., Jr., Saxman, P. R., Farris, R. J., Garrity, G. M., Olsen, G. J., Schmidt, T. M. and Tiedje, J. M.: The RDP-II (Ribosomal Database Project). Nucleic Acids Res. 29, 173–174 (2001). 28. Maidak, B. L., Olsen, G. J., Larsen, N., Overbeek, R., McCaughey, M. J. and Woese, C. R.: The RDP (Ribosomal Database Project). Nucleic Acids Res. 25, 109–110 (1997). 29. Mogk, A., Bukau, B., Lutz, R. Schumann, W.: Construction and analysis of hybrid Escherichia coli-Bacillus subtilis dnaK genes. J. Bacteriol. 181, 1971–1974 (1999). 30. Mollet, C., Drancourt, M. Raoult, D.: Determination of Coxiella burnetii rpoB sequence and its use for phylogenetic analysis. Gene 207, 97–103 (1998). 31. Molouba, F., Lorquin, J., Willems, A., Hoste, B., Giraud, E., Dreyfus, B., Gillis, M., De Lajudie, P. Masson-Boivin, C.: Photosynthetic bradyrhizobia from Aeschynomene spp. are specific to stem-nodulated species and form a separate 16S ribosomal DNA restriction fragment lenght polymorphism group. Appl. Environ. Microbiol. 65, 3084–3094 (1999). 32. Moulin, L., Munive, A., Dreyfus, B. Boivin-Masson, C.: Nodulation of legumes by members of the Beta-subclass of Proteobacteria. Nature 411, 948–950 (2001). 33. Nimura, K., Takahashi, H. Yoshikawa, H.: Characterization of the dnaK multigene family in the Cyanobacterium Synechococcus sp. strain PCC7942. J. Bacteriol. 183, 1320–1328 (2001). 34. Ochman, H. Wilson, A. C.: Evolution in bacteria: evidence for a universal substitution rate in cellular genomes. J. Mol. Evol. 26, 74–86 (1987). 35. Parker, M. A. Lunk, A.: Relationships of bradyrhizobia from Platypodium and Machaerium (Papilionoideae: tribe Dalbergieae) on Barro Colorado Island, Panama. Int. J. Syst. Evol. Microbiol. 50, 1179–1186 (2000). 36. Parker, M. A.: Relationships of bradyrhizobia from the legumes Apios americana and Desmodium glutinosum. Appl. Environ. Microbiol. 65, 4914–4920 (1999). 37. Rivas, R., Velazquez, E., Willems, A., Vizcaino, N., SubbaRao, N. S., Mateos, P. F., Gillis, M., Dazzo, F. B., MartinezMolina, E.: A new species of Devosia that forms a unique nitrogen-fixing root-nodule symbiosis with the aquatic legume Neptunia natans (L.f.) druce. Appl. Environ. Microbiol. 68, 5217–5222 (2002). 38. Saito, A., Mitsui, H., Hattori, R., Minamisawa, K. Hattori, T.: Slow-growing and oligotrophic soil bacteria phylogenetically close to Bradyrhizobium japonicum. Microb. Ecol. 25, 277–286 (1998). 39. Sajnaga, E. Mal/ ek, W.: Numerical taxonomy of Sarothamnus scoparius rhizobia. Curr. Microbiol. 42, 26–31 (2001). 40. Sajnaga, E., Mal/ ek, W., Lotocka, B., Ste˛pkowski, T. Legocki, A.: The root-nodule symbiosis between Sarothamnus scoparius L. and its microsymbionts. Antonie Van Leeuwenhoek. 79, 385–391 (2001). 41. Suh, W. C., Burkholder, W. F., Lu, C. Z., Zhao, X., Gottesman, M. E. Gross, C. A.: Interaction of the Hsp70 molecular chaperone, DnaK, with its cochaperone DnaJ. Proc. Natl. Acad. Sci. U S A. 95, 15223–15228 (1998). 42. Swofford, D. L.: PAUP. Phylogenetic analysis Using Parsimony (and Other Methods). Version 4. Sinauer associates, Sunderland, Massachusetts. 1998. 43. Sy, A., Giraud, E., Jourand, P., Garcia, N., Willems, A., de Lajudie, P., Prin, Y., Neyra, M., Gillis, M., Boivin-Masson, C. Dreyfus, B.: Methylotrophic Methylobacterium bacteria nodulate and fix nitrogen in symbiosis with legumes. J. Bacteriol. 183, 214–220 (2001). 44. Tan, Z. Y., Xu, X. D., Wang, E. T., Gao, J. L., MartinezRomero, E. Chen, W. X.: Phylogenetic and genetic relation- 494 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. T. Ste˛pkowski et al. ships of Mesorhizobium tianshanense and related rhizobia. Int. J. Syst. Bacteriol. 47, 874–879 (1997). Terefework, Z., Nick, G., Suomalainen, S., Paulin, L. Lindstrom, K.: Phylogeny of Rhizobium galegae with respect to other rhizobia and agrobacteria. Int. J. Syst. Bacteriol. 48, 349–356 (1998). Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F. Higgins, D. G.: The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882 (1997). Turner, S. L. and Young, J. P. W.: The glutamine synthetase of Rhizobia: Phylogenetics and evolutionary implications. Mol. Biol. Evol. 17, 309–319 (2000). Turner, S.L., Zhang, X. X., Li, F. D., Young, J. P.: What does a bacterial genome sequence represent? Mis-assignment of MAFF303099 to the genospecies Mesorhizobium loti. Microbiology 148, 3330–3331 (2002). Urtz, B. E. Elkan, G. H.: Genetic diversity among bradyrhizobium isolates that effectively nodulate peanut (Arachis hypogaea). Can. J. Microbiol. 42, 1121–1130 (1996). van Berkum, P., Eardly, B. D.: Molecular and evolutionary systematics of the Rhizobiaceae, pp. 1–24. In: The Rhizobiaceae. (Spaink, H. P., Kondorosi, A. and Hooykaas, P. J. J.) Dordrecht, Kluwer academics 1998. van Berkum, P., Terefework, Z., Paulin, L., Suomalainen, S., Lindstrom, K., Eardly, B. D. : Discordant phylogenies within the rrn loci of Rhizobia. J Bacteriol. 185, 2988–2998 (2003). Van Rossum, D., Schuurmans, F. P., Gillis, M., Muyotcha, A., van Verseveld, H. W., Stouthamer, A. H. Bogerd, F. C.: Genetic and phenetic analyses of Bradyrhizobium strains nodulating peanut (Arachis hypogea L.) roots. Appl. Environ. Microbiol. 61, 1599–1609 (1995). Vandamme, P., Pot, B., Gillis, M., de Vos, P., Kersters, K. Swings, J.: Polyphasic taxonomy, a consensus approach to bacterial systematics. Microbiol. Rev. 60, 407–438 (1996). Viale, A. M., Arakaki, A. K., Soncini, F. C. Ferreyra, R. G.: Evolutionary relationships among eubacterial groups as inferred from GroEL (chaperonin) sequence comparisons. Int. J. Syst. Bacteriol. 44, 527–533 (1994). Vincent, J. M.: a manual for the pratical study of root-nodule bacteria. (Handbook., I. b. P.) Blackwell Scientific Publications, Ltd., Oxford. 1970. Vinuesa, P., Rademaker, J. L., de Bruijn, F. J. Werner, D.: Genotypic characterization of Bradyrhizobium strains nodulating endemic woody legumes of the Canary Islands by PCR-restriction fragment length polymorphism analysis of genes encoding 16S rRNA (16S rDNA) and 16S–23S rDNA intergenic spacers, repetitive extragenic palindromic PCR genomic fingerprinting, and partial 16S rDNA sequencing. Appl. Environ. Microbiol. 64, 2096–2104 (1998). Wang, Y., Zhang, Z. Ramanan, N.: The actinomycete Thermobispora bispora contains two distinct types of transcriptionally active 16S rRNA genes. J. Bacteriol. 179, 3270–3276 (1997). Ward-Rainey, N., Rainey, F. A. Stackebrandt, E.: The presence of a dnaK (HSP70) multigene family in members of the orders Planctomycetales and Verrucomicrobiales. J. Bacteriol. 179, 6360–6366 (1997). Wernegreen, J. J. Riley, M. A.: Comparison of the evolutionary dynamics of symbiotic and housekeeping loci: A case for the genetic coherence of rhizobial lineages. Mol. Biol. Evol. 16, 98–113 (1999). Willems, A. Collins, M. D.: Phylogenetic analysis of rhizobia and agrobacteria based on 16S rRNA gene sequences. Int. J. Syst. Bacteriol. 43, 305–313 (1993). 61. Willems, A., Coopman, R. Gillis, M.: Phylogenetic and DNA-DNA hybridization analyses of Bradyrhizobium species. Int. J. Syst. Evol. Microbiol. 51, 111–117 (2001a). 62. Willems, A., Coopman, R., Gillis, M.: Comparison of sequence analysis of 16S–23S rDNA spacer regions, AFLP analysis and DNA-DNA hybridizations in Bradyrhizobium. Syst. Appl. Microbiol. 51, 623–632 (2001b). 63. Willems, A., Doignon-Bourcier, F., Coopman, R., Hoste, B., de Lajudie, P. Gillis, M.: AFLP fingerprint analysis of Bradyrhizobium strains isolated from Faidherbia albida and Aeschynomene species. Syst. Appl. Microbiol. 23, 137–147 (2000). 64. Willems, A., Doignon-Bourcier, F., Goris, J., Coopman, R., de Lajudie, P., De Vos, P. Gillis, M.: DNA-DNA hybridization study of Bradyrhizobium strains. Int. J. Syst. Evol. Microbiol. 51, 1315–1322 (2001c). 65. Woese, C. R.: Bacterial evolution. Microbiol. Rev. 51, 221–271 (1987). 66. Woese, C. R.: Interpreting the universal phylogenetic tree. Proc. Natl. Acad. Sci. U S A. 97, 8392–8396 (2000). 67. Wojciechowski, M. F.:Advances in Legume Systematics, part 10, Higher level systematics, In: Advances in Legume Systematics, part 10, Higher level systematics. (Klitgaard, B. and Bruneau, A.) Kew, The Royal Botanic Gardens 2003 (in press). 68. Xu, L. M., Ge, C., Cui, Z., Li, J. L. Fan, H.: Bradyrhizobium liaoningensis sp. nov. isolated from the root nodules of soybean. Int. J. Syst. Bacteriol. 45, 706–711 (1995). 69. Yanagi, M. Yamasato, K.: Phylogenetic analysis of the family Rhizobiaceae and related bacteria by sequencing of 16S rRNA gene using PCR and DNA sequencer. FEMS Microbiol. Lett. 107, 115–120 (1993). 70. Yap, W. H., Zhang, Z. Wang, Y.: Distinct types of rRNA operons exist in the genome of the actinomycete Thermonospora chromogena and evidence for horizontal transfer of an entire rRNA operon. J. Bacteriol. 181, 5201–5209 (1999). 71. Young, J. M., Kuykendall, L. D., Martinez-Romero, E., Kerr, A. Sawada, H.: A revision of Rhizobium Frank 1889, with an emended description of the genus, and the inclusion of all species of Agrobacterium Conn 1942 and Allorhizobium undicola de Lajudie et al. 1998 as new combinations: Rhizobium radiobacter, R. rhizogenes, R. rubi, R. undicola and R. vitis. Int. J. Syst. Evol. Microbiol. 51, 89–103 (2001). 72. Zhang, X., Nick, G., Kaijalainen, S., Terefework, Z., Paulin, L., Tighe, S. W., Graham, P. H. Lindstrom, K.: Phylogeny and diversity of Bradyrhizobium strains isolated from the root nodules of peanut (Arachis hypogaea) in Sichuan, China. Syst. Appl. Microbiol. 22, 378–386 (1999). 73. Zhu, X., Zhao, X., Burkholder, W. F., Gragerov, A., Ogata, C. M., Gottesman, M. E. Hendrickson, W. A.: Structural analysis of substrate binding by the molecular chaperone DnaK. Science 272, 1606–1614 (1996). Corresponding author: Tomasz Ste˛pkowski, Institute of Bioorganic Chemistry Polish Academy of Sciences, 61-704 Poznań, Noskowskiego 12/14, Poland Tel.: ++48 61 852 8503; Fax: ++48 61 852 0532 e-mail: sttommic@ibch.poznan.pl