HOMD 16S rRNA Gene Reference Sequence Version History
2023-04-10
Version 15.23 Release Notes
16S rRNA RefSeq Version 15.23 changed the taxonomy of each sequence based on the latest HOMD Taxonomy database.
The latest changes of taxonomy nomenclature was based on this
Excel spreadsheet dated 2022-01-06
2020-02-09
Version 15.22 Release Notes
This is a minor update of the sequence names based on the current HOMD taxonomy. No sequence changes in this minor version update.
2020-08-29
Version 15.21 Release Notes
This is a minor typo correction for those genus names containing special character such as "()" or "[]"; the previous version had the special character portion of the text repeated in the genus name of the sequence. Now these errors have been corrected. No sequence changes in this minor version update.
2019-09-02
Version 15.2 Release Notes
A total of 17 full-length 16S rRNA gene sequences were added: [These sequences have the PROKKA: 'XXXX_XXXX' ID instead of GB: 'XXXXXXX' in the Fasta header line.]
1. 81526711 HMT-815 Methanobrevibacter oralis Strain: JMR01
2. 81531021 HMT-815 Methanobrevibacter oralis Strain: DSM 7256
3. 81531031 HMT-815 Methanobrevibacter oralis Strain: M2 CSUR P5920
4. 95731001 HMT-957 Saccharibacteria (TM7) [G-1] bacterium HMT 957 Strain: BB001
5. 98227191 HMT-982 Ignavibacterium album Strain: JCM 16511
6. 98327201 HMT-983 Melioribacter roseus Strain: P3M-2
7. 98427181 HMT-984 Chlorobium limicola Strain: DSM 245
8. 98427182 HMT-984 Chlorobium limicola Strain: DSM 245
9. 98627231 HMT-986 Saccharibacteria (TM7) [G-7] bacterium HMT 986 Strain: JGI 0001002-L20
10. 98727221 HMT-987 Saccharibacteria (TM7) [G-7] bacterium HMT 987 Strain: GTL1
11. 98827211 HMT-988 Candidatus Saccharimonas aalborgensis
12. 98927371 HMT-989 Saccharibacteria (TM7) [G-7] bacterium HMT 989 Strain: RAAC3_TM7_1
13. 99427171 HMT-994 Caldilinea aerophila Strain: DSM 14535 = NBRC 104270
14. 99427172 HMT-994 Caldilinea aerophila Strain: DSM 14535 = NBRC 104270
15. 99527151 HMT-995 Anaerolinea thermophila Strain: UNI-1
16. 99527152 HMT-995 Anaerolinea thermophila Strain: UNI-1
17. 99627241 HMT-996 Gracilibacteria (GN02) [G-3] bacterium HMT 996 Strain: JGI 0000069-P22
2. 81531021 HMT-815 Methanobrevibacter oralis Strain: DSM 7256
3. 81531031 HMT-815 Methanobrevibacter oralis Strain: M2 CSUR P5920
4. 95731001 HMT-957 Saccharibacteria (TM7) [G-1] bacterium HMT 957 Strain: BB001
5. 98227191 HMT-982 Ignavibacterium album Strain: JCM 16511
6. 98327201 HMT-983 Melioribacter roseus Strain: P3M-2
7. 98427181 HMT-984 Chlorobium limicola Strain: DSM 245
8. 98427182 HMT-984 Chlorobium limicola Strain: DSM 245
9. 98627231 HMT-986 Saccharibacteria (TM7) [G-7] bacterium HMT 986 Strain: JGI 0001002-L20
10. 98727221 HMT-987 Saccharibacteria (TM7) [G-7] bacterium HMT 987 Strain: GTL1
11. 98827211 HMT-988 Candidatus Saccharimonas aalborgensis
12. 98927371 HMT-989 Saccharibacteria (TM7) [G-7] bacterium HMT 989 Strain: RAAC3_TM7_1
13. 99427171 HMT-994 Caldilinea aerophila Strain: DSM 14535 = NBRC 104270
14. 99427172 HMT-994 Caldilinea aerophila Strain: DSM 14535 = NBRC 104270
15. 99527151 HMT-995 Anaerolinea thermophila Strain: UNI-1
16. 99527152 HMT-995 Anaerolinea thermophila Strain: UNI-1
17. 99627241 HMT-996 Gracilibacteria (GN02) [G-3] bacterium HMT 996 Strain: JGI 0000069-P22
They represent 12 taxa which previously did not have any reference sequence in V15.1. These sequences were identified from the genomic sequences based on the PROKKA annotation. Hence they have not yet a Genbank number in NCBI. For the first time, there are 3 16S rRNA gene sequences representing the Archaea domain (HMT-815 Methanobrevibacter oralis). HMT-957 is a newly cultured but unamed Saccharibacteria (TM7) species. Sequences of other taxa (HMT-984 - HMT-996) are the 16S rRNA genes for 13 of the 14 Non-oral/Non-nasal taxa that were added for anchoring purpose.
Altogether there are 1,015 16S rRNA gene sequences in this version.
Sequences in several formats as well as the taxonomy information can be downloaded from this link.
We also provide the BLASTN tool for searching sequences provided by the users against this latest version of the 16S rRNA RefSeq.
2018-01-29
Version 15.1 Release Notes
HOMD 16S rRNA RefSeq Version 15.1 is a major update and an expansion of the microbial taxa to include species identified in human sinonasal cavities.
The expanded version of the HOMD (eHOMD) contains a total of 772 human oral/nasal taxa, of which 693 are oral, 89 nasal (10 are both oral and nasal taxa).
This version of the 16S rRNA RefSeq contains a total of 998 full length 16S rDNA sequences representing 769 taxa (sequences of 3 taxa are not yet available).
The reference sequences are available for search with the "Identity 16S rRNA Sequence" BLAST tool, and are also available for download in the following formats:
HOMD_16S_rRNA_RefSeq_V15.1.fasta - unaligned sequences starting from position 28
HOMD_16S_rRNA_RefSeq_V15.1.p9.fasta - unaligned sequences starting from position 9
HOMD_16S_rRNA_RefSeq_V15.1.aligned.fasta - aligned sequences starting from position 9
HOMD_16S_rRNA_RefSeq_V15.1.p9.fasta - unaligned sequences starting from position 9
HOMD_16S_rRNA_RefSeq_V15.1.aligned.fasta - aligned sequences starting from position 9
The taxonomy information is provided in the following two formats:
HOMD_16S_rRNA_RefSeq_V15.1.qiime.taxonomy - QIIME taxonomy format to be included in the QIIME pipeline
HOMD_16S_rRNA_RefSeq_V15.1.mothur.taxonomy - MOTHUR format for use with the MOTHUR package
HOMD_16S_rRNA_RefSeq_V15.1.mothur.taxonomy - MOTHUR format for use with the MOTHUR package
A phylogenetic tree of all the sequences is also available for viewing and download in both newick and SVG (scalable vector graphics):
HOMD_16S_rRNA_RefSeq_V15.1.tre
HOMD_16S_rRNA_RefSeq_V15.1.svg
HOMD_16S_rRNA_RefSeq_V15.1.svg
2017-01-03
Version 14.51 Release Notes
Version 14.51 is a minor update of version 14.5 with only naming modification for two taxa.
1. HOT-279 has been formally named as Porphyromonas pasteri. This taxon was previously unnamed Porphyromonas sp. oral taxon 279. A total of 6 reference sequences representing this taxon were affected. Their IDs are 279CW034, 279DP023, 279F450a, 279F450b, 279F450c, and 279F450d
2.HOT-659 was renamed to Mesorhizobium loti from previous Rhizobium loti. only one sequence (ID: 659_0166) was affected.
There is no modification of the sequences.
2016-03-29
Version 14.5 Release Notes
Version 14.5 is a major update (versions 14.1 to 14.4 were used internally and not publically released).
The revisions are not just to 16S rRNA Reference Sequences, but to HOMD and its provisional taxonomic structure.
The accompanying Excel file (click to download) details the many added and deleted taxa and changes in status
(Named, Unnamed, Phylotype). Twenty-seven previously overlooked or newly named oral taxa were added. Thirty taxa were deleted as their sequences were chimeric or damaged. In most cases, chimeras deleted were derived from crossover of sequences within a genus. Fourteen non oral taxa were added so that there would be reference genomes in phyla with no or few oral representatives (Chlorobi, Chloroflexi, GN02, TM7, SR1, WPPS-2 ). Genomes are critical for metagenomic, transcriptomic and proteomic studies where you only see sequences that can be mapped back to reference genomes. r In this update, reference sequences were compared to all those in GenBank for the same taxa. Reference sequences that had several mismatches to other sequences in a taxon were replaced by a higher quality sequence that was representative of the taxon. When sequences within a taxon fall into multiple subgroups, RefSeqs representing divergent subgroups were added. Some changes to HOMD taxonomy are not included in the Excell file, such as changes in several class names which now in in “ia” (previously phylum and class names were the same for many taxa, but now we have Spirochaetes and Spirochaetia). A number of taxa which were previously uncultured Phylotypes, have been cultured and are now designated as Unnamed. A number of Unnamed taxa have been named and their status has been changed to Named. The headers for the RefSeqs contain the following information: File identifier; Name; HOT-ID; Clone or Strain #; GenBank accession #; Status (Named/Unnamed/Phylotype/Lost); Genome status (G=yes, X= no); log +1 of sequences seen in a study of 27 subjects at 9 oral sites (14 million sequences).
Previous versions:
HOMD provides two different sets of 16S rRNA Gene Reference Sequence (RefSeq) for download and BLAST search:
1. HOMD 16S rRNA RefSeq: This set contains sequences representing all currently named and unnamed oral taxa.
2. HOMD 16S rRNA Extended RefSeq: This set contains additional16S rRNA reference gene sequences that are distinctively different from existing taxa but have not yet been assigned with a taxon ID.
These sequences are corrected consensus sequences. Many have been corrected and extended based on alignment with other sequences for that taxon and Ns and indels removed. Therefore, for many sequences, there will be differences between the Reference Sequence and the GenBank sequence listed in the header information. We have not yet updated our own GenBank sequences, and can not update those from other depositors. We believe these are currently the bestreference sequences available, and for the purposes of BLAST analysis, have the advantage of being of a uniform length.
2013-05-08:
Version 13.2 Release Notes
Version 13.2 is a minor correction of 13.0 due to two duplicated reference sequences.
Detail changes were shown in the Excel file HOMD version 13.2 [Download the Excel File Here].
We now also included an option to search against these reference sequences that include the forward (5') primer sequences:
1. HOMD 16S rRNA RefSeq Version 13.2: This is the default version selected for search. The sequences of this set start at 16S rRNA position 28 (thus without the forward primer sequences)
2. HOMD 16S rRNA RefSeq Version 13.2 (Pos 9): This set of sequences start at the position 9, thus they include the forward primer sequences.
In addition, we are also providing the following additional files:
1. Aligned sequences arranged in phylogenetic orders. [Download].
2. FASTA sequences in mothur format. [Download].
3. Taxonomy file required by mothur. [Download].
All of these files can also be downloaded from the web download page or HOMD FTP site.
2013-03-25:
Version 13.0 Release Notes
Cumulative additions and corrections to the HOMD Taxon Table and Taxon Description pages have been made since last update.
These changes were shown in the Excel file HOMD version 13.0 [Download the Excel File Here].
The default 16S rRNA reference set has been updated to version 13.0 The Extended set which includes the provisional taxa A00-H067 has not yet been updated and therefore should be used with caution.
2013-01-16:
Version 12.0 Release Notes
Cumulative additions and corrections to the HOMD Taxon Table and Taxon Description pages have been made since last update.
These changes were shown in the Excel file HOMD version 12.0 [Download the Excel File Here].
The default 16S rRNA reference set has been updated to version 12.0 The Extended set which includes the provisional taxa A00-H067 has not yet been updated and therefore should be used with caution.
2011-02-16:
Version 11.0 Release Notes
Over the past year, several additions and corrections to the HOMD Taxon Table and Taxon Description pages have been made.
These changes shown in the Excel file HOMD version 11.0.
In addition, the default 16S rRNA reference set has been updated to version 11.0 The Extended set which includes the provisional taxa A00-H067 has not yet been updated and therefore should be used with caution.
2010-02-17:
Version 10.1
Corrected the following two sequence headers from
>357_8615| Synergistetes [G-2] sp. | Oral Taxon 357 | Clone C2ALM009 | AY278615 | 21 | N
>357W5455| Synergistetes [G-2] sp. | Oral Taxon 357 | Strain W5455 | EU309492 | 21 | N
to:
>357W5455| Synergistetes [G-2] sp. | Oral Taxon 357 | Strain W5455 | EU309492 | 21 | N
>357_8615| Pyramidobacter piscolens | Oral Taxon 357 | Clone C2ALM009 | AY278615 | 21 | N
>357W5455| Pyramidobacter piscolens | Oral Taxon 357 | Strain W5455 | EU309492 | 21 | N
>357W5455| Pyramidobacter piscolens | Oral Taxon 357 | Strain W5455 | EU309492 | 21 | N
2010-02-08:
Version 10.1: Minor modification of HOMD 16S rRNA RefSeq version 10
Contains 755 references sequences.
1. Change of sequence header format (first line of the FASTA sequence) to become:
>Sequence ID| Species Name | Oral Taxon Number | Clone Name | Genbank Access Number | Number of Clones Identified | N/P/U
where N: named species; P: phylotype; U: Un-named species
For example:
>524_3631| Veillonella atypica | Oral Taxon 524 | Clone MB5_P17 | DQ003631 | 208 | N
For example:
>524_3631| Veillonella atypica | Oral Taxon 524 | Clone MB5_P17 | DQ003631 | 208 | N
2. Two additional reference sequences were added:
>357W5455| Synergistetes [G-2] sp. | Oral Taxon 357 | Strain W5455 | EU309492 | 21 | N
>678_4915| Solobacterium moorei | Oral Taxon 678 | Strain AHP 13983 | AY044915 | 20 | N
>678_4915| Solobacterium moorei | Oral Taxon 678 | Strain AHP 13983 | AY044915 | 20 | N
3. Changes of two sequence IDs:
660_2378 -> 660_5312
655_8120 -> 655_9120
655_8120 -> 655_9120
2009-02-03:
Version 10:
First public release of HOMD 16S rRNA RefSeq, constaining 753 reference sequences
HOMD 16S rRNA Extended RefSeq version history:
2010-02-19:
Version 1.1
- A total of 1647 reference sequences including all 755 sequences from 10.1 non-extended version, and additional 892 new sequences that have yet to be assigned with a formal oral taxon. The header lines have been changed to a format consistent to the non-standard version 10.1.
2010-02-07:
Version 1.1
- A total of 34,879 cloned rDNA sequences which were collected over the years by the HOMD research group. These sequences have been recently deposited to NCBI Genbank. Here we provided the complete collection retrieved from Genbank in FASTA format. In addition to the GI and Genbank accession number, an internal HOMD sequence ID has also be added to the header line for each sequence (highlighted in red as shown in the example header line below:
>gi|285159138|gb|GU397556.1|AW149W| Caulobacter sp. oral taxon 002 clone AW149 16S ribosomal RNA gene, partial sequence
2009-02-25:
Version 1:
First public release of HOMD 16S rRNA Extended RefSeq, containing 1726 reference sequences, including all 753 in the non-extended version above.
HOMD 16S rRNA gene sequence clonal collection version history: