Hmdb loader

What is the HMDB?

Welcome to HMDB Version 5.0
The Human Metabolome Database (HMDB) is a freely available electronic database containing detailed information about small molecule metabolites found in the human body. It is intended to be used for applications in metabolomics, clinical chemistry, biomarker discovery and general education. The database is designed to contain or link three kinds of data: 1) chemical data, 2) clinical data, and 3) molecular biology/biochemistry data. The database contains 220,945 metabolite entries including both water-soluble and lipid soluble metabolites. Additionally, 8,610 protein sequences (enzymes and transporters) are linked to these metabolite entries. Each MetaboCard entry contains 130 data fields with 2/3 of the information being devoted to chemical/clinical data and the other 1/3 devoted to enzymatic or biochemical data. Many data fields are hyperlinked to other databases (KEGG, PubChem, MetaCyc, ChEBI, PDB, UniProt, and GenBank) and a variety of structure and pathway viewing applets. The HMDB database supports extensive text, sequence, chemical structure, MS and NMR spectral query searches. Four additional databases, DrugBank, T3DB, SMPDB and FooDB are also part of the HMDB suite of databases. DrugBank contains equivalent information on ~2832 drugs and 800 drug metabolites, T3DB contains information on ~3670 common toxins and environmental pollutants, SMPDB contains pathway diagrams for ~132,335 human metabolic, drug and disease pathways as well as ~60,628 pathways for other organisms, while FooDB contains equivalent information on ~70,000 food components and food additives.

The simple text query (above) supports general text queries of the entire textual component of the database. Clicking on the Browse menu header (on the HMDB navigation panel above) generates a tabular synopsis of the HMDB's content. This browse view allows users to casually scroll through the database or re-sort its contents. Clicking on a given HMDB ID brings up the full data content for the corresponding metabolite. The Biospecimen link in the Browse menu generates hyperlinked tables listing normal and abnormal concentrations of different metabolites for 23 different biospecimen. In the Search menu, the ChemQuery Structure Search link allows users to draw (using a ChemSketch applet) or write (using a SMILES string) a chemical compound and to search HMDB for chemicals similar or identical to the query compound. The Text Query supports a more sophisticated text search (partial word matches, case sensitive, misspellings, etc.) of the text portion of HMDB. The Sequence Search allows users to conduct BLAST sequence searches of the over 8,299 gene and protein sequences contained in HMDB. Both single and multiple sequence BLAST queries are supported. The Advanced Search link opens an easy-to-use relational query search tool that allows users to select or search over various combinations of subfields. The Advanced Search is the most sophisticated search tool for HMDB. The MS Search allows users to submit Mass spectral files (MoverZ format) that will be searched against the HMDB's library of LC-MS/MS spectra. This allows the identification of metabolites from mixtures via LC-MS/MS spectroscopy. The 1D and 2D NMR Searches allow users to submit peak lists from 1H or 13C NMR spectra (both pure and mixtures) or 2D TOCSY or 13C HSQC spectra, respectively, and to have these spectra compared to the NMR libraries contained in the HMDB. This allows the identification of metabolites from mixtures via NMR spectroscopy. The Downloads link provides links to collected sequence, image and text files associated with the HMDB. The HML Home button links to the Human Metabolome Library (HML) home page. The HML lists metabolites that can be ordered for a fee by researchers around the world. Finally, under the About menu there are links for HMDB Statistics, and information about the Data Sources used to assemble the HMDB.

FAIR Compliance

The HMDB is FAIR. Specifically, it is:

FINDABLE:
F1. (meta)data are assigned a globally unique and eternally persistent identifier.
All data and metadata in the HMDB is assigned a 7 digit globally unique HMDB identifier. This identifier is searchable within the database and associated with all data in the online database. Furthermore, the identifier is associated with all data downloadable from the database.
F2. data are described with rich metadata.
All data in the HMDB are described with rich metadata. Every compound, physical property, concentration, pathway diagram and spectrum is described in detail. Structures, names, synonyms, biological sources (if available), descriptions, pathways, concentrations, experimental details and other relevant data are all attached to the referential data. Additionally, scientific references are provided for each entry.
F3. meta)data are registered or indexed in a searchable resource.
All the data and metadata in the HMDB is indexed, viewable and registered through the HMDB online database at https://hmdb.ca
F4. metadata specify the data identifier.
All metadata entries in the HMDB are directly linked to the HMDB identifier and associated concentration, disease, pathway and spectral data.

ACCESSIBLE:
A1 (meta)data are retrievable by their identifier using a standardized communications protocol.
All data and metadata in the HMDB are retrievable from their HMDB identifier through the HMDB website. Similarly, all data in the HMDB may be downloaded in JSON, CSV, TXT, PNG, XML, mzML and/or nmrML format via the HMDB’s download section through a standard internet communications protocol. All MS spectra in the HMDB are also accessible via the internet through their SPLASH identifier. All chemical structure data is available in universally readable formats: SMILES, SDF, MOL, InChI, InChIKey and PDB formats.
A1.1 the protocol is open, free, and universally implementable.
The HMDB website is open and free and its data download operation is compatible with all modern web browsers. The HMDB’s downloadable data is in multiple formats (JSON, CSV, TXT, PNG, XML, mzML and/or nmrML) that are universally readable.
A1.2 the protocol allows for an authentication and authorization procedure, where necessary.
No authentication or authorization is required to access or download the HMDB’s data.
A2 metadata are accessible, even when the data are no longer available.
All of the HMDB’s metadata are linked or linkable to more permanent data sources (PubMed, PubChem, ChEBI, MetaboLights, NP-MRD, MarkerDB, etc). The availability of freely downloadable data (and metadata) for the HMDB ensures that its metadata will exist and be accessible for beyond the lifetime of the database.

INTEROPERABLE:
I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
All textual data and metadata in the HMDB are written in English, all spectral data are in the nmrML and mzML exchange format, all MS spectra are referrable via SPLASH identifiers, all chemical structure data are available in SMILES, SDF, MOL, InChI, InChIKey and PDB formats, all images are stored in PNG format, all nomenclature for compounds and spectral data follows standard IUPAC nomenclature, ontologies or vocabularies used to describe these entities.
I2. (meta)data use vocabularies that follow FAIR principles.
All data and metadata in the HMDB use ontologies, nomenclature and vocabularies that are open, findable, accessible, interoperable and re-usable.
I3. (meta)data include qualified references to other (meta)data.
All data and metadata in the HMDB are fully referenced with detailed descriptions of their provenance and sources.

RE-USABLE:
R1. meta(data) have a plurality of accurate and relevant attributes.
All data and metadata in the HMDB deposited by its curation team have been carefully curated and vetted by multiple skilled curators. All data deposited into the HMDB have been automatically checked for accuracy and consistency using comprehensive data checking software. All data in the HMDB have attributes that are relevant, up-to-date and accurate to the best of the HMDB curation team’s knowledge.
R1.1. (meta)data are released with a clear and accessible data usage license.
All data and metadata in the HMDB are released under the Creative Commons (CC) Attribution-NonCommercial (NC) 4.0 International Licensing condition.
R1.2. (meta)data are associated with their provenance.
All data and metadata in the HMDB have detailed descriptions of their provenance and sources
R1.3. (meta)data meet domain-relevant community standards.
The data and metadata in the HMDB have undergone extensive peer review by members of the metabolomics community. The data in the HMDB has met the standards for publication in peer-reviewed scientific journals and international scientific conferences.

Citing the HMDB

HMDB is offered to the public as a freely available resource. Use and re-distribution of the data, in whole or in part, for commercial purposes requires explicit permission of the authors and explicit acknowledgment of the source material (HMDB) and the original publication (see below). We ask that users who download significant portions of the database cite the HMDB paper in any resulting publications. For commercial licences, please consult with [email protected] (Scott).


Please cite:

  1. Wishart DS, Tzur D, Knox C, et al., HMDB: the Human Metabolome Database. Nucleic Acids Res. 2007 Jan;35(Database issue):D521-6. 17202168
  2. Wishart DS, Knox C, Guo AC, et al., HMDB: a knowledgebase for the human metabolome. Nucleic Acids Res. 2009 37(Database issue):D603-610. 18953024
  3. Wishart DS, Jewison T, Guo AC, Wilson M, Knox C, et al., HMDB 3.0 — The Human Metabolome Database in 2013. Nucleic Acids Res. 2013. Jan 1;41(D1):D801-7. 23161693
  4. Wishart DS, Feunang YD, Marcu A, Guo AC, Liang K, et al., HMDB 4.0 — The Human Metabolome Database for 2018. Nucleic Acids Res. 2018. Jan 4;46(D1):D608-17. 29140435
  5. Wishart DS, Guo AC, Oler E, et al., HMDB 5.0: the Human Metabolome Database for 2022. Nucleic Acids Res. 2022. Jan 7;50(D1):D622–31. 34986597