mzTab-M 2.1 Specification

Preface

Status of This Document

This document presents the specification of the mzTab data format for metabolomics developed by members of the Human Proteome Organisation (HUPO) Proteomics Standards Initiative (PSI) Proteomics Informatics (PI) Working Group, in collaboration with the Metabolomics Standards initiative (MSI). Distribution is unlimited.

Version of This Document

Date created: June 1st, 2012

Last updated: 2026-06-12T07:06:37Z

The current version of this document is: version 2.1 (In Development)

The latest (draft) version of this document may be found at https://github.com/HUPO-PSI/mzTab-M.

Type of This Document

This document is a recommendation for a common, community-driven standard data exchange format in metabolomics.

Authors

Please see Section 12 for details on the authors and editors of this document.

Abstract

The Metabolomics Standards Initiative (MSI) and the Human Proteome Organisation (HUPO) Proteomics Standards Initiative (PSI) define community standards for data representation in proteomics/metabolomics to facilitate data comparison, exchange and verification. In this context, the two organizations are working together on a shared standard for downstream results, following mass spectrometry (MS) analysis. This document defines a tab-delimited text file format to report metabolomics results, based on a shared core mzTab format, which was primarily used in the proteomics context before (mzTab v 1.0).

The intention of this specification, mzTab for Metabolomics (mzTab-M), is to extend the concepts established in the previous specification, so that more detail can be captured about the evidence trail for quantification, including MS features (different charge states or adducts) and the evidence trail for identifications, specifically for MS-based experiments on small molecules (metabolites, lipids, contaminants, etc.). mzTab-M is not formally backwards compatible, but follows a similar design pattern to simplify adaptation of existing software and to facilitate its integration into bioinformatics processing and submission workflows.

1. Introduction

1.1. Background

This document addresses the systematic description of small molecule identification and quantification data retrieved from mass spectrometry (MS)-based experiments. A large number of software tools are available that analyze MS data and produce a variety of different output data formats.

mzTab-M is intended as a reporting standard for quantitative results from metabolomics/lipodomics approaches. This format is further intended to provide local LIMS systems as well as MS metabolomics repositories a simple way to share and combine basic information.

mzTab has been developed with a view to support the following general tasks (more specific use cases are provided in Section 2):

  1. Facilitate the sharing of final experimental results, especially with researchers outside the field of metabolomics.

  2. Export of results to external software, including programs such as Microsoft Excel® and Open Office Spreadsheet and statistical software / coding languages such as R.

  3. Act as an output format of (web-) services that report MS-based results and thus can produce standardized result pages.

  4. Be able to link to the external experimental evidence e.g. by referencing back to mzML files.

This document presents a specification, not a tutorial. As such, the presentation of technical details is deliberately direct. The role of the text is to describe the model and justify design decisions made. The document does not discuss how the models should be used in practice, consider tool support for data capture or storage, or provide comprehensive examples of the models in use. It is anticipated that tutorial material will be developed independently of this specification.

1.2. Document Structure

The remainder of this document is structured as follows.

Section 2 lists requirements that mzTab-M is designed to support.

Section 3 describes the terminology used.

Section 4 describes how the specification presented in Section 6 relates to other specifications, both those that it extends and those that it is intended to complement.

Section 5 discusses the reasoning behind several design decisions taken.

Section 6 contains the documentation of the file.

Section 9 lists use cases that are currently not supported.

Section 10 Conclusions are presented last.

2. Requirements for mzTab

The following requirements have driven the development of the mzTab data model, and are used to define the scope of the format in version 2.0.0.

  1. mzTab-M files should be simple enough to make metabolomics results accessible to people outside the respective fields. This should facilitate the sharing of data beyond the borders of the fields and make it accessible to non-experts.

  2. mzTab-M files should contain sufficient information to provide an electronic summary of all findings in a metabolomics study to permit its use as a standard documentation format for ‘supplementary material’ sections of publications in metabolomics. It should thus be able to replace PDF tables as a way of reporting small molecules and make published identification and quantification information more accessible.

  3. mzTab-M files should enable reporting at different levels of detail: ranging from a simple summary of the final results to a detailed reporting including the experimental design.

  4. It should be possible to open mzTab-M files with “standard” software such as Microsoft Excel® or Open Office Spreadsheet. This should furthermore improve the usability of the format to people outside the fields of metabolomics.

  5. mzTab-M files should make MS-derived results easily accessible to scripting languages allowing bioinformaticians to develop software without the overhead of developing sophisticated parsing code. Since mzTab-M files will be comparatively small, the data from multiple experiments can be processed at once without requiring special resource management techniques.

  6. It should be possible to contain the complete final results of an MS-based metabolomics experiment in a single file, with the exception that different ionisation modes SHOULD be captured in different files (see Section 5.6). This should furthermore reduce the complexity of sharing and processing an experiment’s final results.

  7. It should be useful as an output format by web-services that can then be readily accessed by tools supporting mzTab-M.

  8. It should be possible to directly link a small molecule record to its source spectrum in an external MS data file.

3. Notational Conventions

The key words “MUST,” “MUST NOT,” “REQUIRED,” “SHALL,” “SHALL NOT,” “SHOULD,” “SHOULD NOT,” “RECOMMENDED,” “MAY,” and “OPTIONAL” are to be interpreted as described in RFC-2119 (Bradner 1997).

4. Relationship to Other Specifications

The specification described in this document has not been developed in isolation; indeed, it is designed to be complementary to, and thus used in conjunction with, several existing and emerging models. Related specifications include the following:

  1. mzML (http://www.psidev.info/mzml). mzML is the PSI standard for capturing mass spectra / peak lists resulting from mass spectrometry in proteomics (Martens et al. 2011). mzTab files MAY be used in conjunction with mzML, although it will be possible to use mzTab with other formats of mass spectra. This document does not assume familiarity with mzML.

  2. ISA-TAB (http://isa-tools.org/). The ISA framework allows for reporting experimental metadata and study designs in considerable detail, and is already used for describing metabolomics experiments. It is expected that mzTab files may be linked to ISA-TAB formatted files, for cases where a rich experimental design is to be captured. The linkage between mzTab-M and ISA-TAB is further exemplified in section Section 5.13.

4.1. Relationship to previous mzTab versions

4.1.1. Relationship to mzTab 1.0

The first stable version of mzTab (version 1.0) was developed primarily by the PSI as a format for the final results (identification or quantification) of a proteomics experiment, using MS. In mzTab version 1.0 limited support was included for metabolomics, through a small molecule table, in which end results could be encoded at the level of quantified metabolites. The intention of mzTab-M is to extend these concepts, so that more detail can be captured about the evidence trail for quantification, including MS features (different charge states or adducts) and the evidence trail for identifications - both of which could not be easily supported in mzTab v 1.0. mzTab-M is not formally backwards compatible, but follows a similar design pattern. It has not been designed to support proteomics. However, design decisions made in mzTab-M may in the future be adopted for a version of mzTab specifically intended for proteomics only (mzTab-P). At the time of writing, mzTab version 1.0 remains in active use for proteomics, but is deprecated for use in metabolomics.

4.1.2. Relationship to mzTab-M 2.0.0

The first stable version of mzTab-M for Metabolomics (version 2.0.0) was developed to support the reporting of results from MS-based metabolomics experiments. mzTab-M 2.1.0 clarifies several aspects, and introduces the study_variable_group element to support multi-factorial experimental designs. As a consequence, the study_variable[1-n]-factors field has been removed: the experimental factors it was intended to capture are now represented explicitly through study_variable_group[1-n] and its sub-fields (-description, -type, -datatype, -unit), together with the study_variable[1-n]-group_ref reference from each study variable to its group. The -datatype sub-field supports the standard XSD primitive types as well as Parameter, allowing study variable values to be reported as user-defined or CV Parameters when categorical factors are best expressed by ontology terms.

Some elements have been altered, some have been removed. We have tried to maintain backwards-compatibility with mzTab-M 2.0.0 as much as possible.

4.2. The PSI Mass Spectrometry Controlled Vocabulary (CV)

The PSI-MS controlled vocabulary is intended to provide terms for annotation of mass spectrometry-related file formats. The CV has been generated with a collection of terms from software vendors and academic groups working in the area of mass spectrometry and MS informatics. Some terms describe attributes that must be coupled with a numerical value attribute in the cvParam element (e.g. MS:1000028 “detector resolution”) and optionally a unit for that value (e.g. MS:1001117, “theoretical mass”, units = “dalton”). The terms that require a value are denoted by having a “datatype” key-value pair in the CV itself: MS:1000511 "ms level" value-type:xsd:int. Terms that need to be qualified with units are denoted with a “has_units” key in the CV itself (relationship: has_units: UO:0000221 ! dalton).

As recommended by the PSI CV guidelines, psi-ms.obo should be dynamically maintained via the psidev-ms-vocab@lists.sourceforge.net mailing list that allows any user to request new terms in agreement with the community involved. Once a consensus is reached among the community the new terms are added within a few business days. If there is no obvious consensus, the CV coordinators committee should vote and make a decision. A new psi-ms.obo should then be released by updating the file on the GitHub server without changing the name of the file.

The following ontologies or controlled vocabularies specified below may also be recommended or required in certain instances, as specified within the CV mapping file:

Most of these ontologies are also indexed and searchable at EBI’s Ontology Lookup Service (OLS) at https://www.ebi.ac.uk/ols/index

5. Resolved Design and scope issues

There were several issues regarding the design of the format that were not clear cut, and a design choice was made that was not completely agreeable to everyone. So that these issues are not continously revisited, we document the issues here and why the decision that is implemented was made.

Small molecules MUST be linked to an identifier of the source spectrum (in an external file) from which the identifications are made by way of a reference in the spectra_ref attribute and via the ms_run element which stores the URI of the file in the location attribute.

It is advantageous if there is a consistent system for identifying spectra in different file formats. The following table is implemented in the PSI-MS CV for providing consistent identifiers for different spectrum file formats.

This table shows examples from the CV but MAY be extended. The CV holds the definite specification for legal encodings of spectrum identifier values.
Table 1. Controlled vocabulary terms and rules implemented in the PSI-MS CV for formulating the “nativeID” to identify spectra in different file formats.
ID Term Data type Comment

MS:1000768

Thermo nativeID format

controllerType=xsd:nonNegativeInteger controllerNumber=xsd:positiveInteger scan=xsd:positiveInteger.

controller=0 is usually the mass spectrometer

MS:1000769

Waters nativeID format

function=xsd:positiveInteger process=xsd:nonNegativeInteger scan=xsd:nonNegativeInteger

MS:1000770

WIFF nativeID format

sample=xsd:nonNegativeInteger period=xsd:nonNegativeInteger cycle=xsd:nonNegativeInteger experiment=xsd:nonNegativeInteger

MS:1000771

Bruker/Agilent YEP nativeID format

scan=xsd:nonNegativeInteger

MS:1000772

Bruker BAF nativeID format

scan=xsd:nonNegativeInteger

MS:1000773

Bruker FID nativeID format

file=xsd:IDREF

The nativeID must be the same as the source file ID

MS:1000774

multiple peak list nativeID format

index=xsd:nonNegativeInteger

Used for referencing peak list files with multiple spectra, i.e. MGF, PKL, merged DTA files. Index is the spectrum number in the file, starting from 0.

MS:1000775

single peak list nativeID format

file=xsd:IDREF

The nativeID must be the same as the source file ID. Used for referencing peak list files with one spectrum per file, typically in a folder of PKL or DTAs, where each sourceFileRef is different

MS:1000776

scan number only nativeID format

scan=xsd:nonNegativeInteger

Used for conversion from mzXML, or a DTA folder where native scan numbers can be derived.

MS:1000777

spectrum identifier nativeID format

spectrum=xsd:nonNegativeInteger

Used for conversion from mzData. The spectrum id attribute is referenced.

MS:1001530

mzML unique identifier

xsd:string

Used for referencing mzML. The value of the spectrum id attribute is referenced directly.

In mzTab, the spectra_ref attribute should be constructed following the data type specification in CV Terms and Rules. As an example, to reference the third spectrum (index = 2) in an MGF (Mascot Generic Format) file:

MTD ms_run[1]-format [MS, MS:1001062, Mascot MGF file, ]

MTD ms_run[1]-id_format [MS, MS:1000774, multiple peak list nativeID format, ]

...

SEH ... spectra_ref ...

SME ... ms_run[1]:index=2 ...

Example: Reference the spectrum with identifier “scan=11665” in an mzML file.

MTD ms_run[1]-format [MS, MS:1000584, mzML file, ]

MTD ms_run[1]-id_format [MS, MS:1001530, mzML unique identifier, ]

...

SEH ... spectra_ref ...

SME ... ms_run[1]:scan=11665 ...

5.2. Recommendations for reporting replicates within experimental designs

Modeling the correct reporting of technical/biological replicates within experimental designs is supported in mzTab as shown in Figure 1. The following components have various cross-references and MUST be used in different types of mzTab files as follows:

  • study_variable – The variables about which the final results of a study are reported, which may have been derived following averaging across a group of replicate measurements (assays). The same concept has been defined by others as “experimental factor”.

  • ms_run – An MS run is effectively one run on an MS instrument, and is referenced from assay in different contexts. In the case of pre-fractionation into n fractions, an assay SHOULD reference n ms_runs.

  • assay – The application of a measurement about the sample (in this case through MS) – producing values about small molecules or lipids. One assay is typically mapped to one MS run in the case of label-free MS analysis (with no pre-fractionation). At the present time, multiplexing within an ms_run is not supported in mzTab-M, thus there would typically be a one:one relationship between assay and ms_run.

  • sample – a biological material that has been analyzed, to which descriptors of species, cell/tissue type etc. can be attached. In all of types of mzTab file, these MAY be reported in the metadata section as sample[1-n]-description. Samples are NOT MANDATORY in mzTab, since many software packages cannot determine what type of sample was analyzed (e.g. whether biological or technical replication was performed). If the file producer wishes to describe whether biological or technical replication has been performed, then sample elements SHOULD be provided.

Clear definitions of biological and technical replicates are difficult to provide as these are somewhat dependent upon the biological domain. However, we use the following general definitions in mzTab.

  • Biological replicates are where different samples have been analyzed by MS.

  • Technical replicates are where same samples are analyzed multiple times by MS.

There is deliberately no attempt to define the boundary of the term “sample”.

If sample level information is provided optimally, it is expected that:

  • n biological replicates can be mapped to sample[1-n]

  • m technical replicate measurements of sample 1 SHOULD be mapped to assay[1-m] referencing sample[1] (for example).

However, an open challenge remains since some analysis software is often not aware of whether replicates (multiple MS runs) are originally biological or technical in nature. As such, the default behavior for mzTab exporters from quantitative software is to exclude sample level information and report quantitative data for assay[1-n] and study_variable[1-n].

Additional annotation software would typically be required to add the sample-level information, as provided (often manually) by the user.

image
Figure 1. Simple experimental designs in mzTab-M can be represented using a combination of the elements study_variable (SV), assay, ms_run and sample. Quantitative values can be reported in files for SVs and assays. A) SV is intended to capture different groups of replicates, which might have resulted from different levels of a given variable e.g. control versus treated (represented as 2 SVs), n time points over a treatment course (as n SVs). B) assay captures a measurement made about a molecule (small molecule/lipid) where multiple assays within the same group are taken to be replicates of some kind (biological or technical). C) ms_run captures a single run on an MS instrument. D) samples are optional in mzTab since the quantitative software may often be unaware of the biological samples that have been analyzed. If that information is available, references from assay to the same (technical, upper half) or different (biological, bottom half) samples are used to describe the type of replication performed.

Starting with mzTab-M 2.1, more complex experimental designs involving multiple factors (e.g. different treatments at different time points) can be represented by grouping multiple study variables. Such a group of study variable instances can be considered a column in a wide-layout experimental design matrix.

5.3. Recommendations for reporting multifactorial experimental designs

Some explanation is needed about the reporting of multifactorial experimental designs, which are common in metabolomics. In mzTab-M 2.1, the study_variable_group element has been introduced to support the grouping of multiple study variables (SVs) that together represent a column in a wide-layout experimental design matrix. For example, if there are two factors (e.g. treatment and time point). Each study_variable then references the relevant assay(s) and the study_variable_group. At least one study_variable_group MUST be defined.

5.4. Reporting derivatization approaches

For GC and HPLC, derivatization is often applied in order to specifically target compounds that are otherwise hard to measure at all, being non-volatile or otherwise chemically / physically poorly suited for the separation method and to increase ionization efficiency and selectivity for subsequent MS analysis. For GC, the primary derivatization methods are:

  • acylation

  • alkylation and esterification

  • silylation

In mzTab-M, any derivatization agents used should be reported in the metadata section under derivatization_agent[1-n]. It is expected that in the small molecule evidence table where matches are made to database entries including the derivatized form, then that form SHOULD be reported in evidence row. In the small molecule (summary) table, it MAY be appropriate to reference a database entry for the actual molecule inferred without the derivatization addition, although this is context dependent and in some cases it may be more appropriate to reference a database entry for the derivatized form.

5.5. Encoding missing values, zeroes, nulls, infinity and calculation errors

In the table-based sections there MUST NOT be any empty cells. In case a given property is not available “null” MUST be used, but this is only allowed for parameters with "is nullable=True".

For numerical values, they MUST be encoded following the specifications of xs:decimal. This does not natively support NaN, INF, scientific notation or null. As such, it is allowed in mzTab to include "NaN" for incalculable numbers and "null" for no data. In some cases, there is ambiguity with respect to the use of "0" versus "null": e.g. if there are alignment issues and it is unclear whether a molecule has been quantified with zero abundance or the feature was potentially present in the data but was not found. Export software would be expected to make a decision on this cases, based on best understanding of the case in hand.

Scientific notation and infinity is explicitly not supported.

5.6. Support for positive and negative modes

It is common in metabolomics workflows to use both positive and negative ionisation modes to increase coverage of molecules quantified. In general, an mzTab-M file is intended to capture a data set generated from assays which have been aligned (e.g. in the retention time dimension) to produce a coherent data matrix with few missing values. To our knowledge, it is not common to directly compare the results from positive and negative modes in the same data matrix. As such, we anticipate that such results (i.e. positive mode and negative mode) should be encoded in two different mzTab-M files.

5.7. Referencing evidence for small molecule identifications

Evidence for small molecule identification is captured by reference from the SML table via features (SMFs) down to the final table - Small Molecule Evidence (SME) elements. It is possible to have a legal mzTab-M file that does not contain any features (SML summary level only). In this case, detailed information about small molecule evidence cannot be provided. It is generally RECOMMENDED to include data at the SML, SMF and SME levels.

SMF elements should reference down to all evidence elements (SME rows) that support the identification of that particular feature.

If features (SMF elements) have been grouped prior to evidence collation, then different groups SMF elements SHOULD reference the same SME elements redundantly.

image
Figure 2. A) The summary level (SML) reports the final assumed identification, allowing for ambiguity by “|” separated results in the relevant columns. B) The feature level (SMF) does not explicitly report identifications but references down to the SME level. Ambiguity is propagated via referencing multiple SME elements (rows) with different identification results. C) One SME element (one row) represents a single possible identification from some input evidence. Multiple identifications from the same input data share the same value for evidence_input_id. Ambiguity is captured by different rows for the same input data.

5.8. Ambiguity in identification

It is common in metabolomics and lipidomics for significant ambiguity to remain after data processing in the identification of molecules. In the top level (SML) table, multiple identifiers MAY be provided in several columns: database_identifier, chemical_formula, smiles, inchi, chemical_name and uri. If there is ambiguity in the actual identity of the molecule, multiple identifiers SHOULD be reported separated by the "|" character. The number of elements separated by | characters MUST be identical in all columns where data is reported to emphasize the correspondence across columns.

The SML element [reliability] MUST be assigned a value to indicate the confidence or ambiguity of the overall assignment. By default, mzTab-M assumes the MSI 4 level system (see [reliability]). A different system of confidence levels MAY be defined in the metadata section (see Section 7.2.71 for details and examples). New systems can be supported in the future by extending the PSI MS controlled vocabulary.

When referencing from the features (SMF) elements to evidence (SME) elements, it is possible for a SMF element to reference multiple SME elements. However, there are potentially several reasons for a 1 to many relationship. A different code MUST be provided in the SME_ID_REF_ambiguity_code element to clarify the case:

  • The same input data (e.g. fragment spectrum or isotopic profile) has multiple results, supporting different potential identifications i.e. where ambiguity remains (code=1)

  • Different input data (or different searches of the same data) have returned results evidence supporting the same identification i.e. no ambiguity remains (code=2).

  • Different input data has been used to support identification and ambiguity still remains (code=3).

5.9. Ambiguity in lipidomics identification

The mzTab-M 2.0.0 release is intended to be used for capturing profiling studies from both metabolomics and lipidomics. However, it is acknowledged that representing ambiguity in the identification of lipid molecules, based on the available evidence from MS is potentially more complicated than for small molecules. As such, mzTab-M 2.0.0 SHOULD be used on release for representing lipid-based data, but a working group will continue to improve on the mechanism for representing lipid identification data, for example defining particular CV terms to be used in the appropriate places of the standard. These artefacts will be reported in due course and should plug-in to this version in a backwards-compatible manner.

5.10. Guidelines for reporting results prior to or with no alignment step across features

The most common intended use for mzTab-M is to encode MS results that have been aligned across multiple analyses (assays), for example by retention time alignment in LC-MS or GC-MS approaches. However, it is possible to use mzTab-M as part of internal pipelines to represent small molecules quantified by MS (features) before alignment. The RECOMMENDED encoding for doing this would be to represent the features from n MS analyes in n mzTab files, rather than attempting to create an SMF table including a sparse matrix filled with nulls for all but one of the assay columns.

5.11. Guidelines for workflows involving pre-fractionation

It is possible that a single analysis of a sample is split offline via some fractionation technology prior to LC/GC-MS into n MS analyses to limit the complexity of the molecules arriving at the detector. Such workflows, while relatively rare in metabolomics, can be encoded in mzTab-M via an assay referencing to n ms_runs. It may be desirable to maintain the link from a feature (SMF row) to the ms_run from which it was obtained. This SHOULD be achieved through the use of an optional column called "opt_global_ms_run_refs", in which the identifiers of ms_runs are placed where the feature has been quantified from.

5.12. Adding optional columns

Additional columns MAY be added to the end of rows in all the table-based sections. The information stored within an optional column is completely up to the resource that generates the file. It MUST not be assumed that optional columns having the same name in different mzTab files contain the same type of information.

These column headers MUST start with the prefix “opt_” followed by the identifier of the object they reference: assay, study variable, MS run or “global” (if the value relates to all replicates). Column names MUST only contain the following characters: ‘A’-‘Z’, ‘a’-‘z’, ‘0’-‘9’, ‘’, ‘-’, ‘[’, ‘]’, and ‘:’. CV parameter accessions MAY be used for optional columns following the format: opt_{OBJECT_ID}_cv_{accession}_{parameter name}. Spaces within the parameter’s name MUST be replaced by ‘’.

COM Example showing a global aligned 2D feature retention time for GCxGC-MS

…
SFH SMF_ID … opt_global_retention_time_nd
SMF 1 … 1562 | 2.47
COM Example showing how drift time values are reported in an additional column from MS run 1 using
COM MS CV parameter “ion mobility drift time” (MS:1002476)

…
SFH SMF_ID … opt_ms_run[1]_cv_MS:MS:1002476_ion_mobility_drift_time
SMF 1 … 24.55

5.13. Referencing external resources

The ISA-TAB format (Sansone 2012) is designed for capturing rich experimental designs in terms of a workflow of protocol steps, covering sample processing, data collection and data analysis for any type of high-throughput study. mzTab-M does not aim for a rich description of protocols, but is instead focused on tightly defining the data output from a metabolomics study. Users may wish to use ISA-TAB to record more details about these aspects. The ISA-TAB file can be referenced by the external_study_uri attribute.

Generally, any external resource reference (suffixed -uri, or -location) must be provided as a valid URI string. This allows to report local, as well as remote resource links (URLs) and unique unified resource names (URNs).

Reporting database identifiers SHOULD be kept compatible to http://identifiers.org/, as is demonstrated in the [database_identifier] examples, where the database identifier must be preceded by the resource description (prefix) followed by a colon, as specified in the Section 7.2.64 metadata section. The possible use of the full identifiers.org URI is shown in the example for uri attribute within the SML section ([uri]).

5.14. Other supporting materials

Example files are located at GitHub.

6. Format specification

This section describes the structure of an mzTab file.

  • Field separator
    The column delimiter is the Unicode Horizontal Tab character (Unicode codepoint 0009).

  • File encoding
    The UTF-8 encoding of the Unicode character set is the preferred encoding for mzTab files. However, parsers should be able to recognize commonly used encodings.

  • Case sensitivity
    All column labels and field names are case-sensitive.

  • Line prefix
    Every line in an mzTab file MUST start with a three letter code identifying the type of line delimited by a Tab character. The three letter codes are as follows:

    • MTD for metadata

    • SMH for small molecule table header line (the column labels)

    • SML for rows of the small molecule table

    • SFH for small molecule feature header line

    • SMF for rows of the small molecule feature table

    • SEH for small molecule evidence header line

    • SME for rows of the small molecule evidence table

    • COM for comment lines

  • Header lines
    Each table based section (small molecule, small molecule feature and small molecule evidence) MUST start with the corresponding header line. These header lines MUST only occur once in the document since each section also MUST only occur once.

  • Dates and times
    Dates and times MUST be encoded using the ISO 8601 format:

    • Dates MUST use the form YYYY-MM-DD (e.g. 2025-03-14).

    • Times MUST use the form hh:mm:ss (e.g. 09:45:30). Fractional seconds and a timezone designator (Z for UTC, or ±hh:mm) MAY be appended (e.g. 09:45:30.250Z, 09:45:30+02:00).

    • Combined date and time values MUST use the form YYYY-MM-DDThh:mm:ss (e.g. 2025-03-14T09:45:30). Fractional seconds and a timezone designator MAY be appended (e.g. 2025-03-14T09:45:30Z, 2025-03-14T09:45:30+02:00).

  • Decimal separator
    In mzTab files the dot (“.”) MUST be used as decimal separator. Thousand separators MUST NOT be used in mzTab files.

  • Comment lines and empty lines
    Comment lines can be placed anywhere in an mzTab file. These lines must start with the three-letter code COM and are ignored by most parsers. Empty lines can also occur anywhere in an mzTab file and are ignored.

  • Params
    mzTab makes use of CV parameters. As mzTab is expected to be used in several experimental environments where parameters might not yet be available for the generated scores etc. all parameters can either report CV parameters or user parameters that only contain a name and a value.
    Parameters are always reported as [CV label, accession, name, value]. Any field that is not available MUST be left empty.

[MS, MS:1001477, SpectraST,]
[,,A user parameter, The value]

Should the name of the param contain commas, quotes MUST be added to avoid problems with the parsing: [label, accession, “first part of the param name, second part of the name”, value].

[MOD, MOD:00648, "N,O-diacetylated L-serine",]

A CV parameter mapping file for mzTab following the mzML mapping file XML Schema is available at GitHub as part of the specification for semantic validation. The mapping file defines recommended controlled vocabularies and defines restrictions for the use of CV terms on particular elements of the mzTab document. Unlike other PSI standards, the model description of mzTab-M 2.0 is not based on an XML schema, but instead on a Swagger / OpenAPI 2.0 compatible specification that is used to generate a corresponding object structure that can be represented in XML, JSON or as an object hierarchy in arbitrary programming languages.

  • Sample IDs
    To be able to supply metadata specific to each sample, ids in the format sample[1-n] are used.

MTD sample[1]-species[1] [NCBITaxon, NCBITaxon:9606, Homo sapiens, ]
  • Assay IDs
    To be able to supply metadata specific to each assay, ids in the format assay[1-n] are used.

MTD assay[1] first assay description
  • Study variable IDs
    To be able to supply metadata specific to each study variable (grouping of assays), ids in the format study_variable[1-n] are used.

MTD study_variable[1] Group B (spike-in 0.74 fmol/uL)
  • URIs
    URIs MUST follow the format defined in RFC 3986 and RFC 8089 ('file' URIs).

  • Versioning
    To support a future evolution of the format, an mzTab file MUST report its version. From version 2.0.0-M onwards, we intend to use semantic versioning. This means that increasing the last digit of the version (the patch level) indicates backwards compatible fixes to the specification that require no adaptation of consumers or producers of the format. A change in the middle digit of the version (the minor level) indicates new features that are backwards compatible to existing software but will require updates for new producers and consumers to make use of those features. Finally, a change in the first digit of the version (the major level) indicates breaking changes in the format that require changes in any producing or consuming software to support features of that version.

7. Field Reference

This document provides a reference for all fields defined in the mzTab-M format, organised by section and ordered by mzTab-M section hierarchy. Each field entry includes a description, type, mandatory status, and example usage.

7.1. Sections

The mzTab-M format consists of four cross-referenced data tables: metadata (MTD), Small Molecule (SML), Small Molecule Feature (SMF), and Small Molecule Evidence (SME). The MTD and SML tables are mandatory. SMF and SME sections SHOULD also be included to capture full identification evidence.

7.2. Metadata Section

The metadata section provides additional information about the dataset(s) reported in the mzTab file. All fields in the metadata section are optional apart from those noted as mandatory. The fields in the metadata section MUST be reported in the order listed below. The field name and value MUST be separated by a tab character.

7.2.1. mzTab-version

Description

Version number of the mzTab format used.

Format: major.minor.patch-variant Must end with "-M" suffix for metabolomics variant.

Used to ensure compatibility and processing correctness.

Type

1. Regex

^\d{1}\.\d{1}\.\d{1}-[A-Z]{1}$


Mandatory

True

Example

MTD	mzTab-version	2.0.0-M
MTD	mzTab-version	2.1.0-M

7.2.2. mzTab-ID

Description

Unique identifier for the mzTab-M document. REQUIRED. Can be: - Repository accession number (e.g., MTBLS214) - Laboratory internal identifier - Study-specific identifier NOT intended as a globally unique identifier, but SHOULD have local meaning within its context.

Type

String

Mandatory

True

Example

MTD	mzTab-ID	MTBLS214
MTD	mzTab-ID	LAB001_2023

7.2.3. title

Description

Human-readable title of the experiment or study. OPTIONAL. SHOULD be: - Concise but informative - Reflect the main focus of the study - Unique within a collection of related studies

Type

String

Mandatory

False

Example

MTD	title	Metabolomic Analysis of Human Plasma in Diabetes Type 2
MTD	title	Lipidomics Study of Brain Tissue in Alzheimer's Disease

7.2.4. description

Description

Detailed description of the experiment or study. OPTIONAL. SHOULD include: - Study objectives - Experimental design overview - Key methodological approaches - Any unique aspects of the study Provides context for understanding the data and its significance.

Type

String

Mandatory

False

Example

MTD	description	Investigation of metabolic changes in human plasma samples from type 2 diabetes patients compared to healthy controls. Study includes both fasting and post-prandial measurements.
MTD	description	Analysis of lipid profiles in brain tissue samples examining the relationship between specific lipid species and Alzheimer's disease progression.

7.2.5. sample_processing[1-n]

Description

Parameters specifying sample processing that was applied within one step.

Type

Parameter List

Mandatory

False

Example

MTD	sample_processing[1]	[MSIO, MSIO:0000107, metabolism quenching using precooled 60 percent methanol ammonium bicarbonate buffer,]
MTD	sample_processing[2]	[MSIO, MSIO:0000146, centrifugation,]
MTD	sample_processing[3]	[MSIO, MSIO:0000141, metabolite extraction,]
MTD	sample_processing[4]	[MSIO, MSIO:0000141, silylation,]

The name, source, analyzer and detector of the instruments used in the experiment. Multiple instruments are numbered [1-n].

7.2.6. instrument[1-n]-name

Description

The instrument’s name.

Type

Parameter

Mandatory

False

7.2.7. instrument[1-n]-source

Description

The instrument’s ion source.

Type

Parameter

Mandatory

False

7.2.8. instrument[1-n]-analyzer[1-n]

Description

The instrument’s mass analyzer, as defined by the parameter.

Type

Parameter List

Mandatory

False

7.2.9. instrument[1-n]-detector

Description

The instrument’s mass analyzer, as defined by the parameter.

Type

Parameter

Mandatory

False

7.2.10. software[1-n]

Description

The software utilized.

Type

Parameter

Mandatory

False

Example

MTD	software[1]	[MS, MS:1002879, Progenesis QI, 3.0]
MTD	software[1]-setting	Fragment tolerance = 0.1 Da
…
MTD	software[2]-setting	Parent tolerance = 0.5 Da

7.2.11. software[1-n]-setting[1-n]

Description

A software setting used. This field MAY occur multiple times for a single software. The value of this field is deliberately set as a String, since there currently do not exist cvParams for every possible setting.

Type

String List

Mandatory

False

7.2.12. publication[1-n]

Description

The publication item ids referenced by this publication.

Type

String List

Mandatory

True

Example

MTD	publication[1]	pubmed:21063943|doi:10.1007/978-1-60761-987-1_6
MTD	publication[2]	pubmed:20615486|doi:10.1016/j.jprot.2010.06.008

The contact’s name, affiliation and e-mail. Several contacts can be given by indicating the number in the square brackets after "contact". A contact has to be supplied in the format [first name] [initials] [last name].

7.2.13. contact[1-n]-name

Description

The contact’s name.

Type

String

Mandatory

False

7.2.14. contact[1-n]-affiliation

Description

The contact’s affiliation.

Type

String

Mandatory

False

7.2.15. contact[1-n]-email

Description

The contact’s e-mail address.

Type

String

Mandatory

False

7.2.16. contact[1-n]-orcid

Description

The contact’s orcid id, without https prefix.

Type

1. Regex

^[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{3}[0-9X]{1}$


Mandatory

False

7.2.17. uri[1-n]

Description

The URI pointing to the external resource.

Type

URI

Mandatory

False

Example

MTD	uri[1]	https://www.ebi.ac.uk/metabolights/MTBLS517
…
MTD	external_study_uri[1]	https://www.ebi.ac.uk/metabolights/MTBLS517/files/i_Investigation.txt

7.2.18. external_study_uri[1-n]

Description

The URI pointing to the external resource.

Type

URI

Mandatory

False

Example

MTD	uri[1]	https://www.ebi.ac.uk/metabolights/MTBLS517
…
MTD	external_study_uri[1]	https://www.ebi.ac.uk/metabolights/MTBLS517/files/i_Investigation.txt

7.2.19. quantification_method

Description

The quantification method used in the experiment reported in the file.

Type

Parameter

Mandatory

True

7.2.20. sample[1-n]

Description

The sample’s name.

Type

String

Mandatory

False

Example

COM	Experiment where all samples consisted of the same two species
MTD	sample[1]	individual number 1
MTD	sample[1]-species[1]	[NCBITaxon, NCBITaxon:9606, Homo sapiens, ]
MTD	sample[1]-tissue[1]	[BTO, BTO:0000759, liver, ]
MTD	sample[1]-cell_type[1]	[CL, CL:0000182, hepatocyte, ]
MTD	sample[1]-disease[1]	[DOID, DOID:684, hepatocellular carcinoma, ]
MTD	sample[1]-disease[2]	[DOID, DOID:9451, alcoholic fatty liver, ]
MTD	sample[1]-description	Hepatocellular carcinoma samples.
MTD	sample[1]-custom[1]	[,,Extraction date, 2011-12-21]
MTD	sample[1]-custom[2]	[,,Extraction reason, liver biopsy]
MTD	sample[2]	individual number 2
MTD	sample[2]-species[1]	[NCBITaxon, NCBITaxon:9606, Homo sapiens, ]
MTD	sample[2]-tissue[1]	[BTO, BTO:0000759, liver, ]
MTD	sample[2]-cell_type[1]	[CL, CL:0000182, hepatocyte, ]
MTD	sample[2]-description	Healthy control samples.

7.2.21. sample[1-n]-species[1-n]

Description

Biological species information on the sample.

Type

Parameter List

Mandatory

False

7.2.22. sample[1-n]-tissue[1-n]

Description

Biological tissue information on the sample.

Type

Parameter List

Mandatory

False

7.2.23. sample[1-n]-cell_type[1-n]

Description

Biological cell type information on the sample.

Type

Parameter List

Mandatory

False

7.2.24. sample[1-n]-disease[1-n]

Description

Disease information on the sample.

Type

Parameter List

Mandatory

False

7.2.25. sample[1-n]-description

Description

A free form description of the sample.

Type

String

Mandatory

False

7.2.26. sample[1-n]-custom[1-n]

Description

Additional user or cv parameters.

Type

Parameter List

Mandatory

False

Specification of ms_run. location: Location of the external data file e.g. raw files on which analysis has been performed. If the actual location of the MS run is unknown, a “null” MUST be used as a place holder value, since the [1-n] cardinality is referenced elsewhere. If pre-fractionation has been performed, then [1-n] ms_runs SHOULD be created per assay. instrument_ref: If different instruments are used in different runs, instrument_ref can be used to link a specific instrument to a specific run. format: Parameter specifying the data format of the external MS data file. If ms_run[1-n]-format is present, ms_run[1-n]-id_format SHOULD also be present, following the parameters specified in Table 1. id_format: Parameter specifying the id format used in the external data file. If ms_run[1-n]-id_format is present, ms_run[1-n]-format SHOULD also be present. fragmentation_method: The type(s) of fragmentation used in a given ms run. scan_polarity: The polarity mode of a given run. Usually only one value SHOULD be given here except for the case of mixed polarity runs. hash: Hash value of the corresponding external MS data file defined in ms_run[1-n]-location. If ms_run[1-n]-hash is present, ms_run[1-n]-hash_method SHOULD also be present. hash_method: A parameter specifying the hash methods used to generate the String in ms_run[1-n]-hash. Specifics of the hash method used MAY follow the definitions of the mzML format. If ms_run[1-n]-hash is present, ms_run[1-n]-hash_method SHOULD also be present.

7.2.27. ms_run[1-n]-location

Description

The msRun’s location URI.

Type

URI

Mandatory

True

7.2.28. ms_run[1-n]-instrument_ref

Description

Sample reference.

Type

Integer

Mandatory

False

7.2.29. ms_run[1-n]-format

Description

The format of the MS run file.

Type

Parameter

Mandatory

False

7.2.30. ms_run[1-n]-id_format

Description

The format of the IDs in the MS run file.

Type

Parameter

Mandatory

False

7.2.31. ms_run[1-n]-fragmentation_method[1-n]

Description

The fragmentation methods applied during this msRun.

Type

Parameter List

Mandatory

False

7.2.32. ms_run[1-n]-scan_polarity[1-n]

Description

The scan polarity/polarities used during this msRun.

Type

Parameter List

Mandatory

False

7.2.33. ms_run[1-n]-hash

Description

The file hash value of this msRun’s data file.

Type

String

Mandatory

False

7.2.34. ms_run[1-n]-hash_method

Description

The method used to calculate the hash.

Type

Parameter

Mandatory

False

7.2.35. ms_run[1-n]-parameters[1-n]

Description

Additional parameters of the assay, separated by bars.

Type

Parameter List

Mandatory

False

Example

MTD	ms_run[1]-parameter[1]	[MS, MS:1000031, instrument model, [MS, MS:1000449, LTQ Orbitrap,]]

7.2.36. assay[1-n]

Description

The assay name.

Type

String

Mandatory

True

Example

MTD	assay[1]	first assay
MTD	assay[1]-custom[1]	[MS, , Assay operator, Fred Blogs]
MTD	assay[1]-external_uri	https://www.ebi.ac.uk/metabolights/MTBLS517/files/i_Investigation.txt?STUDYASSAY=a_e04_c18pos.txt
MTD	assay[1]-sample_ref	sample[1]
MTD	assay[1]-ms_run_ref	ms_run[1]

7.2.37. assay[1-n]-custom[1-n]

Description

Additional user or cv parameters.

Type

Parameter List

Mandatory

False

7.2.38. assay[1-n]-external_uri

Description

An external URI to further information about this assay.

Type

URI

Mandatory

False

7.2.39. assay[1-n]-sample_ref

Description

Sample reference.

Type

Integer

Mandatory

False

7.2.40. assay[1-n]-ms_run_ref[1-n]

Description

The ms run(s) referenced by this assay.

Type

Integer List

Mandatory

True

7.2.41. assay[1-n]-protocol_refs[1-n]

Description

The protocol(s) referenced by this assay.

Type

Integer List

Mandatory

False

Example

MTD	assay[1]-protocol_ref	protocol[1]| protocol[2]

7.2.42. assay[1-n]-parameters[1-n]

Description

Additional parameters of the assay, separated by bars.

Type

Parameter List

Mandatory

False

Example

MTD	assay[1]-parameter[1]	[MS, MS:1000031, instrument model, [MS, MS:1000449, LTQ Orbitrap,]]

7.2.43. study_variable[1-n]

Description

The study variable value. Encoded according to the datatype declared on the referenced study_variable_group: either a literal value (for xsd:* datatypes) or a Parameter (for the Parameter datatype, e.g. [NO, NO:12345, Male,] or [,,Male,]).

Type

Study Variable List

Mandatory

True

Example

MTD	study_variable[1]	control
MTD	study_variable[1]-assay_refs	assay[1]| assay[2]| assay[3]
MTD	study_variable-average_function	[MS, MS:1002883, median, ]
MTD	study_variable-variation_function	[MS, MS:1002885, standard error, ]
MTD	study_variable[1]-description	Group B (spike-in 0.74 fmol/uL)
MTD	study_variable[2]	1 minute 0.5mg rapamycin

7.2.44. study_variable[1-n]-assay_refs[1-n]

Description

The assays referenced by this study variable.

Type

Integer List

Mandatory

False

7.2.45. study_variable[1-n]-ms_run_refs[1-n]

Description

The ms run(s) referenced by this study variable.

Type

Integer List

Mandatory

False

Example

MTD	study_variable[1]-ms_run_ref	ms_run[1]| ms_run[2]

7.2.46. study_variable[1-n]-description

Description

A free-form description of this study variable.

Type

String

Mandatory

False

7.2.47. study_variable[1-n]-group_refs[1-n]

Description

The study variable group this study variable belongs to.

Type

Integer List

Mandatory

False

Example

MTD	study_variable[1]-group_ref	study_variable_group[1]| study_variable_group[2]

7.2.48. study_variable[1-n]-average_function

Description

The function used to calculate the study variable quantification value and the operation used is not arithmetic mean (default). e.g. geometric mean, median.

Type

Parameter

Mandatory

False

7.2.49. study_variable[1-n]-variation_function

Description

The function used to calculate the study variable quantification variation value if it is reported and the operation used is not coefficient of variation (default). e.g. standard error.

Type

Parameter

Mandatory

False

7.2.50. study_variable_group[1-n]

Description

The study variable group name.

Type

Parameter

Mandatory

True

Example

MTD	study_variable_group[1]	[PATO, PATO:0000383, sex, ]
MTD	study_variable_group[1]-description	Sex of the individual
MTD	study_variable_group[1]-type	[STATO, STATO:0000252, categorical variable, ]
MTD	study_variable_group[1]-datatype	xsd:string
MTD	study_variable_group[2]	[PATO, PATO:0000384, timepoint, ]
MTD	study_variable_group[2]-description	Time after treatment
MTD	study_variable_group[2]-type	[STATO, STATO:0000228, ordinal variable, ]
MTD	study_variable_group[2]-datatype	xsd:integer
MTD	study_variable_group[2]-unit	[UO, UO:0000033, day, ]
MTD	study_variable[1]	Female_0day
MTD	study_variable[1]-group_ref	study_variable_group[1]
MTD	study_variable[1]-assay_refs	assay[1]|assay[2]|assay[3]
MTD	study_variable[2]	Female_1day
MTD	study_variable[2]-group_ref	study_variable_group[2]
MTD	study_variable[2]-assay_refs	assay[4]|assay[5]|assay[6]

7.2.51. study_variable_group[1-n]-description

Description

Description of the study variable group.

Type

String

Mandatory

False

7.2.52. study_variable_group[1-n]-type

Description

The study variable group type, as defined by the parameter.

Type

Parameter

Mandatory

False

7.2.53. study_variable_group[1-n]-datatype

Description

The datatype of the group variable, which determines how they can be encoded and parsed in mzTab-M files, and how the values could be handled in programming languages.

Producers of mzTab-M 2.1.0 SHOULD provide a datatype for each study_variable_group to simplify interpretation by downstream consumers of the format. The field is not mandatory, but its presence removes ambiguity about how the associated values are encoded and should be parsed.

The following datatypes are supported:

  • xsd:string – Character string.

  • xsd:integer – Arbitrary‑size integer.

  • xsd:decimal – Arbitrary‑precision decimal number.

  • xsd:boolean – Boolean value (true / false).

  • xsd:date – Calendar date, encoded in ISO 8601 format (YYYY-MM-DD).

  • xsd:time – Time of day, encoded in ISO 8601 format (hh:mm:ss, optional fractional seconds and timezone).

  • xsd:dateTime – Combined date and time, encoded in ISO 8601 format (YYYY-MM-DDThh:mm:ss, with optional fractional seconds and timezone, e.g. YYYY-MM-DDThh:mm:ssZ).

  • xsd:anyURI – A Uniform Resource Identifier reference.

  • Parameter – The values of the linked study_variables are reported as a Parameter (user-defined or CV Parameter), using the standard mzTab-M Parameter syntax [CV, accession, name, value].

Writers MUST ensure that the values of the study_variable entries belonging to the same study_variable_group all have the same type (e.g. string, number, …​) and use the same convention of reporting the value directly or as a Parameter, consistent with the datatype declared on the group. If the study_variable_group defines the datatype as Parameter, the CVParam qualifying the study_variable_group itself can be of a different CV origin than the CVParams used in the linked study_variable values.

Tools and parsers implementing mzTab-M SHOULD apply the appropriate data type interpretation when reading study_variable_group data and constructing analysis data structures (e.g., data frames, matrices, or tables).

Type

Parameter

Mandatory

False

Example

MTD	study_variable_group[1]-datatype	....
MTD	study_variable_group[1]-datatype	xsd:string
MTD	study_variable_group[2]-datatype	xsd:decimal
MTD	study_variable_group[3]-datatype	xsd:date
MTD	study_variable_group[4]-datatype	Parameter

COM	plain string value:
MTD	study_variable[1]	Male
MTD	study_variable[1]-group_ref	study_variable_group[1]

COM	user-defined Parameter value:
MTD	study_variable[2]	[,,Male,]
MTD	study_variable[2]-group_ref	study_variable_group[4]

COM	CV Parameter value:
MTD	study_variable[3]	[NCIT, NCIT:C20197, Male, ]
MTD	study_variable[3]-group_ref	study_variable_group[4]
....

7.2.54. study_variable_group[1-n]-unit

Description

The study variable group unit, as defined by the parameter.

Type

Parameter

Mandatory

False

A protocol describing one or more steps of an experimental procedure, such as sample preparation, data acquisition or data processing. Protocols are referenced from Assay objects. Added in mzTab-M 2.1.

7.2.55. protocol[1-n]-name

Description

The protocol name.

Type

String

Mandatory

True

7.2.56. protocol[1-n]-type

Description

The protocol type, as defined by the parameter.

Type

Parameter

Mandatory

True

7.2.57. protocol[1-n]-description

Description

Description of the protocol.

Type

String

Mandatory

False

7.2.58. protocol[1-n]-parameters[1-n]

Description

The protocol parameters.

Type

Parameter List

Mandatory

False

7.2.59. custom[1-n]

Description

Any additional parameters describing the analysis reported.

Type

Parameter List

Mandatory

False

Example

MTD	custom	[MS, MS:1000001, custom param, value]

Specification of controlled vocabularies. label: A string describing the labels of the controlled vocabularies/ontologies used in the mzTab file as a short-hand e.g. "MS" for PSI-MS. full_name: A string describing the full names of the controlled vocabularies/ontologies used in the mzTab file. version: A string describing the version of the controlled vocabularies/ontologies used in the mzTab file. uri: A string containing the URIs of the controlled vocabularies/ontologies used in the mzTab file.

7.2.60. cv[1-n]-label

Description

The abbreviated CV label.

Type

String

Mandatory

True

7.2.61. cv[1-n]-full_name

Description

The full name of this CV, for humans.

Type

String

Mandatory

True

7.2.62. cv[1-n]-version

Description

The CV version used when the file was generated.

Type

String

Mandatory

True

7.2.63. cv[1-n]-uri

Description

A URI to the CV definition.

Type

URI

Mandatory

True

7.2.64. database[1-n]

Description

The database name.

Type

Database List

Mandatory

True

Example

MTD	database[1]	[MIRIAM, MIR:00100079, HMDB, ]
MTD	database[1]-prefix	hmdb
MTD	database[1]-version	3.6
MTD	database[1]-uri	https://www.hmdb.ca
MTD	database[2]	[,, "de novo", ]
MTD	database[2]-prefix	dn
MTD	database[2]-version	Unknown
MTD	database[2]-uri	null
MTD	database[3]	[,, "no database", null ]
MTD	database[3]-prefix	null
MTD	database[3]-version	Unknown
MTD	database[3]-uri	null

7.2.65. database[1-n]-prefix

Description

The prefix used in the “identifier” column of data tables. For the 'no database' case 'null' must be used.

Type

String

Mandatory

True

7.2.66. database[1-n]-version

Description

The database version is mandatory where identification has been performed. This may be a formal version number e.g. “1.4.1”, a date of access “2016-10-27” (ISO-8601 format) or “Unknown” if there is no suitable version that can be annotated.

Type

String

Mandatory

True

7.2.67. database[1-n]-uri

Description

The URI to the database. For the “no database” case, 'null' must be reported.

Type

String

Mandatory

True

7.2.68. derivatization_agent[1-n]

Description

A description of derivatization agents applied to small molecules, using userParams or CV terms where possible.

Type

Parameter List

Mandatory

False

Example

MTD	derivatization_agent[1]	[XLMOD, XLMOD:07014, N-methyl-N-t-butyldimethylsilyltrifluoroacetamide, ]

7.2.69. small_molecule-quantification_unit

Description

Defines what type of units are reported in the small molecule summary quantification / abundance fields

Type

Parameter

Mandatory

True

Example

MTD	small_molecule-quantification_unit	[MS, MS:1001113, peak area, ]

7.2.70. small_molecule_feature-quantification_unit

Description

Defines what type of units are reported in the small molecule feature quantification / abundance fields.

Type

Parameter

Mandatory

False

Example

MTD	small_molecule_feature-quantification_unit	[MS, MS:1001113, peak area, ]

7.2.71. small_molecule-identification_reliability

Description

The system used for giving reliability / confidence codes to small molecule identifications MUST be specified if not using the default codes.

Type

Parameter

Mandatory

False

Example

MTD	small_molecule-identification_reliability	[MS, MS:1000932, identification reliability, ]

7.2.72. id_confidence_measure[1-n]

Description

Small molecule identification confidence metrics.<br/>Scoring System - Use CV parameters numbered [1-n] - Define score direction (high-to-low or low-to-high) - Order by importance for identification ranking

Scores determine confidence in molecular identifications

Type

Parameter List

Mandatory

True

Example

MTD	id_confidence_measure[1]	[MS,MS:1002890,fragmentation score,]
MTD	id_confidence_measure[2]	[MS,MS:1002891,retention time score,]

7.2.73. colunit-small_molecule

Description

Unit definitions for small molecule data columns.

Format - Pattern: {column_name}={unit_parameter} - Use CV parameters for units when possible

Important Notes - Not for quantification columns - Use small_molecule-quantification_unit for quantification values

Type

Column Parameter Mapping List

Mandatory

False

Example

MTD	colunit-small_molecule	retention_time=[UO,UO:0000031,minute,]
MTD	colunit-small_molecule	mass=[UO,UO:0000221,dalton,]

7.2.74. colunit-small_molecule_feature

Description

Defines the used unit for a column in the small molecule feature section. The format of the value has to be {column name}={Parameter defining the unit}. This field MUST NOT be used to define a unit for quantification columns. The unit used for small molecule quantification values MUST be set in small_molecule_feature-quantification_unit.

Type

Column Parameter Mapping List

Mandatory

False

Example

MTD	colunit-small_molecule_feature	retention_time=[UO, UO:0000031, minute, ]

7.2.75. colunit-small_molecule_evidence

Description

Defines the used unit for a column in the small molecule evidence section. The format of the value has to be {column name}={Parameter defining the unit}.

Type

Column Parameter Mapping List

Mandatory

False

Example

MTD	colunit-small_molecule_evidence	retention_time=[UO, UO:0000031, minute, ]

7.3. Small Molecule (SML) Section

The small molecule section is table-based. It MUST always come after the metadata section. Each row reports one final quantified molecule result. All columns are MANDATORY except for "opt_" columns.

The order of columns MUST follow the order specified below. All table columns MUST be Tab separated. There MUST NOT be any empty cells. Missing values MUST be reported using "null".

7.3.1. SML_ID

Description

A within file unique identifier for the small molecule summary.

Type

Integer

Mandatory

True

Is Nullable:

FALSE

Example

SMH	...	SML_ID	...
SML	...	1	...

7.3.2. SMF_ID_REFS

Description

References to the small molecule features (SMF elements) via referencing SMF_ID values. Multiple values MAY be provided as a | separated list to indicate which features were used to aggregate the SML row.

Type

Integer List

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	SMF_ID_REFS	...
SML	...	2|3|11	...

7.3.3. database_identifier

Description

A list of | separated possible identifiers for the small molecule; multiple values MUST only be provided to indicate ambiguity in the identification of the molecule and not to demonstrate different identifier types for the same molecule. Alternative identifiers for the same molecule MAY be provided as optional columns. The database identifier must be preceded by the resource description (prefix) followed by a colon, as specified in the metadata section. A null value MAY be provided if the identification is sufficiently ambiguous as to be meaningless for reporting or the small molecule has not been identified.

Type

String List

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	database_identifier	...
SML	...	CID:00027395|HMDB:HMDB0001847	...

7.3.4. chemical_formula

Description

The chemical formula of the identified compound e.g. in a database, assumed to match the theoretical mass to charge (in some cases this will be the derivatized form, including adducts and protons). This should be specified in Hill notation (EA Hill 1900), i.e. elements in the order C, H and then alphabetically all other elements. Counts of one may be omitted. Elements should be capitalized properly to avoid confusion (e.g., “CO” vs. “Co”). The chemical formula reported should refer to the neutral form. Charge state is reported by the charge field in the SME and SMF section. Example N-acetylglucosamine would be encoded by the string “C8H15NO6”

Type

String List

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	chemical_formula	...
SML	...	C17H20N4O2	...

7.3.5. smiles

Description

The potential molecule’s structure in the simplified molecular-input line-entry system (SMILES) for the small molecule.

Type

String List

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	smiles	...
SML	...	C1=CC=C(C=C1)CCNC(=O)CCNNC(=O)C2=CC=NC=C2	...

7.3.6. inchi

Description

A standard IUPAC International Chemical Identifier (InChI) for the given substance.

Type

String List

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	inchi	...
SML	...	InChI=1S/C17H20N4O2/c22-16(19-12-6-14-4-2-1-3-5-14)9-13-20-21-17(23)15-7-10-18-11-8-15/h1-5,7-8,10-11,20H,6,9,12-13H2,(H,19,22)(H,21,23)	...

7.3.7. chemical_name

Description

The small molecule’s chemical/common name, or general description if a chemical name is unavailable.

Type

String List

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	chemical_name	...
SML	...	N-(2-phenylethyl)-3-[2-(pyridine-4-carbonyl)hydrazinyl]propanamide	...

7.3.8. uri

Description

A URI pointing to the small molecule’s entry in a database (e.g., the small molecule’s HMDB, Chebi or KEGG entry).

Type

String List

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	uri	...
SML	...	http://www.genome.jp/dbget-bin/www_bget?cpd:C00031	...
SML	...	http://www.hmdb.ca/metabolites/HMDB0001847	...

7.3.9. theoretical_neutral_mass

Description

The theoretical neutral mass of the small molecule. This should be calculated from the chemical formula.

Type

Double List

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	theoretical_neutral_mass	...
SML	...	1234.5	...

7.3.10. adduct_ions

Description

A | separated list of the detected adduct ion forms for this small molecule. The terms should follow the general style in the 2013 IUPAC recommendations on terms relating to MS e.g. [M+H]1+, [M+Na]1+, [M+NH4]1+, [M-H]1-, [M+Cl]1-.

Type

Regex List

^\[\d*M([+-][\w\d]+)*\]\d*[+-]$

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	adduct_ions	...
SML	...	[M+H]1+|[M+Na]1+	...

7.3.11. reliability

Description

The reliability of the given small molecule identification. This must be supplied by the resource and should be reported as an integer between 1-4:

1: identified, rigorous. …​ 2: identified. …​ 3: putatively characterized class. …​ 4: unknown. …​

Type

String

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	reliability	...
SML	...	3	...
SML	...	0	...

7.3.12. best_id_confidence_measure

Description

The small molecule confidence measure/score of the best identification for this small molecule summary. The type of the value is defined by the best_id_confidence_measure CV parameter. The value is reported in the best_id_confidence_value column.

Type

Parameter

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	best_id_confidence_measure	...
SML	...	[MS, MS:1001477, SpectraST,,]	...

7.3.13. best_id_confidence_value

Description

The small molecule confidence measure/score value of the best identification for this small molecule summary.

Type

Double

Mandatory

True

Is Nullable:

FALSE

Example

SMH	...	best_id_confidence_value	...
SML	...	0.85	...

7.3.14. abundance_assay

Description

The small molecule’s abundance in every assay described in the metadata section MUST be reported. Null or zero values may be reported as appropriate.

Type

Double List

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	abundance_assay	...
SML	...	12340	...

7.3.15. abundance_study_variable

Description

The small molecule’s abundance in every study variable described in the metadata section. Null or zero values may be reported as appropriate.

Type

Double List

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	abundance_study_variable	...
SML	...	1230	...

7.3.16. abundance_variation_study_variable

Description

The small molecule’s abundance variation in every study variable described in the metadata section. Null or zero values may be reported as appropriate.

Type

Double List

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	abundance_variation_study_variable	...
SML	...	0.2	...

7.3.17. opt_{identifier}_*

Description

Additional columns can be added to the end of the small molecule table. These column headers MUST start with the prefix “opt_” followed by the {identifier} of the object they reference: assay, study variable, MS run or “global” (if the value relates to all replicates). Column names MUST only contain the following characters: 'A'-'Z', 'a'-'z', '0'-'9', '', '-', '[', ']', and ':'. CV parameter accessions MAY be used for optional columns following the format: opt{identifier}_cv_{accession}_{parameter name}. Spaces within the parameter’s name MUST be replaced by '_'.

Type

Optional Column

Mandatory

False

Is Nullable:

TRUE

Example

SMH	...	opt_global_cv_value	...
SML	...	opt_global_cv_MS:1002217_decoy_peptide=null	...

7.4. Small Molecule Feature (SMF) Section

The small molecule feature section is table-based, representing individual MS regions (generally the elution profile for all isotopomers from a single charge state). It MUST always come after the Small Molecule Section. All columns are MANDATORY except for "opt_" columns.

The order of columns MUST follow the order specified below. All table columns MUST be Tab separated. There MUST NOT be any empty cells. Missing values MUST be reported using "null".

7.4.1. SMF_ID

Description

A within file unique identifier for the small molecule feature.

Type

Integer

Mandatory

True

Is Nullable:

FALSE

Example

SFH	...	SMF_ID	...
SMF	...	1	...

7.4.2. SME_ID_REFS

Description

References to the identification evidence (SME elements) via referencing SME_ID values. Multiple values MAY be provided as a | separated list to indicate ambiguity in the identification or to indicate that different types of data supported the identifiction (see sme_id_ref_ambiguity_code). For the case of a consensus approach where multiple adduct forms are used to infer the SML ID, different features should just reference the same SME_ID value(s).

Type

Integer List

Mandatory

False

Is Nullable:

TRUE

Example

SFH	...	SME_ID_REFS	...
SMF	...	5|6|12	...

7.4.3. SME_ID_REF_ambiguity_code

Description

If multiple values are given under SME_ID_REFS, one of the following codes MUST be provided. 1=Ambiguous identification; 2=Only different evidence streams for the same molecule with no ambiguity; 3=Both ambiguous identification and multiple evidence streams. If there are no or one value under SME_ID_REFs, this MUST be reported as null.

Type

Integer

Mandatory

False

Is Nullable:

TRUE

Example

SFH	...	SME_ID_REF_ambiguity_code	...
SMF	...	1	...

7.4.4. adduct_ion

Description

The assumed classification of this molecule’s adduct ion after detection, following the general style in the 2013 IUPAC recommendations on terms relating to MS e.g. [M+H]1+, [M+Na]1+, [M+NH4]1+, [M-H]1-, [M+Cl]1-.

Type

String

Mandatory

False

Is Nullable:

TRUE

Example

SFH	...	adduct_ion	...
SMF	...	[M+H]1+	...
SMF	...	[M+2Na]2+	...

7.4.5. isotopomer

Description

If de-isotoping has not been performed, then the isotopomer quantified MUST be reported here e.g. “+1”, “+2”, “13C peak” using CV terms, otherwise (i.e. for approaches were SMF rows are de-isotoped features) this MUST be null.

Type

Parameter

Mandatory

False

Is Nullable:

TRUE

Example

SFH	...	isotopomer	...
SMF	...	[MS,MS:1002957,”isotopomer MS peak”,”13C peak”]	...

7.4.6. exp_mass_to_charge

Description

The experimental mass/charge value for the feature, by default assumed to be the mean across assays or a representative value. For approaches that report isotopomers as SMF rows, then the m/z of the isotopomer MUST be reported here.

Type

Double

Mandatory

True

Is Nullable:

FALSE

Example

SFH	...	exp_mass_to_charge	...
SMF	...	1234.5	...

7.4.7. charge

Description

The feature’s charge value using positive integers both for positive and negative polarity modes.

Type

Integer

Mandatory

False

Is Nullable:

FALSE

Example

SFH	...	charge	...
SMF	...	1	...

7.4.8. retention_time_in_seconds

Description

The apex of the feature on the retention time axis, in a Master or aggregate MS run. Retention time MUST be reported in seconds. Retention time values for individual MS runs (i.e. before alignment) MAY be reported as optional columns. Retention time SHOULD only be null in the case of direct infusion MS or other techniques where a retention time value is absent or unknown. Relative retention time or retention time index values MAY be reported as optional columns, and could be considered for inclusion in future versions of mzTab as appropriate.

Type

Double

Mandatory

False

Is Nullable:

TRUE

Example

SFH	...	retention_time_in_seconds	...
SMF	...	1345.7	...

7.4.9. retention_time_in_seconds_start

Description

The start time of the feature on the retention time axis, in a Master or aggregate MS run. Retention time MUST be reported in seconds. Retention time start and end SHOULD only be null in the case of direct infusion MS or other techniques where a retention time value is absent or unknown and MAY be reported in optional columns.

Type

Double

Mandatory

False

Is Nullable:

TRUE

Example

SFH	...	retention_time_in_seconds_start	...
SMF	...	1327	...

7.4.10. retention_time_in_seconds_end

Description

The end time of the feature on the retention time axis, in a Master or aggregate MS run. Retention time MUST be reported in seconds. Retention time start and end SHOULD only be null in the case of direct infusion MS or other techniques where a retention time value is absent or unknown and MAY be reported in optional columns.

Type

Double

Mandatory

False

Is Nullable:

TRUE

Example

SFH	...	retention_time_in_seconds_end	...
SMF	...	1327.8	...

7.4.11. abundance_assay

Description

The feature’s abundance in every assay described in the metadata section MUST be reported. Null or zero values may be reported as appropriate.

Type

Double List

Mandatory

False

Is Nullable:

TRUE

Example

SFH	...	abundance_assay	...
SMF	...	38648	...

7.4.12. opt_{identifier}_*

Description

Additional columns can be added to the end of the small molecule feature table. These column headers MUST start with the prefix “opt_” followed by the {identifier} of the object they reference: assay, study variable, MS run or “global” (if the value relates to all replicates). Column names MUST only contain the following characters: 'A'-'Z', 'a'-'z', '0'-'9', '', '-', '[', ']', and ':'. CV parameter accessions MAY be used for optional columns following the format: opt{identifier}_cv_{accession}_parameter name}. Spaces within the parameter’s name MUST be replaced by '_'.

Type

Optional Column

Mandatory

False

Is Nullable:

TRUE

Example

SFH	...	opt_global_cv_value	...
SMF	...	opt_assay[1]_my_value=My value	...
SMF	...	opt_global_another_value=some other value	...

7.5. Small Molecule Evidence (SME) Section

The small molecule evidence section is table-based, representing identification evidence for small molecules or features (e.g., database search results). It MUST always come after the Small Molecule Feature Section. All columns are MANDATORY except for "opt_" columns.

The order of columns MUST follow the order specified below. All table columns MUST be Tab separated. There MUST NOT be any empty cells. Missing values MUST be reported using "null".

7.5.1. SME_ID

Description

A within file unique identifier for the small molecule evidence result.

Type

Integer

Mandatory

True

Is Nullable:

FALSE

Example

SEH	...	SME_ID	...
SME	...	1	...

7.5.2. evidence_input_id

Description

A within file unique identifier for the input data used to support this identification e.g. fragment spectrum, RT and m/z pair, isotope profile that was used for the identification process, to serve as a grouping mechanism, whereby multiple rows of results from the same input data share the same ID. The identifiers may be human readable but should not be assumed to be interpretable. For example, if fragmentation spectra have been searched then the ID may be the spectrum reference, or for accurate mass search, the ms_run[2]:458.75.

Type

String

Mandatory

True

Is Nullable:

FALSE

Example

SEH	...	evidence_input_id	...
SME	...	ms_run[1]:mass=278.65;rt=376.5	...

7.5.3. database_identifier

Description

The putative identification for the small molecule sourced from an external database, using the same prefix specified in database[1-n]-prefix. This could include additionally a chemical class or an identifier to a spectral library entity, even if its actual identity is unknown. For the “no database” case, 'null' must be used. The unprefixed use of 'null' is prohibited for any other case. If no putative identification can be reported for a particular database, it MUST be reported as the database prefix followed by null.

Type

String

Mandatory

True

Is Nullable:

TRUE

Example

SEH	...	database_identifier	...
SME	...	CID:00027395	...

7.5.4. chemical_formula

Description

The chemical formula of the identified compound e.g. in a database, assumed to match the theoretical mass to charge (in some cases this will be the derivatized form, including adducts and protons). This should be specified in Hill notation (EA Hill 1900), i.e. elements in the order C, H and then alphabetically all other elements. Counts of one may be omitted. Elements should be capitalized properly to avoid confusion (e.g., “CO” vs. “Co”). The chemical formula reported should refer to the neutral form. Charge state is reported by the charge field. Example N-acetylglucosamine would be encoded by the string “C8H15NO6”

Type

String

Mandatory

False

Is Nullable:

TRUE

Example

SEH	...	chemical_formula	...
SME	...	C17H20N4O2	...

7.5.5. smiles

Description

The potential molecule’s structure in the simplified molecular-input line-entry system (SMILES) for the small molecule.

Type

String

Mandatory

False

Is Nullable:

TRUE

Example

SEH	...	smiles	...
SME	...	C1=CC=C(C=C1)CCNC(=O)CCNNC(=O)C2=CC=NC=C2	...

7.5.6. inchi

Description

A standard IUPAC International Chemical Identifier (InChI) for the given substance.

Type

String

Mandatory

False

Is Nullable:

TRUE

Example

SEH	...	inchi	...
SME	...	InChI=1S/C17H20N4O2/c22-16(19-12-6-14-4-2-1-3-5-14)9-13-20-21-17(23)15-7-10-18-11-8-15/h1-5,7-8,10-11,20H,6,9,12-13H2,(H,19,22)(H,21,23)	...

7.5.7. chemical_name

Description

The small molecule’s chemical/common name, or general description if a chemical name is unavailable.

Type

String

Mandatory

False

Is Nullable:

TRUE

Example

SEH	...	chemical_name	...
SME	...	N-(2-phenylethyl)-3-[2-(pyridine-4-carbonyl)hydrazinyl]propanamide	...

7.5.8. uri

Description

A URI pointing to the small molecule’s entry in a database (e.g., the small molecule’s HMDB, Chebi or KEGG entry).

Type

URI

Mandatory

False

Is Nullable:

TRUE

Example

SEH	...	uri	...
SME	...	http://www.hmdb.ca/metabolites/HMDB00054	...

7.5.9. derivatized_form

Description

The derivatized form of the small molecule, if the identification was based on a specific derivative (e.g. 2 TMS). This MUST be specified using CV terms (where possible) otherwise “null”.

Type

Parameter

Mandatory

False

Is Nullable:

TRUE

Example

SEH	...	derivatized_form	...
SME	...	[CHEBI, CHEBI:51088, trimethylsilyl group, 3]	...

7.5.10. adduct_ion

Description

The assumed classification of this molecule’s adduct ion after detection, following the general style in the 2013 IUPAC recommendations on terms relating to MS e.g. [M+H]1+, [M+Na]1+, [M+NH4]1+, [M-H]1-, [M+Cl]1-. If the adduct classification is ambiguous with regards to identification evidence it MAY be null.

Type

1. Regex

^\[\d*M([-][\w\d])\]\d[+-]$


Mandatory

False

Is Nullable:

TRUE

Example

SEH	...	adduct_ion	...
SME	...	[M+H]+	...

7.5.11. exp_mass_to_charge

Description

The experimental mass/charge value for the precursor ion. If multiple adduct forms have been combined into a single identification event/search, then a single value e.g. for the protonated form SHOULD be reported here.

Type

Double

Mandatory

True

Is Nullable:

FALSE

Example

SEH	...	exp_mass_to_charge	...
SME	...	1234.5	...

7.5.12. charge

Description

The small molecule evidence’s charge value using positive integers both for positive and negative polarity modes.

Type

Integer

Mandatory

True

Is Nullable:

FALSE

Example

SEH	...	charge	...
SME	...	1	...

7.5.13. theoretical_mass_to_charge

Description

The theoretical mass/charge value for the small molecule or the database mass/charge value (for a spectral library match).

Type

Double

Mandatory

True

Is Nullable:

FALSE

Example

SEH	...	theoretical_mass_to_charge	...
SME	...	1234.71	...

7.5.14. spectra_ref

Description

Reference to a spectrum in a spectrum file, for example a fragmentation spectrum has been used to support the identification. If a separate spectrum file has been used for fragmentation spectrum, this MUST be reported in the metadata section as additional ms_runs. The reference must be in the format ms_run[1-n]:{SPECTRA_REF} where SPECTRA_REF MUST follow the format defined in 5.2 (including references to chromatograms where these are used to inform identification). Multiple spectra MUST be referenced using a | delimited list for the (rare) cases in which search engines have combined or aggregated multiple spectra in advance of the search to make identifications. If a fragmentation spectrum has not been used, the value should indicate the ms_run to which is identification is mapped e.g. “ms_run[1]”.

Type

String List

Mandatory

True

Is Nullable:

FALSE

Example

SEH	...	spectra_ref	...
SME	...	ms_run[1]:index=5|ms_run[2]:index=3	...

7.5.15. identification_method

Description

The search engine or algorithm used for the identification. This SHOULD be specified using CV terms.

Type

Parameter

Mandatory

True

Is Nullable:

FALSE

Example

SEH	...	identification_method	...
SME	...	[MS, MS:1001477, SpectraST,]	...

7.5.16. ms_level

Description

The MS level of the spectrum used for the identification. This SHOULD be specified using CV terms.

Type

Parameter

Mandatory

True

Is Nullable:

FALSE

Example

SEH	...	ms_level	...
SME	...	[MS, MS:1000511, ms level, 2]	...

7.5.17. id_confidence_measure

Description

Any statistical value or score for the identification. The metadata section reports the type of score used, as id_confidence_measure[1-n] of type Param.

Type

Double List

Mandatory

False

Is Nullable:

TRUE

Example

SEH	...	id_confidence_measure	...
SME	...	0.7	...

7.5.18. rank

Description

The rank of this identification from this approach as increasing integers from 1 (best ranked identification). Ties (equal score) are represented by using the same rank - defaults to 1 if there is no ranking system used.

Type

Integer

Mandatory

True

Is Nullable:

FALSE

Example

SEH	...	rank	...
SME	...	1	...

7.5.19. opt_{identifier}_*

Description

Additional columns can be added to the end of the small molecule evidence table. These column headers MUST start with the prefix “opt_” followed by the {identifier} of the object they reference: assay, study variable, MS run or “global” (if the value relates to all replicates). Column names MUST only contain the following characters: 'A'-'Z', 'a'-'z', '0'-'9', '', '-', '[', ']', and ':'. CV parameter accessions MAY be used for optional columns following the format: opt{identifier}_cv_{accession}_{parameter name}. Spaces within the parameter’s name MUST be replaced by '_'.

Type

Optional Column

Mandatory

False

Is Nullable:

TRUE

Example

SEH	...	opt_global_cv_value	...
SME	...	opt_assay[1]_my_value=My value	...
SME	...	opt_global_another_value=some other value	...

8. Changes from mzTab-M version 2.0 to 2.1

8.1. Summary

Section v2.0 elements v2.1 elements Added Removed Changed

Metadata (MTD) Section

67

75

10

2

65

Small Molecule (SML) Section

17

17

0

0

17

Small Molecule Feature (SMF) Section

12

12

0

0

12

Small Molecule Evidence (SME) Section

19

19

0

0

19

8.2. Metadata (MTD) Section

8.2.1. 🟢 Added in v2.1

Column/Field Description Type Mandatory Nullable

contact[1-n]-orcid

The contact’s orcid id, without https prefix.

Regex
----
^[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{3}[0-9X]{1}$
----

False

ms_run[1-n]-parameters[1-n]

Additional parameters of the assay, separated by bars.

Parameter List

False

assay[1-n]-protocol_refs[1-n]

The protocol(s) referenced by this assay.

Integer List

False

assay[1-n]-parameters[1-n]

Additional parameters of the assay, separated by bars.

Parameter List

False

study_variable[1-n]-ms_run_refs[1-n]

The ms run(s) referenced by this study variable.

Integer List

False

study_variable[1-n]-group_refs[1-n]

The study variable group this study variable belongs to.

Integer List

False

protocol[1-n]-name

The protocol name.

String

True

protocol[1-n]-type

The protocol type, as defined by the parameter.

Parameter

True

protocol[1-n]-description

Description of the protocol.

String

False

protocol[1-n]-parameters[1-n]

The protocol parameters.

Parameter List

False

8.2.2. 🔴 Removed in v2.1

Column/Field Description Type Mandatory Nullable

ms_run[1-n]-usi_identifier

An identifier for the MS run based on the Universal Spectrum Identifier (USI) specification.
Implied within the USI is an MS run identifier. Every deposited MS run can be referenced with a shortened form: mzspec:<collection>:<msRun>. More info on the standard can be found: https://www.psidev.info/usi.
Since an MS run may be represented in several formats (with potentially slightly different data associated with the spectrum), a format suffix MAY be specified (e.g. mzspec:<collection>:<msRun>.RAW) to signify that a specific file type is meant.
If specified, the format suffix should align with the ms_run[1-n]-format parameter and the ms_run[1-n]-location suffix defined in the metadata section.

String

False

study_variable[1-n]-group_ref

A reference to the study_variable_group that this study variable belongs to, allowing study variables to be linked to the experimental design factors they represent.

{STUDY_VARIABLE_GROUP_ID}

True

8.2.3. Element Details

mzTab-version
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

The version of the mzTab file. The suffix MUST be "-M" for mzTab for metabolomics (mzTab-M).

Version number of the mzTab format used.

Format: major.minor.patch-variant
Must end with "-M" suffix for metabolomics variant.

Used to ensure compatibility and processing correctness.

Type

Regex
…​.
\d{2}\.\d{0}\.\d{0}-M
…​.

Regex
----
^\d{1}\.\d{1}\.\d{1}-[A-Z]{1}$
----

Mandatory

True

True

Nullable

Example

MTD mzTab-version  2.0.0-M
----
MTD	mzTab-version	2.0.0-M
MTD	mzTab-version	2.1.0-M
----
mzTab-ID
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

The ID of the mzTab file, this could be supplied by the repository from which it is downloaded or a local identifier from the lab producing the file. It is not intended to be a globally unique ID but carry some locally useful meaning.

Unique identifier for the mzTab-M document.
REQUIRED. Can be:
- Repository accession number (e.g., MTBLS214)
- Laboratory internal identifier
- Study-specific identifier
NOT intended as a globally unique identifier,
but SHOULD have local meaning within its context.

Type

String

String

Mandatory

True

True

Nullable

Example

MTD mzTab-ID MTBL1234
----
MTD	mzTab-ID	MTBLS214
MTD	mzTab-ID	LAB001_2023
----
title
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

The file’s human readable title.

Human-readable title of the experiment or study.
OPTIONAL. SHOULD be:
- Concise but informative
- Reflect the main focus of the study
- Unique within a collection of related studies

Type

String

String

Mandatory

False

False

Nullable

Example

MTD title Effects of Rapamycin on metabolite profile
----
MTD	title	Metabolomic Analysis of Human Plasma in Diabetes Type 2
MTD	title	Lipidomics Study of Brain Tissue in Alzheimer's Disease
----
description
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

The file’s human readable description.

Detailed description of the experiment or study.
OPTIONAL. SHOULD include:
- Study objectives
- Experimental design overview
- Key methodological approaches
- Any unique aspects of the study
Provides context for understanding the data and its significance.

Type

String

String

Mandatory

False

False

Nullable

Example

MTD description An experiment investigating the effects of Il-6.
----
MTD	description	Investigation of metabolic changes in human plasma samples from type 2 diabetes patients compared to healthy controls. Study includes both fasting and post-prandial measurements.
MTD	description	Analysis of lipid profiles in brain tissue samples examining the relationship between specific lipid species and Alzheimer's disease progression.
----
sample_processing[1-n]
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

A list of parameters describing a sample processing, preparation or handling step similar to a biological or analytical methods report. The order of the sample_processing items should reflect the order these processing steps were performed in. If multiple parameters are given for a step these MUST be separated by a “{vbar}”. If derivatization was performed, it MUST be reported here as a general step, e.g. 'silylation' and the actual derivatization agens MUST be specified in the Section 7.2.68 part.

Parameters specifying sample processing that was applied within one step.

Type

Parameter List

Parameter List

Mandatory

False

False

Nullable

Example

MTD sample_processing[1] [MSIO, MSIO:0000107, metabolism quenching using precooled 60 percent methanol ammonium bicarbonate buffer,]
MTD sample_processing[2] [MSIO, MSIO:0000146, centrifugation,]
MTD sample_processing[3] [MSIO, MSIO:0000141, metabolite extraction,]
MTD sample_processing[4] [MSIO, MSIO:0000141, silylation,]\{vbar}[MSIO, MSIO:0000116, oximation,]
----
MTD	sample_processing[1]	[MSIO, MSIO:0000107, metabolism quenching using precooled 60 percent methanol ammonium bicarbonate buffer,]
MTD	sample_processing[2]	[MSIO, MSIO:0000146, centrifugation,]
MTD	sample_processing[3]	[MSIO, MSIO:0000141, metabolite extraction,]
MTD	sample_processing[4]	[MSIO, MSIO:0000141, silylation,]
----
instrument[1-n]-name
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

The name of the instrument used in the experiment. Multiple instruments are numbered 1..n.

The instrument’s name.

Type

Parameter

Parameter

Mandatory

False

False

Nullable

Example

MTD instrument[1]-name [MS, MS:1000449, LTQ Orbitrap,]

instrument[1-n]-source
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

The instrument’s source used in the experiment. Multiple instruments are numbered [1-n].

The instrument’s ion source.

Type

Parameter

Parameter

Mandatory

False

False

Nullable

Example

MTD instrument[1]-source [MS, MS:1000073, ESI,]
…
MTD instrument[2]-source [MS, MS:1000598, ETD,]

instrument[1-n]-analyzer[1-n]
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

The instrument’s analyzer type used in the experiment. Multiple instruments are numbered [1-n].

The instrument’s mass analyzer, as defined by the parameter.

Type

Parameter

Parameter List

Mandatory

False

False

Nullable

Example

MTD instrument[1]-analyzer[1] [MS, MS:1000291, linear ion trap,]
…
MTD instrument[2]-analyzer[1] [MS, MS:1000484, orbitrap,]

instrument[1-n]-detector
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

The instrument’s detector type used in the experiment. Multiple instruments are numbered [1-n].

The instrument’s mass analyzer, as defined by the parameter.

Type

Parameter

Parameter

Mandatory

False

False

Nullable

Example

MTD instrument[1]-detector [MS, MS:1000253, electron multiplier,]
…
MTD instrument[2]-detector [MS, MS:1000348, focal plane collector,]

software[1-n]
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

Software used to analyze the data and obtain the reported results. The parameter’s value SHOULD contain the software’s version. The order (numbering) should reflect the order in which the tools were used.

The software utilized.

Type

Parameter

Parameter

Mandatory

True

False

Nullable

Example

MTD software[1] [MS, MS:1002879, Progenesis QI, 3.0]
----
MTD	software[1]	[MS, MS:1002879, Progenesis QI, 3.0]
MTD	software[1]-setting	Fragment tolerance = 0.1 Da
…
MTD	software[2]-setting	Parent tolerance = 0.5 Da
----
software[1-n]-setting[1-n]
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

A software setting used. This field MAY occur multiple times for a single software. The value of this field is deliberately set as a String, since there currently do not exist CV terms for every possible setting.

A software setting used. This field MAY occur multiple times for a single software. The value of this field is deliberately set as a String, since there currently do not exist cvParams for every possible setting.

Type

String

String List

Mandatory

False

False

Nullable

Example

MTD software[1]-setting Fragment tolerance = 0.1 Da
…
MTD software[2]-setting Parent tolerance = 0.5 Da

publication[1-n]
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

A publication associated with this file. Several publications can be given by indicating the number in the square brackets after “publication”. PubMed ids must be prefixed by “pubmed:”, DOIs by “doi:”. Multiple identifiers MUST be separated by “{vbar}”.

The publication item ids referenced by this publication.

Type

String

String List

Mandatory

False

True

Nullable

Example

MTD publication[1] pubmed:21063943\{vbar}doi:10.1007/978-1-60761-987-1_6
MTD publication[2] pubmed:20615486\{vbar}doi:10.1016/j.jprot.2010.06.008
----
MTD	publication[1]	pubmed:21063943\{vbar}doi:10.1007/978-1-60761-987-1_6
MTD	publication[2]	pubmed:20615486\{vbar}doi:10.1016/j.jprot.2010.06.008
----
contact[1-n]-name
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

The contact’s name. Several contacts can be given by indicating the number in the square brackets after "contact". A contact has to be supplied in the format [first name] [initials] [last name] (see example).

The contact’s name.

Type

String

String

Mandatory

False

False

Nullable

Example

MTD contact[1]-name James D. Watson
…
MTD contact[2]-name Francis Crick

contact[1-n]-affiliation
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

The contact’s affiliation.

The contact’s affiliation.

Type

String

String

Mandatory

False

False

Nullable

Example

MTD contact[1]-affiliation Cambridge University, UK
MTD contact[2]-affiliation Cambridge University, UK

contact[1-n]-email
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

The contact’s e-mail address.

The contact’s e-mail address.

Type

String

String

Mandatory

False

False

Nullable

Example

MTD contact[1]-email watson@cam.ac.uk
…
MTD contact[2]-email crick@cam.ac.uk

uri[1-n]
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

A URI pointing to the file’s source data (e.g., a MetaboLights records).

The URI pointing to the external resource.

Type

URI

URI

Mandatory

False

False

Nullable

Example

MTD uri[1] https://www.ebi.ac.uk/metabolights/MTBLS517
----
MTD	uri[1]	https://www.ebi.ac.uk/metabolights/MTBLS517
…
MTD	external_study_uri[1]	https://www.ebi.ac.uk/metabolights/MTBLS517/files/i_Investigation.txt
----
external_study_uri[1-n]
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

A URI pointing to an external file with more details about the study design (e.g., an ISA-TAB file).

The URI pointing to the external resource.

Type

URI

URI

Mandatory

False

False

Nullable

Example

MTD external_study_uri[1] https://www.ebi.ac.uk/metabolights/MTBLS517/files/i_Investigation.txt
----
MTD	uri[1]	https://www.ebi.ac.uk/metabolights/MTBLS517
…
MTD	external_study_uri[1]	https://www.ebi.ac.uk/metabolights/MTBLS517/files/i_Investigation.txt
----
quantification_method
⚠️ Changed fields: Example
Field v2.0 v2.1

Description

The quantification method used in the experiment reported in the file.

The quantification method used in the experiment reported in the file.

Type

Parameter

Parameter

Mandatory

True

True

Nullable

Example

MTD quantification_method [MS, MS:1001834, LC-MS label-free quantitation analysis, ]
MTD quantification_method [MS, MS:1001838, SRM quantitation analysis, ]

sample[1-n]
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

A name for each sample to serve as a list of the samples that MUST be reported in the following tables. Samples MUST be reported if a statistical design is being captured (i.e. bio or tech replicates). If the type of replicates are not known, samples SHOULD NOT be reported.

The sample’s name.

Type

String

String

Mandatory

False

False

Nullable

Example

MTD sample[1] individual number 1
MTD sample[2] individual number 2
----
COM	Experiment where all samples consisted of the same two species
MTD	sample[1]	individual number 1
MTD	sample[1]-species[1]	[NCBITaxon, NCBITaxon:9606, Homo sapiens, ]
MTD	sample[1]-tissue[1]	[BTO, BTO:0000759, liver, ]
MTD	sample[1]-cell_type[1]	[CL, CL:0000182, hepatocyte, ]
MTD	sample[1]-disease[1]	[DOID, DOID:684, hepatocellular carcinoma, ]
MTD	sample[1]-disease[2]	[DOID, DOID:9451, alcoholic fatty liver, ]
MTD	sample[1]-description	Hepatocellular carcinoma samples.
MTD	sample[1]-custom[1]	[,,Extraction date, 2011-12-21]
MTD	sample[1]-custom[2]	[,,Extraction reason, liver biopsy]
MTD	sample[2]	individual number 2
MTD	sample[2]-species[1]	[NCBITaxon, NCBITaxon:9606, Homo sapiens, ]
MTD	sample[2]-tissue[1]	[BTO, BTO:0000759, liver, ]
MTD	sample[2]-cell_type[1]	[CL, CL:0000182, hepatocyte, ]
MTD	sample[2]-description	Healthy control samples.
----
sample[1-n]-species[1-n]
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

The respective species of the samples analysed. For more complex cases, such as metagenomics, optional columns and userParams should be used.

Biological species information on the sample.

Type

Parameter

Parameter List

Mandatory

False

False

Nullable

Example

COM Experiment where all samples consisted of the same two species
MTD sample[1]-species[1] [NCBITaxon, NCBITaxon:9606, Homo sapiens, ]
MTD sample[2]-species[1] [NCBITaxon, NCBITaxon:39767, Human rhinovirus 11, ]
COM Experiment where two samples from different species (combinations)
COM were analysed as biological replicates.
MTD sample[1]-species[1] [NCBITaxon, NCBITaxon:9606, Homo sapiens, ]
MTD sample[1]-species[2] [NCBITaxon, NCBITaxon:39767, Human rhinovirus 11, ]
MTD sample[2]-species[1] [NCBITaxon, NCBITaxon:9606, Homo sapiens, ]
MTD sample[2]-species[2] [NCBITaxon, NCBITaxon:12130, Human rhinovirus 2, ]

sample[1-n]-tissue[1-n]
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

The respective tissue(s) of the sample.

Biological tissue information on the sample.

Type

Parameter

Parameter List

Mandatory

False

False

Nullable

Example

MTD sample[1]-tissue[1] [BTO, BTO:0000759, liver, ]

sample[1-n]-cell_type[1-n]
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

The respective cell type(s) of the sample.

Biological cell type information on the sample.

Type

Parameter

Parameter List

Mandatory

False

False

Nullable

Example

MTD sample[1]-cell_type[1] [CL, CL:0000182, hepatocyte, ]

sample[1-n]-disease[1-n]
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

The respective disease(s) of the sample.

Disease information on the sample.

Type

Parameter

Parameter List

Mandatory

False

False

Nullable

Example

MTD sample[1]-disease[1] [DOID, DOID:684, hepatocellular carcinoma, ]
MTD sample[1]-disease[2] [DOID, DOID:9451, alcoholic fatty liver, ]

sample[1-n]-description
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

A human readable description of the sample.

A free form description of the sample.

Type

String

String

Mandatory

False

False

Nullable

Example

MTD sample[1]-description Hepatocellular carcinoma samples.
MTD sample[2]-description Healthy control samples.

sample[1-n]-custom[1-n]
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

Parameters describing the sample’s additional properties. Dates MUST be provided in ISO-8601 format.

Additional user or cv parameters.

Type

Parameter

Parameter List

Mandatory

False

False

Nullable

Example

MTD sample[1]-custom[1] [,,Extraction date, 2011-12-21]
MTD sample[1]-custom[2] [,,Extraction reason, liver biopsy]

ms_run[1-n]-location
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

Location of the external data file e.g. raw files on which analysis has been performed. If the actual location of the MS run is unknown, a “null” MUST be used as a place holder value, since the [1-n] cardinality is referenced elsewhere. If pre-fractionation has been performed, then [1-n] ms_runs SHOULD be created per assay.

The msRun’s location URI.

Type

URI

URI

Mandatory

True

True

Nullable

Example

MTD ms_run[1]-location file:///C:/path/to/my/file
…
MTD ms_run[1]-location ftp://ftp.ebi.ac.uk/path/to/file

ms_run[1-n]-instrument_ref
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

If different instruments are used in different runs, this attribute can be used to link a specific instrument to a specific run.

Sample reference.

Type

Integer

Integer

Mandatory

False

False

Nullable

Example

MTD ms_run[1]-instrument_ref instrument[1]

ms_run[1-n]-format
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

A parameter specifying the data format of the external MS data file. If ms_run[1-n]-format is present, ms_run[1-n]-id_format SHOULD also be present, following the parameters specified in Table 1.

The format of the MS run file.

Type

Parameter

Parameter

Mandatory

False

False

Nullable

Example

MTD ms_run[1]-format [MS, MS:1000584, mzML file, ]
MTD ms_run[1]-id_format [MS, MS:1000530, mzML unique identifier, ]
…
MTD ms_run[2]-format [MS, MS:1001062, Mascot MGF file, ]
MTD ms_run[2]-id_format [MS, MS:1000774, multiple peak list nativeID format, ]

ms_run[1-n]-id_format
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

Parameter specifying the id format used in the external data file. If ms_run[1-n]-id_format is present, ms_run[1-n]-format SHOULD also be present.

The format of the IDs in the MS run file.

Type

Parameter

Parameter

Mandatory

False

False

Nullable

Example

MTD ms_run[1]-format [MS, MS:1000584, mzML file, ]
MTD ms_run[1]-id_format [MS, MS:1000530, mzML unique identifier, ]
…
MTD ms_run[2]-format [MS, MS:1001062, Mascot MGF file, ]
MTD ms_run[2]-id_format [MS, MS:1000774, multiple peak list nativeID format, ]

ms_run[1-n]-usi_identifier 🔴 (Removed in v2.1)
Field v2.0 v2.1

Description

An identifier for the MS run based on the Universal Spectrum Identifier (USI) specification.
Implied within the USI is an MS run identifier. Every deposited MS run can be referenced with a shortened form: mzspec:<collection>:<msRun>. More info on the standard can be found: https://www.psidev.info/usi.
Since an MS run may be represented in several formats (with potentially slightly different data associated with the spectrum), a format suffix MAY be specified (e.g. mzspec:<collection>:<msRun>.RAW) to signify that a specific file type is meant.
If specified, the format suffix should align with the ms_run[1-n]-format parameter and the ms_run[1-n]-location suffix defined in the metadata section.

(not present)

Type

String

(not present)

Mandatory

False

(not present)

Nullable

(not present)

Example

MTD ms_run[1]-usi_identifier mzspec:PXD000561:Adult_Frontalcortex_bRP_Elite_85_f09

(not present)

ms_run[1-n]-fragmentation_method[1-n]
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

The type(s) of fragmentation used in a given ms run.

The fragmentation methods applied during this msRun.

Type

Parameter

Parameter List

Mandatory

False

False

Nullable

Example

MTD ms_run[1]-fragmentation_method[1] [MS, MS:1000133, CID, ]
…
MTD ms_run[1]-fragmentation_method[2] [MS, MS:1000422, HCD, ]

ms_run[1-n]-scan_polarity[1-n]
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

The polarity mode of a given run. Usually only one value SHOULD be given here except for the case of mixed polarity runs.

The scan polarity/polarities used during this msRun.

Type

Parameter

Parameter List

Mandatory

True

False

Nullable

Example

MTD ms_run[1]-scan_polarity[1] [MS, MS:1000130, positive scan, ]
OR
MTD ms_run[1]-scan_polarity[1] [MS, MS:1000129, negative scan, ]
OR (For mixed polarity in one run)
MTD ms_run[1]-scan_polarity[1] [MS, MS:1000130, positive scan, ]
MTD ms_run[1]-scan_polarity[2] [MS, MS:1000129, negative scan, ]

ms_run[1-n]-hash
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

Hash value of the corresponding external MS data file defined in ms_run[1-n]-location. If ms_run[1-n]-hash is present, ms_run[1-n]-hash_method SHOULD also be present.

The file hash value of this msRun’s data file.

Type

String

String

Mandatory

False

False

Nullable

Example

MTD ms_run[1]-hash_method [MS, MS:1000569, SHA-1, ]
MTD ms_run[1]-hash de9f2c7fd25e1b3afad3e85a0bd17d9b100db4b3

ms_run[1-n]-hash_method
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

A parameter specifying the hash methods used to generate the String in ms_run[1-n]-hash. Specifics of the hash method used MAY follow the definitions of the mzML format. If ms_run[1-n]-hash is present, ms_run[1-n]-hash_method SHOULD also be present.

The method used to calculate the hash.

Type

Parameter

Parameter

Mandatory

False

False

Nullable

Example

MTD ms_run[1]-hash_method [MS, MS:1000569, SHA-1, ]
MTD ms_run[1]-hash de9f2c7fd25e1b3afad3e85a0bd17d9b100db4b3

assay[1-n]
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

A name for each assay, to serve as a list of the assays that MUST be reported in the following tables.

The assay name.

Type

String

String

Mandatory

True

True

Nullable

Example

MTD assay[1] first assay
MTD assay[2] second assay
----
MTD	assay[1]	first assay
MTD	assay[1]-custom[1]	[MS, , Assay operator, Fred Blogs]
MTD	assay[1]-external_uri	https://www.ebi.ac.uk/metabolights/MTBLS517/files/i_Investigation.txt?STUDYASSAY=a_e04_c18pos.txt
MTD	assay[1]-sample_ref	sample[1]
MTD	assay[1]-ms_run_ref	ms_run[1]
----
assay[1-n]-custom[1-n]
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

Additional parameters or values for a given assay.

Additional user or cv parameters.

Type

Parameter

Parameter List

Mandatory

False

False

Nullable

Example

MTD assay[1]-custom[1] [MS, , Assay operator, Fred Blogs]

assay[1-n]-external_uri
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

A reference to further information about the assay, for example via a reference to an object within an ISA-TAB file.

An external URI to further information about this assay.

Type

URI

URI

Mandatory

False

False

Nullable

Example

MTD assay[1]-external_uri https://www.ebi.ac.uk/metabolights/MTBLS517/files/i_Investigation.txt?STUDYASSAY=a_e04_c18pos.txt

assay[1-n]-sample_ref
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

An association from a given assay to the sample analysed.

Sample reference.

Type

{SAMPLE_ID}

Integer

Mandatory

False

False

Nullable

Example

MTD assay[1]-sample_ref sample[1]
MTD assay[2]-sample_ref sample[2]

assay[1-n]-ms_run_ref
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

An association from a given assay to the source MS run. All assays MUST reference exactly one ms_run unless a workflow with pre-fractionation is being encoded, in which case each assay MUST reference n ms_runs where n fractions have been collected.

Multiple assays SHOULD reference the same ms_run to capture multiplexed experimental designs.

The ms run(s) referenced by this assay.

Type

{MS_RUN_ID}

Integer List

Mandatory

True

True

Nullable

Example

MTD assay[1]-ms_run_ref ms_run[1]
MTD assay[1]-ms_run_ref ms_run[2]

study_variable[1-n]
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

A name for each study variable (experimental condition or factor), to serve as a list of the study variables that MUST be reported in the following tables. For software that does not capture study variables, a single study variable MUST be reported, linking to all assays. This single study variable MUST have the identifier “undefined“.

The study variable value. Encoded according to the datatype declared on the referenced study_variable_group: either a literal value (for xsd: datatypes) or a Parameter (for the Parameter datatype, e.g. [NO, NO:12345, Male,] or [,,Male,]).*

Type

String

Study Variable List

Mandatory

True

True

Nullable

Example

MTD study_variable[1] “control”
MTD study_variable[2] “1 minute”
MTD study_variable[13] “Wildtype”
----
MTD	study_variable[1]	control
MTD	study_variable[1]-assay_refs	assay[1]\{vbar} assay[2]\{vbar} assay[3]
MTD	study_variable-average_function	[MS, MS:1002883, median, ]
MTD	study_variable-variation_function	[MS, MS:1002885, standard error, ]
MTD	study_variable[1]-description	Group B (spike-in 0.74 fmol/uL)
MTD	study_variable[2]	1 minute 0.5mg rapamycin
----
study_variable[1-n]-assay_refs
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

Bar-separated references to the IDs of assays grouped in the study variable.

The assays referenced by this study variable.

Type

{ASSAY_ID}, …​

Integer List

Mandatory

True

False

Nullable

Example

MTD study_variable[1]-assay_refs assay[1]\{vbar} assay[2]\{vbar} assay[3]

study_variable[1-n]-description
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

A textual description of the study variable.

A free-form description of this study variable.

Type

String

String

Mandatory

True

False

Nullable

Example

MTD study_variable[1]-description Group B (spike-in 0.74 fmol/uL)

study_variable[1-n]-group_ref 🔴 (Removed in v2.1)
Field v2.0 v2.1

Description

A reference to the study_variable_group that this study variable belongs to, allowing study variables to be linked to the experimental design factors they represent.

(not present)

Type

{STUDY_VARIABLE_GROUP_ID}

(not present)

Mandatory

True

(not present)

Nullable

(not present)

Example

MTD study_variable[1]-group_ref study_variable_group[1]
MTD study_variable[2]-group_ref study_variable_group[1]
MTD study_variable[3]-group_ref study_variable_group[2]

(not present)

study_variable_group[1-n]
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

A parameter defining the group to which the study variable belongs. This allows grouping of related study variables that belong to the same experimental design factor in multi-factorial designs. The parameter can be either a CV Parameter or a user-defined parameter. At least one study_variable_group MUST be defined. For software that does not capture study variables, a single study_variable_group MUST be reported, linking to the single study variable. This single study_variable_group MUST have the identifier “undefined”. Added in mzTab-M 2.1.

The study variable group name.

Type

Parameter

Parameter

Mandatory

True

True

Nullable

Example

MTD study_variable_group[1] [,,sex,]
MTD study_variable_group[2] [EFO, EFO:0004953, date of diagnosis, ]
----
MTD	study_variable_group[1]	[PATO, PATO:0000383, sex, ]
MTD	study_variable_group[1]-description	Sex of the individual
MTD	study_variable_group[1]-type	[STATO, STATO:0000252, categorical variable, ]
MTD	study_variable_group[1]-datatype	xsd:string
MTD	study_variable_group[2]	[PATO, PATO:0000384, timepoint, ]
MTD	study_variable_group[2]-description	Time after treatment
MTD	study_variable_group[2]-type	[STATO, STATO:0000228, ordinal variable, ]
MTD	study_variable_group[2]-datatype	xsd:integer
MTD	study_variable_group[2]-unit	[UO, UO:0000033, day, ]
MTD	study_variable[1]	Female_0day
MTD	study_variable[1]-group_ref	study_variable_group[1]
MTD	study_variable[1]-assay_refs	assay[1]\{vbar}assay[2]\{vbar}assay[3]
MTD	study_variable[2]	Female_1day
MTD	study_variable[2]-group_ref	study_variable_group[2]
MTD	study_variable[2]-assay_refs	assay[4]\{vbar}assay[5]\{vbar}assay[6]
----
study_variable_group[1-n]-description
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

A textual description of the study variable group.

Description of the study variable group.

Type

String

String

Mandatory

True (if study_variable_group is defined)

False

Nullable

Example

MTD study_variable_group[1]-description Sex of the individual

study_variable_group[1-n]-type
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

The statistical type of the group variable, which determines how the values should be interpreted in a statistical analysis context. The type MUST be a term from the STATO ontology, and SHOULD be one of the examples below. While the type also has implications for how the variables are encoded/decoded, that information goes into the study_variable_group[1-n]-datatype.

The following data types are supported:

* categorical[STATO, STATO:0000252, categorical variable] is a variable that can only assume a finite number of values without inherent ordering. In programming languages often represented as enums, factors (in R), or strings.

* ordinal[STATO, STATO:0000228, ordinal variable] Ordered variable where the sequence matters but intervals between levels are undefined. These values represent categories with a defined sequence or ranking. Example: disease severity (mild, moderate, severe), confidence levels (low, medium, high).

* continuous[STATO, STATO:0000251, continuous variable] is a variable that can take any value within a range on a continuous scale. Example: temperature, height, concentration, time.

Tools and parsers implementing mzTab-M SHOULD apply the appropriate data type interpretation when reading study_variable_group data and constructing analysis data structures (e.g., data frames, matrices, or tables).

The study variable group type, as defined by the parameter.

Type

Parameter

Parameter

Mandatory

False

False

Nullable

Example

MTD study_variable_group[1]-type [STATO, STATO:0000252, categorical variable]
MTD study_variable_group[2]-type [STATO, STATO:0000228, ordinal variable]
MTD study_variable_group[3]-type [STATO, STATO:0000251, continuous variable]

study_variable_group[1-n]-datatype
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

The datatype of the group variable, which determines how they can be encoded and parsed in mzTab-M files, and how the values could be handled in programming languages.

Producers of mzTab-M 2.1.0 SHOULD provide a datatype for each study_variable_group to simplify interpretation by downstream consumers of the format. The field is not mandatory, but its presence removes ambiguity about how the associated values are encoded and should be parsed.

The following datatypes are supported:

* xsd:string – Character string.
* xsd:integer – Arbitrary‑size integer.
* xsd:decimal – Arbitrary‑precision decimal number.
* xsd:boolean – Boolean value (true / false).
* xsd:date – Calendar date, encoded in ISO 8601 format (YYYY-MM-DD).
* xsd:time – Time of day, encoded in ISO 8601 format (hh:mm:ss, optional fractional seconds and timezone).
* xsd:dateTime – Combined date and time, encoded in ISO 8601 format (YYYY-MM-DDThh:mm:ss, with optional fractional seconds and timezone, e.g. YYYY-MM-DDThh:mm:ssZ).
* xsd:anyURI – A Uniform Resource Identifier reference.

Tools and parsers implementing mzTab-M SHOULD apply the appropriate data type interpretation when reading study_variable_group data and constructing analysis data structures (e.g., data frames, matrices, or tables).

The datatype of the group variable, which determines how they can be encoded and parsed in mzTab-M files, and how the values could be handled in programming languages.

Producers of mzTab-M 2.1.0 SHOULD provide a datatype for each study_variable_group to simplify interpretation by downstream consumers of the format. The field is not mandatory, but its presence removes ambiguity about how the associated values are encoded and should be parsed.

The following datatypes are supported:

* xsd:string – Character string.
* xsd:integer – Arbitrary‑size integer.
* xsd:decimal – Arbitrary‑precision decimal number.
* xsd:boolean – Boolean value (true / false).
* xsd:date – Calendar date, encoded in ISO 8601 format (YYYY-MM-DD).
* xsd:time – Time of day, encoded in ISO 8601 format (hh:mm:ss, optional fractional seconds and timezone).
* xsd:dateTime – Combined date and time, encoded in ISO 8601 format (YYYY-MM-DDThh:mm:ss, with optional fractional seconds and timezone, e.g. YYYY-MM-DDThh:mm:ssZ).
* xsd:anyURI – A Uniform Resource Identifier reference.
* Parameter – The values of the linked study_variables are reported as a Parameter (user-defined or CV Parameter), using the standard mzTab-M Parameter syntax [CV, accession, name, value].

Writers MUST ensure that the values of the study_variable entries belonging to the same study_variable_group all have the same type (e.g. string, number, …​) and use the same convention of reporting the value directly or as a Parameter, consistent with the datatype declared on the group. If the study_variable_group defines the datatype as Parameter, the CVParam qualifying the study_variable_group itself can be of a different CV origin than the CVParams used in the linked study_variable values.

Tools and parsers implementing mzTab-M SHOULD apply the appropriate data type interpretation when reading study_variable_group data and constructing analysis data structures (e.g., data frames, matrices, or tables).

Type

String (constrained)

Parameter

Mandatory

False

False

Nullable

Example

MTD study_variable_group[1]-datatype xsd:string
MTD study_variable_group[2]-datatype xsd:decimal
MTD study_variable_group[3]-datatype xsd:date
----
MTD	study_variable_group[1]-datatype	....
MTD	study_variable_group[1]-datatype	xsd:string
MTD	study_variable_group[2]-datatype	xsd:decimal
MTD	study_variable_group[3]-datatype	xsd:date
MTD	study_variable_group[4]-datatype	Parameter
COM	plain string value:
MTD	study_variable[1]	Male
MTD	study_variable[1]-group_ref	study_variable_group[1]
COM	user-defined Parameter value:
MTD	study_variable[2]	[,,Male,]
MTD	study_variable[2]-group_ref	study_variable_group[4]
COM	CV Parameter value:
MTD	study_variable[3]	[NCIT, NCIT:C20197, Male, ]
MTD	study_variable[3]-group_ref	study_variable_group[4]
----
study_variable_group[1-n]-unit
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

An optional parameter specifying the unit of the study variable group (e.g., day, hour, concentration, etc.). This SHOULD only be used for numeric or scalar values.

The study variable group unit, as defined by the parameter.

Type

Parameter

Parameter

Mandatory

False

False

Nullable

Example

MTD study_variable_group[1]-unit [UO, UO:0000033, day, ]
MTD study_variable_group[2]-unit [UO, UO:0000010, second, ]

study_variable[1-n]-average_function
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

The function used to calculate the study variable quantification value and the operation used is not arithmetic mean (default) e.g. “geometric mean”, “median”. The 1-n refers to different study variables.

The function used to calculate the study variable quantification value and the operation used is not arithmetic mean (default). e.g. geometric mean, median.

Type

Parameter

Parameter

Mandatory

False

False

Nullable

Example

MTD study_variable-average_function [MS, MS:1002883, median, ]

study_variable[1-n]-variation_function
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

The function used to calculate the study variable quantification variation value if it is reported and the operation used is not coefficient of variation (default) e.g. “standard error”.

The function used to calculate the study variable quantification variation value if it is reported and the operation used is not coefficient of variation (default). e.g. standard error.

Type

Parameter

Parameter

Mandatory

False

False

Nullable

Example

MTD study_variable-variation_function [MS, MS:1002885, standard error, ]

custom[1-n]
⚠️ Changed fields: Type, Example
Field v2.0 v2.1

Description

Any additional parameters describing the analysis reported.

Any additional parameters describing the analysis reported.

Type

Parameter

Parameter List

Mandatory

False

False

Nullable

Example

MTD custom[1] [,,MS operator, Florian]
----
MTD	custom	[MS, MS:1000001, custom param, value]
----
cv[1-n]-label
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

A string describing the labels of the controlled vocabularies/ontologies used in the mzTab file as a short-hand e.g. "MS" for PSI-MS.

The abbreviated CV label.

Type

String

String

Mandatory

True

True

Nullable

Example

MTD cv[1]-label MS

cv[1-n]-full_name
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

A string describing the full names of the controlled vocabularies/ontologies used in the mzTab file.

The full name of this CV, for humans.

Type

String

String

Mandatory

True

True

Nullable

Example

MTD cv[1]-full_name PSI-MS controlled vocabulary

cv[1-n]-version
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

A string describing the version of the controlled vocabularies/ontologies used in the mzTab file.

The CV version used when the file was generated.

Type

String

String

Mandatory

True

True

Nullable

Example

MTD cv[1]-version 4.1.11

cv[1-n]-uri
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

A string containing the URIs of the controlled vocabularies/ontologies used in the mzTab file. Note: For OBO ontologies, always use an OBO PURL rather than raw repository links to ensure long-term stability. For other ontology formats, please use the fully qualified PURL pointing to the ontology file.

A URI to the CV definition.

Type

String

URI

Mandatory

True

True

Nullable

Example

MTD cv[1]-uri https://purl.obolibrary.org/obo/ms.obo

database[1-n]
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

The description of databases used. For cases, where a known database has not been used for identification, a userParam SHOULD be inserted to describe any identification performed e.g. de novo.

If no identification has been performed at all then "no database" should be inserted followed by null.

The database name.

Type

Param

Database List

Mandatory

True

True

Nullable

Example

MTD database[1] [MIRIAM, MIR:00100079, HMDB, ]
MTD database[2] [,, "de novo", ]
MTD database[3] [MIRIAM, MIR:00000002, CHEBI, ]
MTD database[4] [,, "customDB", ]
OR
MTD database[5] [,, "no database", null ]
----
MTD	database[1]	[MIRIAM, MIR:00100079, HMDB, ]
MTD	database[1]-prefix	hmdb
MTD	database[1]-version	3.6
MTD	database[1]-uri	https://www.hmdb.ca
MTD	database[2]	[,, "de novo", ]
MTD	database[2]-prefix	dn
MTD	database[2]-version	Unknown
MTD	database[2]-uri	null
MTD	database[3]	[,, "no database", null ]
MTD	database[3]-prefix	null
MTD	database[3]-version	Unknown
MTD	database[3]-uri	null
----
database[1-n]-prefix
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

The prefix used in the “identifier” column of data tables. For the “no database” case "null" must be used.

The prefix used in the “identifier” column of data tables. For the 'no database' case 'null' must be used.

Type

String

String

Mandatory

True

True

Nullable

Example

MTD database[1]-prefix hmdb
MTD database[2]-prefix dn
MTD database[3]-prefix chebi
MTD database[4]-prefix cust
OR
MTD database[5]-prefix null

database[1-n]-version
⚠️ Changed fields: Example
Field v2.0 v2.1

Description

The database version is mandatory where identification has been performed. This may be a formal version number e.g. “1.4.1”, a date of access “2016-10-27” (ISO-8601 format) or “Unknown” if there is no suitable version that can be annotated.

The database version is mandatory where identification has been performed. This may be a formal version number e.g. “1.4.1”, a date of access “2016-10-27” (ISO-8601 format) or “Unknown” if there is no suitable version that can be annotated.

Type

String

String

Mandatory

True

True

Nullable

Example

MTD database[1]-version 3.6
OR
MTD database[2]-version Unknown

database[1-n]-uri
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

The URI to the database. For the “no database” case, "null" must be reported.

The URI to the database. For the “no database” case, 'null' must be reported.

Type

URI

String

Mandatory

True

True

Nullable

Example

MTD database[1]-uri http://www.hmdb.ca/
OR
MTD database[5]-uri null

derivatization_agent[1-n]
⚠️ Changed fields: Type, Example
Field v2.0 v2.1

Description

A description of derivatization agents applied to small molecules, using userParams or CV terms where possible.

A description of derivatization agents applied to small molecules, using userParams or CV terms where possible.

Type

Param

Parameter List

Mandatory

False

False

Nullable

Example

MTD derivatization_agent[1] [XLMOD, XLMOD:07014, N-methyl-N-t-butyldimethylsilyltrifluoroacetamide, ]
----
MTD	derivatization_agent[1]	[XLMOD, XLMOD:07014, N-methyl-N-t-butyldimethylsilyltrifluoroacetamide, ]
----
small_molecule-quantification_unit
⚠️ Changed fields: Description, Example
Field v2.0 v2.1

Description

Defines what type of units are reported in the small molecule summary quantification / abundance fields.

Defines what type of units are reported in the small molecule summary quantification / abundance fields

Type

Parameter

Parameter

Mandatory

True

True

Nullable

Example

MTD small_molecule-quantification_unit [MS, MS:1002887, Progenesis QI normalised abundance, ]
----
MTD	small_molecule-quantification_unit	[MS, MS:1001113, peak area, ]
----
small_molecule_feature-quantification_unit
⚠️ Changed fields: Mandatory, Example
Field v2.0 v2.1

Description

Defines what type of units are reported in the small molecule feature quantification / abundance fields.

Defines what type of units are reported in the small molecule feature quantification / abundance fields.

Type

Parameter

Parameter

Mandatory

True (if SMF section is being reported)

False

Nullable

Example

MTD small_molecule_feature-quantification_unit [MS, MS:1002887, Progenesis QI normalised abundance, ]
----
MTD	small_molecule_feature-quantification_unit	[MS, MS:1001113, peak area, ]
----
small_molecule-identification_reliability
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

The system used for giving reliability / confidence codes to small molecule identifications MUST be specified if not using the default codes (see [reliability] and for details).

The system used for giving reliability / confidence codes to small molecule identifications MUST be specified if not using the default codes.

Type

Param

Parameter

Mandatory

False

False

Nullable

Example

MTD small_molecule-identification_reliability [MS, MS:1002896, compound identification confidence level, ]
or
MTD small_molecule-identification_reliability [MS, MS:1002955, hr-ms compound identification confidence level, ]
----
MTD	small_molecule-identification_reliability	[MS, MS:1000932, identification reliability, ]
----
id_confidence_measure[1-n]
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

The type of small molecule confidence measures or scores MUST be reported as a CV parameter [1-n]. The CV parameter definition should formally state whether the ordering is high to low or vice versa. The order of the scores SHOULD reflect their importance for the identification and be used to determine the identification’s rank.

Small molecule identification confidence metrics.<br/>Scoring System
- Use CV parameters numbered [1-n]
- Define score direction (high-to-low or low-to-high)
- Order by importance for identification ranking

Scores determine confidence in molecular identifications

Type

Parameter

Parameter List

Mandatory

True

True

Nullable

Example

id_confidence_measure[1]	[MS,MS:1002889,Progenesis MetaScope Score,]
id_confidence_measure[2]	[MS,MS:1002890,fragmentation score,]
id_confidence_measure[3]	[MS,MS:1002891,isotopic fit score,]
----
MTD	id_confidence_measure[1]	[MS,MS:1002890,fragmentation score,]
MTD	id_confidence_measure[2]	[MS,MS:1002891,retention time score,]
----
colunit-small_molecule
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

Defines the used unit for a column in the small molecule section. The format of the value has to be \{column name}=\{Parameter defining the unit}

This field MUST NOT be used to define a unit for quantification columns. The unit used for small molecule quantification values MUST be set in small_molecule-quantification_unit.

Unit definitions for small molecule data columns.

Format
- Pattern: {column_name}={unit_parameter}
- Use CV parameters for units when possible

Important Notes
- Not for quantification columns
- Use small_molecule-quantification_unit for quantification values

Type

String

Column Parameter Mapping List

Mandatory

False

False

Nullable

Example

MTD colunit-small_molecule opt_global_cv_MS:MS:1002954_collisional_cross_sectional_area=[UO,UO:00003241, square angstrom,]
----
MTD	colunit-small_molecule	retention_time=[UO,UO:0000031,minute,]
MTD	colunit-small_molecule	mass=[UO,UO:0000221,dalton,]
----
colunit-small_molecule_feature
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

Defines the used unit for a column in the small molecule feature section. The format of the value has to be \{column name}=\{Parameter defining the unit}

This field MUST NOT be used to define a unit for quantification columns. The unit used for small molecule quantification values MUST be set in small_molecule_feature-quantification_unit.

Defines the used unit for a column in the small molecule feature section. The format of the value has to be {column name}={Parameter defining the unit}. This field MUST NOT be used to define a unit for quantification columns. The unit used for small molecule quantification values MUST be set in small_molecule_feature-quantification_unit.

Type

String

Column Parameter Mapping List

Mandatory

False

False

Nullable

Example

MTD colunit-small_molecule_feature opt_ms_run[1]_cv_MS:MS:1002476_ion_mobility_drift_time=[UO,UO:0000031, minute,]
----
MTD	colunit-small_molecule_feature	retention_time=[UO, UO:0000031, minute, ]
----
colunit-small_molecule_evidence
⚠️ Changed fields: Description, Type, Example
Field v2.0 v2.1

Description

Defines the used unit for a column in the small molecule evidence section. The format of the value has to be \{column name}=\{Parameter defining the unit}.

Defines the used unit for a column in the small molecule evidence section. The format of the value has to be {column name}={Parameter defining the unit}.

Type

String

Column Parameter Mapping List

Mandatory

False

False

Nullable

Example

MTD colunit-small_molecule_evidence opt_global_mass_error=[UO, UO:0000169, parts per million, ]
----
MTD	colunit-small_molecule_evidence	retention_time=[UO, UO:0000031, minute, ]
----
contact[1-n]-orcid 🟢 (Added in v2.1)
Field v2.0 v2.1

Description

(not present)

The contact’s orcid id, without https prefix.

Type

(not present)

Regex
----
^[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{3}[0-9X]{1}$


Mandatory

(not present)

False

Nullable

(not present)

Example

(not present)

ms_run[1-n]-parameters[1-n] 🟢 (Added in v2.1)
Field v2.0 v2.1

Description

(not present)

Additional parameters of the assay, separated by bars.

Type

(not present)

Parameter List

Mandatory

(not present)

False

Nullable

(not present)

Example

(not present)

----
MTD	ms_run[1]-parameter[1]	[MS, MS:1000031, instrument model, [MS, MS:1000449, LTQ Orbitrap,]]
----
assay[1-n]-protocol_refs[1-n] 🟢 (Added in v2.1)
Field v2.0 v2.1

Description

(not present)

The protocol(s) referenced by this assay.

Type

(not present)

Integer List

Mandatory

(not present)

False

Nullable

(not present)

Example

(not present)

----
MTD	assay[1]-protocol_ref	protocol[1]\{vbar} protocol[2]
----
assay[1-n]-parameters[1-n] 🟢 (Added in v2.1)
Field v2.0 v2.1

Description

(not present)

Additional parameters of the assay, separated by bars.

Type

(not present)

Parameter List

Mandatory

(not present)

False

Nullable

(not present)

Example

(not present)

----
MTD	assay[1]-parameter[1]	[MS, MS:1000031, instrument model, [MS, MS:1000449, LTQ Orbitrap,]]
----
study_variable[1-n]-ms_run_refs[1-n] 🟢 (Added in v2.1)
Field v2.0 v2.1

Description

(not present)

The ms run(s) referenced by this study variable.

Type

(not present)

Integer List

Mandatory

(not present)

False

Nullable

(not present)

Example

(not present)

----
MTD	study_variable[1]-ms_run_ref	ms_run[1]\{vbar} ms_run[2]
----
study_variable[1-n]-group_refs[1-n] 🟢 (Added in v2.1)
Field v2.0 v2.1

Description

(not present)

The study variable group this study variable belongs to.

Type

(not present)

Integer List

Mandatory

(not present)

False

Nullable

(not present)

Example

(not present)

----
MTD	study_variable[1]-group_ref	study_variable_group[1]\{vbar} study_variable_group[2]
----
protocol[1-n]-name 🟢 (Added in v2.1)
Field v2.0 v2.1

Description

(not present)

The protocol name.

Type

(not present)

String

Mandatory

(not present)

True

Nullable

(not present)

Example

(not present)

protocol[1-n]-type 🟢 (Added in v2.1)
Field v2.0 v2.1

Description

(not present)

The protocol type, as defined by the parameter.

Type

(not present)

Parameter

Mandatory

(not present)

True

Nullable

(not present)

Example

(not present)

protocol[1-n]-description 🟢 (Added in v2.1)
Field v2.0 v2.1

Description

(not present)

Description of the protocol.

Type

(not present)

String

Mandatory

(not present)

False

Nullable

(not present)

Example

(not present)

protocol[1-n]-parameters[1-n] 🟢 (Added in v2.1)
Field v2.0 v2.1

Description

(not present)

The protocol parameters.

Type

(not present)

Parameter List

Mandatory

(not present)

False

Nullable

(not present)

Example

(not present)

8.3. Small Molecule (SML) Section

8.3.1. Element Details

SML_ID
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

A within file unique identifier for the small molecule.

A within file unique identifier for the small molecule summary.

Type

Integer

Integer

Mandatory

True

Nullable

False

False

Example

SMH SML_ID …
SML 1 …
SML 2 …
----
SMH	...	SML_ID	...
SML	...	1	...
----
SMF_ID_REFS
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

References to all the features on which quantitation has been based (SMF elements) via referencing SMF_ID values. Multiple values SHOULD be provided as a “{vbar}” separated list. This MAY be null only if this is a Summary file.

References to the small molecule features (SMF elements) via referencing SMF_ID values. Multiple values MAY be provided as a {vbar} separated list to indicate which features were used to aggregate the SML row.

Type

{SMF_ID} list

Integer List

Mandatory

False

Nullable

True

True

Example

SMH SML_ID SMF_ID_REFS
SML 1 2\{vbar}3\{vbar}11…
----
SMH	...	SMF_ID_REFS	...
SML	...	2\{vbar}3\{vbar}11	...
----
database_identifier
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

A list of “{vbar}” separated possible identifiers for the small molecule; multiple values MUST only be provided to indicate ambiguity in the identification of the molecule and not to demonstrate different identifier types for the same molecule. Alternative identifiers for the same molecule MAY be provided as optional columns.

The database identifier must be preceded by the resource description (prefix) followed by a colon, as specified in the metadata section.

A null value MAY be provided if the identification is sufficiently ambiguous as to be meaningless for reporting or the small molecule has not been identified.

A list of {vbar} separated possible identifiers for the small molecule; multiple values MUST only be provided to indicate ambiguity in the identification of the molecule and not to demonstrate different identifier types for the same molecule. Alternative identifiers for the same molecule MAY be provided as optional columns. The database identifier must be preceded by the resource description (prefix) followed by a colon, as specified in the metadata section. A null value MAY be provided if the identification is sufficiently ambiguous as to be meaningless for reporting or the small molecule has not been identified.

Type

String List

String List

Mandatory

False

Nullable

True

True

Example

SMH SML_ID database_identifier …
SML 1 CID:00027395 …
SML 2 HMDB:HMDB0001847
SML 3 null
----
SMH	...	database_identifier	...
SML	...	CID:00027395\{vbar}HMDB:HMDB0001847	...
----
chemical_formula
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

A list of “{vbar}” separated potential chemical formulae of the reported compound. The number of values provided MUST match the number of entities reported under “database_identifier”, even if this leads to redundant reporting of information (i.e. if ambiguity can be resolved in the chemical formula), and the validation software will throw an error if the number of “{vbar}” symbols does not match. “null” values between bars are allowed.

This should be specified in Hill notation (EA Hill 1900), i.e. elements in the order C, H and then alphabetically all other elements. Counts of one may be omitted. Elements should be capitalized properly to avoid confusion (e.g., “CO” vs. “Co”). The chemical formula reported should refer to the neutral form.

*Example
N-acetylglucosamine would be encoded by the string “C8H15NO6”*

The chemical formula of the identified compound e.g. in a database, assumed to match the theoretical mass to charge (in some cases this will be the derivatized form, including adducts and protons). This should be specified in Hill notation (EA Hill 1900), i.e. elements in the order C, H and then alphabetically all other elements. Counts of one may be omitted. Elements should be capitalized properly to avoid confusion (e.g., “CO” vs. “Co”). The chemical formula reported should refer to the neutral form. Charge state is reported by the charge field in the SME and SMF section.
Example N-acetylglucosamine would be encoded by the string “C8H15NO6”

Type

String List

String List

Mandatory

False

Nullable

True

True

Example

SMH SML_ID … chemical_formula …
SML 1 … C17H20N4O2 …
----
SMH	...	chemical_formula	...
SML	...	C17H20N4O2	...
----
smiles
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

A list of “{vbar}” separated potential molecule structures in the simplified molecular-input line-entry system (SMILES) for the small molecule. The number of values provided MUST match the number of entities reported under “database_identifier”, and the validation software will throw an error if the number of “{vbar}” symbols does not match. “null” values between bars are allowed.

The potential molecule’s structure in the simplified molecular-input line-entry system (SMILES) for the small molecule.

Type

String List

String List

Mandatory

False

Nullable

True

True

Example

SMH SML_ID … chemical_formula smiles …
SML 1 … C17H20N4O2 C1=CC=C(C=C1)CCNC(=O)CCNNC(=O)C2=CC=NC=C2 …
----
SMH	...	smiles	...
SML	...	C1=CC=C(C=C1)CCNC(=O)CCNNC(=O)C2=CC=NC=C2	...
----
inchi
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

A list of “{vbar}” separated potential standard IUPAC International Chemical Identifier (InChI) of the given substance.

The number of values provided MUST match the number of entities reported under “database_identifier”, even if this leads to redundant information being reported (i.e. if ambiguity can be resolved in the InChi), and the validation software will throw an error if the number of “{vbar}” symbols does not match. “null” values between bars are allowed.

A standard IUPAC International Chemical Identifier (InChI) for the given substance.

Type

String List

String List

Mandatory

False

Nullable

True

True

Example

SMH SML_ID … chemical_formula … inchi …
SML 1 … C17H20N4O2 … InChI=1S/C17H20N4O2/c22-16(19-12-6-14-4-2-1-3-5-14)9-13-20-21-17(23)15-7-10-18-11-8-15/h1-5,7-8,10-11,20H,6,9,12-13H2,(H,19,22)(H,21,23) …
----
SMH	...	inchi	...
SML	...	InChI=1S/C17H20N4O2/c22-16(19-12-6-14-4-2-1-3-5-14)9-13-20-21-17(23)15-7-10-18-11-8-15/h1-5,7-8,10-11,20H,6,9,12-13H2,(H,19,22)(H,21,23)	...
----
chemical_name
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

A list of “{vbar}” separated possible chemical/common names for the small molecule, or general description if a chemical name is unavailable. Multiple names are only to demonstrate ambiguity in the identification. The number of values provided MUST match the number of entities reported under “database_identifier”, and the validation software will throw an error if the number of “{vbar}” symbols does not match. “null” values between bars are allowed.

The small molecule’s chemical/common name, or general description if a chemical name is unavailable.

Type

String List

String List

Mandatory

False

Nullable

True

True

Example

SMH SML_ID … description …
SML 1 … N-(2-phenylethyl)-3-[2-(pyridine-4-carbonyl)hydrazinyl]propanamide…
----
SMH	...	chemical_name	...
SML	...	N-(2-phenylethyl)-3-[2-(pyridine-4-carbonyl)hydrazinyl]propanamide	...
----
uri
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

A URI pointing to the small molecule’s entry in a reference database (e.g., the small molecule’s HMDB or KEGG entry). The number of values provided MUST match the number of entities reported under “database_identifier”, and the validation software will throw an error if the number of “{vbar}” symbols does not match. “null” values between bars are allowed.

A URI pointing to the small molecule’s entry in a database (e.g., the small molecule’s HMDB, Chebi or KEGG entry).

Type

URI List

String List

Mandatory

False

Nullable

True

True

Example

SMH SML_ID … uri …
SML 1 … http://www.genome.jp/dbget-bin/www_bget?cpd:C00031 …
SML 2 … http://www.hmdb.ca/metabolites/HMDB0001847 …
SML 3 … http://identifiers.org/hmdb/HMDB0001847 …
----
SMH	...	uri	...
SML	...	http://www.genome.jp/dbget-bin/www_bget?cpd:C00031	...
SML	...	http://www.hmdb.ca/metabolites/HMDB0001847	...
----
theoretical_neutral_mass
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

The small molecule’s precursor’s theoretical neutral mass.

The number of values provided MUST match the number of entities reported under “database_identifier”, and the validation software will throw an error if the number of “{vbar}” symbols does not match. “null” values (in general and between bars) are allowed for molecules that have not been identified only, or for molecules where the neutral mass cannot be calculated. In these cases, the SML entry SHOULD reference features in which exp_mass_to_charge values are captured.

The theoretical neutral mass of the small molecule. This should be calculated from the chemical formula.

Type

Double List

Double List

Mandatory

False

Nullable

True

True

Example

SMH SML_ID … theoretical_neutral_mass …
SML 1 … 1234.5 …
----
SMH	...	theoretical_neutral_mass	...
SML	...	1234.5	...
----
adduct_ions
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

A “{vbar}” separated list of detected adducts for this this molecule, following the general style in the 2013 IUPAC recommendations on terms relating to MS e.g. [M+H]1+, [M+Na]1+, [M+NH4]1+, [M-H]1-, [M+Cl]1-, [M+H]1+. If the adduct classification is ambiguous with regards to identification evidence it MAY be null.

A {vbar} separated list of the detected adduct ion forms for this small molecule. The terms should follow the general style in the 2013 IUPAC recommendations on terms relating to MS e.g. [M+H]1+, [M+Na]1+, [M+NH4]1+, [M-H]1-, [M+Cl]1-.

Type

Regex List
…​.
\[\d*M([-][\w\d])
\]\d*[+-]
…​.*

Regex List
----
^\[\d*M([-][\w\d])
\]\d*[+-]$
----*

Mandatory

False

Nullable

True

True

Example

SMH SML_ID … adduct_ions …
SML 1 … [M+H]1+ \{vbar} [M+Na]1+ …
----
SMH	...	adduct_ions	...
SML	...	[M+H]1+\{vbar}[M+Na]1+	...
----
reliability
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

The reliability of the given small molecule identification. This must be supplied by the resource and MUST be reported as an integer between 1-4:

. identified metabolite (1)
. putatively annotated compound (2)
. putatively characterized compound class (3)
. unknown compound (4)

These MAY be replaced using a suitable CV term in the metadata section e.g. to use MSI recommendation levels (see Section 7.2.71 for details).

The following CV terms are already available within the PSI MS CV. Future schemes may be implemented by extending the PSI MS CV with new terms and associated levels.

The MSI has recently discussed an extension of the original four level scheme into a five level scheme MS:1002896 (compound identification confidence level) with levels

["arabic", start=0]
. isolated, pure compound, full stereochemistry (0)
. reference standard match or full 2D structure (1)
. unambiguous diagnostic evidence (literature, database) (2)
. most likely structure, including isomers, substance class or substructure match (3)
. unknown compound (4)

For high-resolution MS, the following term and its levels may be used: MS:1002955 (hr-ms compound identification confidence level) with levels

["arabic", start=1]
. confirmed structure (1)
. probable structure (2)
[loweralpha]
.. unambiguous ms library match (2a)
.. diagnostic evidence (2b)
. tentative candidates (3)
. unequivocal molecular formula (4)
. exact mass (5)

A String data type is set to allow for different systems to be specified in the metadata section.

The reliability of the given small molecule identification. This must be supplied by the resource and should be reported as an integer between 1-4:

1: identified, rigorous. …​
2: identified. …​
3: putatively characterized class. …​
4: unknown. …​

Type

String

String

Mandatory

False

Nullable

True

True

Example

SMH identifier … reliability …
SML 1 … 3 …
or
MTD small_molecule-identification_reliability [MS, MS:1002896, compound identification confidence level,]
…
SMH identifier … reliability …
SML 1 … 0 …
or
MTD small_molecule-identification_reliability [MS, MS:1002955, hr-ms compound identification confidence level,]
…
SMH identifier … reliability …
SML 1 … 2a …
----
SMH	...	reliability	...
SML	...	3	...
SML	...	0	...
----
best_id_confidence_measure
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

The approach or database search that identified this small molecule with highest confidence.

The small molecule confidence measure/score of the best identification for this small molecule summary. The type of the value is defined by the best_id_confidence_measure CV parameter. The value is reported in the best_id_confidence_value column.

Type

Parameter

Parameter

Mandatory

False

Nullable

True

True

Example

SMH SML_ID … best_ id_confidence_measure …
SML 1 … [MS, MS:1001477, SpectraST,] …
----
SMH	...	best_id_confidence_measure	...
SML	...	[MS, MS:1001477, SpectraST,,]	...
----
best_id_confidence_value
⚠️ Changed fields: Description, Mandatory, Nullable, Example
Field v2.0 v2.1

Description

The best confidence measure in identification (for this type of score) for the given small molecule across all assays. The type of score MUST be defined in the metadata section. If the small molecule was not identified by the specified search engine, “null” MUST be reported. If the confidence measure does not report a numerical confidence value, “null” SHOULD be reported.

The small molecule confidence measure/score value of the best identification for this small molecule summary.

Type

Double

Double

Mandatory

True

Nullable

True

False

Example

SMH SML_ID … best_id_confidence_value …
SML 1 … 0.7 …
----
SMH	...	best_id_confidence_value	...
SML	...	0.85	...
----
abundance_assay[1-n]
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

The small molecule’s abundance in every assay described in the metadata section MUST be reported. Null or zero values may be reported as appropriate. "null" SHOULD be used to report missing quantities, while zero SHOULD be used to indicate a present but not reliably quantifiable value (e.g. below a minimum noise threshold).

The small molecule’s abundance in every assay described in the metadata section MUST be reported. Null or zero values may be reported as appropriate.

Type

Double

Double List

Mandatory

False

Nullable

True

True

Example

SMH SML_ID … abundance_assay[1] …
SML 1 … 0.3 …
----
SMH	...	abundance_assay	...
SML	...	12340	...
----
abundance_study_variable[1-n]
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

The small molecule’s abundance in all the study variables described in the metadata section (study_variable[1-n]_average_function), calculated using the method as described in the Metadata section (default = arithmetic mean across assays). Null or zero values may be reported as appropriate. "null" SHOULD be used to report missing quantities, while zero SHOULD be used to indicate a present but not reliably quantifiable value (e.g. below a minimum noise threshold).

The small molecule’s abundance in every study variable described in the metadata section. Null or zero values may be reported as appropriate.

Type

Double

Double List

Mandatory

False

Nullable

True

True

Example

SMH SML_ID … abundance_study_variable[1] …
SML 1 … 0.3 …
----
SMH	...	abundance_study_variable	...
SML	...	1230	...
----
abundance_variation_study_variable[1-n]
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

A measure of the variability of the study variable abundance measurement, calculated using the method as described in the metadata section (study_variable[1-n]_average_function), with a default = arithmethic co-efficient of variation of the small molecule’s abundance in the given study variable.

The small molecule’s abundance variation in every study variable described in the metadata section. Null or zero values may be reported as appropriate.

Type

Double

Double List

Mandatory

False

Nullable

True

True

Example

SMH SML_ID … abundance_study_variable[1] abundance_variation_study_variable[1]…
SML 1 … 0.3 0.04 …
----
SMH	...	abundance_variation_study_variable	...
SML	...	0.2	...
----
opt_{identifier}_*
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

Additional columns can be added to the end of the small molecule table. These column headers MUST start with the prefix “opt_” followed by the {identifier} of the object they reference: assay, study variable, MS run or “global” (if the value relates to all replicates). Column names MUST only contain the following characters: ‘A’-‘Z’, ‘a’-‘z’, ‘0’-‘9’, ‘’, ‘-’, ‘[’, ‘]’, and ‘:’. CV parameter accessions MAY be used for optional columns following the format: opt_{identifier}_cv_{accession}_{parameter name}. Spaces within the parameter’s name MUST be replaced by ‘’.

Additional columns can be added to the end of the small molecule table. These column headers MUST start with the prefix “opt_” followed by the {identifier} of the object they reference: assay, study variable, MS run or “global” (if the value relates to all replicates). Column names MUST only contain the following characters: 'A'-'Z', 'a'-'z', '0'-'9', '', '-', '[', ']', and ':'. CV parameter accessions MAY be used for optional columns following the format: opt{identifier}_cv_{accession}_{parameter name}. Spaces within the parameter’s name MUST be replaced by '_'.

Type

Column

Optional Column

Mandatory

False

Nullable

True

True

Example

SMH SML_ID … opt_assay[1]_my_value … opt_global_another_value
SML 1 … My value … some other value
----
SMH	...	opt_global_cv_value	...
SML	...	opt_global_cv_MS:1002217_decoy_peptide=null	...
----

8.4. Small Molecule Feature (SMF) Section

8.4.1. Element Details

SMF_ID
⚠️ Changed fields: Mandatory, Example
Field v2.0 v2.1

Description

A within file unique identifier for the small molecule feature.

A within file unique identifier for the small molecule feature.

Type

Integer

Integer

Mandatory

True

Nullable

False

False

Example

SFH SMF_ID …
SMF 1 …
SMF 2 …
----
SFH	...	SMF_ID	...
SMF	...	1	...
----
SME_ID_REFS
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

References to the identification evidence (SME elements) via referencing SME_ID values. Multiple values MAY be provided as a “{vbar}” separated list to indicate ambiguity in the identification or to indicate that different types of data supported the identifiction (see SME_ID_REF_ambiguity_code). For the case of a consensus approach where multiple adduct forms are used to infer the SML ID, different features should just reference the same SME_ID value(s).

References to the identification evidence (SME elements) via referencing SME_ID values. Multiple values MAY be provided as a {vbar} separated list to indicate ambiguity in the identification or to indicate that different types of data supported the identifiction (see sme_id_ref_ambiguity_code). For the case of a consensus approach where multiple adduct forms are used to infer the SML ID, different features should just reference the same SME_ID value(s).

Type

{SME_ID} list

Integer List

Mandatory

False

Nullable

True

True

Example

SFH SMF_ID SME_ID_REFS
SMF 1 5\{vbar}6\{vbar}12…
----
SFH	...	SME_ID_REFS	...
SMF	...	5\{vbar}6\{vbar}12	...
----
SME_ID_REF_ambiguity_code
⚠️ Changed fields: Mandatory, Example
Field v2.0 v2.1

Description

If multiple values are given under SME_ID_REFS, one of the following codes MUST be provided. 1=Ambiguous identification; 2=Only different evidence streams for the same molecule with no ambiguity; 3=Both ambiguous identification and multiple evidence streams. If there are no or one value under SME_ID_REFs, this MUST be reported as null.

If multiple values are given under SME_ID_REFS, one of the following codes MUST be provided. 1=Ambiguous identification; 2=Only different evidence streams for the same molecule with no ambiguity; 3=Both ambiguous identification and multiple evidence streams. If there are no or one value under SME_ID_REFs, this MUST be reported as null.

Type

Integer

Integer

Mandatory

False

Nullable

True

True

Example

SFH SMF_ID SME_ID_REFS SME_ID_REF_ambiguity_code
SMF 1 5\{vbar}6\{vbar}12… 1
----
SFH	...	SME_ID_REF_ambiguity_code	...
SMF	...	1	...
----
adduct_ion
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

The assumed classification of this molecule’s adduct ion after detection, following the general style in the 2013 IUPAC recommendations on terms relating to MS e.g. [M+H]1+, [M+Na]1+, [M+NH4]1+, [M-H]1-, [M+Cl]1-, [M+H]1+.

The assumed classification of this molecule’s adduct ion after detection, following the general style in the 2013 IUPAC recommendations on terms relating to MS e.g. [M+H]1+, [M+Na]1+, [M+NH4]1+, [M-H]1-, [M+Cl]1-.

Type

Regex
…​.
\[\d*M([-][\w\d])
\]\d*[+-]
…​.*

String

Mandatory

False

Nullable

True

True

Example

SFH SMF_ID … adduct_ion …
SMF 1 … [M+H]+ …
SMF 2 … [M+2Na]2+ …
----
SFH	...	adduct_ion	...
SMF	...	[M+H]1+	...
SMF	...	[M+2Na]2+	...
----
isotopomer
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

If de-isotoping has not been performed, then the isotopomer quantified MUST be reported here e.g. “+1”, “+2”, “13C peak” using CV terms, otherwise (i.e. for approaches where SMF rows are de-isotoped features) this MUST be null.

If de-isotoping has not been performed, then the isotopomer quantified MUST be reported here e.g. “+1”, “+2”, “13C peak” using CV terms, otherwise (i.e. for approaches were SMF rows are de-isotoped features) this MUST be null.

Type

Parameter

Parameter

Mandatory

False

Nullable

True

True

Example

SFH SMF_ID … isotopomer …
SMF 1 … [MS,MS:1002957,”isotopomer MS peak”,”13C peak”]…
----
SFH	...	isotopomer	...
SMF	...	[MS,MS:1002957,”isotopomer MS peak”,”13C peak”]	...
----
exp_mass_to_charge
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

The experimental mass/charge value for the feature, by default assumed to be the mean across assays or a representative value. For approaches that report isotopomers as SMF rows, then the m/z of the isotopomer MUST be reported here.

The experimental mass/charge value for the feature, by default assumed to be the mean across assays or a representative value. For approaches that report isotopomers as SMF rows, then the m/z of the isotopomer MUST be reported here.

Type

Double

Double

Mandatory

True

Nullable

False

False

Example

SFH SMF_ID … exp_mass_to_charge …
SMF 1 … 1234.5 …
----
SFH	...	exp_mass_to_charge	...
SMF	...	1234.5	...
----
charge
⚠️ Changed fields: Description, Mandatory, Nullable, Example
Field v2.0 v2.1

Description

The feature’s charge value using positive integers both for positive and negative polarity modes. Is nullable in the SMF table but if identification is precised the charge is expected to expected to be known.

The feature’s charge value using positive integers both for positive and negative polarity modes.

Type

Integer

Integer

Mandatory

False

Nullable

True

False

Example

SFH SMF_ID … charge …
SMF 1 … 1 …
----
SFH	...	charge	...
SMF	...	1	...
----
retention_time_in_seconds
⚠️ Changed fields: Mandatory, Example
Field v2.0 v2.1

Description

The apex of the feature on the retention time axis, in a Master or aggregate MS run. Retention time MUST be reported in seconds. Retention time values for individual MS runs (i.e. before alignment) MAY be reported as optional columns. Retention time SHOULD only be null in the case of direct infusion MS or other techniques where a retention time value is absent or unknown. Relative retention time or retention time index values MAY be reported as optional columns, and could be considered for inclusion in future versions of mzTab as appropriate.

The apex of the feature on the retention time axis, in a Master or aggregate MS run. Retention time MUST be reported in seconds. Retention time values for individual MS runs (i.e. before alignment) MAY be reported as optional columns. Retention time SHOULD only be null in the case of direct infusion MS or other techniques where a retention time value is absent or unknown. Relative retention time or retention time index values MAY be reported as optional columns, and could be considered for inclusion in future versions of mzTab as appropriate.

Type

Double

Double

Mandatory

False

Nullable

True

True

Example

SFH SMF_ID … retention_time_in_seconds …
SMF 1 … 1345.7 …
----
SFH	...	retention_time_in_seconds	...
SMF	...	1345.7	...
----
retention_time_in_seconds_start
⚠️ Changed fields: Mandatory, Example
Field v2.0 v2.1

Description

The start time of the feature on the retention time axis, in a Master or aggregate MS run. Retention time MUST be reported in seconds. Retention time start and end SHOULD only be null in the case of direct infusion MS or other techniques where a retention time value is absent or unknown and MAY be reported in optional columns.

The start time of the feature on the retention time axis, in a Master or aggregate MS run. Retention time MUST be reported in seconds. Retention time start and end SHOULD only be null in the case of direct infusion MS or other techniques where a retention time value is absent or unknown and MAY be reported in optional columns.

Type

Double

Double

Mandatory

False

Nullable

True

True

Example

SFH SMF_ID … retention_time_in_seconds_start …
SMF 1 … 1327.0 …
----
SFH	...	retention_time_in_seconds_start	...
SMF	...	1327	...
----
retention_time_in_seconds_end
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

The end time of the feature on the retention time axis, in a Master or aggregate MS run. Retention time MUST be reported in seconds. Retention time start and end SHOULD only be null in the case of direct infusion MS or other techniques where a retention time value is absent or unknown and MAY be reported in optional columns..

The end time of the feature on the retention time axis, in a Master or aggregate MS run. Retention time MUST be reported in seconds. Retention time start and end SHOULD only be null in the case of direct infusion MS or other techniques where a retention time value is absent or unknown and MAY be reported in optional columns.

Type

Double

Double

Mandatory

False

Nullable

True

True

Example

SFH SMF_ID … retention_time_in_seconds_end …
SMF 1 … 1327.8 …
----
SFH	...	retention_time_in_seconds_end	...
SMF	...	1327.8	...
----
abundance_assay[1-n]
⚠️ Changed fields: Type, Mandatory, Example
Field v2.0 v2.1

Description

The feature’s abundance in every assay described in the metadata section MUST be reported. Null or zero values may be reported as appropriate.

The feature’s abundance in every assay described in the metadata section MUST be reported. Null or zero values may be reported as appropriate.

Type

Double

Double List

Mandatory

False

Nullable

True

True

Example

SMH SML_ID … abundance_assay[1] …
SMF 1 … 38648 …
----
SFH	...	abundance_assay	...
SMF	...	38648	...
----
opt_{identifier}_*
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

Additional columns can be added to the end of the small molecule feature table. These column headers MUST start with the prefix “opt_” followed by the {identifier} of the object they reference: assay, study variable, MS run or “global” (if the value relates to all replicates). Column names MUST only contain the following characters: ‘A’-‘Z’, ‘a’-‘z’, ‘0’-‘9’, ‘’, ‘-’, ‘[’, ‘]’, and ‘:’. CV parameter accessions MAY be used for optional columns following the format: opt_{identifier}_cv_{accession}_{parameter name}. Spaces within the parameter’s name MUST be replaced by ‘’.

Additional columns can be added to the end of the small molecule feature table. These column headers MUST start with the prefix “opt_” followed by the {identifier} of the object they reference: assay, study variable, MS run or “global” (if the value relates to all replicates). Column names MUST only contain the following characters: 'A'-'Z', 'a'-'z', '0'-'9', '', '-', '[', ']', and ':'. CV parameter accessions MAY be used for optional columns following the format: opt{identifier}_cv_{accession}_parameter name}. Spaces within the parameter’s name MUST be replaced by '_'.

Type

Column

Optional Column

Mandatory

False

Nullable

True

True

Example

SFH SMF_ID … opt_assay[1]_my_value … opt_global_another_value
SMF 1 … My value … some other value
----
SFH	...	opt_global_cv_value	...
SMF	...	opt_assay[1]_my_value=My value	...
SMF	...	opt_global_another_value=some other value	...
----

8.5. Small Molecule Evidence (SME) Section

8.5.1. Element Details

SME_ID
⚠️ Changed fields: Mandatory, Example
Field v2.0 v2.1

Description

A within file unique identifier for the small molecule evidence result.

A within file unique identifier for the small molecule evidence result.

Type

Integer

Integer

Mandatory

True

Nullable

False

False

Example

SEH SME_ID …
SME 1 …
----
SEH	...	SME_ID	...
SME	...	1	...
----
evidence_input_id
⚠️ Changed fields: Mandatory, Example
Field v2.0 v2.1

Description

A within file unique identifier for the input data used to support this identification e.g. fragment spectrum, RT and m/z pair, isotope profile that was used for the identification process, to serve as a grouping mechanism, whereby multiple rows of results from the same input data share the same ID. The identifiers may be human readable but should not be assumed to be interpretable. For example, if fragmentation spectra have been searched then the ID may be the spectrum reference, or for accurate mass search, the ms_run[2]:458.75.

A within file unique identifier for the input data used to support this identification e.g. fragment spectrum, RT and m/z pair, isotope profile that was used for the identification process, to serve as a grouping mechanism, whereby multiple rows of results from the same input data share the same ID. The identifiers may be human readable but should not be assumed to be interpretable. For example, if fragmentation spectra have been searched then the ID may be the spectrum reference, or for accurate mass search, the ms_run[2]:458.75.

Type

String

String

Mandatory

True

Nullable

False

False

Example

SEH SME_ID evidence_input_id …
SME 1 ms_run[1]:mass=278.65;rt=376.5
SME 2 ms_run[1]:mass=278.65;rt=376.5
SME 3 ms_run[1]:mass=278.65;rt=376.5
(in this example three identifications were made from the same accurate mass/RT library search)
----
SEH	...	evidence_input_id	...
SME	...	ms_run[1]:mass=278.65;rt=376.5	...
----
database_identifier
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

The putative identification for the small molecule sourced from an external database, using the same prefix specified in database[1-n]-prefix.

This could include additionally a chemical class or an identifier to a spectral library entity, even if its actual identity is unknown.

For the “no database” case, "null" must be used. The unprefixed use of "null" is prohibited for any other case.
If no putative identification can be reported for a particular database, it MUST be reported as the database prefix followed by null.

The putative identification for the small molecule sourced from an external database, using the same prefix specified in database[1-n]-prefix. This could include additionally a chemical class or an identifier to a spectral library entity, even if its actual identity is unknown. For the “no database” case, 'null' must be used. The unprefixed use of 'null' is prohibited for any other case. If no putative identification can be reported for a particular database, it MUST be reported as the database prefix followed by null.

Type

String

String

Mandatory

True

Nullable

True

True

Example

SEH SME_ID identifier …
SME 1 CID:00027395 …
SME 2 HMDB:HMDB12345 …
SME 3 CID:null …
----
SEH	...	database_identifier	...
SME	...	CID:00027395	...
----
chemical_formula
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

The chemical formula of the identified compound e.g. in a database, assumed to match the theoretical mass to charge (in some cases this will be the derivatized form, including adducts and protons).

This should be specified in Hill notation (EA Hill 1900), i.e. elements in the order C, H and then alphabetically all other elements. Counts of one may be omitted. Elements should be capitalized properly to avoid confusion (e.g., “CO” vs. “Co”). The chemical formula reported should refer to the neutral form. Charge state is reported by the charge field.

*Example
N-acetylglucosamine would be encoded by the string “C8H15NO6”*

The chemical formula of the identified compound e.g. in a database, assumed to match the theoretical mass to charge (in some cases this will be the derivatized form, including adducts and protons). This should be specified in Hill notation (EA Hill 1900), i.e. elements in the order C, H and then alphabetically all other elements. Counts of one may be omitted. Elements should be capitalized properly to avoid confusion (e.g., “CO” vs. “Co”). The chemical formula reported should refer to the neutral form. Charge state is reported by the charge field. Example N-acetylglucosamine would be encoded by the string “C8H15NO6”

Type

String

String

Mandatory

False

Nullable

True

True

Example

SEH SME_ID … chemical_formula …
SME 1 … C17H20N4O2 …
----
SEH	...	chemical_formula	...
SME	...	C17H20N4O2	...
----
smiles
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

The potential molecule’s structure in the simplified molecular-input line-entry system (SMILES) for the small molecule.

The potential molecule’s structure in the simplified molecular-input line-entry system (SMILES) for the small molecule.

Type

String

String

Mandatory

False

Nullable

True

True

Example

SEH SME_ID … chemical_formula smiles …
SML 1 … C17H20N4O2 C1=CC=C(C=C1)CCNC(=O)CCNNC(=O)C2=CC=NC=C2 …
----
SEH	...	smiles	...
SME	...	C1=CC=C(C=C1)CCNC(=O)CCNNC(=O)C2=CC=NC=C2	...
----
inchi
⚠️ Changed fields: Mandatory, Example
Field v2.0 v2.1

Description

A standard IUPAC International Chemical Identifier (InChI) for the given substance.

A standard IUPAC International Chemical Identifier (InChI) for the given substance.

Type

String

String

Mandatory

False

Nullable

True

True

Example

SEH SME_ID … chemical_formula … inchi …
SML 1 … C17H20N4O2 … InChI=1S/C17H20N4O2/c22-16(19-12-6-14-4-2-1-3-5-14)9-13-20-21-17(23)15-7-10-18-11-8-15/h1-5,7-8,10-11,20H,6,9,12-13H2,(H,19,22)(H,21,23) …
----
SEH	...	inchi	...
SME	...	InChI=1S/C17H20N4O2/c22-16(19-12-6-14-4-2-1-3-5-14)9-13-20-21-17(23)15-7-10-18-11-8-15/h1-5,7-8,10-11,20H,6,9,12-13H2,(H,19,22)(H,21,23)	...
----
chemical_name
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

The small molecule’s chemical/common name, or general description if a chemical name is unavailable.

The small molecule’s chemical/common name, or general description if a chemical name is unavailable.

Type

String

String

Mandatory

False

Nullable

True

True

Example

SEH SME_ID … chemical_name …
SML 1 … N-(2-phenylethyl)-3-[2-(pyridine-4-carbonyl)hydrazinyl]propanamide…
----
SEH	...	chemical_name	...
SME	...	N-(2-phenylethyl)-3-[2-(pyridine-4-carbonyl)hydrazinyl]propanamide	...
----
uri
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

A URI pointing to the small molecule’s entry in a database (e.g., the small molecule’s HMDB, Chebi or KEGG entry).

A URI pointing to the small molecule’s entry in a database (e.g., the small molecule’s HMDB, Chebi or KEGG entry).

Type

URI

URI

Mandatory

False

Nullable

True

True

Example

SEH SME_ID … uri …
SME 1 … http://www.hmdb.ca/metabolites/HMDB00054
----
SEH	...	uri	...
SME	...	http://www.hmdb.ca/metabolites/HMDB00054	...
----
derivatized_form
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

If a derivatized form has been analysed by MS, then the functional group attached to the molecule should be reported here using suitable userParam or CV terms as appropriate.

The derivatized form of the small molecule, if the identification was based on a specific derivative (e.g. 2 TMS). This MUST be specified using CV terms (where possible) otherwise “null”.

Type

Parameter

Parameter

Mandatory

False

Nullable

True

True

Example

COM This example shows a triple substitution with a TMS group (3TMS)
SMH database_identifier … derivatized_form …
SML CID:00027395 … [CHEBI, CHEBI:51088, trimethylsilyl group, 3] …
----
SEH	...	derivatized_form	...
SME	...	[CHEBI, CHEBI:51088, trimethylsilyl group, 3]	...
----
adduct_ion
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

The assumed classification of this molecule’s adduct ion after detection, following the general style in the 2013 IUPAC recommendations on terms relating to MS e.g. `, `[M+Na]1, [M+NH4]1+, [M-H]1-, [M+Cl]1-. If the adduct classification is ambiguous with regards to identification evidence it MAY be null.

The assumed classification of this molecule’s adduct ion after detection, following the general style in the 2013 IUPAC recommendations on terms relating to MS e.g. [M+H]1+, [M+Na]1+, [M+NH4]1+, [M-H]1-, [M+Cl]1-. If the adduct classification is ambiguous with regards to identification evidence it MAY be null.

Type

Regex
…​.
\[\d*M([-][\w\d])
\]\d*[+-]
…​.*

Regex
----
^\[\d*M([-][\w\d])
\]\d*[+-]$
----*

Mandatory

False

Nullable

True

True

Example

SEH SME_ID … adduct_ion …
SME 1 … [M+H]+ …
SME 2 … [M+2Na]2+ …
OR (for negative mode):
SME 1 … [M-H]- …
SME 2 … [M+Cl]- …
----
SEH	...	adduct_ion	...
SME	...	[M+H]+	...
----
exp_mass_to_charge
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

The experimental mass/charge value for the precursor ion. If multiple adduct forms have been combined into a single identification event/search, then a single value e.g. for the protonated form SHOULD be reported here.

The experimental mass/charge value for the precursor ion. If multiple adduct forms have been combined into a single identification event/search, then a single value e.g. for the protonated form SHOULD be reported here.

Type

Double

Double

Mandatory

True

Nullable

False

False

Example

SEH SME_ID … exp_mass_to_charge …
SME 1 … 1234.5 …
----
SEH	...	exp_mass_to_charge	...
SME	...	1234.5	...
----
charge
⚠️ Changed fields: Mandatory, Example
Field v2.0 v2.1

Description

The small molecule evidence’s charge value using positive integers both for positive and negative polarity modes.

The small molecule evidence’s charge value using positive integers both for positive and negative polarity modes.

Type

Integer

Integer

Mandatory

True

Nullable

False

False

Example

SEH SME_ID … charge …
SME 1 … 1 …
----
SEH	...	charge	...
SME	...	1	...
----
theoretical_mass_to_charge
⚠️ Changed fields: Mandatory, Example
Field v2.0 v2.1

Description

The theoretical mass/charge value for the small molecule or the database mass/charge value (for a spectral library match).

The theoretical mass/charge value for the small molecule or the database mass/charge value (for a spectral library match).

Type

Double

Double

Mandatory

True

Nullable

False

False

Example

SEH SME_ID … theoretical_mass_to_charge …
SME 1 … 1234.71 …
----
SEH	...	theoretical_mass_to_charge	...
SME	...	1234.71	...
----
spectra_ref
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

Reference to a spectrum in a spectrum file, for example a fragmentation spectrum has been used to support the identification. If a separate spectrum file has been used for fragmentation spectrum, this MUST be reported in the metadata section as additional ms_runs. The reference must be in the format ms_run[1-n]:{SPECTRA_REF} where SPECTRA_REF MUST follow the format defined in 5.2 (including references to chromatograms where these are used to inform identification). Multiple spectra MUST be referenced using a “{vbar}” delimited list for the (rare) cases in which search engines have combined or aggregated multiple spectra in advance of the search to make identifications.

If a fragmentation spectrum has not been used, the value should indicate the ms_run to which is identification is mapped e.g. “ms_run[1]”.

Reference to a spectrum in a spectrum file, for example a fragmentation spectrum has been used to support the identification. If a separate spectrum file has been used for fragmentation spectrum, this MUST be reported in the metadata section as additional ms_runs. The reference must be in the format ms_run[1-n]:{SPECTRA_REF} where SPECTRA_REF MUST follow the format defined in 5.2 (including references to chromatograms where these are used to inform identification). Multiple spectra MUST be referenced using a {vbar} delimited list for the (rare) cases in which search engines have combined or aggregated multiple spectra in advance of the search to make identifications. If a fragmentation spectrum has not been used, the value should indicate the ms_run to which is identification is mapped e.g. “ms_run[1]”.

Type

String List

String List

Mandatory

True

Nullable

False

False

Example

SEH SME_ID … spectra_ref …
SME 1 … ms_run[1]:index=5 …
----
SEH	...	spectra_ref	...
SME	...	ms_run[1]:index=5\{vbar}ms_run[2]:index=3	...
----
identification_method
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

The database search, search engine or process that was used to identify this small molecule e.g. the name of software, database or manual curation etc. If manual validation has been performed quality, the following CV term SHOULD be used: "quality estimation by manual validation" MS:1001058.

The search engine or algorithm used for the identification. This SHOULD be specified using CV terms.

Type

Parameter

Parameter

Mandatory

True

Nullable

False

False

Example

SEH SME_ID … identification_method…
SME 1 … [MS, MS:1001477, SpectraST,] …
----
SEH	...	identification_method	...
SME	...	[MS, MS:1001477, SpectraST,]	...
----
ms_level
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

The highest MS level used to inform identification e.g. MS1 (accurate mass only) = “ms level=1” or from an MS2 fragmentation spectrum = “ms level=2”. For direct fragmentation or data independent approaches where fragmentation data is used, appropriate CV terms SHOULD be used .

The MS level of the spectrum used for the identification. This SHOULD be specified using CV terms.

Type

Parameter

Parameter

Mandatory

True

Nullable

False

False

Example

SEH SME_ID … ms_level …
SME 1 … [MS, MS:1000511, ms level, 2] …
----
SEH	...	ms_level	...
SME	...	[MS, MS:1000511, ms level, 2]	...
----
id_confidence_measure[1-n]
⚠️ Changed fields: Type, Mandatory, Example
Field v2.0 v2.1

Description

Any statistical value or score for the identification. The metadata section reports the type of score used, as id_confidence_measure[1-n] of type Param.

Any statistical value or score for the identification. The metadata section reports the type of score used, as id_confidence_measure[1-n] of type Param.

Type

Double

Double List

Mandatory

False

Nullable

True

True

Example

MTD id_confidence_measure[1] [MS, MS:1001419, SpectraST:discriminant score F,]
…
SEH SME_ID … id_confidence_measure[1] …
SME 1 … 0.7 …
----
SEH	...	id_confidence_measure	...
SME	...	0.7	...
----
rank
⚠️ Changed fields: Description, Mandatory, Example
Field v2.0 v2.1

Description

The rank of this identification from this approach as increasing integers from 1 (best ranked identification). Ties (equal score) are represented by using the same rank – defaults to 1 if there is no ranking system used.

The rank of this identification from this approach as increasing integers from 1 (best ranked identification). Ties (equal score) are represented by using the same rank - defaults to 1 if there is no ranking system used.

Type

Integer

Integer

Mandatory

True

Nullable

False

False

Example

SEH SME_ID … rank …
SME 1 … 1 …
----
SEH	...	rank	...
SME	...	1	...
----
opt_{identifier}_*
⚠️ Changed fields: Description, Type, Mandatory, Example
Field v2.0 v2.1

Description

Additional columns can be added to the end of the small molecule evidence table. These column headers MUST start with the prefix “opt_” followed by the {identifier} of the object they reference: assay, study variable, MS run or “global” (if the value relates to all replicates). Column names MUST only contain the following characters: ‘A’-‘Z’, ‘a’-‘z’, ‘0’-‘9’, ‘’, ‘-’, ‘[’, ‘]’, and ‘:’. CV parameter accessions MAY be used for optional columns following the format: opt_{identifier}_cv_{accession}_{parameter name}. Spaces within the parameter’s name MUST be replaced by ‘’.

Additional columns can be added to the end of the small molecule evidence table. These column headers MUST start with the prefix “opt_” followed by the {identifier} of the object they reference: assay, study variable, MS run or “global” (if the value relates to all replicates). Column names MUST only contain the following characters: 'A'-'Z', 'a'-'z', '0'-'9', '', '-', '[', ']', and ':'. CV parameter accessions MAY be used for optional columns following the format: opt{identifier}_cv_{accession}_{parameter name}. Spaces within the parameter’s name MUST be replaced by '_'.

Type

Column

Optional Column

Mandatory

False

Nullable

True

True

Example

SEH SME_ID … opt_assay[1]_my_value … opt_global_another_value
SML 1 … My value … some other value
----
SEH	...	opt_global_cv_value	...
SME	...	opt_assay[1]_my_value=My value	...
SME	...	opt_global_another_value=some other value	...
----

9. Non-supported use cases

There are a number of use cases that were discussed during the development process and it was decided that they are not explicitly supported in mzTab version 2.0.0-M. They may be implemented in future versions of the standard.

Examples include:

  • Multiplexing technologies

  • Including the results from different technologies in one mzTab file e.g. DIMS and LC/MS

  • Merging of results from different omics experiments, e.g. proteomics, metabolomics and lipidomics

10. Conclusions

This document contains the specifications for using the mzTab format to represent results from small molecule pipelines, in the context of a metabolomics or lipidomics investigation. This specification constitutes a proposal for a standard from the Proteomics Standards Initiative and Metabolomics Standards Initiative. These artefacts are currently undergoing the PSI document process, which will result in a standard officially sanctioned by PSI/MSI.

11. Reference Implementation

A reference implementation in JAVA is available at https://github.com/lifs-tools/jmzTab-m. The reference implementation provides a parser, a validator, a CV-mapping validation and a writer for mzTab-M. It furthermore supports transcoding from a JSON representation of the object model into the tab-separated output format and vice-versa. A user-friendly web-application that uses the validator reference implementation is available at https://apps.lifs.isas.de/mztabvalidator/.

12. Authors

  • Nils Hoffmann, Leibniz-Institut für Analytische Wissenschaften – ISAS – e.V., Dortmund, Germany. nils.hoffmann@isas.de

  • Joel Rein, Wellcome Sanger Institute, Cambridge, United Kingdom. joel.rein@sanger.ac.uk

  • Timo Sachsenberg, Applied Bioinformatics Group, Center for Bioinformatics, University of Tübingen, Germany. sachsenb@informatik.uni-tuebingen.de

  • Jürgen Hartler, Institute of Computational Biotechnology at Graz University of Technology and Center for Explorative Lipidomics, Graz, Austria. juergen.hartler@tugraz.at

  • Kenneth Haug, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom. kenneth@ebi.ac.uk

  • Gerhard Mayer, Medizinisches Proteom-Center, Ruhr-Universität Bochum, Germany. gerhard.mayer@rub.de

  • Oliver Alka, Applied Bioinformatics Group, Center for Bioinformatics, University of Tübingen, Germany. alka@informatik.uni-tuebingen.de

  • Saravanan Dayalan, Metabolomics Australia, The University of Melbourne, Parkville, Australia. sdayalan@unimelb.edu.au

  • Jake TM Pearce, MRC-NIHR National Phenome Center, Imperial College London, London, United Kingdom. jake.pearce@imperial.ac.uk

  • Philippe Rocca-Serra, Oxford e-Research Centre, University of Oxford, United Kingdom. philippe.rocca-serra@oerc.ox.ac.uk

  • Da Qi, Institute of Integrative Biology, University of Liverpool, United Kingdom and BGI-Shenzhen, Shenzen, China. qida@genomics.cn

  • Martin Eisenacher, Medizinisches Proteom-Center, Ruhr-Universität Bochum, Germany. martin.eisenacher@ruhr-uni-bochum.de

  • Yasset Perez-Riverol, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom. yperez@ebi.ac.uk

  • Juan Antonio Vizcaíno, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom. juan@ebi.ac.uk

  • Reza M Salek, International Agency for Research on Cancer, Lyon, France. r7salek@gmail.com

  • Steffen Neumann, Leibniz Institute of Plant Biochemistry, Halle and German Centre for Integrative Biodiversity Researchm Halle-Jena-Leipzig, Germany. sneumann@ipb-halle.de

  • Andrew R Jones, Institute of Integrative Biology, University of Liverpool, United Kingdom. (Editor) Andrew.Jones@liverpool.ac.uk

References

  • [bradner-1997] Bradner, S. (1997). Key words for use in RFCs to Indicate Requirement Levels, Internet Engineering Task Force. RFC 2119.

  • [martens-2011] Martens, L., et al. (2011). "mzML—​a community standard for mass spectrometry data." Mol Cell Proteomics 10(1): R110 000133.

  • [hill-1900] EA Hill (1900). “ON A SYSTEM OF INDEXING CHEMICAL LITERATURE; ADOPTED BY THE CLASSIFICATION DIVISION OF THE U. S. PATENT OFFICE.” J. Am. Chem. Soc. 22 (8): 478–494. doi:10.1021/ja02046a005.

  • [griss-2014] Griss et al. (2014) "The mzTab data exchange format: communicating mass-spectrometry-based proteomics and metabolomics experimental results to a wider audience." Mol Cell Proteomics doi: 10.1074/mcp.O113.036681.

  • [sansone-2012] Sansone et al. (2012) "Toward interoperable bioscience data." Nature Genetics 44: 121–126. doi:10.1038/ng.1054.

13. Intellectual Property Statement

The PSI/MSI takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the PSI Chair.

The PSI/MSI invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to practice this recommendation. Please address the information to the PSI Chair (see contacts information at PSI website).

TradeMark Section

Microsoft Excel®

Copyright © Proteomics Standards Initiative (2018). All Rights Reserved.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the PSI or other organizations, except as needed for the purpose of developing Proteomics Recommendations in which case the procedures for copyrights defined in the PSI Document process must be followed, or as required to translate it into languages other than English.

The limited permissions granted above are perpetual and will not be revoked by the PSI or its successors or assigns.

This document and the information contained herein is provided on an "AS IS" basis and THE PROTEOMICS STANDARDS INITIATIVE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."