Metadata attributes#
Metadata specific to the library#
The following attributes describe an entire collection of spectra, the library itself. They should be listed under the tag <mzSpecLib 1.0> and nowhere else in the library.
MUST#
Name |
Info |
Value |
Allowed units |
Repeatable |
---|---|---|---|---|
library format version (MS:1003186) |
“Version number of the [PSI] library format specification” [PSI:PI] |
xsd:string |
/ |
False |
library name (MS:1003188) |
“A short name identifying the library to potential users. The same name may refer to multiple versions of the same continually updated library.” [PSI:PI] |
xsd:string |
/ |
False |
SHOULD#
Name |
Info |
Value |
Allowed units |
Repeatable |
---|---|---|---|---|
library identifier (MS:1003187) |
“Short identifier for the library for easy reference, preferably but not necessarily globally unique” [PSI:PI] |
xsd:string |
/ |
False |
MAY#
Name |
Info |
Value |
Allowed units |
Repeatable |
---|---|---|---|---|
library description (MS:1003189) |
“Extended free-text description of the library” [PSI:PI] |
xsd:string |
/ |
False |
library version (MS:1003190) |
“Version number of the library, usually refering to a certain release of a continually updated library “ [PSI:PI] |
xsd:string |
/ |
False |
library URI (MS:1003191) |
“URI or URL that uniquely identifies the library” [PSI:PI] |
xsd:string |
/ |
False |
release date (MS:1001017) |
“Date and time at which a product was publicly released. For mzIdentML, use the database release date XML attribute instead of this term.” [PSI:PI] |
xsd:datetime |
/ |
False |
Reference (MS:1002866) |
“Literature reference associated with one dataset (including the authors, title, year and journal details). The value field can be used for the PubMedID, or to specify if one manuscript is just submitted or accepted, but it does not have a PubMedID yet.” [PSI:PI] |
xsd:string |
/ |
True |
Digital Object Identifier (DOI) (MS:1001922) |
“DOI unique identifier of a publication.” [PSI:PI, http://dx.doi.org] |
xsd:string |
/ |
True |
PubMed identifier (MS:1000879) |
“A unique identifier for a publication in the PubMed database (MIR:00000015).” [PSI:MS] |
xsd:integer |
/ |
True |
license URI (MS:1003197) |
“URI of the license controlling use of the library (e.g. https://creativecommons.org/publicdomain/zero/1.0/)” [PSI:PI] |
xsd:string |
/ |
False |
copyright notice (MS:1003198) |
“Notice of statutorily prescribed form that informs users of the underlying claim to copyright ownership in a published work” [PSI:PI] |
xsd:string |
/ |
False |
change log (MS:1003199) |
“Extended free-text description of the difference from the previous version” [PSI:PI] |
xsd:string |
/ |
False |
contact name (MS:1000586) |
“Name of the contact person or organization.” [PSI:MS] |
xsd:string |
/ |
True |
contact address (MS:1000587) |
“Postal address of the contact person or organization.” [PSI:MS] |
xsd:string |
/ |
True |
contact URL (MS:1000588) |
“Uniform Resource Locator related to the contact person or organization.” [PSI:MS] |
xsd:string |
/ |
True |
contact affiliation (MS:1000590) |
“Home institution of the contact person.” [PSI:MS] |
xsd:string |
/ |
True |
contact email (MS:1000589) |
“Email address of the contact person or organization.” [PSI:MS] |
xsd:string |
/ |
True |
contact role (MS:1002033) |
“Role of the contact person.” [PSI:PI] |
Children of MS:1002033 |
/ |
True |
library creation software (MS:1003207) |
“Library creation software” [PSI:MS] |
Children of MS:1003207 |
/ |
False |
software version (MS:1003200) |
“Version number of the software package used for library creation” [PSI:PI] |
xsd:string |
/ |
False |
library creation log (MS:1003206) |
“String of logging information generated when the library was constructed from its constituent files. Multiple lines should be separated with escaped \n” [PSI:PI] |
xsd:string |
/ |
False |
command-line parameters (MS:1001885) |
“Parameters string passed to a command-line interface software application, omitting the executable name.” [PSI:MS] |
xsd:string |
/ |
False |
constituent spectrum file (MS:1003203) |
“Spectrum data file from which (at least) a subset of spectra were extracted from. Should use USI notation mzspec:PXDxxxx:msRunName if possible, or a URI if USI notation is not possible.” [PSI:PI] |
xsd:string |
/ |
True |
constituent identification file (MS:1003205) |
“Source library URI which(at least) a subset of spectra were extracted from.” [PSI:PI] |
xsd:string |
/ |
True |
constituent library file (MS:1003205) |
“Source library URI which(at least) a subset of spectra were extracted from.” [PSI:PI] |
xsd:string |
/ |
True |
library attribute set name (MS:1003212) |
“A name to refer to a library attribute set” [PSI:PI] |
xsd:string |
/ |
True |
spectrum cluster key (MS:1003267) |
“An ordinal number uniquely identifying a spectrum cluster. It should start with 1.” [PSI:PI] |
xsd:integer |
/ |
True |
spectrum cluster member spectrum keys (MS:1003268) |
“A list of integers corresponding to the library spectrum keys of the members of this cluster. These members must be in the same library.” [PSI:PI] |
Undefined |
/ |
True |
spectrum cluster member USI (MS:1003269) |
“A member of this cluster external to the library, specified using a PSI Universal Spectrum Identifier.” [PSI:PI] |
xsd:string |
/ |
True |
Metadata specific to library spectra#
The second level of metadata provides information specific to each library spectrum. The attributes SHOULD be organized in subcategories: those that pertain to the library spectrum and the origin of the spectrum, those that pertain to the spectrum itself, those that pertain to the interpretation of the spectrum, and those that pertain to the analyte(s) the spectrum is identified to. The following are attributes pertaining to the library spectrum and the origin of the spectrum, including the acquisition method and information about the precursor ion. They MUST be listed under <Spectrum=_>, or listed in library spectrum attribute sets.
MUST#
Name |
Info |
Value |
Allowed units |
Repeatable |
---|---|---|---|---|
library spectrum key (MS:1003237) |
“An ordinal number uniquely identifying a spectrum in a library. Library spectrum keys should start at 1. Library spectrum keys SHOULD not change if entries are re-ordered or removed from a library.” [PSI:PI] |
xsd:integer |
/ |
False |
library spectrum index (MS:1003062) |
“Integer index value that indicates the spectrum’s ordered position within a spectral library. By custom, index counters should begin with 0.” [PSI:PI] |
xsd:integer |
/ |
False |
experimental precursor monoisotopic m/z (MS:1003208) |
“The measured or inferred m/z (as reported by the mass spectrometer acquisition software or post-processing software) of the monoisotopic peak of the precursor ion based on the MSn-1 spectrum.” [PSI:MS] |
xsd:float |
m/z |
False |
MAY#
Name |
Info |
Value |
Allowed units |
Repeatable |
---|---|---|---|---|
library spectrum name (MS:1003061) |
“Label attached to a spectrum uniquely naming it within a collection of spectra, often in a spectral library. It is often a string combination of peptide sequence, charge, mass modifications, collision energy, but will obviously be different for small molecules or unidentified spectra. It must be unique within a collection.” [PSI:PI] |
xsd:string |
/ |
False |
universal spectrum identifier (MS:1003063) |
“PSI universal spectrum identifier (USI) multipart key that uniquely identifies a spectrum available in a ProteomeXchange datasets or spectral library.” [PSI:PI] |
xsd:string |
/ |
False |
splash key (MS:1002599) |
“Spectral Hash key, an unique identifier for spectra.” [PMID:27824832] |
xsd:string |
/ |
False |
library spectrum comment (MS:1003168) |
“A free-text string providing additional information of the library spectrum not encoded otherwise, usually for human use and not parsed by software tools.” [PSI:PI] |
xsd:string |
/ |
False |
ms level (MS:1000511) |
“Stage number achieved in a multi stage mass spectrometry acquisition.” [PSI:MS] |
xsd:integer |
/ |
False |
ionization type (MS:1000008) |
“The method by which gas phase ions are generated from the sample.” [PSI:MS] |
Children of MS:1000008 |
/ |
False |
instrument model (MS:1000031) |
“Instrument model name not including the vendor’s name.” [PSI:MS] |
Children of MS:1000031 |
/ |
False |
mass analyzer type (MS:1000443) |
“Mass analyzer separates the ions according to their mass-to-charge ratio.” [PSI:MS] |
Children of MS:1000443 |
/ |
False |
mass spectrometry acquisition method (MS:1003213) |
“Mode of running a mass spectrometer method by which mass ranges are selected and possibly dissociated.” [PSI:MS] |
Children of MS:1003213 |
/ |
True |
collision gas (MS:1000419) |
“An inert gas used for collisional excitation. The term target gas is not recommended.” [PSI:MS] |
xsd:string |
/ |
True |
normalized collision energy (MS:1000138) |
“Instrument setting, expressed in percent, for adjusting collisional energies of ions in an effort to provide equivalent excitation of all ions.” [PSI:PI] |
xsd:float |
percent |
False |
isolation window target m/z (MS:1000827) |
“The primary or reference m/z about which the isolation window is defined.” [PSI:MS] |
xsd:float |
m/z |
False |
Isolation window lower offset (MS:1000828) |
“The extent of the isolation window in m/z below the isolation window target m/z. The lower and upper offsets may be asymmetric about the target m/z.” [PSI:MS] |
xsd:float |
m/z |
False |
Isolation window upper offset (MS:1000829) |
“The extent of the isolation window in m/z above the isolation window target m/z. The lower and upper offsets may be asymmetric about the target m/z.” [PSI:MS] |
xsd:float |
m/z |
False |
retention time (MS:1000894) |
“A time interval from the start of chromatography when an analyte exits a chromatographic column.” [PSI:MS] |
xsd:float |
second, minute |
False |
normalized retention time (MS:1000896) |
“A time interval from the start of chromatography when an analyte exits a standardized reference chromatographic column and instrumental setup.” [PSI:MS] |
xsd:float |
second, minute |
False |
retention time normalization standard (MS:1000901) |
“A standard providing the retention times at which a set of reference compounds exit the reference chromatographic column.” [PSI:MS] |
Children of MS:1000901 |
/ |
False |
ion mobility drift time (MS:1002476) |
“Drift time of an ion or spectrum of ions as measured in an ion mobility mass spectrometer. This time might refer to the central value of a bin into which all ions within a narrow range of drift time have been aggregated.” [PSI:MS] |
xsd:float |
millisecond |
False |
inverse reduced ion mobility (MS:1002875) |
“Thermo Scientific TSQ Quantis Triple Quadrupole MS.” [PSI:PI] |
xsd:float |
/ |
False |
collision cross sectional area (MS:1002954) |
“Structural molecular descriptor for the effective interaction area between the ion and neutral gas measured in ion mobility mass spectrometry.” [PSI:PI] |
xsd:double |
square angstrom |
False |
FAIMS compensation voltage (MS:1001581) |
“The DC potential applied to the asymmetric waveform in FAIMS that compensates for the difference between high and low field mobility of an ion.” [PSI:MS] |
xsd:double |
volt |
False |
scan polarity (MS:1000465) |
“Relative orientation of the electromagnetic field during the selection and detection of ions in the mass spectrometer.” [PSI:MS] |
Children of MS:1000465 |
/ |
False |
filter string (MS:1000512) |
“A string unique to Thermo instrument describing instrument settings for the scan.” [PSI:MS] |
xsd:string |
/ |
False |
mass resolution (MS:1000011) |
“Smallest mass difference between two equal magnitude peaks so that the valley between them is a specified fraction of the peak height.” [PSI:MS] |
xsd:string |
/ |
False |
mass resolving power (MS:1000800) |
“The observed mass divided by the difference between two masses that can be separated: m/dm. The procedure by which dm was obtained and the mass at which the measurement was made should be reported.” [PSI:MS] |
xsd:string |
/ |
False |
precursor apex intensity (MS:1003086) |
“Intensity of the precursor ion current as measured by its apex point over time and m/z. It is unspecified whether this is the intensity of the selected isotope or the most intense isotope.” [PSI:MS] |
xsd:float |
/ |
False |
previous MSn-1 scan precursor intensity (MS:1003085) |
“Intensity of the precursor ion in the previous MSn-1 scan (prior in time to the referencing MSn scan). For an MS2 scan, this means the MS1 precursor intensity. It is unspecified on whether this is an apex (across m/z) intensity, integrated (across m/z) intensity, a centroided peak intensity of unknown origin, or even summed across several isotopes.” [PSI:MS] |
xsd:double |
/ |
False |
possible charge state (MS:1000633) |
“A possible charge state of the ion in a situation where the charge of an ion is known to be one of several possible values rather than a completely unknown value or determined to be a specific charge with reasonable certainty.” [PSI:MS] |
xsd:integer |
/ |
False |
charge state (MS:1000041) |
“Number of net charges, positive or negative, on an ion.” [PSI:MS] |
xsd:integer |
/ |
False |
number of peaks (MS:1003059) |
“Number of peaks or features in a spectrum. For a peak-picked spectrum, this will correspond to the number of data points. For a non-peak-picked spectrum, this corresponds to the number of features discernable in the spectrum, which will be fewer than the number of data points.” [PSI:PI] |
xsd:integer |
/ |
False |
number of data points (MS:1003060) |
“Number of data points in a spectrum. For a peak-picked spectrum, this will correspond to the number of peaks. For a non-peak-picked spectrum, this corresponds to the number of values in the data array, which are not all peaks.” [PSI:PI] |
xsd:integer |
/ |
False |
intensity unit (MS:1000034) |
“The determination of the mass of an ion based on the mass spectral peaks that represent multiple-charge ions.” [PSI:MS] |
Children of MS:1000034 |
/ |
False |
base peak intensity (MS:1000505) |
“The intensity of the greatest peak in the mass spectrum.” [PSI:MS] |
xsd:float |
number of detector counts, percent of base peak, counts per second, percent of base peak times 100, absorbance unit |
False |
signal-to-noise ratio (MS:1001884) |
“Unitless number providing the ratio of the total measured intensity of a signal relative to the estimated noise level for that signal.” [PSI:MS] |
xsd:float |
/ |
False |
total ion current (MS:1000285) |
“The sum of all the separate ion currents carried by the ions of different m/z contributing to a complete mass spectrum or in a specified m/z range of a mass spectrum.” [PSI:MS] |
xsd:float |
/ |
False |
spectrum representation (MS:1000525) |
“Way in which the spectrum is represented, either with regularly spaced data points or with a list of centroided peaks.” [PSI:MS] |
Children of MS:1000525 |
/ |
False |
data processing action (MS:1000543) |
“Data processing attribute used to describe the type of data processing performed on the data file.” [PSI:MS] |
Children of MS:1000543 |
/ |
True |
ion annotation format (MS:1003103) |
“Annotation format used for annotating individual spectrum ion peaks.” [PSI:MS] |
Children of MS:1003103 |
/ |
False |
number of replicate spectra used (MS:1003070) |
“Number of replicate spectra used during the aggregation process. This is generally applicable when there are many replicates available, but some are discarded as being low S/N, blended, or otherwise unsuitable, and the remaining set is then used for merging via a consensus algorithm.” [PSI:PI] |
xsd:integer |
/ |
False |
number of replicate spectra available (MS:1003069) |
“Number of replicate spectra available for use during the aggregation process.” [PSI:PI] |
xsd:integer |
/ |
False |
summary statistics of replicate spectra (MS:1003295) |
“Summary statistics of an attribute among all replicates that are aggregated to generate this spectrum.” [PSI:PI] |
Children of MS:1003295 |
/ |
True |
number of replicate spectra from source (MS:1003296) |
“The number of replicate spectra used during the aggregation process that originate from a specified source (e.g. a file, a dataset, a sample type, etc.). This term must appear in a group alongside a source term, which is one of: ‘ProteomeXchange accession number’, ‘constituent spectrum file’, or ‘sample name’” [PSI:PI] |
xsd:string |
/ |
False |
contributing replicate spectrum keys (MS:1003298) |
“A list of cross references to contributing replicate spectra in the same library, in the form of library spectrum keys.” [PSI:PI] |
Undefined |
/ |
False |
contributing replicate spectrum USI (MS:1003299) |
“A list of cross references to contributing replicate spectra, in the form of PSI Universal Spectrum Identifiers.” [PSI:PI] |
Undefined |
/ |
False |
related spectrum keys (MS:1003259) |
“A list of cross references to a related spectrum in the same library, in the form of the library spectrum key. “ [PSI:PI] |
Undefined |
/ |
False |
related spectrum USI (MS:1003260) |
“A cross reference to a related spectrum in the form of a PSI Universal Spectrum Identifier” [PSI:PI] |
xsd:string |
/ |
False |
related spectrum description (MS:1003261) |
“A free-text string describing the related spectrum and/or its relationship to this spectrum” [PSI:PI] |
xsd:string |
/ |
False |
similar spectrum keys (MS:1003263) |
“A list of cross references to a similar spectrum in the same library, in the form of the library spectrum key. “ [PSI:PI] |
Undefined |
/ |
False |
similar spectrum USI (MS:1003264) |
“A cross reference to a similar spectrum in the form of a PSI Universal Spectrum Identifier” [PSI:PI] |
xsd:string |
/ |
False |
peak attribute (MS:1003254) |
“An attribute of a peak in a mass spectrum other than its m/z, intensity, and annotation. “ [PSI:PI] |
Children of MS:1003254 |
/ |
True |
SHOULD#
Name |
Info |
Value |
Allowed units |
Repeatable |
---|---|---|---|---|
dissociation method (MS:1000044) |
“Fragmentation method used for dissociation or fragmentation.” [PSI:MS] |
Children of MS:1000044 |
/ |
False |
collision energy (MS:1000045) |
“Energy for an ion experiencing collision with a stationary gas particle resulting in dissociation of the ion.” [PSI:MS] |
xsd:float |
electronvolt |
False |
spectrum origin type (MS:1003072) |
“Categorization of a spectrum based on its origin (e.g., observed spectrum, predicted spectrum, demultiplexed spectrum, etc.).” [PSI:PI] |
Children of MS:1003072 |
/ |
False |
spectrum aggregation type (MS:1003065) |
“Categorization of a spectrum based on its type of aggregation (e.g., individual spectrum, consensus spectrum, best replicate spectrum, etc.).” [PSI:PI] |
Children of MS:1003065 |
/ |
False |
Metadata specific to the analyte#
These attributes refer to the analyte(s) to which the spectrum is identified. In the present version of the library format, only peptide analytes are supported, but support for other kinds of analytes (e.g. small molecule metabolites) are expected in the future. They MUST be listed under <Analyte=_>
MAY#
Name |
Info |
Value |
Allowed units |
Repeatable |
---|---|---|---|---|
charge state (MS:1000041) |
“Number of net charges, positive or negative, on an ion.” [PSI:MS] |
xsd:integer |
/ |
False |
theoretical mass (MS:1001117) |
“The theoretical neutral mass of the molecule (e.g. the peptide sequence and its modifications) not including its charge carrier.” [PSI:PI] |
xsd:double |
dalton |
False |
adduct ion mass (MS:1003243) |
“The theoretical mass of the adduct ion (e.g. for a singly-charged protonated peptide ion, this value would be the neutral peptide molecule’s mass plus the mass of a proton)” [PSI:PI] |
xsd:float |
/ |
False |
adduct ion formula (MS:1002813) |
“Adduct formation formula of the form M+X or M-X, as constrained by the provided regular expression.” [PSI:MS] |
xsd:string |
/ |
False |
theoretical monoisotopic m/z (MS:1003053) |
“Mass-to-charge ratio of a peptidoform ion composed of the most common isotope of each atom computed from the putative knowledge of its molecular constituents.” [PSI:PI] |
xsd:float |
/ |
False |
theoretical average m/z (MS:1003054) |
“Mass-to-charge ratio of a peptidoform ion computed from the putative knowledge of its molecular constituents, averaged over the distribution of naturally occurring isotopes.” [PSI:PI] |
xsd:float |
/ |
False |
taxonomy: scientific name (MS:1001469) |
“This term is used if a scientific name is specified, e.g. Homo sapiens. Recommend using MS:1001467 (taxonomy: NCBI TaxID) where possible.” [PSI:PI] |
xsd:string |
/ |
False |
taxonomy: common name (MS:1001468) |
“This term is used if a common name is specified, e.g. human. Recommend using MS:1001467 (taxonomy: NCBI TaxID) where possible.” [PSI:PI] |
xsd:string |
/ |
False |
taxonomy: NCBI TaxID (MS:1001467) |
“This term is used if a NCBI TaxID is specified, e.g. 9606 for Homo sapiens.” [PSI:PI] |
xsd:string |
/ |
False |
sample label (MS:1002602) |
“Reagent used in labeled quantification methods.” [PSI:PI] |
Children of MS:1002602 |
/ |
False |
sample name (MS:1000002) |
“A reference string relevant to the sample under study.” [PSI:MS] |
xsd:string |
/ |
False |
SMILES formula (MS:1000868) |
“The simplified molecular input line entry specification or SMILES is a specification for unambiguously describing the structure of a chemical compound using a short ASCII string.” [EDAM:2301] |
xsd:string |
/ |
False |