<?xml version="1.0" encoding="UTF-8"?>
<?asciidoc-toc maxdepth="3"?>
<?asciidoc-numbered?>
<book xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
<info>
<title>mzIdentML extension for crosslinking data</title>
<date>2026-04-24</date>
</info>
<preface xml:id="preface">
<title>Preface</title>
<simpara><emphasis role="strong">mzIdentML: exchange format for peptides and proteins identified from mass spectra</emphasis></simpara>
<simpara><emphasis role="strong">Extension for crosslinking approaches</emphasis></simpara>
<simpara>(This extension is an <emphasis>addendum</emphasis> to mzIdentML version 1.3.0)</simpara>
<simpara><emphasis>Status of This Document</emphasis></simpara>
<simpara>This document presents a final specification for the mzIdentML data format developed by the HUPO Proteomics Standards Initiative.
Distribution is unlimited.</simpara>
<simpara><emphasis>Version of This Document</emphasis></simpara>
<simpara>Date created: June 24, 2024</simpara>
<simpara>Last updated: Fri Apr 24 15:41:03 UTC 2026</simpara>
<simpara>Based on commit: <link xl:href="https://github.com/HUPO-PSI/mzIdentML/commit/e537e39e404653ca0fcbd1d7c07280123a6ce8fa">e537e39e404653ca0fcbd1d7c07280123a6ce8fa</link> - <link xl:href="https://github.com/HUPO-PSI/mzIdentML/commits/master/specification_document/specdoc1_3/asciidoc/crosslinking_ext.adoc">Commit History</link></simpara>
<simpara>The current version of this document is: version 1.0.0 final June 24, 2024</simpara>
<simpara>The latest (draft) version of this document may be found at <link xl:href="https://github.com/HUPO-PSI/mzIdentML">https://github.com/HUPO-PSI/mzIdentML</link>.</simpara>
<simpara><emphasis>Type of This Document</emphasis></simpara>
<simpara>This document is a <emphasis>recommendation</emphasis> for a common, community-driven standard data exchange format in proteomics.</simpara>
<simpara><emphasis>Authors</emphasis></simpara>
<simpara>Please see <xref linkend="authors"/> for details on the authors and editors of this document.</simpara>
</preface>
<chapter xml:id="abstract">
<title>Abstract</title>
<simpara>The Human Proteome Organisation (HUPO) Proteomics Standards Initiative (PSI) defines community standards for data representation in proteomics to facilitate data comparison, exchange and verification.
This document defines the extension of the mzIdentML data standard to describe the outputs of proteomics search engines or similar software for the identification of crosslinked peptides.</simpara>
</chapter>
<chapter xml:id="introduction">
<title>Introduction</title>
<section xml:id="background">
<title>Background</title>
<simpara>A large number of proteomics search engines are available, each outputting results in a variety of file formats. mzIdentML <xref linkend="viz2017"/> is a HUPO-PSI endorsed community standard that provides a common file format for identification results.
This extension document has been released in parallel with mzIdentML version 1.3. mzIdentML version 1.3 supports extensions for additional features or use cases which can be described in additional documents, rather than editing the original specification document.</simpara>
<simpara>This mzIdentML 1.3.0 extension document provides further information on the encoding of crosslinking Mass Spectrometry (MS) results in mzIdentML.
It has two aims:</simpara>
<simpara>1- Extend the crosslinking use cases supported by mzIdentML version 1.2.0 (2017) to cover what is currently required by the state-of-the-art in the field.
Particular attention is paid to workflows using cleavable crosslinkers.</simpara>
<simpara>2- Provide further clarification and documentation on how to encode crosslinking data in an mzIdentML file.</simpara>
</section>
<section xml:id="supported-crosslinking-use-cases">
<title>Supported Crosslinking Use Cases</title>
<simpara>Already supported use case in mzIdentML 1.2.0:</simpara>
<itemizedlist>
<listitem>
<simpara>Two crosslinked peptides (the crosslinking product that is typically of most interest).</simpara>
</listitem>
</itemizedlist>
<simpara>New use cases supported in this extension (part of mzIdentML 1.3.0):</simpara>
<simpara>1- Reporting cleavable crosslinkers.
MS-cleavable crosslinkers can cleave upon activation in the mass spectrometer, releasing the individual peptides and thus enabling their individual analysis.
Section 7.11 of the main mzIdentML 1.3.0 specification gives a new mechanism for encoding identifications based on multiple spectra, using this is essential for some cleavable crosslinker workflows.
Such a workflow is used as an example in Section 7.11 of the main specification document.</simpara>
<simpara>2- Internally linked peptides (commonly known as “looplinks”).
Internally linked peptides are cases where both ends of the crosslinker are within a single peptide, not between two copies of the same peptide.
This type of crosslinking product is therefore necessarily intramolecular.</simpara>
<simpara>3- Noncovalently associated peptides.
Some spectra may show the fragmentation of two different peptides which were not crosslinked but stayed associated due to noncovalent interactions <xref linkend="giese2019"/>.
Both peptides together appear as a single precursor species in the instrument, as opposed to ‘chimeric’ spectra where a single peptide is selected as precursor but additional peptide(s) fall within the same selection window.
Identifying these noncovalently associated peptides may improve the accuracy of the results as it can prevent them from being misidentified as crosslinked peptides.</simpara>
<simpara>4- Additionally, the encoding of scores applicable to crosslinking MS results, and their corresponding thresholds, has been clarified and improved.</simpara>
<simpara>An overview of the different crosslinking product types and their support in mzIdentML is given in <xref linkend="summary-of-mzidentml-support-for-crosslinking-product-types"/>. For discussion of the product types that are not supported in this version of mzIdentML (crosslinkers with more than two reactive groups, higher order crosslinked peptides) see <xref linkend="unsupported-use-cases-and-future-directions"/>.</simpara>
<table xml:id="summary-of-mzidentml-support-for-crosslinking-product-types" frame="all" rowsep="1" colsep="1">
<title>Summary of mzIdentML support for crosslinking product types.</title>
<tgroup cols="9">
<colspec colname="col_1" colwidth="10.7142*"/>
<colspec colname="col_2" colwidth="10.7142*"/>
<colspec colname="col_3" colwidth="10.7142*"/>
<colspec colname="col_4" colwidth="10.7142*"/>
<colspec colname="col_5" colwidth="14.2857*"/>
<colspec colname="col_6" colwidth="10.7142*"/>
<colspec colname="col_7" colwidth="10.7142*"/>
<colspec colname="col_8" colwidth="10.7142*"/>
<colspec colname="col_9" colwidth="10.7149*"/>
<tbody>
<row>
<entry align="center" valign="middle"><simpara><emphasis role="strong"><phrase role="small">no crosslinker reaction</phrase></emphasis></simpara></entry>
<entry align="center" valign="middle"><informalfigure>
<mediaobject>
<imageobject>
<imagedata fileref="img/crosslinking_ext/image9.jpg" contentwidth="42" contentdepth="20"/>
</imageobject>
<textobject><phrase>image</phrase></textobject>
</mediaobject>
</informalfigure>
<simpara><phrase role="small">linear peptide / free peptide</phrase></simpara></entry>
<entry align="center" valign="middle"></entry>
<entry align="center" valign="middle"></entry>
<entry align="center" valign="middle"></entry>
<entry align="center" valign="middle"></entry>
<entry align="center" valign="middle"><informalfigure>
<mediaobject>
<imageobject>
<imagedata fileref="img/crosslinking_ext/image9.jpg" contentwidth="43" contentdepth="15"/>
</imageobject>
<textobject><phrase>image</phrase></textobject>
</mediaobject>
</informalfigure>
<informalfigure>
<mediaobject>
<imageobject>
<imagedata fileref="img/crosslinking_ext/image9.jpg" contentwidth="43" contentdepth="14"/>
</imageobject>
<textobject><phrase>image</phrase></textobject>
</mediaobject>
</informalfigure>
<simpara><phrase role="small">non-covalently associated peptides</phrase></simpara></entry>
<entry align="center" valign="middle"></entry>
<entry align="center" valign="middle"></entry>
</row>
<row>
<entry align="center" valign="middle"><simpara><emphasis role="strong"><phrase role="small">crosslinker reaction</phrase></emphasis></simpara></entry>
<entry align="center" valign="middle"></entry>
<entry align="center" valign="middle"><informalfigure>
<mediaobject>
<imageobject>
<imagedata fileref="img/crosslinking_ext/image5.jpg" contentwidth="56" contentdepth="33"/>
</imageobject>
<textobject><phrase>image</phrase></textobject>
</mediaobject>
</informalfigure>
<simpara><phrase role="small">crosslinker modified peptide (monolink or dead-end link)</phrase></simpara></entry>
<entry align="center" valign="middle"><informalfigure>
<mediaobject>
<imageobject>
<imagedata fileref="img/crosslinking_ext/image8.jpg" contentwidth="63" contentdepth="41"/>
</imageobject>
<textobject><phrase>image</phrase></textobject>
</mediaobject>
</informalfigure>
<simpara><phrase role="small">crosslinked peptides</phrase></simpara></entry>
<entry align="center" valign="middle"><informalfigure>
<mediaobject>
<imageobject>
<imagedata fileref="img/crosslinking_ext/image4.png" contentwidth="46" contentdepth="34"/>
</imageobject>
<textobject><phrase>image</phrase></textobject>
</mediaobject>
</informalfigure>
<simpara><phrase role="small">cleavable crosslinker</phrase></simpara></entry>
<entry align="center" valign="middle"><informalfigure>
<mediaobject>
<imageobject>
<imagedata fileref="img/crosslinking_ext/image6.jpg" contentwidth="66" contentdepth="26"/>
</imageobject>
<textobject><phrase>image</phrase></textobject>
</mediaobject>
</informalfigure>
<simpara><phrase role="small">internally linked peptide (looplink)</phrase></simpara></entry>
<entry align="center" valign="middle"></entry>
<entry align="center" valign="middle"><informalfigure>
<mediaobject>
<imageobject>
<imagedata fileref="img/crosslinking_ext/image7.jpg" contentwidth="63" contentdepth="45"/>
</imageobject>
<textobject><phrase>image</phrase></textobject>
</mediaobject>
</informalfigure>
<simpara><phrase role="small">crosslinked peptides from crosslinkers with more than two reactive groups</phrase></simpara></entry>
<entry align="center" valign="middle"><informalfigure>
<mediaobject>
<imageobject>
<imagedata fileref="img/crosslinking_ext/image10.jpg" contentwidth="63" contentdepth="41"/>
</imageobject>
<textobject><phrase>image</phrase></textobject>
</mediaobject>
</informalfigure>
<simpara><phrase role="small">higher order crosslinked peptides</phrase></simpara></entry>
</row>
<row>
<entry align="center" valign="middle"><simpara><emphasis role="strong"><phrase role="small">mzIdentML version supporting</phrase></emphasis></simpara></entry>
<entry align="center" valign="middle" namest="col_2" nameend="col_3"><simpara><emphasis role="strong"><phrase role="small">1.1.0</phrase></emphasis></simpara></entry>
<entry align="center" valign="middle"><simpara><emphasis role="strong"><phrase role="small">1.2.0</phrase></emphasis></simpara></entry>
<entry align="center" valign="middle" namest="col_4" nameend="col_6"><simpara><emphasis role="strong"><phrase role="small">1.3.0</phrase></emphasis></simpara></entry>
<entry align="center" valign="middle" namest="col_5" nameend="col_6"><simpara><emphasis role="strong"><phrase role="small">Unsupported</phrase></emphasis></simpara></entry>
</row>
</tbody>
</tgroup>
</table>
</section>
<section xml:id="document-structure">
<title>Document Structure and Changes from mzIdentML Version 1.2.0 to 1.3.0 to Support Crosslinking Results</title>
<simpara>mzIdentML version 1.3.0 makes two significant changes:</simpara>
<orderedlist numeration="lowerroman">
<listitem>
<simpara>a new mechanism for encoding identifications based on multiple spectra, including the retiral of the old method for doing this;</simpara>
</listitem>
<listitem>
<simpara>the ability to supplement the specification with extension documents covering specific use cases.
The general guidance on the mzIdentML file format given in the main specification document all applies here, with this extension document giving guidance on the use of the controlled vocabulary (CV) terms specific to crosslinking.</simpara>
</listitem>
</orderedlist>
<simpara>The previously supported crosslinking use case was described in the main mzIdentML 1.2.0 specification document.
In mzIdentML 1.3.0 this information has been moved to this extension document but it remains unchanged.
The only change to the previous version 1.2.0 support for crosslinking is regarding identifications based on multiple spectra, this change is covered in Section 7.11 of the main mzIdentML 1.3.0 specification document.</simpara>
<simpara>The new use cases supported in this extension (version 1.0.0, compatible with mzIdentML version 1.3.0) are explained in detail in the following Sections of this document.
All of them have new controlled vocabulary terms associated with them.</simpara>
<simpara><xref linkend="encoding-the-results-of-crosslinking-searches"/> of this extension document is organised on the basis of elements in the mzIdentML schema, see <xref linkend="overview-of-the-mzidentml-1-3-0-format-crosslinking-extension"/>. For each of the new use cases, the list below states the relevant sections of this document and the new CV terms.</simpara>
<simpara>1- Reporting cleavable crosslinkers.
See <xref linkend="modification-parameters"/>. Three new CV terms have been created related to encoding the derivatives of cleavable crosslinkers:</simpara>
<itemizedlist>
<listitem>
<simpara>“cleavable crosslinker stub” (MS:1003346),</simpara>
</listitem>
<listitem>
<simpara>“Unimod derivative code” (MS:1003347),</simpara>
</listitem>
<listitem>
<simpara>“crosslinker cleavage characteristics” (MS:1003390).</simpara>
</listitem>
</itemizedlist>
<simpara>2- Internally linked peptides (a.k.a. “looplinks”).
See Sections <xref linkend="encoding-crosslinked-peptides-in-the-element-sequencecollection"/> and <xref linkend="identifications-of-an-internally-linked-peptide"/>. One new CV term has been created to allow the encoding:</simpara>
<itemizedlist>
<listitem>
<simpara>“looplink spectrum identification item” (MS:1003329).</simpara>
</listitem>
</itemizedlist>
<simpara>3- Noncovalently associated peptides.
See <xref linkend="additional-search-parameters"/> and <xref linkend="identifications-of-noncovalently-associated-peptides"/>. Two new CV terms have been created related to noncovalently associated peptides:</simpara>
<itemizedlist>
<listitem>
<simpara>“noncovalently associated peptides search” (MS:1003330),</simpara>
</listitem>
<listitem>
<simpara>“noncovalently associated peptides spectrum identification item” (MS:1003331).</simpara>
</listitem>
</itemizedlist>
<simpara>4- Improvements in the encoding of scores and thresholds related to crosslinking results.
See <xref linkend="scores-and-thresholds"/>. Seven CV terms have been created:</simpara>
<itemizedlist>
<listitem>
<simpara>"crosslinked PSM-level global FDR" (MS:1003337),</simpara>
</listitem>
<listitem>
<simpara>“peptide-pair sequence-level global FDR” (MS:1003338),</simpara>
</listitem>
<listitem>
<simpara>“peptide-pair passes threshold” (MS:1003339),</simpara>
</listitem>
<listitem>
<simpara>“residue-pair passes threshold” (MS:1003340),</simpara>
</listitem>
<listitem>
<simpara>“protein-protein interaction passes threshold” (MS:1003341),</simpara>
</listitem>
<listitem>
<simpara>“regular expression for whether interaction score derived from crosslinking passes threshold” (MS:1003342),</simpara>
</listitem>
<listitem>
<simpara>“FDR applied separately to self crosslinks and protein heteromeric crosslinks” (MS:1003343),</simpara>
</listitem>
<listitem>
<simpara>“residue pair ref” (MS:1003344).</simpara>
</listitem>
<listitem>
<simpara>“regular expression for residue-pair ref” (MS:1003345)</simpara>
</listitem>
</itemizedlist>
</section>
<section xml:id="availability-of-documentation-and-example-files">
<title>Availability of Documentation and Example Files</title>
<simpara>All documents in their most recent form are available on the PSI website (<link xl:href="https://www.psidev.info/mzidentml">https://www.psidev.info/mzidentml</link>) and at the mzIdentML GitHub project (<link xl:href="https://github.com/HUPO-PSI/mzIdentML/tree/master/specification_document">https://github.com/HUPO-PSI/mzIdentML/tree/master/specification_document</link>).</simpara>
<simpara>The example files supporting this extension document are available at <link xl:href="https://github.com/HUPO-PSI/mzIdentML/tree/master/examples/1_3examples/crosslinking">https://github.com/HUPO-PSI/mzIdentML/tree/master/examples/1_3examples/crosslinking</link>.</simpara>
<simpara>The example files are:</simpara>
<itemizedlist>
<listitem>
<simpara>Xlink_EDC_mzIdentML_1_3_0_draft.mzid (internally linked peptides),</simpara>
</listitem>
<listitem>
<simpara>multiple_spectra_per_id_1_3_0_draft.mzid (identification based on multiple spectra),</simpara>
</listitem>
<listitem>
<simpara>noncovalently_assoc_1_3_0_draft.mzid (noncovalently associated peptides),</simpara>
</listitem>
<listitem>
<simpara>scores_and_thresholds_1_3_0_draft.mzid (scores and thresholds).</simpara>
</listitem>
</itemizedlist>
</section>
</chapter>
<chapter xml:id="controlled-vocabularies-for-encoding-crosslinks">
<title>Controlled Vocabularies for Encoding Crosslinks</title>
<simpara>A collection of terms for describing a certain domain is called a controlled vocabulary (CV) <xref linkend="mayer2014"/>.
Section 4.1 of the main mzIdentML 1.3.0 document describes the use of CVs in mzIdentML.
The PSI-MS CV (<link xl:href="https://github.com/HUPO-PSI/psi-ms-CV"><phrase role="underline">https://github.com/HUPO-PSI/psi-ms-CV</phrase></link>) can be used to encode many types of technical information in mzIdentML (e.g. statistical scores, mass spectrometers, etc).
There are two other CVs that are relevant to encoding crosslinking data in mzIdentML: Unimod and XLMOD.
XLMOD (<link xl:href="https://raw.githubusercontent.com/HUPO-PSI/mzIdentML/master/cv/XLMOD.obo">https://raw.githubusercontent.com/HUPO-PSI/mzIdentML/master/cv/XLMOD.obo</link>) represents the crosslinker reagents.
Unimod (<link xl:href="http://www.unimod.org/obo/unimod.obo">http://www.unimod.org/obo/unimod.obo</link>) represents the resulting modifications in the crosslinked peptides/proteins.</simpara>
<simpara>At the time of writing (Unimod v2.1, XLMOD v1.1.12) both CVs have advantages and disadvantages when used for encoding crosslinking results in mzIdentML.
For example, the representation of heterobifunctional crosslinkers (crosslinkers with different reactive groups) is better in XLMOD.
However, the representation of the derivatives from a cleavable crosslinker is more complete in Unimod.
Which CV (XLMOD or Unimod) to use for encoding crosslinker modifications is left as the implementers’ choice.</simpara>
<simpara>There is also some overlap between the information stored in these CVs and the contents of the &lt;SearchModification&gt; elements in mzIdentML.
The &lt;SearchModification&gt; elements can encode: the derivatives of cleavable crosslinkers, namely the crosslinker stub as a peptide modification on the MS3 level and crosslinker cleavability as stub fragments on the MS2 level; and crosslinker specificity (including heterobifunctional crosslinkers).
Implementers SHOULD describe the crosslinker modifications searched for as &lt;SearchModification&gt; elements; this provides a consistent way of retrieving crosslinker modification information regardless of which CV has been used, see §3.2.2.</simpara>
</chapter>
<chapter xml:id="encoding-the-results-of-crosslinking-searches">
<title>Encoding the Results of Crosslinking Searches</title>
<section xml:id="encoding-the-results-of-crosslinking-searches-introduction">
<title>Introduction</title>
<simpara>mzIdentML documents MUST indicate that they are implementing the guidance in this extension document by including the following CV term inside the top-level &lt;MzIdentML&gt; element, immediately after the &lt;cvList&gt; element:</simpara>
<programlisting language="xml" linenumbering="unnumbered">&lt;cvParam cvRef="PSI-MS" accession="MS:1003385" name="mzIdentML crosslinking extension document version" value="1.0.0"/&gt;</programlisting>
<simpara>Crosslinked peptides presented a challenge for mzIdentML 1.2.0, since more than one peptide can be identified from the same spectrum.</simpara>
<simpara>mzIdentML 1.2.0 solved this by:</simpara>
<itemizedlist>
<listitem>
<simpara>introducing the “crosslink donor” (MS:1002509) and “crosslink acceptor” (MS:1002510) CV terms – the values of these terms associate <emphasis role="strong">either</emphasis> &lt;SearchModification&gt; elements (see <xref linkend="modification-parameters"/>) or &lt;Modification&gt; elements (see <xref linkend="encoding-crosslinked-peptides-in-the-element-sequencecollection"/>);</simpara>
</listitem>
<listitem>
<simpara>introducing the “crosslink spectrum identification item” (MS:1002511) CV term – the values of these terms group &lt;SpectrumIdentificationItem&gt; elements within a &lt;SpectrumIdentificationResult&gt; (see <xref linkend="encoding-identified-crosslinks-in-spectrumidentificationitem-elements"/>).</simpara>
</listitem>
</itemizedlist>
<simpara>Note that “crosslink donor” (MS:1002509) and “crosslink acceptor” (MS:1002510) are used in two different contexts:</simpara>
<itemizedlist>
<listitem>
<simpara>/MzIdentML/AnalysisProtocolCollection/SpectrumIdentificationProtocol/ ModificationParams/SearchModification – encoding the modifications searched for (including the specificity, see <xref linkend="modification-parameters"/>);</simpara>
</listitem>
<listitem>
<simpara>/MzIdentML/SequenceCollection/Peptide/Modification - encoding the actual modifications present in the crosslinked peptides (<xref linkend="encoding-crosslinked-peptides-in-the-element-sequencecollection"/>).</simpara>
</listitem>
</itemizedlist>
<simpara>The rules that govern their use differ in each context, the details of these rules are given in <xref linkend="modification-parameters"/> and <xref linkend="encoding-crosslinked-peptides-in-the-element-sequencecollection"/>. To emphasise that they differ, <xref linkend="comparison-of-rules-for-crosslink-donor-and-crosslink-acceptor-depending-on-context"/> compares them.
<xref linkend="comparison-of-rules-for-crosslink-donor-and-crosslink-acceptor-depending-on-context"/> presents no new information on how to encode crosslinking results in mzIdentML.</simpara>
<simpara><xref linkend="overview-of-the-mzidentml-1-3-0-format-crosslinking-extension"/> gives an overview of how the subsections here (<xref linkend="encoding-the-results-of-crosslinking-searches"/>) relate to the elements in an mzIdentML file.</simpara>
<figure xml:id="overview-of-the-mzidentml-1-3-0-format-crosslinking-extension">
<title>Overview of the mzIdentML 1.3.0 format (crosslinking extension). Elements are labelled with the section from this document that contains guidance on how to encode them.</title>
<mediaobject>
<imageobject>
<imagedata fileref="img/crosslinking_ext/overview13.svg" contentwidth="100%" align="center"/>
</imageobject>
<textobject><phrase>overview13</phrase></textobject>
</mediaobject>
</figure>
</section>
<section xml:id="spectrumidentificationprotocol-elements">
<title>&lt;SpectrumIdentificationProtocol&gt; Elements</title>
<simpara>A &lt;SpectrumIdentificationProtocol&gt; element describes the parameters and settings of a spectrum identification analysis.
There may be several of these protocols included in one mzIdentML file.
In the case of analysis workflows in which an identification is based on multiple spectra (see Section 7.11 of the main mzIdentML 1.3.0 specification document), these spectra identifications may be included in different &lt;SpectrumIdentificationList&gt; elements, each associated with a different &lt;SpectrumIdentificationProtocol&gt;.</simpara>
<simpara>Section 2 of the main mzIdentML 1.3.0 specification document states that “all search parameters should be described in sufficient detail to enable a user to run the same or a similar search on the same or another search engine”.
As far as possible, the information that would be needed to reannotate the mass spectra SHOULD be included.
The &lt;FragmentTolerance&gt; and &lt;ParentTolerance&gt; subelements of &lt;SpectrumIdentificationProtocol&gt; SHOULD be completed.</simpara>
<simpara>Two child elements of &lt;SpectrumIdentificationProtocol&gt; are covered in more detail here:</simpara>
<itemizedlist>
<listitem>
<simpara>&lt;AdditionalSearchParams&gt; (<xref linkend="additional-search-parameters"/>),</simpara>
</listitem>
<listitem>
<simpara>&lt;ModificationParams&gt; (<xref linkend="modification-parameters"/>).</simpara>
</listitem>
</itemizedlist>
<section xml:id="additional-search-parameters">
<title>Additional Search Parameters</title>
<simpara><emphasis role="strong">Path:</emphasis> <phrase role="underline">/MzIdentML/AnalysisProtocolCollection/SpectrumIdentificationProtocol/AdditionalSearchParams</phrase></simpara>
<simpara>If a crosslinking search has been performed then the CV term “crosslinking search” (MS:1002494) MUST be present within the &lt;AdditionalSearchParams&gt; subelement of every &lt;SpectrumIdentificationProtocol&gt; associated with that search (see <xref linkend="xml-snippet-showing-crosslinking-related-cv-terms"/> <emphasis role="marked">(1)</emphasis>).</simpara>
<simpara>The ion series that were searched for SHOULD also be included here.</simpara>
<simpara><emphasis><phrase role="underline">New supported use case in this extension - noncovalently associated peptides:</phrase></emphasis> mzIdentML 1.2.1 introduces a new CV term – “noncovalently associated peptides search” (MS:1003330).
If pairs of noncovalently associated peptides were also searched for, then the &lt;SpectrumIdentificationProtocol&gt; elements MUST also contain this new CV term within their &lt;AdditionalSearchParams&gt; subelement, see <xref linkend="xml-snippet-showing-crosslinking-related-cv-terms"/> <emphasis role="marked">(2)</emphasis>.</simpara>
<simpara>The new CV term "FDR applied separately to self crosslinks and protein heteromeric crosslinks" (MS:1003343), see §4.4, which SHOULD be present is also shown in <xref linkend="xml-snippet-showing-crosslinking-related-cv-terms"/> <emphasis role="marked">(3)</emphasis>.</simpara>
<formalpara xml:id="xml-snippet-showing-crosslinking-related-cv-terms">
<title>XML snippet showing crosslinking related CV terms in &lt;AdditionalSearchParams&gt; element.</title>
<para>
<programlisting language="xml" linenumbering="unnumbered">&lt;AnalysisProtocolCollection&gt;
    &lt;SpectrumIdentificationProtocol analysisSoftware_ref="ID_software" id="SearchProtocol_1"&gt;
        &lt;SearchType&gt;
            &lt;cvParam accession="MS:1001083" cvRef="PSI-MS" name="ms-ms search"/&gt;
        &lt;/SearchType&gt;
        &lt;AdditionalSearchParams&gt;
            &lt;cvParam accession="MS:1001211" cvRef="PSI-MS" name="parent mass type mono"/&gt;
            &lt;cvParam accession="MS:1001256" cvRef="PSI-MS" name="fragment mass type mono"/&gt;
            &lt;cvParam accession="MS:1002494" cvRef="PSI-MS" name="crosslinking search"/&gt; <co xml:id="CO1-1"/>
            &lt;cvParam accession="MS:1003330" cvRef="PSI-MS" name="noncovalently associated peptides search"/&gt; <co xml:id="CO1-2"/>
            &lt;cvParam accession="MS:1003343" cvRef="PSI-MS" name="FDR applied separately to self crosslinks and protein heteromeric crosslinks" value="true"/&gt; <co xml:id="CO1-3"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1001118" name="param: b ion"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1001262" name="param: y ion"/&gt;
        &lt;/AdditionalSearchParams&gt;
        ...
    &lt;/SpectrumIdentificationProtocol&gt;
&lt;/AnalysisProtocolCollection&gt;</programlisting>
</para>
</formalpara>
</section>
<section xml:id="modification-parameters">
<title>Modification Parameters</title>
<simpara><emphasis role="strong">Path:</emphasis> <phrase role="underline">/MzIdentML/AnalysisProtocolCollection/SpectrumIdentificationProtocol/ModificationParams/SearchModification</phrase></simpara>
<simpara>The &lt;SpectrumIdentificationProtocol&gt; element encodes the modifications that were searched for within its &lt;ModificationParams&gt; subelement.
These are encoded in &lt;SearchModification&gt; elements within &lt;ModificationParams&gt;.</simpara>
<simpara>mzIdentML version 1.3.0 introduces two new CV terms to link &lt;SearchModification&gt; elements and &lt;Modification&gt; elements - “search modification id” (MS:1003392) which goes inside &lt;SearchModification&gt; elements, and “search modification id ref” (MS:1003393) which goes inside &lt;Modification&gt; elements.
Making this link is optional but recommended where possible.
In the case of open modification searches, such a link cannot be made.
See Section 7.12 of the main mzIdentML specification document.</simpara>
<simpara>Each crosslinker reagent is defined by multiple &lt;SearchModification&gt; elements that contain either the “crosslink donor” (MS:1002509) or “crosslink acceptor” (MS:1002510) CV term.
An example is given in <xref linkend="crosslink-donor-and-crosslink-acceptor-searchmodification-elements"/>.
The residue specificities of the crosslinkers used SHOULD be encoded here, examples are given in <xref linkend="example-encodings-of-crosslinker-reagents-as-searchmodification-elements"/>.</simpara>
<simpara>The value slot of the crosslink donor and acceptor CV terms is interpreted as a local identifier for the &lt;SearchModification&gt; elements describing a single reagent.
The rules governing the use of the crosslink donor and acceptor CV terms in &lt;SearchModification&gt; elements are given below  <xref linkend="crosslink-donor-and-crosslink-acceptor-searchmodification-elements"/>.</simpara>
<simpara>There may be more than two &lt;SearchModification&gt; elements required.
For example, if the crosslinker reacts with the sidechains and also with the protein termini, see <xref linkend="example-encodings-of-crosslinker-reagents-as-searchmodification-elements"/> for examples.</simpara>
<simpara>&lt;SearchModification&gt; elements can contain one or more children of the CV term “peptide modification details” (MS:1001471).
These CV terms can encode information on neutral losses, see <xref linkend="crosslink-donor-and-crosslink-acceptor-searchmodification-elements"/>.</simpara>
<simpara><emphasis><phrase role="underline">New supported use case in this extension - cleavable crosslinkers:</phrase></emphasis> mzIdentML 1.3.0 adds three new CV terms relating to modifications from cleavable crosslinkers – “cleavable crosslinker stub” (MS:1003346), “Unimod derivative code” (MS:1003347) and “crosslinker cleavage characteristics” (MS:1003390).</simpara>
<simpara>At the MS3 level, where single peptides and part of the cleaved crosslinker are identified, the crosslinker modifications SHOULD include the CV term “cleavable crosslinker stub” (MS:1003346).</simpara>
<simpara>The crosslink stub modification MUST also have a suitably sourced CV term for the reagent (see <xref linkend="example-encodings-of-crosslinker-reagents-as-searchmodification-elements"/>).
Additionally, if Unimod is being used as the CV, then the CV term “Unimod derivative code” (MS:1003347) MAY be used to state which derivative of the cleaved crosslinker is identified.
The single-letter derivative codes in Unimod are chosen arbitrarily when a linker definition is added to Unimod.
For instance, in <link xl:href="https://unimod.org/xlink.html"><phrase role="underline">https://unimod.org/xlink.html</phrase></link> one can find the examples "A for alkene, S for sulfenic acid, and T for thiol", and e.g. Xlink:DSS uses W for loss of water.
There is no formal vocabulary for the single-letter codes.
"UNIMOD derivative code" must be equal to one of the derivative codes defined in the corresponding Unimod entry (not a random character unrelated to the definition).
An example Unimod entry is at <link xl:href="https://www.unimod.org/modifications_view.php?editid1=1842"><phrase role="underline">https://www.unimod.org/modifications_view.php?editid1=1842 .</phrase></link></simpara>
<simpara>At the MS2 level, the new CV term “crosslinker cleavage characteristics” (MS:1003390) signifies that the crosslinker is cleavable and on cleavage can leave a given stub.
This can lead to additional stub fragments in the MS2 spectra that contain the crosslinker stub instead of the whole crosslinker plus the second peptide.
Each “crosslinker cleavage characteristics” CV term represents one possible crosslinker stub.
It has a structured value -</simpara>
<simpara><emphasis>name</emphasis>:_mass_:_pairs with_</simpara>
<simpara><emphasis>Name</emphasis> must be a single character to identify this stub.
The scope of <emphasis>name</emphasis> is restricted to that crosslinker definition, i.e. they need only be unique within that crosslinker definition not the whole file or the &lt;SpectrumIdentification&gt; element. <emphasis>Mass</emphasis> gives the monoisotopic mass delta of the resulting stub in Daltons. <emphasis>Pairs with</emphasis> MUST be a sequence of one or more characters, giving the <emphasis>name(s)</emphasis> of the partner stub(s).
See Appendix II for examples.</simpara>
<simpara>Note that the choice of which &lt;SearchModification&gt; is the donor and which one is the acceptor is arbitrary.</simpara>
<formalpara xml:id="crosslink-donor-and-crosslink-acceptor-searchmodification-elements">
<title>XML snippet showing the “crosslink donor” (MS:1002509) and “crosslink acceptor” (MS:1002510) CV terms used in &lt;SearchModification&gt;, shows encoding for the BS3 crosslinking reagent. It also shows a modification with a neutral loss.</title>
<para>
<programlisting language="xml" linenumbering="unnumbered">&lt;SpectrumIdentificationProtocol&gt;
    ...
    &lt;ModificationParams&gt;
        &lt;SearchModification fixedMod="false" massDelta="138.06808" residues="S T Y K"&gt;<co xml:id="CO1-4"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="BS3_donor"/&gt;
            &lt;cvParam cvRef="XLMOD" accession="XLMOD:02000" name="BS3"/&gt;<co xml:id="CO1-5"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002509" name="crosslink donor" value="0"/&gt;<co xml:id="CO1-6"/>
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="138.06808" residues="."&gt;<co xml:id="CO1-7"/>
            &lt;SpecificityRules&gt;
                &lt;cvParam cvRef="PSI-MS" accession="MS:1002057" name="modification specificity protein N-term"/&gt;
            &lt;/SpecificityRules&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="BS3_donor_n_term"/&gt;
            &lt;cvParam cvRef="XLMOD" accession="XLMOD:02000" name="BS3"/&gt;<co xml:id="CO1-8"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink donor" value="0"/&gt;<co xml:id="CO1-9"/>
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="0.0" residues="S T Y K"&gt;<co xml:id="CO1-10"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="BS3_acceptor"/&gt;
            &lt;cvParam cvRef="XLMOD" accession="XLMOD:02000" name="BS3"/&gt;<co xml:id="CO1-11"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink acceptor" value="0"/&gt;<co xml:id="CO1-12"/>
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="0.0" residues="."&gt;<co xml:id="CO1-13"/>
            &lt;SpecificityRules&gt;
                &lt;cvParam cvRef="PSI-MS" accession="MS:1002058" name="modification specificity protein N-term"/&gt;
            &lt;/SpecificityRules&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="BS3_acceptor_n_term"/&gt;
            &lt;cvParam cvRef="XLMOD" accession="XLMOD:02000" name="BS3"/&gt;<co xml:id="CO1-14"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink acceptor" value="0"/&gt;<co xml:id="CO1-15"/>
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="15.994919" residues="M"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="Mox"/&gt;
            &lt;cvParam accession="UNIMOD:35" name="Oxidation" cvRef="UNIMOD"/&gt;
            &lt;cvParam accession="MS:1001524" name="fragment neutral loss" cvRef="PSI-MS" value="63.998291" unitAccession="UO:0000221" unitName="dalton" unitCvRef="UO"/&gt;
        &lt;/SearchModification&gt;
    &lt;/ModificationParams&gt;
    ...
&lt;/SpectrumIdentificationProtocol&gt;</programlisting>
</para>
</formalpara>
<simpara>The rules applying to the use of the “crosslink donor” (MS:1002509) and “crosslink acceptor” (MS:1002510) CV terms within &lt;SearchModification&gt;:</simpara>
<itemizedlist>
<listitem>
<simpara><emphasis role="strong">At least two</emphasis> &lt;SearchModification&gt; elements SHOULD be used to encode each crosslink reagent, to encode the site specificity of both the donor and acceptor termini of the reagent. (<xref linkend="crosslink-donor-and-crosslink-acceptor-searchmodification-elements"/> <emphasis role="marked">(1)</emphasis>)</simpara>
</listitem>
<listitem>
<simpara>The value slot of the crosslink donor and acceptor CV terms is interpreted as a local identifier for the &lt;SearchModification&gt; elements describing a single reagent. (<xref linkend="crosslink-donor-and-crosslink-acceptor-searchmodification-elements"/> <emphasis role="marked">(1)</emphasis>)</simpara>
</listitem>
<listitem>
<simpara>The choice of which reactive group is the donor and which is the acceptor is arbitrary.</simpara>
</listitem>
<listitem>
<simpara>The crosslink donor &lt;SearchModification&gt; element <emphasis role="strong">MUST</emphasis> have the attribute massDelta = the mass gain from the crosslink reagent. (<xref linkend="crosslink-donor-and-crosslink-acceptor-searchmodification-elements"/> <emphasis role="marked">(2)</emphasis>)</simpara>
</listitem>
<listitem>
<simpara>The crosslink acceptor peptide’s &lt;SearchModification&gt; element <emphasis role="strong">MUST</emphasis> have massDelta = 0. (<xref linkend="crosslink-donor-and-crosslink-acceptor-searchmodification-elements"/> <emphasis role="marked">(3)</emphasis>)</simpara>
</listitem>
<listitem>
<simpara><emphasis role="strong">Both</emphasis> acceptor and donor <emphasis role="strong">MUST</emphasis> have a suitably sourced &lt;cvParam&gt;. (<xref linkend="crosslink-donor-and-crosslink-acceptor-searchmodification-elements"/> <emphasis role="marked">(4)</emphasis>)</simpara>
</listitem>
</itemizedlist>
</section>
</section>
<section xml:id="encoding-crosslinked-peptides-in-the-element-sequencecollection">
<title>Encoding Crosslinked Peptides in the Element &lt;SequenceCollection&gt;</title>
<simpara><phrase role="underline"><emphasis role="strong">Path:</emphasis> /MzIdentML/SequenceCollection</phrase></simpara>
<simpara>The peptides that have been identified are encoded in the &lt;SequenceCollection&gt; element.
This will include both crosslinked and uncrosslinked peptides.</simpara>
<simpara>A word of warning about redundancy, it is not the intention of mzIdentML that every &lt;SpectrumIdentificationItem&gt; (<xref linkend="encoding-identified-crosslinks-in-spectrumidentificationitem-elements"/>) references a new &lt;Peptide&gt; in &lt;SequenceCollection&gt; – “the combination of &lt;Peptide&gt; sequence and modifications MUST be unique in the file” (main mzIdentML specification document, Section 6.68).
However, each distinct combination of crosslinked peptides will require a new pair of &lt;Peptide&gt; elements in &lt;SequenceCollection&gt;.</simpara>
<simpara>To represent the crosslinked peptides, mzIdentML 1.2.0 added a mechanism for linking two different &lt;Peptide&gt; elements together, using the CV terms “crosslink donor” (MS:1002509) and “crosslink acceptor” (MS:1002510).
An identical value for these terms indicates that they are grouped together, see <xref linkend="encoding-of-crosslinked-peptides-in-sequencecollection-element"/>.</simpara>
<simpara>The rules governing the use of the crosslink donor and acceptor CV terms in &lt;Modification&gt; elements are given below <xref linkend="encoding-of-crosslinked-peptides-in-sequencecollection-element"/>.</simpara>
<simpara>As of mzIdentML 1.3.0, &lt;Modification&gt; elements MAY contain the CV term "search modification id ref" (MS:1003393) to link a &lt;Modification&gt; to a &lt;SearchModification&gt; element.
The value of this term is the unique id of the &lt;SearchModification&gt; as defined by its "search modification id" (MS:1003392) CV term.
It is recommended to use this approach for the encoding of modifications from crosslinkers, see <xref linkend="example-encodings-of-crosslinker-reagents-as-searchmodification-elements"/>.</simpara>
<simpara><emphasis><phrase role="underline">New supported use case in this extension - internally linked peptide:</phrase></emphasis> An internally linked peptide has both ends of the crosslinker within it.
To encode an internally crosslinked peptide the &lt;Peptide&gt; can contain one &lt;Modification&gt; element with the “crosslink donor” CV term and one &lt;Modification&gt; element with the “crosslink acceptor” CV term.
The same rules apply to these CV terms when encoding internally linked peptides as when encoding crosslinked peptides (below <xref linkend="encoding-of-crosslinked-peptides-in-sequencecollection-element"/>).
For an example of how to encode an internally linked peptide, see <xref linkend="encoding-internally-linked-peptide-in-sequencecollection-element"/>.</simpara>
<simpara>The accompanying example file <link xl:href="https://github.com/HUPO-PSI/mzIdentML/blob/master/examples/1_3examples/crosslinking/multiple_spectra_per_id_1_3_0_draft.mzid"><phrase role="underline">multiple_spectra_per_id_1_3_0_draft.mzid</phrase></link> illustrates a common cleavable crosslinker workflow <xref linkend="liu2017"/>.</simpara>
<simpara>Child CV terms of “peptide modification details” (MS:1001471) can be included in &lt;Modification&gt; elements to provide additional information about the modification, including the new cleavable crosslinker related CV terms, see <xref linkend="modification-parameters"/>. This is not recommended if the &lt;Modification&gt; elements have "search modification id ref" (MS:1003393) CV terms to link them to a &lt;SearchModification&gt; element, as it would add unnecessary duplication to the file.</simpara>
<simpara>The encoding for crosslinked peptides MAY be combined with the encoding for modification localisation scoring, using the same mechanism (main mzIdentML 1.3.0 document, Section 5.2.8).</simpara>
<formalpara xml:id="encoding-of-crosslinked-peptides-in-sequencecollection-element">
<title>XML snippet showing the encoding of crosslinked peptides in &lt;SequenceCollection&gt; element.</title>
<para>
<programlisting language="xml" linenumbering="unnumbered">&lt;SequenceCollection&gt;
    &lt;Peptide id="30491856_30492180_2_4_p1"&gt;
        &lt;PeptideSequence&gt;AAFTKQAADK&lt;/PeptideSequence&gt;
        &lt;Modification monoisotopicMassDelta="138.0680796" location="5"&gt;<co xml:id="CO1-16"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003393" name="search modification id ref" value="DSS_donor"/&gt;
            &lt;cvParam accession="XL:00002" cvRef="PSI-MS" name="Xlink:DSS"/&gt;<co xml:id="CO1-17"/>
            &lt;cvParam accession="MS:1002509" cvRef="PSI-MS" name="crosslink donor" value="*5448*"/&gt;<co xml:id="CO1-18"/>
        &lt;/Modification&gt;
    &lt;/Peptide&gt;
    &lt;Peptide id="30491856_30492180_2_4_p2"&gt;
        &lt;PeptideSequence&gt;AMYPPKEDR&lt;/PeptideSequence&gt;
        &lt;Modification monoisotopicMassDelta="0.0" location="6"&gt;<co xml:id="CO1-19"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003393" name="search modification id ref" value="DSS_acceptor"/&gt;
            &lt;cvParam accession="MS:1002510" cvRef="PSI-MS" name="crosslink acceptor" value="*5448*"/&gt;<co xml:id="CO1-20"/>
        &lt;/Modification&gt;
    &lt;/Peptide&gt;
    ...
&lt;/SequenceCollection&gt;</programlisting>
</para>
</formalpara>
<simpara>If a pair of crosslinked peptides has been identified, the rules applying to the use of the “crosslink donor” (MS:1002509) and “crosslink acceptor” (MS:1002510) CV terms within &lt;Modification&gt; elements are:</simpara>
<itemizedlist>
<listitem>
<simpara>One peptide’s &lt;Modification&gt; element MUST be flagged as “crosslink donor” and one MUST be flagged as “crosslink acceptor”. (<xref linkend="encoding-of-crosslinked-peptides-in-sequencecollection-element"/> <emphasis role="marked">(1)</emphasis>)</simpara>
</listitem>
<listitem>
<simpara>A unique identifier linking exactly <emphasis role="strong">two</emphasis> &lt;Modification&gt; elements together <emphasis role="strong">MUST</emphasis> be in the value slot. (<xref linkend="encoding-of-crosslinked-peptides-in-sequencecollection-element"/> <emphasis role="marked">(1)</emphasis>)
(Thereby excluding the representation of trimeric crosslinkers, see §6.)</simpara>
</listitem>
<listitem>
<simpara>If the CV term “search modification id ref” (MS:1003393) is being used then the crosslink donor MUST be chosen to match the end marked as the donor in the corresponding &lt;SearchModification&gt; elements, see §3.2.2. If that CV term is not used, or if the preceding rule does not unambiguously define which end to mark as donor (e.g. because the crosslinker is symmetrical) then the export software SHOULD use the following rules to choose the crosslink donor as the: longer peptide, then higher peptide neutral mass, then alphabetical order.</simpara>
</listitem>
<listitem>
<simpara>The crosslink donor &lt;Modification&gt; element <emphasis role="strong">MUST</emphasis> have the attribute monoisotopicMassDelta = the mass gain from the crosslink reagent. (<xref linkend="encoding-of-crosslinked-peptides-in-sequencecollection-element"/> <emphasis role="marked">(2)</emphasis>)</simpara>
</listitem>
<listitem>
<simpara>The crosslink acceptor peptide’s &lt;Modification&gt; element <emphasis role="strong">MUST</emphasis> have monoisotopicMassDelta = 0. (<xref linkend="encoding-of-crosslinked-peptides-in-sequencecollection-element"/> <emphasis role="marked">(3)</emphasis>)</simpara>
</listitem>
<listitem>
<simpara>The crosslink donor peptide’s &lt;Modification&gt; element <emphasis role="strong">MUST</emphasis> have a suitably sourced cvParam for the crosslink.
The crosslink acceptor peptide’s &lt;Modification&gt; element <emphasis role="strong">MUST</emphasis> <emphasis role="strong">NOT</emphasis> have a cvParam for the reagent. (<xref linkend="encoding-of-crosslinked-peptides-in-sequencecollection-element"/> <emphasis role="marked">(4)</emphasis>)</simpara>
</listitem>
</itemizedlist>
<formalpara xml:id="encoding-internally-linked-peptide-in-sequencecollection-element">
<title>XML snippet showing the encoding of an internally linked peptide.</title>
<para>
<programlisting language="xml" linenumbering="unnumbered">&lt;SequenceCollection&gt;
    &lt;Peptide id="peptide_7_1"&gt;
        &lt;PeptideSequence&gt;DVIQSLVDDDLVAK&lt;/PeptideSequence&gt;
        &lt;Modification location="10" residues="D" monoisotopicMassDelta="-18.010565"&gt;<co xml:id="CO1-21"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003393" name="search modification id ref" value="EDC_donor"/&gt;
            &lt;cvParam accession="UNIMOD:2018" name="Xlink:EDC" cvRef="UNIMOD"/&gt;<co xml:id="CO1-22"/>
            &lt;cvParam accession="MS:1002509" cvRef="PSI-MS" name="crosslink donor" value="*100*"/&gt;<co xml:id="CO1-23"/>
        &lt;/Modification&gt;
        &lt;Modification location="14" residues="K" monoisotopicMassDelta="0.0"&gt;<co xml:id="CO1-24"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003393" name="search modification id ref" value="EDC_acceptor"/&gt;<co xml:id="CO1-25"/>
            &lt;cvParam accession="MS:1002510" cvRef="PSI-MS" name="crosslink acceptor" value="*100*"/&gt;<co xml:id="CO1-26"/>
        &lt;/Modification&gt;
    &lt;/Peptide&gt;
    ...
&lt;/SequenceCollection&gt;</programlisting>
</para>
</formalpara>
<formalpara xml:id="encoding-of-modifications-from-cleavable-crosslinkers">
<title><emphasis role="strong">XML snippet showing the encoding of modifications from cleavable crosslinkers.</emphasis> The new CV terms are shown: “crosslinker stub” (MS:1003346) <emphasis role="marked">(1)</emphasis> and “Unimod derivative code” (MS:1003347) <emphasis role="marked">(2)</emphasis>. This example also uses the new CV term "search modification id ref" (MS:1003393) to reference the corresponding &lt;SearchModification&gt; elements.</title>
<para>
<programlisting language="xml" linenumbering="unnumbered">&lt;SequenceCollection&gt;
    &lt;!-- linear peptides--&gt;
    &lt;Peptide id="p1_linear"&gt;
        &lt;PeptideSequence&gt;PEPKR&lt;/PeptideSequence&gt;
        &lt;Modification location="4" monoisotopicMassDelta="176.01433"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003393" name="search modification id ref" value="DSSO_monolink_W"/&gt;
            &lt;cvParam accession="UNIMOD:1842" cvRef="UNIMOD" name="Xlink:DSSO"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="W" cvRef="PSI-MS"/&gt;<co xml:id="CO1-27"/>
        &lt;/Modification&gt;
    &lt;/Peptide&gt;
    &lt;!-- crosslinked peptides --&gt;
    &lt;Peptide id="p1"&gt;
        &lt;PeptideSequence&gt;PEPKR&lt;/PeptideSequence&gt;
        &lt;Modification location="4" monoisotopicMassDelta="158.003765"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003393" name="search modification id ref" value="DSSO_donor"/&gt;
            &lt;cvParam accession="UNIMOD:1842" cvRef="UNIMOD" name="Xlink:DSSO"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002509" name="crosslink donor" value="1"/&gt;
        &lt;/Modification&gt;
    &lt;/Peptide&gt;
    &lt;Peptide id="p2"&gt;
        &lt;PeptideSequence&gt;TIDYK&lt;/PeptideSequence&gt;
        &lt;Modification location="4" monoisotopicMassDelta="0"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003393" name="search modification id ref" value="DSSO_acceptor"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink acceptor" value="1"/&gt;
        &lt;/Modification&gt;
    &lt;/Peptide&gt;
    &lt;!-- MS3 peptides are separately listed, as they are linear stub modified peptides --&gt;
    &lt;Peptide id="p1_a"&gt;
        &lt;PeptideSequence&gt;PEPKR&lt;/PeptideSequence&gt;
        &lt;Modification location="4" monoisotopicMassDelta="54.010565"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003393" name="search modification id ref" value="DSSO_crosslink_stub_a"/&gt;
            &lt;cvParam accession="UNIMOD:1842" cvRef="UNIMOD" name="Xlink:DSSO"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="A" cvRef="PSI-MS"/&gt;<co xml:id="CO1-28"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003346" name="cleavable crosslinker stub"/&gt;<co xml:id="CO1-29"/>
        &lt;/Modification&gt;
    &lt;/Peptide&gt;
    &lt;Peptide id="p1_t"&gt;
        &lt;PeptideSequence&gt;PEPKR&lt;/PeptideSequence&gt;
        &lt;Modification location="4" monoisotopicMassDelta="85.982635"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003393" name="search modification id ref" value="DSSO_crosslink_stub_t"/&gt;
            &lt;cvParam accession="UNIMOD:1842" cvRef="UNIMOD" name="Xlink:DSSO"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="T" cvRef="PSI-MS"/&gt;<co xml:id="CO1-30"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003346" name="cleavable crosslinker stub"/&gt;<co xml:id="CO1-31"/>
        &lt;/Modification&gt;
    &lt;/Peptide&gt;
    &lt;Peptide id="p2_a"&gt;
        &lt;PeptideSequence&gt;TIDYK&lt;/PeptideSequence&gt;
        &lt;Modification location="4" monoisotopicMassDelta="54.010565"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003393" name="search modification id ref" value="DSSO_crosslink_stub_a"/&gt;
            &lt;cvParam accession="UNIMOD:1842" cvRef="UNIMOD" name="Xlink:DSSO"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="A" cvRef="PSI-MS"/&gt;<co xml:id="CO1-32"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003346" name="cleavable crosslinker stub"/&gt;<co xml:id="CO1-33"/>
        &lt;/Modification&gt;
    &lt;/Peptide&gt;
    &lt;Peptide id="p2_t"&gt;
        &lt;PeptideSequence&gt;TIDYK&lt;/PeptideSequence&gt;
        &lt;Modification location="4" monoisotopicMassDelta="85.982635"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003393" name="search modification id ref" value="DSSO_crosslink_stub_t"/&gt;
            &lt;cvParam accession="UNIMOD:1842" cvRef="UNIMOD" name="Xlink:DSSO"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="T" cvRef="PSI-MS"/&gt;<co xml:id="CO1-34"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003346" name="cleavable crosslinker stub"/&gt;<co xml:id="CO1-35"/>
        &lt;/Modification&gt;
    &lt;/Peptide&gt;
&lt;/SequenceCollection&gt;</programlisting>
</para>
</formalpara>
</section>
<section xml:id="encoding-identified-crosslinks-in-spectrumidentificationitem-elements">
<title>Encoding Identified Crosslinks in &lt;SpectrumIdentificationItem&gt; Elements</title>
<section xml:id="identifications-of-crosslinked-peptides">
<title>Identifications of Crosslinked Peptides</title>
<simpara><phrase role="underline"><emphasis role="strong">Path:</emphasis> /MzIdentML/DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult</phrase></simpara>
<simpara>&lt;SpectrumIdentificationResult&gt; elements report the evidence associated with the identification of particular peptides.</simpara>
<simpara>A pair of crosslinked peptides within a given &lt;SpectrumIdentificationResult&gt; MUST be reported as two instances of &lt;SpectrumIdentificationItem&gt; having a shared local unique identifier as the value for the CV term “crosslink spectrum identification item” (MS:1002511).
Locally unique means unique within the containing &lt;SpectrumIdentificationResult&gt;.
See <xref linkend="encoding-the-identification-of-a-pair-of-crosslinked-peptides"/>.
The rules governing the use of the “crosslink spectrum identification item” CV term are given in below <xref linkend="encoding-the-identification-of-a-pair-of-crosslinked-peptides"/>.</simpara>
<formalpara xml:id="encoding-the-identification-of-a-pair-of-crosslinked-peptides">
<title><emphasis role="strong">XML snippet showing the encoding the identification of a pair of crosslinked peptides.</emphasis></title>
<para>
<programlisting language="xml" linenumbering="unnumbered">&lt;SpectrumIdentificationResult spectraData_ref="SID_1" spectrumID="index=2776" id="SIR_1"&gt;
    &lt;SpectrumIdentificationItem passThreshold="true" rank="*1*" peptide_ref="30491856_30492180_2_4_p1" experimentalMassToCharge="569.7912" calculatedMassToCharge="569.79054" chargeState="4" id="SII_1_1"&gt;<co xml:id="CO1-36"/>
        &lt;PeptideEvidenceRef peptideEvidence_ref="pepevid_psm121558473_pep30491845_protP02768-A_target_535"/&gt;
        &lt;cvParam accession="MS:1002511" cvRef="PSI-MS" value="*1*" name="crosslink spectrum identification item"/&gt;<co xml:id="CO1-37"/>
        &lt;cvParam accession="MS:1002545" cvRef="PSI-MS" value="1.3111826921077734" name="xi:score"/&gt;<co xml:id="CO1-38"/>
        &lt;cvParam accession="MS:1003344" cvRef="PSI-MS" value="54321.a" name="Residue pair ref"/&gt;
    &lt;/SpectrumIdentificationItem&gt;
    &lt;SpectrumIdentificationItem passThreshold="true" rank="*1*" peptide_ref="30491715_30491845_3_7_p0" experimentalMassToCharge="569.7912" calculatedMassToCharge="569.79054" chargeState="4" id="SII_1_2"&gt;<co xml:id="CO1-39"/>
        &lt;PeptideEvidenceRef peptideEvidence_ref="pepevid_psm121558473_pep30491715_protP02768-A_target_411"/&gt;
        &lt;cvParam accession="MS:1002511" cvRef="PSI-MS" value="*1*" name="crosslink spectrum identification item"/&gt;<co xml:id="CO1-40"/>
        &lt;cvParam accession="MS:1002545" cvRef="PSI-MS" value="1.3111826921077734" name="xi:score"/&gt;<co xml:id="CO1-41"/>
        &lt;cvParam accession="MS:1003344" cvRef="PSI-MS" value="54321.b" name="Residue pair ref"/&gt;
    &lt;/SpectrumIdentificationItem&gt;
&lt;/SpectrumIdentificationResult&gt;</programlisting>
</para>
</formalpara>
<simpara>If a crosslinked pair of peptides has been identified:</simpara>
<itemizedlist>
<listitem>
<simpara>There MUST be <emphasis role="strong">two</emphasis> &lt;SpectrumIdentificationItem&gt; elements with the same rank value.<emphasis role="marked">(1)</emphasis></simpara>
</listitem>
<listitem>
<simpara>Both MUST have the “crosslink spectrum identification item” cvParam, and the value acts as a <emphasis role="strong">local</emphasis> identifier within the &lt;SpectrumIdentificationResult&gt; to group these two elements together.<emphasis role="marked">(2)</emphasis></simpara>
</listitem>
<listitem>
<simpara>The experimentalMassToCharge, calculatedMassToCharge and chargeState MUST be identical over both SII elements, indicating the overall values for the pair.<emphasis role="marked">(1)</emphasis></simpara>
</listitem>
<listitem>
<simpara>If the search engine applies a score to the paired identification, both &lt;SpectrumIdentificationItem&gt; elements MUST have the same cvParam capturing the value.<emphasis role="marked">(3)</emphasis></simpara>
</listitem>
<listitem>
<simpara>The two &lt;SpectrumIdentificationItem&gt; elements MAY also have independent scores for the two chains (not shown).</simpara>
</listitem>
</itemizedlist>
</section>
<section xml:id="identifications-of-noncovalently-associated-peptides">
<title>Identifications of Noncovalently Associated Peptides</title>
<simpara><phrase role="underline"><emphasis role="strong">Path:</emphasis> /MzIdentML/DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult</phrase></simpara>
<simpara><emphasis><phrase role="underline">New supported use case in this extension - noncovalently associated peptides:</phrase></emphasis> mzIdentML 1.2.1 introduces a new CV term “noncovalently associated peptides spectrum identification item” (MS:1003331) to encode such identifications (see <xref linkend="supported-crosslinking-use-cases"/>).
It operates in the same way as “crosslink spectrum identification item”, by using the value of the CV term to group the identifications together, see <xref linkend="encoding-identification-of-noncovalently-associated-peptides"/>.</simpara>
<simpara>As indicated above, to use the “noncovalently associated peptides spectrum identification item” (MS:1003331), the element &lt;AdditionalSearchParams&gt; MUST contain the CV term “noncovalently associated peptides search” (MS:1003330), see <xref linkend="additional-search-parameters"/>.</simpara>
<simpara>The rules governing the use of the “noncovalently associated peptides spectrum identification item” CV term are given below <xref linkend="encoding-identification-of-noncovalently-associated-peptides"/> and are analogous to those governing the use of “crosslink spectrum identification item”.
The peptides referred to will be linear, uncrosslinked peptides.</simpara>
<formalpara xml:id="encoding-identification-of-noncovalently-associated-peptides">
<title><emphasis role="strong">Encoding the identification of a pair of noncovalently associated peptides.</emphasis></title>
<para>
<programlisting language="xml" linenumbering="unnumbered">&lt;SpectrumIdentificationResult spectraData_ref="SID_1" spectrumID="index=2776" id="SIR_1"&gt;
    &lt;SpectrumIdentificationItem passThreshold="true" rank="*1*" peptide_ref="p1" experimentalMassToCharge="569.7912" calculatedMassToCharge="569.79054" chargeState="4" id="SII_1_1"&gt;<co xml:id="CO1-42"/>
        &lt;PeptideEvidenceRef peptideEvidence_ref="pepevid_pep_1"/&gt;
        &lt;cvParam accession="MS:1003331" cvRef="PSI-MS" value="*1*" name="noncovalently associated peptides spectrum identification item"/&gt;<co xml:id="CO1-43"/>
        &lt;cvParam accession="MS:1002545" cvRef="PSI-MS" value="1.3111826921077734" name="xi:score"/&gt;<co xml:id="CO1-44"/>
    &lt;/SpectrumIdentificationItem&gt;
    &lt;SpectrumIdentificationItem passThreshold="true" rank="*1*" peptide_ref="p2" experimentalMassToCharge="569.7912" calculatedMassToCharge="569.79054" chargeState="4" id="SII_1_2"&gt;<co xml:id="CO1-45"/>
        &lt;PeptideEvidenceRef peptideEvidence_ref="pepevid_pep_2"/&gt;
        &lt;cvParam accession="MS:1003331" cvRef="PSI-MS" value="*1*" name="noncovalently associated peptides spectrum identification item"/&gt;<co xml:id="CO1-46"/>
        &lt;cvParam accession="MS:1002545" cvRef="PSI-MS" value="1.3111826921077734" name="xi:score"/&gt;<co xml:id="CO1-47"/>
    &lt;/SpectrumIdentificationItem&gt;
&lt;/SpectrumIdentificationResult&gt;</programlisting>
</para>
</formalpara>
<simpara>If a pair of <emphasis role="strong">noncovalently associated peptides</emphasis> has been identified:</simpara>
<itemizedlist>
<listitem>
<simpara>There MUST be <emphasis role="strong">two</emphasis> &lt;SpectrumIdentificationItem&gt; elements with the same rank value.<emphasis role="marked">(1)</emphasis></simpara>
</listitem>
<listitem>
<simpara>Both MUST have the “noncovalently associated peptides spectrum identification item” cvParam, and the value acts as a <emphasis role="strong">local</emphasis> identifier within the &lt;SpectrumIdentificationResult&gt; to group these two elements together.<emphasis role="marked">(2)</emphasis></simpara>
</listitem>
<listitem>
<simpara>The experimentalMassToCharge, calculatedMassToCharge and chargeState MUST be identical over both SII elements, indicating the overall values for the pair.<emphasis role="marked">(1)</emphasis></simpara>
</listitem>
<listitem>
<simpara>If the search engine applies a score to the paired identification, both &lt;SpectrumIdentificationItem&gt; elements MUST have the same cvParam capturing the value.<emphasis role="marked">(3)</emphasis></simpara>
</listitem>
<listitem>
<simpara>The two &lt;SpectrumIdentificationItem&gt; elements MAY also have independent scores for the two chains (not shown).</simpara>
</listitem>
</itemizedlist>
</section>
<section xml:id="identifications-of-an-internally-linked-peptide">
<title>Identifications of an Internally Linked Peptide</title>
<simpara><phrase role="underline"><emphasis role="strong">Path:</emphasis> /MzIdentML/DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult</phrase></simpara>
<simpara><emphasis><phrase role="underline">New supported use case in this extension - internally linked peptide:</phrase></emphasis> mzIdentML 1.3.0 introduces a new CV term – “looplink spectrum identification item” (MS:1003329) – to allow the encoding of internally linked peptides (a.k.a. “looplinks”), see <xref linkend="encoding-of-identification-of-internally-linked-peptide"/>. The &lt;SpectrumIdentificationItem&gt; element will refer to a &lt;Peptide&gt; containing both crosslink donor and crosslink acceptor modifications (as shown in <xref linkend="encoding-internally-linked-peptide-in-sequencecollection-element"/>).</simpara>
<formalpara xml:id="encoding-of-identification-of-internally-linked-peptide">
<title><emphasis role="strong">XML snippet including the encoding of an identification of an internally linked peptide.</emphasis> Within a &lt;SpectrumIdentificationResult&gt;, a &lt;SpectrumIdentificationItem&gt; element may be marked as referring to a looplink containing peptide by including the CV term "looplink spectrum identification item" (MS:1003329) CV term <emphasis role="marked">(1)</emphasis>. This &lt;SpectrumIdentificationItem&gt; <emphasis role="marked">(2)</emphasis> will refer to a &lt;Peptide&gt; containing both crosslink donor and crosslink acceptor modifications (as shown in <xref linkend="encoding-internally-linked-peptide-in-sequencecollection-element"/>).</title>
<para>
<programlisting language="xml" linenumbering="unnumbered">&lt;SpectrumIdentificationResult spectraData_ref="SID_1" spectrumID="index=2776" id="SIR_1"&gt;
    &lt;SpectrumIdentificationItem passThreshold="true" rank="*1*" peptide_ref="*looplink_p1*" experimentalMassToCharge="569.7912" calculatedMassToCharge="569.79054" chargeState="4" id="SII_1_1"&gt;<co xml:id="CO1-48"/>
        &lt;PeptideEvidenceRef peptideEvidence_ref="*looplink_p1_pep_evid*"/&gt;
        &lt;cvParam accession="MS:1003329" cvRef="PSI-MS" name="looplink spectrum identification item"/&gt;<co xml:id="CO1-49"/>
        &lt;cvParam accession="MS:1002545" cvRef="PSI-MS" value="1.3111826921077734" name="xi:score"/&gt;
    &lt;/SpectrumIdentificationItem&gt;
&lt;/SpectrumIdentificationResult&gt;</programlisting>
</para>
</formalpara>
</section>
</section>
</chapter>
<chapter xml:id="scores-and-thresholds">
<title>Scores and Thresholds</title>
<section xml:id="scores-and-thresholds-introduction">
<title>Introduction</title>
<simpara>This section addresses the encoding of error control procedures.
This consists of encoding scores (<xref linkend="scores"/>) and the corresponding thresholds (<xref linkend="thresholds"/>) applied to those scores.
The contents of this section are all optional; at the PSM level, providing threshold information and identifications that fall below the given significance threshold is encouraged.</simpara>
<simpara><emphasis>“Depending on the intended purpose of the file, the file producer MAY wish to report a number of identifications that fall below the given significance threshold, for example to allow global statistical analyses to be performed which are not possible if only identifications passing the threshold are reported.”</emphasis> (Section 7.4 of the main mzIdentML 1.3.0 specification document)</simpara>
<simpara>mzIdentML also provides the option not to encode the peptide spectrum matches that fell below the threshold applied. (<xref linkend="thresholds"/>)</simpara>
<simpara>The correspondence between scores and the applied thresholds is indicated by using the same CV term for both.
That is, the same CV term will be used within the &lt;Threshold&gt; element and within either the related &lt;SpectrumIdentificationItem&gt; element or the related &lt;ProteinDetectionHypothesis&gt; element.</simpara>
<simpara>One specific type of score is an FDR (False Discovery Rate) score.
Comments specific to FDR are in <xref linkend="fdr-specific-comments"/>.</simpara>
<simpara>There are different points in the analysis at which thresholds may be applied <xref linkend="lenz2021"/> <xref linkend="fischer2017"/>.
These correspond to different levels of consolidation at which analyses may be performed.
Scores and thresholds are encoded differently in mzIdentML depending on the level of consolidation at which they were applied.
For crosslinking studies encoded in mzIdentML, the possible levels are:</simpara>
<itemizedlist>
<listitem>
<simpara>crosslink containing PSM (also known as Crosslink Spectrum Match, CSM), see <xref linkend="match-level-scores"/>,</simpara>
</listitem>
<listitem>
<simpara>unique peptide-pair, see <xref linkend="peptide-level-scores"/>,</simpara>
</listitem>
<listitem>
<simpara>unique residue-pair, see <xref linkend="interaction-level-scores-unique-residue-pairs-and-ppi"/>,</simpara>
</listitem>
<listitem>
<simpara>protein-protein interaction (PPI) see <xref linkend="interaction-level-scores-unique-residue-pairs-and-ppi"/>.</simpara>
</listitem>
</itemizedlist>
<simpara>Unique residue-pair and protein-protein interaction level scores are described in the same section as they are encoded using the same mechanism.</simpara>
<simpara>The example file <link xl:href="https://github.com/HUPO-PSI/mzIdentML/blob/master/examples/1_3examples/crosslinking/scores_and_thresholds_1_3_0_draft.mzid"><phrase role="underline">scores_and_thresholds_1_3_0_draft.mzid</phrase></link> gives a simplified example containing two crosslinks and shows scores and thresholds applied at all four levels.
<xref linkend="xml-showing-thresholds-applied-at-all-four-levels-of-consolidation"/>, <xref linkend="xml-encoding-of-scores-for-psm-level-matches-and-peptide-pairs"/>, <xref linkend="xml-encoding-of-scores-for-residue-pairs-and-ppis"/> and <xref linkend="xml-ambiguity-at-ppi-level"/> are XML-snippets from that example file.</simpara>
<simpara>mzIdentML allows peptide-level scores to be associated with “unique peptides” (not arbitrary groups of peptides).
There are three mutually exclusive definitions of “unique peptide”:</simpara>
<itemizedlist>
<listitem>
<simpara>“group PSMs by sequence” (MS:1002496);</simpara>
</listitem>
<listitem>
<simpara>“group PSMs by sequence with modifications” (MS:1002497);</simpara>
</listitem>
<listitem>
<simpara>“group PSMs by sequence with modifications and charge” (MS:1002498).</simpara>
</listitem>
</itemizedlist>
<simpara>If peptide level (re)scoring is used, exactly one of these CV terms must be placed in the &lt;AdditionalSearchParams&gt; element to state the definition of “unique peptide” in use (see Section 5.2.7 of the main specification document).
As these are mutually exclusive, an error control procedure which uses more than one definition of “unique peptide” cannot be fully captured by mzIdentML.</simpara>
</section>
<section xml:id="thresholds">
<title>Thresholds</title>
<simpara>Section 7.4 of the main mzIdentML specification document gives general guidance on the encoding of thresholds and what has passed them.
Note that thresholds are encoded in two different places: in the &lt;SpectrumIdentificationProtocol&gt; element and in the &lt;ProteinDetectionProtocol&gt; element.
In both cases, they are encoded using CV terms inside a &lt;Threshold&gt; element, see <xref linkend="xml-showing-thresholds-applied-at-all-four-levels-of-consolidation"/>.</simpara>
<simpara>The &lt;Threshold&gt; element inside &lt;SpectrumIdentificationProtocol&gt; gives the thresholds associated with &lt;SpectrumIdentificationItem&gt; elements.
These thresholds apply at the crosslinked PSM level and at a unique peptide level.</simpara>
<simpara>Analogously, the &lt;Threshold&gt; element inside &lt;ProteinDetectionProtocol&gt; includes the thresholds associated with &lt;ProteinDetectionHypothesis&gt; elements.
These thresholds apply at the unique residue-pair level and PPI level.</simpara>
<simpara>The elements &lt;SpectrumIdentificationItem&gt; and &lt;ProteinDetectionHypothesis&gt; have a mandatory Boolean attribute <emphasis>passThreshold</emphasis> that allows a file producer to indicate that an identification has passed the given thresholds or that it has been manually validated.</simpara>
<simpara>The <emphasis>passThreshold</emphasis> attribute of &lt;SpectrumIdentificationItem&gt; relates only to the passing of PSM-level thresholds (see Section 5.2.7 of the main specification document, final paragraph therein).</simpara>
<simpara>To enable additional thresholding at the peptide-pair level in the context of crosslinking, a new CV term is required for all PSMs (“peptide-pair passes threshold”, MS:1003339) as shown in <xref linkend="xml-encoding-of-scores-for-psm-level-matches-and-peptide-pairs"/>. This is similar to the general guidance on peptide level thresholds given in Section 5.2.7 of the main specification document.</simpara>
<simpara>The <emphasis>passThreshold</emphasis> attribute of &lt;ProteinDetectionHypothesis&gt; only relates to the presence or absence of proteins, it is not directly related to the identification of crosslinks.
Whether or not residue-pairs or PPIs have passed significance thresholds is encoded by including the new CV terms “residue-pair passes threshold” (MS:1003340) or “protein-protein interaction passes threshold” (MS:1003341) in the &lt;ProteinDetectionHypothesis&gt; element.
The values of these CV terms include an identifier that associates them with a specific residue pair or PPI, see <xref linkend="xml-encoding-of-scores-for-residue-pairs-and-ppis"/>.</simpara>
<simpara>At each level of consolidation there may be multiple scores.
Therefore, for each level there is a mechanism for encoding whether the identification passed when all scores are considered:</simpara>
<itemizedlist>
<listitem>
<simpara>for PSM-level identifications this is the <emphasis>passThreshold</emphasis> attribute of &lt;SpectrumIdentificiationItem&gt;;</simpara>
</listitem>
<listitem>
<simpara>at peptide-pair level it is the “peptide-pair passes threshold” (MS:1003339) CV term;</simpara>
</listitem>
<listitem>
<simpara>at residue-pair level it is the “residue-pair passes threshold” (MS:1003340) CV term;</simpara>
</listitem>
<listitem>
<simpara>and for PPIs it is the “protein-pair passes threshold” (MS:1003341) CV term.</simpara>
</listitem>
</itemizedlist>
<simpara>If the file producer does not want to indicate that thresholds have been set, all identification elements (&lt;SpectrumIdentificationItem&gt; and &lt;ProteinDetectionHypothesis&gt;) MUST have the attribute passThreshold = “true" and the “no threshold" CV term should be provided within the &lt;SpectrumIdentificationProtocol&gt; and &lt;ProteinDetectionProtocol&gt; (Section 7.4 of the main mzIdentML 1.3.0 specification document).
In this case, the new “residue-pair passes threshold" (MS:1003340) and “protein-protein interaction passes threshold" (MS:1003341) CV terms can be omitted.</simpara>
<formalpara xml:id="xml-showing-thresholds-applied-at-all-four-levels-of-consolidation">
<title><emphasis role="strong">XML snippet showing the thresholds applied at all four levels of consolidation.</emphasis> These are - PSM (<emphasis role="marked">1</emphasis>), peptide-pair(<emphasis role="marked">2</emphasis>), residue pair(<emphasis role="marked">3</emphasis>) and PPI(<emphasis role="marked">4</emphasis>). The CV terms MS:1002490 and MS:1002496 (<emphasis role="marked">5</emphasis>) are required to enable peptide level rescoring (mzIdentML main specification Section 5.2.7) and to state the definition of ‘unique peptide’ being used.</title>
<para>
<programlisting language="xml" linenumbering="unnumbered">&lt;AnalysisProtocolCollection&gt;
    &lt;SpectrumIdentificationProtocol analysisSoftware_ref="xiFDR_id" id="SearchProtocol_1_17022"&gt;
        &lt;SearchType&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1001083" name="ms-ms search"/&gt;
        &lt;/SearchType&gt;
        &lt;AdditionalSearchParams&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1001211" name="parent mass type mono"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002494" name="crosslinking search"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1001256" name="fragment mass type mono"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002490" name="peptide-level scoring"/&gt;<co xml:id="CO1-50"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002496" name="group PSMs by sequence"/&gt;<co xml:id="CO1-51"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003343" name="FDR applied separately to self crosslinks and protein heteromeric crosslinks"/&gt;
            &lt;cvParam accession="MS:1001118" name="param: b ion" cvRef="PSI-MS"/&gt;
            &lt;cvParam accession="MS:1001262" name="param: y ion" cvRef="PSI-MS"/&gt;
        &lt;/AdditionalSearchParams&gt;
        &lt;ModificationParams/&gt;
        &lt;Enzymes/&gt;
        &lt;FragmentTolerance/&gt;
        &lt;ParentTolerance/&gt;
        &lt;Threshold&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003337" name="crosslinked PSM-level global FDR" value="0.05"/&gt;<co xml:id="CO1-52"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003338" name="peptide-pair sequence-level global FDR" value="0.05"/&gt;<co xml:id="CO1-53"/>
        &lt;/Threshold&gt;
    &lt;/SpectrumIdentificationProtocol&gt;
    &lt;ProteinDetectionProtocol analysisSoftware_ref="xiFDR_id" id="pdp1"&gt;
        &lt;Threshold&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="0.05"/&gt;<co xml:id="CO1-54"/>
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002676" name="protein-pair-level global FDR" value="0.05"/&gt;<co xml:id="CO1-55"/>
        &lt;/Threshold&gt;
    &lt;/ProteinDetectionProtocol&gt;
&lt;/AnalysisProtocolCollection&gt;</programlisting>
</para>
</formalpara>
</section>
<section xml:id="scores">
<title>Scores</title>
<section xml:id="match-level-scores">
<title>Match Level Scores</title>
<simpara>Match level scores are stored in &lt;SpectrumIdentificationItem&gt; elements.</simpara>
<simpara>The CV mapping rules for &lt;SpectrumIdentificationItem&gt; are straightforward – there is only one, which states ‘MAY supply a child term of <link xl:href="https://www.ebi.ac.uk/ols4/ontologies/ms/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FMS_1001405"><phrase role="underline">MS:1001405 (spectrum identification result details)</phrase></link> one or more times’.</simpara>
<simpara>CV terms to encode match level scores must therefore be children of <link xl:href="https://www.ebi.ac.uk/ols4/ontologies/ms/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FMS_1001405"><phrase role="underline">MS:1001405</phrase></link> in the CV’s “is a” hierarchy.</simpara>
<simpara>Those which also meet the CV mapping rules for the &lt;Threshold&gt; element can also be used to encode the Threshold applied.</simpara>
<simpara>See Section 7.11 of the main mzIdentML document for guidance specific to PSM-level scores for identifications based on multiple spectra.</simpara>
</section>
<section xml:id="peptide-level-scores">
<title>Peptide Level Scores</title>
<simpara>Peptide level scores are also stored in &lt;SpectrumIdentificationItem&gt; elements and everything in <xref linkend="match-level-scores"/> also applies here.</simpara>
<simpara>Section 5.2.7 of the main mzIdentML specification document describes the encoding of peptide-level scores and statistical measures.
The encoding of crosslinking results MAY also be combined with the peptide-level re-scoring mechanism described there, but with specific CV terms for scores associated with crosslinked peptides rather than PSM-level terms (as stated in Section 5.2.7 of main specification document).</simpara>
<simpara>Where needed, new CV terms for search specific scores of crosslinked peptides should be added as a child of (i.e. with an “is a” relationship to) the CV term “interaction score derived from crosslinking” (MS:1002664).</simpara>
<formalpara xml:id="xml-encoding-of-scores-for-psm-level-matches-and-peptide-pairs">
<title><emphasis role="strong">XML snippet including the encoding of scores for PSM-level matches and peptide pairs.</emphasis> These are encoded inside &lt;SpectrumIdentificationItem&gt; elements. “peptide-pair passes threshold” (MS:1003339) would become relevant if there was more than one score for that peptide pair (sharing the same “peptide group ID”), it states whether the peptide pair passed when all scores and thresholds are considered. This is analogous to the <emphasis>passThreshold</emphasis> attribute of &lt;SpectrumIdentificationItem&gt; elements for PSM-level scores.</title>
<para>
<programlisting language="xml" linenumbering="unnumbered">&lt;SpectrumIdentificationList id="SII_LIST_1_1"&gt;
    &lt;SpectrumIdentificationResult spectrumID="index=26630" spectraData_ref="SD_17022_recal_B210619_02_Lumos_ZC_CO_190_D2I_SDA-WT1.mgf" id="SIR_1"&gt;
        &lt;SpectrumIdentificationItem chargeState="5" experimentalMassToCharge="1135.3259479607323" calculatedMassToCharge="1135.3254335427703" peptide_ref="16734061838_ISDKRAPSQGGLENEGVFEELLR_16734063165_GAEDEEEEEDVGFEQNFEEMLESVTR_4_9_p1" rank="1" passThreshold="false" id="SII_1_1"&gt;
            &lt;PeptideEvidenceRef peptideEvidence_ref="pepevid_pep_16734063165"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002511" name="crosslink spectrum identification item" value="1"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002545" name="xi:score" value="25.929927957127177"/&gt;
            &lt;!-- crosslinked PSM level global FDR --&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003337" name="crosslinked PSM-level global FDR" value="0.06"/&gt;
            &lt;!-- peptide pair global FDR --&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002520" value="GAEDEEEEEDVGFEQNFEEMLESVTR-ISDKRAPSQGGLENEGVFEELLR" name="peptide group ID"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003338" name="peptide-pair sequence-level global FDR" value="0.06"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003339" name="peptide-pair passes threshold" value="false"/&gt;
            &lt;!-- residue pair ref value="1.b" --&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003344" value="11.b" name="Residue-pair ref"/&gt;
        &lt;/SpectrumIdentificationItem&gt;
        &lt;SpectrumIdentificationItem chargeState="5" experimentalMassToCharge="1135.3259479607323" calculatedMassToCharge="1135.3254335427703" peptide_ref="16734061838_ISDKRAPSQGGLENEGVFEELLR_16734063165_GAEDEEEEEDVGFEQNFEEMLESVTR_4_9_p0" rank="1" passThreshold="false" id="SII_1_2"&gt;
            &lt;PeptideEvidenceRef peptideEvidence_ref="pepevid_pep_16734061838"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002511" name="crosslink spectrum identification item" value="1"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002545" name="xi:score" value="25.929927957127177"/&gt;
            &lt;!-- crosslinked PSM level global FDR --&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003337" name="crosslinked PSM-level global FDR" value="0.06"/&gt;
            &lt;!-- peptide pair global FDR --&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002520" value="GAEDEEEEEDVGFEQNFEEMLESVTR-ISDKRAPSQGGLENEGVFEELLR" name="peptide group ID"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003338" name="peptide-pair sequence-level global FDR" value="0.06"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003339" name="peptide-pair passes threshold" value="false"/&gt;
            &lt;!-- residue pair ref value="11.a" --&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003344" value="11.a" name="Residue-pair ref"/&gt;
        &lt;/SpectrumIdentificationItem&gt;
        &lt;cvParam cvRef="PSI-MS" accession="MS:1000797" name="peak list scans" value="40560"/&gt;
    &lt;/SpectrumIdentificationResult&gt;
    &lt;SpectrumIdentificationResult spectrumID="index=23414" spectraData_ref="SD_17022_recal_B210619_04_Lumos_ZC_CO_190_D2I_SDA-WT3.mgf" id="SIR_2"&gt;
        &lt;SpectrumIdentificationItem chargeState="6" experimentalMassToCharge="752.7466713415814" calculatedMassToCharge="752.41371619677" peptide_ref="16734068348_TAAPTVCcmLLVLGQADKVLEEVDWLIKR_16734057553_SCcmKDLQILQASK_18_1_p1" rank="1" passThreshold="true" id="SII_2_1"&gt;
            &lt;PeptideEvidenceRef peptideEvidence_ref="pepevid_pep_16734057553"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002511" name="crosslink spectrum identification item" value="2"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002545" name="xi:score" value="21.55734182309742"/&gt;
            _
            &lt;!-- crosslinked PSM level global FDR --&gt;
            _
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003337" name="crosslinked PSM-level global FDR" value="0.03"/&gt;
            _
            &lt;!-- peptide pair global FDR --&gt;
            _
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002520" value="SCKDLQILQASK-TAAPTVCLLVLGQADKVLEEVDWLIKR" name="peptide group ID"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003338" name="peptide-pair sequence-level global FDR" value="0.03"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003339" name="peptide-pair passes threshold" value="true"/&gt;
            _
            &lt;!-- residue pair ref value="22.b" --&gt;
            _
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003344" value="22.b" name="Residue-pair ref"/&gt;
        &lt;/SpectrumIdentificationItem&gt;
        &lt;SpectrumIdentificationItem chargeState="6" experimentalMassToCharge="752.7466713415814" calculatedMassToCharge="752.41371619677" peptide_ref="16734068348_TAAPTVCcmLLVLGQADKVLEEVDWLIKR_16734057553_SCcmKDLQILQASK_18_1_p0" rank="1" passThreshold="true" id="SII_2_2"&gt;
            &lt;PeptideEvidenceRef peptideEvidence_ref="pepevid_pep_16734068348"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002511" name="crosslink spectrum identification item" value="2"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002545" name="xi:score" value="21.55734182309742"/&gt;
            &lt;!-- crosslinked PSM level global FDR --&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003337" name="crosslinked PSM-level global FDR" value="0.03"/&gt;
            &lt;!-- peptide pair global FDR --&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002520" value="SCKDLQILQASK-TAAPTVCLLVLGQADKVLEEVDWLIKR" name="peptide group ID"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003338" name="peptide-pair sequence-level global FDR" value="0.03"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003339" name="peptide-pair passes threshold" value="true"/&gt;
            &lt;!-- residue pair ref value="22.a" --&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003344" value="22.a" name="Residue-pair ref"/&gt;
        &lt;/SpectrumIdentificationItem&gt;
        &lt;cvParam cvRef="PSI-MS" accession="MS:1000797" name="peak list scans" value="38065"/&gt;
    &lt;/SpectrumIdentificationResult&gt;
&lt;/SpectrumIdentificationList&gt;</programlisting>
</para>
</formalpara>
</section>
<section xml:id="interaction-level-scores-unique-residue-pairs-and-ppi">
<title>Interaction Level Scores (Unique Residue-Pairs and PPI)</title>
<simpara>mzIdentML uses the same mechanism to encode scores for interactions at both the unique residue-pair level and protein-protein interaction level.
This encoding was put forward in mzIdentML 1.2.0 and remains unchanged.
Where a residue-pair level score gives the position of the crosslinked residue, a protein-protein interaction (PPI) score will instead have the value ‘null’.</simpara>
<simpara>mzIdentML encodes these with the same mechanism it uses to address the protein inference problem, that is, within &lt;ProteinAmbiguityGroup&gt; elements.
More specifically, these scores go inside &lt;ProteinDetectionHypothesis&gt; elements.
All such scores must therefore meet the CV mapping rules of &lt;ProteinDetectionHypothesis&gt; elements.</simpara>
<simpara>As the encoding of interaction scores uses &lt;ProteinAmbiguityGroup&gt; elements, the guidance in Section 5.2.1 (Protein grouping encoding) of the main specification also applies here and MUST be followed.
This means that ambiguity about which protein a crosslinked peptide came from must be reflected in how the &lt;ProteinDetectionHypothesis&gt; elements containing the score are assigned to &lt;ProteinAmbiguityGroup&gt; elements, see <xref linkend="xml-ambiguity-at-ppi-level"/>.</simpara>
<figure xml:id="xml-ambiguity-at-ppi-level">
<title><emphasis role="strong">Ambiguity at PPI level.</emphasis> Ambiguity regarding which protein is crosslinked (protein inference problem) MUST be reflected in how the &lt;ProteinDetectionHypothesis&gt; elements containing interaction scores are assigned to &lt;ProteinAmbiguityGroup&gt; elements, see Section 5.2.1 (Protein grouping encoding) of the main specification. Shown here with PPI level scores.</title>
<mediaobject>
<imageobject>
<imagedata fileref="img/crosslinking_ext/image1.png" contentwidth="624" contentdepth="396"/>
</imageobject>
<textobject><phrase>image</phrase></textobject>
</mediaobject>
</figure>
<formalpara xml:id="xml-for-protein-pair-level-global-fdr">
<title><emphasis role="strong">XML snippet showing the CV terms "protein-pair-level global FDR" (MS:1002676) and "residue-pair-level global FDR" (MS:1002677).</emphasis></title>
<para>
<programlisting language="xml" linenumbering="unnumbered">&lt;ProteinAmbiguityGroup id="PAG_0"&gt;
    &lt;ProteinDetectionHypothesis dBSequence_ref="dbseq_P02771" passThreshold="true" id="PAG_0_PDH_0"&gt;
        &lt;PeptideHypothesis peptideEvidence_ref="pepevid_psm252637369_pep54601081"&gt;
            &lt;SpectrumIdentificationItemRef spectrumIdentificationItem_ref="SII_1_1"/&gt;
        &lt;/PeptideHypothesis&gt;
    ...
        &lt;cvParam cvRef="PSI-MS" accession="MS:1002676" name="protein-pair-level global FDR" value="100.b:null:0.001:true"/&gt;
        &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="106.b:146:0.0294:true"/&gt;
    &lt;/ProteinDetectionHypothesis&gt;
    &lt;cvParam cvRef="PSI-MS" accession="MS:1002415" name="protein group passes threshold" value="true"/&gt;
&lt;/ProteinAmbiguityGroup&gt;
&lt;ProteinAmbiguityGroup id="PAG_1"&gt;
    &lt;ProteinDetectionHypothesis dBSequence_ref="dbseq_P02768" passThreshold="true" id="PAG_1_PDH_0"&gt;
        &lt;PeptideHypothesis peptideEvidence_ref="pepevid_psm252637369_pep54600650"&gt;
            &lt;SpectrumIdentificationItemRef spectrumIdentificationItem_ref="SII_1_2"/&gt;
        &lt;/PeptideHypothesis&gt;
        &lt;PeptideHypothesis peptideEvidence_ref="pepevid_psm252633422_pep54604445_protP02768-A_target_52"&gt;
            &lt;SpectrumIdentificationItemRef spectrumIdentificationItem_ref="SII_2_1"/&gt;
        &lt;/PeptideHypothesis&gt;
        ....
        &lt;cvParam cvRef="PSI-MS" accession="MS:1002676" name="protein-pair-level global FDR" value="100.a:null:0.001:true"/&gt;
        &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="106.a:436:0.0294:true"/&gt;
    &lt;/ProteinDetectionHypothesis&gt;
    &lt;cvParam cvRef="PSI-MS" accession="MS:1002415" name="protein group passes threshold" value="true"/&gt;
&lt;/ProteinAmbiguityGroup&gt;</programlisting>
</para>
</formalpara>
<simpara>The XML snippet in <xref linkend="xml-for-protein-pair-level-global-fdr"/> shows the "protein-pair-level global FDR" (MS:1002676) and "residue-pair-level global FDR" (MS:1002677) CV terms, these CV terms have the parent CV term “interaction score derived from crosslinking” (MS:1002664).
Where needed, new CV terms for search specific interaction scores should be added as children of the CV term “interaction score derived from crosslinking” (MS:1002664).</simpara>
<simpara>These CV terms must have a paired structure of int_ID.a|b:POS|null:SCORE_OR_VALUE:PASS_THRESHOLD</simpara>
<simpara><emphasis role="strong">1 2 3 4</emphasis></simpara>
<orderedlist numeration="arabic">
<listitem>
<simpara>The two partners in the interaction share the same integer value for ID followed by a or b.
If there is ambiguity in protein identification, two different ProteinDetectionHypothesis (PDH) elements, within the same ProteinAmbiguityGroup (PAG), MAY share the same ID and suffix (a or b).
A given identifier (integer and suffix) value MUST NOT be used in more than one PAG.</simpara>
</listitem>
<listitem>
<simpara>The export software MAY indicate the general position of the interaction (potentially taking on board multiple pairs of crosslinked peptides), with respect to the protein sequence – using a 1-based counting system.
A “null” MAY be used if the export software does not wish to include a value.</simpara>
</listitem>
<listitem>
<simpara>The score or statistical value for the interaction.</simpara>
</listitem>
<listitem>
<simpara>“true” or “false” to indicate whether the score or value has passed a reported threshold in the file.
If no threshold is defined, then PASS_THRESHOLD is always true.</simpara>
</listitem>
</orderedlist>
<simpara>The first “int_ID” part of the value MUST be identical/shared between interaction level scores if they refer to the same residue pair or PPI.</simpara>
<simpara>The new CV term “Residue pair ref” (MS:1003344) SHOULD be included within &lt;SpectrumIdentificationItem&gt; elements to indicate that these are the spectra which supported the linking of a specific residue pair.
The value of the new “Residue pair ref” CV term is the “<emphasis>int_ID.a|b</emphasis>” part of the values, see <xref linkend="xml-encoding-of-scores-for-psm-level-matches-and-peptide-pairs"/>. More than one “Residue pair ref” (MS:1003344) CV term (with different values) can be included in a single &lt;SpectrumIdentificationItem&gt; element if it has been taken as evidence for more than one linked residue pair.</simpara>
<simpara>It is not a requirement that the &lt;SpectrumIdentificationItem&gt; elements containing “Residue pair ref” (MS:1003344) place the linkage sites at the same position in the peptide as the residue-pair they are claiming to support.
Hence, analyses which utilise link site reassignment can be encoded in mzIdentML.
(Some analyses may look at a collection of spectra to reach a conclusion about where the linkage site was, therefore some identifications may end up supporting a residue-pair that places the linkage site at a different position from where they themselves did).</simpara>
<simpara>References to supporting &lt;SpectrumIdentificationItem&gt; elements for PPIs are given by the &lt;SpectrumIdentificationItemRef&gt; elements inside &lt;PeptideHypothesis&gt; elements in &lt;ProteinDectectionHypothesis&gt;.
This performs the equivalent role as the “Residue pair ref” (MS:1003344) CV term does for residue-pair interactions.</simpara>
<simpara>See <xref linkend="xml-encoding-of-scores-for-residue-pairs-and-ppis"/> for an example of encoding residue-pair and PPI level scores.</simpara>
<simpara>Positional ambiguity of the residues linked can be encoded by repeating the score CV terms, keeping the same identifier (integer and suffix) , for each of the positional alternatives, see <xref linkend="xml-of-positional-ambiguity-of-residue-pairs-het"/> and <xref linkend="xml-of-positional-ambiguity-of-residue-pairs-self"/>. This may be due to ambiguity regarding the position of the peptide in the protein sequences (protein inference problem) or ambiguity regarding the linkage site in the peptide.</simpara>
<formalpara xml:id="xml-encoding-of-scores-for-residue-pairs-and-ppis">
<title><emphasis role="strong">XML snippet including the encoding of scores for residue-pairs and PPIs.</emphasis> These are encoded inside &lt;ProteinDetectionHypothesis&gt; elements. The CV terms "residue-pair passes threshold" (MS:1003340) and “protein-pair passes threshold” (MS:1003341) would become relevant if there was more than one score for those residue or protein pairs (sharing the same integer id part of their value). These are analogous to the <emphasis>passThreshold</emphasis> attribute of &lt;SpectrumIdentificationItem&gt; elements.</title>
<para>
<programlisting language="xml" linenumbering="unnumbered">&lt;ProteinDetectionList id="PDL_1"&gt;
    &lt;ProteinAmbiguityGroup id="PAG_0"&gt;
        &lt;ProteinDetectionHypothesis dBSequence_ref="dbseq_ggFANCI_target" passThreshold="true" id="PAG_0_PDH_0"&gt;
            &lt;PeptideHypothesis peptideEvidence_ref="pepevid_pep_16734063165"&gt;
                &lt;SpectrumIdentificationItemRef spectrumIdentificationItem_ref="SII_1_1"/&gt;
            &lt;/PeptideHypothesis&gt;
            &lt;PeptideHypothesis peptideEvidence_ref="pepevid_pep_16734057553"&gt;
                &lt;SpectrumIdentificationItemRef spectrumIdentificationItem_ref="SII_2_1"/&gt;
            &lt;/PeptideHypothesis&gt;
            &lt;PeptideHypothesis peptideEvidence_ref="pepevid_pep_16734068348"&gt;
                &lt;SpectrumIdentificationItemRef spectrumIdentificationItem_ref="SII_2_2"/&gt;
            &lt;/PeptideHypothesis&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002403" name="group representative"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1001593" name="group member with undefined relationship OR ortholog protein"/&gt;
            &lt;!-- forms a protein heteromeric PPI with its partner 10.a in PAG_1_PDH_0 --&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002676" name="protein-pair-level global FDR" value="10.b:null:0.059:false"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003341" name="protein-protein interaction passes threshold" value="10:false"/&gt;
            &lt;!-- forms a self PPI with its partner 20.b in PAG_0_PDH_0 --&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002676" name="protein-pair-level global FDR" value="20.a:null:0.030:true"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002676" name="protein-pair-level global FDR" value="20.b:null:0.030:true"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003341" name="protein-protein interaction passes threshold" value="20:true"/&gt;
            &lt;!-- forms a protein heteromeric crosslink with its partner 11.a in PAG_1_PDH_0 --&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="11.b:697:0.06:false"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003340" name="residue-pair passes threshold" value="11:false"/&gt;
            &lt;!-- forms a self crosslink with its partner 22.b in PAG_0_PDH_0 --&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="22.a:1095:0.01:true"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="22.b:339:0.01:true"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003340" name="residue-pair passes threshold" value="22:true"/&gt;
        &lt;/ProteinDetectionHypothesis&gt;
        &lt;cvParam cvRef="PSI-MS" accession="MS:1002415" name="protein group passes threshold" value="true"/&gt;
    &lt;/ProteinAmbiguityGroup&gt;
    &lt;ProteinAmbiguityGroup id="PAG_1"&gt;
        &lt;ProteinDetectionHypothesis dBSequence_ref="dbseq_ggFANCD2_target" passThreshold="true" id="PAG_1_PDH_0"&gt;
            &lt;PeptideHypothesis peptideEvidence_ref="pepevid_pep_16734061838"&gt;
                &lt;SpectrumIdentificationItemRef spectrumIdentificationItem_ref="SII_1_2"/&gt;
            &lt;/PeptideHypothesis&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002403" name="group representative"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1001593" name="group member with undefined relationship OR ortholog protein"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002676" name="protein-pair-level global FDR" value="10.a:null:059:false"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003341" name="protein-protein interaction passes threshold" value="10:false"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="11.a:36:0.06:false"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003340" name="residue-pair passes threshold" value="11:false"/&gt;
        &lt;/ProteinDetectionHypothesis&gt;
        &lt;cvParam cvRef="PSI-MS" accession="MS:1002415" name="protein group passes threshold" value="true"/&gt;
    &lt;/ProteinAmbiguityGroup&gt;
    &lt;cvParam cvRef="PSI-MS" accession="MS:1002404" name="count of identified proteins" value="2"/&gt;
&lt;/ProteinDetectionList&gt;</programlisting>
</para>
</formalpara>
<formalpara xml:id="xml-of-positional-ambiguity-of-residue-pairs-het">
<title><emphasis role="strong">XML snippet including the encoding of positional ambiguity of residue pairs (i).</emphasis> Residue-pair 22 is a protein heteromeric crosslink where the “a” end of the crosslink is ambiguous between two proteins and there are three possible positions of the crosslink in peptide “a”.</title>
<para>
<programlisting language="xml" linenumbering="unnumbered">    &lt;ProteinAmbiguityGroup id="PAG_0"&gt;
        &lt;!-- example of both peptide ambiguity (classical protein inference) and site ambiguity with in a peptide --&gt;
        &lt;ProteinDetectionHypothesis dBSequence_ref="dbseq_A_target" passThreshold="true" id="PAG_0_PDH_0"&gt;
            ...
            &lt;!-- each possible linksite in the originating peptide is referenced here as a possible residue pair--&gt;
            &lt;!-- the first two have the same score as there is no fragmentation distinguishing the two neighbouring residues--&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="22.a:1095:0.01:true"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="22.a:1096:0.01:true"/&gt;
            &lt;!-- the third residue would be a possible linksite, but there is some fragments speaking in favour of the first two, therefore this one has a lower score and hence a worse FDR--&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="22.a:1091:0.09:false"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003341" name="residue-pair passes threshold" value="22:true"/&gt;
        &lt;/ProteinDetectionHypothesis&gt;
        &lt;ProteinDetectionHypothesis dBSequence_ref="dbseq_B_target" passThreshold="true" id="PAG_0_PDH_1"&gt;
            ...
            &lt;!-- (all) peptide(s) for site a could also come from a different protein--&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="22.a:295:0.01:true"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="22.a:296:0.01:true"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="22.a:291:0.09:false"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003341" name="residue-pair passes threshold" value="22:true"/&gt;
        &lt;/ProteinDetectionHypothesis&gt;
    &lt;/ProteinAmbiguityGroup&gt;
    &lt;ProteinAmbiguityGroup id="PAG_1"&gt;
        &lt;ProteinDetectionHypothesis dBSequence_ref="dbseq_C_target" passThreshold="true" id="PAG_1_PDH_0"&gt;
            ...
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="22.b:339:0.01:true"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003341" name="residue-pair passes threshold" value="22:true"/&gt;
        &lt;/ProteinDetectionHypothesis&gt;
    &lt;/ProteinAmbiguityGroup&gt;</programlisting>
</para>
</formalpara>
<formalpara xml:id="xml-of-positional-ambiguity-of-residue-pairs-self">
<title><emphasis role="strong">XML snippet including the encoding of positional ambiguity of residue pairs (ii).</emphasis> Residue pair 23 is a self link but there is ambiguity about where peptide “a” came from within that protein (two possible positions) and two possible link sites in peptide “a”, giving a total of four possible residues.</title>
<para>
<programlisting language="xml" linenumbering="unnumbered">&lt;ProteinAmbiguityGroup id="PAG_0"&gt;
    &lt;ProteinDetectionHypothesis dBSequence_ref="dbseq_B_target" passThreshold="true" id="PAG_0_PDH_1"&gt;
        ...
        &lt;!-- peptide has two possible link sites and is present in two places in protein B--&gt;
        &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="23.a:1095:0.01:true"/&gt;
        &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="23.a:1091:0.09:false"/&gt;
        &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="23.a:295:0.01:true"/&gt;
        &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="23.a:291:0.09:false"/&gt;
        &lt;cvParam cvRef="PSI-MS" accession="MS:1002677" name="residue-pair-level global FDR" value="23.b:339:0.01:true"/&gt;
        &lt;cvParam cvRef="PSI-MS" accession="MS:1003341" name="residue-pair passes threshold" value="23:true"/&gt;
    &lt;/ProteinDetectionHypothesis&gt;
&lt;/ProteinAmbiguityGroup&gt;</programlisting>
</para>
</formalpara>
</section>
</section>
<section xml:id="fdr-specific-comments">
<title>FDR Specific Comments</title>
<simpara>Section 7.5 of the main mzIdentML 1.3.0 specification document (‘Using decoy databases to set different thresholds of false discovery rate’) states that:</simpara>
<itemizedlist>
<listitem>
<simpara><emphasis>A &lt;SpectrumIdentificationItem&gt; can be marked as matching a decoy peptide using the isDecoy attribute of the referenced &lt;PeptideEvidence&gt; element, thus allowing the false discovery rate to be calculated across an entire file.</emphasis></simpara>
</listitem>
<listitem>
<simpara><emphasis>Implementers of the format SHOULD report the peptide identifications</emphasis> [including those of decoy peptides] <emphasis>that pass the threshold they wish to communicate to a consumer of the data.</emphasis></simpara>
</listitem>
<listitem>
<simpara><emphasis>It is not guaranteed that a consumer of an mzIdentML file will be able to calculate other results, or global false discovery rates, using different thresholds from the reported information, although in some circumstances they may be able to, for example, if a user reports the complete output of a search against a target and decoy search.</emphasis></simpara>
</listitem>
</itemizedlist>
<simpara>CV terms exist for FDR scores at each level of consolidation:</simpara>
<itemizedlist>
<listitem>
<simpara>"crosslinked PSM-level global FDR" (MS:1003337)</simpara>
</listitem>
<listitem>
<simpara>“peptide-pair sequence-level global FDR” (MS:1003339)</simpara>
</listitem>
<listitem>
<simpara>“residue-pair-level global FDR” (MS:1002677)</simpara>
</listitem>
<listitem>
<simpara>“protein-pair-level global FDR” (MS:1002676)</simpara>
</listitem>
</itemizedlist>
<simpara>A new CV term “FDR applied separately to self crosslinks and protein heteromeric crosslinks” (MS:1003343) has been introduced to encode whether self crosslinks (crosslinks between peptides within one protein sequence) and protein heteromeric crosslinks (crosslinks between distinct protein sequences) were grouped separately for FDR analysis <xref linkend="lenz2021"/>.
This CV term goes within the &lt;AdditionalSearchParameters&gt; element (see <xref linkend="spectrumidentificationprotocol-elements"/>).</simpara>
<simpara>The value of “FDR applied separately to self crosslinks and protein heteromeric crosslinks” (MS:1003343) is a boolean, stating whether or not this happened.
This CV term SHOULD be supplied.
If it is omitted then it is unspecified whether self and heteromeric links were grouped separately for analysis (there is no default value).</simpara>
</section>
</chapter>
<chapter xml:id="unsupported-use-cases-and-future-directions">
<title>Unsupported Use Cases and Future Directions</title>
<simpara>The two unsupported crosslinking product types shown in <xref linkend="summary-of-mzidentml-support-for-crosslinking-product-types"/> are: crosslinkers with more than two reactive groups and higher order crosslinks (arbitrarily many peptides identified with many crosslinks between them).</simpara>
<simpara>Crosslinkers with more than two reactive groups <xref linkend="mohr2024"/> cannot be represented using the current model for two reasons.
First, the donor/acceptor mechanism for crosslinked Peptides in &lt;SequenceCollection&gt; elements (<xref linkend="spectrumidentificationprotocol-elements"/>) restricts the number of reactive groups to two.
Second, there can be at most two crosslinked &lt;SpectrumidentificationItem&gt; elements, each of which references an identified peptide within a &lt;SpectrumIdentificationResult&gt; (<xref linkend="encoding-of-crosslinked-peptides-in-sequencecollection-element"/>).</simpara>
<simpara>In the case of higher order crosslinks, the specification already allows the encoding of this in the &lt;Peptide&gt; elements within &lt;SequenceCollection&gt; (or rather nothing forbids it), see <xref linkend="spectrumidentificationprotocol-elements"/>. It is only the restriction of there being at most two crosslinked &lt;SpectrumIdentificationItem&gt; elements that share the same value within a &lt;SpectrumIdentificationResult&gt; that prevents the encoding of higher order crosslinks.</simpara>
<simpara>It would be possible to support higher order crosslinks by allowing <emphasis>n</emphasis> crosslinked &lt;SpectrumidentificationItem&gt; elements within a &lt;SpectrumIdentificationResult&gt;.
This would pose some problems for the validation of the documents.
These would not be insurmountable because the number of peptides that are crosslinked could be derived from &lt;Peptide&gt; elements in &lt;SequenceCollection&gt;.
However, this would make validation significantly more complex, to alleviate this an additional CV term in peptide that links peptides as part of a "crosslink-group" independent of the crosslinker could be introduced.</simpara>
<simpara>In the case of crosslinkers with more than two reactive groups and the identification of higher order crosslinks, there was no demand for supporting these use cases at this point in time and so, for the sake of simplicity and minimal changes, they are still not supported.</simpara>
<simpara>This remains an open question for future versions of the specification.
There are other use cases in which <emphasis>n</emphasis> &lt;SpectrumIdentificationItem&gt; elements need to be associated.
Characterisation of antibodies or other multi-chain proteins that contain complex patterns of disulfide bonds (representing endogenous crosslinks) by top-down mass spectrometry would be an example of this.</simpara>
</chapter>
<appendix xml:id="comparison-of-rules-for-crosslink-donor-and-crosslink-acceptor-depending-on-context">
<title>Comparison of rules for “crosslink donor” and “crosslink acceptor” depending on context</title>
<simpara>The CV terms “crosslink donor” (MS:1002509) and “crosslink acceptor” (MS:1002510) are used in two different contexts:</simpara>
<itemizedlist>
<listitem>
<simpara>/MzIdentML/AnalysisProtocolCollection/SpectrumIdentificationProtocol/ ModificationParams/SearchModification – encoding the modifications searched for (including specificity, see §3.1.2);</simpara>
</listitem>
<listitem>
<simpara>/MzIdentML/SequenceCollection/Peptide/Modification - encoding the modifications of crosslinked peptides (§3.2).</simpara>
</listitem>
</itemizedlist>
<simpara>Table 1 summarises the commonalities and differences between the rules governing their use in these two contexts.</simpara>
<informaltable frame="all" rowsep="1" colsep="1">
<tgroup cols="2">
<colspec colname="col_1" colwidth="50*"/>
<colspec colname="col_2" colwidth="50*"/>
<thead>
<row>
<entry align="left" valign="top"><emphasis role="strong">Element &lt;SearchModification&gt; (see §3.1.2)</emphasis></entry>
<entry align="left" valign="top"><emphasis role="strong">Element &lt;Peptide&gt;&lt;Modification&gt; (see §3.2)</emphasis></entry>
</row>
</thead>
<tbody>
<row>
<entry align="left" valign="top"><simpara><emphasis role="strong">Two or more</emphasis> &lt;SearchModification&gt; elements are needed to describe the specificity of a single crosslinker with two reactive groups. All of the donor and acceptor CV terms contained in these <emphasis role="strong">MUST</emphasis> share a unique identifier in their value slot.</simpara></entry>
<entry align="left" valign="top"><simpara>A unique identifier linking these <emphasis role="strong">two</emphasis> Modification elements together <emphasis role="strong">MUST</emphasis> be in the value slot. (Thereby excluding the representation of trimeric crosslinkers.)</simpara></entry>
</row>
<row>
<entry align="left" valign="top"><simpara>The choice of which end is the ‘donor’ and which end is the ‘acceptor’ is arbitrary.</simpara></entry>
<entry align="left" valign="top"><simpara>If the CV term “search modification id ref” (MS:1003393) is being used then the crosslink donor <emphasis role="strong">MUST</emphasis> be chosen to match the end marked as the donor in the corresponding &lt;SearchModification&gt; elements, see §3.2.2. If that CV term is not used, or if the preceding rule does not unambiguously define which end to mark as donor (e.g. because the crosslinker is symmetrical) then the export software <emphasis role="strong">SHOULD</emphasis> use the following rules to choose the crosslink donor as the: longer peptide, then higher peptide neutral mass, then alphabetical order.</simpara></entry>
</row>
<row>
<entry align="left" valign="top"><simpara>The element(s) containing the crosslink donor CV term <emphasis role="strong">MUST</emphasis> have their mass delta attribute = the mass gained from the crosslink reagent.</simpara></entry>
<entry align="left" valign="top"></entry>
</row>
<row>
<entry align="left" valign="top"><simpara>The element(s) containing the crosslink acceptor CV term <emphasis role="strong">MUST</emphasis> have their mass delta attribute = 0.</simpara></entry>
<entry align="left" valign="top"></entry>
</row>
<row>
<entry align="left" valign="top"><simpara><emphasis role="strong">Both</emphasis> crosslink donor and crosslink acceptor <emphasis role="strong">MUST</emphasis> have a suitably sourced cvParam for the crosslink.</simpara></entry>
<entry align="left" valign="top"><simpara>The crosslink donor peptide’s Modification element <emphasis role="strong">MUST</emphasis> have a suitably sourced cvParam for the crosslink.</simpara></entry>
</row>
<row>
<entry align="left" valign="top"></entry>
<entry align="left" valign="top"><simpara>The crosslink acceptor peptide’s Modification element <emphasis role="strong">MUST NOT</emphasis> have a cvParam for the reagent.</simpara></entry>
</row>
</tbody>
</tgroup>
</informaltable>
<simpara><emphasis role="strong">Table 1.</emphasis> The rules governing the use of “crosslink donor” (MS:1002509) and “crosslink acceptor” (MS:1002510) differ depending on the context.</simpara>
</appendix>
<appendix xml:id="example-encodings-of-crosslinker-reagents-as-searchmodification-elements">
<title>Example encodings of crosslinker reagents as &lt;SearchModification&gt; elements</title>
<section xml:id="bissulfosuccinimidyl-suberate-bs3">
<title>Bis(sulfosuccinimidyl) suberate (BS3)</title>
<programlisting language="xml" linenumbering="unnumbered">&lt;SpectrumIdentificationProtocol&gt;
...
    &lt;ModificationParams&gt;
        &lt;SearchModification fixedMod="false" massDelta="138.06808" residues="S T Y K"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="BS3_donor"/&gt;
            &lt;cvParam cvRef="XLMOD" accession="XLMOD:02000" name="BS3"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002509" name="crosslink donor" value="0"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="138.06808" residues="."&gt;
            &lt;SpecificityRules&gt;
                &lt;cvParam cvRef="PSI-MS" accession="MS:1002057" name="modification specificity protein N-term"/&gt;
            &lt;/SpecificityRules&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="BS3_donor_n_term"/&gt;
            &lt;cvParam cvRef="XLMOD" accession="XLMOD:02000" name="BS3"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink donor" value="0"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="0.0" residues="S T Y K"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="BS3_acceptor"/&gt;
            &lt;cvParam cvRef="XLMOD" accession="XLMOD:02000" name="BS3"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink acceptor" value="0"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="0.0" residues="."&gt;
            &lt;SpecificityRules&gt;
                &lt;cvParam cvRef="PSI-MS" accession="MS:1002058" name="modification specificity protein N-term"/&gt;
            &lt;/SpecificityRules&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="BS3_acceptor_n_term"/&gt;
            &lt;cvParam cvRef="XLMOD" accession="XLMOD:02000" name="BS3"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink acceptor" value="0"/&gt;
        &lt;/SearchModification&gt;
    &lt;/ModificationParams&gt;
...
&lt;/SpectrumIdentificationProtocol&gt;</programlisting>
</section>
<section xml:id="ethyl-3-3-dimethylaminopropylcarbodiimide-hydrochloride-edc">
<title>1-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride (EDC)</title>
<programlisting language="xml" linenumbering="unnumbered">&lt;SpectrumIdentificationProtocol&gt;
    ...
    &lt;ModificationParams&gt;
        &lt;SearchModification fixedMod="false" massDelta="-18.010565" residues="K"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="EDC_donor"/&gt;
            &lt;cvParam accession="UNIMOD:2018" name="Xlink:EDC" cvRef="UNIMOD"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002509" name="crosslink donor" value="1"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="-18.010565" residues="*.*"&gt;
            &lt;SpecificityRules&gt;
                &lt;cvParam cvRef="PSI-MS" accession="MS:1002057" name="modification specificity protein N-term"/&gt;
            &lt;/SpecificityRules&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="EDC_donor_n_term"/&gt;
            &lt;cvParam accession="UNIMOD:2018" name="Xlink:EDC" cvRef="UNIMOD"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink donor" value="1"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="0.0" residues="D E"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="EDC_acceptor"/&gt;
            &lt;cvParam accession="UNIMOD:2018" name="Xlink:EDC" cvRef="UNIMOD"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink acceptor" value="1"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="0.0" residues="*.*"&gt;
            &lt;SpecificityRules&gt;
                &lt;cvParam cvRef="PSI-MS" accession="MS:1002058" name="modification specificity protein C-term"/&gt;
            &lt;/SpecificityRules&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="EDC_acceptor_c_term"/&gt;
            &lt;cvParam accession="UNIMOD:2018" name="Xlink:EDC" cvRef="UNIMOD"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink acceptor" value="1"/&gt;
        &lt;/SearchModification&gt;
    &lt;/ModificationParams&gt;
    ...
&lt;/SpectrumIdentificationProtocol&gt;</programlisting>
</section>
<section xml:id="nhs-diazirine-succinimidyl-44-azipentanoate-sda">
<title>(NHS-Diazirine) succinimidyl 4,4'-azipentanoate (SDA)</title>
<programlisting language="xml" linenumbering="unnumbered">&lt;SpectrumIdentificationProtocol&gt;
    ...
    &lt;ModificationParams&gt;
        &lt;SearchModification fixedMod="false" massDelta="100.05243" residues="K S T Y"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="SDA_monolink_W"/&gt;
            &lt;cvParam accession="UNIMOD:2000" cvRef="UNIMOD" name="Xlink:SDA"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="W" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="100.05243" residues="."&gt;
            &lt;SpecificityRules&gt;
                &lt;cvParam cvRef="PSI-MS" accession="MS:1002057" name="modification specificity protein N-term"/&gt;
            &lt;/SpecificityRules&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="SDA_monolink_W_n_term"/&gt;
            &lt;cvParam accession="UNIMOD:2000" cvRef="UNIMOD" name="Xlink:SDA"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="W" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="82.041865" residues="K S T Y"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="SDA_crosslink_donor"/&gt;
            &lt;cvParam accession="UNIMOD:2000" name="Xlink:DSSO" cvRef="UNIMOD"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002509" name="crosslink donor" value="*2*"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="S:82.041865:O" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="82.041865" residues="."&gt;
            &lt;SpecificityRules&gt;
                &lt;cvParam cvRef="PSI-MS" accession="MS:1002057" name="modification specificity protein N-term"/&gt;
            &lt;/SpecificityRules&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="SDA_crosslink_donor_n_term"/&gt;
            &lt;cvParam accession="UNIMOD:1842" name="Xlink:DSSO" cvRef="UNIMOD"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink donor" value="*2*"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="S:82.041865:O" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="0.0" residues="."&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="SDA_crosslink_acceptor"/&gt;
            &lt;cvParam accession="UNIMOD:2000" name="Xlink:SDA" cvRef="UNIMOD"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink acceptor" value="*2*"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="O:0:S" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="0.0" residues="."&gt;
            &lt;SpecificityRules&gt;
                &lt;cvParam cvRef="PSI-MS" accession="MS:1002057" name="modification specificity protein N-term"/&gt;
            &lt;/SpecificityRules&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="SDA_crosslink_acceptor_n_term"/&gt;
            &lt;cvParam accession="UNIMOD:2000" name="Xlink:SDA" cvRef="UNIMOD"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink acceptor" value="*2*"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="O:0:S" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
    &lt;/ModificationParams&gt;
...
&lt;/SpectrumIdentificationProtocol&gt;</programlisting>
</section>
<section xml:id="disuccinimidyl-sulfoxide-dsso">
<title>Disuccinimidyl sulfoxide (DSSO)</title>
<programlisting language="xml" linenumbering="unnumbered">&lt;SpectrumIdentificationProtocol&gt;
    ...
    &lt;ModificationParams&gt;
        &lt;SearchModification fixedMod="false" massDelta="175.030314" residues="K S T Y"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="DSSO_monolink_M"/&gt;
            &lt;cvParam accession="UNIMOD:1842" cvRef="UNIMOD" name="Xlink:DSSO"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="M" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="176.01433" residues="K S T Y"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="DSSO_monolink_W"/&gt;
            &lt;cvParam accession="UNIMOD:1842" cvRef="UNIMOD" name="Xlink:DSSO"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="W" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="175.030314" residues="."&gt;
            &lt;SpecificityRules&gt;
                &lt;cvParam cvRef="PSI-MS" accession="MS:1002057" name="modification specificity protein N-term"/&gt;
            &lt;/SpecificityRules&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="DSSO_monolink_M_n_term"/&gt;
            &lt;cvParam accession="UNIMOD:1842" cvRef="UNIMOD" name="Xlink:DSSO"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="M" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="176.01433" residues="."&gt;
            &lt;SpecificityRules&gt;
                &lt;cvParam cvRef="PSI-MS" accession="MS:1002057" name="modification specificity protein N-term"/&gt;
            &lt;/SpecificityRules&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="DSSO_monolink_W_n_term"/&gt;
            &lt;cvParam accession="UNIMOD:1842" cvRef="UNIMOD" name="Xlink:DSSO"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="W" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="158.003765" residues="K S T Y"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="DSSO_crosslink_donor"/&gt;
            &lt;cvParam accession="UNIMOD:1842" name="Xlink:DSSO" cvRef="UNIMOD"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002509" name="crosslink donor" value="3"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="A:54.0105647:ST" cvRef="PSI-MS"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="S:103.9932001:A" cvRef="PSI-MS"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="T:85.9826354:A" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="158.003765" residues="."&gt;
            &lt;SpecificityRules&gt;
                &lt;cvParam cvRef="PSI-MS" accession="MS:1002057" name="modification specificity protein N-term"/&gt;
            &lt;/SpecificityRules&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="DSSO_crosslink_donor_n_term"/&gt;
            &lt;cvParam accession="UNIMOD:1842" name="Xlink:DSSO" cvRef="UNIMOD"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink donor" value="3"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="A:54.0105647:ST" cvRef="PSI-MS"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="S:103.9932001:A" cvRef="PSI-MS"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="T:85.9826354:A" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="0.0" residues="K S T Y"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="DSSO_crosslink_acceptor"/&gt;
            &lt;cvParam accession="UNIMOD:1842" name="Xlink:DSSO" cvRef="UNIMOD"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink acceptor" value="3"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="A:54.0105647:ST" cvRef="PSI-MS"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="S:103.9932001:A" cvRef="PSI-MS"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="T:85.9826354:A" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="0.0" residues="."&gt;
            &lt;SpecificityRules&gt;
                &lt;cvParam cvRef="PSI-MS" accession="MS:1002057" name="modification specificity protein N-term"/&gt;
            &lt;/SpecificityRules&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="DSSO_crosslink_acceptor_n_term"/&gt;
            &lt;cvParam accession="UNIMOD:1842" name="Xlink:DSSO" cvRef="UNIMOD"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1002510" name="crosslink acceptor" value="3"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="A:54.0105647:ST" cvRef="PSI-MS"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="S:103.9932001:A" cvRef="PSI-MS"/&gt;
            &lt;cvParam accession="MS:1003390" name="crosslinker cleavage characteristics" value="T:85.9826354:A" cvRef="PSI-MS"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="54.010565" residues="K S T Y"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="DSSO_crosslink_stub_a"/&gt;
            &lt;cvParam accession="UNIMOD:1842" name="Xlink:DSSO" cvRef="UNIMOD"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="A" cvRef="PSI-MS"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003346" name="cleavable crosslinker stub"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="85.982636" residues="K S T Y"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="DSSO_crosslink_stub_t"/&gt;
            &lt;cvParam accession="UNIMOD:1842" name="Xlink:DSSO" cvRef="UNIMOD"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="T" cvRef="PSI-MS"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003346" name="cleavable crosslinker stub"/&gt;
        &lt;/SearchModification&gt;
        &lt;SearchModification fixedMod="false" massDelta="103.9932" residues="K S T Y"&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003392" name="search modification id" value="DSSO_crosslink_stub_s"/&gt;
            &lt;cvParam accession="UNIMOD:1842" name="Xlink:DSSO" cvRef="UNIMOD"/&gt;
            &lt;cvParam accession="MS:1003347" name="UNIMOD derivative code" value="S" cvRef="PSI-MS"/&gt;
            &lt;cvParam cvRef="PSI-MS" accession="MS:1003346" name="cleavable crosslinker stub"/&gt;
        &lt;/SearchModification&gt;
    &lt;/ModificationParams&gt;
    ...
&lt;/SpectrumIdentificationProtocol&gt;</programlisting>
</section>
</appendix>
<chapter xml:id="authors">
<title>Authors Information</title>
<simpara>Authors of this extension:</simpara>
<simpara>Colin W. Combe, University of Edinburgh, <link xl:href="mailto:colin.combe@ed.ac.uk">colin.combe@ed.ac.uk</link></simpara>
<simpara>Lars Kolbowski, Technische Universität Berlin, <link xl:href="mailto:lars.kolbowski@tu-berlin.de">lars.kolbowski@tu-berlin.de</link></simpara>
<simpara>Lutz Fischer, Technische Universität Berlin, <link xl:href="mailto:lutz.fischer@tu-berlin.de">lutz.fischer@tu-berlin.de</link></simpara>
<simpara>Ville Koskinen, Matrix Science Ltd, <link xl:href="mailto:villek@matrixscience.com">villek@matrixscience.com</link></simpara>
<simpara>Joshua Klein, University of Boston, <link xl:href="mailto:mobiusklein@gmail.com">mobiusklein@gmail.com</link></simpara>
<simpara>Alexander Leitner, ETH Zurich, <link xl:href="mailto:leitner@imsb.biol.ethz.ch">leitner@imsb.biol.ethz.ch</link></simpara>
<simpara>Juan Antonio Vizcaíno, European Molecular Biology Laboratory, EMBL-EBI,</simpara>
<simpara><link xl:href="mailto:juan@ebi.ac.uk">juan@ebi.ac.uk</link></simpara>
<simpara>Andy Jones, University of Liverpool, <link xl:href="mailto:Andrew.Jones@liverpool.ac.uk">Andrew.Jones@liverpool.ac.uk</link></simpara>
<simpara>Juri Rappsilber, Technische Universität Berlin, <link xl:href="mailto:Juri.Rappsilber@tu-Berlin.de">Juri.Rappsilber@tu-Berlin.de</link></simpara>
</chapter>
<glossary xml:id="glossary">
<title>Glossary</title>
<variablelist>
<varlistentry>
<term>Cleavable Crosslinker</term>
<listitem>
<simpara>a crosslinker that can be broken in two to release the individual peptides (in a modified form), thus enabling their individual analysis.
In crosslinking studies this typically refers to <emphasis>MS-cleavable crosslinkers</emphasis>.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Controlled Vocabulary (CV)</term>
<listitem>
<simpara>A structured collection of terms describing a certain Crosslink - The covalent bond formed by a <emphasis>crosslinker</emphasis>.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Crosslink Acceptor</term>
<listitem>
<simpara>one end of a crosslinking reaction, the other is the <emphasis>crosslink donor</emphasis>.
(Assumes the crosslinker has only two reactive groups.)</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Crosslink Donor</term>
<listitem>
<simpara>one end of a crosslinking reaction, the other is the <emphasis>crosslink acceptor</emphasis>.
(Assumes the crosslinker has only two reactive groups.)</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Crosslinker</term>
<listitem>
<simpara>A chemical reagent.
A molecule which creates a covalent bond either between proteins or within the same protein chain.
This bond preserves proximity information which would otherwise be destroyed by the enzymatic digestion of the proteins.
The proximity information is then recovered by identifying the peptides via mass spectrometry.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Crosslinker Modified Peptide</term>
<listitem>
<simpara>a peptide where one reactive group of a <emphasis>crosslinker</emphasis> has reacted with one of its amino acids, but the other reactive group has not reacted with any animo acid and no covalent bond is formed.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Crosslinker Specificity</term>
<listitem>
<simpara>The specificity of a crosslinker gives the amino acids it will react with.
A <emphasis>crosslinker</emphasis> has reactive groups that react with amino acids or their side-chains, the reactive groups determine the specificity of the crosslinker.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Crosslinking Product</term>
<listitem>
<simpara>the result of a crosslinking reaction.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Crosslink Spectrum Match (CSM)</term>
<listitem>
<simpara>the subset of PSM level matches that contain a crosslink.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Decoy (decoy databases, decoy peptide)</term>
<listitem>
<simpara>decoys are a set of artificially generated sequences used to assess the performance of an identification algorithm.
Decoys are typically created by randomising or reversing the sequences of the target proteins.
These new sequences are then added to the database used for analysis. <emphasis>False Discovery Rates</emphasis> may use the identifications of decoys to estimate the error rate in the data.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>False Discovery Rate (FDR)</term>
<listitem>
<simpara>the fraction of identifications that are predicted to be incorrect (false positives) among the total number of identifications made.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Global FDR</term>
<listitem>
<simpara>the False Discovery Rate across the whole dataset, as opposed to ‘local FDR’ which calculates the error rate within a given score window.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Heterobifunctional Crosslinker</term>
<listitem>
<simpara>a <emphasis>crosslinker</emphasis> with two reactive groups in which the reactive groups, and hence the specificity of each end, are different.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Higher Order Crosslink</term>
<listitem>
<simpara>A <emphasis>crosslinking product</emphasis> in which there are arbitrarily many peptides with many crosslinks between them.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Internally Linked Peptide</term>
<listitem>
<simpara>A crosslinking product in which both ends of the crosslinker have reacted within a single peptide (that is, within the same identical peptide, not between two copies of the same peptide).
This type of crosslinking product is known to be intramolecular.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Looplink</term>
<listitem>
<simpara>colloquial name for an <emphasis>internally linked peptide</emphasis>.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Modification</term>
<listitem>
<simpara>a chemical change to an amino acid.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>MS-Cleavable Crosslinker</term>
<listitem>
<simpara>a type of <emphasis>cleavable crosslinker</emphasis>, MS-cleavable crosslinkers can cleave upon activation in the mass spectrometer, releasing the individual peptides and thus enabling their individual analysis.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Noncovalently Associated Peptides</term>
<listitem>
<simpara>two different peptides which were not crosslinked but stayed associated with each other throughout the workflow, due to noncovalent interactions.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Monolink</term>
<listitem>
<simpara>colloquial name for a <emphasis>crosslinker modified peptide</emphasis>.
Unimod uses this term to describe some derivatives of crosslinkers (e.g. see <link xl:href="http://www.unimod.org/modifications_view.php?editid1=1842">http://www.unimod.org/modifications_view.php?editid1=1842</link>).</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Open Modification Search</term>
<listitem>
<simpara>a search strategy which allows for any type of mass shift to occur at any residue within a peptide sequence.
In contrast to a "closed" search, which is limited to a defined set of modifications, an open modification search allows for the identification of novel or unexpected modifications</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Posterior Error Probability</term>
<listitem>
<simpara>a statistical measure of the probability that an identification is incorrect.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Protein Heteromeric Crosslinks</term>
<listitem>
<simpara>crosslinks between distinct protein sequences.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Protein-Protein Interaction (PPI)</term>
<listitem>
<simpara>an interaction between proteins.
In the context of crosslinking, it is a level of consolidation at which crosslinks may be analysed.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>PSI-MS</term>
<listitem>
<simpara>the Human Proteome Organization (HUPO) Proteomics Standards Initiative’s (PSI) controlled vocabulary for Mass Spectrometry (MS)</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Peptide Spectrum Match (PSM)</term>
<listitem>
<simpara>a match to one or more peptides in a mass spectrum.
The lowest, unconsolidated level at which analysis of identifications can occur.
Because a single “match” may identify more than one peptide, the concept ‘PSM’ does not correspond directly with “spectrum identification item” in mzIdentML, no element in mzIdentML does correspond directly to “a match”.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>PSM-level</term>
<listitem>
<simpara>the level of analysis that looks at individual <emphasis>peptide spectrum matches</emphasis>.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Residue-pair</term>
<listitem>
<simpara>a unique pair of crosslinked residues, irrespective of the peptides identified.
A level of consolidation higher than unique peptide but lower than PPI.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Self Crosslinks</term>
<listitem>
<simpara>crosslinks between peptides within one protein sequence.</simpara>
</listitem>
</varlistentry>
<varlistentry>
<term>Trimeric Crosslinker</term>
<listitem>
<simpara>a crosslinker with three reactive groups.</simpara>
</listitem>
</varlistentry>
</variablelist>
</glossary>
<bibliography xml:id="references">
<title>References</title>
<bibliodiv>
<bibliomixed>
<bibliomisc><anchor xml:id="viz2017" xreflabel="[1]"/>[1] Vizcaíno JA, Mayer G, Perkins S, Barsnes H, Vaudel M, Perez-Riverol Y, et al. The mzIdentML Data Standard Version 1.2, Supporting Advances in Proteome Informatics. Mol Cell Proteomics. 2017;16: 1275–1285.</bibliomisc>
</bibliomixed>
<bibliomixed>
<bibliomisc><anchor xml:id="giese2019" xreflabel="[2]"/>[2] Giese SH, Belsom A, Sinn L, Fischer L, Rappsilber J. Noncovalently Associated Peptides Observed during Liquid Chromatography-Mass Spectrometry and Their Effect on Cross-Link Analyses. Anal Chem. 2019;91: 2678–2685.</bibliomisc>
</bibliomixed>
<bibliomixed>
<bibliomisc><anchor xml:id="mayer2014" xreflabel="[3]"/>[3] Mayer G, Jones AR, Binz P-A, Deutsch EW, Orchard S, Montecchi-Palazzi L, et al. Controlled vocabularies and ontologies in proteomics: overview, principles and practice. Biochim Biophys Acta. 2014;1844: 98–107]</bibliomisc>
</bibliomixed>
<bibliomixed>
<bibliomisc><anchor xml:id="liu2017" xreflabel="[4]"/>[4] Liu F, Lössl P, Scheltema R, Viner R, Heck AJR. Optimized fragmentation schemes and data analysis strategies for proteome-wide cross-link identification. Nat Commun. 2017;8: 15473]</bibliomisc>
</bibliomixed>
<bibliomixed>
<bibliomisc><anchor xml:id="lenz2021" xreflabel="[5]"/>[5] Lenz S, Sinn LR, O’Reilly FJ, Fischer L, Wegner F, Rappsilber J. Reliable identification of protein-protein interactions by crosslinking mass spectrometry. Nat Commun. 2021;12: 1–11]</bibliomisc>
</bibliomixed>
<bibliomixed>
<bibliomisc><anchor xml:id="fischer2017" xreflabel="[6]"/>[6] Fischer L, Rappsilber J. Quirks of Error Estimation in Cross-Linking/Mass Spectrometry. Anal Chem. 2017;89: 3829.</bibliomisc>
</bibliomixed>
<bibliomixed>
<bibliomisc><anchor xml:id="mohr2024" xreflabel="[7]"/>[7] Mohr JP, Caudal A, Tian R, Bruce JE. Multidimensional Cross-Linking and Real-Time Informatics for Multiprotein Interaction Studies. J Proteome Res. 2024;23. doi:10.1021/acs.jproteome.3c00455</bibliomisc>
</bibliomixed>
</bibliodiv>
</bibliography>
<chapter xml:id="intellectual-property-statement">
<title>Intellectual Property Statement</title>
<simpara>The PSI takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights.
Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the PSI Chair.</simpara>
<simpara>The PSI invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this recommendation.
Please address the information to the PSI Chair (see contacts information at PSI website).</simpara>
</chapter>
<chapter xml:id="copyright-notice">
<title>Copyright Notice</title>
<simpara>Copyright &#169; Proteomics Standards Initiative (2023).
All Rights Reserved.</simpara>
<simpara>This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works.
However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the PSI or other organizations, except as needed for the purpose of developing Proteomics Recommendations in which case the procedures for copyrights defined in the PSI Document process must be followed, or as required to translate it into languages other than English.</simpara>
<simpara>The limited permissions granted above are perpetual and will not be revoked by the PSI or its successors or assigns.</simpara>
<simpara>This document and the information contained herein is provided on an "AS IS" basis and THE PROTEOMICS STANDARDS INITIATIVE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."</simpara>
</chapter>
</book>