2. Instructions to run GUI for the identification of cross linked peptides
Xilmass GUI has seven panels, which allows users to easily to give their settings.
Note that a user must provide the parameters indicated by an asterisk.
- Input/Output
- Cross-linking
- In-silico digestion
- Modifications
- Scoring
- Spectrum preprocessing
- Multithreading and validation
Input/Output
Select the FASTA database
: Provide the full path to the FASTA file that contains the protein sequences that are cross linked. Please make sure that your database is in the right format (See [here] (https://github.com/compomics/xilmass/wiki/5.-Database)).
Select the contaminants FASTA database
: Provide the full path to the FASTA file that contains contaminant protein sequences (OPTIONAL). If this information is provided, contaminant proteins are in silico digested and scored but such contaminant derived spectra will not be included in the final identification list. If you dont give any contaminant database, please make sure this entry is empty.
Select the cross-linked and mono-linked peptides search database
: Provide the full path of the search database that contains cross-linked and, if selected for monolinked peptides searching, mono-linked peptides. Requires only a folder. For running the first time, it is recommended to introduce an empty folder.
If it is the first time to search Xilmass against your database of interest, Xilmass will generate both a search database file and a folder for indexing. The search database is constructed based on your selected parameters on Cross-linking panel.
If you previously searched your database, Xilmass tries to locate previously indexed database on the given folder.
The search database contains cross-linked peptides, and if selected, also mono-linked peptides. Details about the search database can be found [here] (https://github.com/compomics/xilmass/wiki/5.-Database)). Briefly, the search database has very similar to a fasta format but a modified version of fasta. The first-time constructed search database will have the name of the FASTA database and “cx” with an extension of the cross-linked peptide database (“.fastacp”).
Select the spectra files directory
: Provide the full path to the folder that contains the mgf files (MS/MS spectra). Currently supporting mgf files.
Select the output file directory
: Provide the full path of the folder with the Xilmass result files for each give mgf file. An mgf name is written for the title of each of these Xilmass output files. When Xilmass executes searching, there will be also one file which contains all XPSMs (allXPSMs_list.txt), one file file which contains the validated XPSMs (validatedXPSMs_list.txt), and one file which contains the search settings (settings.txt)
Cross-linking
Select the cross-linker*
: A cross-linker name, currently supporting for DSS (d0/d12), BS3(d0/d4), EDC and GA.
Select the labeling type of a cross-linker¨*
: heavy
is for the usage of a only heavy labeled cross-linker; light
is for the usage of a light labeled cross-linker and both
is for the usage of both a heavy and light labeled cross-linker.
Consider side reactions (only for N-hydroxysuccinimide cross-linkers, such as DSS and BS3) for
: This option enables to assume that serine, threonine and/or tyrosine as a linkable for only N-hydroxysuccinimide cross-linkers, such as DSS and BS3.
Select the cross-linking type*
: intra
shows intra-protein cross linking; inter
shows inter-protein cross-linking; and both
shows inter- and intra-protein cross linking.
Mono-linking*
: This option enables including searches also for mono-linked peptides (recommended).
Minimum peptide length*
: A minimum length of each peptide allowed in a cross linked-peptide
Maximum total length of cross-linked peptides*
: A maximum length of a pair of cross-linked peptides allowed in a cross linked-peptide
Intra-linking
: This option allows to include searches also for intra-peptides (cross linking of the same peptides within the same protein)
In-silico digestion
Select the enzyme used for the in-silico digestion*:
Select an enzyme from the provided list.
Allowed number of miscleavages*:
Provide number of allowed miscleaveges
Minimum peptide mass considered for cross-linked combinations*:
The minimum mass of one putative peptide in order to include cross-linked peptide combinations
Maximum peptide mass considered for cross-linked combinations*:
The maximum mass of one putative peptide in order to include cross-linked peptide combinations
Modifications
Select the fixed modifications*:
Select fixed modifications from the given modification list.
Select the variable modifications*:
Select variable modifications from the given modification list.
Maximum number of variable modifications per peptide*:
Provide a number of maximum allowed variable modifications per peptide.
Scoring
Select the neutral losses to consider*:
This option allows including peaks for neutral losses while scoring. The first option is no neutral losses are taken into account. The second option introduces only singly charged water losses for D/E/S/T and ammonia losses for K/N/Q/R in the presence of the parent ions, as implemented in Andromeda. The last option is to introduce all water and ammonia losses are considered (both singly and doubly charged).
Select the fragmentation mode*:
Select one of the enlisted fragmentation modes, which as HCD (b and y ions also a2); CID (b and y ions), ETD (c and z ions).
Select the number of peptide tolerance windows*:
Provide the total number of peptide tolerance windows.
Peptide mass tolerance window 1:
The first opened peptide tolerance window in either PPM or Da. Base
is the center of this peptide tolerance mass window. For example, if its Value
is 2Da with its Base
of 1.5Da will open a window with values of -0.5Da to 3.5Da. Note that the unit of this base is always Dalton.
Peptide mass tolerance window 2:
The second opened peptide tolerance window in either PPM or Da. Base
is the center of this peptide tolerance mass window.
Peptide mass tolerance window 3:
The third opened peptide tolerance window in either PPM or Da. Base
is the center of this peptide tolerance mass window.
Peptide mass tolerance window 4:
The fourth opened-peptide tolerance window in either PPM or Da. Base
is the center of this peptide tolerance mass window.
Peptide mass tolerance window 5:
The fifth opened peptide tolerance window in either PPM or Da. Base
is the center of this peptide tolerance mass window.
Fragment mass tolerance*:
Provide the fragment mass tolerance value (in Dalton)
Minimum number matched peaks for cross-linked peptides*:
Number of minimum required theoretical peaks along peptide-bonds from each peptide in order to keep this cross-linked peptides to spectrum matches.
Peak matching
: This option allows finding either all matched theoretical peaks within a tolerance or only the closest theoretical peak within a tolerance.
MS1 mass differences reporting unit*:
Select the unit to report MS1 differences on the final list.
Spectrum preprocessing
Spectrum scoring mass window value*:
Provide the mass value to divide spectrum into windows/bins during scoring
Minumum number of filtered peaks per window*:
Provide the minimum number of filtered peaks per mass window during scoring - Inclusive
Maximum number of filtered peaks per window*:
Provide the maximum number of filtered peaks per mass window during scoring - Inclusive
Lower precursor mass bound for selecting the C13 peak over the C12 peak*:
Provide the minimum precursor mass (Da) that C13 peak might be selected over C12 (the point on which we start observing C13 peak selection above this given precursor mass).
Deisotope precision
: Provide the allowed tolerance between the C12 peak and the C12 with one C13 fragment peak (in Da).
Deconvulate precision
: Provide the precision value to select if a singly charged and its deconvoluted peak exist within this precision value (in Da).
Multithreading and validation
Number of threads
: Provide the number of cores for multithreading
Write separate Percolator input files*
: Enables writing separate Percolator input files. Currently, being tested and not fully-function yet
FDR calculation*:
Select either improved
or global
. improved
splits the XL sites lists into two groups and computes FDR for each sub-XL sites according to the given values of Inter-protein improved FDR value*
and Intra-protein improved FDR value*
. In case of the selection of global
, FDR is being computed with all XL sites, according to user given value of Global FDR value*