Example data
These datasets were obtained by peterslab. They include specific and non-specific digests, performed either in solution or under HDX-MS conditions, using single proteins, defined protein mixtures, or complex whole-cell lysates. All LC-MS/MS datasets were acquired on timsTOF instruments (Pro or SCP) and searched using the MASCOT search engine against a custom or UniProt database. Data are provided as zip files containing the search results in *.csv format along with another zip archive with the corresponding FASTA database.
AnPEP digests
AnPEP digests - MASCOT
AnPEP digests - database
A protein mixture (bovine serum albumin, bovine carbonic anhydrase 2, horse heart myoglobin, horse cytochrome C - all commercially available and recombinant human 14-3-3 gamma) was digested in solution by AnPEP (also known as ProAlanase). Digestion was performed at pH 1.5 (30 mM HCl) for 1 h, 2 h, or overnight (ON), using different enzyme-to-protein ratios (1:50, 1:100, 1:200).
Longer digestion times and higher enzyme loadings result in reduced cleavage specificity. Under optimal conditions, digestion should preferentially occur after Pro, Ala, and Cys.
This dataset is associated with the publication: Kalaninova Z et al, Anal Chem. 2024, 96, 19084-19092.
CLCec1 digests
CLCec1 digests - MASCOT
CLCec1 digests - database
The recombinant Cl⁻/H⁺ antiporter CLC-ec1 from E. coli was digested online under HDX-MS conditions using immobilized protease columns: pepsin (iP), nepenthesin-2 (iN2), or co-immobilized pepsin + nepenthesin-2 (coiN2-P). Digestion was carried out at flow rates of 100 µL/min or 200 µL/min and at either 0 °C or 20 °C. Each condition was analyzed in triplicate. This dataset illustrates the effect of temperature and mixed protease column on digestion efficiency of membrane proteins.
This dataset is accompanied by an example Structure-Domain file which defines secondary structure elements (only helices in this membrane protein) and four types of “domains”: CP, PP, IM, and TM, corresponding to cytoplasmic, periplasmic, intramembrane, and transmembrane regions, respectively.
CyaA digests
CyaA digests - MASCOT
CyaA digests - database
The RTX toxin CyaA was digested online under HDX-MS conditions using pepsin (iP) or a co-immobilized pepsin–nepenthesin-2 (coiP-N2) column. Digestions were conducted at 200 µL/min, 0 °C, under denaturing quench conditions (4 M urea).
This dataset demonstrates digestion of a large protein (>1700 amino acids) and comparison of column performance.
Related publication: Osickova A et al, J Biol Chem. 2023, 299, 104978.
BSA trypsin
BSA trypsin - MASCOT
BSA trypsin - database
Bovine serum albumin was digested by trypsin under standard proteomic conditions: reduction and alkylation with TCEP and iodoacetamide, trypsin digestion (1:50) at pH 8.5, 12 h at 37 °C. The resulting LC-MS/MS data were searched using the “no enzyme” setting with carbamidomethylation (CAM) as a fixed modification. Despite being a classical tryptic digest of a single protein, the analysis reveals a surprisingly high number of semi-specific peptides, illustrating the apparent lack of specificity when examined in detail.
Data related to the DigDig publication.
Repetitions and PNGase Rc - Myoglobin and haptoglobin 2-2
Mb concat and Hpt22-PNGase - MASCOT
Hpt22 - database
Mb concat - database
These two datasets illustrate how DigDig processes redundant peptide sequences.
Horse heart myoglobin was digested under HDX-MS conditions using an online co-immobilized pepsin–nepenthesin-2 column (200 µL/min, 0 °C). The MS data were searched against a custom database in which the myoglobin sequence was replicated four times. This setup simulates DigDig’s ability to recognize and report repeated peptide sequences.
Human haptoglobin 2 (Hpt2), which naturally contains a repetitive segment in its α-subunit, was analyzed under HDX-MS conditions. The protein was reduced and denatured (125 mM TCEP, 2 M urea, 5 min, 0 °C), then digested online using the co-immobilized pepsin–nepenthesin-2 column (200 µL/min, 0 °C). In a subset of experiments, PNGase Rc column was added downstream to evaluate the effect of on-column deglycosylation. These data serve two purposes: they demonstrate how DigDig handles repetitive sequences in real proteins, and they illustrate the effect of deglycosylation on digestion patterns.
Human Hpt2 contains four N-glycosylation sites: NLT (168), NHS (191), NAT (195), and NYS (225) - numbering fits to the mature sequence provided in the database here. Comparison of datasets with and without PNGase treatment highlights the limited (fourth N-glc site) or absent (1st-3rd N-glc site) coverage in non-deglycosylated samples. Online deglycosylation markedly improves peptide redundancy and coverage across these glycosylation sites.
HEK293 cell lysate, trypsin digest
HEK293 cell lysate - MASCOT
HEK293 cell lysate - database
HEK293 cell lysate was digested in solution with trypsin (1:50 enzyme:protein ratio, 12 h, 37 °C). Two sample preparation protocols were compared: (a) standard reduction and alkylation using TCEP and iodoacetamide prior to digestion, and (b) reduction with TCEP post-digestion, followed by immediate acidification to preven disulfide bond formation.
MS data were searched using the “no enzyme” setting to allow unbiased assessment of cleavage specificity. The results highlight the behavior and apparent specificity of trypsin in complex mixtures. Data are associated with the DigDig publication.
human serum, Asp-N digest
human serum, Asp-N - MASCOT
human serum, Asp-N - database
Unprocessed human serum was digested with Asp-N protease (enzyme-to-protein ratio 1:50, 12 h, 37 °C). Reduction was done by TCEP, alkylation by iodoacetamide. The data were searched using the “no enzyme” setting to enable unbiased evaluation of cleavage specificity. This dataset illustrates the cleavage preferences of a protease with N-terminal amino acid selectivity and serves as an example of non-tryptic digestion in complex biological samples.
human serum, AnPEP online digest
human serum, AnPEP online digest - MASCOT
human serum, AnPEP online digest - database
Unprocessed human serum was diluted in 50 mM HEPES (pH 7.5), 150 mM NaCl, reduced with TCEP, and subsequently acidified to pH 1.5 with HCl. Proteolysis was performed online on an immobilized AnPEP column. Digestion was driven by a flow of 30 mM HCl (pH 1.5) at 100 µL/min, 0 °C. The resulting peptides were analyzed by LC-MS/MS and searched against the human UniProt database with no enzyme specificity. These data, published in Kalaninova Z et al, Anal Chem. 2024, 96, 19084-19092, demonstrate AnPEP’s preferential cleavage after Ala, Pro, and Cys, and illustrate how digestion metrics can be extracted for individual proteins within a complex sample.
