DiffNovo is an innovative tool for de novo peptide sequencing using advanced machine learning techniques. This guide will help you get started with installation, dataset preparation, and running key functionalities like model training, evaluation, and prediction.
Ebrahimi, Shiva, et al.'DiffNovo: A Transformer-Diffusion Model for De Novo Peptide Sequencing'
To manage dependencies efficiently, we recommend using conda. Start by creating a dedicated conda environment:
conda create --name diffnovo_env python=3.10Activate the environment:
conda activate diffnovo_envInstall DiffNovo and its dependencies via pip:
pip install diffnovoTo verify a successful installation, check the command-line interface:
diffnovo --helpAnnotated DIA datasets can be downloaded from the datasets page.
DiffNovo requires pretrained model weights for predictions in denovo or eval modes. Compatible weights (in .ckpt format) can be found on the pretrained models page.
Specify the model file during execution using the --model parameter.
DiffNovo predicts peptide sequences from MS/MS spectra stored in MGF files. Predictions are saved as a CSV file:
diffnovo --mode=denovo --model=pretrained_checkpoint.ckpt --peak_path=path/to/spectra.mgfTo assess the performance of de novo sequencing against known annotations:
diffnovo --mode=eval --model=pretrained_checkpoint.ckpt --peak_path=path/to/test/annotated_spectra.mgfAnnotations in the MGF file must include peptide sequences in the SEQ field.
To train a new DiffNovo model from scratch, provide labeled training and validation datasets in MGF format:
diffnovo --mode=train --peak_path=path/to/train/annotated_spectra.mgf --peak_path_val=path/to/validation/annotated_spectra.mgfMGF files must include peptide sequences in the SEQ field.
To fine-tune a pretrained DiffNovo model, set the --train_from_scratch parameter to false:
diffnovo --mode=train --model=pretrained_checkpoint.ckpt \
--peak_path=path/to/train/annotated_spectra.mgf \
--peak_path_val=path/to/validation/annotated_spectra.mgfFor further details, refer to our documentation or raise an issue on our GitHub repository.