cd-hit-2d

Sequence file and databases

CD-HIT-2D compares 2 protein datasets (db1, db2). It identifies the sequences in db2 that are similar to db1 at a certain threshold.

> Choose db1

Load search database (in Fasta format):

> Choose db2

Load Query Fasta file from your computer:
Sequence Identity Parameters
> Sequence identity cut-off :

Algorithm Parameters
> -G: use global sequence identity

> -g: sequence is clustered to the best cluster that meet the threshold

> -b: bandwidth of alignment

Alignment Coverage Parameters

> -aL: minimal alignment coverage (fraction) for the longer sequence
> -AL: maximum unaligned part (amino acids/bases) for the longer sequence
> -aS: minimal alignment coverage (fraction) for the shorter sequence
> -AS: maximum unaligned part (amino acids/bases) for the shorter sequence
> -s: minimal length similarity (fraction)
> -S: maximum length difference in amino acids/bases(-S)
Length Control Parameters
> Length difference cutoff (fraction) :

> Length difference cutoff (amino acids/bases) :

Mail address for job checking

Give your mail address:
Developed by @Zhipeng He