binary for Linux (Redhat 7.1)
Download Comet binary for Alpha (Compaq Tru64 UNIX V5.0A)
Download Comet binary for Sun (Solaris 8)
Download Comet binary for SGI / IRIX
Download Comet binary for Mac OS X (thanks to Eric Frangulian)
Don't forget to make the file executable by using chmod +x
Instructions for Using Comet from the Command Line
comet -i myseqs.fa -m mymatrices -a 20 -o outfile
- -i [required]
- Follow this option with the name of a file containing the sequences to be
analyzed. This file should be in fasta format, eg:
>first_sequence AGGTCGAG... GTGGAAC... >second_sequence ...
- -m [required]
Use this option to supply the program with a file containing a list of nucleotide count matrices. Each matrix defines the DNA sequence motif of a cis-element. The file has the following format:
>first_motif 1 1 5 2 38 5 29 1 15 5 3 7 5 35 >second_motif 1 1 4 2 2 12 ...The first line of each matrix definition begins with the symbol > followed by a name for the motif. The second line, which is optional, specifies two weights for the motif: one for the + strand and the other for the - strand. These weights let you specify how often you expect each cis-element to occur on each strand in regulatory clusters. The weights are relative, so multiplying all the weights for all the motifs by a constant makes no difference. If in doubt, leave it out. The remaining lines contain counts of adenine, cytosine, guanine and thymine observed at each position in the cis-element, in a sample of cis-elements of this type.
Palindromes: for matrices that are exact complementary palindromes, there is no distinction between the + and - strand. Comet automatically detects exact complement palindromes, and assigns an overall weight for the motif that is the sum of the two numbers on the second line of the matrix description.
- -a [optional]
- Specifies the average distance expected between motifs in a cluster. The default is 35.
- -o [optional]
- Specifies the name of a file to write the output to. The default is to write output to the screen.
- -e [optional]
- Specifies an E-value threshold to supress output of clusters with greater E-values. The default is 10.
- -w [optional]
- Local abundances of A, C, G and T are counted in windows of size 2w+1. The default is 75.
- -p [optional]
- Number of pseudocounts to add to all entries in cis-element matrices. The default is 1.
- -s [optional]
- Specifies a file to write statistical information used as an intermediate step in calculating the E-values. This option is mainly for development purposes.
The E-values will not be accurate when using a collection of cis-element matrices including: very similar matrices, a matrix that is almost a complementary palindrome, or a matrix with a high propensity for self-overlap, e.g. consensus sequence AAAAAA. It is recommended that very similar matrices be combined into a single matrix, and near-palindromic matrices be made exactly palindromic.