Home      Labs      Publications      People      Tools   

From CAGT

Cister Download

4/24/2002 : added -z option

Download Cister binary for Linux (Redhat 7.1)
Download Cister binary for Alpha (Compaq Tru64 UNIX V5.0A)
Download Cister binary for Sun (Solaris 8)
Download Cister binary for SGI / IRIX

Don't forget to make the file executable by using chmod +x

Instructions for Using Cister from the Command Line

Example usage:

cister -i myseqs.fa -m mymatrices -a 20 -b 8 -g 30000 -c clusterout -o motifout

Options

-i [required]
Follow this option with the name of a file containing the sequences to be analyzed. This file should be in fasta format, eg:
>first_sequence
AGGTCGAG...
GTGGAAC...
>second_sequence
...
-m [required]

Use this option to supply the program with a file containing a list of nucleotide count matrices. Each matrix defines the DNA sequence motif of a cis-element. The file has the following format:

>first_motif
1 1
5 2 38 5
29 1 15 5
3 7 5 35
>second_motif
1 1
4 2 2 12
...
The first line of each matrix definition begins with the symbol > followed by a name for the motif. The second line, which is optional, specifies two weights for the motif: one for the + strand and the other for the - strand. These weights let you specify how often you expect each cis-element to occur on each strand in regulatory clusters. The weights are relative, so multiplying all the weights for all the motifs by a constant makes no difference. If in doubt, leave it out. The remaining lines contain counts of adenine, cytosine, guanine and thymine observed at each position in the cis-element, in a sample of cis-elements of this type.

Palindromes: for matrices that are exact complementary palindromes, there is no distinction between the + and - strand. Cister automatically detects exact complement palindromes, and assigns an overall weight for the motif that is the sum of the two numbers on the second line of the matrix description.

-a [optional]
Specifies the average distance expected between motifs in a cluster. The default is 35.
-b [optional]
Specifies the average number of motifs expected in a cluster. The default is 6.
-g [optional]
Specifies the average distance expected between clusters. The default is 30000.
-w [optional]
Local background nucleotide abundances are counted using a sliding window of width 2w+1. The default is w = 1000.
-c [optional]
Specifies the name of a file for writing output about predicted cluster locations. Each line of the output file gives the posterior probability that each base in the sequence lies within a cis-element cluster.
-o [optional]
Specifies the name of a file for writing output about predicted cis-element locations.
-t [optional]
Specifies the minimum posterior probability for reporting a predicted cis-element. The default is 0.1.
-z [optional]
Supresses printing of cluster posterior probabilities (in the file given by -c) below the value specified with -z. If the value is > 0, each line of the -c file will contain 2 numbers: the position in the sequence and the posterior probability.
-p [optional]
Number of pseudocounts to add to all entries in cis-element matrices. The default is 1.

Protein Engineering