Home      Labs      Publications      People      Tools   

From CAGT

site2genome

This tool locates sites (short DNA sequences) within the human, mouse, rat, fruitfly and the nematode (C. elegans) genomes. It works by fetching user supplied longer sequences containing the sites from GenBank, then aligning each sequence to the appropriate genome using the BLAT webserver at UC Santa Cruz.

Enter up to 100 sites in this format:  
  GenBank identifier/Name               |  Sequence          |  Description
E.g.
  M73700.1                              |  caGGTCAaggCGATCtt |  Lactoferrin ERE
Or
  Human neutrophil lactoferrin promoter |  caGGTCAaggCGATCtt |  Lactoferrin ERE

Search name using   GenBank LocusLink Ensembl         Supress RNA entries from name search results

You can specify the longer sequence using its GenBank identifier or its name, as identified by GenBank or by Ensembl. You may also search by a locus name in LocusLink. If you use a name you must first specify the organism (human, mouse, rat, fruitfly or nematode) and you may have to later select a specific hit (or hits) if the name did not resolve to a unique record. Please note that most often sites of interest are in the upstream regulatory regions of genes and not in the sequence of the transcript's cDNA, which is the most likely result found when searching by a common name.
If the sequence has mixed case, the outermost uppercase letters define the site boundaries. Additional lowercase letters may help to place the site uniquely in the GenBank sequence. Unknown nucleotides may be specified by 'n'. The GenBank identifier and sequence should not contain any spaces. The description is optional.


Please be patient: locating multiple sites may take some time!

Citation

Frith, M. C.*, Halees, A. S.*, Hansen, U. & Weng, Z. (Accepted)
Site2genome: Locating Short DNA Sequences in Whole Genomes
Bioinformatics. *Joint First Authors
Full Text in PDF

Views
Protein Engineering