Home      Labs      Publications      People      Tools   

From CAGT

IRDB Help

[Annotations] [Arm Join Tool] [Browser] [Clustering] [Data Download] [Distributions] [FASTA] [Flanking Sequences] [GAP] [GFF] [Guest Account] [History] [IRF] [My Account] [Partitions] [Projects] [Redundancy] [Repeats] [Reports] [Sequences] [Sets] [Tools]

Annotations (top)

  1. annotation explanation Annotations are an additional information associated with a sequence (like gene or exon information).

    Annotations can either be viewed through the browser (accessible from the same page as are distributions) or from the “filter repeats� page where one can run queries on them.

  2. annotate a sequence

    Interface:
    You can annotate the sequence by selection "annotate" on the sequences menu (to upload a new file) or by going to the modify menu (to view and maintain already uploaded ones).

    If you click on the “modify� link, you are presented (along with sequence description info) a list of annotation files (in GFF format) which is originally empty. You can then proceed to upload GFF files with different features one at a t ...continue

  3. downloading annotations You can download annotations uploaded by yourself from the >sequences>annotate menu.

  4. viewing annotations You can view repeat annotations from the same page as any other field (">sets>view repeats".) In order to make them viewable, you need to go to "Change Columns" and select annotations of interest (ex: Gene, Intron, etc.) There are a number of predefined annotation fields. If some sequence has annotations associated with it which are not defined in these field, you can select the "Other Feature" annotation field and it will be placed there together with any other " ...continue

  5. origin of data Annotations for sets in the Public Database project downloaded from following locations:

    NCBI
    ensembl.org
    wormbase.org
    genome.ucsc.edu
    yeastgenome.org

Arm Join Tool (top)

  1. arm join explanation The purpose of the tool is to reasssemble repeats that were broken apart during the search.

    The algorithm is the first draft and not yet in its final form.

    The algorithm first orders the repeats by left index, establishes the directed graph of who is inside who, and then goes from left to right through the links and merges the repeats that match the criteria 2 at a time.

    The criteria is that the combined score is larger than the in ...continue

Browser (top)

  1. browser explanation You can get to the browser page by selecting a set via the SETS menu and clicking the "BROWSER" button. On the top of the page you are presented with the description of the set. Below the description, there is a box labeled "Browser Options". It contains various options for controlling broser layout and which data the browser displays.

    First, select the zoom level. This is how much data the browser will try to display starting from the selected range point ...continue

Clustering (top)

  1. compare flanking sequences When clustering tandem repeats TRDB makes use of similarity in the repeating pattern. Once a cluster of repeats is obtained it is possible to execute a flanking sequence comparison to determine if a given set of repeats exhibits significant similarities in the sequences surrounding the repeats. This information can reinforce the notion of likeness between a specific pair of repeats.

    The output of the flanking sequence comparison is presented in two ...continue

  2. creating a cluster mannually You don't need to run a clustering algorithm to create a cluster. Simply save a set "as cluster" when making a copy of it and place it into any of the available partitions. You might need that because some tools can only be run on clusters (flanking comparisons, label copies).

  3. label copies This tool labels each copy of a repeat with a unique letter. You can download the label explanation file by clicking the button on the bottom of the page.

  4. clustering explanation

    Clustering can be performed by running the clustering tool against a set. In running this tool you create a partition of the set. Partitions can be accessed from the main menu at any time. In the clustering tool page you provide the following:

    • A name for the new partition.
    • A project to which the partition will be associated.
    • A distance table to use for aligning repeats in the clust ...continue

Data Download (top)

  1. download explanation Data download is accessible from the tool menu. It allows you to download the contents of a set in different formats.

    First select the columns you wish to download by selecting them in the box on the left and moving them to the right by clicking the ">>" arrow. You can select multiple columns by holding down the SHIFT key.

    Then select the ordering of the set in the "Order By" selection box. Note that you cannot sort on some fields like (pattern, pro ...continue

Distributions (top)

  1. distributions explanation You can get to the distributions page by selecting a set via the SETS menu and clicking the "VIEW DISTRIBUTIONS" button. On the top of the page you are presented with the description of the set. Below the description, there is a box labeled "Distribution Options". It contains various options for controlling distributions layout.

    "Distribution On" drop down box allows you to choose the column of interest. The option below (graph or table) allows you to choose ...continue

FASTA (top)

  1. FASTA The FASTA format looks something like this:


    >myseq
    AGTCGTCGCTAGCTAGCTAGCATCGAGTCTTTTCGATCGAG
    CTAGCTAGCTAGCATGTCGCTCGAGCATGTCGCTCATGAGA
    TTTAGCTAGCTAGCATAGCATACGAGCATATCGGTGTCGCT


    The first line starts with a greater than sign ">" and contains a name or other identifier for the sequence. The remaining lines contain the sequence data. The sequence can be in upper or lower case letters. Anything other than letters (numbers for example) is ...continue

Flanking Sequences (top)

  1. flanking sequences explanation Flanking sequences can be downloaded fby using the "Flanking Sequence and Primers Extraction" tool. You are first asked to select a set. After doing so, you will have to pick necessary flanking sequence options. Select the size of the flanking sequences first (50,100,200,350 or 500.) Then, choose the ouput format (ASCII or XML.) Don't forget to choose the line termination that your system understands (newline for unix, carriage return for mac, or both for windows.) ...continue

GAP (top)

  1. GAP detection Basically, all this tool does is scan through the sequence and create an annotation entry each time a run of Ns is found of certain minimum length. Annotation is stored in the "OTHER FEATURE" track. This makes it possible to find repeats that have big gaps overlapping or adjacent to them. NOTE: just as uploading a GFF file, you will need to run the index regeneration from the "sequence->modify" menu in order to be able to search through results.

GFF (top)

  1. GFF The GFF(General Feature Format) format looks something like this:

    SEQ1	EMBL	atg	103	105	.	+	0
    SEQ1	EMBL	exon	103	172	.	+	0
    SEQ1	EMBL	splice5	172	173	.	+	.
    SEQ1	netgene	splice5	172	173	0.94	+	.
    SEQ1	genie	sp5-20	163	182	2.3	+	.
    SEQ1	genie	sp5-10	168	177	2.1	+	.
    SEQ2	grail	ATG	17	19	2.1	-	0
    
    Fields are tab delimited. The file size is unlimited. For the latest GFF file specificatation consult ...continue

Guest Account (top)

  1. guest account Guest account permits browsing of our online database but does not allow storage or copying of any data in the database. You must register to get a private workspace inside the database.

History (top)

  1. set history explanation Sets are often created, filtered and merged. After doing it for a while, you may not remember the history of the set. That's what the Set History page allows you to view. You can get to the HISTORY page by selecting a set via the SETS menu and clicking the "VIEW HISTORY" button. On the top of the page you are presented with the description of the set. On the bottom of the page there is an image with a tree like structure. It shows you the history of the set. Blue b ...continue

IRF (top)

  1. IRF Inverted Repeats Finder is a program to locate and display inverted repeats in DNA sequences. It was developed by Dr. Gary Benson.

My Account (top)

  1. modify user options
    1) Repeats Per Page - indicates how many repeats are displayed on one page while viewing a set (or cluster) of repeats.

    2) Show Repeats In New/Same window indicates whether a new browser window is created for each repeat while viewing it. It may be useful to use multiple windows if you want to compare two repeats next to each other.

  2. modify alignment options This option allows you to modify your default alignment parameters as well as the length of the flank used by the "compare flanks" tool (see clustering for more information).

  3. view explanation View menu allows you to change the columns that are displayed when you are viewing repeats and on which you can filter and order them. First select the columns you wish to display by selecting them in the box on the left ("Availible Extra COlumns") and moving them to the right by clicking the ">>" arrow to the "Selected Extra Columns". You can select multiple columns by holding down the SHIFT key. Some columns (like indices, pattern size and copy number) are there ...continue

  4. changing personal information This option allows you to modify your contact information. Your name is used to refer to you in the database. You can either type your real name or some optional handle. Your address will only be used if we ever have to contact you by email. Make sure your contact email is correct, because it is also your login name (make sure you don't get locked out. )

  5. changing password This option allows you to change your password. Type in the old password and the new password twice. As a reminder, if you forget your password, you can always go to the "password reminder" page. You can get to it from the login page.

  6. changing styles This option allows you to change the way your pages look. It has a set of predefined style sheets that you can choose to render your pages with. Later on we will add more options here for changing background color or image and maybe a feature to allow the user to upload their custom style sheets.

Partitions (top)

  1. partition explanation Partitions are produced by running a clustering algorithm against a set of repeats. Clustering is availible through the TOOLS menu. After the algorithm is run, partitions contain clusters of similar repeats. Following information is displayed: completion status, cutoff value with which the algorithm was run, number of clusters produced, name of the clustering algorithm and the creation date. The following actions are availible for partitions: delete. Clicking on t ...continue

Projects (top)

  1. projects explanation Projects are holders for sets of repeats and the results of analysis. If you want to generate a new set, you must have an active project you can add it to. You cannot add sets to public projects. Projects page contains a list of projects you created/joined, allows you to perform various actions on them and lets you create new ones. Following information is displayed: project name, project owner, number of users in the project, number of sets in the project. The fol ...continue

  2. other options Delete option allows you to delete a project. Modify option allows you to modify project name, description and other fields. Clicking on info will pop up a window with more detailed information about a project. Note: you are not allowed to delete a project if there are still sets in it. Delete the sets first, and then delete the project.

  3. create a project Go to the "create project" page by clicking the "create a project" button on the projects page. Once there, write the name and the description (optional) and click on the "CREATE" button.

  4. modify a project Modify a project page is accessible by clicking the modify action in front of the project. This option allows you to do two things. First, you can modify the project's name and description. Second, you can add collaborators to the project. Just type the user's email into the textbox on the button and press add. If the user is in our database, (s)he will be added to your project. You can remove the user at any time by clicking the [remove] link in front of the user' ...continue

Redundancy (top)

  1. redundancy explanation The purpose of the Redundancy Tool in IRDB is to eliminate redundant repeats (repeats that were reported twice or more or partial repeats (when one repeat is actually a part of another one)). The reason that happens is because repeats could be found at different intervals or slightly different centers due to possible tandem repeats that have matching invereted repeats (ex: ATATAT and TATATA are both tandem repeats and inverted repeats that can easily be shifted 2 c ...continue

Repeats (top)

  1. edit comment Comments are short (up to 300 chars) annotations entered by the user.

    Users can modify comments of repeats of projects of which they are members.

    Users cannot modify comments of repeats which are parts of public projects (only administrators can do that).

    If a set is copied from a public project into a private project, comments can be modified.

  2. view repeats explanation Each row of the repeats table represents a single repeat. The first column indices are the indices where the repeat is located inside the original sequence. If you click on the indices link, a window will pop up with alignment explanation and a visual representation of the repeat occurrences. "Pattern size" is the length of the consensus pattern. "Copy Number" is the number of copies detected. Other columns are optional and can be added and removed via the "VIEW" m ...continue

  3. alignment explanation This page displays information about one repeat and its alignment. Information about the source sequence, annotations and other repeat characteristics are also displayed. Flanking sequences can be displayed optionally. When flanking sequences are selected, database tries to find fractions of the pattern in them. It will display at most 10 fractions (in yellow) that have a score over 14.

  4. repeat explanation Repeats are tandem repeats found in DNA sequences by the TRF algorithm. In order to view them, you must select a set from the SETS menu and click on the "VIEW REPEATS" button. This will transport you to "sets > view repeats" pages. This page has two main sections: filter section, and viewing section.

  5. filter explanation Filter allows you to filter out repeats you do not want. Select a condition from the FIELD textbox, a comparison operator from the OP textbox and type the desired value into the VALUE textbox. For example, selecting PatternSize > 20 and pressing APPLY will filter out the repeats that are less than or equal to 20. Once you have entered a filter, it will be displayed above the selection line. If you want to get rid of the filter, uncheck the checkbox in front of the ...continue

Reports (top)

  1. reports explanation Reports are documents inside the LBI database that you can create to keep track of your work and/or share it with other people. They are designed so it is easy to pull any kind of information of the database and store it for further viewing. Your report can have an unlimited amount of text, as well as any number of pictures or records inserted in it (note: records and pictures are stored statically, therefore they will not change inside the report if your data is u ...continue

  2. create a report This page lets you create a new report by providing its name and abstract(optional). Press CREATE when you are finished filling in required information and a new empty report will be created.

  3. modify a report This page lets you change the report name and abstract. In addition, you can share this report with other people by typing in their emails.

  4. edit a report When you click on the edit button of a new report you are presented with a single text area. You can start typing your text there. If you need to insert a resource, first save your work, then simply navigate to your resource inside the database and press the "SAVE RESOURCE" button. You will be prompted to select the report to add it to. Once you select the target report, press the "SAVE RESOURCE" button again. Your resource will be added to the end of the selected ...continue

Sequences (top)

  1. sequence explanation Sequences are DNA sequences in FASTA format. The Sequences page contains a list of sequences you uploaded, allows you to perform various actions on them and lets you upload new ones. The following information is displayed: sequence name, number of sequences inside the FASTA file, sequence length (if there is more than one subsequence in the file, this number is the sum of all subsequences), upload date. The following actions are availible for sequences: delete, mod ...continue

  2. upload a new sequence Upload a new sequence form expects a DNA sequence in FASTA format (note: if you upload a file, make sure it is saved as a regular text file and not something else, like word or rich text format, etc. ) You can also just copy and paste the sequence text into the "Cut and Paste" section. You must also provide the name of the sequence and the name of the organism. Genbank number and description are optional. After you fill in all the necessary information, press "subm ...continue

  3. process sequence In order to run the the search algorithm on a sequence you need to process it. You can get to the "process sequence" form by clicking the process link in front of the sequence you wish to process. Once there, make sure you enter the name of the future set and which project to add it to. At this point, you must have already created/joined a project. Note, you cannot add anything to public projects. Once you press process, a new set is created and you are automatic ...continue

  4. other sequence options Delete option allows you to delete a sequence. Modify option allows you to modify sequence name, description and other fields. Clicking on info will pop up a window with more detailed information about the sequence. Note: although deleting a sequence is allowable, some options (like set history) will not display correct information if this is done. You can also maintain annotation files from the "modify" menu (see "annotations" for more info).

  5. download sequence “Download sequences� tool provides the following options:

    First, a range can be selected, to download a part of the sequence. By default, the range is set to the whole length of the sequences. The first input box is the starting index (note we use a one based coordinate system). The second is the length. You cannot enter range exceeding the length of the sequence. (Note: if your file is a multiple sequences FASTA file, you will not be presented with the ...continue

sets (top)

  1. Set Import SET IMPORT lets you import a set into the database using a native .DAT file (TRF/IRF output.) Please note that in the TRDB version of this tool, pattern is recomputed to produce a better pattern along with caclulating a profile. In IRDB, profile calculation is skipped.

Sets (top)

  1. sets explanation Sets are produced by running the TRF algorithm on a DNA sequence. They are a collection of tandem repeats. Sets can be created, merged, copied, deleted and modified. Set is the main unit of the TRDB. Various tools like clustering, data download, etc... are run on sets.

    "View Repeats" button sends you to the page that lets you view repeats in a set. "View History" allows you to view the set history. "View Distributions" lets you view the way repeats a ...continue

  2. save or copy a set There are three reasons you might use this options: First, if you just filtered a set, you might want to save your results. Second, you might want to copy a set out of the public database into one of your projects. Finally, you might want to save a set as a cluster (to make some other tools available, like "compare flanks" or "label copies").

    First, select the project you want to add your new set to. Second, enter the name of your new set (a suggested name ...continue

  3. merge sets Merging a set is the process of combining two sets of repeats into another one that shares some of the units from both of them (note: the new set may be a join, an intersect, or somethig else).

    First, select the project to add your new set to. Second, select the type of the merge ("A or B" is the default one and is usually the most commonly used one, which is a join). Third, pick the name for your new set (at least 4 characters). Optionally, you may pro ...continue

Tools (top)

  1. tools explanation TOOLS menu gives you access to a number of tools you can run. TOOL is defined as following: "an action that affects some part of the database and generates some data which is either stored for later processing or immediately downloaded." Some database objects have ACTIONS associated with them (ex: you can select view or process actions for any sequence while viewing them.) ACTIONS are also tools, but they are so interconnected with their objects as essential to the ...continue



top

Protein Engineering