Home      Labs      Publications      People      Tools   

From CAGT

About the Center for Advanced Genomic Technology (CAGT)

CAGT evolved from the Molecular Engineering Research Laboratory(MERL), which was founded by Charles DeLisi in 1990 as a new interdisciplinary program in Boston University's College of Engineering. The primary participants in the work of the Center are faculty and students in the Bioinformatics program at Boston University.

Biological cells have developed methods for transmission and control of information, for memory and learning, and for error correction and adaptation, which were optimized over hundreds of millions of years of evolution. Recent developments in high throughput experimental and computational methods have placed us, for the first time, in a position to understand these processes and to use them in clinical medicine and engineering in ways that currently can barely be glimpsed. CAGT is positioned to play an important role in such progress through new forms of collaboration and training that will provide the intellectual foundation required for breakthrough technologies in computation, information handling and engineering.

 

 Recent News

  • BU IGERT Challenge Project video
  • With support from Merck Pharmaceuticals, Boston, CAGT and BUMC sponsored a one day symposium, Preparing for Bio-Threats: Emerging and Re-emerging Infectious Diseases, on 12/14/05 in Boston University.
  • CAGT website is now using a popular collaborative software (wiki) to automate publishing, all CAGT members are welcome to participate.
  • Dr. Charles DeLisi, founder of the Boston University Bioinformatics Graduate Program, was featured in the September 2004 issue of Bio-IT World magazine in an intimate conversation on the mission of the BU Bioinformatics program, the milestones of his career, and how close we really are to finding a vaccine for AIDS.


Research tools developed by CAGT

VisANT is a web-based software framework for visualizing and analyzing many types of networks of biological interactions and associations. Networks are a useful computational tool for representing many types of biological data, such as biomolecular interactions, cellular pathways and functional modules. Given user-defined sets ofinteractions or groupings between genes or proteins, VisANT provides: (i) a visual interface for combining and annotating network data, (ii) supporting function and annotation data for different genomes from the Gene Ontology and KEGG databases and (iii) the statistical and analytical tools needed for extracting topological properties of the user-defined networks. Users can customize, modify, save and share network views with other users, and import basic network data representations from their own data sources, and from standard exchange formats such as PSI-MI and BioPAX. The software framework we employ also supports the development of more sophisticated visualization and analysis functions through its open API for Java-based plug-ins. VisANT is distributed freely via the web at http://visant.bu.edu and can also be downloaded for individual use.

  • Meta-Network: multi-scale visualization of bio-networks, ideal for network of functional modules
  • Flexible Visual Schema of the network: Customized node&edge annotation
  • Integrative Data-Mining: 256k+ associations for 66 species, name normalization for Yeast, Fly, Homo sapiens etc...
  • Adjustable high performance, supports large-network, test has been performed with 226k+ edges and nodes


SeqVISTA: a graphical tool for sequence feature visualization and comparison presents a holistic, graphical view of features annotated on nucleotide or protein sequences. This interactive tool highlights the residues in the sequence that correspond to features chosen by the user, and allows easy searching for sequence motifs or extraction of particular subsequences. SeqVISTA is able to display results from diverse sequence analysis tools in an integrated fashion, and aims to provide much-needed unity to the bioinformatics resources scattered around the Internet. Our viewer may be launched on a GenBank record by a single click of a button installed in the web browser.

Integrated Platform for Regulatory Motif Detection: We argue the importance of integrating multiple computational algorithms, and present an infrastructure that integrates eight web services covering key areas of transcriptional regulation. With innovated architecture and extended new functionalities, especially the pipeline management, the new infrastructure allows easy integration of gene regulation analysis software that is scattered over the Internet. It also enables bench biologists to perform an arsenal of analysis using cutting-edge methods in a familiar environment and bioinformatics researchers to focus on developing new algorithms without the need to invest substantial effort on complex pre- or post-processors.

Visual Data Mining: SeqVISTA presents a holistic, graphical view of features annotated on nucleotide or protein sequences. This interactive tool highlights the residues in the sequence that correspond to features chosen by the user, and allows easy searching for sequence motifs or extraction of particular subsequences. SeqVISTA is able to display results from diverse sequence analysis tools in an integrated fashion, and aims to provide much-needed unity to the bioinformatics resources scattered around the Internet.

Tandem Repeats Database (TRDB) is a public repository of information on tandem repeats in genomic DNA and contains a variety of tools for their analysis. These currently include, the Tandem Repeats Finder algorithm, query and filtering capabilities for finding particular repeats of interest, repeat clustering algorithms based on sequence similarity, polymorphism prediction based on common patterns of mutation, PCR primer selection, and data download in a variety of formats. In addition, TRDB serves as a centralized research workbench. It provides storage space for results of analysis and permits collaborators to privately share their data and analysis.

A tandem repeat in DNA is two or more adjacent, approximate copies of a pattern of nucleotides. Tandem Repeats Finder is a program to locate and display tandem repeats in DNA sequences. In order to use the program, the user submits a sequence in FASTA format. There is no need to specify the pattern, the size of the pattern or any other parameter. The output consists of two files: a repeat table file and an alignment file. The repeat table contains information about each repeat, including its location, size, number of copies and nucleotide content. Clicking on the location indices for one of the table entries opens a second web browser that shows an alignment of the copies against a consensus pattern. The program is very fast, analyzing sequences on the order of .5Mb in just a few seconds. Submitted sequences may be of arbitrary length. Repeats with pattern size in the range from 1 to 2000 bases are detected. Sequence information sent to the server is confidential and deleted after program execution

Protein Engineering