De novo interpretation of tandem mass spectra.
Authors: Ma B, Lajoie G Abstract De novo sequencing is an effective method for identifying unknown peptide sequences from their tandem mass spectra. This unit briefly introduces how this can be done manually. A protocol for using the PEAKS online software for automated de novo sequencing is described. Finally, we show how to use the PEAKS scores to validate the de novo sequencing results. PMID: 19274631 [PubMed - indexed for MEDLINE] (Source: Current Protocols in Bioinformatics)
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Exploring human metabolites using the human metabolome database.
Authors: Forsythe IJ, Wishart DS Abstract The Human Metabolome Database (HMDB) is a Web-based bioinformatic/cheminformatic resource with detailed information about human metabolites and metabolic enzymes. It can be used for fields of study including metabolomics, biochemistry, clinical chemistry, biomarker discovery, medicine, nutrition, and general education. In addition to its comprehensive literature-derived data, the HMDB contains an extensive collection of experimental metabolite concentration data for plasma, urine, CSF, and/or other biofluids The HMDB is fully searchable, with many tools for viewing...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Finding homologs in amino acid sequences using network BLAST searches.
Authors: Ladunga I PMID: 19274633 [PubMed - indexed for MEDLINE] (Source: Current Protocols in Bioinformatics)
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Using RepeatMasker to identify repetitive elements in genomic sequences.
Authors: Tarailo-Graovac M, Chen N Abstract RepeatMasker is a popular software tool widely used in computational genomics to identify, classify, and mask repetitive elements, including low-complexity sequences and interspersed repeats. RepeatMasker searches for repetitive sequence by aligning the input genome sequence against a library of known repeats, such as Repbase. Here, we describe two Basic Protocols that provide detailed guidelines on how to use RepeatMasker, either via the Web interface or command-line Unix/Linux system, to analyze repetitive elements in genomic sequences. Sequence comparisons in ...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Obtaining comparative genomic data with the VISTA family of computational tools.
Authors: Ratnere I, Dubchak I Abstract Comparison of DNA sequences from different species is a fundamental method for identifying functional elements, such as exons or enhancers, as they tend to exhibit significant sequence similarity due to purifying selection. Availability of whole-genome sequences for a constantly growing number of organisms makes identification of such elements within these genomes possible. There are two distinct phases in comparisons of genomic sequences: in the first, the sequences are aligned, and in the second, the resulting alignments are analyzed to find conservation signals tha...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

RNA secondary structure analysis using the Vienna RNA package.
Authors: Hofacker IL Abstract This unit documents how to use the Vienna RNA package for RNA secondary structure analysis. Possible tasks include structure prediction for single sequences, prediction of consensus structures, prediction of RNA-RNA interactions, and sequence design. PMID: 19496057 [PubMed - indexed for MEDLINE] (Source: Current Protocols in Bioinformatics)
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

RNA secondary structure analysis using the RNAshapes package.
Authors: Reeder J, Giegerich R Abstract This unit shows how to use the RNAshapes package for the prediction of the secondary structure of a single RNA sequence using either minimum free energy methods or weighted ensemble information. It also includes a protocol for the consensus prediction of a set of related sequences. PMID: 19496058 [PubMed - indexed for MEDLINE] (Source: Current Protocols in Bioinformatics)
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

ChEBI: an open bioinformatics and cheminformatics resource.
Authors: Degtyarenko K, Hastings J, de Matos P, Ennis M Abstract Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on "small" chemical compounds. This unit provides a detailed guide to browsing, searching, downloading, and programmatic access to the ChEBI database. PMID: 19496059 [PubMed - indexed for MEDLINE] (Source: Current Protocols in Bioinformatics)
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Finding similar nucleotide sequences using network BLAST searches.
Authors: Ladunga I Abstract The Basic Local Alignment Search Tool (BLAST) is a keystone of bioinformatics due to its performance and user-friendliness. Beginner and intermediate users will learn how to design and submit blastn and Megablast searches on the Web pages at the National Center for Biotechnology Information. We map nucleic acid sequences to genomes, find identical or similar mRNA, expressed sequence tag, and noncoding RNA sequences, and run Megablast searches, which are much faster than blastn. Understanding results is assisted by taxonomy reports, genomic views, and multiple alignments. We inte...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

The importance of biological databases in biological discovery.
Authors: Baxevanis AD Abstract Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, the Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource are also covered. Non-sequence-centric databases su...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Searching Online Mendelian Inheritance in Man (OMIM) for information on genetic loci involved in human disease.
Authors: Borate B, Baxevanis AD Abstract Online Mendelian Inheritance in Man (OMIM) is a comprehensive compendium of information on human genes and genetic disorders, with a particular emphasis on the interplay between observed phenotypes and underlying genotypes. This unit focuses on the basic methodology for formulating OMIM searches and illustrates the types of information that can be retrieved from OMIM, including descriptions of clinical manifestations resulting from genetic abnormalities. This unit also provides information on additional relevant medical and molecular biology databases. A basic knowl...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Extracting biological meaning from large gene lists with DAVID.
Authors: Huang da W, Sherman BT, Zheng X, Yang J, Imamichi T, Stephens R, Lempicki RA Abstract High-throughput genomics screening studies, such as microarray, proteomics, etc., often result in large, "interesting" gene lists, ranging in size from hundreds to thousands of genes. Given the challenges of functionally interpreting such large gene lists, it is necessary to incorporate bioinformatics tools in the analysis. DAVID is a Web-based application that provides a high-throughput and integrative gene functional annotation environment to systematically extract biological themes behind large gene lists. Hig...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

An introduction to sequence similarity ("homology") searching.
Authors: Stormo GD Abstract Homologous sequences usually have the same, or very similar, functions, so new sequences can be reliably assigned functions if homologous sequences with known functions can be identified. Homology is inferred based on sequence similarity, and many methods have been developed to identify sequences that have statistically significant similarity. This unit provides an overview of some of the basic issues in identifying similarity among sequences and points out other units in this chapter that describe specific programs that are useful for this task. PMID: 19728288 [PubMed -...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Using OrthoCluster for the detection of synteny blocks among multiple genomes.
Authors: Vergara IA, Chen N Abstract Synteny blocks are composed of two or more orthologous genes conserved among species, resulting from speciation from their last common ancestor. OrthoCluster (Zeng et al., 2008) is a fast and easy-to-use program for the identification of synteny blocks among multiple genomes. It allows users to identify synteny blocks that contain different types of mismatches, and to decide whether they require conservation of gene orientation and conservation of gene order within the blocks. OrthoCluster can also be used to find duplicated blocks within genomes. Although genes and the...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Analyzing gene expression data from microarray and next-generation dna sequencing transcriptome profiling assays using GeneSifter analysis edition.
Authors: Porter S, Olson NE, Smith T Abstract Transcription profiling with microarrays has become a standard procedure for comparing the levels of gene expression between pairs of samples, or multiple samples following different experimental treatments. New technologies, collectively known as next-generation DNA sequencing methods, are also starting to be used for transcriptome analysis. These technologies, with their low background, large capacity for data collection, and dynamic range, provide a powerful and complementary tool to the assays that formerly relied on microarrays. In this chapter, we describ...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research