Using the Generic Synteny Browser (GBrowse_syn).
Authors: McKay SJ, Vergara IA, Stajich JE Abstract Genome Browsers are software that allow the user to view genome annotations in the context of a reference sequence, such as a chromosome, contig, scaffold, etc. The Generic Genome Browser (GBrowse) is an open-source genome browser package developed as part of the Generic Model Database Project (see UNIT ; Stein et al., 2002). The increasing number of sequenced genomes has led to a corresponding growth in the field of comparative genomics, which requires methods to view and compare multiple genomes. Using the same software framework as GBrowse, the Generic ...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Searching NCBI's dbSNP database.
Authors: Bhagwat M Abstract The Single-Nucleotide Polymorphism database (dbSNP) is a variation database at the National Center for Biotechnology Information (NCBI). It is a public repository of submitted nucleotide variations and is part of NCBI's search and retrieval system Entrez. This unit describes two basic protocols to search dbSNP effectively, one to perform a text-based search and another to perform a sequence-based search. The unit also describes one of the result display formats called GeneView to obtain information about all submitted SNPs in a particular gene. PMID: 21154707 [PubMed - i...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

RNA-Seq read alignments with PALMapper.
Authors: Jean G, Kahles A, Sreedharan VT, De Bona F, Rätsch G Abstract Next-generation sequencing technologies have revolutionized genome and transcriptome sequencing. RNA-Seq experiments are able to generate huge amounts of transcriptome sequence reads at a fraction of the cost of Sanger sequencing. Reads produced by these technologies are relatively short and error prone. To utilize such reads for transcriptome reconstruction and gene-structure identification, one needs to be able to accurately align the sequence reads over intron boundaries. In this unit, we describe PALMapper, a fast and easy-to-use t...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Aligning short sequencing reads with Bowtie.
Authors: Langmead B Abstract This unit shows how to use the Bowtie package to align short sequencing reads, such as those output by second-generation sequencing instruments. It also includes protocols for building a genome index and calling consensus sequences from Bowtie alignments using SAMtools. PMID: 21154709 [PubMed - indexed for MEDLINE] (Source: Current Protocols in Bioinformatics)
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Setting up the JBrowse genome browser.
Authors: Skinner ME, Holmes IH Abstract JBrowse is a Web-based tool for visualizing genomic data. Unlike most other Web-base genome browsers, JBrowse exploits the capabilities of the user's Web browser to make scrolling and zooming fast and smooth. It supports the browsers used by almost all Internet users, and is relatively simple to install. JBrowse can utilize multiple types of data in a variety of common genomic data formats, including genomic feature data in bioperl databases, GFF files, BED files, and quantitative data in wiggle files. This unit describes how to obtain the JBrowse software, set it up...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Next generation sequence assembly with AMOS.
Authors: Treangen TJ, Sommer DD, Angly FE, Koren S, Pop M Abstract A Modular Open-Source Assembler (AMOS) was designed to offer a modular approach to genome assembly. AMOS includes a wide range of tools for assembly, including the lightweight de novo assemblers Minimus and Minimo, and Bambus 2, a robust scaffolder able to handle metagenomic and polymorphic data. This protocol describes how to configure and use AMOS for the assembly of Next Generation sequence data. Additionally, we provide three tutorial examples that include bacterial, viral, and metagenomic datasets with specific tips for improving assem...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Using CisGenome to analyze ChIP-chip and ChIP-seq data.
Authors: Ji H, Jiang H, Ma W, Wong WH Abstract Chromatin immunoprecipitation (ChIP) coupled with genome tiling array hybridization (ChIP-chip) and ChIP followed by massively parallel sequencing (ChIP-seq) are high-throughput approaches to profiling genome-wide protein-DNA interactions. Both technologies are increasingly used to study transcription-factor binding sites and chromatin modifications. CisGenome is an integrated software system for analyzing ChIP-chip and ChIP-seq data. This unit describes basic functions of CisGenome and how to use them to find genomic regions with protein-DNA interactions, vis...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Inferring protein function from homology using the Princeton Protein Orthology Database (P-POD).
Authors: Livstone MS, Oughtred R, Heinicke S, Vernot B, Huttenhower C, Durand D, Dolinski K Abstract Inferring a protein's function by homology is a powerful tool for biologists. The Princeton Protein Orthology Database (P-POD) offers a simple way to visualize and analyze the relationships between homologous proteins in order to infer function. P-POD contains computationally generated analysis distinguishing orthologs from paralogs combined with curated published information on functional complementation and on human diseases. P-POD also features an applet, Notung, for users to explore and modify phylogene...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Administering GBrowse sites with WebGBrowse.
Authors: Podicheti R, Dong Q Abstract GBrowse is widely used by biologists to visualize genome annotation. We have developed WebGBrowse to facilitate hosting multiple GBrowse instances specific for different users on the same Web server. WebGBrowse automatically sets up each user-specific GBrowse instance by extracting information from user-supplied data to produce the appropriate configuration. This unit describes installation and administration of WebGBrowse for bioinformaticians who plan to manage local WebGBrowse servers in their institutions. PMID: 21400697 [PubMed - indexed for MEDLINE] (Sour...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

The importance of biological databases in biological discovery.
Authors: Baxevanis AD Abstract Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, the Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource are also covered. Non-sequence-centric databases, s...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Searching NCBI databases using Entrez.
Authors: Gibney G, Baxevanis AD Abstract One of the most widely used interfaces for the retrieval of information from biological databases is the NCBI Entrez system. Entrez capitalizes on the fact that there are pre-existing, logical relationships between the individual entries found in numerous public databases. The existence of such natural connections, mostly biological in nature, argued for the development of a method through which all the information about a particular biological entity could be found without having to sequentially visit and query disparate databases. Two basic protocols describe simp...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Metabolomic data processing, analysis, and interpretation using MetaboAnalyst.
Authors: Xia J, Wishart DS Abstract MetaboAnalyst is a comprehensive, Web-based tool designed for processing, analyzing, and interpreting metabolomic data. It handles most of the common metabolomic data types including compound concentration lists, spectral bin lists, peak lists, and raw MS spectra. In addition to providing a variety of data processing and normalization procedures, MetaboAnalyst supports a number of data-analysis tasks using a range of univariate, multivariate, and machine-learning methods. MetaboAnalyst also offers two newly developed approaches-Metabolite Set Enrichment Analysis (MSEA) a...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

An introduction to recognizing functional domains.
Authors: Stormo GD Abstract This unit provides an overview of issues involved in domain recognition in protein and DNA sequences. It opens with a discussion of the two primary methods of domain representation, namely consensus sequences and alignment matrices (e.g., the log-odds matrix). The unit continues with a brief overview of some of the resources available for identifying functional domains in nucleotide sequences (e.g., transcription factor binding sites). In addition, it reviews databases such as Pfam and InterPro, which are available for protein analysis. PMID: 21633944 [PubMed - indexed f...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Using MACS to identify peaks from ChIP-Seq data.
Authors: Feng J, Liu T, Zhang Y Abstract Model-based Analysis of ChIP-Seq (MACS) is a command-line tool designed by X. Shirley Liu and colleagues to analyze data generated by ChIP-Seq experiments in eukaryotes, especially mammals. MACS can be used to identify transcription factor binding sites and histone modification-enriched regions if the ChIP-Seq data, with or without control samples, are given. This unit describes two basic protocols that provide detailed information on how to use MACS to identify either the binding sites of a transcription factor or the enriched regions of a histone modification with...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Glossary of bioinformatics terms.
Authors: PMID: 21901738 [PubMed - indexed for MEDLINE] (Source: Current Protocols in Bioinformatics)
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research