LC-MS data processing with MAVEN: a metabolomic analysis and visualization engine.
Authors: Clasquin MF, Melamud E, Rabinowitz JD Abstract MAVEN is an open-source software program for interactive processing of LC-MS-based metabolomics data. MAVEN enables rapid and reliable metabolite quantitation from multiple reaction monitoring data or high-resolution full-scan mass spectrometry data. It automatically detects and reports peak intensities for isotope-labeled metabolites. Menu-driven, click-based navigation allows visualization of raw and analyzed data. Here we provide a User Guide for MAVEN. Step-by-step instructions are provided for data import, peak alignment across samples, identific...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Using the KEGG database resource.
Authors: Tanabe M, Kanehisa M Abstract KEGG (Kyoto Encyclopedia of Genes and Genomes) is a bioinformatics resource for understanding the functions and utilities of cells and organisms from both high-level and genomic perspectives. It is a self-sufficient, integrated resource consisting of genomic, chemical, and network information, with cross-references to numerous outside databases. The genomic and chemical information is a complete set of building blocks (genes and molecules) and the network information includes molecular wiring diagrams (interaction/reaction networks) and hierarchical classifications (r...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Using Galaxy to perform large-scale interactive data analyses.
Authors: Hillman-Jackson J, Clements D, Blankenberg D, Taylor J, Nekrutenko A, Galaxy Team Abstract Innovations in biomedical research technologies continue to provide experimental biologists with novel and increasingly large genomic and high-throughput data resources to be analyzed. As creating and obtaining data has become easier, the key decision faced by many researchers is a practical one: where and how should an analysis be performed? Datasets are large and analysis tool set-up and use is riddled with complexities outside of the scope of core research activities. The authors believe that Galaxy provi...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Using cloud computing infrastructure with CloudBioLinux, CloudMan, and Galaxy.
Authors: Afgan E, Chapman B, Jadan M, Franke V, Taylor J Abstract Cloud computing has revolutionized availability and access to computing and storage resources, making it possible to provision a large computational infrastructure with only a few clicks in a Web browser. However, those resources are typically provided in the form of low-level infrastructure components that need to be procured and configured before use. In this unit, we demonstrate how to utilize cloud computing resources to perform open-ended bioinformatic analyses, with fully automated management of the underlying cloud infrastructure. By ...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Using the reactome database.
Authors: Haw R, Stein L Abstract There is considerable interest in the bioinformatics community in creating pathway databases. The Reactome project (a collaboration between the Ontario Institute for Cancer Research, Cold Spring Harbor Laboratory, New York University Medical Center, and the European Bioinformatics Institute) is one such pathway database and collects structured information on all the biological pathways and processes in the human. It is an expert-authored and peer-reviewed, curated collection of well-documented molecular reactions that span the gamut from simple intermediate metabolism to si...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

The Human Gene Mutation Database (HGMD) and its exploitation in the fields of personalized genomics and molecular evolution.
Authors: Stenson PD, Ball EV, Mort M, Phillips AD, Shaw K, Cooper DN Abstract The Human Gene Mutation Database (HGMD) constitutes a comprehensive core collection of data on germ-line mutations in nuclear genes underlying or associated with human inherited disease (http://www.hgmd.org). Data cataloged include single-base-pair substitutions in coding, regulatory, and splicing-relevant regions, micro-deletions and micro-insertions, indels, and triplet repeat expansions, as well as gross gene deletions, insertions, duplications, and complex rearrangements. Each mutation is entered into HGMD only once, in order...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Comparative ncRNA gene and structure prediction using Foldalign and FoldalignM.
Authors: Havgaard J, Kaur S, Gorodkin J Abstract This unit describes how to use Foldalign and FoldalignM to make structural alignments of non-protein-coding-RNA (ncRNA). These tools can be used to find new ncRNAs, to find the structure of novel ncRNAs, and to improve alignments for known ncRNAs. PMID: 22948726 [PubMed - indexed for MEDLINE] (Source: Current Protocols in Bioinformatics)
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Some phenotype association tools in Galaxy: looking for disease SNPs in a full genome.
Authors: Giardine BM, Riemer C, Burhans R, Ratan A, Miller W Abstract This unit focuses on some of the tools available on the public Galaxy server that are useful for exploring possible associations between human genetic variants and phenotypes. We trace step-by-step through an example illustrating several methods for examining a single full-coverage genome to look for single-nucleotide polymorphisms (SNPs) that are either known to be associated with disease or suspected to have impact for other reasons. It makes use of public genomic data, tools designed specifically for working with variants, and also so...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Genotyping in the cloud with Crossbow.
Authors: Gurtowski J, Schatz MC, Langmead B Abstract Crossbow is a scalable, portable, and automatic cloud computing tool for identifying SNPs from high-coverage, short-read resequencing data. It is built on Apache Hadoop, an implementation of the MapReduce software framework. Hadoop allows Crossbow to distribute read alignment and SNP calling subtasks over a cluster of commodity computers. Two robust tools, Bowtie and SOAPsnp, implement the fundamental alignment and variant calling operations respectively, and have demonstrated capabilities within Crossbow of analyzing approximately one billion short read...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Analyzing protein-protein interactions from affinity purification-mass spectrometry data with SAINT.
Authors: Choi H, Liu G, Mellacheruvu D, Tyers M, Gingras AC, Nesvizhskii AI Abstract Significance Analysis of INTeractome (SAINT) is a software package for scoring protein-protein interactions based on label-free quantitative proteomics data (e.g., spectral count or intensity) in affinity purification-mass spectrometry (AP-MS) experiments. SAINT allows bench scientists to select bona fide interactions and remove nonspecific interactions in an unbiased manner. However, there is no 'one-size-fits-all' statistical model for every dataset, since the experimental design varies across studies. Key variables incl...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Using ProHits to store, annotate, and analyze affinity purification-mass spectrometry (AP-MS) data.
Authors: Liu G, Zhang J, Choi H, Lambert JP, Srikumar T, Larsen B, Nesvizhskii AI, Raught B, Tyers M, Gingras AC Abstract Affinity purification coupled with mass spectrometry (AP-MS) is a robust technique used to identify protein-protein interactions. With recent improvements in sample preparation, and dramatic advances in MS instrumentation speed and sensitivity, this technique is becoming more widely used throughout the scientific community. To meet the needs of research groups both large and small, we have developed software solutions for tracking, scoring and analyzing AP-MS data. Here, we provide deta...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Exploring genetic, genomic, and phenotypic data at the rat genome database.
We present an overview of the database followed by specific examples that can be used to gain experience in employing RGD to explore the wealth of functional data available for the rat. PMID: 23255149 [PubMed - indexed for MEDLINE] (Source: Current Protocols in Bioinformatics)
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

The UCSC Genome Browser.
Authors: Karolchik D, Hinrichs AS, Kent WJ Abstract The University of California Santa Cruz (UCSC) Genome Browser is a popular Web-based tool for quickly displaying a requested portion of a genome at any scale, accompanied by a series of aligned annotation "tracks." The annotations generated by the UCSC Genome Bioinformatics Group and external collaborators include gene predictions, mRNA and expressed sequence tag alignments, simple nucleotide polymorphisms, expression and regulatory data, phenotype and variation data, and pairwise and multiple-species comparative genomics data. All information relevant to...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Using the Wash U Epigenome Browser to examine genome-wide sequencing data.
Authors: Zhou X, Wang T Abstract This unit describes the Wash U Epigenome Browser, a next-generation genomic data visualization system. The Browser currently hosts ENCODE and Roadmap Epigenomics data for human and model organisms. The Browser displays many sequencing-based data sets across all or part of the genome, on specific gene sets or pathways, and in the context of their metadata. Investigators can order, filter, aggregate, classify, and display data interactively based on given feature sets including metadata features, annotated biological pathways, and user-defined collections of genes or genomic ...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

PatternLab: from mass spectra to label-free differential shotgun proteomics.
Authors: Carvalho PC, Fischer JS, Xu T, Yates JR, Barbosa VC Abstract PatternLab for proteomics is a self-contained computational environment for analyzing shotgun proteomic data. Recent improvements incorporate modules to facilitate the computational analysis, such as FastaDBXtractor for sequence database preparation and ProLuCID runner for simplifying and managing the protein identification search engine; modules for pushing the limits on proteomics standards, such as SEPro, which relies on a semi-labeled decoy approach for increasing confidence in filtering and organizing peptide spectrum matches; and m...
Source: Current Protocols in Bioinformatics - November 12, 2014 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research