Using DrugBank for In Silico Drug Exploration and Discovery.
Authors: Wishart DS, Wu A Abstract DrugBank is a fully curated drug and drug target database that contains 8174 drug entries including 1944 FDA approved small-molecule drugs, 198 FDA-approved biotech (protein/peptide) drugs, 93 nutraceuticals, and over 6000 experimental drugs. Additionally, 4300 non-redundant protein (i.e., drug target/enzyme/transporter/carrier) sequences are linked to these drug entries. DrugBank is primarily focused on providing both the query/search tools and biophysical data needed to facilitate drug discovery and drug development. This unit provides readers with a detailed descriptio...
Source: Current Protocols in Bioinformatics - June 21, 2016 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Comparative Protein Structure Modeling Using MODELLER.
Authors: Webb B, Sali A Abstract Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate ...
Source: Current Protocols in Bioinformatics - June 21, 2016 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Inference of Episodic Changes in Natural Selection Acting on Protein Coding Sequences via CODEML.
Authors: Bielawski JP, Baker JL, Mingrone J Abstract This unit provides protocols for using the CODEML program from the PAML package to make inferences about episodic natural selection in protein-coding sequences. The protocols cover inference tasks such as maximum likelihood estimation of selection intensity, testing the hypothesis of episodic positive selection, and identifying sites with a history of episodic evolution. We provide protocols for using the rich set of models implemented in CODEML to assess robustness, and for using bootstrapping to assess if the requirements for reliable statistical infer...
Source: Current Protocols in Bioinformatics - June 21, 2016 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

UniProt Tools.
Authors: Pundir S, Martin MJ, O'Donovan C, UniProt Consortium Abstract The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data (UniProt Consortium, 2015). The UniProt Web site receives ∼400,000 unique visitors per month and is the primary means to access UniProt. Along with various datasets that you can search, UniProt provides three main tools. These are the 'BLAST' tool for sequence similarity searching, the 'Align' tool for multiple sequence alignment, and the 'Retrieve/ID Mapping' tool for using a list of identifiers to retrieve UniProtKB proteins...
Source: Current Protocols in Bioinformatics - March 25, 2016 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Using PSEA-Quant for Protein Set Enrichment Analysis of Quantitative Mass Spectrometry-Based Proteomics.
Authors: Lavallée-Adam M, Yates JR Abstract PSEA-Quant analyzes quantitative mass spectrometry-based proteomics datasets to identify enrichments of annotations contained in repositories such as the Gene Ontology and Molecular Signature databases. It allows users to identify the annotations that are significantly enriched for reproducibly quantified high abundance proteins. PSEA-Quant is available on the Web and as a command-line tool. It is compatible with all label-free and isotopic labeling-based quantitative proteomics methods. This protocol describes how to use PSEA-Quant and interpret its output. The...
Source: Current Protocols in Bioinformatics - March 25, 2016 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Introduction to Cheminformatics.
Authors: Wishart DS Abstract Cheminformatics is a field of information technology that focuses on the collection, storage, analysis, and manipulation of chemical data. The chemical data of interest typically includes information on small molecule formulas, structures, properties, spectra, and activities (biological or industrial). Cheminformatics originally emerged as a vehicle to help the drug discovery and development process, however cheminformatics now plays an increasingly important role in many areas of biology, chemistry, and biochemistry. The intent of this unit is to give readers some introduction...
Source: Current Protocols in Bioinformatics - March 25, 2016 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

MetaboLights: An Open-Access Database Repository for Metabolomics Data.
Authors: Kale NS, Haug K, Conesa P, Jayseelan K, Moreno P, Rocca-Serra P, Nainala VC, Spicer RA, Williams M, Li X, Salek RM, Griffin JL, Steinbeck C Abstract MetaboLights is the first general purpose, open-access database repository for cross-platform and cross-species metabolomics research at the European Bioinformatics Institute (EMBL-EBI). Based upon the open-source ISA framework, MetaboLights provides Metabolomics Standard Initiative (MSI) compliant metadata and raw experimental data associated with metabolomics experiments. Users can upload their study datasets into the MetaboLights Repository. These ...
Source: Current Protocols in Bioinformatics - March 25, 2016 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Finding Protein and Nucleotide Similarities with FASTA.
Authors: Pearson WR Abstract The FASTA programs provide a comprehensive set of rapid similarity searching tools (fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local, and global similarity searches (ssearch36, ggsearch36), and for searching with short peptides and oligonucleotides (fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matrices and gap penalties, improving alignment boundary accuracy and search sensitivity...
Source: Current Protocols in Bioinformatics - March 25, 2016 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

cgpPindel: Identifying Somatically Acquired Insertion and Deletion Events from Paired End Sequencing.
Authors: Raine KM, Hinton J, Butler AP, Teague JW, Davies H, Tarpey P, Nik-Zainal S, Campbell PJ Abstract cgpPindel is a modified version of Pindel that is optimized for detecting somatic insertions and deletions (indels) in cancer genomes and other samples compared to a reference control. Post-hoc filters remove false positive calls, resulting in a high-quality dataset for downstream analysis. This unit provides concise instructions for both a simple 'one-shot' execution of cgpPindel and a more detailed approach suitable for large-scale compute farms. © 2015 by John Wiley & Sons, Inc. PMID: ...
Source: Current Protocols in Bioinformatics - December 20, 2015 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

VAGrENT: Variation Annotation Generator.
Authors: Menzies A, Teague JW, Butler AP, Davies H, Tarpey P, Nik-Zainal S, Campbell PJ Abstract VAGrENT is a tool that provides biological context and effect prediction for genomic sequence variants. It annotates single base substitutions and small insertions and deletions by comparing them to reference information within or close to genes or other transcribed elements. This information provides the critical insight required to inform the biological or clinical significance of variant data generated from sequencing studies. The software has been optimized to run efficiently against the large numbers and d...
Source: Current Protocols in Bioinformatics - December 20, 2015 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

iRegulon and i-cisTarget: Reconstructing Regulatory Networks Using Motif and Track Enrichment.
Authors: Verfaillie A, Imrichova H, Janky R, Aerts S Abstract Gene expression profiling is often used to identify genes that are co-expressed in a biological process or disease. Downstream analyses of co-expressed gene sets using bioinformatics methods can reveal candidate transcription factors (TF) that co-regulate these genes, based on the presence of shared TF binding sites. Drawing gene regulatory networks that connect TFs to their predicted target genes can uncover gene modules that implement a particular function. Here, we describe several protocols to analyze any set of co-expressed genes using iReg...
Source: Current Protocols in Bioinformatics - December 20, 2015 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

An Introduction to Genome Annotation.
Authors: Campbell MS, Yandell M Abstract Genome projects have evolved from large international undertakings to tractable endeavors for a single lab. Accurate genome annotation is critical for successful genomic, genetic, and molecular biology experiments. These annotations can be generated using a number of approaches and available software tools. This unit describes methods for genome annotation and a number of software tools commonly used in gene annotation. © 2015 by John Wiley & Sons, Inc. PMID: 26678385 [PubMed - in process] (Source: Current Protocols in Bioinformatics)
Source: Current Protocols in Bioinformatics - December 20, 2015 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Protein Structure and Function Prediction Using I-TASSER.
Authors: Yang J, Zhang Y Abstract I-TASSER is a hierarchical protocol for automated protein structure prediction and structure-based function annotation. Starting from the amino acid sequence of target proteins, I-TASSER first generates full-length atomic structural models from multiple threading alignments and iterative structural assembly simulations followed by atomic-level structure refinement. The biological functions of the protein, including ligand-binding sites, enzyme commission number, and gene ontology terms, are then inferred from known protein function databases based on sequence and structure...
Source: Current Protocols in Bioinformatics - December 20, 2015 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Mapping RNA-seq Reads with STAR.
Authors: Dobin A, Gingeras TR Abstract Mapping of large sets of high-throughput sequencing reads to a reference genome is one of the foundational steps in RNA-seq data analysis. The STAR software package performs this task with high levels of accuracy and speed. In addition to detecting annotated and novel splice junctions, STAR is capable of discovering more complex RNA sequence arrangements, such as chimeric and circular RNA. STAR can align spliced sequences of any length with moderate error rates, providing scalability for emerging sequencing technologies. STAR generates output files that can be used fo...
Source: Current Protocols in Bioinformatics - September 6, 2015 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research

Using PepExplorer to Filter and Organize De Novo Peptide Sequencing Results.
Authors: da Veiga Leprevost F, Barbosa VC, Carvalho PC Abstract PepExplorer aids in the biological interpretation of de novo sequencing results; this is accomplished by assembling a list of homolog proteins obtained by aligning results from widely adopted de novo sequencing tools against a target-decoy sequence database. Our tool relies on pattern recognition to ensure that the results satisfy a user-given false-discovery rate (FDR). For this, it employs a radial basis function neural network that considers the precursor charge states, de novo sequencing scores, the peptide lengths, and alignment scores. P...
Source: Current Protocols in Bioinformatics - September 6, 2015 Category: Bioinformatics Tags: Curr Protoc Bioinformatics Source Type: research