Unbiased probabilistic taxonomic classification for DNA barcoding
We present a probabilistic method for taxonomical classification (PROTAX) of DNA sequences. Given a pre-defined taxonomical tree structure that is partially populated by reference sequences, PROTAX decomposes the probability of one to the set of all possible outcomes. PROTAX accounts for species that are present in the taxonomy but that do not have reference sequences, the possibility of unknown taxonomical units, as well as mislabeled reference sequences. PROTAX is based on a statistical multinomial regression model, and it can utilize any kind of sequence similarity measures or the outputs of other classifiers as predict...
Source: Bioinformatics - September 26, 2016 Category: Bioinformatics Authors: Somervuo, P., Koskela, S., Pennanen, J., Henrik Nilsson, R., Ovaskainen, O. Tags: SEQUENCE ANALYSIS Source Type: research

AgIn: measuring the landscape of CpG methylation of individual repetitive elements
Motivation: Determining the methylation state of regions with high copy numbers is challenging for second-generation sequencing, because the read length is insufficient to map reads uniquely, especially when repetitive regions are long and nearly identical to each other. Single-molecule real-time (SMRT) sequencing is a promising method for observing such regions, because it is not vulnerable to GC bias, it produces long read lengths, and its kinetic information is sensitive to DNA modifications. Results: We propose a novel linear-time algorithm that combines the kinetic information for neighboring CpG sites and increases t...
Source: Bioinformatics - September 26, 2016 Category: Bioinformatics Authors: Suzuki, Y., Korlach, J., Turner, S. W., Tsukahara, T., Taniguchi, J., Qu, W., Ichikawa, K., Yoshimura, J., Yurino, H., Takahashi, Y., Mitsui, J., Ishiura, H., Tsuji, S., Takeda, H., Morishita, S. Tags: GENOME ANALYSIS Source Type: research

A time-varying group sparse additive model for genome-wide association studies of dynamic complex traits
Motivation: Despite the widespread popularity of genome-wide association studies (GWAS) for genetic mapping of complex traits, most existing GWAS methodologies are still limited to the use of static phenotypes measured at a single time point. In this work, we propose a new method for association mapping that considers dynamic phenotypes measured at a sequence of time points. Our approach relies on the use of Time-Varying Group Sparse Additive Models (TV-GroupSpAM) for high-dimensional, functional regression. Results: This new model detects a sparse set of genomic loci that are associated with trait dynamics, and demonstrat...
Source: Bioinformatics - September 26, 2016 Category: Bioinformatics Authors: Marchetti-Bowick, M., Yin, J., Howrylak, J. A., Xing, E. P. Tags: GENOME ANALYSIS Source Type: research

Zerone: a ChIP-seq discretizer for multiple replicates with built-in quality control
Motivation: Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is the standard method to investigate chromatin protein composition. As the number of community-available ChIP-seq profiles increases, it becomes more common to use data from different sources, which makes joint analysis challenging. Issues such as lack of reproducibility, heterogeneous quality and conflicts between replicates become evident when comparing datasets, especially when they are produced by different laboratories. Results: Here, we present Zerone, a ChIP-seq discretizer with built-in quality control. Zerone is powered b...
Source: Bioinformatics - September 26, 2016 Category: Bioinformatics Authors: Cusco, P., Filion, G. J. Tags: GENOME ANALYSIS Source Type: research

Evaluating the molecule-based prediction of clinical drug responses in cancer
This study provides a starting point to objectively evaluate the molecule-based predictions of clinical drug responses. Contact: jgu@tsinghua.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. (Source: Bioinformatics)
Source: Bioinformatics - September 26, 2016 Category: Bioinformatics Authors: Ding, Z., Zu, S., Gu, J. Tags: SYSTEMS BIOLOGY Source Type: research

Functional classification of CATH superfamilies: a domain-based approach for protein function annotation
(Source: Bioinformatics)
Source: Bioinformatics - September 10, 2016 Category: Bioinformatics Authors: Das, S., Lee, D., Sillitoe, I., Dawson, N. L., Lees, J. G., Orengo, C. A. Tags: CORRIGENDUM Source Type: research

HiPub: translating PubMed and PMC texts to networks for knowledge discovery
Summary: We introduce HiPub, a seamless Chrome browser plug-in that automatically recognizes, annotates and translates biomedical entities from texts into networks for knowledge discovery. Using a combination of two different named-entity recognition resources, HiPub can recognize genes, proteins, diseases, drugs, mutations and cell lines in texts, and achieve high precision and recall. HiPub extracts biomedical entity-relationships from texts to construct context-specific networks, and integrates existing network data from external databases for knowledge discovery. It allows users to add additional entities from related ...
Source: Bioinformatics - September 10, 2016 Category: Bioinformatics Authors: Lee, K., Shin, W., Kim, B., Lee, S., Choi, Y., Kim, S., Jeon, M., Tan, A. C., Kang, J. Tags: DATA AND TEXT MINING Source Type: research

SETH detects and normalizes genetic variants in text
Summary: Descriptions of genetic variations and their effect are widely spread across the biomedical literature. However, finding all mentions of a specific variation, or all mentions of variations in a specific gene, is difficult to achieve due to the many ways such variations are described. Here, we describe SETH, a tool for the recognition of variations from text and their subsequent normalization to dbSNP or UniProt. SETH achieves high precision and recall on several evaluation corpora of PubMed abstracts. It is freely available and encompasses stand-alone scripts for isolated application and evaluation as well as a th...
Source: Bioinformatics - September 10, 2016 Category: Bioinformatics Authors: Thomas, P., Rocktäschel, T., Hakenberg, J., Lichtblau, Y., Leser, U. Tags: DATA AND TEXT MINING Source Type: research

Rule-based modeling with Virtual Cell
Summary: Rule-based modeling is invaluable when the number of possible species and reactions in a model become too large to allow convenient manual specification. The popular rule-based software tools BioNetGen and NFSim provide powerful modeling and simulation capabilities at the cost of learning a complex scripting language which is used to specify these models. Here, we introduce a modeling tool that combines new graphical rule-based model specification with existing simulation engines in a seamless way within the familiar Virtual Cell (VCell) modeling environment. A mathematical model can be built integrating explicit ...
Source: Bioinformatics - September 10, 2016 Category: Bioinformatics Authors: Schaff, J. C., Vasilescu, D., Moraru, I. I., Loew, L. M., Blinov, M. L. Tags: SYSTEMS BIOLOGY Source Type: research

R.JIVE for exploration of multi-source molecular data
Summary: The integrative analysis of multiple high-throughput data sources that are available for a common sample set is an increasingly common goal in biomedical research. Joint and individual variation explained (JIVE) is a tool for exploratory dimension reduction that decomposes a multi-source dataset into three terms: a low-rank approximation capturing joint variation across sources, low-rank approximations for structured variation individual to each source and residual noise. JIVE has been used to explore multi-source data for a variety of application areas but its accessibility was previously limited. We introduce R....
Source: Bioinformatics - September 10, 2016 Category: Bioinformatics Authors: OConnell, M. J., Lock, E. F. Tags: SYSTEMS BIOLOGY Source Type: research

MIA: non-targeted mass isotopolome analysis
Summary: MIA detects and visualizes isotopic enrichment in gas chromatography electron ionization mass spectrometry (GC–EI-MS) datasets in a non-targeted manner. It provides an easy-to-use graphical user interface that allows for visual mass isotopomer distribution analysis across multiple datasets. MIA helps to reveal changes in metabolic fluxes, visualizes metabolic proximity of isotopically enriched compounds and shows the fate of the applied stable isotope labeled tracer. Availability and Implementation: Linux and Windows binaries, documentation, and sample data are freely available for download at http://massiso...
Source: Bioinformatics - September 10, 2016 Category: Bioinformatics Authors: Weindl, D., Wegner, A., Hiller, K. Tags: SYSTEMS BIOLOGY Source Type: research

RANKS: a flexible tool for node label ranking and classification in biological networks
Summary: RANKS is a flexible software package that can be easily applied to any bioinformatics task formalizable as ranking of nodes with respect to a property given as a label, such as automated protein function prediction, gene disease prioritization and drug repositioning. To this end RANKS provides an efficient and easy-to-use implementation of kernelized score functions, a semi-supervised algorithmic scheme embedding both local and global learning strategies for the analysis of biomolecular networks. To facilitate comparative assessment, baseline network-based methods, e.g. label propagation and random walk algorithms...
Source: Bioinformatics - September 10, 2016 Category: Bioinformatics Authors: Valentini, G., Armano, G., Frasca, M., Lin, J., Mesiti, M., Re, M. Tags: SYSTEMS BIOLOGY Source Type: research

CART--a chemical annotation retrieval toolkit
Motivation: Data on bioactivities of drug-like chemicals are rapidly accumulating in public repositories, creating new opportunities for research in computational systems pharmacology. However, integrative analysis of these data sets is difficult due to prevailing ambiguity between chemical names and identifiers and a lack of cross-references between databases. Results: To address this challenge, we have developed CART, a Chemical Annotation Retrieval Toolkit. As a key functionality, it matches an input list of chemical names into a comprehensive reference space to assign unambiguous chemical identifiers. In this unified s...
Source: Bioinformatics - September 10, 2016 Category: Bioinformatics Authors: Deghou, S., Zeller, G., Iskar, M., Driessen, M., Castillo, M., van Noort, V., Bork, P. Tags: SYSTEMS BIOLOGY Source Type: research

Combenefit: an interactive platform for the analysis and visualization of drug combinations
We present Combenefit, new free software tool that enables the visualization, analysis and quantification of drug combination effects in terms of synergy and/or antagonism. Data from combinations assays can be processed using classical Synergy models (Loewe, Bliss, HSA), as single experiments or in batch for High Throughput Screens. This user-friendly tool provides laboratory scientists with an easy and systematic way to analyze their data. The companion package provides bioinformaticians with critical implementations of routines enabling the processing of combination data. Availability and Implementation: Combenefit is pr...
Source: Bioinformatics - September 10, 2016 Category: Bioinformatics Authors: Di Veroli, G. Y., Fornari, C., Wang, D., Mollard, S., Bramhall, J. L., Richards, F. M., Jodrell, D. I. Tags: SYSTEMS BIOLOGY Source Type: research

MEANS: python package for Moment Expansion Approximation, iNference and Simulation
We present a free, user-friendly tool implementing an efficient moment expansion approximation with parametric closures that integrates well with the IPython interactive environment. Our package enables the analysis of complex stochastic systems without any constraints on the number of species and moments studied and the type of rate laws in the system. In addition to the approximation method our package provides numerous tools to help non-expert users in stochastic analysis. Availability and implementation: https://github.com/theosysbio/means Contacts: m.stumpf@imperial.ac.uk or e.lakatos13@imperial.ac.uk Supplementary in...
Source: Bioinformatics - September 10, 2016 Category: Bioinformatics Authors: Fan, S., Geissmann, Q., Lakatos, E., Lukauskas, S., Ale, A., Babtie, A. C., Kirk, P. D. W., Stumpf, M. P. H. Tags: SYSTEMS BIOLOGY Source Type: research