seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data
Motivation: One of the main goals of large scale methylation studies is to detect differentially methylated loci. One way is to approach this problem sitewise, i.e. to find differentially methylated positions (DMPs). However, it has been shown that methylation is regulated in longer genomic regions. So it is more desirable to identify differentially methylated regions (DMRs) instead of DMPs. The new high coverage arrays, like Illuminas 450k platform, make it possible at a reasonable cost. Few tools exist for DMR identification from this type of data, but there is no standard approach. Results: We propose a novel method for...
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Kolde, R., Märtens, K., Lokk, K., Laur, S., Vilo, J. Tags: GENOME ANALYSIS Source Type: research

A simple yet accurate correction for winner's curse can predict signals discovered in much larger genome scans
Conclusions: FIQT is a simple, yet accurate, WCA method for Z-scores (and ORs/RRs, via simple transformations). Availability and Implementation: A 10 lines R function implementation is available at https://github.com/bacanusa/FIQT. Contact: sabacanu@vcu.edu Supplementary information: Supplementary data are available at Bioinformatics online. (Source: Bioinformatics)
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Bigdeli, T. B., Lee, D., Webb, B. T., Riley, B. P., Vladimirov, V. I., Fanous, A. H., Kendler, K. S., Bacanu, S.-A. Tags: GENOME ANALYSIS Source Type: research

SoFIA: a data integration framework for annotating high-throughput datasets
We present SoFIA, a framework for workflow-driven data integration with a focus on genomic annotation. SoFIA conceptualizes workflow templates as comprehensive workflows that cover as many data integration operations as possible in a given domain. However, these templates are not intended to be executed as a whole; instead, when given an integration task consisting of a set of input data and a set of desired output data, SoFIA derives a minimal workflow that completes the task. These workflows are typically fast and create exactly the information a user wants without requiring them to do any implementation work. Using a co...
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Childs, L. H., Mamlouk, S., Brandt, J., Sers, C., Leser, U. Tags: GENOME ANALYSIS Source Type: research

Evaluation of hybrid and non-hybrid methods for de novo assembly of nanopore reads
Motivation: Recent emergence of nanopore sequencing technology set a challenge for established assembly methods. In this work, we assessed how existing hybrid and non-hybrid de novo assembly methods perform on long and error prone nanopore reads. Results: We benchmarked five non-hybrid (in terms of both error correction and scaffolding) assembly pipelines as well as two hybrid assemblers which use third generation sequencing data to scaffold Illumina assemblies. Tests were performed on several publicly available MinION and Illumina datasets of Escherichia coli K-12, using several sequencing coverages of nanopore data (20x,...
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Sovic, I., Krizanovic, K., Skala, K., Sikic, M. Tags: GENOME ANALYSIS Source Type: research

DOGMA: domain-based transcriptome and proteome quality assessment
Motivation: Genome studies have become cheaper and easier than ever before, due to the decreased costs of high-throughput sequencing and the free availability of analysis software. However, the quality of genome or transcriptome assemblies can vary a lot. Therefore, quality assessment of assemblies and annotations are crucial aspects of genome analysis pipelines. Results: We developed DOGMA, a program for fast and easy quality assessment of transcriptome and proteome data based on conserved protein domains. DOGMA measures the completeness of a given transcriptome or proteome and provides information about domain content fo...
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Dohmen, E., Kremer, L. P. M., Bornberg-Bauer, E., Kemena, C. Tags: GENOME ANALYSIS Source Type: research

densityCut: an efficient and versatile topological approach for automatic clustering of biological data
This article introduces densityCut, a novel density-based clustering algorithm, which is both time- and space-efficient and proceeds as follows: densityCut first roughly estimates the densities of data points from a K-nearest neighbour graph and then refines the densities via a random walk. A cluster consists of points falling into the basin of attraction of an estimated mode of the underlining density function. A post-processing step merges clusters and generates a hierarchical cluster tree. The number of clusters is selected from the most stable clustering in the hierarchical cluster tree. Experimental results on ten syn...
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Ding, J., Shah, S., Condon, A. Tags: GENOME ANALYSIS Source Type: research

Author Index
(Source: Bioinformatics)
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Tags: AUTHOR INDEX Source Type: research

On cross-conditional and fluctuation correlations in competitive RNA networks
Conclusion: Our results deepen the theoretical understanding of cross-talk in ceRNA networks, and have implications for the experimental identification of ceRNA cross-talk phenomena. Availability and Implementation: Simulation software available upon request. Contact: tperkins@ohri.ca (Source: Bioinformatics)
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Sanchez-Taltavull, D., MacLeod, M., Perkins, T. J. Tags: SYSTEMS Source Type: research

Bayesian parameter estimation for the Wnt pathway: an infinite mixture models approach
Motivation: Likelihood-free methods, like Approximate Bayesian Computation (ABC), have been extensively used in model-based statistical inference with intractable likelihood functions. When combined with Sequential Monte Carlo (SMC) algorithms they constitute a powerful approach for parameter estimation and model selection of mathematical models of complex biological systems. A crucial step in the ABC–SMC algorithms, significantly affecting their performance, is the propagation of a set of parameter vectors through a sequence of intermediate distributions using Markov kernels. Results: In this article, we employ Diri...
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Koutroumpas, K., Ballarini, P., Votsi, I., Cournede, P.-H. Tags: SYSTEMS Source Type: research

Logical model specification aided by model-checking techniques: application to the mammalian cell cycle regulation
Motivation: Understanding the temporal behaviour of biological regulatory networks requires the integration of molecular information into a formal model. However, the analysis of model dynamics faces a combinatorial explosion as the number of regulatory components and interactions increases. Results: We use model-checking techniques to verify sophisticated dynamical properties resulting from the model regulatory structure in the absence of kinetic assumption. We demonstrate the power of this approach by analysing a logical model of the molecular network controlling mammalian cell cycle. This approach enables a systematic a...
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Traynard, P., Faure, A., Fages, F., Thieffry, D. Tags: SYSTEMS Source Type: research

Edge-based sensitivity analysis of signaling networks by using Boolean dynamics
Motivation: Biological networks are composed of molecular components and their interactions represented by nodes and edges, respectively, in a graph model. Based on this model, there were many studies with respect to effects of node-based mutations on the network dynamics, whereas little attention was paid to edgetic mutations so far. Results: In this paper, we defined an edgetic sensitivity measure that quantifies how likely a converging attractor is changed by edge-removal mutations in a Boolean network model. Through extensive simulations based on that measure, we found interesting properties of highly sensitive edges i...
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Trinh, H.-C., Kwon, Y.-K. Tags: SYSTEMS Source Type: research

iReMet-flux: constraint-based approach for integrating relative metabolite levels into a stoichiometric metabolic models
Motivation: Understanding the rerouting of metabolic reaction fluxes upon perturbations has the potential to link changes in molecular state of a cellular system to alteration of growth. Yet, differential flux profiling on a genome-scale level remains one of the biggest challenges in systems biology. This is particularly relevant in plants, for which fluxes in autotrophic growth necessitate time-consuming instationary labeling experiments and costly computations, feasible for small-scale networks. Results: Here we present a computationally and experimentally facile approach, termed iReMet-Flux, which integrates relative me...
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Sajitz-Hermstein, M., Töpfer, N., Kleessen, S., Fernie, A. R., Nikoloski, Z. Tags: SYSTEMS Source Type: research

Genome wide predictions of miRNA regulation by transcription factors
Motivation: Reconstructing regulatory networks from expression and interaction data is a major goal of systems biology. While much work has focused on trying to experimentally and computationally determine the set of transcription-factors (TFs) and microRNAs (miRNAs) that regulate genes in these networks, relatively little work has focused on inferring the regulation of miRNAs by TFs. Such regulation can play an important role in several biological processes including development and disease. The main challenge for predicting such interactions is the very small positive training set currently available. Another challenge i...
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Ruffalo, M., Bar-Joseph, Z. Tags: SYSTEMS Source Type: research

A weighted exact test for mutually exclusive mutations in cancer
Motivation: The somatic mutations in the pathways that drive cancer development tend to be mutually exclusive across tumors, providing a signal for distinguishing driver mutations from a larger number of random passenger mutations. This mutual exclusivity signal can be confounded by high and highly variable mutation rates across a cohort of samples. Current statistical tests for exclusivity that incorporate both per-gene and per-sample mutational frequencies are computationally expensive and have limited precision. Results: We formulate a weighted exact test for assessing the significance of mutual exclusivity in an arbitr...
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Leiserson, M. D. M., Reyna, M. A., Raphael, B. J. Tags: SYSTEMS Source Type: research

Large-scale inference of conjunctive Bayesian networks
The continuous time conjunctive Bayesian network (CT-CBN) is a graphical model for analyzing the waiting time process of the accumulation of genetic changes (mutations). CT-CBN models have been successfully used in several biological applications such as HIV drug resistance development and genetic progression of cancer. However, current approaches for parameter estimation and network structure learning of CBNs can only deal with a small number of mutations (<20). Here, we address this limitation by presenting an efficient and accurate approximate inference algorithm using a Monte Carlo expectation-maximization algorithm...
Source: Bioinformatics - August 31, 2016 Category: Bioinformatics Authors: Montazeri, H., Kuipers, J., Kouyos, R., Böni, J., Yerly, S., Klimkait, T., Aubert, V., Günthard, H. F., Beerenwinkel, N., The Swiss HIV Cohort Study Tags: SYSTEMS Source Type: research