Molecular mechanisms underlying COPD-muscle dysfunction unveiled through a systems medicine approach
We describe a discrete model-driven method combining mechanistic and probabilistic approaches to decipher the role of ROS on the activity state of skeletal muscle regulatory network, assessed before and after an 8-week endurance training program in COPD patients and healthy subjects. In COPD, our computational analysis indicates abnormal training-induced regulatory responses leading to defective tissue remodeling and abnormal energy metabolism. Moreover, we identified tnf, insr, inha and myc as key regulators of abnormal training-induced adaptations in COPD. The tnf-insr pair was identified as a promising target for therap...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Marin de Mas, I., Fanchon, E., Papp, B., Kalko, S., Roca, J., Cascante, M. Tags: SYSTEMS BIOLOGY Source Type: research

BeReTa: a systematic method for identifying target transcriptional regulators to enhance microbial production of chemicals
We present BeReTa (Beneficial Regulator Targeting), a new algorithm for prioritization of TR manipulation targets, which makes use of unintegrated network models. BeReTa identifies TR manipulation targets by evaluating regulatory strengths of interactions and beneficial effects of reactions, and subsequently assigning beneficial scores for the TRs. We demonstrate that BeReTa can predict both known and novel TR manipulation targets for enhanced production of various chemicals in Escherichia coli. Furthermore, through a case study of antibiotics production in Streptomyces coelicolor, we successfully demonstrate its wide appl...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Kim, M., Sun, G., Lee, D.-Y., Kim, B.-G. Tags: SYSTEMS BIOLOGY Source Type: research

A likelihood ratio-based method to predict exact pedigrees for complex families from next-generation sequencing data
Motivation: Next generation sequencing technology considerably changed the way we screen for pathogenic mutations in rare Mendelian disorders. However, the identification of the disease-causing mutation amongst thousands of variants of partly unknown relevance is still challenging and efficient techniques that reduce the genomic search space play a decisive role. Often segregation- or linkage analysis are used to prioritize candidates, however, these approaches require correct information about the degree of relationship among the sequenced samples. For quality assurance an automated control of pedigree structures and samp...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Heinrich, V., Kamphans, T., Mundlos, S., Robinson, P. N., Krawitz, P. M. Tags: GENETICS AND POPULATION ANALYSIS Source Type: research

Gene- and pathway-based association tests for multiple traits with GWAS summary statistics
Summary: To identify novel genetic variants associated with complex traits and to shed new insights on underlying biology, in addition to the most popular single SNP-single trait association analysis, it would be useful to explore multiple correlated (intermediate) traits at the gene- or pathway-level by mining existing single GWAS or meta-analyzed GWAS data. For this purpose, we present an adaptive gene-based test and a pathway-based test for association analysis of multiple traits with GWAS summary statistics. The proposed tests are adaptive at both the SNP- and trait-levels; that is, they account for possibly varying as...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Kwak, I.-Y., Pan, W. Tags: GENETICS AND POPULATION ANALYSIS Source Type: research

PDB2CD: a web-based application for the generation of circular dichroism spectra from protein atomic coordinates
Motivation: Circular dichroism (CD) spectroscopy is extensively utilized for determining the percentages of secondary structure content present in proteins. However, although a large contributor, secondary structure is not the only factor that influences the shape and magnitude of the CD spectrum produced. Other structural features can make contributions so an entire protein structural conformation can give rise to a CD spectrum. There is a need for an application capable of generating protein CD spectra from atomic coordinates. However, no empirically derived method to do this currently exists. Results: PDB2CD has been cr...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Mavridis, L., Janes, R. W. Tags: STRUCTURAL BIOINFORMATICS Source Type: research

Nanocall: an open source basecaller for Oxford Nanopore sequencing data
Motivation: The highly portable Oxford Nanopore MinION sequencer has enabled new applications of genome sequencing directly in the field. However, the MinION currently relies on a cloud computing platform, Metrichor (metrichor.com), for translating locally generated sequencing data into basecalls. Results: To allow offline and private analysis of MinION data, we created Nanocall. Nanocall is the first freely available, open-source basecaller for Oxford Nanopore sequencing data and does not require an internet connection. Using R7.3 chemistry, on two E.coli and two human samples, with natural as well as PCR-amplified DNA, N...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: David, M., Dursi, L. J., Yao, D., Boutros, P. C., Simpson, J. T. Tags: SEQUENCE ANALYSIS Source Type: research

Prediction of nucleosome positioning by the incorporation of frequencies and distributions of three different nucleotide segment lengths into a general pseudo k-tuple nucleotide composition
Motivation: Nucleosome positioning plays important roles in many eukaryotic intranuclear processes, such as transcriptional regulation and chromatin structure formation. The investigations of nucleosome positioning rules provide a deeper understanding of these intracellular processes. Results: Nucleosome positioning prediction was performed using a model consisting of three types of variables characterizing a DNA sequence—the number of five-nucleotide sequences, the number of three-nucleotide combinations in one period of a helix, and mono- and di-nucleotide distributions in DNA fragments. Using recently proposed str...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Awazu, A. Tags: SEQUENCE ANALYSIS Source Type: research

iRSpot-EL: identify recombination spots with an ensemble learning approach
Motivation: Coexisting in a DNA system, meiosis and recombination are two indispensible aspects for cell reproduction and growth. With the avalanche of genome sequences emerging in the post-genomic age, it is an urgent challenge to acquire the information of DNA recombination spots because it can timely provide very useful insights into the mechanism of meiotic recombination and the process of genome evolution. Results: To address such a challenge, we have developed a predictor, called iRSpot-EL, by fusing different modes of pseudo K-tuple nucleotide composition and mode of dinucleotide-based auto-cross covariance into an ...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Liu, B., Wang, S., Long, R., Chou, K.-C. Tags: SEQUENCE ANALYSIS Source Type: research

SiNVICT: ultra-sensitive detection of single nucleotide variants and indels in circulating tumour DNA
Motivation: Successful development and application of precision oncology approaches require robust elucidation of the genomic landscape of a patient’s cancer and, ideally, the ability to monitor therapy-induced genomic changes in the tumour in an inexpensive and minimally invasive manner. Thanks to recent advances in sequencing technologies, ‘liquid biopsy’, the sampling of patient’s bodily fluids such as blood and urine, is considered as one of the most promising approaches to achieve this goal. In many cancer patients, and especially those with advanced metastatic disease, deep sequencing of circu...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Kockan, C., Hach, F., Sarrafi, I., Bell, R. H., McConeghy, B., Beja, K., Haegert, A., Wyatt, A. W., Volik, S. V., Chi, K. N., Collins, C. C., Sahinalp, S. C. Tags: SEQUENCE ANALYSIS Source Type: research

Multivariate two-part statistics for analysis of correlated mass spectrometry data from multiple biological specimens
Motivation: High through-put mass spectrometry (MS) is now being used to profile small molecular compounds across multiple biological sample types from the same subjects with the goal of leveraging information across biospecimens. Multivariate statistical methods that combine information from all biospecimens could be more powerful than the usual univariate analyses. However, missing values are common in MS data and imputation can impact between-biospecimen correlation and multivariate analysis results. Results: We propose two multivariate two-part statistics that accommodate missing values and combine data from all biospe...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Taylor, S. L., Ruhaak, L. R., Weiss, R. H., Kelly, K., Kim, K. Tags: GENOME ANALYSIS Source Type: research

signeR: an empirical Bayesian approach to mutational signature discovery
Motivation: Mutational signatures can be used to understand cancer origins and provide a unique opportunity to group tumor types that share the same origins and result from similar processes. These signatures have been identified from high throughput sequencing data generated from cancer genomes by using non-negative matrix factorisation (NMF) techniques. Current methods based on optimization techniques are strongly sensitive to initial conditions due to high dimensionality and nonconvexity of the NMF paradigm. In this context, an important question consists in the determination of the actual number of signatures that best...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Rosales, R. A., Drummond, R. D., Valieris, R., Dias-Neto, E., da Silva, I. T. Tags: GENOME ANALYSIS Source Type: research

A novel method for predicting activity of cis-regulatory modules, based on a diverse training set
We present a novel strategy to improve the above-mentioned approach: to predict if a CRM drives a specific gene expression pattern, assess not only how similar the CRM is to other CRMs with similar activity but also to CRMs with distinct activities. We use a state-of-the-art statistical method to quantify a CRM’s sequence similarity to many different training sets of CRMs, and employ a classification algorithm to integrate these similarity scores into a single prediction of the CRM’s activity. This strategy is shown to significantly improve CRM activity prediction over current approaches. Availability and Imple...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Yang, W., Sinha, S. Tags: GENOME ANALYSIS Source Type: research

An integer programming framework for inferring disease complexes from network data
(Source: Bioinformatics)
Source: Bioinformatics - December 18, 2016 Category: Bioinformatics Authors: Mazza, A., Klockmeier, K., Wanker, E., Sharan, R. Tags: CORRIGENDUM Source Type: research

MSIdV: a versatile tool to visualize biological indices from mass spectrometry imaging data
Summary: Mass spectrometry imaging (MSI) visualizes the simultaneous lateral distribution of multiple compounds on sample surface. However, it is still difficult to visualize biological indices such as energy charge index from multiple compounds because of the lack of publicly available tools. Here we present MSIdV, a visualization tool for biological indices calculated from mass spectrometry imaging data, which can effectively scan a series of mass spectra and process, calculate and visualize user-defined index measures accurately with a number of signal processing features. Availability and Implementation: MSIdV is imple...
Source: Bioinformatics - December 18, 2016 Category: Bioinformatics Authors: Hayakawa, E., Fujimura, Y., Miura, D. Tags: BIOIMAGE INFORMATICS Source Type: research

AlmostSignificant: simplifying quality control of high-throughput sequencing data
Motivation: The current generation of DNA sequencing technologies produce a large amount of data quickly. All of these data need to pass some form of quality control (QC) processing and checking before they can be used for any analysis. The large number of samples that are run through Illumina sequencing machines makes the process of QC an onerous and time-consuming task that requires multiple pieces of information from several sources. Results: AlmostSignificant is an open-source platform for aggregating multiple sources of quality metrics as well as run and sample meta-data associated with DNA sequencing runs from Illumi...
Source: Bioinformatics - December 18, 2016 Category: Bioinformatics Authors: Ward, J., Cole, C., Febrer, M., Barton, G. J. Tags: DATABASES AND ONTOLOGIES Source Type: research