Target prediction and a statistical sampling algorithm for RNA-RNA interaction — : 26 (2): 175 — Bioinformatics
Motivation: It has been proven that the accessibility of the target sites has a critical influence on RNA–RNA binding, in general and the specificity and efficiency of miRNAs and siRNAs, in particular. Recently, O(N6) time and O(N4) space dynamic programming (DP) algorithms have become available that compute the partition function of RNA–RNA interaction complexes, thereby providing detailed insights into their thermodynamic properties.
Results: Modifications to the grammars underlying earlier approaches enables the calculation of interaction probabilities for any given interval on the target RNA. The computation of the ‘hybrid probabilities’ is complemented by a stochastic sampling algorithm that produces a Boltzmann weighted ensemble of RNA–RNA interaction structures. The sampling of k structures requires only negligible additional memory resources and runs in O(k·N3).
Availability: The algorithms described here are implemented in C as part of the rip package. The source code of rip2 can be downloaded from http://www.combinatorics.cn/cbpc/rip.html and http://www.bioinf.uni-leipzig.de/Software/rip.html.
Motivation: The accurate prediction of the quality of 3D models is a key component of successful protein tertiary structure prediction methods. Currently, clustering- or consensus-based Model Quality Assessment Programs (MQAPs) are the most accurate methods for predicting 3D model quality; however, they are often CPU intensive as they carry out multiple structural alignments in order to compare numerous models. In this study, we describe ModFOLDclustQ—a novel MQAP that compares 3D models of proteins without the need for CPU intensive structural alignments by utilizing the Q measure for model comparisons. The ModFOLDclustQ method is benchmarked against the top established methods in terms of both accuracy and speed. In addition, the ModFOLDclustQ scores are combined with those from our older ModFOLDclust method to form a new method, ModFOLDclust2, that aims to provide increased prediction accuracy with negligible computational overhead.
Results: The ModFOLDclustQ method is competitive with leading clustering-based MQAPs for the prediction of global model quality, yet it is up to 150 times faster than the previous version of the ModFOLDclust method at comparing models of small proteins (<60 residues) and over five times faster at comparing models of large proteins (>800 residues). Furthermore, a significant improvement in accuracy can be gained over the previous clustering-based MQAPs by combining the scores from ModFOLDclustQ and ModFOLDclust to form the new ModFOLDclust2 method, with little impact on the overall time taken for each prediction.
Availability: The ModFOLDclustQ and ModFOLDclust2 methods are available to download from http://www.reading.ac.uk/bioinf/downloads/
AQUA: automated quality improvement for multiple sequence alignments — : 26 (2): 263 — Bioinformatics
Summary: Multiple sequence alignment (MSA) is a central tool in most modern biology studies. However, despite generations of valuable tools, human experts are still able to improve automatically generated MSAs. In an effort to automatically identify the most reliable MSA for a given protein family, we propose a very simple protocol, named AQUA for ‘Automated quality improvement for multiple sequence alignments’. Our current implementation relies on two alignment programs (MUSCLE and MAFFT), one refinement program (RASCAL) and one assessment program (NORMD), but other programs could be incorporated at any of the three steps.
Availability: AQUA is implemented in Tcl/Tk and runs in command line on all platforms. The source code is available under the GNU GPL license. Source code, README and Supplementary data are available at http://www.bork.embl.de/Docu/AQUA.