Imbalanced multi-label learning for identifying antimicrobial peptides and their functional types

Motivation: With the rapid increase of infection resistance to antibiotics, it is urgent to find novel infection therapeutics. In recent years, antimicrobial peptides (AMPs) have been utilized as potential alternatives for infection therapeutics. AMPs are key components of the innate immune system and can protect the host from various pathogenic bacteria. Identifying AMPs and their functional types has led to many studies, and various predictors using machine learning have been developed. However, there is room for improvement; in particular, no predictor takes into account the lack of balance among different functional AMPs. Results: In this paper, a new synthetic minority over-sampling technique on imbalanced and multi-label datasets, referred to as ML-SMOTE, was designed for processing and identifying AMPs’ functional families. A novel multi-label classifier, MLAMP, was also developed using ML-SMOTE and grey pseudo amino acid composition. The classifier obtained 0.4846 subset accuracy and 0.16 hamming loss. Availability and Implementation: A user-friendly web-server for MLAMP was established at http://www.jci-bioinfo.cn/MLAMP. Contacts: linweizhong@jci.edu.cn or xudong@missouri.edu
Source: Bioinformatics - Category: Bioinformatics Authors: Tags: SEQUENCE ANALYSIS Source Type: research