A Study of Learning Targets and Evaluation Criteria in Imbalanced Data Learning----Institute of Automation

A Study of Learning Targets and Evaluation Criteria in Imbalanced Data Learning

Apr 18, 2016Author：

PrintText Size A A

A Study of Learning Targets and Evaluation Criteria in Imbalanced Data Learning

Abstract：Imbalanced data learning is one of the challenges in big data processing. This

program aims at a systematic study on the primary problem, namely “What to learn?”, in the imbalanced data learning. In a theoretical level, we will explore what the specific learning targets will be required by the imbalanced data learning in both “linguistic” and “computational” levels, respectively. A study will be made on the intrinsic properties of the learning targets and evaluation criteria, so that we can reach a theoretical understanding why some measures are proper in dealing with imbalanced data learning, some are not. We will further explore the information-based learning targets and criteria in comparison with the non-information ones, and will derive their relations with respect to the imbalance ratio. The goal of the analytical study is to provide the guidelines in the selections of learning targets and evaluation criteria. In the approach level, we will advance the current classifiers with the abstaining functions for wider applications. We will study on the optimization of reject threshold and its associated properties. We will further explore the information-based learning targets and criteria in comparison with the non-information ones. Their connections are investigated. A novel boosting classifier will be developed by setting the multiple learning targets for a classifier-example study towards a large-scale data process. These targets will include the adaptation of imbalance ratio in the data, abstaining and non-abstaining classification, and convexity optimization. The final goal of this program is to put forward on the new study theme of “learning target selection” in machine learning and to provide a study example in the abstaining classifier design in imbalanced data learning.

Keywords: machine learning; imbalanced data learning; learning target selection; evaluation criteria selection; Boosting classifier

Contact:

HU Baogang

E-mail: hug@nplr.ia.ac.cn

National Laboratory of Pattern Recognition

Research

Research Projects