Since class overlap has not been mathematically well characterised[35], a standard measurement of the overlap degreeis not yet defined. Several approaches have been formulatedto estimate the overlap degree, however, with limitations. Forexample, in [17], the overlap degree of a synthetic dataset wasdetermined from the overlapping area with respect to the totaldata space. In [34], the authors adapted such measurement sothat class imbalance was also taken into account seeing that theminority class is relatively more overwhelmed by class overlap.The overlap degree was instead measured from the overlappingarea with respect to the total area of the positive class.Another common approach is using the classification error asthe estimated overlap degree, e.g., the percentage of instancesmisclassified by the k-Nearest Neighbour rule [36] (kNN) withrespect to the number of total instances [37,38]. However, in [35],the authors showed that such an approach was inaccurate andproposed the use of the ridge curves of the probabilistic densityfunction to quantify class overlap. The computation was basedon the ratio of the saddle point to a smaller peak of the ridgecurves of the two classes. This method is one of a few existingmethods that measure overlap from the actual contour of dataand can be extended to handle multi-class datasets. The maindrawback of this approach is that it is only applicable to datasetswith normal distributions of both data and features, which isimpracticable to real-world datasets. In [39], the overlap degreewas defined as the distance between the class centroids, which islikely to be inaccurate due to arbitrary shapes and non-uniformityof data in nature. Another approach [40] was based on SupportVector Data Description (SVDD) [41]. SVDD was used to locateapproximated boundaries of each class in binary-class datasets,and the overlapping region was estimated based on the amountof the common instances found within both boundaries. Similarto the approach of [39], this method tends to introduce higherrors in the overlap approximation since SVDD is only capable ofdiscovering a spherical-shaped boundary of a class, which is notideal for real-world datasets.