Training RBF Neural Networks on Unbalanced Data

4.5 RBF Neural Networks Dealing with Unbalanced Data

4.5.3 Training RBF Neural Networks on Unbalanced Data

unbalanced cases, in which the sample sizes of diﬀerent classes in a data set are unbalanced. Unbalanced training data may lead to an unbalanced architecture in training. In our work, we add larger weights on minority classes in order to attract more attention in training for the minority members.

Assume that the number of samples in classi isN_i. The total number of samples in the data set isN =N₁+· · ·+N_i+· · ·+N_M. The error function shown in Eq. (4.19) can be written as:

E₀(W) =1 During the training of neural networks with unbalanced training data, a general error function such as Eq. (4.16) or Eq. (4.21) cannot lead to a bal-anced classiﬁcation performance on all classes in the data set because majority classes contribute more compared to minority classes and therefore result in more weight adjustments on majority classes. In supervised training algo-rithms, neural networks are constructed by minimizing a neural network error function whose variables are the network weights connecting layers. Thus, the training procedure has a bias towards frequently occurring classes.

In order to increase the contribution of minority classes in weight ad-justments, we change Eq. (4.21) to:

E(W) =1 DiﬀerentiateE with respect to w_mj, and let

4.5 RBF Neural Networks Dealing with Unbalanced Data 113

∂E(W)

∂w_mj = 0. (4.24)

Substituting Eq. (4.22) into Eq. (4.24), we obtain:

M We introduce a new parameterr_n replacingβ_i:

r_n=β_i when Xⁿ∈A_i. (4.26)

A_i is classi. Substitute Eq. (4.26) into Eq. (4.25):

N Similarly as stated in [22], there is the following new pseudo-inverse equation for calculating weightW:

(φ^Tφ)W^TT =φ^TT. (4.29)

Diﬀerent to the pseudo-inverse equation shown in Eq. (4.12), hereφ→øⁿ_j√r_n, andT →tⁿ_i√r_n.

As indicated in the above equations, we have taken the unbalanced data into consideration when training RBF neural networks. The parametersr_n in-troduce biased weights which are opposite to the proportions of classes in a data set. The eﬀect of the weight parametersr_nis shown in Sect. 4.5.4. Com-pared with the training method without considering an unbalanced condition in the data, the classiﬁcation accuracy of the minority classes is improved.

We also allow large overlaps between clusters of the same class to reduce the number of hidden units [102][104].

The modiﬁed training algorithm for RBF neural networks, in which small overlaps between clusters of diﬀerent classes and large overlaps between clusters of the same class are allowed, is used in this section.

4.5.4 Experimental Results

The car evaluation data set in Chap. 3 are used here to demonstrate our algorithm. The data set is divided into three parts, i.e., training, validation,

114 4 An Improved RBF Neural Network Classiﬁer

and test sets. Each experiment is repeated ﬁve times with diﬀerent initial conditions and the average results are recorded.

We generate an unbalanced car data set based on function 5 shown in Chap. 3. There are nine attributes and two classes: Class A and Class B.

Samples which do not meet the conditions of Class A are samples of Class B in the car data set. 4000 patterns are in the training data set and 2000 patterns for the testing data set. There are 507 patterns of class 1 (Class A) in the training data set, and 205 patterns of class 1 in the testing data set. The testing data set is divided into two subsets: the validation set and the testing set with 1000 patterns, respectively. Class A is the minority class. Class B is the majority class.

Comparison between small overlaps and large overlaps among clusters of the same class are shown on classiﬁcation error rates and the number of hidden units. When allowing large overlaps among clusters of the same class, the number of hidden units is reduced from 328 to 303, and the classiﬁcation error rate on the test data set is increased slightly from 4.1% to 4.5%.

In table 4.6, the comparison of overall classiﬁcation error rates between with and without considering the unbalanced condition is shown. Here large overlaps are allowed between clusters with the same class label. It is also shown in Table 4.6, when considering the unbalanced condition in the data set, that the classiﬁcation error rate of the minority class decreases from 34.65% to 8.73%. At the same time, the error rate of the majority class increases slightly from 1.37% to 4.1%. Since, in most cases, the minority class is embedded with important information, improving the individual accuracy of the minority class is critical.

In this section, a modification [103] is described to the training algo-rithm for the construction and training of the RBF network on unbalanced data by increasing bias towards the minority classes. Weights inversely propor-tional to the number of patterns of classes are given to each class in the MSE function. Experimental results show that the proposed method is effective in improving the classification accuracy of minority classes while maintaining the overall classification performance.

4.6 Summary

In this chapter, we described a modified training algorithm for RBF neural networks, which we proposed earlier [107]. This modified algorithm leads to fewer hidden units while maintaining the classification accuracy of RBF clas-sifiers. Training is carried out without knowing in advance the number of hidden units and without making any assumptions on the data.

We described two useful modiﬁcations to Royet al.’s algorithm for the construction and training of an RBF network, by allowing for large overlaps among clusters of the same class and dynamically determining the cluster overlaps of diﬀerent classes.

4.6 Summary 115 Table 4.6. Comparison of classification error rates of the RBF neural network for each class of the car data set between with and without considering the unbalanced condition when allowing large overlaps between clusters with the same class label (average results of five independent runs). ( c2005 IEEE) We thank the IEEE for allowing the reproduction of this table, first appeared in [103].

Without considering unbalanced condition Overall error rates