• Aucun résultat trouvé

Application and Comparison of Classification Techniques in Controlling Credit Risk

1. Credit Risk and Credit Rating

In a broad sense, credit risk is the uncertainty or fluctuation of profit in credit activities. People pay high attention to the possible loss of credit assets in the future. In other words, the credit risk is the possibility that some related factors change and affect credit assets negatively, thus leading to the decline of the whole bank’s value (Yang, Hua & Yu 2003).

Credit activity in a commercial bank is strongly influenced by the factors from outside as well as inside, such as macro-economic status, industrial situation and internal operations. For example, the credit assets may suffer a loss because of an undesirable movement in interest rates, exchange rates or inflation. Management problems and inappropriate operations in commercial banks may also result in the loss of credit assets. Furthermore, a contractual counterpart may not meet its obligations stated in the contract, thereby causing the bank a financial loss. It is irrelevant whether the counterpart is unable to meet its contractual obligation due to financial distress or is unwilling to honor an unenforceable contract. This chapter focuses on the last kind of risk, i.e., the risk of breaching contracts.

A commercial bank is requested to evaluate the credit level of clients when extending credit to them. Evaluation may not always be correct and these clients’ credit level may vary all the time for miscellaneous reasons. If there is a lack of objectivity during the audit process, the loan quality may deteriorate and the bank has to seriously face the risk of losing credit assets. An effective instrument in decreasing credit risk is to use a credit rating system, which is of extraordinary concern in China these years.

Credit rating is a qualified assessment and formal evaluation of a company’s credit history and capability of repaying obligations by a credit bureau. It measures the default probability of the borrower, and its ability to repay fully and timely its financial debt obligations (Guo 2003).

The credit ratings of these companies are expressed as letter grades such as AAA, A, B, CC, etc. The highest rating is usually AAA, and the lowest is D. ‘+’ and ‘-’ can be attached to these letters so as to make them more precise. Credit rating provides an information channel for both the borrower and the lender, making the capital market to work

more effectively. Commercial banks can control their credit risk with the powerful support from credit ratings. If there is no credit rating, the commercial bank has to increase the charge so as to cover the extra risk due to information asymmetry. Therefore, these companies benefit from credit ratings too.

However, to obtain a company’s credit rating is usually very costly, since it requires credit bureaus to invest large amounts of time and human resources to perform a deep analysis of the company’s risk status based on various aspects ranging from strategic competitiveness to operational level details (Huang, Chen, Hsu, Chen &Wu 2004). This situation may consequently lead to at least two major drawbacks that exist in credit rating. First of all, it is not possible to rate all companies.

The number of companies applying for credit rating is too large and rating all of them is intolerable. In addition, rating companies, such as Moody’s and Standard & Poors which rate other companies, will not rate companies that are not willing to pay for this service. Secondly, credit rating cannot be performed frequently, so it cannot reflect timely the credit level of companies applying for loans. Usually, the rating work is implemented twice a year and not all companies can afford rating themselves very often. Therefore, intelligent approaches based on data mining are considered to support the credit activities of commercial banks. Such approaches have their own learning mechanisms and become intelligent after they have been trained on historical rating data.

They can alarm the bank as soon as they determine that a new client has high risk of breaching a contract.

In past years, quantitative models such as linear discriminant analysis and neural networks have been applied to predict the credit level of new clients and achieved satisfactory performance results (Baesens 2003;

Galindo &Tamayo 2000; Huang, Chen, Hsu et al. 2004; Pinches &Mingo 1973). For example, Pinches and Mingo employed discriminant analysis to bond rating data in 1973 (Pinches &Mingo 1973). A logistic regression technique was also applied in this area (Ederington 1985). In addition to these traditional statistical methods, artificial intelligence techniques, such as case based reasoning systems and neural networks, were adopted to improve the prediction ability in credit ratings.

Investigations of neural networks and numerous experiments revealed

that such methods can normally reach higher accuracy than traditional statistical methods (Dutta & Shekhar 1988; Singleton & Surkan 1990;

Moody & Utans 1995; Kim 1993; Maher & Sen 1997). Shin and Han proposed a case based reasoning approach, enhanced with genetic algorithms (GAs) to find an optimal or near optimal weight vector for the attributes of cases in case matching, to predict bond rating of firms (Shin

&Han 1999). A good literature review can be found in (Huang, Chen, Hsu et al. 2004; and Shin &Lee 2002).

This chapter endeavors to make a much more comprehensive application of classification techniques in credit rating. More specifically, it applies seven types of classification models that include traditional statistical models (LDA, QDA and logistic regression), k-nearest neighbors, Bayesian networks (Naïve Bayes and TAN), a decision tree (C4.5), associative classification (CBA), neural networks, and SVM.

Though the total accuracy is a commonly used measure for classification, in the credit ratings context the cost of rating a bad client as good one is much more expensive than that of rating a good client as a bad one. Thus, in this chapter the receiver operating characteristic (ROC) curve analysis, which takes misclassification error into account, is considered. The value of AUC (area under the receiver operating characteristic curve) is taken as the performance evaluation criterion. The Delong-Pearson method is applied to test if there is a statistically significant difference between each pair of these classification techniques.

Furthermore, our study is to use credit data from China, collected mainly by the Industrial and Commercial Bank of China. The rest of this chapter is organized as follows. The second section describes the data and index used to train the models. Seven types of classification techniques are discussed in the third section, where the principle and characteristics of each technique are elaborated. The experimental settings, the classification performance and a statistical comparison are presented in the fourth section.