• Aucun résultat trouvé

Analysis and assessment of credit rating model in P2P lending : an instrument to solve information asymmetry between lenders and borrowers

N/A
N/A
Protected

Academic year: 2021

Partager "Analysis and assessment of credit rating model in P2P lending : an instrument to solve information asymmetry between lenders and borrowers"

Copied!
69
0
0

Texte intégral

(1)

Analysis and Assessment of Credit rating model in P2P lending

An instrument to solve information asymmetry between lenders and borrowers By

Yang Yang

B.Sc. Management of Science and Project University of Science and Technology of China, 2007

SUBMITTED TO THE MIT SLOAN SCHOOL OF MANAGEMENT IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR DEGREE OF

MASTER OF SCIENCE IN MANAGEMENT STUDIES AT THE

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

JUNE 2015

2015 Yang Yang. All rights reserved

The author hereby grants to MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part

in any medium now know or hereafter created.

Signature

redacted

ARCHNES

MASSACHUSETTS INSTITUTE OF TECHNOLOLGY

JUN 2 4 2015

LIBRARIES

MIT

Signature redacted

Sloan School of Management May 8, 2015 Certified by:

Accepted by:

Christian Catalini Assistant Professor of Technological Innovation, Entrepreneurship, and Strategic Management

Signature redacted

Thesis Supervisor

Michael A. Cusumano SMR Distinguished Professor of Management Program Director, M.S. in Management Studies Program MIT Sloan School Of Management

I

(2)
(3)

Analysis and Assessment of Credit rating model in P2P lending

An instrument to solve information asymmetry between lenders and borrowers

By

Yang Yang

Submitted to MIT Sloan School of Management on May 8, 2015 in Partial Fulfillment of the Requirements for the Degree of Master of Science in

Management Studies.

ABSTRACT

Since the establishment of the first P2P lending platform in 2005, P2P lending industry has been nibbling the market share of traditional consumer credit. In 2014, Lending Club and Prosper originated over 7 billion personal loans. As one of the biggest traditional banks in the U.S., Citi issued 25.2 billion USD in 2014. Given the advantages of P2P lending over traditional banks, the market for P2P lending is expected to grow rapidly along with the improvement of the internal system of P2P lending platforms, external regulation and more participation from borrowers and lenders. Given the fact that most P2P lending platforms in China first imitated the business model from either the U.S. or European platforms, they have progressively evolved to incorporate different business models due to legislation, economic or behavioral reasons.

Several findings are detected by analyzing the data form Lending Club and Prosper. First, although both platforms progressively improve the default rate each year, currently both platforms offer negative returns for investors. Second, if only considering finished/matured loans, higher credit score doesn't lead to less default risk. Third, on average, a default loan will cost a loss more than twice as much as the interest return offered to investors. Taking this cost matrix into consideration, the optimal data model won't necessarily provide the highest accuracy but maximum return. Fourth, the ex post return offered by the platforms is not enough to cover the potential risk facing investors.

Thesis Supervisor: Christian Catalini

Title: Assistant Professor of Technological Innovation, Entrepreneurship, and Strategic Management

(4)
(5)

Analysis and Assessment of Credit rating model in P2P lending

An instrument to solve information asymmetry between lenders and borrowers By

Yang Yang

SUBMITTED TO THE MIT SLOAN SCHOOL OF MANAGEMENT IN PARTIAL FULFILLMENT

OF THE

REQUIREMENTS FOR DEGREE OF

MASTER OF SCIENCE IN MANAGEMENT STUDIES

AT THE

MASSACHUSETTS INSTITUTE OF TECHNOLOGY JUNE 2015

PURPOSES OF THIS PAPER

It's been almost 10 years since the first P2P lending platform was founded in the UK. While P2P lending has been growing rapidly within the past 10 years, it is still in the infant stage compared to the traditional banking industry. There are over 70 academic papers about P2P lending between 2008 and 2015, but from different perspectives, including analyses of determinants of a loan to be successfully funded by investors, regulations, credit risks, determinants of credit quality and default probability, business model of P2P lending across countries, internal information system and literature reviews.

Even though a handful of papers did research on credit risks using data mining methodologies, most of them were focused on explaining the determinants of a loan being successfully funded. Few literature considered cost matrix in the model or compared results from Prosper and Lending Club. P2P lending is a two-sided market. In order to further boost market growth, P2P lending platforms also need to enhance the ability of investors to assess credit risks. By doing this, Platforms can offer higher return, and thus, attract more participation of investors in lending activity.

The main purpose of this paper is to identify key determinants of a loan's default probability and respective coefficients, and then build the optimal model to predict the loan's status. This model will act as a way to mitigate information asymmetry on P2P lending and gaming philosophy of borrowers. Besides, this paper will also take a dynamic review of the current development of P2P lending built on previous literature.

Another motivation for this paper is that the Chinese government just granted the participation of personal credit rating business from non-state owned companies. The public believes this movement will become the game changer for the internet finance industry, especially the P2P lending segment. This paper will justify whether a 3rd party credit rating

(6)

will help investors prevent adverse selections.

ABSTRACT

Since the establishment of the first P2P lending platform in 2005, P2P lending industry has

been nibbling the market share of traditional consumer credit. In 2014, Lending Club and

Prosper originated over 7 billion personal loans. As one of the biggest traditional banks in the

U.S., Citi issued 25.2 billion USD in 2014. Given the advantages of P2P lending over

traditional banks, the market for P2P lending is expected to grow rapidly along with the

improvement of the internal system of P2P lending platforms, external regulation and more

participation from borrowers and lenders. Given the fact that most P2P lending platforms in

China first imitated the business model from either the U.S. or European platforms, they have

progressively evolved to incorporate different business models due to legislation, economic

or behavioral reasons.

Several findings are detected by analyzing the data form Lending Club and Prosper. First,

although both platforms progressively improve the default rate each year, currently both

platforms offer negative returns for investors. Second, if only considering finished/matured

loans, higher credit score doesn't lead to less default risk. Third, on average, a default loan

will cost a loss more than twice as much as the interest return offered to investors. Taking this

cost matrix into consideration, the optimal data model won't necessarily provide the highest

accuracy but maximum return. Fourth, the ex post return offered by the platforms is not

enough to cover the potential risk facing investors.

Thesis Supervisor: Christian Catalini

Title: Assistant Professor of Technological Innovation, Entrepreneurship, and Strategic

Management

(7)

Table of Contents

1. INTRODUCTION... 6

1.1 DEFINITION OF P2P LENDING ... 7

1.2 How P2P LENDING W ORKS (LENDING CLUB, PROSPER)...7

2. M ARKET REVIEW OF P2P LENDING ... 10

2.1 MEARKET SIZE ... 10

2.2 KEY PLAYERS AND RESPECTIVE M ARKETPLACE ... 11

2.3 M ARKET OUTLOOK OF P2P LENDING ... 13

2.4 BUSINESS M ODELS OF P2P LENDING...15

3. DATA ANALYSIS AND M ODELING ... 19

3.1 INTRODUCTION ... 19 3.2 KEY VARIABLES ... 20 3.2.1 Prosper ... 20 3.2.2 Lending Club...20 3.3 DISTRIBUTION OF DATASET ... 21 3.3.1 Prosper ... 21 3.3.2 Lending Club...24

3.4 M ODEL BUILDING AND INTERPRETATION-LENDING CLUB ... 26

3.4.1 Data Preparation...27

3.4.2 M odel Building ... 29

3.4.3 M odel interpretation ... 32

3.4.4 Robustness Check ... 34

3.5 M ODEL BUILDING AND INTERPRETATION-PROSPER ... 38

3.5.1 Data Preparation...38

3.5.2 M odel Building ... 43

3.5.3 M odel interpretation ... 47

3.5.4 Robustness Check ... 49

(8)

3.6 COMPARISON OF FINDINGS IN MODEL BUILDING FOR LENDING CLUB AND PROSPER ... 53

3 .6 .1 Sim ila rities...5 3 3 .6 .2 D ifferences...54

3.6.3 Lessons for China's P2P Lending ... 55

4. CO NCLUSIO N. ... 56

4.1. CONCLUSION OF THIS PAPER... 56

4.2. FURTHER RESEARCH PROPOSED...58

5. REFERENCES... 58

1. Introduction

Freedman and Zhe Jin (2007) wrote the first academic paper to look into the business of P2P

lending. They brought up the question of whether P2P lending would reshape the future of the

financial industry or if P2P lending would be a fad that would wane over time. Even though

it's been over 6 years since that paper, it's still too early to give an answer to that question,

whereas what we see on the market is the emergence of more P2P lending platforms globally

and the IPO of Lending Club in December 2014. In addition, the attitude of traditional banks

toward this infant industry is also evolving. For instance, in early 2014, one employee of

Wells Fargo told the media that one internal email was sent by the principal requesting all

employees of Wells Fargo not to get engaged with any business of P2P lending. By contrast,

many hedge funds or regional banks are purchasing personal Loan products from P2P lending

platforms due to stable and attractive return. In addition, more traditional financial

institutions also opened their own P2P platforms to catch up with the trend.

(9)

1.1 Definition of P2P Lending

P2P stands for Peer-to-Peer or Person-To-Person. In P2P lending, platforms act as

intermediaries matching lenders and borrowers, and transact the money. P2P lending was first

introduced by Zopa in UK, 2005. By the time of this paper, Zopa has originated 713 million

GBP and is one of the biggest platforms in the world. The emergence of P2P lending is also a

result of applying web 2.0 in financial industry. By reducing the overhead cost and

infrastructure of traditional banks, P2P lending platforms can offer lower interest rate for

borrowers and accumulate huge traffic within a short period (Dhand et al., 2008).

1.2 How P2P Lending Works (Lending Club, Prosper)

fl~ctdApk*u Lafure fistimp k"vma

Borrowers want to apply for personal loans for various reasons. The main reason of personal

loans on Lending Club and Prosper is credit consolidation. A borrower applies for loans by

providing private information such as loan amount, term, credit rating score, debt-to-income

(10)

ratio, monthly income, occupation and the loan purpose. Both platforms will then assess the

information and decide a fixed interest rate for the loan. After the interest is agreed on by the

borrower, the loan will be listed on the platform for investors to browse. Then investors can

browse loan information and decide whether to invest and how much to invest.

Among the 73 papers on P2P lending between 2008 and 2015, 20 papers discussed how to

increase the possibility of loans being successfully funded and what are the key determinants.

Compared with unverified variables, verified variables play a much more significant role in

determining whether to invest a loan (Gregor, et al., 2010). Also, borrowers who are willing

to disclose more information normally pay less interest rate (B6hme et al., 2010). Social ties

will increase the chances of having the loan fully funded (Sergio, 2009; Greiner & Wang,

2009; Herrero-Lopez, 2009; Hildebrand & Rocholl, 2010; Lin 2009), reduce the ex post interest charged on the loan, and also decrease the default risk associated with the loan (Lin et

al., 2009; Zhensheng, 2014). Furthermore, some research is focused on the contribution of

demographic information of borrowers on loan funding such as appearance and gender.

Research shows that appearance also does influence the decision of lenders to fund a loan or

not (Jefferson et al., 2012). Female borrowers are less likely to get loans funded than are male

borrowers.

Based on all the information provided by the borrower, investors then need to determine

whether to lend and how much to lend. The objective of lending money on P2P platforms is

to gain high return and mitigate default risk. Investors on P2P lending platforms are inclined

to invest in loans with higher ex post return, which also carry higher default risk. Assessing

(11)

There are 8 papers that built models to investigate what are the key determinants of default

risk, so investors can use this as a guideline to avoid adverse selection. Loans with lower

credit grade and longer terms will result in higher default risk (Riza et al., 2015). This finding

is opposite from the result in this paper because in my paper, rather than using either

completed loans or matured/finished loans, I used a combination of both. There are

discrepancies between risk premiums charged and real default risk associated with loans on

P2P lending platforms (Kumar, 2007). This conclusion is supported by the fact that the proof

shows that the premium charged by P2P platforms is not enough to cover the potential loss of

investors (Riza et al., 2015). Recommendations were also imposed that another way to

mitigate default risk of loans is to set up a social reputation system in P2P lending platforms

(Everett, 2010; Lin, 2009).

Platforms will charge borrowers a loan origination fee once the loan is successfully funded.

Investors will also be charged a service fee of managing installment payments from

borrowers. A handful of papers were focused on building the internal information system of

P2P platforms. For instance, Collier (2010) informed practice and theory on developing

community reputation that can improve information asymmetry on Prosper and mitigate

adverse selection. Also, as an intermediary in the financial market, platforms are regulated by

both SEC and CFPB. 4 papers uncovered the current regulations on P2P lending and inform

implications for further development of specific regulation for P2P lending. A multi-agency

regulatory approach of P2P lending should be implemented that intimates the approach

applied to regulate traditional lending (Eric et al., 2012).

Borrowers need to pay monthly installment payments until the the loans reach maturity. If

(12)

desired, they can also choose to repay all principle payments ahead of the loan's maturity by

paying a service fee. Platforms also provide a trading system to investors who want to sell

holding loans with a certain discount. This trading system, like an open market, helps

platforms to provide more flexibility to investors.

However, some loans default in early stages of installment payments. This causes a huge loss

for investors as a whole. Investors are inclined not to hire an agency to collect net principle

loss due to the small amount of investment (Freedman & Jin, 2008). Further research into

after-default management of P2P lending is an urgent need because it can help mitigate net

principle loss of investors and improve the risk-adjusted return of platforms as a whole.

2. Market Review of P2P Lending

2.1 Market Size

The potential market size of P2P lending could be measured in both micro and macro ways.

The market size of P2P lending is mainly the size of unsecured loans, including unsecured

personal loans and line of credit. The total amount of consumer credit in the U.S. as of Oct,

2014 is 3.283 trillion USD, as asserted by Federal Reserve G.19 release. Per the E2 Release

of Federal Reserve, the total amount of outstanding business loans ranging from $10,000 to

$99,000 is 3.4 billion. We can sum up above two components as the potential market size for P2P lending, which is 3.286 trillion USD purely in THE U.S. market. Currently, Prosper

contributes 2 billion in fund lending, and Lending club contributes 6 billion in loans.

In a macro way, we can even expand the market to the middle size business loans since

lending club also provides business loans up to 300K USD. The total amount of business 10

(13)

loans ranging from IOOK to 999K is 12 billion (Donghon, 2014). Conservatively, we can add

another 2.4 billion to the potential P2P lending market. This will result in a market with a

total amount of 4.288 trillion USD dollars. Investors on P2P lending platforms are about to

eat between 25 percent and 30 percent of the business that traditional banks are doing. The

overall market of P2P lending will then grow to about $1 trillion by 2025 (Cromwell, 2015).

2.2 Key Players and Respective Ma

Rank

Lending Site

1

Lending Club

2

CreditEase

3

Upstart

4

Prosper

5

Zopa

rketplace

Year Founded

2007 2006 2012 2006 2005 Loan Volume($billion) 6 3.2 3 2 0.8

Lending Club. Lending club which was founded in 2007 has been paying investors $590

million in interest returns. Per the statistic data from Lending Club's websites, by 3 0th

September 2014, 83.17% of Lending Club borrowers reported that they use loans from Lending Club to refinance existing loans or pay off their credit cards. The breakdown of the main purposes of Lending Club loans is shown below.

11 Country USA China USA USA UK

(14)

/J

--- C t .' F g:ff

Prosper. Prosper, founded by Chris Larsen and John Witchel on February

5,

2006, was the

first P2P Lending platform in the U.S. It stays unlisted and is financially supported by several

big names in venture capitals. Till now, Prosper had more than 2 million members and

generated over 2 billion loans.

Upstart. It was founded by ex-Googlers in 2012 in the U.S. and originated more than $3

billion in loans with an annual growth rate of

265%.

The major difference that lies between

Upstart and other platforms is that when assessing the credit quality of borrowers, Upstart

starts with the same information but will further include academic variables to come up with

the risk assessment more statistically.

CreidtEase. As reported by Peter Renton in 2014, CreditEase is the largest P2P lending

platform in China and has generated more than $3.2 billion USD in loans to over 500,000

borrowers. This company was founded in 2006 and is now operating in over 150 cities of

China.

Zopa. Zopa is the oldest Peer-to-Peer lending company in the world. The company was

founded in 2005 in the UK. It has lent $1 billion USD and has helped both borrowers and

investors get better rates.

(15)

2.3 Market Outlook of P2P Lending

The emergence of P2P lending exceeded the public's expectation in recent years. P2P lending

would increase by 66% to a total size of 5 billion USD by the end of 2013 (Gartner, 2010).

Looking at the statistic data of the biggest platforms, I found that lending club experienced

over 150% annual growth rate till 2014. Besides, Prosper.com also achieved exponential

growth since its establishment. Till the end of 2013, it originated over 300 million USD in

loans and moved this number to over 1.5 billion USD in loans by the end of 2014.

Despite the fact that it's extremely difficult to estimate the exact growth rate of P2P lending,

there are several determinants that can indicate the future trend of P2P lending from a macro

perspective. 1) Geographic expansion. Till now, P2P lending is not fully authorized in all

states of the U.S. due to the complexity of autonomy. Even in China, the acceptance of P2P

lending varies among different regions. Further geographic expansion would be expected in

the next few years. 2) More comprehensive legislation. The main reason that certain public

authorities or groups are still skeptical about P2P lending is that it is still in its infancy and is

less regulated compared to traditional banks. The specific regulations for P2P lending are an

urgent need in the market. 3) Challenges from traditional banking. Given the fact that the P2P

lending has huge cost-advantage to traditional banks, with the recovery of the U.S. economy,

the government is considering loosening the requirement for loan borrowers. This will help

traditional banks to regain borrowers who are not entitled to a loan. In China, many financial

institutions also introduced their own P2P platforms to gain a piece of the pie. 4) Information

asymmetry. Information asymmetry might lead investors to adverse selection (Akerlof, 1970)

and moral hazard (Stiglitz and Weiss, 1981). Various efforts are being made in order to

(16)

mitigate the information asymmetry by the platforms. 5) Bottom line of the economy and

employment. The performance of both the economy and employment will impact the further

development of P2P lending. As the statistic data from Proper and Lending club, most of the

borrowers' purpose is credit consolidation. Stronger economy and improved wages and

employment rate indicate that people's financial condition will be better off and the need of

credit consolidation will decline accordingly. 6) Institutional investors. P2P lending can

provide a higher ROI than many other investments in the financial market. There are

institutional investors who purchase loan packages from platforms to gain stable cash flow

and return. A simple comparison among different financial investments is listed below. In

2013, P2P lending generated much lower return than NYSE and Dow Jones Industry

Composite, but outperformed NYSE and Dow Jones in 2014. However, for P2P lending

platforms, I'm using the official investment return rate while the true risk-adjusted

investment return might vary from this data. Another point worth noticing is that the superior

return from stock market in 2013 is due to the recovery from an economic and financial

downturn. An ROI around 10% is already very impressive in the financial investment sector.

As reported by Bloomberg, the average return of hedge funds was 7.4% in 2013.

Investment Lending club Prosper 3yr T NYSE Dow Jones

2014 10.50% 9.79% 1.10% 4.22% 7.52%

2013 8.75% 9.86% 0.78% 23.18% 26.50%

Till the end of 2014, the total amount of loans originated through P2P lending in China has

reached $40 billion with a default rate of 17.46%. 1.16 million borrowers got their loans

(17)

with numbers of 2013 respectively. There are 1575 P2P lending platforms in China, and 275

went bankrupt in 2014, implying that one out of six platforms was not sound. The average

amount of loans and money that individual investor funded is $35,000 USD and $64,000

USD. This statistics data comes from Wangdaizhijia.com in China.

2.4Business Models of P2P Lending

This section will introduce the business models used by major P2P lending platforms in the

U.S and China and address the major differences between the two markets.

In the U.S. market, the business models of P2P lending platforms are quite similar to each

other. Borrowers post their loans on platforms and investors browse and choose loans to

invest. The P2P lending platform acts as an intermediary and is responsible for risk rating,

determining interest rate, document verification and interest payment management. However,

Prosper and Lending Club still varies in several ways as below.

1) Loan type. Prosper only originates personal loans ($2000-$35,000 USD) while Lending

Club also originates business loans up to $300,000 USD and personal loans ranging from

$1000 to $35,000 USD. Besides, Prosper and Lending Club provides loans with different

maturities. Both provide 3-year and 5-year loans. In addition, Lending Club provides a

1-year loan as well.

2) Interest rate. P2P platforms determine the interest rate by considering information

reflecting borrowers' credit quality. Both Prosper and Lending Club stipulate the cap and

floor interest rate for loans falling into different credit Rating/Grades. However, Interest

rate in the same credit category varies between Prosper and Lending Club due to different

credit rating logic.

(18)

3) Credit scoring. Prosper and Lending Club provides a proprietary credit score as a major

indicator of loan risk. They both offer 7 rating categories, Prosper from HR (worst) to AA

(best) and Lending Club from G (worst) to A (best).

4) Origination Fee. Platforms earn money by charging fees to borrowers. The cap and floor

fee rates charged by Prosper and Lending Club are the same, whereas different rates are

charged for borrowers in different risk categories. A simple comparison is listed below,

including credit rating, respective interest rate and origination fee.

Lending Club

Rating Interest Rate Origination Fee Rating Interest Rate Origination Fee

AA 6.05%'7.96% 1%2% A 5.49%'8.19% %3% A 8.19%11.33% 4% B 8.67%11.99% 4%-5% B 11.56%'14.06% 5% C 12.39%'14.99% 5% C 14.59%'18.27% 5% D 15.59%-17.86% 5% D 19%'22.68% 5% E 18.54%21.99% 5% E 23.44%27.04% 5% F 22.99%-25.5.7% 5% HR 27.75%31.25% 5% G 25.8%'26.06% 5%

5) Affiliate & Referral Programs. Prosper introduces the affiliate program to attract more

borrowers and lenders from referrers and to provide $100-150 USD for borrower leads

and $50 for lender leads. Lending Club also introduced the affiliate & Referral program,

but detailed bonuses are not provided on its website.

6) Both Prosper and Lending Club provide Notes Trading Platform, where investors can

trade their holding notes with each other. Folio is a Broker-Dealer platform which only

charges sellers 1%.

7) Early repayment. Borrowers can choose to pay the remaining repayment without paying

any penalty, in order to refrain from paying monthly interest in the future.

(19)

based on the information provided by the borrowers. However, in early years, Prosper

introduced interest an rate auction in which investors can bid the lowest interest rate they

can accept to compete funding the most popular loans. This is the reason why sometimes

we can see that the loans were originated with a lower interest rate. Prosper stopped the

interest auction service in 2011 and implemented a fixed interest rate like Lending Club.

In China's market, P2P lending platforms are basically following the same model as those in

the U. S., acting as an intermediary between borrowers and lenders. However, due to

differences of economic and legal environment, as well as the customer's behavior, there are

unique features which evolved from P2P lending in China. We use Hongling Capital and

Creditease as representatives since they are two of the earliest P2P platforms which

originated in China.

1) Loan Type. Hongling Capital offers personal and business loans with an amount between $500 and $1,600,000 USD, with maturities between 3 months and 12 months. Creditease

offers personal loans of amounts between $1,600 USD and $1,000,000 USD with

maturities between 1 year and 4 years. Obviously, P2P lending platforms in China's

market are more aggressive and also bear higher default risk.

2) Interest Rate. Rather than determining the interest rate based on credit score, maturity and

amount as P2P platforms in the U.S., China's platforms determine the interest rates

simply based on loan type or maturity, because there is no credit agency that can provide

a comprehensive credit report for individuals (China's PBOC just authorized certificates

for credit agency in January 2015). Hongling Capital regulates interest rate between 8%

and 18% and Creditease between 10% and 12.5%.

(20)

3) Credit Scoring. The only credit report that a borrower can submit is the one provided by

PBOC that includes the history of credit card usage and loan repayment. Platforms don't

rate borrowers into different credit categories, which differs from U.S. platforms. It's a

common practice for platforms to enable credits to borrowers/investors if they

successfully pay the monthly payment or make investment. For instance, Hongling

Capital category sorts customers into 5 categories from VI (lowest) to V5 (highest).

Investors on Hongling Capital can refer to different categories as a risk indicator.

4) Origination fee. Creditease charges investors 10% of interest earnings and borrowers 10%

as service fee. Rates and Fees on Hongli is more complex. Hongli charges investors from

0% to 10% as fees. This charge is determined depending on the categories, which range

from V I to V5. For instance, investors in VI need to pay 10% of interest earnings as a

service fee, and those in V5 don't need to pay any service fee. For borrowers, Hongli also

charges various percentages on loans, as a service fee based on different loan types. The

overall range is from 3% to 14.6%.

5) Affiliate & Referral Programs. Creditease doesn't pay the referral bonus, while Hongli

pays $6 USD if the referred customer registers as a normal member, and $12 USD if he

registers as a VIP.

6) Notes trading. Platforms in China also provide notes trading services to investors.

7) Early repayment. On Creditease, if borrowers want to pay the remaining loan earlier,

besides the interest for the current month, remaining loan and service fee, they need to

pay a 0.5% of the remaining loan as a penalty to the platform. Similarly, borrowers on

(21)

the remaining loan earlier.

8) Principle Guarantee. The biggest difference between the U. S. and China in P2P lending

is that many platforms in China introduce a 3rd party company to guarantee the safety of

investors' money, just in case any fraudulent funding happens. This is the remedy for the

lack of credit score available from borrowers and platforms that will improve the

confidence of investors. However, 3rd party guarantee is not a catholicon for P2P lending

in China. A certificate of Guarantee Company only costs $1 million USD and there are

cases where owners disappeared with the money, leaving investors to lose all their money.

3. Data Analysis and Modeling

3.1 Introduction

There are questions being addressed in this section, including 1) the distribution of PV, rate of

bad loans and interest of different credit categories. 2) Whether the risk-return improves from

year to year, especially when platforms change their policy. 3) Any behavior difference of

borrowers and investors between Prosper and Lending Club. 4) Investigate the contribution

of determinant variables to the performance of loans. 5) Build the model to determine the

possibility of default using different data mining methodologies. 6) As researched by Riza,

Yanbin, Benjamas and Min in 2014, the higher interest rate regulated by Prosper and

Lending Club for riskier loans is not enough to reimburse the potential loss exposing to

investors. This section will use a FCFF methodology to test this conclusion considering the

time value of future cash flow.

(22)

3.2 Key Variables

3.2.1 Prosper

Variable name Type Definition

Credit Rating Numeric Proprietary Credit rating by P2P lending platforms

Loan Status Dummy Whether the loan is active, completed or default

Borrower Rate Numeric Interest rate borrower is willing to pay

Borrower APR Numeric Actual rate borrower needs to pay considering service cost

Lender Yield Numeric Actual rate lenders receive considering service cost

Listing Category Dummy The purpose of the loan

Employment Duration Numeric The time period of employment till the creation of listing

Is Borrower home owner Numeric Whether the borrower owns real estate

Current Credit Line Numeric The number of credit lines the borrower owns

OpenRevolvingMonthlyPayment Numeric The monthly payment of revolving account

RevolvingCreditBalance Numeric The current credit balance of revolving account

BankcardUtilization Numeric The percentage utilization of revolving credit balance

AvailableBankcardCredit Numeric The total amount of bank card credit till the creation of the loan

TradesNeverDelinquent Numeric The percentage of delinquency of trades

DebtToIncomeRatio Numeric The percentage of debt to income

StatedMonthlyIncome Numeric Monthly income stated by borrowers

LoanOriginalAmount Numeric The original amount of loan originated

Investors Numeric The number of investors who fund the loan

Terms Numeric The term length of the loan

Both Prosper and Lending Club define "bad loans" as loans that are 60+ days past due within

the first twelve months from the date of loan origination.

3.2.2 Lending Club

Variables Type Definition

Grade Dummy The proprietary credit rating of Lending Club

loan-status Dummy The current status of the loan

int rate Numeric The interest rate the borrower needs to pay

Purpose Dummy The purpose of the loan

emplength Numeric The time length of the employment of the borrower

home-ownership Dummy If the borrower owns or rents an apartment

open acc Numeric The number of open credit line of the borrower

(23)

revol util Numeric The current ratio of credit balance utilization

dti Numeric The debt to income ratio

annual inc Numeric The amount of annual income

loan amnt Numeric The amount of the loan

installment Numeric The amount of monthly payment

Term Numeric The term length of the loan

3.3 Distribution of Dataset

3.3.1 Prosper

When depicting the distribution of loan's characteristics, we exclude current and cancelled

listings that haven't completed and funded. Besides, records with proprietary credit rating

"NC" are excluded due to incomplete information, and those loans were originated in early 2006 and 2007 when Prosper was in infancy. There are 113 rows of records that are missing

proprietary credit rating. We assume that these records won't influence the validity of our

analysis due to the small amount of records.

Successful Amount of Loans

Credit Category Rate Number of Loans Total Average Mean STDEV Default Rate

AA 30% 6,487 61,402,940 9,466 12,000 6,664 11% A 23% 10,479 101,490,254 9,685 11,000 6,664 16% B 25% 12,023 117,411,802 9,764 12,000 8,345 22% C 29% 14,892 125,436,437 8,423 10,000 7,044 28% D 47% 15,259 96,539,254 6,326 7,500 5,853 31% E 49% 10,286 43,717,649 4,250 4,000 2,629 37% HR 76% 8,846 27,031,067 3,056 3,500 1,323 46% Term Credit Category AA A B C D E HR 1vear 3 years 3% 3% 3% 2% 2% 3% 0% 93% 86% 79% 76% 83% 87% 100%

5years Interest rate 4% 12% 190/ 22% 15% 10% 0% 8.9% 11.4% 15.4% 18.9% 23.6% 28.3% 29.3%

$/investor Credit Score 53 73 87 104 91 103 89 791 738 712 682 667 640 621 21

(24)

There are several features of the dataset distribution of Prosper. 1) Surprisingly, the

successful rate of a listing being funded to be a loan decreases when credit worsens. This

might be caused by the higher interest rate paid by worse credit rating. 2) The majority of

loans are from C and D, consistent with our expectation that the major loans on Prosper (even

most of the P2P lending platforms) came from borrowers with poor credit record. 3) From the

best credit rating to the worst, the average and medium amount of the loan is declining

continuously, majorly because the limitation placed by P2P platforms. 4) The default rate

climbs when credit getting worse. The default rate of A-loan is 11%, while 46% for HR-loan.

5)

As we expected, interest rate increases when credit quality declines. An assessment will be

done in the following section to test if the interest rate advised by Prosper is enough to cover

the potential loss. 6) There is a trend that for loans with poor credit rating, investors tend to

place more money on each investment.

Number of Loans

18,000 12,000 16,000 10,000 14,000 12,000 8,000 10,000 6,000 NO. of Loans

8,000 -- Ave rage Amount

6,000 4,000 4,000 2,000 2,000 0 AA A B C D E HR

(25)

Borrower Rate vs. Prosper Rating

- h

A AA B C D E HR

Prosper Rating

- Smooth(Borrower Rate)

Percentage of Total Loans by amount

Year AA A B C D E HR Default Rate

2006 7.5% 7.7% 9.3% 11.2% 9.8% 8.9% 45.6% 39.2% 2007 15.4% 16.8% 19.9% 21.3% 15.3% 6.2% 5.2% 39.5% 2008 23.3% 19.5% 23.2% 17.4% 11.2% 3.0% 2.5% 33.0% 2009 21.6% 24.9% 6.9% 17.9% 13.6% 5.2% 9.8% 15.2% 2010 16.1% 20.9% 14.7% 9.0% 19.5% 8.3% 11.5% 16.7% 2011 7.3% 17.5% 16.9% 9.1% 27.1% 16.7% 5.5% 22.6% 2012 7.3% 17.7% 18.1% 22.8% 18.9% 5.4% 9.9% 31.2% 2013 4.6% 16.5% 24.5% 31.2% 14.9% 6.8% 1.5% 23.6% 2014 6.8% 18.3% 24.0% 29.4% 1.4% 6.5% 13.6% 24.5%

7) Year by year, more investors switch to riskier loans from A or AA classes, especially to

loans in B and C. This trend might be caused by investors seeking higher interest rate as well

as the improved loan default rate under each credit category. 8) Both the overall default rate

and the default rate for each credit category decreased continuously. However, investors are

becoming more risk-averse. This improvement can be explained by the effort that Prosper is

better off in risk screening and verification.

(When calculating the default rate, loans that originated after Q2 2014 are excluded from the

dataset, because no loans could be past due more than 60 days, and when they do, they are

considered as default) 23 3t 0 0.3443 03288 03125, 0299 02863 0.2745 0.2623 0.2521 0.2417 0232 02225 0.2127 0.2025 0.1932 0.1839 0.1753 0:1679 0.1587 0.1495 0,1424 0.1338 0.1248 0.1162 0.1075 0.0985 0.0911 0.0813 0.0714 0:0623 0

(26)

Default rate YoY Year AA A B C D E HR Overall 2006 8.8% 16.7% 24.7% 36.2% 35.8% 48.8% 64.8% 39.2% 2007 14.3% 25.8% 33.3% 41.1% 42.8% 53.2% 62.2% 39.5% 2008 18.3% 25.6% 32.9% 33.4% 37.4% 43.6% 52.5% 33.0% 2009 6.0% 9.3% 16.8% 15.4% 22.4% 22.3% 23.7% 15.2% 2010 3.9% 9.8% 11.2% 15.3% 21.4% 24.9% 25.4% 16.7% 2011 2.9% 9.4% 15.5% 14.9% 24.8% 32.1% 31.0% 22.6% 2012 8.1% 9.3% 14.1% 20.1% 23.9% 25.9% 28.5% 31.2% 2013 4.1% 2.8% 4.6% 7.5% 10.8% 13.1% 13.6% 23.6% 2014 8.7% 0.4% 0.7% 1.2% 1.6% 2.5% 1.7% 24.5% 3.3.2 Lending Club Amount of Loans

Credit Successful Number of Default

Category Rate Loans Total Average STDEV Rate

A 32.6% 20,076 213,245,525 10,622 6,586 8.5% B 28.8% 33,882 402,115,200 11,868 6,861 17.2% C 26.7% 27,641 352,094,900 12,738 7,769 24.2% D 28.3% 17,980 246,222,500 13,694 8,426 30.8% E 29.1% 8,484 148,964,150 17,558 9,505 36.4% F 33.6% 3,772 73,021,450 19,359 9,225 43.5% G 33.6% 916 20,171,950 22,022 8,417 43.2%

1) There is no significant difference of successful rate listing being funded across different

credit categories in Lending Club, 2) Loans are more concentrated on good-credit loans

from A to D in terms of number of loans and total amount. 3) What is different from loans

on Prosper are lower-credit loans on LC which tend to have bigger amount than

higher-credit loans. This is an indicator that LC considers amount as a contributor when

rating loans. 4) There is no significant switch of investors' risk aversion year by year on

lending club. 5) The default rate of LC is much lower than Prosper in each year and under

each category, but this doesn't mean that the overall risk return that Lending Club

(27)

following sections. 6) Interest rate for loans among the same credit rank on LC and Prosper is similar. 7) There is a trend of improvement regarding default rate from 2007 to 2010. I don't involve years after 2011 into consideration since most loans are still under regular payment process, whereas for loans originated in early years, most of them are

either fully paid or went default.

Percentage of Loans by credit grade-LC

Year A B C D E F G 2007 22.7% 24.3% 29.9% 14.7% 5.6% 2.8% 0.0% 2008 18.9% 32.5% 28.0% 14.2% 4.8% 1.3% 0.3% 2009 25.0% 28.9% 25.3% 13.9% 5.0% 1.4% 0.5% 2010 24.3% 30.7% 21.4% 14.0% 6.9% 2.1% 0.8% 2011 26.5% 30.2% 18.1% 12.9% 8.0% 3.3% 0.9% 2012 20.4% 34.7% 22.3% 13.7% 6.0% 2.5% 0.5% 2013 13.1% 32.7% 28.3% 15.3% 6.7% 3.3% 0.6% 2014 14.2% 26.6% 28.1% 18.9% 8.7% 2.8% 0.8%

Default Rate YoY-LC

Year A B C D E F G Overall 2007 1.8% 13.1% 18.7% 40.5% 35.7% 28.6% 0.0% 17.9% 2008 5.8% 14.6% 17.8% 24.3% 16.0% 47.6% 50.0% 15.8% 2009 6.7% 11.4% 14.8% 17.4% 21.6% 17.2% 34.8% 12.6% 2010 4.7% 11.1% 14.5% 18.6% 22.5% 30.0% 28.4% 12.6% 2011 6.6% 11.5% 16.8% 20.9% 23.8% 28.1% 31.5% 14.1% 2012 6.3% 11.0% 15.1% 19.1% 23.4% 25.6% 30.7% 13.2% 2013 1.7% 4.4% 7.4% 10.8% 12.8% 17.0% 16.6% 6.9% 2014 0.5% 1.1% 1.8% 2.8% 3.8% 5.8% 5.8% 1.9%

Number of Loans by Risk Category

(28)

Number/Amount of loans

40,000 35,000 30,000 25,000 20,000 15,000 10,000 5,000 Number of Loans -U-Average amount A B C D E F G

Interest Rate Range by Risk Category

Column 2 vs. Column 1 02509 0.24S Smooth(Colu.m. 2) 0.2352 0.229 0.2215 02159 0.1939 0.1891 0.171 0162 als 014. .40.1426 U.324 0.1261 0.12183 -0.1172 0.1141 00432 0.0781 0.0692 Credit Grade

3.4 Model Building and Interpretation-Lending Club

This section contains five steps. First, prune the datasets of Lending Club and Prosper for the

model building. Second, select variables and build the logistic model to predict the default

probability. Third, try to interpret the significance of each variable and compare the estimates

with the expectation. Fourth, Choose alternative data models to predict the loan status, as

(29)

well as net profit/loss, and try to compare the result with conclusion made by logistic regression. Last, as a robustness check, I will test the linear assumption between predicting variables and target prediction, and try to explore the nonlinear relationship between target prediction and each individual predicting variable.

3.4.1 Data Preparation

In the data preparation, I tried to only incorporate parameters that can be somewhat verified. There are definitely some variables such as loan purposes that borrowers can fabricate subjectively. Even though we can build a model with a good performance using those subjective parameters, the reliability of the model is questionable.

1) Homeownership. The original options for this variable include "rent", "own", "Mortgage",

"None", "Other". We create dummy variable, considering 1 as "own" or "mortgage" and

0 for the rest. Answers of "own" and "Mortgage" are considered as 1, and the rest as 0.

2) There are over 300,000 rows of data; all current listings are excluded from the dataset since we're aiming to detect any indicators of risks from an investor's perspective.

3) Loan Status is the target to predict. Loan status. Loan status of "0" represents active loans

that already finished all payment or that are still in payment process. "I" represents default loans including charged-off, default, or delinquencies more than 31 days (since there are only two categories for delinquent loans, less or equal to 30 days or more than

31 days). Initially, there are 87880 "completed" loan listed on Lending Club, while my

interest is to look at loans that either finished all payments or declared default already. Keeping that in mind, I further split completed loans into two categories - paid and

in-process. Within completed loans, there are only 5509 loans that already finished all

(30)

payments. The remaining 82371 completed loans are still in payment process. However,

as shown in the below graph, 50% of bad loans declared default before Ih month. Or

75% of bad loans declared default before 171 month. This implies that within those

82371 loans that didn't finish all payments, there is a great chance that they will

eventually pay off all installments. Therefore, in order to provide a reliable data model

and mitigate bias toward completed loans, I treat completed loans that have paid at least

17th installments as finished loans, and assume that they won't go default in future. By

doing this, I get 38555 good loans (finished all payments) and 24871 bad loans (default or

charged off).

NO. of Month Paid vs. loan status

65 3 NO. of Month Paid

60 00 60 55 00 50 45 40 35 30 Z)25 20 15 10 0 0 1 loan_status

4) Income verified. "0" represent that the income is not verified while "1" means income

verified.

5) Independent variables involved in the regression: Loan amount, term, employment length,

homeownership, annual income, if the income is verified, debt to income ratio, FICO

credit score, open account, revolving credit balance, the utilization ratio of revolving

credit balance, total account. I excluded the variable "purpose" from the model due to the

(31)

low reliability of the value that borrowers put when they applied for the loan.

6) The whole dataset will be divided into training and validation. The whole dataset is

randomly partitioned into 43426 training rows and 20000 validation rows

7) Profit/Cost matrix. I need a cutoff value in order to classify the predictions into 0 or 1. To

do that, I need to compute firstly the profit/cost matrix for Lending Club. There are 63426

loans in the dataset, including 38555 good loans and 24871 bad loans. Good loans

generate $108,339,408 out of the total original amount $450,364,975, representing a ROI

of 24.1%. Bad loans cost investors a total loss of $219172141, out of the total original

amount $350771625, representing a negative ROI of 62.5%. Finished loans as a whole

causes a loss of 110,832,732 out of the total amount $801,136,600, representing negative

ROI of 13.8%. You might be surprised that the real ROI that Lending Club offers to

investors is actually much lower than the one it advertises on the website. The profit/cost

matrix should be as below.

Profit Matrix Actual Predicted Loan Status 0 1 0 1 -1 1 -2.6 0 3.4.2 Model Building

Before building the model in each step, I selected variables based on R-Square, AIC and BIC

rules. Then I compared the performance of models using different variable combinations. 1)

R-Square oriented stepwise selection intends to remove open acct from the model. 2) A

minimum AIC recommend further removing home-ownership from the data model. 3)

(32)

Selecting to use Minimum BIC also gives the same result of excluding open acct and

homeownership from the model. Detailed results are listed below.

Maximize Rsquare

Entered Parameter Sig Prob

[X]

Intercept[1]

1

[X] loanamnt 8.30E-70

[X] term 3.00E-233

[X] emplength 5.00E-15

[XI homeownership 0.51441

[XI annualinc 1.30E-41

[XI isincv 6.81E-09

[XI dti 3.20E-84

[XI FICOScore 0

openacc 0.88003

[X] revolbal 3.76 E-09

[X] revolutil 4.57 E-06

Minimum AIC

Entered Parameter Sig Prob

[X] Intercept[I] I [X] loanamnt 8.30E-70 [X] Term 3.OOE-233 [X] emplength 5.OOE-15 home ownership 0.51441 [X] annualinc 1.30E-41 [X] isincv 6.81E-09 [X] Dti 3.20E-84 [X] FICOScore 0 open acc 0.88003 [X] revolbal 3.76E-09 [X] revolutil 4.57E-06 Minimum BIC

Entered Parameter Sig Prob

[XI

Intercept[1]

1

[XI loanamnt 8.30E-70

[XI term 3.OOE-233

[X] emplength 5.OOE-15

homeownership 0.51441

[X] annualinc 1.30E-41

(33)

[XI isincv 6.81E-09

[XI dti 3.20E-84

[X] FICOScore 0

open_acc 0.88003

[XI revolbal 3.76E-09

[X] revolutil 4.57E-06

Based on the result from data selection, I ran the logistic regression Estimates of parameters

under slightly different variable combinations are listed below. There is no significant value

or sign difference between the two results. Besides, RSquare-oriented variable combination

offers a RSquare of 0.2135, while AIC/BIC selected variable combination gives only a

slightly lower RSqure -- 0.2134.

Estimate

Maximize Minimum

Term Rsquare AIC/BIC

Intercept -10.66162 -10.67306 loanamnt -0.00003 -0.00003 Term -0.03942 -0.03937 empjength -0.02573 -0.02533 homeownership 0.01513 N/A annualinc 0.00001 0.00001 isincv -0.13985 -0.13967 Dti -0.03298 -0.03296 FICOScore 0.01900 0.01902 revolbal 0.00001 0.00001 revolutil 0.21735 0.21590

Since the model using parameters selected by RSquare stepwise offers slightly better result, I

computed the formula as below accordingly.

1

P(Default) = 1 + eO-(-0.66162+PiXi)

fli: Coeff cient of parameter

X1: Parameters

The confusion matrix generated from two combinations is listed below. Both models achieve

(34)

the best performance under a cutoff value of 0.44, meaning that if the default probability

equals to or is bigger than 0.44, the loan will be determined as default, vice versa. The overall

accuracy rate of the two combinations is close to 69.1% for RSqure combination and 68.8%

for AIC/BIC. The former one does a better job in identifying good loans, while the latter one

is more accurate in identifying bad ones. Both combinations can improve the overall ROI of

Lending Clubto negative 1.2% by AIC/BIC combination and to negative 1.7% by RSquare

combination. Even though the risk return after enhancement is still negative, a progressive

step has been made by imitating 12% loss. Not surprisingly, there is a price paid to improve

the overall risk adjusted return to investors. Applying this model means the overall volume of

loan origination will decline by 37.8%, while this improvement in risk adjusted return can

help amass the credit worthiness for P2P platforms and attract more investors thus borrowers

in the long run.

Confusion Matrix-RSquare Actual Predicted loan Status 0 1 0 9180 2923 1 3256 4621 Confusion Matrix-AIC/BIC Actual Predicted loan Status 0 1 0 8959 3144 1 3099 4778 3.4.3 Model interpretation

In this section, I will analyze the estimates of parameters concluded in model building, and compare

(35)

parameter is claimed to have a positive impact to default rate, it means the higher the value the

parameter have, the higher default probability the loan involves, and vice-versa.

Several papers also tried to interpret the impact of parameters. FICOScore has a negative impact to

default rate, while debt-to-income ratio and credit line utilization have a positive impact (Riza, Yanbin,

Benjamas and Min, 2015). However, when looking at the result from the model that only included the

finished loans, some of estimates of variables are not intuitive. This section will start from interpreting

variables that are counter-intuitive with our expectation, and then go through those that match the

expectation. 1) "Loan amnt" has a negative impact to the default probability. Normally, a higher

Loan amnt gives people an image of involving higher risk, while it turns out that this is not the case.

2) The same to "term". There are two time length allowed on Lending Club - 36 and 60 months.

Generally speaking, given all the other features constant, 60-month loan doesn't contain a higher

default risk than 36-month. This might explain that Lending Club only approves a longer term loan if

the borrower is more qualified. 3) "Home_ownership". Owning a real estate doesn't necessarily mean

that you're more credit worthy. It's actually the opposite. 4) "Annualinc". A higher income put by the

borrower when applying for a loan won't guarantee a better consequence. The impact of this variable

should be considered with " is incv", which has a negative impact to the default rate. 5) "dti-debt" to

income ratio. This ratio also has a negative impact to the default rate. This impact could be explained

that some income information of borrowers is fictive. Further research in the paper will only include

loans with verified income to detect any different result. 6) One most surprising finding is that

"FICOScore" has a positive impact to the default rate. People might think that borrowers with higher

FICOScore normally have better credit quality, since the credit score backed by a 3rd party agency is

normally very reliable. However, on Lending Club (and also later mentioned in Prosper's model),

(36)

FICOScore is not a good indicator of the credit quality. Lenders can't simply make the decision

based on this score, which is actually what lots of investors are doing. 7) "revol_util" and "revolbal"

have positive impact to default rate, which is consistent with expectation. Because the majority of

borrowers on Lending Club are applying for loans to coordinate personal credit lines, a higher balance

and utilization ratio indicate a higher financial pressure of paying back the balance.

3.4.4 Robustness Check

Besides building the model to predict nominal target parameter, I also considered using the same

predicting variables to predict the numeric parameter-net profit/loss, to check the numeric regression

outperforms logistic regression. The same as the previous section,

I

prune the predicting variable

combination oriented by RSqure, AIC and BIC and list the result below. Three ways to rule out

variables give the U.S. the same result-to keep all variables in the linear regression model.

Entered Parameter Estimate [XI Intercept -13687.535 [XI loan_amnt -0.1754355 [XI term -106.27022 [X] emplength -44.95143 [X] annualinc 0.00523282 [X] is_inc_v -239.96839 [X] dti -96.287356 [XI FICOScore 27.2358601 [XI revolbal 0.01584572 [XI revolutil 1126.27626

Looking at the estimates of variables in a linear regression, it makes more intuitive sense than the

result from the logistic regression. For instance, "loanamnt", term and" dti" have a negative

coefficients with net profit in a sense that the higher value the variables have, the lower profit or

higher loss that the loan will cause investors. By contrast, FICO_Score, and annual_ inc place positive

to the loan's net profit/loss. The model generates an RSquare of 0.1072, which is significantly lower

(37)

than the value by logistics model. To further test which model is superior to the other one, I also draw

the confusion matrix for linear regression model by setting up a profit/loss value as cutoff of good or

bad loans. Under a cutoff value of net profit/loss of negative $2,100, the model achieves the highest

accuracy of 67%, which could be further broken down to 74% of identifying good loans and 55%

accuracy of identifying bad loans. However, the performance of this model is still worse than the

logistic model. Confusion Matrix-RSquare Actual Predicted loan Status 0 1 0 9152 3146 1 3422 4258

The different coefficient of the same parameter to default probability and net profit can be understood

by twofold way. First, the amount of net loss outweighs that of net profit significantly, therefore the

positive impact imposed by FICOScore or annualinc can't bring enough profit to push the net P/L

to positive numbers. 2) However, it's true that higher FICOScore and annual inc can reduce the net

loss if loans go default, and can also increase the positive return if loans are proved to be good.

I also used discriminant and neural network to classify good and bad loans and got confusion matrix

listed below. Literally, both models outperform logistic model in the overall accuracy and net profit if

applying the cost matrix to the results below. The overall accuracy of discriminant is 68% with a

further breakdown of 70% accurate for good loans and 65% for bad loans. Using neural network, the

accuracy turns out to be 69%, with 76% accurate for good loans and 59% for bad ones. However,

there are two key disadvantages of discriminant and neural network. One is that the structure of the

model is non-transparent and user can't interpret the importance of each parameter. Investors can't

apply the model easily when making investment decisions. Another disadvantage is both model need

Références

Documents relatifs

After preparing and preprocessing the data using the C5 decision tree algorithm in this paper, the classification model has been constructed and the credit rating of

Finally, we consider the capital bu¤er system where the representative bank applies backward-looking provisioning rules and uses a capital bu¤er in order to cover expected losses

stochastic risk-free interest rate (Longstaff and Schwartz, 1992); connection of debt value to firm risk, taxes, bankruptcy cost and bond covenants (Le- land, 1994); valuation of

The instability of Hb Dompierre may explain the acute clinical manifestations observed in the present patient and the presence of some hemolytic biological features.. Abdominal pain

Without wading into too much detail, there are five types of barrier to entry/expansion that protect the “big three” from competition: (i) informational expertise, including

In order to validate our model and to use it as an optimization tool for the industrial process, an experimental campaign was made during one month and some temperature recording

In this paper we defined primary steps toward a better guidance process by taking into account the needs of the visually impaired and the potential of technologies such as wearable

Nous utiliserons dans ce mémoire, les termes de reconfigurabilité, d’agilité ou d’accordabilité pour désigner le changement de la configuration ou du fonctionnement