Analyzing the economic efficiency of eBay-like online reputation reporting mechanisms

(1)

MIT LIBRARIES DUPL

(2)

(3)

(4)

(5)

14 i-i

B

u

JI

n

e>i.."

M

I

T

MIT

Sloan School

of

Management

Sloan

Working

Paper

4181-01

eBusiness@MIT

Working

Paper

102 October

2001

ANALYZING

THE

ECONOMIC

EFFICIENCY

OF

EBAY-LIKE

ONLINE

REPUTATION

REPORTING

MECHANISMS

Chrysanthos

Dellarocas

This

paper

is

available

through

the

Center

for

eBusiness(aJMIT

web

siteat the

following

URL:

http://ebusiness.mit.edu/research/papers.html

This

paper

also

can be

downloaded

without

charge

from

the

Social

Science

Research

Network

Electronic

Paper

Collection:

http://papers.ssm.coin/abstract=289968

(6)

(7)

Analyzing

the

economic

efficiencyofeBay-iikeonlinereputationreporting

mechanisms^

ChrysanthosDellarocas

SloanSchool of

Management

MassachusettsInstituteofTechnology Cambridge,

MA

02139,

USA

dellfemitedu

Abstract

This paperintroducesa

model

foranalyzing marketplaces,suchaseBay,

which

relyonbinary reputation

mechanisms

forqualitysignalingandqualitycontrol. Inour

model

sellerskeeptheir actualqualityprivate

and choose

what

quality to advertise.

The

reputation

mechanism

isprimarilyusedtoinducesellersto

advertisetruthfully.Buyers basetheirratings

on

thedifference

between

expectedandactual quality.

Furthennore.ratersare lenientand

do

not post negativeratingsunless transactionsend upexceptionally bad. ItIS

shown

that, insucha setting, the fairnessofthemarket

outcome

isdetermined

by

the relationship

betweenratingleniencyand correspondingstrictness

when

assessinga seller's feedbackprofile. Ifbuyers judgesellerstoostrictly(relative to

how

lenientlytheyrate)then,atsteadystate,sellerswillbeforcedto

understatetheirtrue quality.

On

theother hand,ifbuyersjudgetoolenientlythensellerscanget

away

with

consistently overstatingtheirtrue quality.

An

optimal

judgment

rule,

which

resultsin

outcomes

where,at

steadystate,buyersaccurately predictthe truequalityofsellers,istheoreticallypossibleto

denve

forall

leniencylevels.Furthennore,ifbuyersjudgesellersusing that rule,thenthe

more

lenientbuyersare

when

ratingsellers,the

more

likelyitisthat sellers will finditoptimal tosettle

down

tosteady-state quality

levels,asopposedto oscillatingbetween

good

qualityand badquality.

However,

it isarguedthatthis

optimalruledepends onseveral parameters,

which

are difficult toestmiatefromtheinformationthat

eBay

currently

makes

availabletoits

members.

Itistherefore questionabletowhatextent unsophisticated buyers

arecurrently using

eBay

feedbackinfonnation inan optimalway.

'_This_is

_work

_in_progress. _{Suggestions and}

_comments

very

welcome.

Please addresscorrespondenceto

(8)

(9)

1. Introduction

Online reputation reportingsystemsareemergingasan important quality signalingandqualitycontrol

mechanism

inonline tradingcommunities (Kollock 1999;Resnicket. al.

2000

). Reputationsystemscollect feedback from

members

of anonline

community

regarding past transactionswithother

members

ofthat

community

Feedbackisanalyzed,aggregatedand

made

publicly availabletothe

community

in theform of

member

feedbackprofiles. Ifoneacceptsthatpastbehaviorisa relatively reliablepredictoroffuture

behavior,then theseprofilescanactas apowerfulqualitysignalingandqualitycontrol

mechanism,

essentiallyactingas the digitalequivalentofa

member's

reputation.

eBay

reliesonitsreputation

mechanism

almostexclusivelyinordertobothproducetrust and induce

good

behavioronthe partofits

members.

eBay

buyersandsellersareencouragedto rateone anotherattheend of eachtransaction.

A

ratingcanbeadesignationof"praise","complaint"or "neutral", togetherwitha

short text

comment.

eBay makes

the

sums

ofpraise,complaintandneutral ratingssubmittedforeach

member,

aswellasallindividual

comments,

publicly availabletoall itsusers.Anecdotalandempirical

results

seem

todemonstratethateBay'sreputationsystemhas

managed

toprovideremarkablestabilityin

anotherwiseveryrisky tradingenvironment

(Dewan

and

Hsu

2001; Resnick and Zeckhauser2001).

The

risingpractical importanceofonline reputationsystemsnotonlyinvitesbut rather necessitates

rigorous research

on

theirfunctioningand consequences.

Are

such

mechanisms

trulyreliable?

Do

they

promoteefficientmarketoutcomes']'

To

whatextent are theymanipulablebystrategicbuyersand sellers?

What

isthe best

way

todesign

them?

How

shouldbuyers (andsellers)usethe informationprovided

by

such

mechanisms

intheirdecision-makingprocess?Thisisjustasmall subsetof

unanswered

questions,

which

inviteexcitingandvaluable research.

The

studyofreputation asa

mechanism

forinducing

good

behaviorinmarkets with asymmetric

infonnationiscertainlynotnew.Severaleconomistshave publishedimportant

works

analyzingits

properties(Rogerson, 1983; Schmalensee, 1978; Shapiro, 1982;

Smallwood

andConlisk, 1979; Wilson,

1985,just to

name

afew).

Nevertheless,althoughpast

work

ineconomicshas studied

some

oftheoveralleffectsofreputation,ithas

paid verylittleattentiontotheanalysisofspecific

mechanisms

forformingand

communicating

reputation,

inpartbecauseintraditionalbrickand mortarsocietiessuch

mechanisms

arelargely informal (theyare

often referred toas"word-of-mouthadvertising")and defydetailedmodeling.

The

few publishedresults

focusingonthe effectsofspecificpropertiesofreputation

mechanisms

clearly

make

thepointthatsuch

propertiescan havesignificant effectsonthemarket outcome. For example,

Rogerson

(1983)

shows

that

reputationbasedonsubjective binaryratings(e.g.good/bad,praise/complaint) createsanexternality,

which

affectsthe entiremarket. Shapiro (1982)

shows

that,unless the

mechanism by which

reputationisformed

satisfiescertainproperties, sellers

may

finditoptimaltocontinuouslyoscillatein quality,periodically

building

good

reputationand subsequently milkingit.

On

theotherhand, the design and implementation ofonline reputationsystemshas sofarbeentheresearch

domain

of

computer

scientists(seeBreseeet. al., 1998;Sarwaret.al.,2000; Schaferet. al.2001 for

overviews ofpast work).

The

emphasis ofpast

work

inthearea hasbeen on developingalgorithmsand systemsforcollecting,aggregatingandextracting useful infonnation fromsetsofuserratings,drawing from

work

in infonnationretrieval,datamining andcollaborative filtering.

The

analysisand evaluationof

theproposedalgorithmsistypically

done

in termsof computational complexity andstatisticalmetrics,such

as theirrunningtime,

memory

requirements,averagerecall andprecision,averagebias,etc.

We

believethatthereisaneedfor

work

thatbridgesthe

two

disciplines:research,

which

takesintoaccount

thealgorithmicdetailsofspecificreputationsystemsbut alsomodels

how

thesesystemsare

embedded

inside tradingcommunities andinvestigatestheireffectivenessandimpact, notonlyintermsof

computationalandstatisticalproperties,butrather intermsoftheir overall impactintheefficiencyofthe

market andthewelfareofthevariousclassesofmarketparticipants.

Given

thatreputation systems were conceivedinorderto assistchoiceinenvironmentsofimperfect information,theirimpact inthoselatter

(10)

(11)

market dimensions should betheultimatedeterminantofsuccessof any

new

proposed

new

algorithmand

system.

This papercontributesin thisdirectionby proposinga

model

foranalyzingthe

economic

efficiencyof

binary reputation systems, suchas theone used

by

eBay.

Section2 introducesthe

model

anditsunderlying assumptions.

We

assume

thatbuyersatisfactionon

eBay

isafunctionofthedifferencebetweentheadvertisedand truequalityofanitem. Insucha setting, the

reputation

mechanism

isprimarilyusedtoinducesellers toadvertisetruthfully.Section 3 formalizesthis

miuition

mto

a

number

ofpropertiesthateBay-like reputation

mechanisms

shouldsatisfy, inordertobe

considered wellfiinclioning.

Section4appliesour

model

inordertodetemiine under

what

circumstances such

mechanisms

canindeed

bewell functioning.Itis

shown

that the fairnessofthemarket

outcome

isdeterminedbytherelationship

betweenratingleniencyand

judgment

strictness

when

assessinga seller'sfeedbackprofile.

An

optimal

judgment

rule,

which

resultsin

outcomes

where,atsteadystate,buyersaccurately predictthe truequality

ofsellers, istheoreticallypossibletoderiveforallleniency levels.

A

rathersurprisingconclusionof our

analysisisthat, ifbuyersusethisoptimal

judgment

rule,thenthe

more

lenientbuyersare

when

rating

sellers,the

more

likelyit isthat sellers will finditoptimaltosettle

down

tosteady-statequality levels,as

opposedto oscillating

between

good

qualityand badquality. In thatsense,raterleniencybenefitsthe overallstabilityofthesystem.

However,

it isarguedthatthisoptimalruledepends

on

severalparameters,

which

are difficult toestimate fromthefeedbackinformationthat

eBay

currently

makes

availabletoits

members

Itistherefore questionabletowhatextentunsophisticatedbuyersarecurrently using

eBay

feedback infomiationin anoptimalway.

Section5considers the implicationsofrelaxing

some

ofthesimplifyingassumptions on

which

our analysis

isbased.Finally, Section6

summarizes

thecontributionsandconclusionsofthepaper.

2.

A

model

of reputation-mediated marketplaces with binary

feedback

This section introducesa

model

foranalyzing marketplaces,

which

relyexclusively

on

abinary reputation

mechanism

forqualitysignalingandqualitycontrol.

A

binaryreputation

mechanism

isa

mechanism where

ratersaregiventheopportunityto ratepasttransactions usingone of

two

values,

commonly

interpretedas

"positive"(i.e.satisfactory)and"negative"(i.e.unsatisfactory,problematic).

Our

intentionistousethis

model

inordertostudy the

economic

impact ofreputation

mechanisms

similartotheone used

by eBay

(seeResnick and Zeckhauser 2001 for adetailed description) .

Inour model,qualitiesarenon-negativereal-valuedquantities,

which

subsume

aspectsof both product

qualityandservicequality.

We

assume

thateachsellerproducesitems,

whose

real quality _q^ is

unknown

tobuyersand can only be determined with accuracyafterconsumption.

We

further

assume

thatallbuyers

prefer higher qualitytolowerquality,althoughtheymightdifferintheextentto

which

theyarepreparedto

pay foran extraunitofquality. Finally,

we

assume

thatalthoughthe realqualityofitemsisnot

communicated

tobuyers,sellersdo inform buyers by advertising.

On

eBay,advertisingcorrespondstothe

seller-supplied description,

which

accompaniesallitems.

The

advertised quality q^ of anitemis

completelycontrolled

by

the seller(i.e.thereis

no

validationofany kind bya thirdparty)and

may

or

may

notcorrespondto itsreal quality.

'

Inadditionto"positive" (praise)and"negative" (complaint)ratings,eBay'sreputation

mechanism

also

supports "neutral"ratings(which,however,arerarelyusedinactual practice).

As

will

become

apparent

below, our

model subsumes

raters

who

would

submit"neutral"ratingson

eBay

intothesetofraters

who

(12)

(13)

Sellersaimsto

maximize

thepresentvalueoftheirpayofffunction ;r(x,^ ,q )

= G{x,q

,q

)-c(x,q

)

where

.visthe

volume

ofsales, C(.)isthegrossrevenuefunctionandc(.)isthe costftinction.

We

assume

that dc/dq^

>0

imdd/r/dq^ >

forallsellers.

Under

theabove assumptions,sellershave anincentivetoover-advertisequality.

The

market

would

then

degeneratetoa"marketforlemons" (Akerlof1970). Inordertoavoidthis from happening, buyersare

giventheoptionto rateeachtransactionusinga"positive" or "negative"rating.

A

reputation system, operated

by

atrustworthythird party,accumulatesallratings into afeedbackprofile

R

=

(Z ,1. ,Z )foreachseller,

where

Z

isthe

sum

ofallpositiveratingsreceivedfor that seller

+ - norating +^ * '-'

duringthemostrecenttime

window,

X

isthe

sum

ofallnegativeratingsreceivedduringthe

same

period

and

Z

isthe

number

oftransactionsfor

which no

rating

was

submitted".

Time

windowing

isusedin

norating ^ ^

ordertoaddress thepossibility that sellers

may

improveor deterioratetheirbehavior overtime.For example,

on

eBay, feedbackprofilesdisplay the

sums

ofratingsreceivedduringthe past6

months

only.

Buyer

utility from purchase ofa singleitemis

modeled by [/=

^

9

-

p

,

where

/;istheprice,

^

isa

buyer'squalitysensitivityand qisthelevelofqualityperceived

by

thebuyerafterconsumption.

When

consideringapurchase,buyers

combine

all theinfomiationthat isavailabletothem, i.e.anitem's advertised qualityandaseller'sfeedbackprofile,inordertofomiasubjectiveassessmentof anitem's

estimatedquality_q ,where;

9,

=/(?„. R)

(1)

Armed

with

knowledge

ofpricesandestimatedqualities,buyersproceedtopurchaseone ofthe available items,presumablytheone

which maximizes

theirexpectedutility

U

=

6

q

-

p

. Followingapurchase,

buyers observetheitem'sperceivedquality q

=

q

+

£,

where

f

isanormallydistributed errorterm with

standard deviation

a

.

The

introductionof anerrortermisintendedtocollectively

model

a

number

of

phenomena, which

occurinactual practice.For example:

• buyers

may

misinterpretaseller'sadvertised quality(thisshouldbe

modeled

as ^

=q

+

e,however,

ouranalysisisidentical if

we

addthe errortermto q instead)

• sellers

may

exhibit small variationsinactual quality from onetransactiontoanother

• buyers

may

havesmall differencesin theirperceptionofqualitybased,say, ontheir

moods

thatday

•

some

aspectsofperceived quality

depend on

factors

beyond

aseller'scontrol(e.g.post-office delays)

Finally,buyers decide whetherto rale atransactionaswellas

what

rating to give.

Our model

assumesthat

ratingsare a functionofabuyer's satisfaction relative toher expectations.

We

defineabuyer'ssatisfaction

fromagiventransactiontobethedifferencebetweenperceivedand expectedutility. Thatis,

S

=

U-U=9(q

-q

+£)

.

Under

theabove assumptions,

5

isanormallydistributed

random

variable

with

mean

(q

-

_q,) andstandard deviation

6a.

One

interestingpropertyof eBay,

which

hasbeenreported

on

severalempiricalstudies,isthatmost buyers

giveveryfewnegativeratings tosellers.Resnickand Zeckhauser (2001)reportthat lessthan

0.5%

of buyersand 1

.2%

ofsellerspostneutral or negativefeedback abouttheirtradingpartners (see Figure 1).

They

tentativelyconcludethat raterseither rategenerouslyor prefertorefrainfromratingatallafterbad

experiences.

(14)

(15)

(16)

(17)

WF2:

Assuming

WFl

holds,underallsteady-statesellerstrategies

_{q^.qj

the qualityofsellersas

estimated

by

buyersbefore transactions takeplace,isequaltotheirtruequality(i.e._q^

=

_q^). Before

we

proceed,letusjustifytheabovedefinition

by

providinga brief rationaleforthedesirabilityof

properties

WFl

and

WF2.

First,thevalueofreputation

mechanisms

ingeneralrelies ontheassumptionthatpastbehaviorisa reliable

predictoroffuturebehavior (Wilson 1985). If oscillations

were

optimal, the predictive valueof cumulative

functionsofpastratings,suchas

1,1

,w ouldbegreatly diminished.Inenvironments

where

theprimary

(oronly)

mechanism

for certifyingandcontrollingsellerqualityisbasedonreputation,it is,therefore,

desirablethat sellers find itoptimal tosettle

down

toasteady-statebehaviorratherthanto oscillate.

Second,

accordmg

toour model,buyers

make

purchasedecisionsbased on

knowledge

ofpricesand

estimatedqualities. Ifit

were

possiblefor sellers tosettle

down

intoasteady-state strategythat

would

consistentlydeceivebuyersintoestimating q > q ,thenthis

would

allowsellerstoearn additionalprofits

attheexpense ofbuyers.Inthepresenceofcompetitive marketplaces,buyers

would

theneventually leave

themarketplaceinfavorofothermarketswithbetterinformation.

On

theotherhand,if,underallpossible steady-stateseller strategies the effectofthereputation

mechanism was

such,sothatbuyersestimated

q

<q

,thentheoppositeeffect

would

takeplace:buyers

would

realizeextra surplusattheexpense of

sellers.

Once

again,

we

would

thenexpectthat sellers

would

desert themarketplace in favorofother,

more

transparentmarkets.

The

onlyfairsteady-statestrategy, therefore,isone

where

_q^

-

_q^

.

4.

Can

binary reputation

mechanisms

be

wellfunctioning?

Thissectionwilldemonstratethat,givena rating function,

which

hasthegeneralform given

by

(2),

whether aneBay-like binary reputationsystemsatisfiesproperty

WF2

depends onthe relationshipbetween

(2)andthe quality estimation function <?

= /(^

,R). Furthermore,

we

will

show

that, ifbuyersare lenient

enough

when

theyrateand correspondinglystrict

when

theyjudgeseller profiles, sellers will finditoptimal

tosettle

down

tosteady-statetrueandadvertised qualitylevelsifsuch anequilibriumexistsunderperfect information.

4.1 Estimatedvs.real qualities insteadystate

Let us firstfocusourattention

on

thecircumstances under

which

abinary reputation

mechanism

satisfies

condition

WF2.

We

areassumingthat

WFl

holds.Therefore, thereexistsatleastonesteady-statestrategy iq ,q ) foreachseller''.

A

steady-state strategyisastrategythatoptimizesa seller'spayofffunction,while

atthe

same

timeresultinginanestimated quality q^

= f{q^,R)

,

which

isstableovertime.

Denote

q

=

q

+ ^

,

where ^

isthedeceptionfactor,thatis,thedistortion

between

estimatedandrealqualityat

steadystate.If.^

>

thenbuyersoverestimatea seller's true quality,whereasif.,

<

thenbuyers underestimatetrue quality. LetA'bethetotal

number

ofsalestransactionsofagivenseller inthemost

recenttime

window.

ItiseasytoseethatA'

=

Z

+

1 +

S

.

Assuming

thatbuyersrateaccordingto •'

-t- - noruling ^

(2),forlarge

N

atsteadystatethefollowingwillhold:

I

=N

?Tob[S>Q]

=

N^[(q^

-<? )/o-]

=

/V

0[-^/cr]

Z

=

_N

ProliS

<-X\

=

N-

0[(9^

-q^)la-Xlieo)]

=

N-^{^lo-Xl(e-

a)]

where

<t>(.) isthestandardnormal

CDF.

''

Section 4.2willexploretheconditionsunder

which

sellers willindeedfinditoptimalto settle

down

toa

(18)

(19)

Given

that q

= f(q

,R),satisfactionofcondition

WF2

depends

on

thequalityassessment function/

More

specifically,

/must

bechosensothat,forallsteady-statestrategies

_(q^,qj

theequation:

q=q^+^

= f(q^,R(4))

(4)

hasauniquesolutionat

^ =

.

eBay

doesnotspecify,oreven

recommend,

aspecificqualityassessment function/ Itsimplypublishesthe

quantities

S

and

I

foreachsellerandallowsbuyerstouseany assessmentrulethey seefit.Itis

importanttonoteat thispointthat

eBay

doesnot currently publishthequantity

£„„^„„„^(andtherefore

N)

foraseller.

As

we

will

show

below,

knowledge

ofA'isessential forconstructingreliablequality

assessmentfunctions.

The

resultsofthispaper, therefore,

make

astrongargumentthatthe

number

of

transactionsthathavereceived

no

ratingshouldbe addedtotheprofileinformationpublished

by

eBay

and

similarsystems.

In thispaper,

we

willexploreonegeneral familyofqualityassessmentfunctionswhich,

when

implemented

correctlyand used

m

conjunction withrating rule(2),satisfy

WF2.

We

willfurtherexploredifferent

ways

in

which

usersofa reputation

mechanism

canuse I.^

,2

and

TV inordertocorrectly

implement

those

functions.

The

generalform of ourqualifyassessmentfunctions isgivenby:

,=/„..R,

=

k

i'f'^»

(5)

[O

if^=(R)>0

Inotherwords, buyersassessthequalifyof an itemtobeequalto thatadvertisedbytheseller, if,basedon

the seller's profile,theyconcludethat the sellerdoesnot over-advertise.Otherwise, buyers

assume

thatthe

seller liesandassess

minimum

quality. Function(5)thereforeuses the infonnalion provided

by

the

reputation

mechanism

inordertoderivea(binary)assessmentoUruthfiilnessinadvertising.

Assuming

that sellershavea

way

ofreliably inferring thesignof

^

from feedbackprofileinfomiation,

sellers

who

over-advertisetheir quality will quickly seetheirestimated qualityfallto zero.Therefore, if/is

given

by

(5),equation(4)hasnosolutionfor

^ >

.

Note

thatfunction(5)doesnot preventsellersfromunder-advertisingtheirqualifybecausefor _^

=

^

+ ^

,

all

^ <

arealso solutionsofequation(4).

However,

giventhat

we

have

assumed

that drtldq^

>

,

we

would

not expectany profit-maximizingsellertounder-advertise. Therefore, theonlysteady-stateseller

strategy for sellers

would

beto truthfullyadvertisetheir real quality. In that case,buyers

would

estimate

q

=

o

=

,adesirableoutcome,

which

satisfies

WF2.

Let us

now

explore threedifferent

ways

in

which

buyers canuse

Z

,

S

andA' inordertoestimate the sign

of^

Assessment

based on

the

number of

positives

One way

toestimatesellerhonestyistorequirethatthefractionofpositiveratingsof

good

sellersexceeda threshold.

From

(3)

we

canseethat f]

=

Y. /A' canbeinterpretedas apoint estimatorof

^[-^1

o].

Given

(20)

(21)

that <i>[-4/a]

<

0.5 for alli^

>

,assessmentofthesignof

^

reducestotestingthestatisticalhypothesis

H

://

>

0.5 given fj.

The

correspondingqualityassessmentfunction thenbecomes:

{q

if

H

accepted

if

H^

rejected (6)

where

H^:t]>

0.5given^7

h

X

/N

Hypothesis

H

canbetestedusingone ofthe

known

techniquesfor

computing

confidenceintervalsof

proportions followingbinomialdistributions(e.g.BlythandStill 1983).

Function(6)isan appealing

method

forassessingseller qualitybecause ofitsrelativesimplicity.

Note

that

itscomputation doesnot require

knowledge

ofthe

model

parameters

A,dand

a

.

However,

(6)isdifficult

to

compute

reliablywithout

knowledge

of

M

thetotal

number

ofrated plus unrated transactionsofaseller.

As

was

mentioned,

eBay

doesnot

make

A'

known

toits

members.

Taking f]

=

_^JC^^

+

^_)

would

result

in largeoverestimationof<t>[-^/a],especially because,dueto ratingleniency, 5^„^„„„|, isexpectedtobe

quitesignificant (inthe datasetofFigure 1 S^^^„„ /

N

= 48.3%

). Forthatreason,one

would

infer that

qualityassessment based

on

the

number

ofpositiveratingsisnot(and shouldnot be)widely used

on

eBay. Thishypothesisisconsistentwithempirical observations

(Dewan

and

Hsu

2001).

Section42 willdiscussanother disadvantage offunction(6),

which

isthatit

makes

iteasierfor sellers to

oscillatebetweenperiods

where

theymilktheir

good

reputationbyoverstatingtheirqualityanddeceiving

buyersandperiods

where

theyrestore theirreputation

by

offeringbetterquality than

what

buyersexpect.

Assessment

based

on

the

number of

negatives

Inananalogous manner,

we

expect

good

sellers tohavefewnegativeratings.Therefore,another

way

to

estimatesellerhonestyistorequirethatthefractionofnegativeratingsof

good

sellersstay

below

a

threshold.

From

(3)

we

canseethat

^

=

Z

/A'canbeinterpreted asapoint estimatorof

cl)[^/(T

-

/I /(6* cr)].

Given

that cl)[,,^/cr-A/(6' o")]

>

0[-/l/(6' cr)] for all.,=> ,assessmentofthesign

of^^ reducestotestingthestatisticalhypothesis

H

[ :

^

<

<^[-

A

1(9

a

)\ given

C

The

correspondingqualityassessment functionthenbecomes:

\q if

H'

accepted

jo if//' rejected

vhere

H[: ( <k' =^[-Al{d a)\g\\m (

=

1. I

N

^'

|0 if//' rejected (7)

Let uscall k' =0[-/l/(6' cr)J the

optimum

trustworthinessthreshold. A* isamonotonicallydecreasing functionoftheleniency factor

X

.Therefore,the

more

lenientbuyersare

when

theyrate,thelowerthe thresholdofnegativeratings totransactionsabove which theyshouldnottrustsellers,andviceversa.This

ISa result thatcorrespondswellto

documented

empiricalfindings: most

eBay

buyers

weigh

negative

ratings

much

more

heavilythan positiveratings

when

assessingthetrustworthinessofaprospectiveseller

(Dewan

and

Hsu

2001).

Given

thatthey

seem

toberather lenient

when

they ratethosesellers,accordingto

(7),

we

would

expect

them

tobeslnct

when

assessing the qualityofsellers,andthereforetoberelatively

intolerantofnegativeratings.

The

big question,however,iswhether buyersuse therightthreshold

when

theyjudgesellers (inotherwords,arebuyers capable of

making

the"nght"

judgment

oijust

how

many-negative ratingsaretoomany?).

(22)

(23)

From

equation(7)

we

canalsoseethatan

optimum

k' canbederived forevery

A

.

One way

of

interpretingthisresultisthat satisfactionof

WF2

isalwayspossible

no

matter

how

lenient(orstrict)buyers

are

when

theyrate,providedthatiheystriketherightbalance betweenrating leniency

and

quality

assessmentstrictness.ln thenextsection,

we

shallprovethat,

more

lenientrating(andcorrespondingly

strictassessment)increasesthelikchhood that sellers will finditoptimalto settle

down

toasteady-state behavior.

Some

degree ofleniency,therefore,can bebeneficialto the stabilityofthemarketplace.

Itisalso importanttopointoutthat, unlessbuyers usethe rightthreshold k'

when

evaluatingthe

number

ofnegativeratingsofaseller,

WF2

willnotbesatisfied. Ifbuyersuseathreshold k> k' thenthere willbe some<^ > for

which

//' willbesatisfiedandsellerswillbeabletoconsistentlydeceive buyers

by

over-advertisingtheir quality. In contrast, \ik

<k'

,therewillbe

^^^ < suchthat //^'^ willberejectedforall

(^

> ^

, Inthelattercase, topreventtheirestimated quality

from

droppingtozero, sellers willbeforcedto

under-advertise and, therefore,beconsistentlyunder-appreciatedby |^„ |•

We

see,therefore, thatthechoiceofthe"right" k' iscrucial tothewell functioning

of

thereputation

mechanism,

and ofthemarketplaceingeneral.Itisimportanttoaskwhether buyers can be reasonably expectedtobeabletocorrectly deriveit.

From

equation(7),calculationofk' requires

knowledge

ofthe

model

parameters

i,^

and

a

Itisunlikelythatbuyers

would

haveaccurateunderstandingand

knowledge

ofthoseparameters(especially

o

,

which

partly reflectspropertiesoftheseller). Nevertheless,evenifthe

model

parametersarenot

known,

itispossibletoestimate thevalueof

^[-Xl(da)]

from

2

,S_ andA'

.

From

(3):

-^la

=<^

CLJN)

I

A-' =a>[-/l/(6ia)]=*[<D"'(^ /yV)+

0'(I

IN)] (8)

-Al(e cr)^^la

=

_^-\ZJN)

J ^ ^ n v y , i y . n

{iN

issmallthenaconfidenceintervalshouldbeconstructedfor k'

.

Even

withthehelpofequation (8).buyersstill needto

know

A'inordertoproperly

compute

function(7)

Overall,function(7)definesa rather fragile rule forassessingsellerqualityefficiently.

Given

thatalotof

eBay

buyersareheavilybasingtheir sellerqualityassessmentsonthe

number

ofnegativeratings

on

the

sellers' feedbackprofile,it isveryinteresting toask

what

methods

theyuseto

compute

their

trustworthinessthresholds and,even

more

important,whethertheirtrustworthiness thresholds

do

indeed

come

closetosatisfying

WF2.

Clearly,these areimportant questions,

which

invitefurtherempiricaland experimentalresults to

complement

the resultsofthiswork.

Assessment

based on

theratiobetweennegatives

and

positives

In both previouscases,correctimplementationofthequalityassessmentfunction required

knowledge

of

M

One

mightthinkthat,by basingqualityassessment onthe ratio

between

negativeandpositiveone

may

be

abletoderivean optimal assessmentfunction from

Z^

and

Z

only.

We

will

show

that thisisnotpossible.

From

(3)

we

get:

Function p(^)isnon-negative and monotonicallyincreasingin

_^

. Furthermore p(Q)

=

2 <t>l-Ai{d a)]

.

(24)

(25)

H"

.

p<l

0[-/li{fi a)] given

p

= Z_

/

Z^

.

The

correspondingqualityassessmentfunctionthen

becomes:

{q

if//' accepted

if//^rejected (10)

where

_Hl:p<2

*[-/l1(6 a)]given

p

=

Y. lY.^

Unless buyers have

knowledge

ofthe

model

parameters

A,6mAa

,calculationof2 <i>[-Xl{d a)\ from

(8)requires

knowledge

of

M

Therefore, usingthe ratioofnegativestopositivesisverysimilartousingthe

fractionofnegativesandisequallytricky to"getright"without

knowledge

ofA'.

4.2Existenceofsteady-statebehavior

The

analysisofSection4.1 hasbeen based ontheassumptionthat sellers settle

down

tosteady-statereal

andadvertised qualitylevels.Thissectionwillinvestigatetheconditionsunder

which

sellers will indeed

findItoptimalto

do

so.

The

alternativeisto oscillatebetweenbuildinga

good

reputationandthenmilking it

by

over-advertisingrealquality.

As

we

arguedinSection3,reputation-mediatedmarketplaces should be designedinordertoinducesellerstosettle

down

tosteadystate behavior(otherwise infomiation aboutpast

behaviorwillnotbevery helpfulas a

way

ofpredictingthe future).

The

principalresultofthissection isthat

when

qualityassessmentisbased onfunctions(7)or(10),

which

involve negativeratings, then,ifthe ratingleniencyfactor

A

islargeenough,sellerswillfinditoptimalto

settle

down

tosteadystatebehavior. In contrast,thereisno such guarantee

when

qualityassessmentis

basedon function(6),

which

only involves positiveratings.Thisresult

shows

that

more

lenientrating

(coupled with

more

strictqualityassessment)supportsstability inthe system. Forthe

same

reason,

although

more

fragileanddifficultto"getnght", functions(7)and(10),i.e. functions

which

baseseller

qualityassessmentonthe

number

ofnegativeratings, arepreferredtofunction(6),

which

only looksatthe

seller'spositiveratings.

Inordertoderiveourresult, letus consider

ways

in

which

sellers

may

attempttorealizeadditional profits

through oscillatingbehavior.

Assume

thata sellerisabletoperform N^ transactionsbeforeratingsofthose transactionsarepostedtoherfeedbackprofile.This

number

depends onthefrequency oftransactionsand

thedelay

between

transactionsandthepostingofratings

by

buyers (oneBay,thisdelayistypically 2-3 weeks).

Let us consider aseller

who,

attheend ofperiod0,has completedA'transactionsinthe currenttime

window

andhasaccumulateda

good

reputation,by producing andadvertising itemsofqualityq ,the

quality thatoptimizesprofitsassumingsteady-state behavior. Let usfurther

assume

thatbuyersassess qualitybased onfunction(7). Attheend ofperiod0:

I

=

—

_(^

=

0)

=

cD[-/l/(^a)]

=

A: (11)

N

period

Atthebeginningofperiod 1 the sellerdecidestomilk her reputation

by

choosinga realquality q'andthen

over-advertisingherquality

by ^

sothatherprofitisma.ximizedrelative tothesteadystatecase.

Given

theseller's

good

past reputation,initiallybuyerswillbedeceived.

However,

aftertheypurchasetheseller's

items,theywillrealize their inferiorqualityandwillpostproportionally

more

negativeratings.Therefore,

(26)

(27)

A'

ptTloJI

'nTn.

>k

(12)

and the seller'ssubsequentestimated qualitywill fall tozero

Assuming

that

some

buyersarewillingto

buy

from

somebody

with zero qualityifthe priee islow enough, oursellerwillslayin business.Inorderto

increase her reputationonceagain,sheneedstoreducethe ratio

Z_/

N

to

below

thethreshold A*.

The

only

way

shecan achievethisisto

go

throughaperiod

where

sheproduceshigher quality items but receiveslowerprices,surpassing buyers' expectations

(who

now

expect q

=

)by .5 ,. Let us

assume

that

it

would

take

N,

"redeeming"transactionsbefore

Z

/A'

<

A'. Attheend ofperiod2:

N

0[-A/(e a)]+N^ *[^,

/a-A/(0

a)]+ N^ <i>[-4Ja-A./(0 a)]

N

+

N

+iV,

k'=0[-A/{O a)] (13)

A

profit-maximizingsellerwillchooseto oscillateifthe profitfromthe"deceiving"transactionsrelative to

the steady-stateprofitexceedsthe lossfromthe"redeeming"transactionsrelative to thesteady-stateprofit.

Ifthese

two

quantitieshaveafiniteratio,then,providedthatthe

number N^

of"redeeming"transactions

thatarenecessaryinorderto

"undo"

thereputationeffectsofA^^ "deceiving"transactionsishighenough,

sellers willnotfinditprofitable to oscillateandwill settle

down

tosteady-staterealandadvertised quality

levels.

From

(13)after

some

algebraicmanipulation,

we

get:

A,

0[£/cr-A/(0a)]-0[-A/{0

a)]

N. «[-/?/{0 a)]

-fl>[-^,

/a-A/{0

(T)]

=

_gU.4r4,)

(14)

After

some

manipulation

we

get

dg/dA>0

and3'g/3/l"

>

.In fact g(.)

grows

exponentiallywith/I

Figure2 plotsg(A)for

^ =

<T

-

' ""''c-rno ronrocsntatU/P ^;QlllCCnf,<=

-

.^

-

,= and

some

representative valuesof1,

=

c,

=

c,.

05

1

15

2

Leniency

factor(lambda)

ksi=1 ksi=2 ksi=3 ksi=4 ksi=5

Figure2:

Minimum

ratioof

"redeeming"

to"deceiving" transactions

needed

in

order

torestoreone's

good

reputation followingaperiodof qualityover-reporting.

^

Furthermore,foragiven

A

,g{.)

grows

rapidlywithi^^ anddecreasesveryslowly with i^,. This

means

thatthe

minimum

necessaryratioof redeemingtodeceiving transactions

grows

with the

amount

ofinitial

(28)

(29)

From

Figure2itisevidentthatinmarketplaces

where

buyersrate leniently(and assess qualitystrictly),

sellersneed

many

more

"redeemingtransactions"inorderto restore their

good

reputation following afew

"deceivingtransactions".

The

relative

number

of redeemingtransactionsincreases exponentiallywiththe

leniencyfactor Otherwisesaid, the larger the

A

,the

more

difficultitisfor sellers to restore their

reputationoncethey loseit.Consequently, if

A

issufficiently large, sellers will finditoptimaltosettle

down

tosteady-staterealandadvertised qualitylevels.Q.E.D.

A

similarresultcanbederivedifbuyers basequalityassessment

on

theratio

Z_ /I^

.Incontrast, ifbuyers

basequalityassessment

on

function(7),ouranalysisgives:

A', _{_}

1-2

0[^, /CT]

(15)

Equation(15) gives /V,

=

A', for i^^

=

<^^and/V, slightly lessthanN^for ^^^

<

^^. Otherwisesaid, followinga setof deceivingtransactions, ittakes the

same number

of(orfewer)redeemingtransactionsin

orderto restoreone's

good

reputation. Insucha setting, itis

more

likely that

some

sellers willhaveprofit

functionsfor

which

itwillbeoptimalto oscillate. Therefore,oneexpectsthatinreputation-mediated

marketplaces

where

buyersuse(6) toassessseller quality,therewillbeless stabilitythaninmarketplaces

where

sellersuse(7)or(10).

The

resultsofthissectionprovide

some

interestingargumentsforbothratingleniencyaswell asforbasing

the qualityassessmentofsellersontheirnegative,ratherthantheirpositiveratings.

5.Realitychecks

and

some

recommendations

The

resultsoftheprevious sectionshave beenderived

by making

a

number

ofsimplifyingassumptions

about buyerbehavior.

More

specifically,

we

have

assumed

thatallbuyers havethe

same

qualitysensitivity

andleniency factor

A

. Furthemiore,

we

have

assumed

thatbuyers always submitratings

whenever

their

satisfaction rises

above

zero orfalls

below

-A

.Both assumptionsarenotlikely tohold ina real

marketplace. Buyers havedifferent personalities,andtherefore, areexpectedtohavedifferentquality

sensitivities,as wellasleniency parameters.Furthermore,ratings

do

incuracost(timetologon and submit

them)and

some

buyersdonot botherrating,even

when

transactions turnoutreally

good

orverybad. In

thissection,

we

will injectabitofrealitytoour

model

andwillexplore

how

ourresultschange if

we

take

intoaccounttheaboveconsiderations.

Reality

Check

#1:

Some

buyersneverrate

We

needtomodify ourratingfunctionr(S)in (2), sothat

when

S

>

, r(S)="+" withprobability /? and

r(S)

=

norating with probability (1

-

/9). Similariy,

when

S<-A,

r{S)="-" with probability y and

r(S)

=

norating withprobability (1

-

7).

Under

this

new

ratingfunction,thestatistical hypothesisin (6)

becomes

H^:r]>^

0.5 ,whilethehypothesisin(7)

becomes

H'^:C<7

0[-A

/(d a)].

We

seethatour

new

assumptionintroduces

two

additionalparameterstoour model.

The

parametersneedtobereliably

estimatedinorderforproperty

WF2

tobesatisfied.

Reality

Check

#2:Buyersdiffer inqualitysensitivity

and

leniency

Let's define a;

= A/

(9andlet's call p{co) theprobability distnbutionoftt»

among

buyers.

Then

(3)must

bemodifiedas:

1 =NProb[S>0]

=

N<t>[-^/a]

(16)

(30)

(31)

Ifqualityassessmentisbased onthe fractionofpositiveratingsusing(5),thenrealitycheck#2 doesnot introduce additional complications.

However,

ifqualityassessmentisbasedonthe fractionofnegative ratmgs,which,

m

thepresenceoflenient ratingsisthe rule

most

likelyto result in stable sellerbehavior, then

thmgs

do

become

considerably

more

complicated.

More

specifically,itiseasytoseethatthestatistical

hypothesistobetestedin(6)must

become

H'^:^<

k'

=

j<I)[-a»/cr]

p{w)

dco. Inordertocalculatethe

"right" k',one needs

knowledge

ofp(QJ)

.

Things

become

even

more

complicatedif

we

combine

realitychecks#1 and#2,

which would

bethe

situation thatmostcloselycorrespondsto actual reality.

Of

course,one can begintothinkof

ways

inwhich

individualbuyers might beabletoestimate,

maybe

with

some

degreeoferror,the additional

model

parameters /3,yandp(6;) from

Z

,

I

andjV .

However,

insteadof

embarking

inthisdirection,atthisstage

we

believethat

we

areprovided

enough

argumentsto

make

one ofthe

main

pointsofthispaper: Binaryreputation

mechanisms

canintheoryhe

wellfunctioningundertheassumption of simpleratingand assessmentrules,butonlyifbuyersusethe

rightthresholds

when

judgingsellertrustworthiness. Calculatingthe rightthreshold from

S

,S alone, the

onlyinformation currentlyprovided by eBay,isverydifficult. Calculatingthe rightthresholdfrom

I

,Z andA' ispossibleunderthesimplifyingassumptionsofSection 2 but

becomes

more

and

more

difficultasour

models

approachreality. In realistic cases, thecorrectassessmentruledependsnotonly

on

thefeedbackprofileofa sellerbutalsoonpropertiesofthe raterpopulation.

Given

thattheefficiencyof

themarketplacecruciallydepends onthe selectionofcorrectassessmentthresholdsonthe partofthebuyer,

themostsensiblecourseofresearch thereforeshould betothinkofadditional informationthatthe reputation

mechanism

can provideto raters, inorderto

make

thiscalculationeasier.

One

ideaisforthe operatorofthemarketplace, or

some

trusted third party, to introduceasmall

number

of

honestsellers into themarket.

The

behaviorofthosesellersmust be completely underthecontrolofthe

marketplaceoperator buttheir identitiesshouldbe

unknown

tobuyers(sothatbuyersarenotbiased

when

theyratethosesellers).

Given

thatthemarketplace

knows

thatforthosesellers ,^

=

,itcanusetheir

feedbackprofiles inordertoderive estimatesofy5,7,p(ft;)andk'

Those

estimates, or suitablychosen

derivatives,canthenbe

communicated

tothebuyersforthepurposeoffacilitating their sellerassessment

process.

6.Conclusions

The

objectiveofthispaper

was

toexploreto

what

extent binary reputationmechanisms, suchastheone usedateBay, arecapableof inducingefficientmarket

outcomes

in marketplaces

where

(a) truequality informationis

unknown

tobuyers,(b)advertised quality iscompletelyunderthe controlofthesellerand

(c)theonlyinformation availabletobuyersisanitem's advertised quality plus theseller'sfeedbackprofile.

The

firstcontributionofthepaperisthedefinitionofa setofconditionsforevaluating thewell functioning

ofareputation

mechanism

issuchsettings.

We

considerareputation

mechanism

tobewell-functioningifit

(a) inducessellers tosettle

down

to asteady-statebehavior assumingitisoptimal for

them

to

do

sounder

perfect qualityinformationand(b)atsteady-state, sellerqualityasestimated

by

buyersbefore transactions takeplaceisequal totheirtrue quality.

The

secondcontributionofthepaperisananalysisof whetherbinary reputation

mechanisms

canbe well-functioningundertheassumptionsthat(a)ratings arebasedonthedifference

between

buyers'trueutility

followingatransactionandtheirexpectations beforethetransactionand(b)buyersarerelatively lenient

when

they assessa seller'sfeedbackprofile.

Our

firstconclusionisthatifbinaryfeedbackprofilesareusedtodecidewhethera selleradvertises

(32)

(33)

assess quality equalto the

minimum

quality),then, intheory, binary reputationsystems canbewell functioning,providedthatbuyersstrikethe rightbalance betweenratingleniencyandqualityassessment

strictness Furthermore,assumingthatbuyersbasetheir

judgment

onthe ratioofnegativeratingsreceived

byaseller, ifbuyersare lenient

enough

when

theyjudgeseller

profiles,

we

have

shown

that sellers will finditoptimaltosettle

down

tosteady-state qualitylevelsifsuch

anequilibriumexistsunderperfectinfomiation.Thisisaninteresting

way

in

which

(a)judgingseller

trustworthinessbased ontheirnegativeratingsispreferabletobasingitontheirpositiveratingsand(b)

some

degreeofratingleniency helps bringstability to thesystem.

Our

second conclusion isthat,unlessbuyersusethe"nght"threshold parameters

when

theyjudgeseller

profiles,binary reputation

mechanisms

willnot function wellandthe resultingmarket

outcome

willbe

unfairfor either thebuyersor thesellers.Inthatsense,althoughbinary reputation

mechanism

can bewell functioningin theory,theyareexpectedtobequite fragileinpractice.

The

crucialquestion therefore

becomes

whetherbinary feedbackprofilesprovidesellers(esp. relativelyunsophisticatedones) with

enough

infonnaliontoderive the"right" seller

judgment

rules.

We

have foundthatthe "right"

judgment

rule(e.g.

what

isthe right

number

ofnegativeratings

above

which

a sellershouldnotbetrusted?) isdifficultto infercorrectlyfrom

knowledge

ofthe

sum

ofpositive

andnegativeratingsalone,

which

istheonlyinformation currentlyprovided

by eBay

toits

members.

If

knowledge

ofthe

sum

ofunrated transactionsisaddedtofeedbackprofiles, then,undera

number

of

simplifyingassumptions,itispossibletoderivenon-obviousbutrelativelysimple "optimal"

judgment

rules

which

result inwell functioning reputationmechanisms.

However,

ifthesimplifying assumptionsare

dropped,calculationofthe right

judgment

rulefrom

S^

,£_ and

N

onceagain

becomes

difficult,asit

requires

knowledge

notonlyofsellerratingsbutofthe raterpopulationas well.

Our

findings leadto the

recommendation

that

more

informationshouldbe providedto assist ratersof such marketplacesusefeedbackprofilesinthe"right"way.

One

idea isfortheoperatorofthemarketplace, or

some

trusted third party, to introduceasmall

number

ofhonestsellersunderitscontrolintothemarket. If

one knows,a priori, that a sellerishonest,herfeedbackprofilecanthenbe usedbythemarketplaceto

estimateandpublisha

number

ofadditional"market-wide" parameters which,togetherwith

_2^

,S_ and

N

,

willhelpbuyers

more

reliablyassess the qualityofothersellers inthe system.

The

theoretical resultsofthispaperraise

some

intriguing questionsrelated to the efficiency, fairnessand

stabilityofeBay-like electronic marketplaces.

The

author

would

welcome

experimental andempirical

evidencethat willshed

more

lightintothequestionsraisedand

would

validate the conclusions

drawn

from

hismodels.

References

Akerlof,G. (1970)

The

marketfor'lemons': Quality uncertaintyandthemarket

mechanism.

Quarterly Journal

ofEconomics

84, pp.488-500.

Blyth,C.R.and StillH.A. (1983) Binomial ConfidenceIntervals.Journal

of

the

American

Statistical

Association78, pp. 108-116.

Bresee,J.S.,

Heckerman,

D.,and Kadie, C. (1998)EmpiricalAnalysisofPredictiveAlgorithmsfor

CollaborativeFiltering.InProceedings ofthe14'''_Conference

_on

_Uncertainty_in_ArtificialIntelligence (UAl-98),pp.43-52,SanFrancisco, July 24-26, 1998.

Dew

an, S and Hsu,

V

(2001)Trustin ElectronicMarkets:PriceDiscoveryinGeneralistVersusSpecialty

OnlineAuctions.

Working

Paper.January31,2001.

Kollock,P.(1999)

The

ProductionofTrustinOnlineMarkets. In

Advances

in

Group

Processes(Vol. 16), eds. E.J.Lawler,

M. Macy,

S.Thyne, and

HA.

Walker, Greenwich,

CT:

JAIPress.

(34)

(35)

Resnick,P.,Zeckhauser,R.,Friedman, H.,

Kuwabara,

K.(2000) [Reputation Systems.

Communicaiions

of

the

ACM,

Vol. 43,(12),

December

2000,pp.45-48.

Resnick, P.and Zeckhauser,R. (2001)Trust

Among

StrangersinInternetTransactions: Empirical Analysis

ofeBay'sReputation System.

Working

Paperforthe

NBER

workshop

onempirical studiesofelectronic

commerce

January 2001.

Rogerson,

W

P (1983) Reputation and productquality.BellJounuil

of

Economics,Vol. 14(2),pp.508-16.

Sarwar, B. M., Karypis,G.,Konstan,J.A.,andRiedl,J.(2000).Analysisof

Recommender

Algorithmsfor

E-Commerce.

ACM

E-Commerce

2000

Conference, Minneapolis,

MN,

Oct. 17-20, 2000.

Schmalensee,R. (1978).Advertisingand ProductQuality.Joz/rm;/ o/Po/;7;ca/fcowowi', Vol. 86,pp.

485-503.

Schafer,J.B., Konstan,J.,andRiedl,J., (2001)Electronic

Commerce Recommender

Applications.Journal

of Data Mining

and

Knowledge

Discovery. January,

200

1

.

Shapiro, C. (1982)

Consumer

Information,ProductQuality,andSellerReputation. BellJournal

of

Economics

13(1),

pp

20-35,Spring1982.

Smallwood,

D.andConlisk,J.(1979).ProductQualityinMarkets

Where

Consumers Are

Imperfectly Informed. QuarterlyJournal

_of

Economics.Vol. 93,pp. 1-23.

Wilson, Robert(1985). Reputationsin

Games

andMarkets.InGame-Theoretic

Models of

Bargaining,

(36)

(37)

(38)

DEC

?00?

(39)

MIT LIBRARIES

IInilI mil

(40)