Building trust on-line : the design of reliable reuptation reporting building trust on-line : the design or reliable reputation reporting

(1)

MIT LIBRARIES DUPL

(2)

(3)

(4)

(5)

^

DEWEY

'<>

•'Bu§in«ss#MIT

MIT

Sloan School

of

Management

Sloan

Working

Paper

4180-01

eBusiness@MIT Working

Paper

101 October

2001

BUILDING

TRUST

ON-LINE:

THE

DESIGN

OF

RELIABLE

REPUTATION

REPORTING

BUILDING

TRUST

ON-LINE:

THE

DESIGN

OF

RELIABLE

REPUTATION

REPORTING

Chrysanthos

Dellarocas

This paper

is available

through

the

Center

for

eBusiness@MIT web

site atthe

following

URL:

http://ebusiness.mit.edu/research/papers.htiTil

This paper

also

can be

downloaded

without charge

from

the

Social

Science Research

Network

Electronic

Paper

Collection:

(6)

(7)

Building

Trust

On-Line:

The

Design

of Reliable

Reputation

Reporting

Mechanisms

for

Online

Trading

Communities

Chrysanthos

Dellarocas

SloanSchool of

Management

MassachusettsInstituteof

Technology

RoomE53-315

Cambridge,

MA

02139

dell(a!mit.cdu

Abstract:

Several propertiesofonlineinteraction arechallenging theaccumulated

wisdom

oftradingcommunities

on

how

toproduceand

manage

trust. Onlinereputationreportingsystemshave

emerged

asapromisingtrust

management mechanism

insuchsettings.

The

objectiveofthispaperistocontributetotheconstructionof

onhne

reputation reportingsystemsthatarerobustinthepresenceofunfairanddeceitful raters.

The

paper setsthestage

by

providingacriticaloverviewofthecurrentstateoftheartin thisarea.Followingthat,it

identifiesa

number

ofimportant

ways

in

which

the reliabilityofthecurrent generationofreputation

reportingsystemscan beseverely

compromised by

unfairbuyers andsellers.

The

centralcontributionof

thepaperisa

number

of novel"immunization

mechanisms"

foreffectivelycounteringtheundesirable

effectsof suchfraudulent behavior.

The

paperdescribesthemechanisms, provestheirpropertiesand

e.\plains

how

variousparametersofthemarketplacemicrostructure,

most

notably the

anonymity

and

authenticationregimes,caninfluencetheireffectiveness. Finally,itconcludesbydiscussingthe

implicationsofthefindingsforthemanagers andusersofcurrentandfutureelectronicmarketplacesand

identifies

some

important

open

issuesforfiitureresearch.

1. Introduction

At

theheartof anybilateralexchangethereisatemptation,fortheparty

who

moves

second,todefectfrom

theagreed

upon

termsin

ways

that result inindividualgainsforit(andlossesfortheotherparty).For

example, intransactions

where

thebuyer paysfirst,the selleristemptedtonotprovidetheagreed

upon

goodsor services ortoprovide

them

ataquality

which

isinferior to

what was

advertisedtothe buyer.

Unlessthere are

some

other guarantees, thebuyer

would

thenbe temptedtoholdback

on

her sideofthe

exchangeas well. Insuchsituations,thetradewillnevertakeplaceand bothparties will

end up

being

worse

off.

Unsecured

bilateralexchangesthushavethestructureofa Prisoner's

Dilemma.

Our

societyhasdevelopeda

wide

range ofinformal

mechanisms

and formalinstitutionsform.anagingsuch

(8)

(9)

thelikelihoodthatonepartywillend up empty-handed.Writtencontracts,commercial law,creditcard

companies and escrowservices are additionalexamples ofinstitutionswithexactly the

same

goals.

Although

mechanism

designandinstitutionalsupportcanhelpreducetransactionrisks, theycan never

eliminate

them

completely.

One

example

isthe riskinvolvingtheexchange of goods

whose

"real" quality

can only beassessed

by

thebuyera relativelylong timeafter atradehasbeen completed(e.g.usedcars).

Even where

societydoes provideremedialmeasurestocoverrisks insuchcases(forexample,the

Massachusetts

"lemon

law"),theseareusually

burdensome

andcostlyand

most

buyers

would

very

much

rathernothavetoresort tothem. Generallyspeaking, the

more

the

two

sidesofa transaction are separated

intimeandspace,thegreater therisks.Inthosecases,

no

transactionwilltakeplace unless the party

who

moves

firstpossesses

some

sufficientdegreeoftnistthatthe party

who

moves

secondwillindeed honorits

commitments.

The

productionoftrust,therefore,isapreconditionfortheexistenceofany market and

civilizedsocietyingeneral(Dunn, 1984;Gambetta, 1990).

In"bncks and mortar" communities,theproductionoftrustisbased

on

several cues, oftenrationalbut

sometimespurelyintuitive.For example,

we

tendto trustordistrust potential trading partnersbased

on

theirappearance,thetoneoftheirvoice ortheir

body

language.

We

also askouralreadytrustedpartners

abouttheirpriorexperienceswiththe

new

prospect,undertheassumptionthatpastbehaviorisa relatively

reliablepredictoroffuturebehavior.

Taken

together,theseexperiencesformthe reputationof our

prospectivepartners.

The emergence

ofelectronicmarketsandother typesofonline tradingcommunitiesarechangingthe rules

on

many

aspectsof doingbusiness. Electronicmarketspromisesubstantialgainsinproductivityand

efficiency

by

bringing together a

much

larger setof buyers andsellersandsubstantiallyreducingthesearch

andtransaction costs(Bakos, 1997;Bakos, 1998).Intheory,buyerscan then lookforthe best possible deal

andend

up

transactingwithadifferent selleron everysingletransaction.

None

ofthesetheoreticalgains

willberealized,however,unlessmarket

makers

andonline

community managers

findeffective

ways

to

producetrust

among

their

members.

The

productionoftrustisthus

emerging

asanimportant

management

challengeinany organizationthatoperates orparticipates inonline tradingcommunities.

Several propertiesofonlinecommunitieschallenge theaccumulated

wisdom

of oursocieties

on

how

to

producetrust.Formalinstitutions,suchas legalguarantees,are less effective inglobal electronic markets,

which

spanmultiple jurisdictions with, oftenconflicting, legalsystems (Johnson and Post, 1996). For

example,itisverydifficult,andcostly, for abuyer

who

resides intheU.S.A.toresolveatrading dispute

witha seller

who

livesin Indonesia.

The

difficultyis

compounded by

the factthat, in

many

electronic

markets,itisrelativelyeasyfortrading partners tosuddenly"disappear"and reappear underadifferent

(10)

(11)

Furthermore,

many

ofthecuesbased

on

which

we

tendto trustordistrustother individuals are absentin

electronicmarkets

where

face-to-facecontactistheexception. Finally,one ofthemotivatingforcesbehind

electronicmarketsisthedesiretoopen uptheuniverseofpotentialtrading partnersand enabletransactions

among

parties

who

have never

worked

togetherinthepast.Insuchalargetrading space,

most

ofone's alreadytrustedpartners are unlikelytobeabletoprovide

much

informationaboutthe reputationof

many

of theotherprospectsthatone

may

be considering.

As

acounterbalancetothose challenges, electroniccommunitiesarecapableofstoringcomplete and

accurate informationaboutalltransactionsthey mediate. Several researchersandpractitionershave,

therefore, started tolookat

ways

in

which

thisinformationcanbe aggregated and processed

by

themarket

makers

or othertrusted third parties inordertohelp onlinebuyersand sellersassesseachother's trustworthiness.Thishas leadto a

new

breedofsystems,

which

arequickly

becoming

anindispensable

component

of everysuccessful online trading

community:

electronic reputation reporting systems.

We

arealready seeing thefirstgenerationof such systemsinthe

form

ofonlineratings,

feedback

or

recommender

systems (Resnick andVarian, 1997;Schaferet.al.,2001).

The

basic ideaisthatonline

community

members

aregiventheability torateorprovidefeedback abouttheirexperiences withother

community members. Feedback

systems

aim

tobuildtrust

by

aggregatingsuchratingsofpastbehavior of

theirusersand

making them

availabletoother usersaspredictorsoffiiturebehavior.

eBay

(www.ebay.com),

forexample,encourages bothpartiesof eachtransactionto rateone another witheithera

positive(+1),neutral(0) oranegative(-1) ratingplusa short

comment.

eBay makes

thecumulativeratings ofits

members,

aswellasallindividual

comments

publicly availabletoeveryregistereduser.

The

majorityofthe current generationofonlinefeedback systems have

been

developed

by

Internet

entrepreneursandtheir reliabilityhas not yetbeensystematically researched. In fact,thereis

ample

anecdotal evidence,aswellasone recentlegal case', related to the ability to effectivelymanipulatepeople's

actions

by

using onlinefeedbackforums(stock

message

boardsinthiscase)tospreadfalseopinions.

As

more

and

more

organizationsparticipate inelectronicmarketplaces, online reputation reportingsystems

deserve

new

scrutinyandthestudyoftrust

management

systemsindigitalcommunitiesdeservesto

become

a

new

additiontotheburgeoningfieldof

Management

Science.

The

objectiveofthispaperistocontributetotheconstructionofonline reputation reportingsystemsthat arerobustinthepresence

of

unfairanddeceitftil raters.

The

papersetsthe stage

by

providinga critical

overview ofthecurrentstateoftheartin thisarea(Section2).Followingthat,itidentifiesa

number

of

important

ways

in

which

thepredictive valueofthecurrentgenerationofreputation reportingsystems can beseverely

compromised by

unfairbuyersandsellers(Section3).

The

centralcontributionofthepaperisa

number

of novel "immunization

mechanisms"

foreffectivelycounteringtheundesirableeffectsof such

fraudulent behavior.

The

paperdescribesthemechanisms, provestheirpropertiesandexplains

how

various

(12)

(13)

influencetheireffectiveness (Section4). Finally, itconcludes

by

discussing the implicationsofthe findingsforthe

managers

andusersofcurrentand futureelectronicmarketplacesandidentifies

some

open

issues for futureresearch (Section5).

2. Reputation reporting

mechanisms

inonline

communities

The

relativeeasewith

which

computers cancapture,storeandprocess

huge amounts

ofinformationabout

pasttransactions,

makes

pastbehavior(reputational)information aparticularlypromising

way

on which

to

basetheproductionoftrust inonlinecommunities. Thisfact,togetherwiththe fact thatthe othertraditional

ways

of producingtrust (institutionalguarantees,indirectcues)

do

not

work

aswellincyberspace,has

prompted

researchersandpractitioners tofocustheirattention

on

developingonlinetrustbuilding

mechanisms

based

on

reputationalinfonnation.Thissectionprovidesacriticalsurveyofthe state-of-the-art inthis field.

A

repuiaiion, asdefined

by

Wilson(Wilson, 1985) isa "characteristicorattributeascribedtoone person

by

another. Operationally,thisisusuallyrepresented asapredictionaboutlikelyfuture behavior.Itis,

however,primarilyanempirical statement. Itspredictive

power

depends

on

thesuppositionthatpast

behaviorisindicativeoffuturebehavior". Reputationhasbeenthe objectofstudyofthesocialsciencesfor

along time (Schmalensee, 1978; Shapiro, 1982;

Smallwood

andConlisk, 1979). Severaleconomistsand

game

theoristshave demonstratedthat, inthepresenceofimperfect infonnation, theformationof

reputations isanimportant forcethathelpsbuyers

manage

transactionrisks,but alsoprovidesincentivesto

sellers toprovide

good

servicequality.

Having

interactedwith

someone

inthe pastis,ofcourse,the

most

reliablesourceofinformationaboutthat

agent's reputation. But, relyingonly

on

directexperiencesisbothinefficientand dangerous. Inefficient,

because anindividualwillbelimitedinthe

number

ofexchangepartnersheorshe hasand dangerous

becauseonewilldiscoveruntrustworthypartnersonly through hard experience(Kollock, 1999).These

shortcomingsareespecially severeinthecontextofonlinecommunities

where

the

number

ofpotential

partnersis

huge

andtheinstitutionalguaranteesincaseofnegativeexperiences areweaker.

Greatgains are possibleifinformationaboutpastinteractionsisshared and aggregatedwithinagroupin

theform ofopinions, ratingsorrecommendations. Inthe"bricksand mortar"

communities

thiscantake

many

forms:informal gossipnetworks,institutionalizedratingagencies,professionalcritics, etc. In

cyberspace,they take theform ofonline reputation reportingsystems, also

known

as online

recommender

systems (Resnick andVarian, 1997).

The

following sectionsprovidea brief discussionofthemost

important design challengesandcategoriesofthese systems.

(14)

(15)

Althoughtheeffectiveaggregationofother

community members'

opinionscanbeaveryeffective

way

to

gatherinformationaboutthereputationofprospective tradingpartners,itisnotwithoutpitfalls.

The

followingparagraphsdescribe

two

importantissues thatneedtobe addressed

by

opinion-basedreputation reportingmechanisms:

Subjectivelymeasurableattributes.

Intherestofthepaper

we

willusetheterm "agent"to refertoa participant(buyerorseller,

human

or

software)ofan online tradingcommunity.

We

say thatanattribute

_Q

of anagentsissubjectively

measurableifidenticalbehavior ofagent5 vis-a-vistwodifferentagents b^ and b_^

may

result in

two

differentratings

R'

^

R'

for attribute

Q

bytherespectiveraters.

The most

common

example

ofa

A, A,

subjeciivelymeasurableattributeisthenotionof productorservice"quality". In

most

transactiontypes,

some

ofthe attributesofinterestaresubjectively measurable.

Inorderforagentbto

make

useofother agents'ratingsforsubjectivelymeasurableattributesasabasisfor

calculatingagent5'sreputation,it

must

firsttryto "translate"eachof

them

intoits

own

value system. In

traditionalcommunities

we

address theaboveissue

by

primarily accepting

recommendations

from people

whom

we

know

already.Inthosecases,ourpriorexperiencewiththesepeoplehelpsus

gauge

their

opinions

and

"translate"

them

intoourvalue system.For example,

we may

know

from

pastexperiencethat

Billisextremely

demanding

andsoa ratingof"acceptable"

on

hisscale

would

correspondto "brilliant"on

ourscale.

As

afurtherexample,

we

may know

that

Mary

and

we

havesimilartastes inmoviesbut notin

food,so

we

follow her opinionson movies while

we

ignoreher

recommendations

onrestaurants.

Due

tothe

much

larger

number

ofpotential tradingpartners, inonline

communities

itis,onceagain,less

likely thatourimmediate"friends"willhave haddirectexperiences with several oftheprospects

considered Itis,therefore,

more

likely that

we

willhavetorelyontheopinionsofstrangerssogauging

suchopinions

becomes

much

more

difficult.

Intentionally falseopinions

Fora

number

ofreasons (see Section3)agents

may

deliberatelyprovidefalseopinions about anotheragent,

thatis,opinions,

which

bear

no

relationshipto their truthfulassessment oftheirexperienceswiththatother

agent. Incontrasttosubjective opinions, for

which

we

have

assumed

thattherecan bea possibilityof

"translation"to

somebody

else'svalue system,falseopinionsareusually deliberately constructedto

misleadtheirrecipientsandtheonlysensible

way

to treat

them

istoignorethem.Inordertobeableto

ignorethem, however, onehastofirstbeableto identifythem. Beforeaccepting opinions,ratersmust,

therefore, alsoassess the trustworthinessofotheragentswithrespecttogivinghonestopinions.

(Yahalom

(16)

(17)

orthogonaltoitstrustworthinessas aservice provider.Inotherwords, anagentcan beahigh-quahty

service providerandavery unreliable

recommendation

provider or viceversa.

Intherestofthesection

we

will brieflysurveythevarious classesof proposedonline reputation reporting

systems andwilldiscuss

how

each of

them

fares inaddressing theaboveissues.

2.2

Recommendation

repositories

Recommendation

repositoriesstoreand

make

availablereconmiendations

from

a large

number

of

community members

withoutattemptingtosubstantiallyprocess or qualifythem.

The

Web

isobviously verywell suitedforconstructingsuchrepositories.Infact,

most

current-generation

web-based

recommendation

systems (messageboards,opinionforums,etc.) fallinto thiscategory.

A

typical

representativeofthis classof systemsisthefeedback

mechanism

ofauctionsiteeBay.Other popular

auctionsites,suchas

Yahoo

and

Amazon

employ

very similarmechanisms.

eBay

encouragesthebuyer andsellerofaneBay-mediatedtransactiontoleavefeedbackforeachother.

Feedback

consistsofanumericalrating,

which

iscanbe +1 (praise), (neutral)or -1 (complaint) plus a

short(80 charactersmax.)text

comment.

eBay

then

makes

thelistofallsubmitted feedbackratingsand

comments

accessibletoanyother registered userofthesystem.

eBay

doescalculate

some

rudimentary

statisticsofthesubmittedratings foreachuser(the

sum

ofpositive,neutral andnegativeratingsinthelast

7days, past

month

and 6 months)but,otherwise,itdoesnotfilter,

modify

orprocessthesubmittedratings.

Recommendation

repositories are a step intheright direction.

They

make

lotsofinformationaboutother agents availabletointerested users,but theyexpectusersto

"make

sense"ofthoseratingsthemselves and

draw

their

own

conclusions.

On

theone hand,thisisconsistentwiththeviewpointthattheassessmentof

somebody's

trustworthinessisanessentiallysubjectiveprocess

(Boon

and Holmes, 1991).

On

the other hand, however,thisbaselineapproach doesnotscaleverywell.In situations

where

therearedozensor

hundredsof,possiblyconflicting, ratings,usersneedtospendconsiderableeffortreading"betweenthe

lines"ofindividualratingsinorderto "translate"other people'sratings to their

own

valuesystemorin

ordertodecidewhetheraparticular ratingishonestornot.

What's

more,incommunities

where most

raters

arecompletestrangerstooneanother, thereisnoconcreteevidencethat reliable"reading betweenthe

lines"ispossibleatall. Infact,as

we

mentioned,thereis

ample

anecdotalevidence of people being misled

by

followmg

the

recommendations

offalsemessages posted

on

Internetfeedbackforums.

2.3Professional(specialist)ratingsites

Specialist-based

recommendation

systems

employ

trustedand knowledgeablespecialists

who

thenengage

in first-handtransactions witha

number

ofserviceprovidersandthen publishtheir"authoritative"ratings.

Otherusersthenuse theseratingsas abasisforformingtheir

own

assessmentof

someone's

(18)

(19)

credit-ratingagencies(Moody's) and

e-commerce

professionalratingagencies,suchas

Gomez

Advisors,Inc.

(www.gomez.com).

The

biggestadvantageofspecialist-based

recommendation

systemsisthatitaddresses the

problem

offalse

ratingsmentionedabove.In

most

casesspecialistsareprofessionalsandtake great paintobuildand

maintaintheirtrustworthinessas disinterested,fairsourcesofopinions (otherwise theywillquickly find

themselvesoutofbusiness).

On

theother hand, specialist-based

recommendation

systemshavea

number

of

shortcomings,

which become

even

more

severeinonlinecommunities:

First,specialistscan onlytesta relativelysmall

number

ofservice providers.There istimeandcost involvedinperformingthesetestsand,the largerandthe

more

volatilethepopulationofone

community,

thelowerthepercentage ofcertifiedproviders.Second,specialists

must

beabletosuccessfullyconceal

theiridentityorelsethereisadangerthatproviderswillprovideatypically

good

servicetothespecialist for

thepurposeofreceiving

good

ratings. Third,specialistsare individualswiththeir

own

tastesandinternal

ratings scale,

which

donot necessarily

match

thatofanyotheruserofthe system. Individual usersof

specialist ratingsstillneedtobeabletogaugeaspecialist'srecommendation, inordertoderivetheir

own

likelyassessment. Last but notleast,specialiststypicallybasetheirratings

on

averysmall

number

of

sampleinteractionswiththeservice providers (oflenjustone).This

makes

specialistratingsavery

weak

basisfi-om

which

toestimatethe variabilityof someone'sserviceattributes,

which

isanimportant aspectof

someone's

trustworthiness, especiallyindynamic,time-varyingenvironments.

2.4 Collaborativefilteringsystems

Collaborativefilteringtechniques(Goldberget.al., 1992; Resnicket. al., 1994;

Shardanand

and Maes,

1995;BillsusandPazzani, 1998)attempttoprocess"raw"ratingscontainedina

recommendation

repositoryinordertohelpraters focustheirattentiononlyonasubsetofthoseratings,

which

aremost

likely tobeusefultothem.

The

basic ideabehindcollaborativefilteringistouse pastratingssubmitted

by

anagent i asa basisforlocating other agents

b

,b^,...

whose

ratingsarelikely tobe

most

"usefiji" to

agent b inordertoaccurately predictsomeone'sreputationfromits

own

subjective perspective.

Thereareseveral classesofproposedtechniques:

Classificationor clusteringapproachesrely

on

theassumptionthatagentcommunities forma relatively

smallsetoftasteclusters,withthepropertythatratingsofagentsofthe

same

cluster forsimilar thingsare

similartoeachother.Therefore,ifthe taste clusterof anagent b canbeidentified,thenratingsofother

members

ofthat cluster foranattribute

Q

ofagentscanbe usedasstatisticalsamplesforcalculating the

(20)

(21)

The

problem ofidentifyingthe"right"tastecluster for agivenagentreducestothewell-studiedproblem of

classification/dataclustering

(Kaufman

and

Rousseeuw,

1990; Jainet, al. 1999;Gordon, 1999). Inthe

contextofcollaborativefiltering,the similarityoftwo buyersisafunctionofthe distanceoftheirratings for

commonly

ratedsellers.Collaborativefilteringresearchershave experimented withavarietyof

approaches,based

on

statisticalsimilaritymeasures (Resnicket. al., 1994;Breseeet. al., 1998)aswellas

machine

learningtechniques(BillsusandPazzani, 1998).

Regression approachesrelyontheassumptionthatthe ratingsof anagent

b

canoftenberelated to the

ratingsof anotheragent

b

throughalinearrelationshipoftheform

R'

=a

R'

+B

forallagents.y (1)

fc V i>, 'J

Thisassumptionismotivated

by

thebelief,widely accepted

by

economists (Arrow, 1963; Sen, 1986)that,

even

when

agentshave"similar"tastes,one user's internal scaleisnotcomparabletoanotheruser's scale.

Accordingto this belief, inagiven

community

the

number

ofstrictnearestneighborswillbe verylimited

whiletheassumption of(1)opensthe possibilityofusingthe

recommendations

ofa

much

larger

number

of

agents as the basisforcalculatingan agent'sreputation. In that case,if

we

canestimate theparameters

a

,p

foreachpairofagents,

we

canuseformula(1)to"translate"theratingsofagents b tothe

"internalscale"ofagent b and thentreatthe translated ratings asstatisticalsamplesforestimatingthe

reputation R' fromtheperspectiveofagent b .

The problem

ofestimating thoseparameters reducestothe

b '

well-Studied

problem

oflinearregression.Thereisa

huge

literature

on

thetopicandalotofefficient

techniques,

which

areapplicabletothiscontext(Malinvaud, 1966;Pindyck andRubinfeld, 1981).

Bothclassificationandregressionapproachesrelatebuyerstoone another based

on

their ratings for a

common

setofsellers.Iftheuniverseofsellersislargeenough,evenactivebuyers

may

haveratedavery small subsetofsellers.Accordingly,classificationandregressionapproaches

may

be unabletocalculate estimated reputationsfor

many

seller-buyerpairs. Furthermore,theaccuracy

of

suchreputation estimates

may

be poor becausefairlylittleratingsdatacan be usedtoderivethem.This

problem

is

known

asreduced

coverageandisduetothesparse natureofratings.

Such

weaknessesarepromptingresearcherstoexperiment withtheuse oftechniques

from

thefieldof

Knowledge

Discoveryin Databases(Fayyadet.al. 1996),

which

discoverlatentrelationships

among

elementsofsparsedatabasesinthecontextofonline reputation reporting systems.

The

promisinguseof

one suchtechnique, SingularValueDecomposition

(SVD),

hasbeenreportedin(Billsusand Bazzani 1998;

Sarwaret.al. 2000).

(22)

(23)

Of

thevarious classesofsystemssurveyedintheprevioussection,

we

beheve

that

recommendation

repositorieswithcollaborativefilteringhavethe best potential for scalabilityandaccuracy. Nevertheless,

whilethesetechniques address issuesrelated tothe subjective natureofratings,they

do

not addressthe

problem

ofunfairratings.Thissectionlooksatthisproblemin

more

detail.

More

specifically,ourgoalisto

studya

number

ofunfairratingscenariosand analyzetheireffects in

compromising

thereliabilityofa

coUaborative-filtering-based reputation reporting system.

To

simplifythediscussion,inthe restofthepaper

we

are

making

thefollowingassumptions:

We

assume

a trading

community whose

participantsaredistinguishedintobuyersandsellers.

We

further

assume

that

onlybuyers canratesellers. Ina futurestudy

we

willconsidertheimplicationsofbi-directional ratings. In

a typicaltransactioni,abuyer bcontracts withasellersfortheprovisionofa service.

Upon

conclusionof

the transaction,bprovidesanumericalrating R'^ (I),reflecting

some

attribute

Q

ofthe service offered

by

s

asperceivedbyb(ratingscan onlybe submittedinconjunctionwitha transaction).Again,forthesakeof

simplicity

we

assume

that R'^ (i)isascalar quantify, although,in

most

transactions there are

more

thanone

criticalattributes_{and ^^ (0}

would

bea vector.

We

further

assume

theexistenceofanonline reputation reporting

mechanism, whose

goalisto storeand

process pastratingsinordertocalculatereliablepersonalized reputation estimates 7?^^(/) for seller.?

upon

requestofaprospectivebuyerb.In settings

where

thecriticalattribute

Q

for

which

ratings areprovidedis

subjectivelymeasurable,thereexistfourscenarios

where

buyersand/orsellerscanintentionallytry to "rig thesystem", resultinginbiased reputation estimates,

which

deviatefroma "fair"assessment ofattribute

_Q

foragivenseller:

a.Unfairratings

by

buyers

• Unfairlyhighratings ("ballotstuffing"):

A

sellercolludeswithagroup of buyers inordertobe given

unfairiyhighratings

by

them.Thiswillhavethe effectofinflatingaseller'sreputation, therefore

allowingthat seller toreceive

more

ordersfrom buyers andatahigherprice than she deserves.

• Unfairlylowratings("bad-mouthing"):Sellerscancolludewithbuyersinorderto

"bad-mouth"

other

sellersthatthey

want

todriveoutofthemarket.In suchasituation,theconspiringbuyers provide

unfairlynegativeratings to thetargetedsellers,thusloweringtheirreputation.

b.Discriminatorysellerbehavior

• Negativediscrimination:Sellersprovide

good

servicetoeveryoneexceptafewspecificbuyersthat

they "don'tlike". Ifthe

number

of buyers beingdiscriminated

upon

isrelativelysmall,thecumulative

(24)

(25)

• Positivediscrimination:Sellersprovideexceptionally

good

servicetoafew selectindividualsand

averageservicetotherest.

The

effectofthisisequivalentto ballot stuffing.Thatis,ifthefavored

groupissufficiently large, theirfavorableratings will inflate thereputationofdiscriminatingsellers

andwillcreateanexternalityagainsttherestofthebuyers.

The

observableeffectofallfourabovescenariosisthattherewillbea dispersionofratings for agiven

seller. Iftheratedattributeisnot objectivelymeasurable,itwillbe verydifficult,orimpossibleto

distinguishratingsdispersionduetogenuinetastedifferences fromthat

which

isduetounfairratingsor discriminatory behavior.Thiscreatesa

moral

hazard,

which

requires additional

mechanisms

inordertobe

eitheravoided, or detectedandresolved.

Inthe followinganalysis,

we

assume

theuse ofcollaborativefilteringtechniquesinordertoaddress the

issueofsubjectiveratings.

More

specifically,

we

assume

that, inordertoestimate thepersonalized

reputationof5

from

the perspectiveofb,

some

collaborativefilteringtechniqueisusedtoidentifythe nearestneighborset

N

oi^b.

N

includesbuyers

who

havepreviously rated.sand

who

are thenearest

neighborsofi,based

on

thesimilarityoftheirratings,forother

commonly

ratedsellers,with thoseofb.

Sometimes,thisstepwillfilteroutallunfairbuyers. Suppose, however,thatthe colludershavetaken

collaborativefiltering intoaccountand havecleverlypicked buyers

whose

tastesaresimilartothose

ofb

in

everythingelseexcepttheirratingsof5.In thatcase, theresultingsetA'willinclude

some

fairratersand

some

unfairraters.

Effects

when

reputationissteadyoverlime

The

simplest scenariotoanalyzeisone

where

we

can

assume

thatagent behavior,andthereforereputation,

remainssteadyovertime.That

means

that,collaborativefilteringalgorithmscantakeintoaccountall

ratings in theirdatabase,

no

matter

how

old.

Inorderto

make

ouranalysis

more

concrete,

we

will

make

theassumptionthatfairratingscan range

between!/?

,R

1andthatthey followa distributionofthe general form: tnin max

TUR)

=

_max{R,mm{R

,z))

where

z~

N{^,a)

(2)

b

which

inthe restofthepaperwillbe approximatedto T'

(R)

~

N{jU,

(T).

The

introductionof

minimum

and

maximum

rating

bounds

correspondsnicelywith

common

practice.

The

assumption of nomially

distributedfairratings,requires

more

discussion. Itisbased

on

thepreviousassumptionthatthoseratings

belong to thenearestneighborsetofagivenbuyer,andtherefore representasingletaste cluster. Withina

taste cluster,itisexpectedthatfairratings willberelativelyclosely clusteredaround

some

valueand hence

(26)

(27)

Inthispaper

we

will focusonthe reliableestimationofthereputationmean.

Given

alltheabove

assumptions,thegoalofareliablereputationreportingsystem should bethe calculationofafair

mean

reputationestimate

(MRE)

R'^

which

isequaltoor very closeto jU,the

mean

ofthefairratings

(28)

(29)

bjatr '

On

theotherhand,thegoalofunfairraters istostrategicallyintroduce unfairratingsin ordertomaximize

the distance

between

theactual

MRE

R

*

, calculated

by

the reputationsystemandthefair

MRE.

More

specificallytheobjectiveofballot-stuffingagentisto

maximize

the

MRE

while bad-mouthingagentsaim

tominimizeit.

Note

that,incontrasttothecaseoffairratings,itisnot safeto

make

any

assumptions about

theform ofthe distributionofunfairratings.Therefore,allanalysesinthe restofthispaperwillcalculate

system behavior underthe

most

disruptive possible unfairratings strategy.

We

willonly analyzethecaseofballot-stuffingsince thecaseof bad-mouthingissymmetrical.

Assume

thattheinitialcollaborativefilteringstepconstructsa nearestneighborsetN,in

which

theproportionof

unfairratersis5andthe proportionoffairratersis i5. Finally,ourbaseline analysisin thissectionassumes

thattheactual

MRE

/?' istakentobethesample

mean

ofthe

most

recent ratinggivento s

by

each

b.aclual

qualifyingrater inA^.This simpleestimatorisconsistentwiththepracticeof

most

current-generation

commercial

recommender

systems(Schaferet. al. 2001).In that case,the actual

MRE

willapproximate:

b.actual ^ ' '^ '^ _u

where

U

isthe

mean

valueofunfairratings.

The

strategy,

which maximizes

the

above

MRE

isone

where

u

=

R

,i.e.

where

allunfairbuyers givethe

maximum

possiblerating totheseller.

'^

u max

We

define the

mean

reputation estimate biasforacontaminatedsetofratings to be:

B

=

_Rl

-R'.

(5)

b.uctual bjair

Inthe

above

scenario,the

maximum

MRE

biasisgivenby:

B

={\-S)H

+

dR

-^

=

5{R

-jU)

(6)

max ' _max

' max

Figure 1 tabulates

some

valuesof

B

forseveraldifferentvaluesLiand6,inthe special case

where

*^ _max

ratingsrange

from

[0,9]. Forthepurpose of comparingthisbaselinecasewith the"immunization

mechanisms"

describedinSection4,

we

havehighlighted biasesabove

5%

ofthe ratingsrange(i.e.biases greater than±0.5 pointsonratings

which

range from0-9).

As

can beseen,formula(6)canresult invery

significant inflationofa seller's

MRE,

especiallyforsmall(Iandlarge6.

Percentage of

(30)

(31)

(32)

(33)

selleras a solution. Inenvironments

where

reputation estimatesuseallavailableratings, thissimple

strategyensiires thateventually6 can neverbe

more

than the actualfractionofunfairratersinthe

community,

usuallyavery smallfraction.

However,

thestrategybreaks

down

inenvironments

where

reputation estimates arebased

on

ratingssubmittedwithina relativelyshorttime

window

(or

where

older

ratingsareheavily discounted).

The

followingparagraphexplains

why.

Let us

assume

thattheinitialnearestneighborsetN,„i,icicontains

m

fairraters

and

nunfairraters.In

most

casesn

«

m.

Assume

further thattheaverageinterarrivaltimeoffairratings for agivenselleris

X

and thatpersonalized

MREs

R'

(t) arebased only

on

ratings for 5submitted

by

buyers ue A',,,,,/,,/within the

time

window

W

=[t-

kA, t].

Based

onthe

above

assumptions,theaverage

number

offairratingssubmitted

within fF

would

beequaltok.

To

ensure accurate reputation estimates, thewidth ofthetime

window

W

shouldberelativelysmall; thereforek shouldgenerallybeasmall

number

(say,

between

5 and20).Fork

«

mwe

can

assume

thateveryratingsubmittedwithin fFisfroma distinctfair rater.

Assume

now

that

unfairratersfloodthesystem withratingsata frequency

much

higher than thefrequencyoffairratings. If

the unfairratingsfrequencyishighenough, every one ofthenunfairraterswillhave submittedatleastone

ratingwithinthetime

window

W.

As

suggested

by

Zachariaet. al.,

we

keep onlythelastrating sent

by

each

rater.

Even

usingthat rule,however,theabovescenario

would

result inanactivenearestneighborsetof

raters

where

thefractionofunfairratersis5

=

n/(n+k). TTiisexpressionresults in5

>

0.5for

n>k,

independentof

how

smalln isrelative tom. For example, if

n=10

and k=5, 6

=

10/(10+5)

=

0.67.

We

thereforeseethat,for relativelysmalltime

windows,

even asmall(e.g.5-10)

number

ofcolludingbuyers can successfullyuseunfairratingsfloodingtodominatethesetofratingsusedtocalculate

MREs

and

completelybias theestimateprovided

by

thesystem.

The

resultsofthissection indicatethatevena relativelysmall

number

ofunfairraterscansignificantly

compromise

thereliabilityofcollaborative-filtering-based reputation reporting systems. Thisrequires the

development ofeffectivemeasuresforaddressing theproblem.

and

analyzes several

such measures.

4.

Mechanisms

for

immuntzing

onlinereputationreportingsystems against unfairraterbehavior

Having

recognizedthe

problem

ofunfairratings as a realandimportant one,thissectionproposesa

number

of

mechanisms

foreliminating orsignificantlyreducingitsadverseeffects

on

the reliabilityof

online reputation reporting systems.

4.1 Avoidingnegative unfairratingsusing controlledanonymity

The main argument

ofthissection isthattheanonymity regime of anonline

community

caninfluence the kindsofreputationsystemattacksthatarepossible.

A

slightlysurprisingresultisthe realization that a fully

(34)

(35)

transparentmarketplace,

where

everybody

knows

everybodyelse's true identity incurs

more

dangersof

reputationsystemfraud thanamarketplace

where

the true identitiesoftradersare carefullyconcealedfrom eachotherbutare

known

toatrusted third entity(usually themarket-maker).

Bad-mouthing

andnegative discriminationarebased

on

theabilitytopicka

few

specific"victims"and

give

them

unfairlypoorratingsorprovide

them

withpoorservice respectively. Usually, victims are selectedbased on

some

real-life attributesoftheirassociated principalentities(forexample, becausethey

areourcompetitors orbecauseofreligiousorracialprejudices).This adverseselectionprocesscan be

avoidedifthe

community

conceals thetrue identitiesofthebuyers andsellersfrom eachother.

Insucha "controlled

anonymity"

scheme, themarketplace

knows

the true identityofallmarketparticipants

by applying

some

effective aiithenlicationprocessbeforeitallows accesstoanyagent (Huttet. al. 1995).

Inaddition,itkeepstrackofalltransactionsandratings.

The

marketplacepublishes the estimated

reputationof buyers andsellersbutkeepstheir identitiesconcealed

from

eachother(orassigns

them

pseudonyms which

change from onetransactiontothenext, inorderto

make

identitydetectionvery

difficult).In thatway, buyersandsellers

make

theirdecisions solelybased

on

theofferedtermsoftradeas

wellasthepublishedreputations. Becausetheycan

no

longeridentify their"victims",bad-mouthingand

negative discriminationcanbeavoided.

Itisinterestingtoobservethat,while,in

most

cases,the

anonymity

ofonlineconmiunities has been viewed

as asourceofadditionalrisks(Kollock 1999;Friedman and Resnick1999), here

we

have an

example

ofa

situation

where

some

controlleddegreeof anonymity can be usedtoeliminate

some

transactionrisks.

Concealingthe identitiesof buyers andsellersisnot possibleinalldomains. For example, concealingthe

identityofsellersisnot possibleinrestaurantandhotelratings(althoughconcealingtheidentityof buyers

is).Inotherdomains,it

may

requirethecreative interventionofthe marketplace.For example,ina

marketplaceofelectronic

component

distributors,it

may

requirethemarketplaceto act asanintermediary

shipping

hub

that willhelp erase informationaboutthe seller'saddress.

Ifconcealingtheidentitiesof bothpartiesfrom eachotherisnot possible, thenit

may

stillbeusefulto

concealthe identityofoneparty only.

More

specifically,concealingtheidentityof buyersbut notsellers

avoids negative discrimination against

hand

picked buyersbutdoesnotavoid bad-mouthing of

hand

picked

sellers.Inananalogousmanner, concealingtheidentityofsellersbut notbuyers avoids bad-mouthingbut not negative discrimination.

These

resultsare

summarized

inFigure2.

Generallyspeaking,concealingtheidentitiesof buyersisusually easierthanconcealingtheidentitiesof

sellers (asimilarpointis

made

inCranor and Resnick1999). This

means

thatnegative discriminationis

easier toavoidthan"bad-mouthing". Furthermore, concealingtheidentitiesofsellersbefore a serviceis

performedisusually easier than afterwards. In

domains

withthisproperty, controlled

anonymity

canbe

(36)

(37)

subsequent bad-mouthing. For example,intheabove-mentioned marketplace ofelectronic

component

distributors,one couldconcealthe identitiesofsellers untilaftertheclosingofadeal.

Assuming

that the

number

ofdistributors for agiven

component

typeisrelatively large, thisstrategy

would

make

itdifficult,

or impossible,formalevolent buyersto intentionallypickspecific distributors forsubsequent

bad-mouthing.

(38)

(39)

The

sample

median

F

of « orderedobservations

Y

12/1

<Y

<...<¥

isthemiddleobservation Y,

where

A

k=

(n+I)/2if«isodd.

When

n iseventhen

Y

isconsideredtobeanyvalue

between

the

two

middle

observations

Y

and

Y

where

k=n/2, althoughitismostoftentakentobetheiraverage.

Intheabsenceofunfairratings(i.e.

when

5=0)

we

havepreviously

assumed

that _^^

(^)

=

^{Mi

<^)• '^'^

well

known

(Hojo, 1931)that,as the size« ofthesampleincreases,the

median

ofasample

drawn

froma

normaldistributionconvergesrapidlyto anormaldistributionwith

mean

equal tothe

median

oftheparent

distribution.Innormaldistributions,the

median

isequaltothemean.Therefore,insituations

where

there

arenounfairraters,theuseofthesample

median

results inunbiasedfair

MREs:

bjair '

Let us

now

assume

thatunfairraters

know

that

MREs

arebased

on

thesample median.

They

will

strategically try tointroduce unfairratings

whose

valueswill

maximize

the absolute bias

between

the

sample

median

ofthefairset

and

thesample

median

ofthecontaminatedset.

More

specifically, "ballot

stuffers" will try to

maximize

thatbiaswhile"bad-mouthers"will try tominimizeit.In thefollowing

analysis

we

consider the caseofballot stuffing.

The

caseof bad-mouthingissymmetric, withthesigns reversed.

Assuming

thatthenearestneighborsetconsistsof

«

=

{\

—

5)n

fairratingsand

n

=

S

n

unfair

ratings,

where

<

(J

<

0.5,themostdisruptive unfairratings strategy, intermsofinfluencingthesample

median,isone

where

allunfairratingsarehigher than thesample

median

ofthecontaminatedset.In that

caseandfor

J

<

0.5,allthe ratings,

which

arelowerthan or equaltothesample

median

willhavetobe

fairratings.Then,thesample

median

ofthecontaminatedset,willbeidentical totheA"'orderstatisticof

thesetof

n

fairratings,

where

k=(n+l)/2.

Ithasbeen

shown

(Cadwell 1952)that,as the sizen ofthesampleincreases,thek"'orderstatisticofa

sample

drawn

fromanormaldistribution

N{/U,

O)

convergesrapidlytoanormaldistributionwith

mean

equaltotheq"'_quantile_of_the_parent_distribution

_where

_{q=k/n. Therefore,}forlargeratingsamplesn,under

theworstpossible unfairratings strategy, thesample

median

ofthecontaminatedset willconvergeto

X

where

X

isdefined by:

(40)

(41)

k n

+

\ (n

+

i

where

q

=

—

=

1

2 (!-(?)

and

O

(q) isthe inversestandardnormal

CDF.

1

2 (1-tJ)

(9)

Given

that

R'

=

U

theasymptotic formulafortheaveragereputation bias achievable

by

bjair ^

S

1

00%

unfairratings

when

fairratings are

drawn

fromanormaldistribution N{/U.,

o)

and

unfair

ratingsfollow the

most

disruptivepossible unfairratings distribution,isgivenby;

1

E[B

- max]-*

= E[Rl

p.actual

-R'

b.jt

o-

0-'(-:) (10)

2

(\-sy

Figure3

shows

some

ofthe valuesof

E\B

1forvariousvaluesof

S

and <7 inthe specialcase

where

° "- _max-•

ratingsrange

from

to9.

Given

that

we

have

assumed

thatallratings in thenearestneighborset

correspondtousersinthe

same

taste cluster, itisexpectedthatthe standard deviationofthefairratings will

berelativelysmall. Therefore,

we

didnot consider standard deviations higher than

10%

ofthe ratings

range. It isobviousthatthe

maximum

biasincreaseswiththepercentage ofunfairratingsand isdirectly

proportional tothe standard deviationofthefairratings.

As

before,

we

havehighlighted

maximum

average

biasesof

5%

oftheratingrangeormore.Figure3clearly

shows

thattheuseofthesample

median

as a the

basisofcalculating

MREs

manages

toreducethe

maximum

averagebiasto

below

5%

oftheratingrange forunfairrater ratiosofupto

30-40%

anda

wide

range offairratingstandard deviations.

Percentage

of

(42)

(43)

4.3 Using frequencyfilteringtoeliminate unfairratingsflooding

Formulas(6)and(10)confirmtheintuitive factthatthereputation biasduetounfairratingsincreaseswith

the ratio

5

ofunfairratersinagivensample.In settings

where

a seller'squalityattributescan vary over

time (mostrealisticsettings),calculationofreputationshouldbebased

on

recentratingsonly usingtime

discounting ora

time-window

approach.Inthosecases.Section3demonstratedthat

by

"flooding"the

system withratings,arelativelysmall

number

ofunfairraterscan

manage

to increasethe ratio

5

ofunfair

ratings inany given time

window

above

50%

and completely

compromise

thereliabilityofthe system.

Thissectionproposes an approachfor effectively

immunizing

areputationreportingsystemagainst unfair

ratings flooding.

The main

ideaistofilterraters in thenearestneighborsetbased

on

theirratings

submissionfrequency.

Description

of

frequencyfiltering

Step 1:Frequencyfilteringdepends

on

estimating theaverage frequency ofratingssubmitted

by

each buyer

foragivenseller. Sincethis frequencyisatime-varying quantity(sellerscan

become

more

orlesspopular

withthepassageoftime),it,tooneedstobeestimatedusingatime

window

approach.

More

specifically:

1

.

Calculatethe set

F'

{t)o^buyer-specificaverageratingssubmissionfrequencies

_f^

(t) for sellers,

foreachbuyer bthathassubmittedratings for sduringthe ratingssubmission frequencycalculation

time

window

W

.=[t-E, i].

More

precisely,

/'

(/)

= (number

ofratingssubmittedfor

shy

bduring

W

)IE (1 1)

2. Setthecutofffrequency

/'

(?) tobeequaltothe A-thorderstatisticoftheset

F'

(/)

where

k

=

(\-D)

•

«

,n isthe

number

of elements of

F'

(t)and

D

isaconservative estimateofthefraction

ofunfairraters inthetotalbuyerpopulationfor sellers.For example,if

we

assume

thatthereare

no

more

than 1

0%

unfairraters

among

allthebuyersfor sellers,then

D=0

1.

Assuming

further that

n=100,i.e. thatthe

setF"

(?)containsaverageratingssubmissionfrequencies

from

100buyers, then

thecutofffrequency

would

beequal tothe 90-th smallestfrequency(the 10-thlargestfrequency)

presentintheset

F\t)

.

The

width

E

ofthe ratingssubmission frequencycalculationtime

window

W

shouldbelarge

enough

in

(44)

(45)

Step2: Duringthecalculationoi~a

MRE

for sellerj',eliminateallratersb inthenearestneighborsetfor

whom

f^

> f'

. Inotherwords,elimmateallbuyers

whose

averageratingssubmission frequency for

sellersisabovethecutoff frequency.

Analysis

of

frequencyfiltering

We

will

show

thatfrequencyfilteringprovideseffectiveprotection against unfair ratings flooding

by

guaranteeingthatthe ratioofunfairraters inthe

MRE

calculationsetcannotbe

more

than twiceaslarge as

the ratioofunfairratersin thetotalbuyerpopulation.

As

before,

we

will

assume

thattheentirebuyerpopulationisn,unfairratersare

S

n

«

n

andthewidth ofthe reputation estimationtime

window

isa relativelysmall ff (sothat,eachratingwithin fFtypically

comes

fromadifferentrater).Then,afterapplyingfrequencyfiltering to thenearestneighborsetofraters,

inatypicaltime

window

we

expecttofind

W

(\-5)

n \u (p{u) dii fairratings,

where

(p(u)isthe probabilitydensityftinctionoffairratings

frequencies,andat

most

W

5

n

a

f unfairratings,

where

<

«

<

1 isthe fractionofunfairraterswith submission ^_cutoff °

frequencies

below

/^^,„.

Therefore, theunfair/fairratings ratio in the final set

would

beequalto:

unfairratings

_{_}

S'

_

5

^'Jcuiaff

_

5

,.-.

fairratings

\-5'

\-

5

''« 1

-

8

\u (p{u)du

a-f

where

/

=

'^^ denotestheinflationoftheunfair/fairratings ratio inthefinalset relative toits

\u(p{u) du

valueinthe originalset.

The

goalofunfairratersisto strategically distribute their ratingsfrequencies

above and

below

thecutofffrequencyinorderto

maximize

/.Incontrast,thegoalofthemarketdesigneris

topick the cutofffrequency

/__^,,^ soas tominimize/.

The

cutofffrequencyhasbeendefinedas the(/-D/n-thorderstatisticofthesample of buyerfrequencies,

where

D>5.

Forrelativelylargesamples,thisconvergestothe^-th quantileofthefairratingfrequencies

(46)

(47)

(\-D)

n

=

qi\-S)

n

+

aS

n =>q

=

1

-D

+ (a-l)

₍₁₃₎

1-0

From

thispoint on,theexact analysis requires

some

assumptions aboutthe probability densityfijnctionof

fairratingsfrequencies.

We

start

by

assumingaunifonndistributionbetween F^^^

=

_f^/(I

+

s) and

F

=

_/

(1

+

5). Let

5 =

F

-

F

.Then,

by

applyingthepropertiesofunifonnprobability

distributions toequation(12),

we

getthefollowing expressionofthe inflation/ofunfairratings:

/=^""V

where/

-F

^^iS^Z^s

(14)

cutuj/ min

After

some

algebraicmanipulation

we

find that

—

>

and

—

>

.This

means

that,unfairraters will

da

dD

wantto

maximize

a,the fractionofratings thatarelessthan orequalto f^^^^^,whilemarket makerswill

wanttominimize D,i.e. set

D

ascloseas possibletoanaccurateestimateoftheratioofunfairraters inthe

totalpopulation.Therefore,atequilibrium,

a

=

\,D

=

S

and;

2(F

-eS)

₅

1=

!!2i

_where

_f

=

₍₁₅₎

(\-£)(F

^ ' ^ mmmin

+F

mm

max

-£

S)

l-S

The

aboveexpressionfortheunfair/fairratings inflationdepends

on

thespreadS offairratings

1 2

frequencies.

At

the limitingcases

we

get lim/

=

and lim/

=

5-»o

_i-£

s^-

i-e

By

substituting the

above

limiting valuesof/inequation(12),

we

get thefinal fonnulaforthe equilibrium relationshipbetween6,the ratioofunfairraters inthetotalpopulationof buyers and5' the final ratioof

unfairratings

m

thenearestneighborsetusingtime

windowing

and frequency filtering:

S/(\-S)<S'<2S

(16)

Equation(16)

shows

that,nomatter

how

hardunfairraters

may

tryto"fiood"thesystem withratings,the

presenceof frequencyfilteringguaranteesthattheycannotinflatetheirpresencein the final

MRE

calculationset

by more

than afactorof2.This concludesthe

proof

In

most

onlinecommunities,the exactratio5 ofunfairraterswillnotbe

known

exactly. Insuchcases,if

we

haveabeliefthat5<0.1, thensetting

D=0.1

hasbeenexperimentallyprovento result in inflation ratios,

which

alsofallwithin the

bounds

ofequation(16).

A

more

realisticassumption aboutfairratingsfrequenciesisthattheyfollow alognormaldistributionwith

mean

/

andvariance relatedtothefrequencyspreadS.This assumptionisconsistentwiththe findingsof

(48)

(49)

giveninclosed form.

However,

anumericalsolutionyields results,

which

approximatevery closely those obtainedanalytically foruniformlydistributedfairratingfrequencies (Figure4).

0.001 0.01 0.1 1 10

Frequency

spread

100 1000

-B—

Uniformdistribution

x

Log-normaldistribution

Figure4.

Maximum

unfairratings inflation factors

when

frequencyfilteringisused

(S

=

D

=

0.l).

Given

that

median

filteringguarantees reputation biases oflessthan

5%

oftheratingsscale(e.g.lessthan

±0.5 points

when

ratingsrange from 1-10)forcontaminationratiosofupto

30-40%

and frequencyfiltering

guaranteesthatunfairraterscannotuse floodingto inflate theirpresence

by more

thana factoroftwo,the

combinationof frequencyfilteringand

median

filteringofguarantees reputation biasesoflessthan

5%

when

the ratioofunfairratersisupto

15-20%

ofthe totalbuyerpopulationforagivenseller.

One

possiblecriticismofthefrequencyfilteringapproachisthatitpotentiallyeliminatesthosefairbuyers

who

transact

most

frequentlywithagivenseller. Infact, intheabsence ofunfairraters,allraters

who

would

befilteredoutbased

on

theirhighratingssubmission frequency

would

befairraters.Nevertheless,

we

do

not believethat thispropertyconstitutesa

weakness

ofthe approach.

We

arguethatthe "best

customers" ofagivenselleroften receivepreferentialtreatment,

which

isina

way

aform ofpositive

discrimination

on

behalfoftheseller. Therefore,

we

believethat the potentialeliminationof suchraters

fromthefinalreputation estimateinfactbenefitstheconstructionof

more

unbiasedestimatesforthe

benefitoffirst-timeprospective buyers.

4.4 Issuesincommunities

where

buyeridentityisnot authenticated

The

effectivenessof frequencyfiltering reliesontheassumptionthatagivenprincipalentitycan only have

one buyeragent actingonitsbehalfinagiven marketplace.

The

techniqueisalso validinsituations

where

principalentitieshavemultiplebuyeragentswithauthenticatedidentifiers. In that case,frequencyfiltering

(50)

(51)

Innon-aulhenticated onlinecommunities (communities

where

"pseudonyms"

are"cheap",tousetheterm of

Friedman

and Resnick) with

time-windowed

reputation estimation, unfairbuyers canstill

manage

to

"flood"thesystem withunfairratingsbycreating a large

number

of

pseudonymously

known

buyeragents acting

on

theirbehalfIn thatcasethetotalratio5 ofunfairagentsrelative tothe entirebuyerpopulation

canbe

made

arbitrarilyhigh. Ifeachofthe unfairagents takes careofsubmitting unfairratings forsellers

withfrequency

_J

'

< J

,because 6willbehigh,eveninthepresenceof frequencyfiltering,imfair

raterscanstill

manage

toseverelycontaminatea seller'sfairreputation.

Furtherresearchisneededinordertodevelopimmunizationtechniquesthatareeffective incommunities

where

the"true"identityofbuyeragentscannotbeauthenticated.Inthemeantime,theobservationsofthis

section

make

astrongargumentforusing

some

reasonablyeffectiveauthentication

regimeybr

buyers(for

example,requiringthatall

newly

registeringbuyers supplyavalidcreditcardforauthentication purposes)

inall onlinecommunities

where

trustisbasedonreputationalinfomiation.

5.Conclusions

and

Management

Implications

We

beganthispaper

by

arguingthatmanagers ofonlinemarketplacesshould pay specialattention tothe

designofeffectivetrust

management mechanisms

thatwillhelpguaranteethestability,longevityand

growth oftheirrespectivecommunities.

We

pointed out

some

ofthe challengesof producingtrust inonline

enviroiunentsand arguedthatonline reputation reporting systems,an emergingclassofinformation systems,holdthepotentialof

becoming

aneffective,scalable,andrelatively low-costapproachfor

achievingthisgoal,especially

when

thesetof buyers andsellersislargeandvolatile.Understandingthe

proper implementation,usageandlimitationsof such systems(inordertoeventually

overcome

them)is

thereforeimportant,both forthe

managers

aswellas for theparticipantsofonlinecommunities.

Thispaperhas contributedin this direction,first,

by

analyzing thereliabilityofcurrent-generation

reputationreportingsystemsinthepresenceof buyers

who

intentionallygive unfairratings to sellersand, second,by proposingandevaluating asetof "immunization mechanisms",

which

eliminate orsignificantly

reducetheundesirableeffectsof such fraudulentbehavior.

InSection3,themotivations forsubmitting unfairratings

were

discussedandthe effectsof suchratingson

biasinga reputation reportingsystem's

mean

reputation estimateofa seller

were

analyzed.

We

have

concludedthatreputationestimation

methods

basedontheratings

mean, which

are

commonly

usedin

commercial

recommender

systems, areparticularlyvulnerabletounfair ratingattacks,especiallyin

contexts

where

aseller'sreputation

may

varyovertime.

(52)

(53)

(54)

(55)

techniquesproposed

by

this

work

willprovideauseful basisthat willstimulatefurtherresearchinthe importantand promisingfieldofonline reputation reporting systems.

(56)

(57)

References

Arrow,

Kenneth

(1963). SocialChoice

and

Individual Values.YaleUniversityPress.

Bakos,Y. (1997).

Reducing

Buyer

SearchCosts: ImplicationsforElectronicMarketplaces.

Management

Science,

Volume

43, 12,

December

1997.

Bakos,Y. (1998).

Towards

Friction-FreeMarkets:

The Emerging

RoleofElectronicMarketplaces

on

the

Internet.

Communications of

the

ACM,

Volume

41,8 (August1998),pp. 35-42.

Billsus,D.andPazzani,M.J. (1998).Learningcollaborativeinformationfilters. InProceedings

_of

the15'''

InternationalConference

on

Machine

Learning,July 1998,pp.46-54.

Boon, SusanD.,

&

Holmes, John G.(1991).

The dynamics

ofinterpersonaltrust: resolving uncertaintyin

thefaceofrisk. Pages 190-211

of

Hinde,RobertA.,

&

Groebel, Jo(eds),Cooperation

and

Prosocial

Behaviour.

Cambridge

UniversityPress.

Bresee,J S,

Heckerman,

D.,and Kadie, C.(1998)EmpiricalAnalysisofPredictiveAlgorithmsfor CollaborativeFiltering.InProceedings ofthe14'''_{Conference on} _Uncertainty_in_Artificial_Intelligence

(UAI-98),pp.43-52,SanFrancisco, July 24-26, 1998.

Cadwell,J.H.(1952)

The

distributionofquantilesofsmall samples. Biometrika. Vol. 39,pp.207-211.

Cranor, L.F.andResnick, P.(2000). Protocolsfor

Automated

Negotiationswith

Buyer Anonymity

and

SellerReputations.

To

appearinNetnomics.

Dellarocas,C. (2000).

The

Design ofReliable Trust

Management

Systems forOnline Trading

Communities.

Working

Paper, availablefromhttp://ccs.mit.edu/dell/trustm(;t.pdf

Dunn,

John.(1984)

The

concept of'trust' inthe politicsof John Locke. Chap. 12,pages

279-301

of

Rorty, Richard,Schneewind,J.B.,

&

Skinner,Quentin(eds),PhilosophyinHistory.

Cambridge

UniversityPress.

Fayyad,U.M.,Piatetsky-Shapiro,G.,Smyth, P. and Uthurusamy,R.eds.(1996)

Advances

in

Knowledge

Discovery

and

Data

Mining,

MIT

Press, Cambridge, Mass.

Friedman,E.J.andResnick,P. (1999)

The

SocialCostof

Cheap Pseudonyms. Working

paper.

An

eariier

version

was

presentedattheTelecommunicationsPolicyResearch Conference, Washington,

DC,

October

1998.

Gambetta,

Diego

(ed).(1990). Trust.Oxford:Basil Blackwell.

Gordon, A.D. (1999)Classification.

Boca

Raton:

Chapman

&

Hall/CRC.

Goldberg,D.,Nichols,D.,Oki,B.M.,andTerry,D.(1992) UsingCollaborativeFiltering to

Weave

an

Infonnation Tapestry.

Communications

ofthe

ACM

35(12), pp. 61-70,

December

1992.

Hojo,T.(1931). Distributionofthemedian,quartilesandinterquartiledistanceinsamples fromanormal

population. Biometrika, Vol.23, pp. 315-360.

(58)

(59)

Hutt,A.E., Bosworth,S.andHoyt.D.B.eds.(1995).

Computer

Security

Handbook

(3"^edition).Wiley,

New

York.

Jain,A.K., Murty,

M.N.

andFlynn, P.J.(1999) Dataclustering:areview.

ACM

Computing

Surveys, Vol.

31, 3(Sep. 1999),pages

264 -323.

Johnson,D. R.and PostD.G.(1996).

Law And

Borders—The

Riseof

Law

inCyberspace. Stanford

Law

Review,Vol. 48.

Kaufman,

L.and

Rousseeuw,

P.J.(1990).Finding

Groups

inData:

An

IntroductiontoCluster Analysis.

Wiley,

New

York.

Kollock,P.(1999)

The

Production ofTrustinOnlineMarkets.In

Advances

in

Group

Processes(Vol. 16),

eds. E.J.Lawler,

M. Macy,

S. Thyne, and

HA.

Walker,Greenwich,

CT:

JAIPress.

Lawrence,R.J. (1980)

The Lognormal

Distributionof

Buying

FrequencyRates.Journal

of

Marketing

Research.Vol.XVII,

May

1980,pp.212-226

Malinvaud, E.(1966).Statistical

Methods of

Econometrics.Paris:NorthHolland.

Pindyck,R.andRubinfeld,D.L.(1981).Econometric

Models

and

Economic

Forecasts(2"''_Edition).

McGraw-Hill,

New

York.

Resnick,P.,lacovou,N.,Suchak, M., Bergstrom,P.,andRiedl, J.(1994) Grouplens:

An

Open

Architecture

forCollaborativeFilteringof Netnews.InProceedings

of

the

ACM

1994

Conference

on Computer

Supported Cooperative Work,pp. 175-186,

New

York,

NY:

ACM

Press.

Resnick,P.and Varian,H.R.(1997).

Recommender

Systems.

Communications of

the

ACM,

Vol.

40

(3),

pp. 56-58.

Sarwar,B.M.,Karypis,G.,Konstan,J.A.,andRiedl,J.(2000)Applicationof Dimensionality Reduction

in

Recommender

System

-

A

Case

Study.In

ACM

WebKDD

2000

Web

Mining

for

E-Commerce

Workshop.

Schmalensee,R. (1978).Advertisingand?rod.vic\QMa\\Ky.Journal

of

Political

Economy,

Vol.86, pp.

485-503.

Sen,A. (1986). Social choicetheory. In

Handbook

of

Mathematical Economics,

Volume

3. Elsevier

SciencePublishers.

Schafer,J.B.,Konstan,J.,andRiedl,J., (2001)Electronic

Commerce Recommender

Applications.Journal of

Data Mining

and

Knowledge

Discovery. January,2001 (expected).

Shapiro, C.(1982)

Consumer

Infonnation,ProductQuality,andSellerReputation. BellJournal

of

Economics

13 (1),

pp

20-35,Spring1982.

Shardanand,U.and Maes,P.(1995). Social informationfiltering: Algorithmsforautomating

"word

of

mouth". InProceedings

of

theConference

on

Human

Factorsin

Computing

Systems (CHI95), Denver,

CO,

(60)

(61)

Smallwood,

D.andConlisk,J.(1979).ProductQualityinMarkets

Where

Consumers Are

Imperfectly Infomied.QuarterlyJournal of Economics.Vol.93, pp. 1-23.

Wilson,Robert(1985). Reputationsin

Games

andMarkets. InGame-Theoretic

Models of

Bargaining.

edited

by

AlvinRoth,

Cambridge

UniversityPress, pp.27-62.

Yahalom,

R., Klein.

B

,andBeth,T. (1993). Trust RelationshipsinSecureSystems -

A

Distributed Authentication Perspective.InProceedings

of

the

IEEE

Symposium on

ResearchinSecurity

and

Privacy,

Oakland, 1993.

Zacharia,G.,

Moukas,

A.,and Maes,P.(1999)CollaborativeReputation

Mechanisms

inOnline

Marketplaces.InProceedings

of

32"''

Hawaii

InternationalConference

on

System Sciences (HICSS-32),

Maui, Hawaii, January 1999.

Footnotes

'

Jonathan

Lebed,

a

15-year-old

boy was

sued

by

the

SEC

for

buying

large

blocks

of

inexpensive,

thinly

traded

stocks,

posting

false

messages

promoting

the

stocks

on

Internet

message

boards

and

then

dumping

the

stocks

afterprices rose, partly

as

a result

of

his

messages.

The

boy

allegedly

earned

more

than

a

quarter-million

dollarsin less

than

six

months and

settledthe

lawsuit

on September

20,

2000

for

$285,000

(Source:

Associated

(62)

(63)

(64)

0£C

^^00?

_^

(65)

MIT LIBRARIES

(66)