I VEN V AN M ECHELEN
Prediction of a dichotomous criterion variable by means of a logical combination of dichotomous predictors
Mathématiques et sciences humaines, tome 102 (1988), p. 47-54
<http://www.numdam.org/item?id=MSH_1988__102__47_0>
© Centre d’analyse et de mathématiques sociales de l’EHESS, 1988, tous droits réservés.
L’accès aux archives de la revue « Mathématiques et sciences humaines » (http://
msh.revues.org/) implique l’accord avec les conditions générales d’utilisation (http://www.
numdam.org/conditions). Toute utilisation commerciale ou impression systématique est consti- tutive d’une infraction pénale. Toute copie ou impression de ce fichier doit conte- nir la présente mention de copyright.
Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
http://www.numdam.org/
PREDICTION OF A DICHOTOMOUS CRITERION VARIABLE
BY MEANS OF A LOGICAL COMBINATION OF DICHOTOMOUS PREDICTORS
Iven VAN MECHELEN 1
INTRODUCTION
Logical
relations between dichotomous variables are of interest in severalareas of human sciences. A number of methods have been
developed already
todetect such relations. For
example Lerman,
Gras and Rostam(1981) proposed
a
technique
to discover allsignificant pairwise implications
within a setof dichotomous variables. Another
example
is VanBuggenhaut’s (1987) approach
to reveal collections of morecomplex logical
relations.A
particular
case wherelogical
relations are of use is the situation in which one wants topredict
agiven
criterionby
means of alogical
combination of
predictor
variables. In this contribution we propose a method tofind,
for agiven
dichotomous criterion C and agiven
set of ndichotomous
potential predictor
variables(P1,...,Pn),
alogical
1 Université de
Leuven,
Tiensestraat102,
B 3000Leuven, Belgium.
The author is research assistant of The National Fund for Scientific Research
(Belgium).
fulfilled. More
formally:
with
2) generalized disjunction:
Both a subset ofpredictors
and a subset ofnegated predictors
aredisjunctively
combined. Moreformally:
with and
3) ordinary conjunction:
The criterionphenomenon
ispredicted
to takeplace
if all selectedpredictors
have value one. Stated in other words:the criterion event is
predicted
to takeplace
iff a set ofsingly
necessary and
jointly
sufficient conditions have been fulfilledsimultaneously.
Or moreformally:
with
4) generalized conjunction:
this is aconjunctive
combination of bothpredictors
andnegated predictors.
In other words: the criterion event ispredicted
to takeplace
iff both a set of necessary conditions and a set of exclusion conditions have been fulfilled. Moreformally:
with
and i
DISCREPANCIES
The
equivalence
of a criterion with alogical
combination ofpredictors
canbe read off from a truth table like Table 1. For the
example
in Table 1holds that the criterion C is
equivalent
with thegeneralized conjunctive
combination
PI
and(not PZ).
Table 1
Truth table for
hypothetical predictors
and criterionHowever,
inpractice
a condition ofperfect logical equivalence
isseldom met. There are two main reasons for this:
(1)
the values ofpredictor
and criterion variables can be disturbedby
error(e.g.
due tovarious measurement
problems)
and(2)
our models aresimplifications
ofreality;
otherpredictors
than the ones considered canplay a
role and thecorrect
logical
combination rule can be morecomplex
than the rule under consideration.In view of this
problem,
it is not feasible to evaluate the truth value ofempirical equivalence
relations in an all-or-none fashion. It makes more sense toaccept
less thanperfect
relations and to evaluate relations in terms of their number ofdiscrepancies (i.e.
the number of datapoints
that contradict therelation).
In the case of a
logical
combination ofpredictors
and a criterionthere are two
types
ofdiscrepancies: (1) "false positives"
where thecombination of
predictors
has a value of one and the criterion a value ofthe
implication
from the combination ofpredictors
to thecriterion, "false
negatives" only
contradict the inverseimplication. Therefore,
the twoimplication
relations can bedifferentially
loaded with error.Otherwise,
it can be useful to further examine thisby computing
for eachimplication
relation an index of
implicative strength (Lerman,
Gras &Rostam, 1981).
ALGORITHM
The aim of our
analysis
is to find for agiven
set ofpredictors,
agiven
criterion and a
given
combinationrule,
alogical
combination of thepredictors
that iso timall equivalent
with thecriterion,
thatis,
a combination with the minimal number ofdiscrepancies.
Probably
theonly
method to detectoptimal
combinations is the enumerative one.However,
even with a small number ofpredictors
thismethod is not workable.
Therefore,
a heuristic isproposed
here that leads tosatisfactory, yet possibly suboptimal
solutions. This heuristic is basedon Boolean
regression.
Such aregression
consists in astepwise
construction of a Boolean sum of
predictors;
in eachstep
thepredictor
thatmaximally
reduces the number of residualdiscrepancies
is added to theBoolean sum of the
previous step. -Up
till now, such a Booleanregression
was
mostly
not used in its ownright, although
it ispart
of theprocedures
in Boolean factor
analysis (Mickey,
Mundle &Engelman, 1983)
and inhierarchical classes
analysis (De
Boeck &Rosenberg,
inpress).-
For the case of an
ordinary disjunctive
combination rule it is evident thatordinary
Booleanregression-
of the criterion upon thepredictors
willlead to the
optimal
combination looked for. For thegeneralized disjunction,
a Booleanregression
of the criterion upon both the set ofpredictors
and theirnegations
will do thejob.
For the
conjunctive
combination rule one has to regress thenegation
of the criterion upon the
negation
of thepredictors.
This leads to anoptimal disjunctive
combination:After
negation
of both sides this appears to beequivalent
with theconjunctive
combination looked for:Finally,
for thegeneralized conjunctive
rule theoptimization problem
is solved
by
means of a Booleanregression
of thenegation
ofthe
criterionupon both the set of
predictors
and theirnegations.
FURTHER OPTIONS
During
the Booleanregression
the total number ofdiscrepancies
between the combination ofpredictors
and the criterion is minimized in astepwise procedure.
As standardoption,
falsepositives
and falsenegatives
aregiven
anequal weight
in this total number ofdiscrepancies. However,
thepractical importance
of the twotypes
ofdiscrepancies might
be differentdepending
on the situation. Forexample,
in a situation ofpersonnel
selection where
predictors
are scores on admission tests and the criterionto a
raising
of thepositive predictive
power of the finallogical
combination.
(Analogous reasoning
holds for the falsenegatives
and thenegative predictive power.)
Logically speaking
a differentialweighting
of the twotypes
ofdiscrepancies
amounts to a differentialweighting
of the twoimplications
that are contained in the
equivalence
relation between the combination ofpredictors
and the criterion. Togive
ahigher weight
to falsepositives,
means that one
predominantly
looks at theimplication
from the combination ofpredictors
to thecriterion; by giving
ahigher weight
to falsenegatives
one can scrutinize the inverse
implication.
ILLUSTRATION
A case
study
was set up in order to illustrate theproposed logical
combination method. This
study
concerns theprediction
ofpolitical preferences
of asubject by
means of alogical
combination ofperceived
attributes of the
politicians.
The
subject
for thisstudy
was a 25 year old malegraduate
student.First he was asked to list some
important Belgian political figures
he wasmore or less familiar with. Nineteen
politicians
were chosen: AnnemieNeyts, Guy Verhofstadt,
PaulStaes,
LudoDierickx,
FrankSwaelen,
LeoTindemans,
MarcEyskens,
Jean-LucDehaene,
JeanGol, G6rard Deprez,
WilfriedMartens,
Charles-FerdinandNothomb, Jos6 Happart,
JaakGabriels,
KarelDillen,
KrisMerckx,
Karel VanMiert, Willy Claes,
Louis Tobback. Then thesubject
was asked to describe each of thesepoliticians by
means of one ormore
typical
attributes. The twelve obtained attributes were:arrogant, realistic, idealistic, patronizing, diligent, eloquent, slimy, sly, diplomatic, willing
tolisten, experienced,
has aquality
program.Next,
a matrix ofpoliticians by
attributes was constructed. Thesubject
was askedto fill in this matrix
by assigning
to cell(i,j)
of the matrix a value of 1if the i-th
politician
had thej-th attribute,
and a value of 0 otherwise.Finally
he was asked to tell about each candidate whether he(she)
wasacceptable
to him as member of agovernment.
Of the nineteenpoliticians
six were
judged acceptable.
Theacceptability
was used as acriterion,
thatwe have tried to
predict by
means of the twelve attributes.The results of the
logical
combinationanalysis
showthat,
for theordinary
as well as for thegeneralized disjunctive
combinationrule,
onesingle
attribute"diligent"
appears asoptimal predictive
combination(with
21%
discrepancies). "Diligent"
was set forthby
thesubject
as"gives
methe
impression
to workhard".
Thepositive predictive
power is 60% and thenegative predictive
power 100%.Therefore, "diligence"
is a necessary, but not sufficient characteristic for apolitician
to bepreferred by
oursubject.
Thenegative predictive
power of 100% shows that a furtherdisjunctive
extension is notpossible.
On the otherhand,
with thesimple
as well as with the
generalized conjunctive
combinationrule,
theconjunction
of"diligent"
and"diplomatic"
appears as theoptimal
combination
(with
10.5%discrepancies).
Thepositive predictive
power of this combination is 100% and thenegative predictive
power 86.7%. So"diligent
anddiplomatic"
is a sufficient characteristic ofacceptable
politics,
arepretty
numerous!BIBLIOGRAPHY
(1)
DeBoeck, P., Rosenberg, S., "Hierarchical
Classesanalysis:
Modeland data
analysis", Psychometrika, (in press)
(2) Lerman, I.C., Gras, R., Rostam, H., "Elaboration
etevaluation d’un
indiced’implication
pour desdonnées binaires", Math. Sci. hum., 74,
(1981),
5-35.(3) Mickey,
M.R., Mundle, P., Engelman, R., "Boolean
factoranalysis’’,
inW.J. Dixon
(Ed.) ,
BMDPstatistical software, Berkeley, University
ofCalifornia
Press, 1983,
538-545.(4)
VanBuggenhaut, J., "Questionnaires booleens: schémas d’implications
et
degrés
decohésion", Math. Sci. hum., 98, (1987),
9-20.NOTE
The