Language-independent conceptual "bugs" in novice programming

(1)

HAL Id: hal-00190538

https://telearn.archives-ouvertes.fr/hal-00190538

Submitted on 23 Nov 2007

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Language-independent conceptual ”bugs” in novice programming

Roy D. Pea

To cite this version:

Roy D. Pea. Language-independent conceptual ”bugs” in novice programming. Journal educational

computing research, 1986, 2(1), pp.25-36. �hal-00190538�

(2)

J. E D U C A T I O N A L COMPUTING RESEARCH. Vol. 2 ( 1 ) , 1986

LANGUAGE-INDEPENDENT CONCEPTUAL

"BUGS"

IN NOVICE PROGRAMMING*

R O Y D. PEA

ABSTRACT

T h i s article argues for tlic c s i s t c n c c 01' pcrsistcnt c o n c c p t u a l "bugs" in h o w noviccs program a n d understand programs. Tlicsc bugs arc n o t specific t o a given programming language, b u t appcar t o be language-independcnt. P u r t h e r - more, such bugs occur Tor noviccs from primary school t o college age. T h r e e different classcs of bugs-parallelisn~, intentionality, a n d egocentrism-are identified, a n d cxcmplificd through student errors. It is suggested t h a t these classes of c o n c c p t u a l bugs arc r o o t c d in a "superbug," t h c d e f a u l t strategy t h a t t l ~ c r c is a hidden mind sorncwltcrc in tlic programtning lnngungc that has intel- ligcnt intcrprctivc powcrs.

I t is well k n o w n tliat studcnts Iiavc such pcrvasivc conceptual misunderstandings as novice programmers tliat correct programs carly in t h e learning process come as pleasant surprises. Even after a ycar o r niorc o f programming instruction, students liavc grcat difficulty predicting what o u t p u t a program will have, in what ordcr conirnands will hc cxccutcd, or i r ~ writing arid debugging original programs l o solvc problcnis. Furtlicrlnorc, tlicsc problcliis arc n o t confincd t o tlic vcry young s t l ~ d c n t in clcmcntary scliool [I -51 and junior high [6] , b u t appear t o pcrvadc tlic programniing activities of high scliool, college [7, 81, a n d mature

//

adult studcnts as wcll. What arc tlic sourccs o f these difficulties?

Many of tlicsc conccptual difficulties arc confincd to specific implementations o f particular prograniniing lariguagcs, and presumably can be ameliorated by re- designing t h e particular fcatures o f those implcnicntations, o r b y means o f auto- matic crror finders. In an excniplary s t u d y , Soloway, Bonar, a n d Ehrlich have shown how a n invented while looping construct not available in Pascal was easier for novices t o use in writing programs t h a n the standard Pascal looping

* T h e rcscarch discussed in this articlc was supportcd by t h e Spcncer F o u n d a t i o n and t h c National Institute o f Education (Contract No. 4 0 0 - 8 3 0 0 1 6 ) .

O 1986, Uaywood Publishing Co.. Inc.

(3)

26 / R O Y D. P E A

constructs [ 9 ]

.

I n this articlc, howcvcr, 1 plan t o consider instead t h e kinds of fundamental and widespread conceptual misunderstandings o r "bugs" [ l o ] in prograni understanding that appear, froni o u r o w n and others' w o r k , to be relatively independent of specific commands or programming languages. These misunderstandings, w e will argue, haveless t o d o with the design o f programming languages than with the problems people have in learning t o give instructions t o a computer.

Much o f our programming instruction treats learning t o program as a new and indcpcndcnt ski11 having little t o d o with previous learning: "It is almost as close to a situation o f a tabula rasa as we arc going to find in a n adult" [ l l , 121.

Furthermore, in the classroom setting, students' errors are commonly considered to be idiosyncratic problems. But sonicthing ~ n u c l i more interesting psychologi- cally is happening, and w e must comc t o understand it. It is n o t that students don't know anything that is relevant to programming-they have an intuitive understanding o f much o f what we say about programming. Depending o n their age and developmental level, studcnts have available expcriences, and a broad range of conccpts a n d strategies relcvant t o learning to program [ 1 3 ] . But one o f tlic most ccntral aspccts or thcir intclligcnce is niislcading when it comes to learning to program. The novice programmer works inhlitiveiy and pursues many blind alleys in learning the formal skill of programming. But what does it mean to work "intuitively?"

Specifically, students have a predominant analogy that guides their behavior when, as novices, they write programming instructions to a c o m p u t e r , This analogy is conversing with a human, Their pragmatic strategies for using natural language with other humans lead t h e m astray as they try t o deal w i t h programming, because programming is a formal system that interprets each part o f a program (instructions t o it) in terms o f rules that are mechanisric. At least for the programming languages w e will be referring t o in our examples, there are strict rules for interpreting commands in a rigid sequential order, determined b y how flow of control is dealt with in the language. While people are intelligent inter- preters o f conversations, computer programming languages are n o t . This fundamental feature o f programming systematically violates human conversational maxims, such as the cooperative principlesoutlined by Grice [14] , a n d developed in theories of natural language pragmatics (c.g., Cole [ I S ] ; Searle [16]). For cxaniple, a programming language a n n o t infer what a speaker means if she is not absolutely explicit, whereas a listener in a human-human conversation can query the speaker for clarification. There are similar problems in the developmental transition froni oral t o written communication of natural language [17, 181, where the absence o f the listener sets new constraints o n the explicitness with which meaning must be expressed.

My aim here is t o explicate a few of the major obstacles t o programming ex- pertise presented b y three major classes of students' conceptual bugs in understanding. This division is offered as a first attempt at defining a taxonomy for

(4)

LANGUAGE-INDEPENDENT CONCEPTUAL "BUGS" 1 27

guiding discussion o f these problenis. These errors are bugs in the sense that they are systematic-that is, n o t random crrors o r sloppy work-and that they need rcvision and further instruction for students to make progress in learning t o program. Thc data o n which these a r g u ~ n e n t s are based consisted o f several years o f Logo programming studies, with eight. t o twclvc-ycar-olds, a n d fourteen- to scvcntccn-ycar-vlcls, and observations of high s c l ~ o v l studcnts learning Basic programming. Details follow wherc appropriate. I will close b y suggesting some implications o f these findings for how programming is taught.

CLASSES

OF

BUGS

Parallelism Bugs

Thc parallelism bug is rcvcaled in divcrsc contexts, but its essence is the assunlption that different lincs in a program can be active o r somehow known b y the cornputcr at t h e same timc, or in parallel. Though there may b e others, we can distinguish t w o different kinds of programs in which t h e parallelism mis-

~rnclcrs~anding is c o m m o n .

Onc contcxt in which t h e bug occurs is programs wherc conditional statcnients (IF

. . .

THEN) occur outside o f loops. A c o m m o n cxample is o n e where, early in a program, a conditional statement appears. Sl ZE will be o u r variablf'nanie in this case. The program says:

IF SIZE = 10. THEN PRINT "HELLO

h t c r in the program, a c o u n t u p loop is cncountcred, where a variable is incre- niented by o n e each time until it reaches ten.

FOR SIZE = 1 to 10, PRINT "SIZE NEXT SlZE

Now we may ask: What d o students think the computer will d o ? If they understand the control structure o f tlic programniing language (in this case, BASIC), they know that the IF statement is first evaluated for its truth. If SlZE is equal t o ten, HELLO is printed, and control passes to the next statement. If the variable is not equal t o t e n , nothing is printed, and control passes t o the next statement. The knowledgeable prograni~ner knows that after the first line o f the program-thc IF line-is cxccutcd, it is iuactivc, and irrelevant t o whatever the rest o f thc program instructions say because tlic control cycle never returns t o it.

But a recurrcnt problem for students-in this casc, high schoolers in their second ycar o f computcr science-to w h o m wc have offered problems of this type is that a very different prediction is offered for w h a t will happen. In one study, eight o u t o f the fifteen students interviewed predicted that during the

(5)

28 1 R O Y D. PEA

looping process, w h e n t h c variablc S l Z E bccame equal t o t e n , H E L L O would be printed. When asked t o explain w h y , tlic s t u d e n t observed t h a t , since variable SlZE was now equal t o ten (i.e., within the loop) and the IF statement was

"waiting for" tlic S l Z E t o b c cqual l o t c n , i t could now print HELLO. But in Ihct, o n c c the IF s t a t e m e n t was cvaluatcd and found false, it was never returned to in tlic program. Thcrc is a scnsc in wliicli tlicsc s t u d c n t s bclievc t h a t all the lincs in tlic program arc active o r alivc a t o n c c . As o n c junior lligh student pro- nounccd: "It looks a t tlic program all at o n c c bccausc it is so fast." T h c program is thought t o liavc on intclligcncc u~itlcr tlic surracc tliat m o n i t o r s tlic action status o f every line in t h e program si~iiultancously.

Now think a b o u t t h c logic o f IF statcmcnts in natural conversation [ 1 9 ] . When I say t o y o u , "If y o u w a n t t o go t o tlic s t o r e , I'll drive you," t h e z is more than a n instantaneous duration t o my IF statenlent. It m a y n o t b e active for a weck, o r even all d a y , b u t y o u r responsc docs not have to b e immediate. l f in a n hour y o u w a n t t o go t o t h c s t o r e , 1 an1 still likely t o drive y o u there. (The tem- poral period will vary according t o context in ways currently little understood.) The idea o f a n IF s t a t e m e n t being evaluated a n d t h e n taken off t h e books, as it were, is o d d f r o m a natural language perspective. So t h e student h a s applied her intuitions a b o u t t h e d u r a t i o n o f IF statements in natural language discourse t o the initially mysterious domain o f computer language discourse. It is possible that a different n o t a t i o n for i f .

. .

t h c n (e.g., condition-action pairs) could atten- uate this interpretive problem.

A related finding involved notive Pascal programmers [7, 201. A "while demon" b u g w a s revealed w h e n as many as a third o f t h e college students assumed f o r simple Pascal programs t h a t t h e actions in t h e while l o o p were con- ti~tuously monitored for the exit condition to become true. For example, o n e student explained t h a t "every time 1 [ t h e variable tested in t h e while condition]

is assigned a new value, t h e machine nccds t o check t h a t value." T h e authors note tliat this interpretation is consistent with English while, a s in "while the Iiighway is t w o lanes, c o n t i n u e north."

T h e generality o f t h e phenonicnon may bc observed in a second example o f the parallelism bug revealed by students a t t e m p t i n g t o c o m p r e h e n d programs not involving conditionals-in this case, variable assignment statements which occur in a program after lines rcfcrring t o tliat variable. T h e student thinks (incorrectly) that w h a t will liappcn latcr in a program influences w h a t happens carlier. For example, considcr tlic following four-line program:

A R E A = Height X Width l n p u t Height

l n p u t Width P R I N T "AREA

Many s t u d e n t s assume that there is n o problem with this program (which would essentially b e true were it written in Prolog, in which t h e interpreter does

(6)

LANGUAGE-INDEPENDENT CONCEPTUAL "BUGS" / 29

d o infercncc!), a n d prcdict that it will print o u t tlic product o f t h e height and width values t h e program user h a s i n p u t . But this is n o t t r u e . When t h e first statement is e x e c u t e d , t h a t is, t h e o n e that defines AREA as height times w i d t h , it has ^riotyet rcccivcd t h e i n p u t valucs. S o it trcats Iicight a n d w i d t h a s equal to thc default value o f zero. What is printed is n o t , a s t h e s t u d e n t assumes, the product of tlic input valucs o f Iiciglit and widtli, b u t tlic product o f t h e values o f those variables available a t tlic timc tlie first line in t h e program was e x e c u t e d , that is, 0 X 0 = 0.

tlcre, o n c c again, wc can scc thc influcncc o f natural language conversational strategies, where implicit knowledge or cxpcctations o f what will come later can guidc tlic intcrpretation o f what occurs carly in a convcrsation (or text). In natural language, a p a r t f r o m procedural instructions such as recipes o r building plans, thcrc is o f t c n n o rcason not t o skip alicad. But in c o m p u t e r programming, tlie novicc studcnt must think: "What conditions regarding i n p u t s are in effect as t h i s line is executed?" In natural language, o n e rarely violates t h e meaning o f a tcxt b y reading parts o f it o u t o f order, since linc-by-line comprehension is n o t essential. In fact, w e even teach scanning ahead for structure as a reading strategy. Nonetheless, natural language out-of-ordcr reading d o e s o f t e n disrupt tcxt comprelicnsion, a s research o n story grammars reveals.

Intentionality Bugs

There is a n o t h e r class o f important language-independent conceptual bugs that w e will call Intentionality Bugs. lntcntionality Bugs are those in which the student a t t r i b u t e s goal directedness or foresiglitedncss t o t h e program a n d , in so doing, "goes beyond t h e inforniation given" in tlie lines o f programming code being cxccuted w h e n t h c program is run. S t u d e n t s a d o p t what Dennett callsan

"intentional stance" toward t h c c o ~ i i p l c x systcni representcd b y t h e programming language, a n d assume t h a t it has capacities o r a t t r i b u t e s o f a h u m a n [21].

In o n e example which we have studied in dctail [ I ? , ] , w e ask students t o talk o u t loud as they draw o n graph papcr what thc graphics pcn will draw as the following tail-recursive Logo program is executed. As depicted in t h e figure below, wlicn o n e typcs SHAPE 40, tlic program draws a large square, a medium-sized

TO SHAPE :SIDE I F :SIDE = 10 STOP

REPEAT 4 [FORWARD :SIDE RIGHT 901 SHAPE :SfDE/2

END

squarc insidc it, a n d t h e n stops. More specifically, tlic prograni draws a square with a variable sidc t h a t , w h e n initialized o n tlic first call, is forty units long, The first line of t h c program is a conditional counter with t h e purpose o f stopping

(7)

30 1 R O Y D. P E A

~ h c drawing a f t e r the t w o squarcs arc d r a w n . Wlicn c x c c u t e d . t h e n e x t line draws a square tlic length o f t h e variable SlDE (i.c., 40): REPEAT 4 [FORWARD :SIDE RIGHT 901

.

T h e last line o f the recursive program divides t h s y a r i a b l e SlDE by t w o , a n d since the program bcgins with a conditional statemcnt that says when t h c variable SlDE equals 1 0 s t o p , the program draws t h e t w o squares o f size forty and t w c n t y a n d stops, bcc;~usc tlic vnriablc SIDE t h e n equals 1 0 .

Wlicn encountering t h e sccond linc o f thc program, a conditional that says IF thc value o f t h e variable SIDE equals 1 0 STOP, sonic s t u d e n t s erroneously pre- dict t h a t w h e n t h e program is r u n , a b o x o f side 10 will be drawn. When asked w h y , their c o m m e n t s a r e revealing. The studcnts have glanced ahead in the program to see w h a t is t o t h e m a familiar programming schema o r "plan" [22] -a comniand line t h a t results in t h e drawing o f a square: REPEAT 4 [FORWARD (SOME DISTANCE) R IGHTANGLETURN ( 9 0 DEGREES)]

.

T h e y then read the IF s t a t e m e n t a s if t h e program is c o ~ n m a ~ i d i n g t h e c o m p u t e r t o draw a square with sides equal t o t e n , because "it will draw a square," o r "because i t wants t o draw a square." O t h e r students recognize that t h e variablc a t the IF statement equals f o r t y , b u t t h e n say that t h e prograni sees t h e b o x statement line ahead which it w a n t s t o d r a w , b u t has t o s t o p a t t e n !

In each case-the parallelisnl a n d intentionality bugs-the program has been given the s t a t u s o f a n intentional being which h a s g o a l s , a n d k n o w s o r sees what will happen elsewhere in itself.

Egocentrism Bugs

Egocentrism bugs are t h e flip side o f intentionality bugs. Whereas intentionality bugs involve c o m p r e h e n d i n g and rracing what a program will d o , egocentrism bugs are involved in creating a program to d o something. Each bug t y p e presupposes that the computer can d o what it has not been told t o d o in the program.

Egocentrism, a n overemphasis o n tlic pcrspcctivc o f self relative to that o f othcrs, is a pervasive characteristic o f children's thinking, manifested in spatial cognition [ 2 3 ] , communication [ 2 4 ] , and otlicr problem domains. Under the strcnuous cognitive d e m a n d s o f a ncw task environment, it m a y also surface as a characteristic o f the perforniances o f novicc programmcrs w h o are adolescents and adults. It should t h u s c o m e as n o surprise tliat tlic task performances o f novice programmers are also subject to egoccntric biases. Egocentrism bugs are those where s t u d e n t s assume tliat there is rnore o f their meaning for what they want t o accomplish in the program than is actually present in t h e code they have written. S t u d e n t s giving evidence o f this bug egocentrically assume t h a t the c o m p u t e r can follow t h e advice fornicr Mayor o f Chicago Richard Daley used t o give reporters:

"Don't print what I s a y , print what I mean!"

(8)

ri-

LANGUAGE-INDEPENDENT CONCEPTUAL "BUGS" 1 3 1

For e x a m p l e , lines o f code or variablc valucs arc o m i t t e d b y these students bccause it is assumed that t h c computer "knows" o r can "fill in," as a h u m a n listener c a n , w h a t t h e s t u d e n t wishes it t o do.'

S t u d e n t s d o not literally say tliat thc program k n o w s w h a t t o d o ; tlie errors generated b y this b u g are almost perceptual in nature-their current conceptions d o n o t guide thcir a t t e n t i o n to thcse p r o b l c ~ n s as relcvant reasons for their programs' n o t working as planned. A c o m m o n problcm o f this kind is the omission o f punctuation o r c o n t r o l characters, and thc nonprovision o f values for variables. Lest these omissions be thought o f only a s careless w o r k , o n e can probe the s t u d c n t s t o test o u r current liypotl~csis, which a t t r i b u t e s m o r e significancc t o thcse onlissions t h a n oversight. Wlicn asked to explain what programs t h e y have written will d o , they gloss over t h e specific c o m m a n d s in a line o f Logo code just w r i t t c n , asserting tliat a line o f graphics c o d c d r a w s a square w h e n , for ex- arnplc, t h e y havc included a rnovc c o m m a n d to send t h e graphic turtle forward, b u t no turn c o m m a n d for making t h e ncccssary right angles:

REPEAT 4 [ F O R W A R D 301

It is as if t h e y d o n o t see t h a t t h e ncccssary specifications t o t h e computer have bccn o ~ n i t t e d . All t h e y have provided is t h c skclcton o f a program, assuming that in somc w a y t h e c o m p u t e r can fill in.thc rest, can say what t h e y "mean."

Bonar a n d Soloway provide anotlicr clear case o f egocentrism, manifested b y a college s t u d e n t writing a program in Pascal [ 7 ] . The student was writing pseudo-code for t h e p r o b l e m : "Write a program which reads in ten integers and prints t h e average o f those intcgcrs." Shc w r o t c o u t :

Repeat

( 1 ) Read a n u m b e r (Nurn) ( l a ) C o u n t := C o u n t + 1 ( 2 ) Add t h e n u m b e r t o S u m (2a) S u m := S u m + Num (3) until C o u n t :=I0 ( 4 ) Average := S u m div Nurn (5) writeln ('average = ',Average)

When t h e interviewer asked whether ( l a ) was the "same kind o f statement" as (?a), it became clear "that she thinks tlie Pascal translator k n o w s far more a b o u t these roles t h a n it does":

I Scveral counterexaniplcs and, pcrhaps, part of a growing trend, are Teitelbaum's DWlhf (Do What I Mean) systcms added to thc Interlisp programming environment, which corrects spelling errors by using syntactic contcxt, and commercially available syntax-correcting com- pilers. Such painlcss crror rcvisions are the subject of feverous debates among programmers.

(9)

32 / R O Y D, PEA

Are they the same kind. Ahhh,ummm, not exactly, because with this [ l a ] you are adding-you initialize it as zero and you're adding one t o it [points t o the right side of l a ] , w h i c h is just a constant kind of thing. [Points t o 2al Sum, initialized to, u h h , Sum t o Sum plus Num, ahh-that's [points t o left side of 2a] storing two values in one, two variables [points t o Sum and Num o n the right side of 2 a ] . That's [now points t o 1 a ] a counter, that's what keeps the whole loop under control. Whereas this thing [points t o 2 a ] , this was probably the most interesting thing.

.

. a b o u t Pascal when I hit it. That you could have t h e same, you sorta have the same thing here [points to l a ] , it was interesting that you could have-you could save space by having the Sum re-storing information on the left with two different things there [points t o right side of 2 a ] , so 1 didn't need t o have two. No, they're different t o me. I think of this [points to l a ] as just a constant, something that keeps the loop under control. And this [points t o 2a] has something t o d o with something that you are gonna, that stores more kinds of information that you are going t o take o u t of the loop with y o u [ 7 , p. 51.

Hcre, again, we see the student believing that the programming language knows more about her intentions than it possibly can.

Soloway et a/. have found among college Pascal programmers a set of errors that we believe also stems from egocentrism bugs [8]

.

They describe what they call a "mushed variables" bug. After a semester of Pascal, more than one quarter of their novice programmers used the santc variable incorrectly for more than one role. For example, in the following program, the variabIe X is used both to store a value being read in [read ( X ) ] and t o hold a running total [ X := X t X I :

program Student26-Problem2;

var X , Ave : integer

-

begin

-

repeat Read ( X ) X : = X + X

until X + X [greater-than sign] 100;

-

Ave := X

div

Nx;

Write (Ave) end.

-

They observe that students making thesc errors may have assumed that the computer would recognize that thc same variable played two different r o l e w n d that it could use the different values appropriately.

CONCLUSIONS

All t h e bugs discussed-parallelism, intentionality, and egocentrism-appear to derive from what might be called a superbug, T i e superbug may be described as the idea that there is a hidden mind somewhere in the programming language that

(10)

L A N GUAGE-IN DEPENDENT CONCEPTUAL "BUGS" 1 33

has intelligent, interpretive powers. It k n o w s wliat has happened o r will happen in lines o f t h e program o t h e r t h a n the linc being e x e c u t e d ; it can bcnevo- lently go b e y o n d t h e information given t o help the s t u d e n t achieve h e r goals in writing thc program. This "hiddcn mind supcrbug" interpretation providcs a d c c p explanation o f t h e various n~isconccptions t h a t plague the novice pro- gra mmc r.

But thcre is t o o facile a n interpretation o f this argurncnt that must be avoided because it is false. It is not that studcnts literally bclieve t h a t the c o m p u t e r has a mind, o r can t h i n k , o r can intcrpret wliat w a s n o t explicitly stated. In o u r experience, novice programming students are likely t o vehemently deny that t h e c o m p u t e r can think or that it is intelligent. Besides, instructors are very good a t highlighting this point at t h c beginning o f courses: Computers arc d u m b and can d o n o t h i n g b u t what y o u tcll tlicni! But students' behaviors whcn working with programs oftcn contradict tlicir denials; t h e y act as if the programming language is more t h a n mechanistic. Their default strategy for makittg sense w h e n encountering difficulties o f program interpretation o r w h e n writing programs is t o resort t o t h e powerful analogy o f natural language conversation, t o assume a disambiguating mind which can understand. It is n o t clear a t t h e current time wlicther tliis strategy is consciously pursued b y students, o r w h e t h e r it is a tacit overgeneralization o f conversational principles t o c o m p u t e r programming "discourse." T h e central point is t h a t this personal analogy should be seen as expected rather t h a n bizarre behavior, for the students have n o o t h e r analog, n o o t h e r procedural device t h a n "person"

to which t h e y can give written instructions that are t h e n followed. Rumelhart and Norman have similarly emphasized the critical role o f analogies in early Icarning o f a domain-making links bctwcen thc to-be-learned d o m a i n and known domains pcrccived b y thc student to be rclevant [25]. But, in this case, tilapping conventions for natural language instructions o n t o programming rcsults in error-riddcn pcrformanccs.

A rival explanation for t h e aforcnicntioncd classcs o f bugs is that the novice programmer d o e s not i m p u t e interpretive intelligence t o t h e machine. It is not that thc programmer assumes that a distinction needs t o b e e x p r e s s e d g n d that the cornputcr can m a k c that distinction. Instead, he o r she simply d o e s n o t u n d e r s ~ a n d that there are anibiguitics t o bc resolved in t h e code that h a s been written. From this pcrspective, thc c o m m o n developmental problem o f coming t o distinguish alternatives which are initially fused o r collapsed in thought is viewed a s t h e source o f the kinds o f errors w e have discussed. While this possibility should b e considered for some error-ridden programs, there are types o f errors, such as t h e parallelism bugs, which are unlikely t o result from such conceptual fusion. And ~t is difficult t o see o n this rival interpretation w h y we find t h a t novice prograninicrs o f t e n utilize intentional terms t o describe t h e process b y which the coniputer executes t h e c o m m a n d s pre- scnted b y the program.

(11)

34 / R O Y D. PEA

What are the implications o f tlicsc findings for programming instruction?

First, wc need t o b e aware o f t h c pcrvasivcncss o f programming misunderstandings that arise f r o m t h e tacit applications of h u m a n conversational metaphor to programming. This is powcrful transfer, t o b c sure, b u t it is misleading and does n o t w o r k . S e c o n d , b e y o n d bcing awarc o f these bugs, w e have t o arrange m a n y more kinds o f learning activities for studcnts, and diagnostic activities for tcachcrs, in which t h c bugs can bc madc obvious. Wc bclicvc t h c persistcncc o f thcse bugs is in part linked t o the itifrcyiro~cy with which t h e y are explicitly confronted b y s t u d c n t s a n d tcaclicrs alikc. Bugs likc thcse could be snared if o n e used program reading o r dcbugging activitics as ccntral c o m p o n e n t s o f pro- grarnnling instruction. It was n o t until wc did t h e tedious w o r k o f having students walk through every c o m m a n d in a program, thinking aloud a n d explaining h o w the c o m p u t e r would interpret it, tliat w e bccamc aware o f t h e prevalence o f these bugs. After t h a t , w e saw thcm everywhere.

'There are additional complexities t o be faccd from a pedagogical perspective.

F r o m the programmer's viewpoint, it is not true that every operation to be carried o u t has t o b e m a d e explicit. There are m a n y things which programming languages automatically carry o u t , w i t h o u t , a s it were, specific instructions t o d o s o (e.g., physical address m a n a g e m e n t ; stack storage allocation; Pascal compiler disam- biguation o f t h e meaning o f t h e semicolon f r o m context). S o t h e l e s s o n the novice programmer needs t o learn is t h a t some meanings d o n o t need t o be explicitly expressed in the c o d e h e o r she writcs, while o t h e r s d o . Since t h e boundaries o f required explicitness are conventions that vary across programming languages, the learner must realize t h e necessity o f identifying in exactly w h a t ways the language he or she is learning "invisibly" specifies t h e meaning o f code written.

Much more research is needed o n h o w best t o help s t u d e n t s see that computers read programs through a strictly mechanistic and interpretive process, whose rules are fairly simple once understood. We think this can best be achievcd b y providing clear modcls that show h o w t h e processing o f control and data is rcgulatcd b y t h e specific programniing language u n d e r s t u d y . Tliese cx- planations can b c s u p p o r t e d by explicit think-aloud examples o f h o w the facile programmer t h i n k s a b o u t a n d rnakcs dccisions with respect t o program creation arid undcrstanding, a n d through instruction in coniprchcnsion-monitoring proccsscs for colnputcr programs similar to thosc tliat have bccn effective for writtcn languagc undcrstanding [26]

.

Other uscful lcads will c o m c from artificial- intclligcncc, knowlcdgc-based programmers' assistants [ 2 7 ] , a n d debugging aides that seek to idcntify a n d remcdiatc studcnts' pervasive misconceptions in learning

//

h o w to program [ 2 8 ] .

Finally, wc can be assurcd o f (althougli not c o ~ n f o r t e d b y ) t h e fact that such conceptual difficulties are n o t specific to t h c programming domain. There are o t h e r formal systems with abstract rules o f interpretation-logic, physics.

a n d mathematics-that are also very challenging for s t u d e n t s t o learn, rife with bugs [29] , b u t well w o r t h our concerted efforts to help students understand.

(12)

LANGUAGE-INDEPENDENT CONCEPTUAL "BUGS" 1 35

A C K N O W L E D G M E N T

1 would like t o thank my colleagues at the Ccnter for Children and Tech- nology for discussing thcsc issucs, and I hcrcby cxprcss my appreciation for tlic constructive commcnls ofscvcral anonymous rcvicwcrs.

REFERENCES

1 . D. M. Kurland and R. D. Pea, Children's Mental Models of Recursive Logo Programs, Journal ofEducationa1 Computing Research, 2 , in press.

2. U. Leron, Some Problems in Children's Logo Learning, in Proceedings of rhr Seventl~ International Confcrencc for tllc I ' S J ' C / ~ O ~ O ~ J ~ of Mathematics Edu-

- 5

cation, Weizmann Institute, Jerusalem, 1983. '

3. J . D. Milojkovic, "Children Learning Computer Programming: Cognitive and Motivational Consequences," doctoral dissertation, Department of Psy- chology, Stanford University, 1983.

4 . R. Nachmias, D. Mioduser, and D. Chen, Acquisition of Basic Computer Programming Concepts b y Children , (Technical Report Number 14), The Computers in Education Research Lab, School of Education, Tel Aviv Uni- versity, Tel Aviv, 1985.

5. R. D. Pea, Logo Programming and Problem Solving, (Technical Report Number 12), Bank Street College of Education, Center f o r Children and Technology, New York, 1983.

6. R. Mawby, Proficiency Conditions for the Development of Thinking Skills Through Programming, paper presented a t the Harvard University Confer- ence o n Thinking, Cambridge, Massachusetts, 1984.

7 . J. Bonar and E. Soloway, Uncovering Principles of Novice Programming, in S I C P L A N S I C A C T , tenth annual symposium o n Principles of Programming Languages, Austin, Texas, 1983.

8. E. Soloway, K. Ehrlich, J . Bonar, and J . Greenspan, What D o Novices Know about Programming? in Direcriorrs in Human-Computer Interactions, B. Shneiderman and A. Badre (eds.), Ablex, Norwood, New Jersey, 1982.

9. E. Soloway, J. Bonar, and K . Ehrlich, Cognitive Strategies and Looping Con- structs: An Empirical S t u d y , Corn~nunications o f the A C M , November 1983.

10. J . S. Brown and R. Burton, Diagnostic Models for Procedural Bugs in Basic Mathematical Skills, Cognitive Scicncc, 2 , pp. 155-1 9 2 , 1978.

11. J. R. Anderson, R. Farrell, and R. Sauers, Learning to Program in LISP, Cognitive Science, 8, pp. 87-1 29, 1984.

12. R. D. Pea and D . M. Kurland, On the Cognitive Effects of Learning Com- puter Programming, New Ideas in Psj~cl~ology, 2 , pp. 1 3 1-1 6 8 , 1984.

13. R . D. Pea and D. M. Kurland, 011 the Cognirivc Prerequisites o f Learning Cotnputer I'rogramming, project report to the National Institute of Educa- tion. (Also Technical Report Number 1 8 , Bank Street College of Education, Center for Children and Technolo'gy), New York, 1983.

14. H . P. Grice, Logic and Conversation, in Syntax and Senlantics 3: Speech Acts, P. Cole and J . Morgan (eds.), Academic Press, New York, 1973.

(13)

36 / R O Y D. P E A

15. P. Cole, Radical Pragmatics, Academic Press, New York, 1981. - - - -

16. J. R. Searle, Intentionality, Cambridge University Press, Cambridge, 1983.

17. D. R. Olson, F r o m Utterance t o Text: The Bias of Language in Speech and Writing, Harvard EducationalReview, 47, pp. 257-281, 1977.

18. D. Tannen (ed.), Coherence in Spoken and Written Discourse, Ablex, Nor- wood, New Jersey, 1983.

19. J . D. McCawley, Everything that Linguists Have Always Wanted to Know about Logic ( B u t Were Ashamed t o A s k ) , University of Chicago Press, Chicago, 1 9 8 1.

20. E. Soloway, J. Bonar, J. Barth, E. Rubin, and B. Woolf, Programming and Cognition: Why Your Students Write Those Crazy Programs, Proceedings o f the National Educational Computing Conference, pp. 206-219,1981.

21. D. Dennett, Brainstorms, Bradford Books, Montgomery, Vermont, 1978.

;

22. E. Soloway and K. Ehrlich, Empirical Studies of Programming Knowledge, IEEE Transactions o n Software Engineering, in press.

23. J. Piaget and B. Inhelder, The Child's Conception o f Space, Norton, New York, 1967.

24. J . H. Flavell, P. T. Botkin, C. L. Fry, J. W. Wright, and P. E. Jarvis, The Development o f Role-taking and Communication Skills in Children, Wiley, New York, 1968.

25. D. E. Rumelhart and D, A. Norman, Analogical Processes in Learning, in Cognitive Skills and Their Acquisition, J. R. Anderson (ed.), Erlbaum, Hills- dale, New Jersey, 1 9 8 1 .

26. A. S. Palincsar and A. L. Brown, Reciprocal Teaching of Comprehension- fostering and Comprehension-monitoring Activities, Cognition and Instruc- t i o n , 1 , pp. 117-175, 1984.

27. R. A. Waters, A Knowledge-based Program Editor, Proceedings o f the 7th International Joint Conference on ArtificialIntelligence Vol. 11, pp. 920-926,

1982.

28. W. L. Johnson and E. Soloway, PROUST: Knowledge-based Program Under- standing, (Technical Report Number 285), Department of Computer Science, Yale University, New Haven, Connecticut, 1984.

29. D. Gentner and A. Stevens (eds.), Mental Models, Erlbaum, Hillsdale, New Jersey, 1983.

Direct reprint requests t o : Dr. Roy D. Pea

Bank Street SehebGIlege 61 0 W. 112th Street New York. NY 10025