All Values are Pointers to Objects - An Introduction to Scheme and its Implementation

As I said earlier, all values are conceptually pointers to objects on a heap, and you don't ever have to explicitly free memory.

By "object," I don't necessarily mean object in the object-oriented sense. I just mean data objects like Pascal records or C structs, which can be referenced via pointers and may (or may not) hold state information.

Some versions of Scheme do have object systems for object-oriented programming. (This includes our own RScheme system, where standard Scheme types are all classes in a unified object system.) In this book, however, we use the word "object" in a broader sense, meaning an entity that you can have a pointer to.

● All Values are Pointers: All Values are Pointers

● Objects on the Heap: Objects on the Heap

Go to the first, previous, next, last section, table of contents.

http://www.federated.com/~jim/schintro-v14/schintro_29.html11/3/2006 8:55:17 PM

An Introduction to Scheme and its Implementation - All Values are Pointers

Go to the first, previous, next, last section, table of contents.

All Values are Pointers

[ some of this needs to be moved up: ]

Conceptually, all Scheme objects are allocated on the heap, and referred to via pointers. This actually makes life simple, because you don't have to worry about whether you should dereference a pointer when you want to use a value--you always do. Since pointer dereferencing is uniform, procedures always dereference a pointer to a value when they really use the value, and you never have to explicitly force the dereferencing.

For example, the predefined Scheme procedure + takes two pointers to numbers, and automatically dereferences both pointers before doing the addition. It returns a pointer to the number that's the result of the addition.

So when we evaluate the expression (+ 2 3) to add two to three, we are taking a pointer to the integer 2 and a pointer to integer 3, and passing those as arguments to the procedure +. + returns a pointer to the integer 5. We can nest expressions, e.g., (* (+ 2 3) 6), so that the pointer to five is passed, in turn, to the procedure *. Since these functions all accept pointers as arguments and return pointers as values, you can just ignore the pointers, and write arithmetic expressions the way you would in any other language.

When you think about it, it doesn't make any sense to change the value of an integer, in a mathematical sense. For example, what would it mean to change the integer 6's value to be 7? It wouldn't mean anything sensible, for sure. 6 is a unique, abstract mathematical object that doesn't have any state that can be changed---6 is 6, and behaves like 6, forever.

What's going on in conventional programming languages is not really changing the value of an integer--it's replacing one (copy of an) integer value with (a copy of) another. That's because most programming languages have both pointer semantics (for pointer variables) and value semantics (for nonpointer

variables, like integers). You make multiple copies of values, and then clobber the copies when you perform an assignment.

In Scheme, we don't need to clobber the value of an integer, because we get the effect we want by replacing pointers with other pointers. An integer in Scheme is a unique entity, just as it is in mathematics. We don't have multiple copies of a particular number, just multiple references to it.

(Actually, Scheme's treatment of numbers is not quite this simple and pretty, for efficiency reasons I'll explain later, but it's close.)

As we'll see later, an implementation is free to optimize away these pointers if it doesn't affect the

http://www.federated.com/~jim/schintro-v14/schintro_30.html (1 of 2)11/3/2006 8:55:22 PM

An Introduction to Scheme and its Implementation - All Values are Pointers

programmer's view of things--but when you're trying to understand a program, you should always think of values as pointers to objects.

The uniform use of pointers makes lots of things simpler. In C or Pascal, you have to be careful whether you're dealing with a raw value or a pointer. If you have a pointer and you need the actual value, you have to explictly dereference the pointer (e.g., with C's prefix operator *, or Pascal's postfix operator ^).

If you have a value and you need a pointer to it, you have to take its address (e.g., with C's prefix &

operator, or Pascal's prefix operator ^).

In Scheme, none of that mess is necessary. User-defined routines pass pointers around, consistently, and when they bottom out into predefined routines (like the built-in + procedure or set! special form) those low-level built-in operations do any dereferencing that's necessary.

(Of course, when traversing lists and the like, the programmer has to ask for pointers to be dereferenced, but from the programmer's point of view, that just means grabbing another pointer value out of a field of an object you already have a pointer to.)

It is sometimes said that languages like Scheme (and Lisp, Smalltalk, Eiffel, and Java) "don't have pointers." It's at least as reasonable to say that the opposite is true--everything's a pointer. What they don't have is a distinction between pointers and nonpointers that you have to worry about.(2)

Go to the first, previous, next, last section, table of contents.

http://www.federated.com/~jim/schintro-v14/schintro_30.html (2 of 2)11/3/2006 8:55:22 PM

An Introduction to Scheme and its Implementation - Immediate Values

Go to the first, previous, next, last section, table of contents.

Most Implementations Optimize Away Many Pointers

You might think that making every value a pointer to an object would be expensive, because you'd have to have space for all of the pointers as well as the things they point to, and you'd have to use extra

instructions to access things via pointers.

Everything's a pointer at the language level--i.e., from the programmer's point of view--but a Scheme system doesn't actually have to represent things the way they appear at the languages level.

Most Scheme implementations optimize away a lot of pointers. For example, it's inefficient to actually represent integer values as pointers to integer objects on the heap. Scheme implementations therefore use tricks to represent integers without really using pointers. (Again, keep in mind that this is just an implementation trick that's hidden from the programmer. Integer values have the semantics of pointers, even if they're represented differently from other things.)

Rather than putting integer values on the heap, and then passing around pointers to them, most

implementations put the actual integer bit pattern directly into variables--after all, a reasonable-sized integer will fit in a machine word.

A short value (like a normal integer) stored directly into a variable is called an immediate value, in contrast to pointers which are used to refer to objects indirectly.

The problem with putting integers or other short values into variables is that Scheme has to tell them apart from each other, and from pointers which might have the same bit patterns.

The solution to this is tagging. The value in each variable actually has a few bits devoted to a type tag which says what kind of thing it is--e.g., whether it's a pointer or not. The use of a few bits for a tag slightly reduces the amount of storage available for the actual value, but as we'll see next, that usually isn't a problem.

It might seem that storing integer bit patterns directly in variables would break the abstraction that Scheme is supposed to present--the illusion that all values are pointers to objects on the heap. That's not so, though, because the language enforces restrictions that keep programmers from seeing the difference.

In the case of numbers and a few other types, you can't change the state of the object itself. There's no way to side-effect an integer object and make it behave differently. We say that integers are immutable, i.

e., you can't mutate (change) them.

http://www.federated.com/~jim/schintro-v14/schintro_31.html (1 of 2)11/3/2006 8:55:27 PM

An Introduction to Scheme and its Implementation - Immediate Values

If integers were actually allocated on the heap and referred to via pointers, and if you could change the integer's value, then that change would be visible through other pointers to the integer.

(That doesn't mean that a variable's value can't be one integer at one time, and another integer at

another--the variable's value is really a pointer to an integer, not the integer itself, and you're really just replacing a pointer to one integer with a pointer to another integer.)

Go to the first, previous, next, last section, table of contents.

http://www.federated.com/~jim/schintro-v14/schintro_31.html (2 of 2)11/3/2006 8:55:27 PM

An Introduction to Scheme and its Implementation - Objects on the Heap

Go to the first, previous, next, last section, table of contents.

Objects on the Heap

Most Scheme objects only have fields that are general-purpose value cells---any field can hold any Scheme value, whether it's a tagged immediate value or a tagged pointer to another heap-allocated object. (Of course, conceptually they're all pointers, so the type of a field is just "pointer to anything.") So, for example, a pair (also known in Lisp terminology as a "cons cell") is a heap-allocated object with two fields. Either field can hold any kind of value, such as a number, a text character, a boolean, or a pointer to another heap object.

The first field of a pair is called the car field, and the second field is called the cdr field. These are among the dumbest names for anything in all of computer science. (They are just a historical artifact of the first Lisp implementation and the machine it ran on.)

Pairs can be created using the procedure cons. For example, to create a pair with the number 22 as the value of its car field, and the number 15 as the value of its cdr field, you can write the procedure call (cons 22 15).

The fields of a pair are like variable bindings, in that they can hold any kind of Scheme value. Both bindings and fields are called value cells---i.e., they're places you can put any kind of value.

In most implementations, each heap-allocated object has a hidden "header" field that you, as a Scheme programmer, are not supposed to know about. This extra field holds type information, saying exactly what kind of heap allocated object it is. So, laid out in memory, the pair looks something like this:

+---+

In this case, the car field of the pair (cons cell) holds the integer 22, and the cdr field holds the integer 15.

The values stored in the fields of the pair are drawn as arrows, because they are pointers to the numbers 22 and 15.

http://www.federated.com/~jim/schintro-v14/schintro_32.html (1 of 2)11/3/2006 8:55:32 PM

An Introduction to Scheme and its Implementation - Objects on the Heap

(The actual representation of these values might be a 30-bit binary number with a two-bit tag field used to distinguish integers from real pointers, but you don't have to worry about that.)

Scheme provides a built-in procedure car to get the value of the car field of a pair, and set-car! to set that field's value. Likewise there are functions cdr and set-cdr! to get and set the cdr field's values.

Suppose we have a top-level variable binding for the variable foo, and its value is a pointer to the above pair. We would draw that situation something like this:

+---+

+---+ header| <PAIR> | foo | *----+--->+=========+

+---+ car| *----+---->22 +---+

cdr| *----+---->15 +---+

Most other objects in Scheme are represented similarly. For example, a vector (one-dimensional array) is typically represented as a linear array of value cells, which can hold any kind of value.

Even objects that aren't actually represented like this can be thought of this way, since conceptually, everything's on the heap and referred to via a pointer.

Go to the first, previous, next, last section, table of contents.

http://www.federated.com/~jim/schintro-v14/schintro_32.html (2 of 2)11/3/2006 8:55:32 PM

An Introduction to Scheme and its Implementation - Automatic Memory Management

Go to the first, previous, next, last section, table of contents.

Dans le document An Introduction to Scheme and its Implementation (Page 53-60)