forked from twcamper/sicp-kindle
-
Notifications
You must be signed in to change notification settings - Fork 0
/
book-Z-H-16.html
1866 lines (1445 loc) · 87 KB
/
book-Z-H-16.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http:https://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<!-- Generated from TeX source by tex2page, v 4o,
(c) Dorai Sitaram, http:https://www.cs.rice.edu/~dorai/tex2page -->
<head>
<meta name="generator" content="HTML Tidy for Linux (vers 7 December 2008), see www.w3.org" />
<title>Structure and Interpretation of Computer Programs</title>
<link href="book-Z-C.css" title="default" rel="stylesheet" type="text/css" />
<meta name="robots" content="noindex,follow" />
</head>
<body>
<mbp:pagebreak />
<p><a name="%_sec_2.3"></a></p>
<h2><a href="book-Z-H-4.html#%_toc_%_sec_2.3">2.3 Symbolic
Data</a></h2>
<p><a name="%_idx_1984"></a> <a name="%_idx_1986"></a>All the
compound data objects we have used so far were constructed
ultimately from numbers. In this section we extend the
representational capability of our language by introducing the
ability to work with arbitrary symbols as data.</p>
<p><a name="%_sec_2.3.1"></a></p>
<h3><a href="book-Z-H-4.html#%_toc_%_sec_2.3.1">2.3.1 Quotation</a></h3>
<p><a name="%_idx_1988"></a> If we can form compound data using
symbols, we can have lists such as</p>
<p><tt><a name="%_idx_1990"></a>(a b c d)<br />
(23 45 17)<br />
((Norah 12) (Molly 9) (Anna 7) (Lauren 6) (Charlotte 4))<br />
</tt></p>
<p>Lists containing symbols can look just like the expressions of
our language:</p>
<p><tt>(* (+ 23 45) (+ x 9))<br />
<br />
(define (fact n) (if (= n 1) 1 (* n (fact (- n 1)))))<br />
</tt></p>
<p>In order to manipulate symbols we need a new element in our
language: the ability to <em>quote</em> a data object. Suppose we
want to construct the list <tt>(a b)</tt>. We can't accomplish
this with <tt>(list a b)</tt>, because this expression constructs
a list of the <em>values</em> of <tt>a</tt> and <tt>b</tt> rather
than the symbols themselves. This issue is well known in the
context of <a name="%_idx_1992"></a><a name="%_idx_1994"></a>natural languages, where words and sentences may
be regarded either as semantic entities or as character strings
(syntactic entities). The common practice in natural languages is
to use quotation marks to indicate that a word or a sentence is
to be treated literally as a string of characters. For instance,
the first letter of ``John'' is clearly ``J.'' If we tell
somebody ``say your name aloud,'' we expect to hear that person's
name. However, if we tell somebody ``say `your name' aloud,'' we
expect to hear the words ``your name.'' Note that we are forced
to nest quotation marks to describe what somebody else might
say.<a href="#footnote_Temp_227" name="call_footnote_Temp_227" id="call_footnote_Temp_227"><sup><small>32</small></sup></a></p>
<p><a name="%_idx_1998"></a>We can follow this same practice to
identify lists and symbols that are to be treated as data objects
rather than as expressions to be evaluated. However, our format
for quoting differs from that of natural languages in that we
place a quotation mark (traditionally, the single <a name="%_idx_2000"></a>quote symbol <tt>'</tt>) only at the
beginning of the object to be quoted. We can get away with this
in Scheme syntax because we rely on blanks and parentheses to
delimit objects. Thus, the meaning of the single quote character
is to quote the next object.<a href="#footnote_Temp_228" name="call_footnote_Temp_228" id="call_footnote_Temp_228"><sup><small>33</small></sup></a></p>
<p><a name="%_idx_2010"></a>Now we can distinguish between
symbols and their values:</p>
<p><tt>(define a 1)<br />
<br />
(define b 2)<br />
<br />
(list a b)<br />
<i>(1 2)</i><br />
<br />
(list 'a 'b)<br />
<i>(a b)</i><br />
<br />
(list 'a b)<br />
<i>(a 2)</i><br /></tt></p>
<p><a name="%_idx_2012"></a>Quotation also allows us to type in
compound objects, using the conventional printed representation
for lists:<a href="#footnote_Temp_229" name="call_footnote_Temp_229" id="call_footnote_Temp_229"><sup><small>34</small></sup></a></p>
<p><tt>(car '(a b c))<br />
<i>a</i><br />
<br />
(cdr '(a b c))<br />
<i>(b c)</i><br /></tt></p>
<p><a name="%_idx_2018"></a><a name="%_idx_2020"></a>In keeping
with this, we can obtain the empty list by evaluating
<tt>'()</tt>, and thus dispense with the variable
<tt>nil</tt>.</p>
<p>One additional primitive used in manipulating symbols is
<a name="%_idx_2022"></a><a name="%_idx_2024"></a><a name="%_idx_2026"></a><a name="%_idx_2028"></a><tt>eq?</tt>, which
takes two symbols as arguments and tests whether they are the
same.<a href="#footnote_Temp_230" name="call_footnote_Temp_230" id="call_footnote_Temp_230"><sup><small>35</small></sup></a>
Using <tt>eq?</tt>, we can implement a useful procedure called
<tt>memq</tt>. This takes two arguments, a symbol and a list. If
the symbol is not contained in the list (i.e., is not
<tt>eq?</tt> to any item in the list), then <tt>memq</tt> returns
false. Otherwise, it returns the sublist of the list beginning
with the first occurrence of the symbol:</p>
<p><tt><a name="%_idx_2030"></a>(define (memq item x)<br />
(cond ((null? x) false)<br />
((eq? item (car x)) x)<br />
(else (memq item (cdr x)))))<br />
</tt></p>
<p>For example, the value of</p>
<p>
<tt>(memq 'apple '(pear banana prune))<br /></tt></p>
<p>is false, whereas the value of</p>
<p>
<tt>(memq 'apple '(x (apple sauce) y apple pear))<br />
</tt></p>
<p>is <tt>(apple pear)</tt>.</p>
<p><a name="%_thm_2.53"></a> <b>Exercise
2.53.</b> What would the interpreter print in response
to evaluating each of the following expressions?</p>
<p><tt>(list 'a 'b 'c)<br />
<br />
(list (list 'george))<br />
(cdr '((x1 x2) (y1 y2)))<br />
<br />
(cadr '((x1 x2) (y1 y2)))<br />
(pair? (car '(a short list)))<br />
(memq 'red '((red shoes) (blue socks)))<br />
<br />
(memq 'red '(red shoes blue socks))<br /></tt></p>
<p><a name="%_thm_2.54"></a> <b>Exercise 2.54.</b> Two
lists are said to be <a name="%_idx_2032"></a><a name="%_idx_2034"></a><a name="%_idx_2036"></a><tt>equal?</tt> if they
contain equal elements arranged in the same order. For
example,</p>
<p>
<tt>(equal? '(this is a list) '(this is a list))<br />
</tt></p>
<p>is true, but</p>
<p>
<tt>(equal? '(this is a list) '(this (is a) list))<br />
</tt></p>
<p>is false. To be more precise, we can define <tt>equal?</tt>
recursively in terms of the basic <tt>eq?</tt> equality of
symbols by saying that <tt>a</tt> and <tt>b</tt> are
<tt>equal?</tt> if they are both symbols and the symbols are
<tt>eq?</tt>, or if they are both lists such that <tt>(car
a)</tt> is <tt>equal?</tt> to <tt>(car b)</tt> and <tt>(cdr
a)</tt> is <tt>equal?</tt> to <tt>(cdr b)</tt>. Using this idea,
implement <tt>equal?</tt> as a procedure.<a href="#footnote_Temp_233" name="call_footnote_Temp_233" id="call_footnote_Temp_233"><sup><small>36</small></sup></a></p>
<p><a name="%_thm_2.55"></a> <b>Exercise 2.55.</b> Eva
Lu Ator types to the interpreter the expression</p>
<p><tt>(car ''abracadabra)<br /></tt></p>
<p>To her surprise, the interpreter prints back <tt>quote</tt>.
Explain.</p>
<p><a name="%_sec_2.3.2"></a></p>
<h3><a href="book-Z-H-4.html#%_toc_%_sec_2.3.2">2.3.2 Example:
Symbolic Differentiation</a></h3>
<p><a name="%_idx_2042"></a><a name="%_idx_2044"></a><a name="%_idx_2046"></a> As an illustration of symbol manipulation and a
further illustration of data abstraction, consider the design of
a procedure that performs symbolic differentiation of algebraic
expressions. We would like the procedure to take as arguments an
algebraic expression and a variable and to return the derivative
of the expression with respect to the variable. For example, if
the arguments to the procedure are
<em>a</em><em>x</em><sup>2</sup> + <em>b</em><em>x</em> +
<em>c</em> and <em>x</em>, the procedure should return
2<em>a</em><em>x</em> + <em>b</em>. Symbolic differentiation is
of special historical significance in Lisp. It was one of the
motivating examples behind the development of a computer language
for symbol manipulation. Furthermore, it marked the beginning of
the line of research that led to the development of powerful
systems for symbolic mathematical work, which are currently being
used by a growing number of applied mathematicians and
physicists.</p>
<p>In developing the symbolic-differentiation program, we will
follow the same strategy of data abstraction that we followed in
developing the rational-number system of section <a href="book-Z-H-14.html#%_sec_2.1.1">2.1.1</a>. That is, we will first
define a differentiation algorithm that operates on abstract
objects such as ``sums,'' ``products,'' and ``variables'' without
worrying about how these are to be represented. Only afterward
will we address the representation problem.</p>
<p><a name="%_sec_Temp_235"></a></p>
<h4><a href="book-Z-H-4.html#%_toc_%_sec_Temp_235">The
differentiation program with abstract data</a></h4>
<p><a name="%_idx_2048"></a> In order to keep things simple, we
will consider a very simple symbolic-differentiation program that
handles expressions that are built up using only the operations
of addition and multiplication with two arguments.
Differentiation of any such expression can be carried out by
applying the following reduction rules:</p>
<div align="left"><img src="ch2-Z-G-45.gif" border="0" /></div>
<div align="left"><img src="ch2-Z-G-46.gif" border="0" /></div>
<div align="left"><img src="ch2-Z-G-47.gif" border="0" /></div>
<div align="left"><img src="ch2-Z-G-48.gif" border="0" /></div>
<p>Observe that the latter two rules are recursive in nature.
That is, to obtain the derivative of a sum we first find the
derivatives of the terms and add them. Each of the terms may in
turn be an expression that needs to be decomposed. Decomposing
into smaller and smaller pieces will eventually produce pieces
that are either constants or variables, whose derivatives will be
either 0 or 1.</p>
<p>To embody these rules in a procedure we indulge in a little
<a name="%_idx_2050"></a>wishful thinking, as we did in designing
the rational-number implementation. If we had a means for
representing algebraic expressions, we should be able to tell
whether an expression is a sum, a product, a constant, or a
variable. We should be able to extract the parts of an
expression. For a sum, for example we want to be able to extract
the addend (first term) and the augend (second term). We should
also be able to construct expressions from parts. Let us assume
that we already have procedures to implement the following
selectors, constructors, and predicates:</p>
<table border="0">
<tr>
<td valign="top"><tt>(variable? e)</tt></td>
<td valign="top">Is <tt>e</tt> a variable?</td>
</tr>
<tr>
<td valign="top"><tt>(same-variable? v1 v2)</tt></td>
<td valign="top">Are <tt>v1</tt> and <tt>v2</tt> the same
variable?</td>
</tr>
<tr>
<td valign="top">
<p><tt>(sum? e)</tt></p>
</td>
<td valign="top">Is <tt>e</tt> a sum?</td>
</tr>
<tr>
<td valign="top"><tt>(addend e)</tt></td>
<td valign="top">Addend of the sum <tt>e</tt>.</td>
</tr>
<tr>
<td valign="top"><tt>(augend e)</tt></td>
<td valign="top">Augend of the sum <tt>e</tt>.</td>
</tr>
<tr>
<td valign="top"><tt>(make-sum a1 a2)</tt></td>
<td valign="top">Construct the sum of <tt>a1</tt> and
<tt>a2</tt>.</td>
</tr>
<tr>
<td valign="top">
<p><tt>(product? e)</tt></p>
</td>
<td valign="top">Is <tt>e</tt> a product?</td>
</tr>
<tr>
<td valign="top"><tt>(multiplier e)</tt></td>
<td valign="top">Multiplier of the product <tt>e</tt>.</td>
</tr>
<tr>
<td valign="top"><tt>(multiplicand e)</tt></td>
<td valign="top">Multiplicand of the product <tt>e</tt>.</td>
</tr>
<tr>
<td valign="top"><tt>(make-product m1 m2)</tt></td>
<td valign="top">Construct the product of <tt>m1</tt> and
<tt>m2</tt>.</td>
</tr>
</table>Using these, and the primitive predicate
<tt>number?</tt>, <a name="%_idx_2052"></a><a name="%_idx_2054"></a>which identifies numbers, we can express the
differentiation rules as the following procedure:
<p><tt><a name="%_idx_2056"></a>(define (deriv exp var)<br />
(cond ((number? exp) 0)<br />
((variable? exp)<br />
(if (same-variable? exp var) 1 0))<br />
((sum? exp)<br />
(make-sum (deriv (addend exp) var)<br />
(deriv (augend exp) var)))<br />
((product? exp)<br />
(make-sum<br />
(make-product (multiplier exp)<br />
(deriv (multiplicand exp) var))<br />
(make-product (deriv (multiplier exp) var)<br />
(multiplicand exp))))<br />
(else<br />
(error "unknown expression type -- DERIV" exp))))<br />
</tt></p>
<p>This <tt>deriv</tt> procedure incorporates the complete
differentiation algorithm. Since it is expressed in terms of
abstract data, it will work no matter how we choose to represent
algebraic expressions, as long as we design a proper set of
selectors and constructors. This is the issue we must address
next.</p>
<p><a name="%_sec_Temp_236"></a></p>
<h4><a href="book-Z-H-4.html#%_toc_%_sec_Temp_236">Representing
algebraic expressions</a></h4>
<p><a name="%_idx_2058"></a> We can imagine many ways to use list
structure to represent algebraic expressions. For example, we
could use lists of symbols that mirror the usual algebraic
notation, representing <em>a</em><em>x</em> + <em>b</em> as the
list <tt>(a * x + b)</tt>. However, one especially
straightforward choice is to use the same parenthesized prefix
notation that Lisp uses for combinations; that is, to represent
<em>a</em><em>x</em> + <em>b</em> as <tt>(+ (* a x) b)</tt>. Then
our data representation for the differentiation problem is as
follows:</p>
<ul>
<li>The variables are symbols. They are identified by the
primitive predicate <a name="%_idx_2060"></a><a name="%_idx_2062"></a><tt>symbol?</tt>:
<p><tt><a name="%_idx_2064"></a>(define (variable? x) (symbol? x))<br />
</tt></p>
</li>
<li>Two variables are the same if the symbols representing them
are <tt>eq?</tt>:
<p><tt><a name="%_idx_2066"></a>(define (same-variable? v1 v2)<br />
(and (variable? v1) (variable? v2) (eq? v1 v2)))<br />
</tt></p>
</li>
<li>Sums and products are constructed as lists:
<p><tt><a name="%_idx_2068"></a>(define (make-sum a1 a2) (list '+ a1 a2))<br />
<br />
<a name="%_idx_2070"></a>(define (make-product m1 m2) (list '* m1 m2))<br />
</tt></p>
</li>
<li>A sum is a list whose first element is the symbol
<tt>+</tt>:
<p><tt><a name="%_idx_2072"></a>(define (sum? x)<br />
(and (pair? x) (eq? (car x) '+)))<br />
</tt></p>
</li>
<li>The addend is the second item of the sum list:
<p><tt><a name="%_idx_2074"></a>(define (addend s) (cadr s))<br />
</tt></p>
</li>
<li>The augend is the third item of the sum list:
<p><tt><a name="%_idx_2076"></a>(define (augend s) (caddr s))<br />
</tt></p>
</li>
<li>A product is a list whose first element is the symbol <tt>
*</tt>:
<p><tt><a name="%_idx_2078"></a>(define (product? x)<br />
(and (pair? x) (eq? (car x) '*)))<br />
</tt></p>
</li>
<li>The multiplier is the second item of the product list:
<p><tt><a name="%_idx_2080"></a>(define (multiplier p) (cadr p))<br />
</tt></p>
</li>
<li>The multiplicand is the third item of the product list:
<p><tt><a name="%_idx_2082"></a>(define (multiplicand p) (caddr p))<br />
</tt></p>
</li>
</ul>
<p>Thus, we need only combine these with the algorithm as
embodied by <tt>deriv</tt> in order to have a working
symbolic-differentiation program. Let us look at some examples of
its behavior:</p>
<p><tt>(deriv '(+ x 3) 'x)<br />
<i>(+ 1 0)</i><br />
(deriv '(* x y) 'x)<br />
<i>(+ (* x 0) (* 1 y))</i><br />
(deriv '(* (* x y) (+ x 3)) 'x)<br />
<i>(+ (* (* x y) (+ 1 0))<br />
(* (+ (* x 0) (* 1 y))<br />
(+ x 3)))</i><br />
</tt></p>
<p>The program produces answers that are correct; however, they
are unsimplified. It is true that</p>
<div align="left"><img src="ch2-Z-G-49.gif" border="0" /></div>
<p>but we would like the program to know that <em>x</em> ·
0 = 0, 1 · <em>y</em> = <em>y</em>, and 0 + <em>y</em> =
<em>y</em>. The answer for the second example should have been
simply <tt>y</tt>. As the third example shows, this becomes
a serious issue when the expressions are complex.</p>
<p><a name="%_idx_2084"></a><a name="%_idx_2086"></a>Our
difficulty is much like the one we encountered with the
rational-number implementation: we haven't reduced answers to
simplest form. To accomplish the rational-number reduction, we
needed to change only the constructors and the selectors of the
implementation. We can adopt a similar strategy here. We won't
change <tt>deriv</tt> at all. Instead, we will change
<tt>make-sum</tt> so that if both summands are numbers,
<tt>make-sum</tt> will add them and return their sum. Also, if
one of the summands is 0, then <tt>make-sum</tt> will return the
other summand.</p>
<p><tt><a name="%_idx_2088"></a>(define (make-sum a1 a2)<br />
(cond ((=number? a1 0) a2)<br />
((=number? a2 0) a1)<br />
((and (number? a1) (number? a2)) (+ a1 a2))<br />
(else (list '+ a1 a2))))<br />
</tt></p>
<p>This uses the procedure <tt>=number?</tt>, which checks
whether an expression is equal to a given number:</p>
<p><tt><a name="%_idx_2090"></a>(define (=number? exp num)<br />
(and (number? exp) (= exp num)))<br />
</tt></p>
<p>Similarly, we will change <tt>make-product</tt> to build in
the rules that 0 times anything is 0 and 1 times anything is the
thing itself:</p>
<p><tt><a name="%_idx_2092"></a>(define (make-product m1 m2)<br />
(cond ((or (=number? m1 0) (=number? m2 0)) 0)<br />
((=number? m1 1) m2)<br />
((=number? m2 1) m1)<br />
((and (number? m1) (number? m2)) (* m1 m2))<br />
(else (list '* m1 m2))))<br />
</tt></p>
<p>Here is how this version works on our three examples:</p>
<p><tt>(deriv '(+ x 3) 'x)<br />
<i>1</i><br />
(deriv '(* x y) 'x)<br />
<i>y</i><br />
(deriv '(* (* x y) (+ x 3)) 'x)<br />
<i>(+ (* x y) (* y (+ x 3)))</i><br />
</tt></p>
<p>Although this is quite an improvement, the third example shows
that there is still a long way to go before we get a program that
puts expressions into a form that we might agree is ``simplest.''
The problem of algebraic simplification is complex because, among
other reasons, a form that may be simplest for one purpose may
not be for another.</p>
<p><a name="%_thm_2.56"></a> <b>Exercise
2.56.</b> <a name="%_idx_2094"></a>Show how to extend
the basic differentiator to handle more kinds of expressions. For
instance, implement the differentiation rule</p>
<div align="left"><img src="ch2-Z-G-50.gif" border="0" /></div>
<p>by adding a new clause to the <tt>deriv</tt> program and
defining appropriate procedures <tt>exponentiation?</tt>,
<tt>base</tt>, <tt>exponent</tt>, and
<tt>make-exponentiation</tt>. (You may use the symbol <tt>**</tt>
to denote exponentiation.) Build in the rules that anything
raised to the power 0 is 1 and anything raised to the power 1 is
the thing itself.</p>
<p><a name="%_thm_2.57"></a> <b>Exercise
2.57.</b> Extend the differentiation program to handle
sums and products of arbitrary numbers of (two or more) terms.
Then the last example above could be expressed as</p>
<p>
<tt>(deriv '(* x y (+ x 3)) 'x)<br />
</tt></p>
<p>Try to do this by changing only the representation for sums
and products, without changing the <tt>deriv</tt> procedure at
all. For example, the <tt>addend</tt> of a sum would be the first
term, and the <tt>augend</tt> would be the sum of the rest of the
terms.</p>
<p><a name="%_thm_2.58"></a> <b>Exercise
2.58.</b> <a name="%_idx_2096"></a><a name="%_idx_2098"></a>Suppose we want to modify the differentiation
program so that it works with ordinary mathematical notation, in
which <tt>+</tt> and <tt>*</tt> are infix rather than prefix
operators. Since the differentiation program is defined in terms
of abstract data, we can modify it to work with different
representations of expressions solely by changing the predicates,
selectors, and constructors that define the representation of the
algebraic expressions on which the differentiator is to
operate.</p>
<p>a. Show how to do this in order to differentiate algebraic
expressions presented in infix form, such as <tt>(x + (3 * (x +
(y + 2))))</tt>. To simplify the task, assume that <tt>+</tt> and
<tt>*</tt> always take two arguments and that expressions are
fully parenthesized.</p>
<p>b. The problem becomes substantially harder if we allow
standard algebraic notation, such as <tt>(x + 3 * (x + y +
2))</tt>, which drops unnecessary parentheses and assumes that
multiplication is done before addition. Can you design
appropriate predicates, selectors, and constructors for this
notation such that our derivative program still works?</p>
<p><a name="%_sec_2.3.3"></a></p>
<h3><a href="book-Z-H-4.html#%_toc_%_sec_2.3.3">2.3.3 Example:
Representing Sets</a></h3>
<p><a name="%_idx_2100"></a> In the previous examples we built
representations for two kinds of compound data objects: rational
numbers and algebraic expressions. In one of these examples we
had the choice of simplifying (reducing) the expressions at
either construction time or selection time, but other than that
the choice of a representation for these structures in terms of
lists was straightforward. When we turn to the representation of
sets, the choice of a representation is not so obvious. Indeed,
there are a number of possible representations, and they differ
significantly from one another in several ways.</p>
<p><a name="%_idx_2102"></a>Informally, a set is simply a
collection of distinct objects. To give a more precise definition
we can employ the method of data abstraction. That is, we define
``set'' by specifying the operations that are to be used on sets.
These are <tt>union-set</tt>, <tt>intersection-set</tt>,
<tt>element-of-set?</tt>, and <tt>adjoin-set</tt>. <a name="%_idx_2104"></a><tt>Element-of-set?</tt> is a predicate that
determines whether a given element is a member of a set. <a name="%_idx_2106"></a><tt>Adjoin-set</tt> takes an object and a set as
arguments and returns a set that contains the elements of the
original set and also the adjoined element. <a name="%_idx_2108"></a><tt>Union-set</tt> computes the union of two
sets, which is the set containing each element that appears in
either argument. <a name="%_idx_2110"></a><tt>Intersection-set</tt> computes the
intersection of two sets, which is the set containing only
elements that appear in both arguments. From the viewpoint of
data abstraction, we are free to design any representation that
implements these operations in a way consistent with the
interpretations given above.<a href="#footnote_Temp_240" name="call_footnote_Temp_240" id="call_footnote_Temp_240"><sup><small>37</small></sup></a></p>
<p><a name="%_sec_Temp_241"></a></p>
<h4><a href="book-Z-H-4.html#%_toc_%_sec_Temp_241">Sets as
unordered lists</a></h4>
<p><a name="%_idx_2112"></a><a name="%_idx_2114"></a> One way to
represent a set is as a list of its elements in which no element
appears more than once. The empty set is represented by the empty
list. In this representation, <tt>element-of-set?</tt> is similar
to the procedure <tt>memq</tt> of section <a href="#%_sec_2.3.1">2.3.1</a>. It uses <tt>equal?</tt> instead of
<tt>eq?</tt> so that the set elements need not be symbols:</p>
<p><tt><a name="%_idx_2116"></a>(define (element-of-set? x set)<br />
(cond ((null? set) false)<br />
((equal? x (car set)) true)<br />
(else (element-of-set? x (cdr set)))))<br />
</tt></p>
<p>Using this, we can write <tt>adjoin-set</tt>. If the object to
be adjoined is already in the set, we just return the set.
Otherwise, we use <tt>cons</tt> to add the object to the list
that represents the set:</p>
<p><tt><a name="%_idx_2118"></a>(define (adjoin-set x set)<br />
(if (element-of-set? x set)<br />
set<br />
(cons x set)))<br /></tt></p>
<p>For <tt>intersection-set</tt> we can use a recursive strategy.
If we know how to form the intersection of <tt>set2</tt> and the
<tt>cdr</tt> of <tt>set1</tt>, we only need to decide whether to
include the <tt>car</tt> of <tt>set1</tt> in this. But this
depends on whether <tt>(car set1)</tt> is also in <tt>set2</tt>.
Here is the resulting procedure:</p>
<p><tt><a name="%_idx_2120"></a>(define (intersection-set set1 set2)<br />
(cond ((or (null? set1) (null? set2)) '())<br />
((element-of-set? (car set1) set2) <br />
(cons (car set1)<br />
(intersection-set (cdr set1) set2)))<br />
(else (intersection-set (cdr set1) set2))))<br />
</tt></p>
<p>In designing a representation, one of the issues we should be
concerned with is efficiency. Consider the number of steps
required by our set operations. Since they all use
<tt>element-of-set?</tt>, the speed of this operation has a major
impact on the efficiency of the set implementation as a whole.
Now, in order to check whether an object is a member of a set,
<tt>element-of-set?</tt> may have to scan the entire set. (In the
worst case, the object turns out not to be in the set.) Hence, if
the set has <em>n</em> elements, <tt>element-of-set?</tt> might
take up to <em>n</em> steps. Thus, the number of steps required
grows as <img src="book-Z-G-D-3.gif" border="0" />(<em>n</em>). The
number of steps required by <tt>adjoin-set</tt>, which uses this
operation, also grows as <img src="book-Z-G-D-3.gif" border="0" />(<em>n</em>). For <tt>intersection-set</tt>, which does an
<tt>element-of-set?</tt> check for each element of <tt>set1</tt>,
the number of steps required grows as the product of the sizes of
the sets involved, or <img src="book-Z-G-D-3.gif" border="0" />(<em>n</em><sup>2</sup>) for two sets of size <em>n</em>. The
same will be true of <tt>union-set</tt>.</p>
<p><a name="%_thm_2.59"></a> <b>Exercise
2.59.</b> Implement the <a name="%_idx_2122"></a><tt>union-set</tt> operation for the
unordered-list representation of sets.</p>
<p><a name="%_thm_2.60"></a> <b>Exercise 2.60.</b> We
specified that a set would be represented as a list with no
duplicates. Now suppose we allow duplicates. For instance, the
set {1,2,3} could be represented as the list <tt>(2 3 2 1 3 2
2)</tt>. Design procedures <tt>element-of-set?</tt>,
<tt>adjoin-set</tt>, <tt>union-set</tt>, and
<tt>intersection-set</tt> that operate on this representation.
How does the efficiency of each compare with the corresponding
procedure for the non-duplicate representation? Are there
applications for which you would use this representation in
preference to the non-duplicate one?</p>
<p><a name="%_sec_Temp_244"></a></p>
<h4><a href="book-Z-H-4.html#%_toc_%_sec_Temp_244">Sets as
ordered lists</a></h4>
<p><a name="%_idx_2124"></a><a name="%_idx_2126"></a> One way to
speed up our set operations is to change the representation so
that the set elements are listed in increasing order. To do this,
we need some way to compare two objects so that we can say which
is bigger. For example, we could compare symbols
lexicographically, or we could agree on some method for assigning
a unique number to an object and then compare the elements by
comparing the corresponding numbers. To keep our discussion
simple, we will consider only the case where the set elements are
numbers, so that we can compare elements using <tt>></tt> and
<tt><</tt>. We will represent a set of numbers by listing its
elements in increasing order. Whereas our first representation
above allowed us to represent the set {1,3,6,10} by listing the
elements in any order, our new representation allows only the
list <tt>(1 3 6 10)</tt>.</p>
<p>One advantage of ordering shows up in
<tt>element-of-set?</tt>: In checking for the presence of an
item, we no longer have to scan the entire set. If we reach a set
element that is larger than the item we are looking for, then we
know that the item is not in the set:</p>
<p><tt><a name="%_idx_2128"></a>(define (element-of-set? x set)<br />
(cond ((null? set) false)<br />
((= x (car set)) true)<br />
((< x (car set)) false)<br />
(else (element-of-set? x (cdr set)))))<br />
</tt></p>
<p>How many steps does this save? In the worst case, the item we
are looking for may be the largest one in the set, so the number
of steps is the same as for the unordered representation. On the
other hand, if we search for items of many different sizes we can
expect that sometimes we will be able to stop searching at a
point near the beginning of the list and that other times we will
still need to examine most of the list. On the average we should
expect to have to examine about half of the items in the set.
Thus, the average number of steps required will be about
<em>n</em>/2. This is still <img src="book-Z-G-D-3.gif" border="0" />(<em>n</em>) growth, but it does save us, on the average, a
factor of 2 in number of steps over the previous
implementation.</p>
<p>We obtain a more impressive speedup with
<tt>intersection-set</tt>. In the unordered representation this
operation required <img src="book-Z-G-D-3.gif" border="0" />(<em>n</em><sup>2</sup>) steps, because we performed a
complete scan of <tt>set2</tt> for each element of <tt>set1</tt>.
But with the ordered representation, we can use a more clever
method. Begin by comparing the initial elements, <tt>x1</tt> and
<tt>x2</tt>, of the two sets. If <tt>x1</tt> equals <tt>x2</tt>,
then that gives an element of the intersection, and the rest of
the intersection is the intersection of the <tt>cdr</tt>s of the
two sets. Suppose, however, that <tt>x1</tt> is less than
<tt>x2</tt>. Since <tt>x2</tt> is the smallest element in
<tt>set2</tt>, we can immediately conclude that <tt>x1</tt>
cannot appear anywhere in <tt>set2</tt> and hence is not in the
intersection. Hence, the intersection is equal to the
intersection of <tt>set2</tt> with the <tt>cdr</tt> of
<tt>set1</tt>. Similarly, if <tt>x2</tt> is less than
<tt>x1</tt>, then the intersection is given by the intersection
of <tt>set1</tt> with the <tt>cdr</tt> of <tt>set2</tt>. Here is
the procedure:</p>
<p><tt><a name="%_idx_2130"></a>(define (intersection-set set1 set2)<br />
(if (or (null? set1) (null? set2))<br />
'() <br />
(let ((x1 (car set1)) (x2 (car set2)))<br />
(cond ((= x1 x2)<br />
(cons x1<br />
(intersection-set (cdr set1)<br />
(cdr set2))))<br />
((< x1 x2)<br />
(intersection-set (cdr set1) set2))<br />
((< x2 x1)<br />
(intersection-set set1 (cdr set2)))))))<br />
</tt></p>
<p>To estimate the number of steps required by this process,
observe that at each step we reduce the intersection problem to
computing intersections of smaller sets -- removing the first
element from <tt>set1</tt> or <tt>set2</tt> or both. Thus, the
number of steps required is at most the sum of the sizes of
<tt>set1</tt> and <tt>set2</tt>, rather than the product of the
sizes as with the unordered representation. This is <img src="book-Z-G-D-3.gif" border="0" />(<em>n</em>) growth rather than
<img src="book-Z-G-D-3.gif" border="0" />(<em>n</em><sup>2</sup>)
-- a considerable speedup, even for sets of moderate size.</p>
<p><a name="%_thm_2.61"></a> <b>Exercise
2.61.</b> Give an implementation of <a name="%_idx_2132"></a><tt>adjoin-set</tt> using the ordered
representation. By analogy with <tt>element-of-set?</tt> show how
to take advantage of the ordering to produce a procedure that
requires on the average about half as many steps as with the
unordered representation.</p>
<p><a name="%_thm_2.62"></a> <b>Exercise
2.62.</b> Give a <img src="book-Z-G-D-3.gif" border="0" />(<em>n</em>) implementation of <a name="%_idx_2134"></a><tt>union-set</tt> for sets represented as
ordered lists.</p>
<p><a name="%_sec_Temp_247"></a></p>
<h4><a href="book-Z-H-4.html#%_toc_%_sec_Temp_247">Sets as binary
trees</a></h4>
<p><a name="%_idx_2136"></a><a name="%_idx_2138"></a><a name="%_idx_2140"></a><a name="%_idx_2142"></a><a name="%_idx_2144"></a><a name="%_idx_2146"></a> We can do better than
the ordered-list representation by arranging the set elements in
the form of a tree. Each node of the tree holds one element of
the set, called the ``entry'' at that node, and a link to each of
two other (possibly empty) nodes. The ``left'' link points to
elements smaller than the one at the node, and the ``right'' link
to elements greater than the one at the node.
Figure <a href="#%_fig_2.16">2.16</a> shows some trees that
represent the set {1,3,5,7,9,11}. The same set may be represented
by a tree in a number of different ways. The only thing we
require for a valid representation is that all elements in the
left subtree be smaller than the node entry and that all elements
in the right subtree be larger.</p>
<p><a name="%_fig_2.16"></a></p>
<div align="left">
<div align="left">
<b>Figure 2.16:</b> Various binary trees that
represent the set { 1,3,5,7,9,11 }.
</div>
<table width="100%">
<tr>
<td><img src="ch2-Z-G-51.gif" border="0" /></td>
</tr>
<tr>
<td></td>
</tr>
</table>
</div>
<p>The advantage of the tree representation is this: Suppose we
want to check whether a number <em>x</em> is contained in a set.
We begin by comparing <em>x</em> with the entry in the top node.
If <em>x</em> is less than this, we know that we need only search
the left subtree; if <em>x</em> is greater, we need only search
the right subtree. Now, if the tree is ``balanced,'' each of
these subtrees will be about half the size of the original. Thus,
in one step we have reduced the problem of searching a tree of
size <em>n</em> to searching a tree of size <em>n</em>/2. Since
the size of the tree is halved at each step, we should expect
that the number of steps needed to search a tree of size
<em>n</em> grows as <img src="book-Z-G-D-3.gif" border="0" />(<tt>log</tt> <em>n</em>).<a href="#footnote_Temp_248" name="call_footnote_Temp_248" id="call_footnote_Temp_248"><sup><small>38</small></sup></a> For
large sets, this will be a significant speedup over the previous
representations.</p>
<p><a name="%_idx_2150"></a>We can represent trees by using
lists. Each node will be a list of three items: the entry at the
node, the left subtree, and the right subtree. A left or a right
subtree of the empty list will indicate that there is no subtree
connected there. We can describe this representation by the
following procedures:<a href="#footnote_Temp_249" name="call_footnote_Temp_249" id="call_footnote_Temp_249"><sup><small>39</small></sup></a></p>
<p><tt><a name="%_idx_2152"></a>(define (entry tree) (car tree))<br />
<a name="%_idx_2154"></a>(define (left-branch tree) (cadr tree))<br />
<a name="%_idx_2156"></a>(define (right-branch tree) (caddr tree))<br />
<a name="%_idx_2158"></a>(define (make-tree entry left right)<br />
(list entry left right))<br /></tt></p>
<p>Now we can write the <tt>element-of-set?</tt> procedure using
the strategy described above:</p>
<p><tt><a name="%_idx_2160"></a>(define (element-of-set? x set)<br />
(cond ((null? set) false)<br />
((= x (entry set)) true)<br />
((< x (entry set))<br />
(element-of-set? x (left-branch set)))<br />
((> x (entry set))<br />
(element-of-set? x (right-branch set)))))<br />
</tt></p>
<p>Adjoining an item to a set is implemented similarly and also
requires <img src="book-Z-G-D-3.gif" border="0" />(<tt>log</tt>
<em>n</em>) steps. To adjoin an item <tt>x</tt>, we compare
<tt>x</tt> with the node entry to determine whether <tt>x</tt>
should be added to the right or to the left branch, and having
adjoined <tt>x</tt> to the appropriate branch we piece this newly
constructed branch together with the original entry and the other
branch. If <tt>x</tt> is equal to the entry, we just return the
node. If we are asked to adjoin <tt>x</tt> to an empty tree, we
generate a tree that has <tt>x</tt> as the entry and empty right
and left branches. Here is the procedure:</p>
<p><tt><a name="%_idx_2162"></a>(define (adjoin-set x set)<br />
(cond ((null? set) (make-tree x '() '()))<br />
((= x (entry set)) set)<br />
((< x (entry set))<br />
(make-tree (entry set) <br />
(adjoin-set x (left-branch set))<br />
(right-branch set)))<br />
((> x (entry set))<br />
(make-tree (entry set)<br />
(left-branch set)<br />
(adjoin-set x (right-branch set))))))<br />
</tt></p>
<p>The above claim that searching the tree can be performed in a
logarithmic number of steps rests on the assumption that the tree
is <a name="%_idx_2164"></a><a name="%_idx_2166"></a>``balanced,'' i.e., that the left and the right
subtree of every tree have approximately the same number of
elements, so that each subtree contains about half the elements
of its parent. But how can we be certain that the trees we
construct will be balanced? Even if we start with a balanced
tree, adding elements with <tt>adjoin-set</tt> may produce an
unbalanced result. Since the position of a newly adjoined element
depends on how the element compares with the items already in the
set, we can expect that if we add elements ``randomly'' the tree
will tend to be balanced on the average. But this is not a
guarantee. For example, if we start with an empty set and adjoin
the numbers 1 through 7 in sequence we end up with the highly
unbalanced tree shown in figure <a href="#%_fig_2.17">2.17</a>. In this tree all the left subtrees are
empty, so it has no advantage over a simple ordered list. One way
to solve this problem is to define an operation that transforms
an arbitrary tree into a balanced tree with the same elements.
Then we can perform this transformation after every few
<tt>adjoin-set</tt> operations to keep our set in balance. There
are also other ways to solve this problem, most of which involve
designing new data structures for which searching and insertion
both can be done in <img src="book-Z-G-D-3.gif" border="0" />(<tt>log</tt> <em>n</em>) steps.<a href="#footnote_Temp_250" name="call_footnote_Temp_250" id="call_footnote_Temp_250"><sup><small>40</small></sup></a></p>
<p><a name="%_fig_2.17"></a></p>
<div align="left">
<div align="left">
<b>Figure 2.17:</b> Unbalanced tree produced by
adjoining 1 through 7 in sequence.
</div>
<table width="100%">
<tr>
<td><img src="ch2-Z-G-52.gif" border="0" /></td>
</tr>
<tr>
<td></td>
</tr>
</table>
</div>
<p><a name="%_thm_2.63"></a> <b>Exercise
2.63.</b> Each of the following two procedures
converts a <a name="%_idx_2182"></a><a name="%_idx_2184"></a>binary tree to a list. <a name="%_idx_2186"></a></p>
<p><tt>(define (tree->list-1 tree)<br />
(if (null? tree)<br />
'()<br />
(append (tree->list-1 (left-branch tree))<br />
(cons (entry tree)<br />
(tree->list-1 (right-branch tree))))))<br />