content/book-Z-H-13.html

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http:https://www.w3.org/TR/REC-html40/loose.dtd">

<html>
<!-- Generated from TeX source by tex2page, v 4o, 
     (c) Dorai Sitaram, http:https://www.cs.rice.edu/~dorai/tex2page -->

<head>
  <meta name="generator" content="HTML Tidy for Linux (vers 7 December 2008), see www.w3.org" />

  <title>Structure and Interpretation of Computer Programs</title>
  <link href="book-Z-C.css" title="default" rel="stylesheet" type="text/css" />
  <meta name="robots" content="noindex,follow" />
</head>

<body>
  <mbp:pagebreak />

  <p><a name="%_chap_2"></a></p>

  <div class="chapterheading">
    <h3 class="chapter"><a href="book-Z-H-4.html#%_toc_%_chap_2">Chapter 2</a></h3>
    <h1>Building Abstractions with Data</h1>
  </div>

  <div align="right">
    <table width="60%">
      <tr>
        <td>
          <div><span class="epigraph">&quot;We now come to the decisive step of mathematical abstraction: we forget about what
          the symbols stand for. <tt>...</tt>[The mathematician] need not be idle; there are many operations which he may
          carry out with these symbols, without ever having to look at the things they stand for.&quot;</span></div>

          <div><span class="epigraph attrib"><a name="%_idx_1254"></a>Hermann Weyl, <em>The Mathematical Way of Thinking</em></span></div>
        </td>
      </tr>
    </table>
  </div>

  <p><a name="%_idx_1256"></a><a name="%_idx_1258"></a>We
  concentrated in chapter&nbsp;1 on computational processes and on
  the role of procedures in program design. We saw how to use
  primitive data (numbers) and primitive operations (arithmetic
  operations), how to combine procedures to form compound
  procedures through composition, conditionals, and the use of
  parameters, and how to abstract procedures by using
  <tt>define</tt>. We saw that a procedure can be regarded as a
  pattern for the local evolution of a process, and we classified,
  reasoned about, and performed simple algorithmic analyses of some
  common patterns for processes as embodied in procedures. We also
  saw that higher-order procedures enhance the power of our
  language by enabling us to manipulate, and thereby to reason in
  terms of, general methods of computation. This is much of the
  essence of programming.</p>

  <p>In this chapter we are going to look at more complex data. All
  the procedures in chapter&nbsp;1 operate on simple numerical
  data, and simple data are not sufficient for many of the problems
  we wish to address using computation. Programs are typically
  designed to model complex phenomena, and more often than not one
  must construct computational objects that have several parts in
  order to model real-world phenomena that have several aspects.
  Thus, whereas our focus in chapter&nbsp;1 was on building
  abstractions by combining procedures to form compound procedures,
  we turn in this chapter to another key aspect of any programming
  language: the means it provides for building abstractions by
  combining data objects to form <em>compound data</em>.</p>

  <p>Why do we want compound data in a programming language? For
  the same reasons that we want compound procedures: to elevate the
  conceptual level at which we can design our programs, to increase
  the modularity of our designs, and to enhance the expressive
  power of our language. Just as the ability to define procedures
  enables us to deal with processes at a higher conceptual level
  than that of the primitive operations of the language, the
  ability to construct compound data objects enables us to deal
  with data at a higher conceptual level than that of the primitive
  data objects of the language.</p>

  <p><a name="%_idx_1260"></a>Consider the task of designing a
  system to perform arithmetic with rational numbers. We could
  imagine an operation <tt>add-rat</tt> that takes two rational
  numbers and produces their sum. In terms of simple data, a
  rational number can be thought of as two integers: a numerator
  and a denominator. Thus, we could design a program in which each
  rational number would be represented by two integers (a numerator
  and a denominator) and where <tt>add-rat</tt> would be
  implemented by two procedures (one producing the numerator of the
  sum and one producing the denominator). But this would be
  awkward, because we would then need to explicitly keep track of
  which numerators corresponded to which denominators. In a system
  intended to perform many operations on many rational numbers,
  such bookkeeping details would clutter the programs
  substantially, to say nothing of what they would do to our minds.
  It would be much better if we could ``glue together'' a numerator
  and denominator to form a pair -- a <em>compound data object</em>
  -- that our programs could manipulate in a way that would be
  consistent with regarding a rational number as a single
  conceptual unit.</p>

  <p>The use of compound data also enables us to increase the
  modularity of our programs. If we can manipulate rational numbers
  directly as objects in their own right, then we can separate the
  part of our program that deals with rational numbers per se from
  the details of how rational numbers may be represented as pairs
  of integers. The general technique of isolating the parts of a
  program that deal with how data objects are represented from the
  parts of a program that deal with how data objects are used is a
  powerful design methodology called <a name="%_idx_1262"></a><em>data abstraction</em>. We will see how data
  abstraction makes programs much easier to design, maintain, and
  modify.</p>

  <p>The use of compound data leads to a real increase in the
  expressive power of our programming language. Consider the idea
  of forming a ``linear combination'' <em>a</em><em>x</em> +
  <em>b</em><em>y</em>. We might like to write a procedure that
  would accept <em>a</em>, <em>b</em>, <em>x</em>, and <em>y</em>
  as arguments and return the value of <em>a</em><em>x</em> +
  <em>b</em><em>y</em>. This presents no difficulty if the
  arguments are to be numbers, because we can readily define the
  procedure</p>

  <p>
  <tt>(define&nbsp;(linear-combination&nbsp;a&nbsp;b&nbsp;x&nbsp;y)&nbsp;<br />

  &nbsp;&nbsp;(+&nbsp;(*&nbsp;a&nbsp;x)&nbsp;(*&nbsp;b&nbsp;y)))<br />
  </tt></p>

  <p>But suppose we are not concerned only with numbers. Suppose we
  would like to express, in procedural terms, the idea that one can
  form linear combinations whenever addition and multiplication are
  defined -- for rational numbers, complex numbers, polynomials, or
  whatever. We could express this as a procedure of the form</p>

  <p>
  <tt>(define&nbsp;(linear-combination&nbsp;a&nbsp;b&nbsp;x&nbsp;y)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br />

  &nbsp;&nbsp;(add&nbsp;(mul&nbsp;a&nbsp;x)&nbsp;(mul&nbsp;b&nbsp;y)))&nbsp;<br />
  </tt></p>

  <p>where <tt>add</tt> and <tt>mul</tt> are not the primitive
  procedures <tt>+</tt> and <tt>*</tt> but rather more complex
  things that will perform the appropriate operations for whatever
  kinds of data we pass in as the arguments <tt>a</tt>, <tt>b</tt>,
  <tt>x</tt>, and <tt>y</tt>. The key point is that the only thing
  <tt>linear-combination</tt> should need to know about <tt>a</tt>,
  <tt>b</tt>, <tt>x</tt>, and <tt>y</tt> is that the procedures
  <tt>add</tt> and <tt>mul</tt> will perform the appropriate
  manipulations. From the perspective of the procedure
  <tt>linear-combination</tt>, it is irrelevant what <tt>a</tt>,
  <tt>b</tt>, <tt>x</tt>, and <tt>y</tt> are and even more
  irrelevant how they might happen to be represented in terms of
  more primitive data. This same example shows why it is important
  that our programming language provide the ability to manipulate
  compound objects directly: Without this, there is no way for a
  procedure such as <tt>linear-combination</tt> to pass its
  arguments along to <tt>add</tt> and <tt>mul</tt> without having
  to know their detailed structure.<a href="#footnote_Temp_131" name="call_footnote_Temp_131" id="call_footnote_Temp_131"><sup><small>1</small></sup></a> We begin
  this chapter by implementing the rational-number arithmetic
  system mentioned above. This will form the background for our
  discussion of compound data and data abstraction. As with
  compound procedures, the main issue to be addressed is that of
  abstraction as a technique for coping with complexity, and we
  will see how data abstraction enables us to erect suitable
  <a name="%_idx_1264"></a><em>abstraction barriers</em> between
  different parts of a program.</p>

  <p>We will see that the key to forming compound data is that a
  programming language should provide some kind of ``glue'' so that
  data objects can be combined to form more complex data objects.
  There are many possible kinds of glue. Indeed, we will discover
  how to form compound data using no special ``data'' operations at
  all, only procedures. This will further blur the distinction
  between ``procedure'' and ``data,'' which was already becoming
  tenuous toward the end of chapter&nbsp;1. We will also explore
  some conventional techniques for representing sequences and
  trees. One key idea in dealing with compound data is the notion
  of <a name="%_idx_1266"></a><em>closure</em> -- that the glue we
  use for combining data objects should allow us to combine not
  only primitive data objects, but compound data objects as well.
  Another key idea is that compound data objects can serve as
  <a name="%_idx_1268"></a><em>conventional interfaces</em> for
  combining program modules in mix-and-match ways. We illustrate
  some of these ideas by presenting a simple graphics language that
  exploits closure.</p>

  <p>We will then augment the representational power of our
  language by introducing <a name="%_idx_1270"></a><a name="%_idx_1272"></a><em>symbolic expressions</em> -- data whose
  elementary parts can be arbitrary symbols rather than only
  numbers. We explore various alternatives for representing sets of
  objects. We will find that, just as a given numerical function
  can be computed by many different computational processes, there
  are many ways in which a given data structure can be represented
  in terms of simpler objects, and the choice of representation can
  have significant impact on the time and space requirements of
  processes that manipulate the data. We will investigate these
  ideas in the context of symbolic differentiation, the
  representation of sets, and the encoding of information.</p>

  <p>Next we will take up the problem of working with data that may
  be represented differently by different parts of a program. This
  leads to the need to implement <a name="%_idx_1274"></a><a name="%_idx_1276"></a><em>generic operations</em>, which must handle
  many different types of data. Maintaining modularity in the
  presence of generic operations requires more powerful abstraction
  barriers than can be erected with simple data abstraction alone.
  In particular, we introduce <em>data-directed programming</em> as
  a technique that allows individual data representations to be
  designed in isolation and then combined <a name="%_idx_1278"></a><em>additively</em> (i.e., without
  modification). To illustrate the power of this approach to system
  design, we close the chapter by applying what we have learned to
  the implementation of a package for performing symbolic
  arithmetic on polynomials, in which the coefficients of the
  polynomials can be integers, rational numbers, complex numbers,
  and even other polynomials.</p>

  <div class="smallprint">
    <hr />
  </div>

  <div class="footnote">
    <p><a href="#call_footnote_Temp_131" name="footnote_Temp_131" id="footnote_Temp_131"><sup><small>1</small></sup></a> The
    ability to directly manipulate procedures provides an analogous
    increase in the expressive power of a programming language. For
    example, in section&nbsp;<a href="book-Z-H-12.html#%_sec_1.3.1">1.3.1</a> we introduced the
    <tt>sum</tt> procedure, which takes a procedure <tt>term</tt>
    as an argument and computes the sum of the values of
    <tt>term</tt> over some specified interval. In order to define
    <tt>sum</tt>, it is crucial that we be able to speak of a
    procedure such as <tt>term</tt> as an entity in its own right,
    without regard for how <tt>term</tt> might be expressed with
    more primitive operations. Indeed, if we did not have the
    notion of ``a procedure,'' it is doubtful that we would ever
    even think of the possibility of defining an operation such as
    <tt>sum</tt>. Moreover, insofar as performing the summation is
    concerned, the details of how <tt>term</tt> may be constructed
    from more primitive operations are irrelevant.</p>
  </div>

  
</body>
</html>