We’re here to master programming by learning how some fundamental concepts work. Instead of a tutorial showing each line of code, this repository contains a set of tests that describe the concepts being explored. There’s also some more information along this text for you to get started.
We’re using typescript and you need a working nodejs installed. The exact versions of the dependencies can be checked in the package.json file.
If you haven’t done it yet, clone the repository somewhere in your
computer using git
:
git clone [email protected]:clarete/dojo.git
You can now get into that directory and install the dependencies:
cd dojo
npm install
Then you’re ready to test your setup and get started with the exercises. To iterate, run the tests with:
npm test
You must see the following message in the first time you run it:
dojo $ npm test
> [email protected] test /tmp/dojo
> ts-node $(which jasmine) --config=jasmine.json
/tmp/dojo/node_modules/ts-node/src/index.ts:293
return new TSError(diagnosticText, diagnosticCodes)
^
TSError: ⨯ Unable to compile TypeScript:
kata1.spec.ts:1:42 - error TS2307: Cannot find module './api'.
1 import { BaseParser, ParsingError } from "./api";
~~~~~~~
at createTSError (/tmp/dojo/node_modules/ts-node/src/index.ts:293:12)
at reportTSError (/tmp/dojo/node_modules/ts-node/src/index.ts:297:19)
at getOutput (/tmp/dojo/node_modules/ts-node/src/index.ts:399:34)
at Object.compile (/tmp/dojo/node_modules/ts-node/src/index.ts:457:32)
at Module.m._compile (/tmp/dojo/node_modules/ts-node/src/index.ts:536:43)
at Module._extensions..js (internal/modules/cjs/loader.js:827:10)
at Object.require.extensions.(anonymous function) [as .ts] (/tmp/dojo/node_modules/ts-node/src/index.ts:539:12)
at Module.load (internal/modules/cjs/loader.js:685:32)
at Function.Module._load (internal/modules/cjs/loader.js:620:12)
at Module.require (internal/modules/cjs/loader.js:723:19)
npm ERR! Test failed. See above for more details.
Although that’s an error, right now it means success. You got as
far as you were meant to be in your journey. If you squint a bit,
you can see that the error happened because typescript
didn’t
find the module that implements the API required by the test.
That’s right, it’s on you to make this test and all the others
pass!
Although just looking through the tests will get you pretty far, especially in the beginning, at some point, things might get frustrating. If that ever happens or if you just prefer reading a bit about the abstract side before coding or even looking through the tests, continue the reading to get more context about what’s going on with all those tests and what you will build.
We usually say we “write software” because we use the most basic type of file to declare our programs: Plain Text. That’s the most powerful format we’ve ever created in computers. With a Plain Text file you can write a program that will be able to interpret whatever you wrote in whatever way your parser works.
Plain Text is as powerful as the parser we can write to interpret a plain text file. This is why it’s used as the fabric of the programs we write, because both us lil humans and machines can interpret them.
So we’re going to master one way of parsing machine readable text formats. Machine readable formats are rather easier to be parsed than natural language and although natural language is… hmmm… more natural to us humans, it’s different to tell computers how to do it. It’s way more involved.
We’re going to be writing a top down recursive parser. It means that 1. It reads the text from left to right, from top to bottom and also that it’s recursive, so it’s powerful enough to parse recursive structures (parse the definition of a list that can contain other lists for example).
This first exercise aims to teach us the basics about parsing. It
won’t involve recursion just yet. The exercise requires the
implementation of a class called BaseParser
. The class will have
to provide methods to control how far the parser has moved within
the input string. The test requires explicitly the BaseParser
to
provide the property cursor
.
The input, written to the parser instance only through
BaseParser#setInput
, isn’t ever mutated during parsing.
The method BaseParser#getCurrent()
will return the character at
the position the cursor
is in. And as a reminder, string
indexing starts from zero in typescript
so the cursor
must
also start from zero, thus getCurrent()
will return the first
character of the input
if the cursor
hasn’t moved yet.
The method BaseParser#next()
moves the cursor to the next
position in the input and must fail with a ParsingErros
if the
cursor has reached the end of the input.
The method BaseParser#expect(expected)
must either advance the
cursor if the expected
character is under the cursor, or fail
with a parsing error. And finally BaseParser#expectString(e)
should take a string and expect()
each character of the input
string e
to be matched.
The first Kata implements enough tooling to help our parser move
the cursor forward if it matches a string. It also introduces the
ParsingError
exception that is raised whenever an expected input
isn’t matched. This second exercise will teach you how to build
two little tools that will allow you to compose more involved
parsers much more simply.
- The method
BaseParser#zeroOrMore
will receive a parsing function and try to execute it indefinitely until aParsingError
happens. Then,zeroOrMore
returns all the successful outputs of the matching function.zeroOrMore
never fails, even if the matching function never succeeds. Instead, it just returns an empty array. - The method
BaseParser#or
will receive a list of parsing functions and try each of them and return the result of the very first successful one. Theor
method must reset the cursor to where it was before each failed attempt. If no alternatives work, theor
method will then fail with aParsingError
.
The second kata also introduces BaseParser#oneOrMore
but it is
just a syntactic sugar for calling a parsing function once and
then calling it again with zeroOrMore
.
Another very important concept that shows up in this kata is the idea of writing small parsing functions that work from where the cursor is located onward. Each parsing function will use these tools we built so far and other parsing functions for composing more complex parsers.
Pay very close attention to the little test function
expectInteger
within the test “should work with any matching
function”. It is an example of how a matching function can be
tailored from the most basic primitives getCurrent()
and
next()
. It also shows where the parsing error has to be
emitted on the custom matching functions
Most of the tooling needed for building our parser is now finished. This third exercise will teach you how to use the tooling you just built to parse a wide variety of readable formats.
The first thing we’re going to parse is numbers! If you’re familiar with parseInt, we’re going to implement a lightweight version of it. We’ll take a string and return the number contained in that string.
Let’s start pretty small though. Let’s parse a single decimal
digit. Although we’ve done something very similar in the last
test of the second kata (remember expectInteger
?), this time
we’ll try to use the functions we wrote instead of a regular
expression to match each digit.
To check if a character is a valid decimal digit, one just needs
to compare that character to characters from 0
to 9
. We can
use the combination of two methods we’ve built to implement this.
We just need a list of teeny little parsers that expect()
a
single character and then we can feed that list into the or
method. That gives you parseDigit()
!
For the next tests, you can follow the same logic of composing matching functions to be able to create more complex parsers.
The best advice for implementing parseDecimal()
is to use
parseDigit
oneOrMore
times!