Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lwt.ml: human-friendly edition – major refactoring and commenting of the Lwt core #354

Merged
merged 66 commits into from
Jul 5, 2017

Conversation

aantron
Copy link
Collaborator

@aantron aantron commented May 14, 2017

This PR heavily updates lwt.ml to make it much more legible and friendly.

What the new lwt.ml looks like...

For comparison, what the previous lwt.ml looks like: [link].


Why?

This PR is a series of gradual refactorings of lwt.ml to make it more legible, such that the final version shares almost no lines in common with the original, though the ASTs are still largely similar. There are two main purposes:

  • Make it much easier to contribute to Lwt. I've seen several comments along the lines of "I wanted to contribute, but I couldn't read lwt.ml," and I couldn't read it well either. Even if someone can't read the new lwt.ml, they should feel welcomed and encouraged to ask about it.

  • Have enough people actually understand the semantics of lwt.ml deeply to be able to do some combination of

    Ideally, enough people can read lwt.ml for the semantics to enter the community's organizational memory. This will hopefully free Lwt from some paralysis and inertia.

    I found that, while reading lwt.ml, I needed to refactor it in some places, to be able to continue reading it more and more deeply, instead of getting stuck on superficial difficulties. That eventually led to this whole PR.


Highlights of the new code

The new lwt.ml has:

  • Thorough documentation, including the mentioned Reading Guide and Overview, as well as many implementation detail comments. The Overview discusses all concepts necessary to understand everything in lwt.ml.
  • Sectioning with internal modules. Signatures limit the scope of helpers, and provide type-checked tables of contents. Internal modules can be folded by text editors that support code folding. Functions are reordered by category, compared to the original lwt.ml.
  • Lots of renaming for clearer, self-descriptive terminology, and a switch to the "promise" language.
  • General refactoring and cleanup to organize the code.
  • GADTs to eliminate a large number of assert false cases. The original code had 35 assert false expressions, which correspond to a considerably larger number of cases, because many appear under or-patterns. The new code has four assert false expressions.
  • Explicit annotation and documentation of promise state changes. Encoding (mutable) promise state in a GADT requires casts when state changes, and these casts explicitly document the state changes Lwt believes are possible between when Lwt gives a promise to a user, and when Lwt uses that promises again later.
  • No first-class modules. These were used for encoding existential types, and have been replaced by GADTs.

It's undoubtedly possible to improve the code even further, but I'd like to commit this as a starting point for that process.


How to review

This PR is broken up into 57 commits, of which:

  • The first 40 are trivial renamings, reorderings, and whitespace changes.
  • The next 11, from "be explicit about public/internal casts" to "refactor nchoose, npick, nchoose_split" are non-trivial changes that require true thinking. Of these, "make promise state into a GADT" is the most involved. Many of these have detailed commit messages.
  • The remaining 6 are again trivial renamings, reorderings, and whitespace changes.

It's still a lot to deal with, but hopefully this way of breaking up the history makes it somewhat possible to review the PR. It should also give lwt.ml fairly sane blame.

Each commit passes the Lwt test suite. Code coverage is 98%, though the test suite is more thorough than just achieving high coverage.


Pings and notes

I have to thank @dkim for some suggestions on improving the Overview. Thanks!

@talex5 Obviously, this affects tracing. Ultimately, I hope it makes tracing easier to maintain. We should probably also work on getting it maintained in the main Lwt repo at some point (#180).

This code should be bug-for-bug compatible with the previous code. That includes the assertion failures and segfaults we recently saw in #315 and #349. In fact, this code is likely to have worse behavior in such situations, due to having fewer assertion expressions. Preserved bugs also include #340.

Some of the stuff in the overview belongs in lwt.mli, and may get moved there when working on the manual.

I would appreciate it if anyone with access to a stress test, or any other non-trivial Lwt code, would give this rewrite a try (@mfp, @copy, @domsj?). The command to pin it should be:

opam pin add lwt https://github.com/ocsigen/lwt.git\#wat

Thanks!

aantron added 30 commits May 5, 2017 13:37
They either use the thread terminology, or will become misleading during
intermediate states of refactoring.
And rename its constructors, as well as some related types.
...also constructors and field names.
The preceding commits made some lines in lwt.ml really long; this commit
breaks them up.
...to the future location of the Basic_helpers module.
...to group it with the rest of the sequence-associated storage
functions.
...to group it with the rest of the completion loop functions.
...to the future Public_types module.
...to be closer to the sequential composition functions. They are the
only code in Lwt that uses the unify helper.
...to group with other initial promise functions. Lwt.protected will be
moved later: some refactoring is required first.
They belong in a miscellaneous section at the end of file.
...to a state query section at the end of file.
...to group it with the rest of the sequential composition functions.
...to group it with the rest of the sequential composition functions.
...so that they are next to the regular forms of the same functions.
...to group it with sequential composition functions.
Mainly because it is easier to read than the various choose and pick
functions, and so acts as a gentler introduction to the implementation
of concurrent composition.
Together with the next commit, this groups functions with a similar
implementation.
...to group functions with a similar implementation.
This allows moving the phantom types into internal module
Main_internal_types, making the code more organized, and removing the
need for an implementation note. See the deleted implementation note in
the diff for an explanation of the previous situation.

Suggested by Gabriel Radanne (@Drup) in

  #354 (comment)
The proxying vocabulary was suggested by Raphaël Proust
(@raphael-proust). Indeed, it coincides with the proxy terminology
already in use for one of the participants in what was formerly called
unification, so calling the whole process proxying just makes everything
simpler.
@aantron
Copy link
Collaborator Author

aantron commented Jun 29, 2017

Ok, the entire stability project turned out to be (probably) caused by the Skylake CPU bug. So, nothing is blocking work to finish off this PR. I've tried to address all the comments.

  • Switched "unification" vocabulary to "proxying".
  • Gave constructors to the phantom types, so they can be moved into an internal module.
  • Fixed typos.
  • Regarding the warning suppression (ping @avsm), I think we have to keep it. Since there is no _tags anymore, I will probably add it as an [@@@ocaml.warning] extension point inside lwt.ml in the upcoming merge commit.
  • Regarding the promise state terminology, I am now thinking that the right thing to do is to fully adopt JS terminology, i.e. completing becomes resolving, resolving becomes fulfilling, etc. I'll open a separate issue about that, and that issue will also include an item to write a table containing the mapping between old terminology, new terminology, and terminology in other popular languages or systems. For this PR, I've gone through the comments and tried to make sure the current transient terminology is self-consistent. A quirk (already observed by @edwintorok) is that what resolvers do is allow completing promises. After switching to JS terminology, resolvers will resolve promises. The JS terminology issue doesn't have to result in JS terminology; debate is welcome, but whatever the final terminology is, we will make everything as consistent and obvious as possible.
  • Regarding anything that is already explained in the .mli file, the .mli file will be rewritten as part of creating the new manual. I agree that several things from the comments in the new lwt.ml should, or are, explained there. However, the .mli file is currently a source of uncertainty, and I'd like to revisit lwt.ml once lwt.mli is written. I will adjust the comments then to reduce duplication, and everyone is welcome to participate in revisiting lwt.ml as well :)

This PR passes all the unit tests, and I believe it has also passed several stress tests by @smondet, @domsj, and @copy. I think we should merge it relatively soon. That will eliminate the chance of further bit rot, and dually eliminate uncertainty about working on lwt.ml/unblock several issues in it.

When merging the PR, I will incorporate the work on more accurate exception messages by @raphael-proust that has been added to master while the PR was open.

For whatever is left over, and upon new ideas, everyone is welcome to make further issues and PRs to improve lwt.ml. As it says inside:

* Please edit the code and the docs!

This code is meant to be readable, and to be edited. If you are reading
something, and think there is a much better way to express it, please go
ahead and open a pull request to the Lwt repository at

 https://github.com/ocsigen/lwt

I'll probably remove the word "much" from that. It sounds like a barrier.

So... :)

@aantron
Copy link
Collaborator Author

aantron commented Jun 29, 2017

(macOS build failures were caused by some upstream breakage on macOS, that was fixed in master by 0e63890 on May 16, so it should be fine on merge, and I will check that before putting the merge into master).

src/core/lwt.ml Outdated

* Documentation

The documentation begins with an Overview of major concepts and components.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Overview -> overview

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually wrote that deliberately, but looking back on it months later, I have to agree :) Changing...

src/core/lwt.ml Outdated
type proxy = Proxy_and_this_constructor_is_not_used

type completed = Completed_and_this_constructor_is_not_used
type pending = Pending_and_this_constructor_is_not_used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should mark those types as private to enforce it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, I've done so.

This is actually enforcing even what prevented me from making the promise record type private – that the record would become immutable even inside the module Main_internal_types. private in a module is perfect for phantom types, though.

| Failed : exn -> ( _, underlying, completed) state
| Pending : 'a callbacks -> ('a, underlying, pending) state
| Proxy : ('a, _, 'c) promise -> ('a, proxy, 'c) state

Copy link
Contributor

@chambart chambart Jul 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to avoid one indirection (and associated allocations) here.

type (_,_, _, _) kind =
    | Resolved : ('a,'a, underlying, completed) state
    | Failed   : (exn, _, underlying, completed) state
    | Pending  : ('a callbacks, 'a, underlying, pending) state
    | Proxy    : (('a, 'b, 'c) promise, 'a, proxy, 'c) state

type (_,_,_) promise = Promise : {
  mutable kind : ('value, 'a, 'u, 'c) kind;
  mutable value : 'value;
} -> ('a, 'u, 'c) promise

Of course the problem is that mutating the value requires both fields to be updated at once. You can't express that in the type system, hence you need to cheat some more with the type system to write a set function like:

val set : ('a, 'u, 'c) promise -> ('value, 'a, 'u, 'c) kind -> 'value -> unit

I'm not proud of this kind of suggestion, but in the case of Lwt the cost of traversing those pointers and allocating the state is far from negligible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem of that proposition is that, additionally to the set trick, it requires inline records. Lwt is still compatible with ocaml 4.02, inline records come from 4.03.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also reminds me of #94 (comment). We will have to keep it mind for the future.

aantron added a commit that referenced this pull request Jul 5, 2017
This merge commit includes a port to the new lwt.ml of the work by
Raphaël Proust (@raphael-proust) to improve exception messages,
originally committed in 38c167d.

The rewrite originally included a modification to _tags to disable a
certain warning (see the pull request). This has been dropped, because
Lwt no long has a _tags file, and most warnings don't seem to be enabled
in the new build system anyway. We should enable them soon, in
additional work.

Resolves #354.
@aantron aantron merged commit 577b5a1 into master Jul 5, 2017
@aantron
Copy link
Collaborator Author

aantron commented Jul 5, 2017

It's finally in! Thanks to all reviewers (whether commenting or not!) and stress testers!

@aantron
Copy link
Collaborator Author

aantron commented Jul 5, 2017

Hopefully, it's now open season on improving lwt.ml :)

@aantron aantron deleted the wat branch July 5, 2017 01:03
aantron added a commit that referenced this pull request Jul 7, 2017
- We generally build with something like -w +A-29.
- The Jbuilder --dev flag also turns on warnings as errors.
- The --dev flag has its own (currently) hardcoded warning set. We
  override it, because we want a more restrictive set, and for
  future-proofing.
- Fixed some warnings by modifying code.
- The new lwt.ml has received an annotation for disabling warning 4 for
  that file only; this was discussed in #354.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants