stdlib: Add query optimisation to `ets:fun2ms/1` #4

TD5 · 2023-05-22T14:47:11Z

Unlike writing match specs directly, ets:fun2ms/1 generates queries by translating an erlang function expression. This is convenient and makes for readable queries, but it necessarily trades-off some expressiveness in favour of simplicity (for example, it's not possible to generate a match spec pattern guard that matches against an in-scope variable: users are forced to use something like an equality guard instead). Here, we resolve that issue by reading the author's intention from the given function expression, generating the match spec as before via ms_transform, but then running an optimisation pass over it during compilation in order to generate more efficient queries.

Performance

Amongst other things, we optimise equality guards by moving them into the pattern, which can avoid scanning the whole table, making queries O(1) or O(log(n)) (depending on the table type), rather than O(n), (where n is the number of rows in the table). In other words, this is not primarily a micro-optimisation, but rather a very substantial algorithmic complexity improvement for many common queries.

In practice, I have seen no situations where the new ets:fun2ms/1 queries are slower, but many simple queries can be executed drastically faster when the number of rows in the table is large.

For example, even a simple query over a table of a million rows made up of pairs of keys and values queried with:

make_query(Key) ->
  ets:fun2ms(fun({K, V}) when K =:= Key -> {K,V} end).

now executes >1000x faster with my local benchmarks. Almost any query which requires that a =:= guard always hold will potentially see a substantial performance improvement.

Theory

From the existing ETS match spec docs:

Traversals using match and select functions may not need to scan the
entire table depending on how the key is specified. A match pattern with
a fully bound key (without any match variables) will optimize the
operation to a single key lookup without any table traversal at all. For
ordered_set a partially bound key will limit the traversal to only scan
a subset of the table based on term order. A partially bound key is
either a list or a tuple with a prefix that is fully bound.

We can leverage this knowledge to re-write queries to make better use of the key.

For example:

make_query(Key) ->
  ets:fun2ms(fun({K, V}) when K =:= Key -> {K,V} end).

was previously compiled to:

{
  {'$1', '$2'},
  [
    {'=:=', '$1', Key}
  ],
  [{'$1', '$2'}]
}

This was sub-optimal, since the equality guard is less efficient than the functionally-equivalent pattern match because the equality guard did not result in a fast lookup using the table's key.

Now, the same function expression is compiled to this, more efficient, query:

{
  {Key, '$2'},
  [],
  [{Key, '$2'}]
}

We can also simplify constant parts of queries statically, and perform other rewritings to improve efficiency, but the largest win comes from inlining the values of variables bound by guards such as (K =:= Key).

Implementation

This optimisation is implemented for all relevant literals that I could find. Floats were given extra consideration and testing because of the differences in ==/=:= vs. pattern matching. In this situation, the handling of floats in ordered_set is safe because we only inline =:= guards into the the match head and body, but we leave == as a guard, since determining statically whether the table type would make this a safe operation or not is not feasible using the the information available in the parse transform.

New unit tests cover the parse transform compiling to the expected match expression, the match expression matching the expected rows, and the equivalence between the naive match expression and the optimised one in terms of data returned. See the changes to ms_transform_SUITE.erl for more information.

This optimisation is specifically applied in ets:fun2ms/1, because I think users would expect generated match specs to avoid trivial inefficiencies (and, indeed, utilising the key efficiently when it was given as a parameter was impossible to express before). Moreover, by making use of ets:fun2ms/1, users have already ceded some control of the generated match spec to the tooling. Users who construct match specs directly will be unaffected.

Notably, since ets:fun2ms/1 is transformed at compile time (outside of the shell, at least), we don't pay any unnecessary performance penalty at runtime in order to apply these optimisations, and the cost of doing them at compile time is low relative to other operations.

Later work could explore runtime query-planning for ETS, but avoiding introducing performance regressions for at least some queries will be harder to guarantee, since we then we would have to consider the runtime cost of computing the optimisation itself.

Optimisation can be disabled with the no_optimise_fun2ms compiler flag, but by default it is enabled. The flag can be altered via the usual compile flag mechanisms, including the -compile(no_optimise_fun2ms) attribute.

OTP-18660

Add a simple test case to simply print all ioctl requests. OTP-18660

OTP-18660

Add support for a new get request; atmark - SIOCATMARK Also changed the name of the fion[read|write|space] requests to just n[read|write|space]. OTP-18660

OTP-18660

Forgot to rename the fionread test case (to nread) and it also used the wrong request (fionread instead of nread). OTP-18660

OTP-18660

Add support for the (windows only?) (I/O) control code SIO_TCP_INFO, tcp_info. The socket has to connected to return a value. Also updated the NTDDI version to NTDDI_WIN10_RS2 for all the esock stuff. OTP-18660

OTP-18660

Remove commented out code and turn trace printouts into proper ?DP()-invocations.

Add support for IOCTL rcvall (SIO_RCVALL). OTP-18660

Update socket:ioctl/3 with support for set request rcvall. OTP-18660

Add documentation for new ioctl set request rcvall. OTP-18660

Add support for the (windows only?) (I/O) control code SIO_TCP_INFO, tcp_info. The socket has to connected to return a value. Also updated the NTDDI version to NTDDI_WIN10_RS2 for all the esock stuff. OTP-18660

Add the support for the IOCTL rcvall_igmpmcast (SIO_RCVALL_IGMPMCAST). OTP-18660

OTP-18660

Add doc for ioctl rcvall_igmpmcast. OTP-18660

Add the support for the IOCTL rcvall_mcast (SIO_RCVALL_MCAST). OTP-18660

OTP-18660

Add doc for ioctl rcvall_mcast. OTP-18660

* raimo/erts/fix-windows-build-warnings: Fix warning about not returning a value Fix warning about Sint* incompatible to LPDWORD Fix warning about ASSERT redefinition Fix warning about type cast from unsigned int to pointer Fix warning about (char *) vs. (int *) on Windows

* maint: Fix warning about not returning a value Fix warning about Sint* incompatible to LPDWORD Fix warning about ASSERT redefinition Fix warning about type cast from unsigned int to pointer Fix warning about (char *) vs. (int *) on Windows

…roduct-type/erlangGH-7584/OTP-18738 dialyzer: Handle definition of type product/0

…erlangGH-7410/OTP-18740 Teach the debugger to handle the maybe expression

* maint: Teach the debugger to handle the maybe expression dialyzer: Handle definition of type product/0

Several functions in the `binary` module would accept an invalid pattern (such as an atom) if the subject binary was empty or if the `{scope,{0,0}}` option was given. The following functions were affected: match/{2,3} matches/{2,3} replace/{3,4} split/{2,3}

The flag works similarly to `-s` and `-run`, except that: - Additional command line arguments starting with a hyphen will be passed to the invoked script as well, whilst with `-s` and `-run` these arguments will be passed to the runtime system. - Command-line arguments will be passed directly to the function, without having to call `init:get_plain_arguments`. - Scripts that make use of this option only need to define a function of arity one, as passing no arguments will result in a call like `func([])` as opposed to `func()`, which would be the behaviour of the existing options. The documentation for the existing `-s` and `-run` options was updated to mention that they will not forward arguments starting with a hyphen to the specified function, to prevent surprises when using `argparse` or other option parser libraries.

init: Introduce -S flag OTP-18744

Update configure_algos.xml

When a BEAM file lacks a "Type" chunk, a default `any` type is set up, but all fields in the type was not initialized, which could lead to the JIT removing type tests when it was not safe to do so.

* raimo/triple-quoted-strings-warning: Update primary bootstrap Shorten code by list comprehension Update primary bootstrap Emit warning for triple quote chars Emit warning for triple quote chars

Fix incorrect range calculation for operator `rem`

…ry-module/OTP-18743 binary module: Always detect invalid patterns

This is an 'ours' merge since we should not warn for triple double-quote strings on 'master'. The implementation of the feature will be merged later.

On 32-bit systems, we sometimes run out of address space when running the asn1 test suite. Try mitigating that by unloading generated code after each test case.

* bjorn/asn1/test-cuddling: asn1: Unload generated code after each test case

…ype-info/erlangGH-7492/OTP-18745 Properly handle BEAM files without the "Type" chunk

Test BEAM files without type information

* maint: asn1: Unload generated code after each test case Properly handle BEAM files without the "Type" chunk Test BEAM files without type information

…34' into maint * raimo/kernel/gen_udp-pass-options/erlangGH-7569/OTP-18734: 'dontroute' does not work on all platforms Test more UDP socket options Test UDP socket options Allow missing options in open() calls

* maint: 'dontroute' does not work on all platforms Test more UDP socket options Test UDP socket options Allow missing options in open() calls

Unlike writing match specs directly, `ets:fun2ms/1` generates queries by translating an erlang function expression. This is convenient and makes for readable queries, but it necessarily trades-off some expressiveness in favour of simplicity (for example, it's not possible to generate a match spec pattern guard that matches against an in-scope variable: users are forced to use something like an equality guard instead). Here, we resolve that issue by reading the author's _intention_ from the given function expression, generating the match spec as before via `ms_transform`, but then running an optimisation pass over it during compilation in order to generate more efficient queries. Performance =========== Amongst other things, we optimise equality guards by moving them into the pattern, which can avoid scanning the whole table, making queries `O(1)` or `O(log(n))` (depending on the table type), rather than `O(n)`, (where `n` is the number of rows in the table). In other words, this is not primarily a micro-optimisation, but rather a very substantial algorithmic complexity improvement for many common queries. In practice, I have seen no situations where the new `ets:fun2ms/1` queries are slower, but many simple queries can be executed drastically faster when the number of rows in the table is large. For example, even a simple query over a table of a million rows made up of pairs of keys and values queried with: ```erlang make_query(Key) -> ets:fun2ms(fun({K, V}) when K =:= Key -> {K,V} end). ``` now executes **>1000x faster** with my local benchmarks. Almost any query which requires that a `=:=` guard always hold will potentially see a substantial performance improvement. Theory ====== From the existing ETS match spec docs: > Traversals using match and select functions may not need to scan the > entire table depending on how the key is specified. A match pattern with > a fully bound key (without any match variables) will optimize the > operation to a single key lookup without any table traversal at all. For > ordered_set a partially bound key will limit the traversal to only scan > a subset of the table based on term order. A partially bound key is > either a list or a tuple with a prefix that is fully bound. We can leverage this knowledge to re-write queries to make better use of the key. For example: ```erlang make_query(Key) -> ets:fun2ms(fun({K, V}) when K =:= Key -> {K,V} end). ``` was previously compiled to: ```erlang { {'$1', '$2'}, [ {'=:=', '$1', Key} ], [{'$1', '$2'}] } ``` This was sub-optimal, since the equality guard is less efficient than the functionally-equivalent pattern match because the equality guard did not result in a fast lookup using the table's key. Now, the same function expression is compiled to this, more efficient, query: ```erlang { {Key, '$2'}, [], [{Key, '$2'}] } ``` We can also simplify constant parts of queries statically, and perform other rewritings to improve efficiency, but the largest win comes from inlining the values of variables bound by guards such as `(K =:= Key)`. Implementation ============== This optimisation is implemented for all relevant literals that I could find. Floats were given extra consideration and testing because of the differences in `==`/`=:=` vs. pattern matching. In this situation, the handling of floats in `ordered_set` is safe because we only inline `=:=` guards into the the match head and body, but we leave `==` as a guard, since determining statically whether the table type would make this a safe operation or not is not feasible using the the information available in the parse transform. New unit tests cover the parse transform compiling to the expected match expression, the match expression matching the expected rows, and the equivalence between the naive match expression and the optimised one in terms of data returned. See the changes to `ms_transform_SUITE.erl` for more information. This optimisation is specifically applied in `ets:fun2ms/1`, because I think users would expect generated match specs to avoid trivial inefficiencies (and, indeed, utilising the key efficiently when it was given as a parameter was impossible to express before). Moreover, by making use of `ets:fun2ms/1`, users have already ceded some control of the generated match spec to the tooling. Users who construct match specs directly will be unaffected. Notably, since `ets:fun2ms/1` is transformed at compile time (outside of the shell, at least), we don't pay any unnecessary performance penalty at runtime in order to apply these optimisations, and the cost of doing them at compile time is low relative to other operations. Later work could explore runtime query-planning for ETS, but avoiding introducing performance regressions for at least some queries will be harder to guarantee, since we then we would have to consider the runtime cost of computing the optimisation itself. Optimisation can be disabled with the `no_optimise_fun2ms` compiler flag, but by default it is enabled. The flag can be altered via the usual compile flag mechanisms, including the `-compile(no_optimise_fun2ms)` attribute.

facebook-github-bot added the cla signed label May 22, 2023

TD5 force-pushed the ets-opt branch from ef754da to 687c3b0 Compare June 6, 2023 14:33

bmk and others added 28 commits August 3, 2023 10:06

[erts|esock] fionread ioctl for Windows

0f6fc89

OTP-18660

[kernel|esock|test] Add a simple test case

04b5ef7

Add a simple test case to simply print all ioctl requests. OTP-18660

[kernel|esock|test] Update for FreeBSD

615d376

OTP-18660

[kernel|esock] Update spec and doc for ioctl

034636a

OTP-18660

[erts,kernel|esock] Unix and doc fixes

a315d40

OTP-18660

[kernel|doc] Fixed since tags

8b82397

OTP-18660

[kernel|esock|doc] Fixed achor

ce05a4a

OTP-18660

[kernel|esock|doc] Cosmetic fix

dbc2b5c

OTP-18660

[kernel|esock|test] Corrected nread test case

c0cc2ba

Forgot to rename the fionread test case (to nread) and it also used the wrong request (fionread instead of nread). OTP-18660

[erts|esock] ifdef-ing

22acb89

OTP-18660

[erts|esock] Preliminary tcp_info support

dd7f6a1

OTP-18660

[erts,kernel|esock] Add support for ioctl tcp_info

d70b02e

Add support for the (windows only?) (I/O) control code SIO_TCP_INFO, tcp_info. The socket has to connected to return a value. Also updated the NTDDI version to NTDDI_WIN10_RS2 for all the esock stuff. OTP-18660

[kernel|esock|test] Add simple test case for ioctl:tcp_info

d50318e

OTP-18660

[kernel|esock] Add spec entry for tcp_info

9b58f0a

OTP-18660

[kernel|esock|doc] Add doc for ioctl:tcp_info

542ea04

OTP-18660

compiler: Clean up commented out code in beam_ssa_alias.erl

0e97f10

Remove commented out code and turn trace printouts into proper ?DP()-invocations.

[erts|esock] Add support for IOCTL RCVALL

47fb767

Add support for IOCTL rcvall (SIO_RCVALL). OTP-18660

[kernel|esock] Add support for ioctl rcvall

2b39b23

Update socket:ioctl/3 with support for set request rcvall. OTP-18660

[kernel|esock|doc] Add documentation for ioctl rcvall

a66cc8c

Add documentation for new ioctl set request rcvall. OTP-18660

[erts,kernel|esock] Add support for ioctl tcp_info

05fba99

Add support for the (windows only?) (I/O) control code SIO_TCP_INFO, tcp_info. The socket has to connected to return a value. Also updated the NTDDI version to NTDDI_WIN10_RS2 for all the esock stuff. OTP-18660

[erts|esock] Add support for ioctl rcvall_igmpmcast

17a639a

Add the support for the IOCTL rcvall_igmpmcast (SIO_RCVALL_IGMPMCAST). OTP-18660

[kernel|esock] Add support for ioctl rcvall_igmpmcast

58a2363

OTP-18660

[kernel|esock|doc] Add doc for the new ioctl request (rcvall_igmpmcast)

1175071

Add doc for ioctl rcvall_igmpmcast. OTP-18660

[erts|esock] Add support for ioctl rcvall_mcast

5454eb4

Add the support for the IOCTL rcvall_mcast (SIO_RCVALL_MCAST). OTP-18660

[kernel|esock] Add support for ioctl rcvall_mcast

0328511

OTP-18660

[kernel|esock|doc] Add doc for the new ioctl request (rcvall_mcast)

be19058

Add doc for ioctl rcvall_mcast. OTP-18660

Merge branch 'bmk/megaco/20230717/test_tweaking' into maint

85fa533

RaimoNiskanen and others added 24 commits August 31, 2023 09:29

Merge branch 'maint'

1049822

* maint: Fix warning about not returning a value Fix warning about Sint* incompatible to LPDWORD Fix warning about ASSERT redefinition Fix warning about type cast from unsigned int to pointer Fix warning about (char *) vs. (int *) on Windows

Merge pull request erlang#7597 from bjorng/bjorn/dialyzer/redefined-p…

62ef1b0

…roduct-type/erlangGH-7584/OTP-18738 dialyzer: Handle definition of type product/0

Merge pull request erlang#7599 from bjorng/bjorn/debugger/handle-maybe/…

7a6e424

…erlangGH-7410/OTP-18740 Teach the debugger to handle the maybe expression

Merge branch 'maint'

7388d5a

* maint: Teach the debugger to handle the maybe expression dialyzer: Handle definition of type product/0

Merge pull request erlang#7470 from jchristgit/init-dash-capital-s-flag

78d6886

init: Introduce -S flag OTP-18744

Update preloaded module init

8696b84

Merge pull request erlang#7551 from erlang/kuba/ssh/fix_doc_typo

70397e0

Update configure_algos.xml

Merge branch 'maint'

bbb84f3

Properly handle BEAM files without the "Type" chunk

c42b16a

When a BEAM file lacks a "Type" chunk, a default `any` type is set up, but all fields in the type was not initialized, which could lead to the JIT removing type tests when it was not safe to do so.

Merge branch 'raimo/triple-quoted-strings-warning' into maint

924483c

* raimo/triple-quoted-strings-warning: Update primary bootstrap Shorten code by list comprehension Update primary bootstrap Emit warning for triple quote chars Emit warning for triple quote chars

Merge pull request erlang#7610 from bjorng/bjorn/compiler/correct-bounds

c0474db

Fix incorrect range calculation for operator `rem`

Merge pull request erlang#7611 from bjorng/bjorn/stdlib/stricten-bina…

b6dcb5d

…ry-module/OTP-18743 binary module: Always detect invalid patterns

Merge branch 'maint'

7b16493

This is an 'ours' merge since we should not warn for triple double-quote strings on 'master'. The implementation of the feature will be merged later.

'dontroute' does not work on all platforms

62bd165

asn1: Unload generated code after each test case

fd1ea68

On 32-bit systems, we sometimes run out of address space when running the asn1 test suite. Try mitigating that by unloading generated code after each test case.

Merge branch 'bjorn/asn1/test-cuddling' into maint

9968037

* bjorn/asn1/test-cuddling: asn1: Unload generated code after each test case

Merge pull request erlang#7616 from bjorng/bjorn/jit/handle-missing-t…

db5c0aa

…ype-info/erlangGH-7492/OTP-18745 Properly handle BEAM files without the "Type" chunk

Merge pull request erlang#7603 from bjorng/bjorn/erts/no-type-info

0a6c695

Test BEAM files without type information

Merge branch 'maint'

b539c47

* maint: asn1: Unload generated code after each test case Properly handle BEAM files without the "Type" chunk Test BEAM files without type information

Merge branch 'raimo/kernel/gen_udp-pass-options/erlangGH-7569/OTP-187…

127a547

…34' into maint * raimo/kernel/gen_udp-pass-options/erlangGH-7569/OTP-18734: 'dontroute' does not work on all platforms Test more UDP socket options Test UDP socket options Allow missing options in open() calls

Merge branch 'maint'

1706caa

* maint: 'dontroute' does not work on all platforms Test more UDP socket options Test UDP socket options Allow missing options in open() calls

TD5 force-pushed the ets-opt branch 4 times, most recently from 614c926 to 6403107 Compare September 4, 2023 13:39

TD5 force-pushed the ets-opt branch from 6403107 to 4719d0a Compare September 4, 2023 13:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stdlib: Add query optimisation to `ets:fun2ms/1` #4

stdlib: Add query optimisation to `ets:fun2ms/1` #4

TD5 commented May 22, 2023

stdlib: Add query optimisation to ets:fun2ms/1 #4

Are you sure you want to change the base?

stdlib: Add query optimisation to ets:fun2ms/1 #4

Conversation

TD5 commented May 22, 2023

Performance

Theory

Implementation

stdlib: Add query optimisation to `ets:fun2ms/1` #4

stdlib: Add query optimisation to `ets:fun2ms/1` #4