owned this note
owned this note
Published
Linked with GitHub
# Desgin Doc of rbs subtract
This document describes rbs subtract.
## Basic Usage
You can use `rbs subtract` from command line.
```console=
# Print RBS to the stdout, which is generated.rbs - hand-written.rbs
$ rbs subtract generated.rbs hand-written.rbs
# It takes multiple minuends. The last argument becomes subtrahend.
$ rbs subtract generated-a.rbs generated-b.rbs hand-written.rbs
# It updates `.rbs` files directly with `-w` option.
$ rbs subtract -w generated-a.rbs generated-b.rbs hand-written.rbs
# It takes multiple subtrahends with --subtrahend option
$ rbs subtract generated.rbs --subtrahend=hand-written-a.rbs --subtrahend=hand-written-b.rbs
```
## The purpose of rbs subtract
`rbs subtract` focuses on conbining auto-generated RBSs and hand-written RBSs (or two kinds auto-generated RBSs).
There are several RBS generators. For example, `rbs prototype`, RBS Rails, and so on. They are useful, but the RBSs generated by them is not complete, for example, they include `untyped`.
So we want to override the generated RBSs with hand-written RBSs. But we had no good way to override them.
```rbs=
# auto-generated RBS
class C
# This definition doesn't describe the argument and returned value types,
# so we want to describe them with hand-written RBS.
def m: (untyped) -> untyped
end
# hand-written RBS
class C
# The following definition does not work with the generated RBS
# because RBS doesn't allow duplicated method definitions.
def m: (String) -> Integer
# The following definition is valid, but it is not the expected behavior
# because the overload still has `(untyped) -> untyped`.
def m: (String) -> Integer | ...
end
```
To solve this problem, we need to remove `C#m` definition from the generated RBS. But modifing a generated file by the hand introduces hard maintainability sooner or later.
`rbs subtract` solves this problem. It removes duplicated definitions from the generated RBS automatically. Then we can maintain the generated RBSs.
The `rbs subtract`'s goal is modifying generated RBSs to make it valid with the other RBSs. So, after `rbs subtract a.rbs b.rbs > c.rbs`, the environment including `b.rbs` and `c.rbs` has to be valid.
## Example workflow with rbs subtract
We can use this command with the following workflow on a Rails application.
```bash=
# Generate RBSs for Active Record models under sig/rbs_rails directory
$ bin/rake rbs_rails:all
# Generate RBSs for all Ruby code under sig/prototype directory
$ rbs prototype rb --out-dir=sig/prototype --base-dir=. app lib
# Remove methods generated by RBS Rails from sig/prototype
$ rbs subtract --write sig/prototype sig/rbs_rails
# Remove hand-written methods from generated RBSs
$ rbs subtract --write sig/prototype sig/rbs_rails sig/hand-written
```
Then the sig directory contains a complete RBS files as an environment. It means `rbs -Isig validate` passes (if there is no missing classes and so on).
## Detailed specifications
See the test file.
## Implementation details
The main implementation is `RBS::Subtractor`. It subtracts an environment from declarations.
It uses `RBS::Environment` as the subtrahend.
It needs to merge several class declarations for the same class, so `RBS::Declarations` is not appropriate for this purpose.
The subtrahend RBSs is probably incomplete RBS, for example, it may depend on the minuend RBS. `RBS::DefinitionBuilder` does not work in this case, so it is inappropriate also.
## Limitations
### Interfaces mixin
`rbs subtract` is not aware of interfaces mixins. For example
```rbs=
# minuend - generated
class C
def x: () -> untyped
end
# subtrahend - hand-written
class C
include _I
end
interface _I
def x: () -> untyped
end
# subtracted by `rbs subtract`
class C
def x: () -> untyped
end
```
`x` method remains in the subtracted. Because it is actually defined by `_I`, but not `C`.
It causes duplicated method definition error, so I'd like to improve this situation.
#### Solution ideas
* Remove entire of `class C` from subtracted if the subtrahend incldues interface mixins.
* We can fix this problem easily.
* But it may remove necessary methods.
* Use DefinitionBuilder to trace inheritance
* DefinitionBuilder builds inheritance, so we can remove methods defined by interface correctly.
* But DefinitionBuilder needs complete RBS environment.
* Search interface inheritance by the Subtractor
* Re-implement DefinitionBuilder, but it works with incomplete environment.
* It is bit of hard, and it doesn't 100% emulate the behavior.
### Different type parameters
The subtracted RBS doesn't work with the subtrahend if the subtrahend contains a class/module with type parameters.
```rbs=
# a.rbs
class C
def foo: () -> untyped
end
# b.rbs
class C[T]
def bar: () -> untyped
end
# rbs subtract a.rbs b.rbs is the following, the same as a.rbs
# The type parameter of `C` is different, so it causes an error.
class C
def foo: () -> untyped
end
```
### attr_accessor
Currently `rbs subtract` command removes `attr_accessor` if the subtrahend contains one of the methods that `attr_accessor` defines. For example
```rbs=
# minuend.rbs
class C
# It defines a and a=
attr_accessor a: String
end
# subtrahend.rbs
class C
def a: () -> String
end
```
In this case, `rbs subtract a.rbs b.rbs` prints nothing. It removes `C#a=` unexpectedly.
We can fix this problem more easily than other problems. We can convert `attr_accessor` to a `attr_{reader,writer}` in this case.
## Alternative Approaches
This section describes alternative approaches that I considered.
### Specify multiple subtrahends
#### Decided specification
`rbs subtract` treat the last argument as a subtrahend by default. But you can also specify multiple subtrahends by `--subtrahend` option. For example:
```bash=
# Specify one subtrahend
$ rbs subtract minuend.rbs subtrahend.rbs
# Specify two or more subtrahends
$ rbs subtract minuend.rbs --subtrahend=subtrahend_1.rbs --subtrahend=subtrahend_2.rbs
```
#### Why this feature is necessary
Specifying multiple subtrahends is useful on the following situaion.
```
.
└── sig
├── app
│ └── models/user.rbs
├── lib
│ └── lib.rbs
├── prototype
│ └── app/models/user.rbs
└── rbs_rails
└── app/moels/user.rbs
6 directories, 4 files
```
In this case, `rbs subtract` executes `(sig/prototype + sig/rbs_rails) - (sig/app + sig/lib)`, which takes two directories as the subtrahends.
#### Alternative Solutions
I considered the following solutions too.
```console=
# Separate minuends and subtrahends by `-`
#
# It looks cool, but it is not common as CLI.
# And I'm not sure I can implement it easily with optparse, because `-` is a meta character of optparse.
$ rbs subtract sig/prototype sig/rbs_rails - sig/app sig/lib
# Add --minuend option
#
# It is not bad, but I like --subtrahend.
$ rbs subtract --minuend=sig/prototype --minuend=sig/rbs_rails sig/app sig/lib
# Add --minuend and --subtrahend options
#
# It is too redundant.
$ rbs subtract --minuend=sig/prototype --minuend=sig/rbs_rails \
--subtrahend=sig/app --subtrahend=sig/lib
# Specify comma separated files as subtrahend
#
# I do not want to implement the comma separated files because of escaping comma.
# It will introduce complexity.
$ rbs subtract sig/prototype sig/rbs_rails sig/app,sig/lib
```