Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(dev/run): handling node number configuration #4973

Merged
merged 1 commit into from
Feb 13, 2024

Conversation

pgj
Copy link
Contributor

@pgj pgj commented Feb 4, 2024

The --node-number parameter is not interpreted correctly so that the node identifiers are shifted everywhere in the same way. For example, the lead port and node ports are assigned with no regard to what was set as the starting node index.

This could be a problem if, for example, one wants to simulate mixed-version cluster configurations and launch a set of nodes starting from a given index and use another invocation of the script to run another set of nodes with different subset of indexes.

Testing recommendations

There are no automated tests, but this can be tried out manually. For example, with Clouseau enabled, setting the node number to 2, both the ports and the Clouseau indexes are shifted:

$ make devclean
$ dev/run -a adm:pass --with-clouseau --node-number=2 -n 2
[ * ] Setup environment ... ok
[ * ] Ensure CouchDB is built ... ok
[ * ] Ensure Erlang boot script exists ... ok
[ * ] Prepare configuration files ... ok
[ * ] Start node node2 ... ok
[ * ] Start node node3 ... ok
[ * ] Start Clouseau node clouseau2 ... ok
[ * ] Start Clouseau node clouseau3 ... ok
[ * ] Check node at http:https://127.0.0.1:25984/ ... ok
[ * ] Check Clouseau node clouseau2 ... ok
[ * ] Check node at http:https://127.0.0.1:35984/ ... ok
[ * ] Check Clouseau node clouseau3 ... ok
[ * ] Running cluster setup ... ok
[ * ] Developers cluster is set up at http:https://127.0.0.1:25984.
Admin username: adm
Password: pass
Time to hack! ...

It is recommended to call the devclean target first to reset the persisted data (database contents, configuration) for the nodes to avoid problems at the startup.

@jiahuili430
Copy link
Contributor

jiahuili430 commented Feb 5, 2024

If we specify --node-number=2, do we need to change the default NODE in dev/remsh and dev/remsh-tls.

$ make devclean
$ ./dev/run --admin=adm:pass --node-number=2 -n 2 --no-eval ‘dev/remsh’
[ * ] Setup environment ... ok
[ * ] Ensure CouchDB is built ... ok
[ * ] Ensure Erlang boot script exists ... ok
[ * ] Prepare configuration files ... ok
[ * ] Start node node2 ... ok
[ * ] Start node node3 ... ok
[ * ] Check node at http:https://127.0.0.1:25984/ ... ok
[ * ] Check node at http:https://127.0.0.1:35984/ ... ok
[ * ] Running cluster setup ... ok
[ * ] Exec command dev/remsh ... Erlang/OTP 25 [erts-13.2.2.5] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:1] [jit:ns]

*** ERROR: Shell process terminated! (^G to start new job) ***

Probably not, since we can run NODE=2 ./dev/run --admin=adm:pass --node-number=2 -n 2 --no-eval 'dev/remsh'.

  • I feel, it's more convenient to start the node from 1.
  • It would be more acceptable if -n was a shortcut option to --node-number.
  • Maybe node-number is not a good name, maybe refactor it to a reasonable name?

These are just my personal feelings.
Anyway, your change makes --node-number=2 work. Thank you for your contribution.

@pgj
Copy link
Contributor Author

pgj commented Feb 5, 2024

@jiahuili430 I do not think dev/run should specifically support the remsh use case. For what it is worth, I just fixed what was already there. Fortunately, as you suggested in your comment above, there is an easy to way to make remsh fit if the numbering of nodes does not start from 1.

One as a starting index is quite fine in most of the cases. My use case is unusual, though valid. Since we have this --node-number flag there, I wanted to make it work again. I assume it has worked in the past, but some later changes broke it.

I agree that the --node-number could use a shorthand and a possibly a better name. Based on the helper text ("The node number to seed them when creating the node(s)"), --node-number-seed might be a good choice, with a shorthand of -s. However, I reckon that the lack of shorthand is due to the infrequent use of the flag.

Copy link
Contributor

@jaydoane jaydoane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main interest is being able to continue using the --no-eval dev/remsh trick to both start CouchDB and a remsh in the same command. Luckily, this works fine as @jiahuili430 already pointed out:

❯ make devclean
❯ NODE=3 dev/run -a adm:pass --with-clouseau --node-number=3 -n 3 --no-eval dev/remsh
[ * ] Setup environment ... ok
[ * ] Ensure CouchDB is built ... ok
[ * ] Ensure Erlang boot script exists ... ok
[ * ] Prepare configuration files ... ok
[ * ] Start node node3 ... ok
[ * ] Start node node4 ... ok
[ * ] Start node node5 ... ok
[ * ] Start Clouseau node clouseau3 ... ok
[ * ] Start Clouseau node clouseau4 ... ok
[ * ] Start Clouseau node clouseau5 ... ok
[ * ] Check node at http:https://127.0.0.1:35984/ ... ok
[ * ] Check Clouseau node clouseau3 ... ok
[ * ] Check node at http:https://127.0.0.1:45984/ ... ok
[ * ] Check Clouseau node clouseau4 ... ok
[ * ] Check node at http:https://127.0.0.1:55984/ ... ok
[ * ] Check Clouseau node clouseau5 ... ok
[ * ] Running cluster setup ... ok
[ * ] Exec command dev/remsh ... Erlang/OTP 25 [erts-13.2.2] [source] [64-bit] [smp:10:10] [ds:10:10:10] [async-threads:1] [jit]

Enabled docsh 0.7.0 from: /Users/jay/repos/docsh/_build/default/lib/docsh
Call h(docsh) for interactive help.

Eshell V13.2.2  (abort with ^G)
([email protected])1> mem3:nodes().
['[email protected]','[email protected]','[email protected]']
([email protected])2>

Nice improvement!

@big-r81
Copy link
Contributor

big-r81 commented Feb 8, 2024

NODE=3 dev/run -a adm:pass --with-clouseau --node-number=3 -n 3 --no-eval dev/remsh

Does this really work? Shouldn't the argument now be --node-number-seed?

@pgj
Copy link
Contributor Author

pgj commented Feb 8, 2024

@big-r81 earlier it was called --node-number maybe @jaydoane was using that version.

@jaydoane
Copy link
Contributor

jaydoane commented Feb 8, 2024

NODE=3 dev/run -a adm:pass --with-clouseau --node-number=3 -n 3 --no-eval dev/remsh

Does this really work? Shouldn't the argument now be --node-number-seed?

I guess I had an outdated version, but it did in fact work as I reported :)

Copy link
Contributor

@big-r81 big-r81 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

The `--node-number` parameter is not interpreted correctly so that
the node identifiers are shifted everywhere in the same way.  For
example, the lead port and node ports are assigned with no regard
to what was set as the starting node index.

This could be a problem if, for example, one wants to simulate
mixed-version cluster configurations and launch a set of nodes
starting from a given index and use another invocation of the
script to run another set of nodes with different subset of
indexes.

At the same time, the name `--node-number` does not describe well
the purpose of this flag, therefore it is renamed to
`--node-number-seed`.
@pgj pgj merged commit d546447 into apache:main Feb 13, 2024
14 checks passed
@pgj pgj deleted the fix/dev/run/node_number branch February 13, 2024 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants