Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support exec for nomad alloc exec and change_mode = script #193

Closed
Procsiab opened this issue Oct 3, 2022 · 5 comments
Closed

support exec for nomad alloc exec and change_mode = script #193

Procsiab opened this issue Oct 3, 2022 · 5 comments
Labels

Comments

@Procsiab
Copy link
Contributor

Procsiab commented Oct 3, 2022

Hello, I am facing an error in trying to run a Job with a Template using the "new" change_mode = "script". My main goal would be to trigger an Nginx reload whenever its templated configuration file changes.

First of all, here it is the reference documentation for this template feature; in addition, below there is the snippet of that templated task from my Nomad Job file:

template {
    data = <<EOH
    # redacted non-relevant stuff
    upstream php-handler {
        {{- range service "myfpmservicename" }}
        server {{ .Address }}:{{ .Port }}{{ end }};
    }
EOH
    destination = "local/nginx.conf"
    change_mode = "script"
    change_script {
        command       = "/usr/sbin/nginx"
        args          = ["-s", "reload"]
        timeout       = "1s"
        fail_on_error = true
    }
}

When I run the job with the task templated in such a way, it starts without errors; however, if I try to restart its Allocation, then the template re-render is triggered and the following error occurs until I stop the Job and then run it again:

Task hook failed | Template failed to run script /usr/sbin/nginx with arguments [-s reload] because task driver doesn't support the exec operation.

Then after the second restart attempt:

Template failed to run script /usr/sbin/nginx with arguments [-s reload] on change: rpc error: code = Unknown desc = task not found for given id Exit code: 0

My environment is the following:

  • Consul v1.13.1
  • Nomad v1.3.5
  • nomad-driver-podman compiled from 39a4a50
  • Podman 4.2.1

If you need more information I am willing to help you in testing this use case: at the moment I am using the change_mode = "restart" or signal where I can pass that to the container, but this is not always Ideal since restarting a Task may lead to service unavailability.

@tgross
Copy link
Member

tgross commented Oct 4, 2022

Hi @Procsiab! Just for clarity the error message you're getting here is saying that the task driver doesn't have the capability to execute scripts inside tasks. That means you can't use nomad alloc exec either. That's documented in the Capabilities table for the driver: https://www.nomadproject.io/plugins/drivers/podman#capabilities

It looks like it's possible in podman (ref https://docs.podman.io/en/latest/markdown/podman-exec.1.html) but just not implemented yet. I'm going to update the title of this issue and mark it as a feature request.

@tgross tgross changed the title Nomad Template change_mode script not supported support exec for nomad alloc exec and change_mode = script Oct 4, 2022
@Procsiab
Copy link
Contributor Author

Procsiab commented Oct 4, 2022

Thanks for you clear explaination @tgross! I did not figure out that the nomad alloc exec capability was the one involved, this makes a lot of sense.
Also, thanks for updating accordingly the title and tag for this issue ❤️

@towe75
Copy link
Collaborator

towe75 commented Nov 21, 2022

@Procsiab the driver does support nomad alloc exec since quite a while. It works both on command line and also when opening a terminal via nomad ui.

Example:

 nomad alloc exec 2849 systemctl status apache2

● apache2.service - The Apache HTTP Server
     Loaded: loaded (/lib/systemd/system/apache2.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/apache2.service.d
             └─override.conf
     Active: active (running) since Mon 2022-11-21 00:46:02 UTC; 5h 40min ago

I did never try to use it within the template stanza, though.

@tgross is there anything special when using it in this context? I can remember that there was some oddity when running scripted health checks, which might be related.

@Procsiab
Copy link
Contributor Author

I can confirm @towe75 's observation on manual execution of commands inside containers with the alloc exec subcommand. Since I opened the issue Nomad 1.4 came out: I will soon take the time to upgrade and also try to replicate the issue on it.

@lgfa29 lgfa29 added stage/needs-verification issue needs verifying it still exists theme/driver labels Dec 1, 2022
@Procsiab
Copy link
Contributor Author

Hello there, it has been some time since latest update; however I am updating the issue with good news: I am not able to reproduce anymore my issue using a script reload action with the Nomad template stanza.

My Nomad job uses:

  • Caddy proxy 2.4.6
  • Nomad 1.6.1

An extract from my job file:

            template {
                data = <<EOH
{{ lookup('file', 'applications/caddy/Caddyfile.tpl') }}
EOH
                destination   = "local/Caddyfile"
                change_mode   = "script"
                change_script {
                    command       = "/usr/bin/caddy"
                    args          = ["reload", "--config", "/local/Caddyfile", "--adapter", "caddyfile"]
                    timeout       = "5s"
                    fail_on_error = true
                }
            }

The Nomad allocation logs on the reload event:

Jul 31, '23 10:29:11 +0000 | Task hook message | Template successfully ran script /usr/bin/caddy with  arguments: [reload --config /local/Caddyfile --adapter caddyfile]. Exit  code: 0
Jul 31, '23 10:29:05 +0000 | Task hook message | Template successfully ran script /usr/bin/caddy with  arguments: [reload --config /local/Caddyfile --adapter caddyfile]. Exit  code: 0
Jul 31, '23 10:29:03 +0000 | Task hook message | Template successfully ran script /usr/bin/caddy with  arguments: [reload --config /local/Caddyfile --adapter caddyfile]. Exit  code: 0
Jul 31, '23 10:28:45 +0000 | Started | Task started by client
Jul 31, '23 10:28:44 +0000 | Task Setup | Building Task Directory
Jul 31, '23 10:28:37 +0000 | Received | Task received by client

I can also confirm that the same allocation stays alive and reloads without stopping the undelrlying container.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

No branches or pull requests

4 participants