Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow controller restarts with 'fatal error: concurrent map read and map write' #2638

Closed
onzo-dyk opened this issue Apr 8, 2020 · 2 comments · Fixed by #2851
Closed
Labels

Comments

@onzo-dyk
Copy link

onzo-dyk commented Apr 8, 2020

What happened:
Workflow controller restarts one, twice every day with fatal error: concurrent map read and map write
I have few cron workflows active. AFAIK they are executed without any errors. I cannot correlate restarts time with workflows execution nor deploys

Environment:

  • Argo version:
argo: v2.6.3
  BuildDate: 2020-03-16T17:55:34Z
  GitCommit: 2e8ac609cba1ad3d69c765dea19bc58ea4b8a8c3
  GitTreeState: clean
  GitTag: v2.6.3
  GoVersion: go1.13.4
  Compiler: gc
  Platform: linux/amd64
  • Kubernetes version :
clientVersion:
  buildDate: "2019-02-28T13:37:52Z"
  compiler: gc
  gitCommit: c27b913fddd1a6c480c229191a087698aa92f0b1
  gitTreeState: clean
  gitVersion: v1.13.4
  goVersion: go1.11.5
  major: "1"
  minor: "13"
  platform: linux/amd64
serverVersion:
  buildDate: "2019-05-01T04:05:01Z"
  compiler: gc
  gitCommit: 7a578febe155a7366767abce40d8a16795a96371
  gitTreeState: clean
  gitVersion: v1.11.10
  goVersion: go1.10.8
  major: "1"
  minor: "11"
  platform: linux/amd64

Logs

Time                            message
April 8th 2020, 06:24:00.248    goroutine 6 [chan receive]:
April 8th 2020, 06:24:00.248    k8s.io/klog.(*loggingT).flushDaemon(0x257d560)
                                /go/pkg/mod/k8s.io/[email protected]/klog.go:1010 +0x8b
April 8th 2020, 06:24:00.248    goroutine 33 [chan receive, 1380 minutes]:
April 8th 2020, 06:24:00.248    
April 8th 2020, 06:24:00.248    created by k8s.io/client-go/util/workqueue.newDelayingQueue
                                /go/pkg/mod/k8s.io/client-go0.0.0-20191225075139-73fd2ddc9180/util/workqueue/delaying_queue.go:56 +0x1d4
April 8th 2020, 06:24:00.248    created by k8s.io/client-go/util/workqueue.newDelayingQueue
                                /go/pkg/mod/k8s.io/client-go0.0.0-20191225075139-73fd2ddc9180/util/workqueue/delaying_queue.go:56 +0x1d4
April 8th 2020, 06:24:00.248    internal/poll.runtime_pollWait(0x7f6975f31000, 0x72, 0xffffffffffffffff)
                                /usr/local/go/src/runtime/neoll.go:184 +0x55
April 8th 2020, 06:24:00.248    internal/poll.(*pollDesc).waitRead(...)
                                /usr/local/go/src/internal/pl/fd_poll_runtime.go:92
April 8th 2020, 06:24:00.248    internal/poll.(*FD).Read(0xc0003f6f80, 0xc0005c6000, 0x8b2b, 0x8b2b, 0x0, 0x0, 0x0)
                                /usr/local/go/src/internal/pl/fd_unix.go:169 +0x1cf
April 8th 2020, 06:24:00.248    net.(*conn).Read(0xc00044a000, 0xc0005c6000, 0x8b2b, 0x8b2b, 0x0, 0x0, 0x0)
                                /usr/local/go/src/net/net.go84 +0x68
April 8th 2020, 06:24:00.248    crypto/tls.(*Conn).readRecord(...)
                                /usr/local/go/src/crypto/tlsonn.go:577
April 8th 2020, 06:24:00.248    io.ReadFull(...)
                                /usr/local/go/src/io/io.go:3
April 8th 2020, 06:24:00.248    
April 8th 2020, 06:24:00.248    sync.(*Cond).Wait(0xc0003609b8)
                                /usr/local/go/src/sync/cond.:56 +0x9d
April 8th 2020, 06:24:00.248    goroutine 37 [select]:
April 8th 2020, 06:24:00.248    
April 8th 2020, 06:24:00.248    created by k8s.io/client-go/util/workqueue.newQueue
                                /go/pkg/mod/k8s.io/client-go0.0.0-20191225075139-73fd2ddc9180/util/workqueue/queue.go:58 +0x132
April 8th 2020, 06:24:00.248    
April 8th 2020, 06:24:00.248    k8s.io/client-go/util/workqueue.(*delayingType).waitingLoop(0xc0003d72d0)
                                /go/pkg/mod/k8s.io/client-go0.0.0-20191225075139-73fd2ddc9180/util/workqueue/delaying_queue.go:215 +0x344
April 8th 2020, 06:24:00.248    internal/poll.(*pollDesc).wait(0xc0003f6f98, 0x72, 0x8b00, 0x8b2b, 0xffffffffffffffff)
                                /usr/local/go/src/internal/pl/fd_poll_runtime.go:87 +0x45
April 8th 2020, 06:24:00.248    crypto/tls.(*Conn).readFromUntil(0xc000424380, 0x19b0a00, 0xc00044a000, 0x5, 0xc00044a000, 0xc00249c2d0)
                                /usr/local/go/src/crypto/tlsonn.go:802 +0xec
April 8th 2020, 06:24:00.248    io.ReadAtLeast(0x19aebe0, 0xc000444de0, 0xc0004701f8, 0x9, 0x9, 0x9, 0x404d15, 0xc001374420, 0xc000354dd8)
                                /usr/local/go/src/io/io.go:3 +0x87
April 8th 2020, 06:24:00.248    created by golang.org/x/net/http2.(*Transport).newClientConn
                                /go/pkg/mod/golang.org/x/net0.0.0-20191209160850-c0dbc17a3553/http2/transport.go:674 +0x62f
April 8th 2020, 06:24:00.248    k8s.io/client-go/tools/cache.(*controller).processLoop(0xc0003f7080)
                                /go/pkg/mod/k8s.io/client-go0.0.0-20191225075139-73fd2ddc9180/tools/cache/controller.go:150 +0x40
April 8th 2020, 06:24:00.248    k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000093fb0)
                                /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5e
April 8th 2020, 06:24:00.248    github.com/argoproj/argo/workflow/cron.(*Controller).runCronWorker-fm()
                                /go/src/github.com/argoproj/go/workflow/cron/controller.go:107 +0x2a fp=0xc0018cbe70 sp=0xc0018cbe58 pc=0x13f377a
April 8th 2020, 06:24:00.248    k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0002ec440, 0x3b9aca00, 0x0, 0x1, 0xc00040c300)
                                /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8 fp=0xc0018cbf90 sp=0xc0018cbee0 pc=0x11d16c8
April 8th 2020, 06:24:00.248    
April 8th 2020, 06:24:00.248    
April 8th 2020, 06:24:00.248    goroutine 35 [chan receive]:
April 8th 2020, 06:24:00.248    github.com/argoproj/pkg/stats.StartStatsTicker.func1(0xc000330a50)
                                /go/pkg/mod/github.com/argopj/[email protected]/stats/stats_linux.go:19 +0x37
April 8th 2020, 06:24:00.248    os/signal.signal_recv(0x0)
                                /usr/local/go/src/runtime/siueue.go:147 +0x9c
April 8th 2020, 06:24:00.248    created by os/signal.init.0
                                /usr/local/go/src/os/signal/gnal_unix.go:29 +0x41
April 8th 2020, 06:24:00.248    github.com/argoproj/pkg/stats.RegisterStackDumper.func1()
                                /go/pkg/mod/github.com/argopj/[email protected]/stats/stats_linux.go:31 +0xa0
April 8th 2020, 06:24:00.248    created by k8s.io/client-go/util/workqueue.newQueue
                                /go/pkg/mod/k8s.io/client-go0.0.0-20191225075139-73fd2ddc9180/util/workqueue/queue.go:58 +0x132
April 8th 2020, 06:24:00.248    
April 8th 2020, 06:24:00.248    crypto/tls.(*Conn).readRecordOrCCS(0xc000424380, 0x0, 0x0, 0x437a7e)
                                /usr/local/go/src/crypto/tlsonn.go:609 +0x124
April 8th 2020, 06:24:00.248    bufio.(*Reader).Read(0xc000444de0, 0xc0004701f8, 0x9, 0x9, 0xc00205a2a0, 0xc000354dd8, 0xc000354cf0)
                                /usr/local/go/src/bufio/bufigo:226 +0x26a
April 8th 2020, 06:24:00.248    golang.org/x/net/http2.(*Framer).ReadFrame(0xc0004701c0, 0xc0007c0360, 0x0, 0x0, 0x0)
                                /go/pkg/mod/golang.org/x/net0.0.0-20191209160850-c0dbc17a3553/http2/frame.go:492 +0xa1
April 8th 2020, 06:24:00.248    goroutine 43 [sync.Cond.Wait, 8 minutes]:
April 8th 2020, 06:24:00.248    runtime.goparkunlock(...)
                                /usr/local/go/src/runtime/pr.go:310
April 8th 2020, 06:24:00.248    sync.runtime_notifyListWait(0xc0003609c8, 0xb9)
                                /usr/local/go/src/runtime/se.go:510 +0xf8
April 8th 2020, 06:24:00.248    github.com/argoproj/argo/workflow/cron.(*Controller).processNextCronItem(0xc000360a50, 0x0)
                                /go/src/github.com/argoproj/go/workflow/cron/controller.go:153 +0x47a fp=0xc0018cbe38 sp=0xc0018cbcc8 pc=0x13ee8ea
April 8th 2020, 06:24:00.248    k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0002ec440)
                                /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5e fp=0xc0018cbee0 sp=0xc0018cbe70 pc=0x11d212e
April 8th 2020, 06:24:00.248    runtime.goexit()
                                /usr/local/go/src/runtime/asamd64.s:1357 +0x1 fp=0xc0018cbfd0 sp=0xc0018cbfc8 pc=0x45b691
April 8th 2020, 06:24:00.248    github.com/spf13/cobra.(*Command).ExecuteC(0xc00038ea00, 0xc000060750, 0xc00010df50, 0x40576f)
                                /go/pkg/mod/github.com/[email protected]/command.go:852 +0x2ea
April 8th 2020, 06:24:00.248    created by k8s.io/klog.init.0
                                /go/pkg/mod/k8s.io/[email protected]/klog.go:411 +0xd6
April 8th 2020, 06:24:00.244    fatal error: concurrent map read and map write


Message from the maintainers:

If you are impacted by this bug please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

@simster7
Copy link
Member

simster7 commented Apr 8, 2020

This seems like a bug internal to the Kubernetes client

@shibataka000
Copy link
Contributor

Workflow controller restarts with fatal error: concurrent map read and map write in our environment, too.

Our environment are as follow.

  • Argo server version:
image: argoproj/argocli:v2.6.1
  • Workflow controller version:
image: argoproj/workflow-controller:v2.6.1
  • Kubernetes version:
Client Version: v1.17.5
Server Version: v1.15.10-eks-bac369

This is part of workflow-controller's error logs.

fatal error: concurrent map read and map write

goroutine 72 [running]:
runtime.throw(0x17b27e3, 0x21)
	/usr/local/go/src/runtime/panic.go:774 +0x72 fp=0xc0001dbc58 sp=0xc0001dbc28 pc=0x42e5d2
runtime.mapaccess2_faststr(0x157fd40, 0xc0003907b0, 0xc0013003a0, 0x20, 0xc00041e140, 0xc0000d31a0)
	/usr/local/go/src/runtime/map_faststr.go:116 +0x48f fp=0xc0001dbcc8 sp=0xc0001dbc58 pc=0x412cff
github.com/argoproj/argo/workflow/cron.(*Controller).processNextCronItem(0xc0004400b0, 0x0)
	/go/src/github.com/argoproj/argo/workflow/cron/controller.go:153 +0x47a fp=0xc0001dbe38 sp=0xc0001dbcc8 pc=0x13ee3ba
github.com/argoproj/argo/workflow/cron.(*Controller).runCronWorker(0xc0004400b0)
	/go/src/github.com/argoproj/argo/workflow/cron/controller.go:108 +0x2b fp=0xc0001dbe58 sp=0xc0001dbe38 pc=0x13edf1b
github.com/argoproj/argo/workflow/cron.(*Controller).runCronWorker-fm()
	/go/src/github.com/argoproj/argo/workflow/cron/controller.go:107 +0x2a fp=0xc0001dbe70 sp=0xc0001dbe58 pc=0x13f324a
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc00041e160)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5e fp=0xc0001dbee0 sp=0xc0001dbe70 pc=0x11d1bfe
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00041e160, 0x3b9aca00, 0x0, 0x1, 0xc0000aa180)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8 fp=0xc0001dbf90 sp=0xc0001dbee0 pc=0x11d1198
k8s.io/apimachinery/pkg/util/wait.Until(0xc00041e160, 0x3b9aca00, 0xc0000aa180)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d fp=0xc0001dbfc8 sp=0xc0001dbf90 pc=0x11d108d
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc0001dbfd0 sp=0xc0001dbfc8 pc=0x45b691
created by github.com/argoproj/argo/workflow/cron.(*Controller).Run
	/go/src/github.com/argoproj/argo/workflow/cron/controller.go:97 +0x4d9

You can see full of it in here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants