Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'recreated' API objects are not (always) active in IDO DB #5205

Closed
grrvs opened this issue Apr 27, 2017 · 6 comments
Closed

'recreated' API objects are not (always) active in IDO DB #5205

grrvs opened this issue Apr 27, 2017 · 6 comments
Labels
area/api REST API area/db-ido Database output bug Something isn't working

Comments

@grrvs
Copy link

grrvs commented Apr 27, 2017

If I create a bunch of hosts and services via API, delete these objects, and recreate the same objects again, sometimes one or more object(s) are lost in Icingaweb2.
As far as I understand these are not set is_active=1 in IDO DB / table icinga_objects

Expected Behavior

all objects should be active if configured - even those that get created and deleted over and over

Current Behavior

some objects are not active, resulting in a 'funny state' - 22 Services vs. 74 total
icingaweb2_living_undead_objects
Restarting the process helps sometimes

Possible Solution

no clue

Steps to Reproduce (for bugs)

  1. create a host via API, e.g. host01
  2. delete that host via API
  3. repeat steps 1 to 3 until it happens (remember to use the same name)
  4. ... or clone this repo and follow the readme - I tried to wrap everything up for testing in the single node vagrant box :)

Context

I first noticed this with 2.5.x as we set up a HA cluster with a single DB.
We tested some 'how long does it take to rebuild the configuration if we've lost the puppet DB and a crazy job deletes all monitoring items' scenario.
First guess was 'cluster out of sync' so we rebuilt the cluster and cleared the DB. Since 2.6 the cluster is stable and now - well, that had to be another issue....

Your Environment

  • Version used: v2.6.3-192-gb62241e
  • Operating System and version: Centos 7 Vagrant box
  • Enabled features: api checker graphite ido-mysql mainlog notification
  • Config validation (icinga2 daemon -C):
[root@icinga2 ~]# icinga2 daemon -C
information/cli: Icinga application loader (version: v2.6.3-192-gb62241e)
information/cli: Loading configuration file(s).
information/ConfigItem: Committing config item(s).
information/ApiListener: My API identity: icinga2
information/ConfigItem: Instantiated 4 ApiUsers.
information/ConfigItem: Instantiated 1 ApiListener.
information/ConfigItem: Instantiated 3 Zones.
information/ConfigItem: Instantiated 1 FileLogger.
information/ConfigItem: Instantiated 1 Endpoint.
information/ConfigItem: Instantiated 2 NotificationCommands.
information/ConfigItem: Instantiated 175 CheckCommands.
information/ConfigItem: Instantiated 2 HostGroups.
information/ConfigItem: Instantiated 1 IcingaApplication.
information/ConfigItem: Instantiated 5 Hosts.
information/ConfigItem: Instantiated 1 User.
information/ConfigItem: Instantiated 1 UserGroup.
information/ConfigItem: Instantiated 3 TimePeriods.
information/ConfigItem: Instantiated 3 ServiceGroups.
information/ConfigItem: Instantiated 80 Services.
information/ConfigItem: Instantiated 1 IdoMysqlConnection.
information/ConfigItem: Instantiated 1 NotificationComponent.
information/ConfigItem: Instantiated 1 GraphiteWriter.
information/ConfigItem: Instantiated 1 CheckerComponent.
information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
information/cli: Finished validating the configuration file(s).
@grrvs
Copy link
Author

grrvs commented Apr 28, 2017

Don't know if that is related, but I noticed the memory usage rising slowly - just run the script for half a day and you'll see. The dashboard is included in the repo and will be synced to vagrant grafana.
grafana_screenshot

@dnsmichi
Copy link
Contributor

Wow, what a nice report.

From a quick look I think the memory usage is related to #5148 while the object activation signal could be the same as reported in #4040. Both are somewhat hard to debug, but on our TODO list @lippserd

@dnsmichi dnsmichi added area/api REST API bug Something isn't working area/db-ido Database output blocker Blocks a release or needs immediate attention labels Apr 28, 2017
@grrvs
Copy link
Author

grrvs commented Apr 28, 2017

Thanks - looking at the issue with less sleepy eyes, it may be related to #5152 as well

@silenceJI
Copy link

Hello, �
Is there a temporary solution to this problem?

@Stefar77
Copy link
Contributor

Stefar77 commented Jul 17, 2017

@grrvs
What happens if you sleep in between remove node X and create node X?

Delete host and create host within a real short period of time could indeed trigger weirdness in the current state.
Stressing the API like this will create deadlocks and threadleaks, can you check the amount of threads you Icinga2 process has?

Could try this test again using my API patches in #5419 it makes the API a bit more stable.

After the test, wait 30 seconds and check if the thread counter is back to normal.
Mine is always around ~40, if not it will slowly go down do 40 in a few minutes.

@dnsmichi
Copy link
Contributor

Parent ticket for solving these problems is #6012.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api REST API area/db-ido Database output bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants