Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set io_priority in all IO paths #4101

Open
chewbranca opened this issue Jul 13, 2022 · 4 comments
Open

Set io_priority in all IO paths #4101

chewbranca opened this issue Jul 13, 2022 · 4 comments

Comments

@chewbranca
Copy link
Contributor

Summary

There are a number of codepaths performing IO operations that do not set an io_priority flag, so they do not properly get labeled or prioritized by IOQ and default to other io_class [1]. Many moons ago I made a PR [2] that went through and properly tagged anywhere I could find making IO requests without an io_priority set. It looks like the main gaps around indexing and indexer compaction were fixed in [3], however there are still a number of places lacking an io_priority value.

I also think it would be worthwhile to introduce at least one, perhaps two, additional IO classes. I think we should add a system io_priority class for things like auth cache, loading mem3 shards, global changes, couch_per_user, etc. I think it might also be worthwhile to add a dedicated replication io_priority, although it kind of looks like the replicator no longer does local db replications? In [2], the replicator still allowed for doing a direct couch_changes fetch of the local database, so perhaps my original concern around the replicator doing direct changes reads of full databases going unflagged with io_priority is no longer relevant. We should probably still tag the replicator doc updates with either system or replication though.

Desired Behaviour

To minimize (or eliminate) the use of the fallback other io_priority class so that all IO is properly flagged with the appropriate IO class.

Possible Solution

The PR in [2] has a number of simple additions of where to put the io_priority process dictionary value when missing. I tracked those down by modifying the code to throw an exception when absent and then iterated through all the locations I could find. I found many, but the approach was not fully exhaustive so I don't think we should crash by default when there's no io_priority set.

Additional context

[1] https://github.com/apache/couchdb/blob/main/src/ioq/src/ioq.erl#L84-L85
[2] https://github.com/apache/couchdb/pull/1998/files
[3] 13bf0eb

@nickva
Copy link
Contributor

nickva commented Jul 13, 2022

@chewbranca good idea to have better priority classes!

Replicator db access from the replicator could be a system db access or a have its own replicator class. But I can see the user accessing db (user creating a replication doc) staying as an interactive request?

the replicator still allowed for doing a direct couch_changes fetch of the local database

I think that's just a left-over clause we didn't clean up when we removed the local access.

@chewbranca
Copy link
Contributor Author

Replicator db access from the replicator could be a system db access or a have its own replicator class. But I can see the user accessing db (user creating a replication doc) staying as an interactive request?

Yeah I think system for the replicator db access would work fine.

I do like the idea of being able to prioritize replicator traffic at a different priority level than interactive, I think that would be a useful addition allowing for replication traffic to be deprioritized a bit relative to normal interactive client requests. This would be a bigger change than just properly setting io_priority values in places where it's currently absent, as I believe for that to work we would need to do something like extrapolate from the http request user agent that it's Couch-Replicator and then propagate the replication io_priority from the coordinator node to all the rpc nodes.

I think that's just a left-over clause we didn't clean up when we removed the local access.

Makes sense 👍

@chewbranca
Copy link
Contributor Author

I've got two PRs out for this, one in the main CouchDB repo #4106 and a corresponding PR in the IOQ repo: apache/couchdb-ioq#19

@big-r81
Copy link
Contributor

big-r81 commented Dec 9, 2022

Can this be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants