-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create the build in an async manner to improve response time to webhooks #2870
Conversation
… with error if something goes wrong - the goal is to improve response time to webhook
First, thanks for the pull request 🙂 One concern I have is that always returning a build we give the impression a build was created, which is not always going to be the case. I see this causing confusion and resulting in support burden. For example, if the trigger condition is not satisfied the build is ignored, but a build is still returned in the hook body. For this reason, we would probably be better off always returning a 200 OK with an empty body to avoid any confusion. This would simplify the pull request and would only require changing a few lines of code in the hook handler.
Before making this change we also need to understand whether or not anything useful is being sent back in the hook response on error, because we would be taking that away from users who may be using it to debug today. My gut tells me nothing useful is being returned, but we should verify anyway. Lastly I had hoped one day to return more details in the response, including reasons a build may have been skipped. This will no longer be an option if we process the build async. The timeout errors in GitHub can be unsightly, however, they do not prevent the trigger routing from executing and scheduling a build. For the record I am not opposing this change, and I do think it is probably the right thing to do, but I wanted to document all the pros and cons and things we will need to consider so that others can weigh in. |
Thanks for the quick response. Calling the whole trigger async definitely works for our issue, but I'm not sure it's the best solution. When writing the PR I was thinking that if we wanted the response to be useful we might as well just have the correct build number in the response, that should be enough to debug, at least it's enough to call the drone API build info endpoint, but I missed the build being skipped in which case the response sent is invalid. What we could do, though, is create the build with the status "skipped" and with a message about why it was skipped. As for debugging from the webhooks response, we are doing it but only for issues with our configuration plugin because that's the only way although I think it's easier to debug from drone itself because not every contributor has access to the webhooks response but they do have access to their builds on the drone UI. |
this is not something I would want to do. This would create a lot of noise in the system and would create large gaps in build numbers, as well as increase database size. @tboerger any thoughts? maybe we store the last N webhooks that are sent to the system in a separate table / s3 and we periodically purge? |
I agree. Trying to find another solution 🤔 |
I think the optimal solution probably looks something like this:
not all systems provide access to webhook responses (such as Bitbucket) and it often requires admin access. So hook responses were never a great way to debug. Storing the data in the system and making it accessible via endpoint should make it way easier for users to debug. |
I like the idea. I think we can still make the response from the webhook useful doing something simple. What about that?
{
"hook_id": 1,
"link": "https://drone.com/api/hooks/1"
}
|
yes, we'll have to come up with a more formal rfc (we don't want to rush into this) but I feel pretty good about the ideas being discussed so far. Interested to hear from @ashwilliams1 and @tboerger |
I like the idea to store the webhooks. It's always a pain to find the right webhook within github for debugging. I also like listing the hooks in the ui, we could show attributes where you always got to toggle the hook view within github directly in the list, and we could even filter them. |
I don't know what's the decision-making process but if we go ahead at some point, I'd be happy to work on the implementation :) |
Brad is the project owner, so he is the one that decides the final solution, but he asked us, the other maintainers or contributors, for our opinion to get to a decision. |
This PR has been automatically closed due to inactivity. |
Hi,
I will give the context of this PR:
We have been working on a configuration plugin to handle our monorepo and the solution we came up with does the following steps:
Leveraging the multi-pipeline feature we have been able to come up with a solution which handles each microservice individually, erasing the drawbacks of monorepos in classic CI environments.
A root drone file using this can look like the following:
What we have done works great and we are planning to make it available to the community.
Yet there is one problem, our monorepo is pretty big and even though we created a cache mechanism using a mirror repo as reference, it still takes between 5~8 secs to fetch the repository in the configuration plugin and it leads to timeouts in the Github webhooks as the timeout is set at 10sec.
Following the good practices advised by github here https://developer.github.com/v3/guides/best-practices-for-integrators/, we've decided to come up with this PR which makes the handling of a webhook more asynchronous.
With this PR, the triggerer starts a goroutine to do most of the heavy lifting to create the build, if an error occurs, it creates the build with an error message.
Thanks for the feedbacks