Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add GitHub component #177

Open
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

YCK1130
Copy link
Collaborator

@YCK1130 YCK1130 commented Jun 25, 2024

Because

  • We need a GitHub component to complete some development automation tasks

This commit

Related Issue

instill-ai/instill-core#1025

Todo

  • TASK_GET_ALL_PULL_REQUESTS: function to get all prs given owner name and repository name.
  • TASK_GET_PULL_REQUEST: function to get a specific pr given owner name, repository name, and pr number. (including file changes)
  • TASK_GET_REVIEW_COMMENT: get review comment inside a pull request
  • TASK_CREATE_REVIEW_COMMENT: create review comment inside a pull request
  • TASK_GET_COMMIT: get commit messages and file changes
  • TASK_CREATE_ISSUE: post issue
  • TASK_GET_ALL_ISSUES: get all issues in a repo
  • TASK_GET_ISSUE: get an issue
  • TASK_CREATE_WEBHOOK: register webhook, https://docs.github.com/en/webhooks/webhook-events-and-payloads

Copy link

@GeorgeWilliamStrong GeorgeWilliamStrong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job with the documentation, just left a few minor comments for you to address.

Copy link
Member

@chuang8511 chuang8511 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finished my first review.

It is a big PR.
I did not have too much time to check the GitHub API document in details one by one, which means I may miss some details.
So, please help me do 2 things

  1. Please let me know if there is something concerning you that you hope others can re-confirm. This is including
  • The parts you want to make your code cleaner.
  • The parts you hope others to check the doc as well. And, please attach the doc url.
  • The parts you are afraid that will increase the code maintenance in the future
  • ....
  1. Please build several pipelines to make sure your Tasks solid & robust locally.
  • To avoid the duplicate work, you can save the recipe and to upload d0/staging in the future. (I am not so sure if we support this function now, and I will also check this part)

application/github/v0/config/definition.json Outdated Show resolved Hide resolved
"TASK_GET_REVIEW_COMMENTS",
"TASK_CREATE_REVIEW_COMMENT",
"TASK_GET_ALL_ISSUES",
"TASK_GET_ISSUE",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question

    "TASK_GET_ALL_PULL_REQUESTS",
    "TASK_GET_PULL_REQUEST",
    "TASK_GET_ALL_ISSUES",
    "TASK_GET_ISSUE",

I am curious about the purpose of the design here.
If we remain multiple one, will it be cleaner in terms of UX / codebase?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I make tasks to get a specific pr/issue to simplify the response. If the client can only use multiple ones, they may need to use another component to filter out the response they want and make the pipeline more complex.

Moreover, some detailed information may not be provided in the multiple ones' responses. The purpose of build "get all" here is to provide a tool for users to know what pr/issue the repo has if they don't know the pr/issue number.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can combine the two multiple tasks into a "TASK_GET_REPO_INFO" and return all the info users might want to know?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can combine the two multiple tasks into a "TASK_GET_REPO_INFO" and return all the info users might want to know?

I think your original way will be fine!
Aligning the vendors could bring as benefit in terms of workflow use case in my opinion.

It may be too various to know what users want to know in the cases of workflow because every company has different flows even when there are best practices.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @YCK1130 @chuang8511
Let's use LIST instead of GET_ALL, as it's a more typical name when listing the data.

  • "TASK_GET_ALL_PULL_REQUESTS" -> TASK_LIST_PULL_REQUESTS
  • "TASK_GET_ALL_ISSUES" -> TASK_LIST_ISSUES

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!

One more thing, there are page, page_number options to determine how many items would be returned, should I include these in the field?

application/github/v0/config/tasks.json Outdated Show resolved Hide resolved
application/github/v0/config/tasks.json Show resolved Hide resolved
application/github/v0/config/tasks.json Outdated Show resolved Hide resolved
application/github/v0/config/tasks.json Show resolved Hide resolved
application/github/v0/commits.go Show resolved Hide resolved
application/github/v0/commits.go Outdated Show resolved Hide resolved
application/github/v0/pull_request.go Outdated Show resolved Hide resolved
application/github/v0/pull_request.go Outdated Show resolved Hide resolved
@YCK1130
Copy link
Collaborator Author

YCK1130 commented Jul 4, 2024

Concerns about this component

  1. The rate limit of GitHub API calls for unauthenticated requests is 60 requests per hour, which is quite small. I left the token field optional because some tasks are available without authentication.
  • Personal access token rate limit: 5000/hr
  • GitHub Enterprise Cloud organization: 15000/hr
  1. The "create review comment" task may need more checks. The docs of GitHub are quite unclear and there seems to be some mismatch between the document and the actual API call.
  2. I am using a Mock server to simulate the API calls just like in other components. I don't know whether it will be difficult to maintain in the future.

Copy link
Member

@chuang8511 chuang8511 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

second draft review.
I will check in details next week.

"TASK_GET_REVIEW_COMMENTS",
"TASK_CREATE_REVIEW_COMMENT",
"TASK_GET_ALL_ISSUES",
"TASK_GET_ISSUE",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can combine the two multiple tasks into a "TASK_GET_REPO_INFO" and return all the info users might want to know?

I think your original way will be fine!
Aligning the vendors could bring as benefit in terms of workflow use case in my opinion.

It may be too various to know what users want to know in the cases of workflow because every company has different flows even when there are best practices.

application/github/v0/config/tasks.json Outdated Show resolved Hide resolved
application/github/v0/main.go Outdated Show resolved Hide resolved
"TASK_GET_REVIEW_COMMENTS",
"TASK_CREATE_REVIEW_COMMENT",
"TASK_GET_ALL_ISSUES",
"TASK_GET_ISSUE",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @YCK1130 @chuang8511
Let's use LIST instead of GET_ALL, as it's a more typical name when listing the data.

  • "TASK_GET_ALL_PULL_REQUESTS" -> TASK_LIST_PULL_REQUESTS
  • "TASK_GET_ALL_ISSUES" -> TASK_LIST_ISSUES

return &base.ExecutionWrapper{Execution: e}, nil
}

func (e *execution) fillInDefaultValues(input *structpb.Struct) (*structpb.Struct, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YCK1130 @chuang8511
How about let's move this function to the base package and make it available for all components.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good idea. However, there are some concerns described in the Slack.

application/github/v0/client.go Show resolved Hide resolved
application/github/v0/issues.go Outdated Show resolved Hide resolved
application/github/v0/issues.go Outdated Show resolved Hide resolved
application/github/v0/review_coment.go Outdated Show resolved Hide resolved
"github.com/instill-ai/x/errmsg"
)

func middleWare(req string) int {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for the mock server to throw errors across various tasks.

type MockIssuesService struct{}

func (m *MockIssuesService) ListByRepo(ctx context.Context, owner, repo string, opt *github.IssueListByRepoOptions) ([]*github.Issue, *github.Response, error) {
switch middleWare(owner) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can the owner string be converted to an HTTP code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🚛 In progress
5 participants