Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INS-2214] [Feature] Web Crawler Operator #616

Open
1 task done
praharshjain opened this issue Sep 29, 2023 · 5 comments
Open
1 task done

[INS-2214] [Feature] Web Crawler Operator #616

praharshjain opened this issue Sep 29, 2023 · 5 comments

Comments

@praharshjain
Copy link

praharshjain commented Sep 29, 2023

Is There an Existing Issue for This?

  • I have searched the existing issues

Project

Instill VDP

Is your Proposal Related to a Problem?

No, it is a new feature request.

Describe Your Proposed Solution

We can implement a "Web Crawler" operator that will take an initial URL & a depth (int) as input and recursively extract links from those pages up to the given depth, finally returning a list of strings (extracted URLs).

Highlight the Benefits

Such an operator will be useful for crawling and gathering online data. For example, the links captured by it can then be fed to the text extraction operator to build a knowledge base from linked documents.

Anything Else?

No response

INS-2214

@praharshjain praharshjain changed the title [Feature] Web Crawler Operator [INS-2214] [Feature] Web Crawler Operator Sep 29, 2023
@github-actions
Copy link

This issue is a great way to kick-start your journey with our project, or to make a positive impact on open-source development. Jump in!

✨ Thank you for your contribution! ✨

@AnkitaMalik22
Copy link

Can you please assign me this ?

@itssiddhantjain
Copy link

Hello @praharshjain, please assign this issue to me as i already worked on this kind of problem in past and has a great experience.

@lazyMonk1010
Copy link

hey @praharshjain i want to work on this issue , as its my 1st work in ai so i really wanna work in this issue . thankyou!!please assign me

@harshsoni7
Copy link

harshsoni7 commented Oct 4, 2023

Can you please assign me this ?

Hi @AnkitaMalik22! Absolutely, we’re thrilled about your interest in our project! 🚀 Here’s the Contributing Guideline for Instill VDP to get you started on your journey! Please refer to the Contributing Guidelines for components as well. Don’t forget to link your pull request to the corresponding issue, and after your PR gets merged, please complete this form to claim your well-deserved points! If you ever have any questions or need a hand along the way, don’t hesitate to drop a message in this thread or hop into our Discord. Happy contributing! 😊🌟

@pinglin pinglin transferred this issue from instill-ai/community May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants