Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat(dspy): from_pandas support #1176

Merged
merged 7 commits into from
Jun 27, 2024

Conversation

Anindyadeep
Copy link
Contributor

@Anindyadeep Anindyadeep commented Jun 19, 2024

This PR adds support for to load dspy dataset from dataframes directly. Something which I personally found useful when fetching from some sources did some cleanup and get dspy dataset without saving it as a csv.

Fixes issue: #1177

@Anindyadeep
Copy link
Contributor Author

PS: Just for the ruff checks, I did additional changes on types. Let me know if I need to remove that.

@Josephrp
Copy link

very nice update !

@krypticmouse
Copy link
Collaborator

I see one else block removed in from_huggingface method is it a breaking change? It might be for few datasets, did you check?

@Anindyadeep
Copy link
Contributor Author

I see one else block removed in from_huggingface method is it a breaking change? It might be for few datasets, did you check?

Okay, so I reverted to the implementation along with I added the from_pandas implementation. However I am not sure how to remove the lint problem, because I face these error even when I am on the container (built from the repo).

Let me know, I will be doing that in other of my PRs too.

@krypticmouse
Copy link
Collaborator

Thanks for the contribution!! Did you try running ruff check . --fix?

@okhat
Copy link
Collaborator

okhat commented Jun 27, 2024

Let's merge this if @krypticmouse approves

@Anindyadeep
Copy link
Contributor Author

Thanks for the contribution!! Did you try running ruff check . --fix?

Hey thanks for the quick pointer, although I have used this command earlier which gives me this output:

.....
testing/tasks/tweet_metric.py:66:5: ANN201 Missing return type annotation for public function `metric`
testing/tasks/tweet_metric.py:66:24: ARG001 Unused function argument: `trace`
testing/tasks/tweet_metric.py:67:5: N806 Variable `gpt3T` in function should be lowercase
testing/tasks/tweet_metric.py:67:12: N806 Variable `gpt4T` in function should be lowercase
testing/tasks/tweet_metric.py:73:121: E501 Line too long (122 > 120)
testing/tasks/tweet_metric.py:82:121: E501 Line too long (122 > 120)
testing/tasks/tweet_metric.py:94:9: ANN201 Missing return type annotation for public function `forward`
testing/tasks/tweet_metric.py:96:121: E501 Line too long (126 > 120)
testing/tasks/tweet_metric.py:104:121: E501 Line too long (126 > 120)
testing/tasks/tweet_metric.py:121:9: N806 Variable `gpt3T` in function should be lowercase
testing/tasks/tweet_metric.py:121:16: N806 Variable `gpt4T` in function should be lowercase
testing/tasks/tweet_metric.py:131:121: E501 Line too long (159 > 120)
testing/tasks/tweet_metric.py:140:121: E501 Line too long (155 > 120)
testing/tasks/tweet_metric.py:146:9: ANN201 Missing return type annotation for public function `get_program`
testing/tasks/tweet_metric.py:149:9: ANN201 Missing return type annotation for public function `get_metric`
Found 2281 errors.
No fixes available (692 hidden fixes can be enabled with the `--unsafe-fixes` option).

To solve now, I kinda applied a small hack (reverting back to the contents in the main branch and adding the from_pandas function) However, I am guessing I might need to set the environment once again to not see logs like above when doing ruff fix?

@krypticmouse
Copy link
Collaborator

Yea try syncing the repo with the current version and lemme know if you still see the errors

Copy link
Collaborator

@krypticmouse krypticmouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm 🚢

@krypticmouse
Copy link
Collaborator

Thank you for the contribution. Merging it!

@krypticmouse krypticmouse merged commit deff8ec into stanfordnlp:main Jun 27, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants