Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SyntaxError in preprocessing_data.py #343

Closed
bpm246 opened this issue May 20, 2021 · 2 comments
Closed

SyntaxError in preprocessing_data.py #343

bpm246 opened this issue May 20, 2021 · 2 comments
Labels
bug Something isn't working

Comments

@bpm246
Copy link

bpm246 commented May 20, 2021

I'm trying to downloand and tokenize Enron Emails but when running preprocessing_data.py a Syntax Error pops up. Specifically in line 105

File "tools/preprocess_data.py", line 105
def yield_from_files(fnames: list, semaphore):
^
SyntaxError: invalid syntax

My python version is 3.8.5, so there shouldn't be any compatiblity problem.

Any ideas why is this happening?

@bpm246 bpm246 added the bug Something isn't working label May 20, 2021
@sdtblck
Copy link
Contributor

sdtblck commented May 21, 2021

I ran into this the other day - Are you running tools/preprocess_data.py or prepare_data.py in the main directory?

If the latter, I think it's because we're just using python to call preprocess_data.py here https://github.com/EleutherAI/gpt-neox/blob/main/tools/corpora.py#L133 and in some setups that can default to using python 2 - where yield from wasn't an operator yet.

I'll try to make this more robust in a future PR - but for now can you let me know if your problem is solved by changing the line linked above to call python3 rather than python?

@bpm246
Copy link
Author

bpm246 commented May 21, 2021

I tried to run the program with python3, but it kept getting the same error.
I wasn´t using anaconda as recommended by you, and there was the problem. I tried the process after installing anaconda and the necessary packages and it works without problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants