Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link in Readme produces 404 #117

Open
gladwig2 opened this issue Aug 19, 2023 · 15 comments
Open

Link in Readme produces 404 #117

gladwig2 opened this issue Aug 19, 2023 · 15 comments

Comments

@gladwig2
Copy link

In the readme, there is a link that if followed produces a 404. See below.

THIS REPO IS PROBABLY NOT WHAT YOU ARE LOOKING FOR. A copy of the Pile can be downloaded here.

@SinclairCoder
Copy link

It also produces 404!

@gladwig2
Copy link
Author

right, that is the link in the README

@SinclairCoder
Copy link

So how to download this dataset now?

@gladwig2
Copy link
Author

gladwig2 commented Aug 21, 2023

I don't know... that's kind-of why I submitted an issue. If you find an alternate site, please post it!

@SinclairCoder
Copy link

ok

@nezhazheng
Copy link

+1

@zigzagcai
Copy link

+1, it returns 404 not found

@curehabit
Copy link

+1

@stc2001
Copy link

stc2001 commented Oct 3, 2023

magnet:?xt=urn:btih:0d366035664fdf51cfbe9f733953ba325776e667

@stc2001
Copy link

stc2001 commented Oct 3, 2023

magnet:?xt=urn:btih:0d366035664fdf51cfbe9f733953ba325776e667

this is the magnet link.
ps: the dataset has been banned bacause of copyright issues.
The legal departments of major companies such as Microsoft have over 10000 ways to indefinitely engage in attrition wars with rights organizations, and rights organizations should never try to profit even a penny from those large companies.The only one who ultimately got hurt was the open source organization. This big company won: they knocked down the open source organization and established a dataset that only they knew about copyright infringement.Some of the existing copyright protections are unfair to open-source organizations .

@anthonyprinaldi
Copy link

+1

@Jessy0429
Copy link

magnet:?xt=urn:btih:0d366035664fdf51cfbe9f733953ba325776e667

this magnet link does not work :( .

@Zero-Pointer
Copy link

magnet:?xt=urn:btih:0d366035664fdf51cfbe9f733953ba325776e667

Thanks!!

@dsdanielpark
Copy link

magnet:?xt=urn:btih:0d366035664fdf51cfbe9f733953ba325776e667

Thank you. I hope it works.

Margnet starts download. but speed is so slow. lol

@kuailehaha
Copy link

https://web.archive.org/web/20240000000000*/https://the-eye.eu/public/AI/pile/
You need to sign up mayback machine website first. Then you can download the pile_preliminary_components.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests