Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fix write s3 file 80GB limit issue #376

Merged
merged 6 commits into from
Jul 9, 2024
Merged

Conversation

bbtfr
Copy link
Member

@bbtfr bbtfr commented Jun 22, 2024

https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html

aws s3 实现的 multi part upload 最多上传 10000 个分片,国内几个云厂商也遵守这个约定,导致 megfile 最坏情况只能写 80GB 大小的文件

新开 DEFAULT_MIN_BLOCK_SIZE 配置项,单独控制文件上传时分片的最小尺寸,因为不想升大版本,尽可能的保持了之前的参数名

Copy link

codecov bot commented Jun 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.22%. Comparing base (3771f4d) to head (f736622).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #376   +/-   ##
=======================================
  Coverage   97.22%   97.22%           
=======================================
  Files          44       44           
  Lines        6056     6057    +1     
=======================================
+ Hits         5888     5889    +1     
  Misses        168      168           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@LoveEatCandy
Copy link
Collaborator

DEFAULT_WRITE_BLOCK_SIZE 好点吧,默认等于 DEFAULT_BLOCK_SIZE,没大文件需求还是块小点好,能多利用多线程以及 retry 的成本不高。然后我准备再加个 DEFAULT_READ_BLOCK_SIZE 这样读写都可以自定义。相关的东西我在文档里面好好描述一下

@bbtfr
Copy link
Member Author

bbtfr commented Jun 24, 2024

DEFAULT_WRITE_BLOCK_SIZE 好点吧,默认等于 DEFAULT_BLOCK_SIZE,没大文件需求还是块小点好,能多利用多线程以及 retry 的成本不高。然后我准备再加个 DEFAULT_READ_BLOCK_SIZE 这样读写都可以自定义。相关的东西我在文档里面好好描述一下

好,我改一下,主要看到 MAX_BLOCK_SIZE 也是写相关的

@LoveEatCandy
Copy link
Collaborator

我把你这个 PR 合了,环境变量就用你这个吧,但是默认值我改成和以前一样了,还是 8MB,如果有需求自己设大点。我准备等 10 月份 python3.8 过期,升大版本再把环境变量统一改改名

@LoveEatCandy LoveEatCandy merged commit 18c4b2f into main Jul 9, 2024
11 checks passed
@LoveEatCandy LoveEatCandy deleted the liyang/fix-s3-write branch July 9, 2024 08:40
@MoyanZitto
Copy link
Contributor

@LoveEatCandy 接着这个问题讨论下,看代码实现,s3_buffered_open在写的时候,block_size被min_block_size覆盖了?这就是说,在写的时候传进去的block_size是不生效的。感觉是bug?

@LoveEatCandy
Copy link
Collaborator

@MoyanZitto 是有这个问题,下周修一下

@LoveEatCandy
Copy link
Collaborator

@LoveEatCandy 接着这个问题讨论下,看代码实现,s3_buffered_open在写的时候,block_size被min_block_size覆盖了?这就是说,在写的时候传进去的block_size是不生效的。感觉是bug?

@MoyanZitto 3.1.0.post2 修复了这个问题,相应的 PR 是这个:#392

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants