Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error copying large subset of files (1M+) from a Onprem to a Azure File Storage #2690

Open
NorthFace21 opened this issue May 23, 2024 · 3 comments

Comments

@NorthFace21
Copy link

Which version of the AzCopy was used?

AzcopyVersion 10.24.0

Note: The version is visible when running AzCopy without any argument

Which platform are you using?

OS-Environment windows

What command did you run?

\pc1\d$\ROFS\Defusion* https://azsrofs.file.core.windows.net --check-length=false --log-level=ERROR --preserve-smb-permissions=true --preserve-permissions=true --preserve-smb-info=true --follow-symlinks=false --overwrite=false --recursive .

What problem was encountered?

Each time I start a copy when the counter reach about 262,000 files copied everything after fails. (THe number is very precise)
The error is 👍
dial tcp privateendpoint:443: connectex: Only one usage of each socket address (protocol/network address/port) is normally permitted.

How can we reproduce the problem in the simplest way?

Try to copy over 262000 files in on single batch, tried to change the number of concurent value from 16 to 2048 with the same numbers always.

Have you found a mitigation/solution?

I haven't found any solution at all, I am running the copy multiple time with the overwrite at false, and incuring more charges because I run multiple copy. So for 1M files I do run the same copy 4 time and cancel with CRTL-C when reaching 262000 because after that all I get is FAILED FAILED.

THanks

@NorthFace21
Copy link
Author

My initial belief is that the connection are somehow kept alive and we reach the limit of number of port of the privateendpoint and then we start failing stoping the process takes a long time (perhaps severing all those connection) Also to note if I run 2-3 concurent copy in 3 different command prompt from the same server to the same share using the same command EACH one will reach 262000 and then start failing all remaining object.

@philippjenni
Copy link

philippjenni commented May 27, 2024

I have encountered the same problem when synchronising 2 blobs with AzCopy. The error ‘dial tcp 52.239.251.68:443: connectex: Only one usage of each socket address (protocol/network address/port) is normally permitted.’ is triggered when copying or deleting the files and fills the EventLog on the system.

I was able to solve the problem by setting AZCOPY_CONCURRENCY_VALUE to 1. The errors are then gone, but the synchronisation is no longer as fast, which in my case does not play a major role, as there are many files but few changes.

I use the following command to synchronise: \azcopy sync $Source $Target --recursive --delete-destination=true. The blog has around 1.7 million files in it.

Translated with DeepL.com (free version)

@NorthFace21
Copy link
Author

Thanks for you input, I did experiment with the Concurency value to 1, however time here is somewhat important, and I am having millions of small file and this kills completely the performance of the tools. I didn't even complete the copy as it was easier to do the cancel resume. (I have over 40M files to sync total in a lot of different share, but the size isn't that huge we are talking about 10tb total.

I was however semi successuful with the job resume copy. So start the copy and then at 250k, hit crtl-c, wait to complete, resume the copy wait to about 500k then cancel then resume, however it requires me to be looking at the computer and being present so that it doesnt start to fail, if it does then the resume becomes useless.

If there was a way to tread the copy so it break it in batch like 250K files per batch and then run batchs that would work also. For now its just really time consuming, as I cant let it run overnight, but it is still faster than any other tool I tried! So I am kind of stuck between a rock and a hard place.

THanks,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants