Skip to content

Commit

Permalink
explicitly set multiprocess start method to fork for cross-OS consist…
Browse files Browse the repository at this point in the history
…ency

The method for launching a process can be "spawn", "fork", and "forkserver".
The default on Unix is fork, and the resulting process inherits all resources
from the parent process. Conversely, the default on Mac OS X/Windows is spawn,
which results in a minimal number of resources inherited by the child process.

https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
  • Loading branch information
alistairewj committed Apr 28, 2022
1 parent 3c209d0 commit 77156ce
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion scripts/finish_dedup_wiki40b.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ def _generate_examples(self, split):
data_dir=args.data_dir)


p = mp.Pool(96)
p = mp.get_context("fork").Pool(mp.cpu_count())
i = -1
for batch in ds:
i += 1
Expand Down
2 changes: 1 addition & 1 deletion scripts/load_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ def tok(x):

fout = open(os.path.join(save_dir, dataset_name+"."+split), "wb")

with mp.Pool(mp.cpu_count()) as p:
with mp.get_context("fork").Pool(mp.cpu_count()) as p:
i = 0
sizes = [0]
for b in ds:
Expand Down

0 comments on commit 77156ce

Please sign in to comment.