Default `use_cache=False` in eval harness integration #774

haileyschoelkopf · 2023-01-21T19:13:56Z

When running the eval-harness via GPT-NeoX using more than 1 GPU, evals would fail by default unless setting use_cache=False with an error like the following:

 File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/evaluator.py", line 247, in evaluate
    return func(*args, **kwargs)
  File "/fsx/hailey/gpt-neox/eval_tasks/eval_adapter.py", line 394, in run_eval
    results = evaluator.evaluate(
  File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/utils.py", line 161, in _wrapper
    resps = getattr(lm, reqtype)([req.args for req in reqs])
  File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/base.py", line 820, in fn
    results = evaluator.evaluate(
  File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/utils.py", line 161, in _wrapper
    return fn(*args, **kwargs)
  File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/evaluator.py", line 247, in evaluate
    rem_res = getattr(self.lm, attr)(remaining_reqs)
  File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/base.py", line 185, in loglikelihood
    return fn(*args, **kwargs)
  File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/evaluator.py", line 247, in evaluate
    resps = getattr(lm, reqtype)([req.args for req in reqs])
  File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/base.py", line 820, in fn
    resps = getattr(lm, reqtype)([req.args for req in reqs])
  File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/base.py", line 820, in fn
    rem_res = getattr(self.lm, attr)(remaining_reqs)
  File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/base.py", line 185, in loglikelihood
    return self._loglikelihood_tokens(new_reqs)
  File "/fsx/hailey/gpt-neox/eval_tasks/eval_adapter.py", line 248, in _loglikelihood_tokens
    rem_res = getattr(self.lm, attr)(remaining_reqs)
  File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/base.py", line 185, in loglikelihood
    return self._loglikelihood_tokens(new_reqs)
  File "/fsx/hailey/gpt-neox/eval_tasks/eval_adapter.py", line 248, in _loglikelihood_tokens
    self.cache_hook.add_partial(
  File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/base.py", line 780, in add_partial
    return self._loglikelihood_tokens(new_reqs)
  File "/fsx/hailey/gpt-neox/eval_tasks/eval_adapter.py", line 248, in _loglikelihood_tokens
    self.cache_hook.add_partial(
  File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/base.py", line 780, in add_partial
    self.dbdict[hsh] = res
  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 253, in __setitem__
    self.cache_hook.add_partial(
  File "/fsx/hailey/pythia/lm-evaluation-harness/lm_eval/base.py", line 780, in add_partial
    self.dbdict[hsh] = res
  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 253, in __setitem__
    self.dbdict[hsh] = res
  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 253, in __setitem__
    self.conn.execute(ADD_ITEM, (key, self.encode(value)))
  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 481, in execute
    self.conn.execute(ADD_ITEM, (key, self.encode(value)))
  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 481, in execute
    self.conn.execute(ADD_ITEM, (key, self.encode(value)))
  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 481, in execute
    self.check_raise_error()
  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 475, in check_raise_error
    self.check_raise_error()
  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 475, in check_raise_error
        reraise(e_type, e_value, e_tb)self.check_raise_error()

  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 71, in reraise
  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 475, in check_raise_error
    reraise(e_type, e_value, e_tb)
  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 71, in reraise
        raise valuereraise(e_type, e_value, e_tb)

  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 409, in run
  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 71, in reraise
    raise value
  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 409, in run
        cursor.execute(req, arg)cursor.execute(req, arg)

sqlite3sqlite3..OperationalErrorOperationalError: : database is lockeddatabase is locked

    raise value
  File "/fsx/gpt-neox/conda/envs/neox_deeperspeed/lib/python3.9/site-packages/sqlitedict.py", line 409, in run
    cursor.execute(req, arg)
sqlite3.OperationalError: database is locked
Killing subprocess 1991856
Killing subprocess 1991857
Killing subprocess 1991858
Killing subprocess 1991859

This PR just defaults the caching to False, to avoid this. I think we want the cache off anyway because the naming of the cache doesn't adequately distinguish between different eval steps(?) and so evals would mistakenly read from prior steps' cache.

Set use_cache=False in run_eval_harness

be163a7

haileyschoelkopf requested a review from a team as a code owner January 21, 2023 19:13

haileyschoelkopf requested review from Quentin-Anthony and ShivanshuPurohit January 21, 2023 19:13

github-actions and others added 3 commits January 21, 2023 19:14

Update NeoXArgs docs automatically

425a45b

Merge branch 'main' into patch-eval-cache

7ed7bc6

Update NeoXArgs docs automatically

f4cd742

Quentin-Anthony approved these changes Feb 3, 2023

View reviewed changes

Quentin-Anthony merged commit 26ef16d into main Feb 3, 2023

Quentin-Anthony deleted the patch-eval-cache branch February 3, 2023 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default `use_cache=False` in eval harness integration #774

Default `use_cache=False` in eval harness integration #774

haileyschoelkopf commented Jan 21, 2023

Default use_cache=False in eval harness integration #774

Default use_cache=False in eval harness integration #774

Conversation

haileyschoelkopf commented Jan 21, 2023

Default `use_cache=False` in eval harness integration #774

Default `use_cache=False` in eval harness integration #774