Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test coverage for optimum_lm.py #1872

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zafstojano
Copy link
Contributor

@zafstojano zafstojano commented May 22, 2024

Created tests adapted for optimum_lm.py by adapting those from test_huggingface.py for the Pythia-70m model.

Addresses: #1613

In short, I needed to update the epected logit / generated outputs. As I understand, even though the models are the same, the generated outputs are not due to the optimizations performed by the optimum library.

Tests are passing:

pytest tests/models/test_optimum_lm.py

============================= test session starts =============================
platform linux -- Python 3.12.1, pytest-8.2.1, pluggy-1.5.0
rootdir: /home/z/projects/lm-evaluation-harness
configfile: pyproject.toml
plugins: anyio-4.3.0, cov-5.0.0, xdist-3.6.1
collected 7 items                                                             

tests/models/test_optimum_lm.py .......                                 [100%]

============================== warnings summary ===============================
<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: Type google._upb._message.MessageMapContainer uses PyType_Spec with a metaclass that has custom tp_new. This is deprecated and will no longer be allowed in Python 3.14.

<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: Type google._upb._message.ScalarMapContainer uses PyType_Spec with a metaclass that has custom tp_new. This is deprecated and will no longer be allowed in Python 3.14.

../../.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/huggingface_hub/file_download.py:1132
../../.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/huggingface_hub/file_download.py:1132
../../.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/huggingface_hub/file_download.py:1132
  /home/z/.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
    warnings.warn(

../../.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/transformers/utils/import_utils.py:521
  /home/z/.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/transformers/utils/import_utils.py:521: FutureWarning: `is_torch_tpu_available` is deprecated and will be removed in 4.41.0. Please use the `is_torch_xla_available` instead.
    warnings.warn(

../../.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/transformers/modeling_utils.py:4371
  /home/z/.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/transformers/modeling_utils.py:4371: FutureWarning: `_is_quantized_training_enabled` is going to be deprecated in transformers 4.39.0. Please use `model.hf_quantizer.is_trainable` instead
    warnings.warn(

../../.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py:867
  /home/z/.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py:867: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    assert batch_size > 0, "batch_size has to be defined and > 0"

../../.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py:555
  /home/z/.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py:555: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if seq_len > self.max_seq_len_cached:

../../.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/optimum/bettertransformer/models/attention.py:52
  /home/z/.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/optimum/bettertransformer/models/attention.py:52: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if batch_size == 1 and attention_mask is not None and attention_mask[0, 0, -1, -1] < -1:

../../.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/optimum/bettertransformer/models/attention.py:56
  /home/z/.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/optimum/bettertransformer/models/attention.py:56: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if batch_size == 1 or self.training:

../../.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/optimum/bettertransformer/models/attention.py:70
  /home/z/.pyenv/versions/3.12.1/envs/lm-eval/lib/python3.12/site-packages/optimum/bettertransformer/models/attention.py:70: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if query_length > 1:

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================= 7 passed, 12 warnings in 35.77s =======================

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant