Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readme Example is broken #2193

Closed
drisspg opened this issue Mar 12, 2024 · 5 comments
Closed

Readme Example is broken #2193

drisspg opened this issue Mar 12, 2024 · 5 comments

Comments

@drisspg
Copy link
Contributor

drisspg commented Mar 12, 2024

Summary

Steps to reproduce

python3 install.py
pip install . # add -e for an editable installation
import torchbenchmark.models.densenet121
model, example_inputs = torchbenchmark.models.densenet121.Model(test="eval", device="cuda", batch_size=1).get_module()
model(*example_inputs)
Traceback (most recent call last):
  File "/home/drisspg/meta/benchmark/torchbench_loop.py", line 14, in <module>
    main()
  File "/home/drisspg/meta/benchmark/torchbench_loop.py", line 6, in main
    model, example_inputs =  models.densenet121.Model(test="eval", device="cuda", batch_size=1).get_module()
AttributeError: module 'torchbenchmark.models' has no attribute 'densenet121

Having Torchbench not be package though makes the path shennagins kind of hard to reason about so not sure if the why the model is not being picked up.

@xuzhao9
Copy link
Contributor

xuzhao9 commented Mar 13, 2024

Sorry I could not reproduce this error:

(base) ➜  benchmark git:(main) python -c "import torchbenchmark.models.densenet121; print(torchbenchmark.models.densenet121.Model)"
/home/xz/miniconda3/lib/python3.11/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
<class 'torchbenchmark.models.densenet121.Model'>

This will also work:

python -c 'from torchbenchmark.models import densenet121; print(densenet121.Model)'
/home/xz/miniconda3/lib/python3.11/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
<class 'torchbenchmark.models.densenet121.Model'>

Can you please try with import torchbenchmark.models.densenet121, and then use
torchbenchmark.models.densenet121.Model() instead of models.densenet121.Model()?

--
I think I can reproduce when importing with torchbenchmark or torchbenchmark.models:

python -c 'import torchbenchmark; print(torchbenchmark.models.densenet121.Model)'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: module 'torchbenchmark.models' has no attribute 'densenet121'

I can look into how to support directly instantiate the model with importing torchbenchmark.models or torchbenchmark module.

@drisspg
Copy link
Contributor Author

drisspg commented Mar 13, 2024

Yeah, I was able to work around this by making sure I was running the script from with the torchbench top level structure, I do think this is very counter-intutitive.

Script I got working, all though many models threw failures with the --precision=bf16 args
https://gist.github.com/drisspg/402855d460dbf130013b9230b35699c2#file-torchbench_loop-py

Feel free to close this issue if you think its not reproducible

@xuzhao9
Copy link
Contributor

xuzhao9 commented Mar 13, 2024

I will look into how to instantiate models by directly importing from torchbenchmark.models.

Can we have a list of models that don't support bf16? We fixed BERT_pytorch in #2185 but I hope we can have a complete list for the track of record.

@drisspg
Copy link
Contributor Author

drisspg commented Mar 17, 2024

Using this script:

from transformer_nuggets.utils.shape_trace import ShapeLog
import torch
from pathlib import Path
from tqdm import tqdm
import logging
import json

logging.basicConfig(level=logging.INFO)

def main():
    import torchbenchmark.models as models
    models = []
    success_count = 0
    failure_count = 0
    model_failures = {}
    for file in Path("torchbenchmark/models/").iterdir():
        if file.is_dir():
            models.append(file.name)
    for model_name in tqdm(models, desc="Logging models", aunit="model"):
        try:
            module = __import__(f"torchbenchmark.models.{model_name}", fromlist=[model_name])
            model, example_inputs = module.Model(test="train", device="cuda", extra_args=["--precision=bf16",]).get_module()
            model(*example_inputs)
            success_count += 1
        except Exception as e:
            tqdm.write(f"Failed to log {module}: {e}")
            failure_count += 1
            model_failures[model_name] = str(e)

    tqdm.write(f"Successfully logged {success_count} models")
    tqdm.write(f"Failed to log {failure_count} models")
    with open("model_failures_bf16.txt", "w") as f:
        json.dump(model_failures, f)



if __name__ == "__main__":
    main()

returns the following model failures;

https://www.internalfb.com/intern/paste/P1197341526/

@xuzhao9
Copy link
Contributor

xuzhao9 commented Mar 19, 2024

Thanks for sharing the result, I created a new issue #2203 tracking the bf16 precision support of the models.

The package import problem has been fixed, so I am closing this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants