Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in jupyter server, can not download huggingface model #4677

Closed
ThumbRocket opened this issue Jul 4, 2024 · 1 comment
Closed

in jupyter server, can not download huggingface model #4677

ThumbRocket opened this issue Jul 4, 2024 · 1 comment

Comments

@ThumbRocket
Copy link

ThumbRocket commented Jul 4, 2024

hi
I am using Colab to create a Jupiter server using ngrok.
At this time, when importing the pre-trained huggingface model, infinite loading occurs at the location below.

# Code that is causing the problem
import pandas as pd

from transformers import GPT2Tokenizer, GPT2LMHeadModel, logging as transformers_logging

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')
# API code where the problem occurred
<img width="1165" alt="image" src="https://github.com/googlecolab/colabtools/assets/66233986/9fe28172-2e37-4285-a538-a29ad4585d31">

-------------------API location where the problem occurred-------------------
<ipython-input-2-b8e6d5b2ef2e> in <cell line: 6>()
      4 
      5 
----> 6 tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
      7 model = GPT2LMHeadModel.from_pretrained('gpt2')

/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py in from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, trust_remote_code, *init_inputs, **kwargs)
   2027                     # Try to get the tokenizer config to see if there are versioned tokenizer files.
   2028                     fast_tokenizer_file = FULL_TOKENIZER_FILE
-> 2029                     resolved_config_file = cached_file(
   2030                         pretrained_model_name_or_path,
   2031                         TOKENIZER_CONFIG_FILE,

/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py in cached_file(path_or_repo_id, filename, cache_dir, force_download, resume_download, proxies, token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_gated_repo, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, _commit_hash, **deprecated_kwargs)
    397     try:
    398         # Load from URL or cache if already cached
--> 399         resolved_file = hf_hub_download(
    400             path_or_repo_id,
    401             filename,

/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py in _inner_fn(*args, **kwargs)
    112             kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
    113 
--> 114         return fn(*args, **kwargs)
    115 
    116     return _inner_fn  # type: ignore

/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py in hf_hub_download(repo_id, filename, subfolder, repo_type, revision, library_name, library_version, cache_dir, local_dir, user_agent, force_download, proxies, etag_timeout, token, local_files_only, headers, endpoint, legacy_cache_layout, resume_download, force_filename, local_dir_use_symlinks)
   1182         raise ValueError(f"Invalid repo type: {repo_type}. Accepted repo types are: {str(REPO_TYPES)}")
   1183 
-> 1184     headers = build_hf_headers(
   1185         token=token,
   1186         library_name=library_name,

/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py in _inner_fn(*args, **kwargs)
    112             kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
    113 
--> 114         return fn(*args, **kwargs)
    115 
    116     return _inner_fn  # type: ignore

/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_headers.py in build_hf_headers(token, is_write_action, library_name, library_version, user_agent, headers)
    122     """
    123     # Get auth token to send
--> 124     token_to_send = get_token_to_send(token)
    125     _validate_token_to_send(token_to_send, is_write_action=is_write_action)
    126 

/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_headers.py in get_token_to_send(token)
    151 
    152     # Token is not provided: we get it from local cache
--> 153     cached_token = get_token()
    154 
    155     # Case token is explicitly required

/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py in get_token()
     43         `str` or `None`: The token, `None` if it doesn't exist.
     44     """
---> 45     return _get_token_from_google_colab() or _get_token_from_environment() or _get_token_from_file()
     46 
     47 

/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py in _get_token_from_google_colab()
     74 
     75         try:
---> 76             token = userdata.get("HF_TOKEN")
     77             _GOOGLE_COLAB_SECRET = _clean_token(token)
     78         except userdata.NotebookAccessError:

/usr/local/lib/python3.10/dist-packages/google/colab/userdata.py in get(key)
     47   # thread-safe.
     48   with _userdata_lock:
---> 49     resp = _message.blocking_request(
     50         'GetSecret', request={'key': key}, timeout_sec=None
     51     )

/usr/local/lib/python3.10/dist-packages/google/colab/_message.py in blocking_request(request_type, request, timeout_sec, parent)
    174       request_type, request, parent=parent, expect_reply=True
    175   )
--> 176   return read_reply_from_input(request_id, timeout_sec)

/usr/local/lib/python3.10/dist-packages/google/colab/_message.py in read_reply_from_input(message_id, timeout_sec)
     94     reply = _read_next_input_message()
     95     if reply == _NOT_READY or not isinstance(reply, dict):
---> 96       time.sleep(0.025)
     97       continue
     98     if (

I think there is a firewall or authentication problem on colab. Please check.

@ThumbRocket ThumbRocket added the bug label Jul 4, 2024
@metrizable
Copy link
Contributor

@ThumbRocket Thanks for filing the issue and thanks for using Colab. I tried out the code that you shared and could not reproduce the issue you're facing.

ABeB9uSGDvxS7Co

Can you provide a reproducible notebook that exhibits the behavior?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants