Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyLLM seems to use OpenAI parameters that are deprecated #187

Open
lfoppiano opened this issue Nov 20, 2023 · 13 comments
Open

KeyLLM seems to use OpenAI parameters that are deprecated #187

lfoppiano opened this issue Nov 20, 2023 · 13 comments

Comments

@lfoppiano
Copy link

First of all, this tool is amazing :-)

I'm trying to use keyLLM using OpenAI API, but when I import the OpenAI module from keybert, I cannot not noticed that the default parameters look having quite old defaults, something like "gpt-3.5-instruct".

The code is something like this:

from keybert.llm import OpenAI

lc_chatgpt = OpenAI(model="gpt-3.5-turbo")
kw_model = KeyLLM(llm=lc_chatgpt)

[...]

keywords_abstracts = kw_model.extract_keywords(abstracts, embeddings=embeddings_abstracts, threshold=0.9)

When trying following your instructions I get a deprecation error:

openai.lib._old_api.APIRemovedInV1: 

You tried to access openai.Completion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.

You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface. 

Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`

A detailed migration guide is available here: https://github.com/openai/openai-python/discussions/742

here the libraries versions:

openai                             1.3.3
keybert                            0.8.3

Thank you in advance

@MaartenGr
Copy link
Owner

Ah, that is correct! It seems that openai has updated their package with some breaking changes. Perhaps if you set openai to 0.28, it might just work. I'll make sure to update the backend so that it works with their newest release. That likely will introduce a breaking change since I want to only support openai>1.

MaartenGr added a commit that referenced this issue Nov 29, 2023
@MaartenGr
Copy link
Owner

@lfoppiano I just pushed a fix to #189, if you have the time. Could you check whether it works for you?

@adegboyegaFAU
Copy link

Hi,

I cant seem to get it to work.

I've installed keybert and openai as follows:

pip install keybert
pip install openai

The versions are:

keybert                   0.8.3
openai                    1.3.7

I've subsequently run the following:

import openai
from keybert.llm import OpenAI
from keybert import KeyLLM

client = openai.OpenAI(api_key=OpenAI.api_key)
llm = OpenAI(client)
kw_model = KeyLLM(llm)

[...]

keywords = kw_model.extract_keywords(docs, check_vocab=True)

However, I end up with the following error:

---------------------------------------------------------------------------
APIRemovedInV1                            Traceback (most recent call last)
Cell In[105], line 2
      1 # Extract keywords
----> 2 keywords = kw_model.extract_keywords(docs, check_vocab=True)

File ~\.conda\envs\PhDProjectsWork\Lib\site-packages\keybert\_llm.py:126, in KeyLLM.extract_keywords(self, docs, check_vocab, candidate_keywords, threshold, embeddings)
    123         keywords = [in_cluster_keywords[index] for index in range(len(docs))]
    124 else:
    125     # Extract keywords using a Large Language Model (LLM)
--> 126     keywords = self.llm.extract_keywords(docs, candidate_keywords)
    128 # Only extract keywords that appear in the input document
    129 if check_vocab:

File ~\.conda\envs\PhDProjectsWork\Lib\site-packages\keybert\llm\_openai.py:177, in OpenAI.extract_keywords(self, documents, candidate_keywords)
    175         response = chat_completions_with_backoff(**kwargs)
    176     else:
--> 177         response = openai.ChatCompletion.create(**kwargs)
    178     keywords = response["choices"][0]["message"]["content"].strip()
    180 # Use a non-chat model
    181 else:

File ~\.conda\envs\PhDProjectsWork\Lib\site-packages\openai\lib\_old_api.py:39, in APIRemovedInV1Proxy.__call__(self, *_args, **_kwargs)
     38 def __call__(self, *_args: Any, **_kwargs: Any) -> Any:
---> 39     raise APIRemovedInV1(symbol=self._symbol)

APIRemovedInV1: 

You tried to access openai.ChatCompletion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.

You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface. 

Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`

A detailed migration guide is available here: https://github.com/openai/openai-python/discussions/742

Reading the help documentation for OpenAI with
help(OpenAI)
it shows:

|  Using the OpenAI API to extract keywords
 |  
 |      The default method is `openai.Completion` if `chat=False`.
 |      The prompts will also need to follow a completion task. If you
 |      are looking for a more interactive chats, use `chat=True`
 |      with `model=gpt-3.5-turbo`.

This would suggest that the error received is correct as openai.Completion is deprecated.

I thought the fix applied works for openai >1.0 ?
Could you help clarify what I'm not doing correctly?

On the other hand, if I try the following:

client = openai.OpenAI(api_key=OpenAI.api_key)
llm = OpenAI(client, chat=True, model="gpt-3.5-turbo")
kw_model = KeyLLM(llm)

I end up with the following error instead

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[108], line 3
      1 # Create LLM
      2 client = openai.OpenAI(api_key=OpenAI.api_key)
----> 3 llm = OpenAI(client, chat=True, model="gpt-3.5-turbo")
      5 # Load it in KeyLLM
      6 kw_model = KeyLLM(llm)

TypeError: OpenAI.__init__() got multiple values for argument 'model'

What am I doing wrong?

@MaartenGr
Copy link
Owner

@adegboyegaFAU You are not using the fix. To install the fix, you should run the following instead:

pip install -U git+https://github.com/MaartenGr/KeyBERT@openai_fix

@adegboyegaFAU
Copy link

adegboyegaFAU commented Dec 8, 2023

Works now! Thanks @MaartenGr. I'd actually previously tried that from @lfoppiano's post on fix #189 and it didn't work. Turns out that what I didn't do after uninstalling keybert then was to restart anaconda.

I really do love the tool by the way. Great work

@lfoppiano
Copy link
Author

@MaartenGr any estimate on when this fix will be released?

@MaartenGr
Copy link
Owner

@lfoppiano I just pushed the fix to the main branch, an official release will follow either this or next week.

@lfoppiano
Copy link
Author

Great, thanks!
I've been testing it extensively these days and works fine

@fabmeyer
Copy link

@MaartenGr the fix is not yet released, right?

@fabmeyer
Copy link

@lfoppiano Can you provide a minimal working example? I am running into problems when using openai LLM for keyword generation.

openai.api_key = os.getenv('OPENAI_API_KEY')
llm = OpenAI(
    client = openai,
    model = "gpt-3.5-turbo-instruct",
    prompt = "Summarize the following text of keywords with a maximum of 5 keywords: \n\n-",
    chat = False,
    verbose = False,
    )

kw_model_2 = KeyLLM(llm)

year = 2010
texts_to_process = unique_keywords_2[year]
topics = kw_model_2.extract_keywords(texts_to_process)
KeyError                                  Traceback (most recent call last)
File [~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:759](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:759), in BaseModel.__getattr__(self, item)
    [758](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:758) try:
--> [759](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:759)     return pydantic_extra[item]
    [760](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:760) except KeyError as exc:

KeyError: 'message'

The above exception was the direct cause of the following exception:

AttributeError                            Traceback (most recent call last)
Cell In[10], [line 17](vscode-notebook-cell:?execution_count=10&line=17)
     [15](vscode-notebook-cell:?execution_count=10&line=15) year = 2010
     [16](vscode-notebook-cell:?execution_count=10&line=16) texts_to_process = unique_keywords_2[year]
---> [17](vscode-notebook-cell:?execution_count=10&line=17) topics = kw_model_2.extract_keywords(texts_to_process)
     [19](vscode-notebook-cell:?execution_count=10&line=19) print(topics)

File [~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:126](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:126), in KeyLLM.extract_keywords(self, docs, check_vocab, candidate_keywords, threshold, embeddings)
    [123](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:123)         keywords = [in_cluster_keywords[index] for index in range(len(docs))]
    [124](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:124) else:
    [125](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:125)     # Extract keywords using a Large Language Model (LLM)
--> [126](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:126)     keywords = self.llm.extract_keywords(docs, candidate_keywords)
    [128](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:128) # Only extract keywords that appear in the input document
    [129](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:129) if check_vocab:

File [~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:189](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:189), in OpenAI.extract_keywords(self, documents, candidate_keywords)
    [187](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:187)     else:
    [188](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:188)         response = self.client.completions.create(model=self.model, prompt=prompt, **self.generator_kwargs)
--> [189](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:189)     keywords = response.choices[0].message.content.strip()
    [190](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:190) keywords = [keyword.strip() for keyword in keywords.split(",")]
    [191](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:191) all_keywords.append(keywords)

File [~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:761](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:761), in BaseModel.__getattr__(self, item)
    [759](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:759)         return pydantic_extra[item]
    [760](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:760)     except KeyError as exc:
--> [761](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:761)         raise AttributeError(f'{type(self).__name__!r} object has no attribute {item!r}') from exc
    [762](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:762) else:
    [763](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:763)     if hasattr(self.__class__, item):

AttributeError: 'CompletionChoice' object has no attribute 'message'

@lfoppiano
Copy link
Author

lfoppiano commented Feb 15, 2024

@fabmeyer I use the gpt3.5-turbo openai model and chat=True.

I assembled an example from the code I've used (disclaimer: I did not test it):

client = openai.OpenAI()
chatgpt = OpenAI(client, model="gpt-3.5-turbo", chat=True)
kw_model = KeyLLM(llm=chatgpt)
model = SentenceTransformer('all-MiniLM-L6-v2')

abstracts = [work['abstract'] if 'abstract' in work and work['abstract'] is not None else "" for work in
                 works]
embeddings_abstracts = model.encode(abstracts, convert_to_tensor=True)
keywords_abstracts = kw_model.extract_keywords(abstracts, embeddings=embeddings_abstracts, threshold=0.5)

@MaartenGr
Copy link
Owner

MaartenGr commented Feb 15, 2024

Ah right, I should definitely release an official version. Let me work on it for a bit and I'll let you know when I release 0.8.4.

@MaartenGr
Copy link
Owner

Apologies for the late delay (and thanks for the ping)! I just pushed 0.8.4 to PyPI, so all changes to the main branch should now be in the official release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants