Workflow updated with the use of LLMs (using Amazon Bedrock) #3

dlaredo · 2024-06-25T05:36:03Z

Issue #, if available:

Description of changes:

Incorporates LLMs to process the data in a more efficient manner
Uses Amazon Bedrock
Creates a new data-streamer to test the workflow

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

donatoaz · 2024-07-08T18:14:19Z

backend/lambdas/process_post/lambda.py

+
+from langchain.prompts import PromptTemplate
+from langchain.llms.bedrock import Bedrock
+from langchain_community.chat_models import BedrockChat


The official package for accessing bedrock models is now langchain-aws

Fixed in commit e611804

donatoaz · 2024-07-08T18:17:10Z

backend/lambdas/process_post/lambda.py

+ 'claude': 'anthropic.claude-3-haiku-20240307-v1:0',
+}
+
+logging.getLogger().setLevel(os.environ.get('LOG_LEVEL', 'WARNING').upper())


Have you considered using Powertools for Lambda? They offer a cool structured logging convenience.

Fixed in commit e611804

donatoaz · 2024-07-08T18:24:40Z

backend/lambdas/process_post/lambda.py

+ augmented_json_format_str = json.dumps(json_format)
+
+ logging.info(f'Extract data prompt')
+ logging.info(extract_data_prompt.format(json_format=augmented_json_format_str,


Have you considered using a few shot prompting template?

Added in commit e611804

donatoaz · 2024-07-08T18:25:16Z

backend/lambdas/process_post/lambda.py

+
+ extract_data_prompt = ChatPromptTemplate.from_messages(messages_data)
+
+ chain_extract_data = extract_data_prompt | llm_data | StrOutputParser()


Have you considered using structured output or tools to do info extraction?

Added in commit e611804

…default LLM

donatoaz

Hi david, I think overall my original comments were neatly addressed and this looks a lot more "curent" in terms of how to use langchain, very good work.

I left some comments, they are all minor, so I will go ahead and approve, take in the comments and, if you want apply them, otherwise, feel free to merge.

donatoaz · 2024-07-31T17:48:02Z

architecture.pptx

Maybe this could be a pdf, seems more inclusive, not everyone has ppt

The PPT is required by the solutions team but I'm including a PDF version anyway.

Added PDF diagram in 0aa1ba7

donatoaz · 2024-07-31T17:55:48Z

backend/lambdas/process_post/lambda.py

+ item = event
+
+ logger.info('Item:')
+ logger.info(item)


We are already logging the entire event on line 124. Maybe you could reconsider one of them.

Fixed in e91f7a0

donatoaz · 2024-07-31T17:56:01Z

backend/lambdas/process_post/lambda.py

+
+ #Attemp to categorize item
+ text = item['text']
+ logger.info(f'Text: {text}')


same here, we are logging this twice

Fixed in e91f7a0

donatoaz · 2024-07-31T17:56:10Z

backend/lambdas/process_post/lambda.py

+ logger.info(f'Text: {text}')
+
+ text = demoji.replace(text, "")
+ item['text_clean'] = text


same here, we are logging this twice

Fixed in e91f7a0

donatoaz · 2024-07-31T17:56:21Z

backend/lambdas/process_post/lambda.py

+
+ try:
+
+ #meta_topics_str = ','.join(META_TOPICS)


commented out line?

Fixed in e91f7a0

donatoaz · 2024-07-31T18:10:53Z

backend/lambdas/process_post/lambda.py

+ text: str
+) -> ExtractedInformation:
+
+ bedrock_llm = ChatBedrock(


Have you considered extracting this instantiation to root level so the client is created only once per lifetime of the lambda function?

Cant extract from the function since the parameters are different for each instance of the client.

donatoaz · 2024-07-31T18:12:24Z

backend/lambdas/process_post/lambda.py

+
+ claude_information_extraction_prompt_template = INFORMATION_EXTRACTION_PROMPT_SELECTOR.get_prompt(MODEL_ID)
+
+ print("The prompt template")


Is there a reason for using prints when you have a logger instance? It might also be an opportunity to have some logs as INFO and others as DEBUG.

Fixed in e91f7a0

donatoaz · 2024-07-31T18:13:40Z

backend/lambdas/process_post/lambda.py

+
+ text_insight = text_information_extraction(META_SENTIMENTS_STR, item['text_clean'])
+ logger.info(f'Text insights:')
+ logger.info(text_insight)


You seem to be double-logging some things, for example line 119 logs the same as this line 150.

Fixed in e91f7a0

donatoaz · 2024-07-31T18:14:09Z

backend/lambdas/process_post/lambda.py

+ })
+
+ print("Information extraction object")
+ print(type(information_extraction_obj))


Is this necessary? We know the type is ExtractedInformation since we are using structured outputs

Fixed in e91f7a0

donatoaz · 2024-07-31T18:24:52Z

data-streammer/stream_posts.py

+ region = args.region
+
+ sqs = boto3.client('sqs')
+ translate = boto3.client(service_name='translate', region_name=region, use_ssl=True)


Are you using this?

This is optional only for sending a test load to the solution and the translation part is only used if the user decides to test for posts in Spanish

dlaredo · 2024-08-02T07:41:13Z

Solution upgraded to support the use of LLMs to make information extraction more efficient:

Description of changes:

Incorporates LLMs to process the data in a more efficient manner
Makes use of structured outputs using Pydantic models
Uses Amazon Bedrock
Creates a new data-streamer to test the workflow

David Laredo added 10 commits March 31, 2024 01:38

feture: support for Claude 3 models

fcd9029

feature: adding support for Claude 3 Models

9ef4761

feature: adding support for Bedrock LLMs

c51c2eb

feature: changing architecture to account for Bedrock LLMs

0106f67

fix: changing L4M dimension names

86d10a1

feature: creating data streamer to simulate posts being streamed

7f806cc

fix: updating README file

2485e7a

adding cost section to readme file

41ba856

licese section improved wording

2f1e474

fix: adding license header to missing files

662329d

arbahena approved these changes Jun 26, 2024

View reviewed changes

fix: removing unused code

21e05e2

donatoaz reviewed Jul 8, 2024

View reviewed changes

David Laredo added 6 commits July 30, 2024 13:04

adding english and spanish examples

0b2d38b

creating models for data extraction

06d7778

created prompt selector

d89fc32

feat: post extraction uses structured output and prompt selectors

e611804

fix: removing unnecessary instructions on readme file

4432ea1

feat: upgrading Python Lambda version and defined Claude 3 Sonnet as …

23950fe

…default LLM

dlaredo requested a review from donatoaz July 30, 2024 20:43

donatoaz approved these changes Jul 31, 2024

View reviewed changes

David Laredo added 2 commits August 2, 2024 01:33

feat: using pydantic models to create insights object

e91f7a0

adding PDF diagram

0aa1ba7

donatoaz merged commit ace836a into aws-samples:main Aug 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workflow updated with the use of LLMs (using Amazon Bedrock) #3

Workflow updated with the use of LLMs (using Amazon Bedrock) #3

dlaredo commented Jun 25, 2024

donatoaz Jul 8, 2024

dlaredo Jul 30, 2024

donatoaz Jul 8, 2024

dlaredo Jul 30, 2024

donatoaz Jul 8, 2024

dlaredo Jul 30, 2024

donatoaz Jul 8, 2024

dlaredo Jul 30, 2024

donatoaz left a comment

donatoaz Jul 31, 2024

dlaredo Aug 2, 2024

dlaredo Aug 2, 2024

donatoaz Jul 31, 2024

dlaredo Aug 2, 2024

donatoaz Jul 31, 2024

dlaredo Aug 2, 2024 •

edited

Loading

donatoaz Jul 31, 2024

dlaredo Aug 2, 2024 •

edited

Loading

donatoaz Jul 31, 2024

dlaredo Aug 2, 2024 •

edited

Loading

donatoaz Jul 31, 2024

dlaredo Jul 31, 2024

donatoaz Jul 31, 2024

dlaredo Aug 2, 2024

donatoaz Jul 31, 2024

dlaredo Aug 2, 2024

donatoaz Jul 31, 2024

dlaredo Aug 2, 2024

donatoaz Jul 31, 2024

dlaredo Aug 2, 2024

dlaredo commented Aug 2, 2024


		extract_data_prompt = ChatPromptTemplate.from_messages(messages_data)

		chain_extract_data = extract_data_prompt \| llm_data \| StrOutputParser()


		claude_information_extraction_prompt_template = INFORMATION_EXTRACTION_PROMPT_SELECTOR.get_prompt(MODEL_ID)

		print("The prompt template")

Workflow updated with the use of LLMs (using Amazon Bedrock) #3

Workflow updated with the use of LLMs (using Amazon Bedrock) #3

Conversation

dlaredo commented Jun 25, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

donatoaz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dlaredo Aug 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dlaredo Aug 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dlaredo Aug 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dlaredo commented Aug 2, 2024

dlaredo Aug 2, 2024 •

edited

Loading

dlaredo Aug 2, 2024 •

edited

Loading

dlaredo Aug 2, 2024 •

edited

Loading