AWS re:Invent Day 5: all about the 2023 edition


GenAI omnipresent and responsible AI solution announced

The fifth and final day of AWS Re:Invent 2023 concluded with in-depth sessions on some of the new features announced this week. Deep Dive on three of them...

1. Amazon Q, the new productivity companion from AWS

Today's digital age is marked by a relentless quest for productivity gains, and generative artificial intelligence (AI Gen) technologies are at the heart of this transformation. According to a McKinsey study, up to 25% of our daily activities could be automated, freeing up up to two hours a day for each employee. This concerns a multitude of areas, including marketing, R&D, financial analysis and technical support.

However, the adoption of AI Gen is not without its challenges. Accuracy of answers, trust in sources and citations, data security and access control are major concerns. What's more, it's essential to quickly demonstrate the added value of these technologies.

Against this backdrop, Amazon has developed Q, an AI Gen assistant that adapts specifically to the data, code and operations needs of enterprises. Amazon Q is applicable in a wide range of scenarios, acting both as a versatile companion for enterprises (GPT-style), or as an aid to implementing AWS services (Q is integrated into tools like Glue or QuickSight, for example).

But what's really great about Q is how easy it is to enrich it with company data and query it through the tool (thanks to the RAG technique).

Q connects to over 40 data sources with integrated ingestion and vector indexing. Typical use cases include searching information stored in SharePoint, Confluence, Salesforce and internal wikis, accelerating content creation, summarizing documents and extracting insights for comparison. Amazon Q's key features include conversational interfaces, the execution of actions via plugins, and robust safety and security measures, such as toxicity prevention and the application of enterprise security on clean data, indexed through the RAG technique.

A concrete example is the integration of Amazon Q into Slack, where simply mentioning @AmazonQ starts a conversation with the assistant. Practical demos also show how to create an application and add data from S3, highlighting Q's ease of use and efficiency in enterprise data management.

In short, Amazon Q represents a giant step towards intelligent, secure automation in the enterprise, promising not only substantial productivity gains but also better management of enterprise data.

2. Data integration simplified and enhanced with GenAI

For many years, AWS has offered Amazon Glue, a Serverless ETL-type data integration service that allows you to develop and run jobs in PySpark or Scala, essentially aimed at Big Data Engineers. A few years ago, a graphical tool called Glue Studio appeared, designed to offer developers a more intuitive drag & drop experience (inspired by the self-ETL tools on the market), enabling them to create data extraction and transformation jobs more rapidly.

However, managing connectors to external data sources remained complex with Glue, and significantly slowed down Time to Data for new data sources. AWS has therefore just announced the availability of native Glue connectors for the market's main Cloud or on premises data sources (including Snowflake, Azures CosmosDB and SQL DB, GCP BigQuery, SAP Hana, Teradata, Oracle, SQL Server...) to significantly simplify the connection of Glue jobs to these sources. Connections to these data sources can now be configured simply by entering the required parameters on a few screens.

To complete its data integration suite, last August AWS introduced Glue Data Quality, a Serverless solution designed to define and execute data quality rules based on a hundred or so built-in functions (uniqueness, non-nullity, referential integrity, row count, entropy calculation, etc.) and the DQDL(Data Quality Definition Language) structured language, which lets you combine these rules intuitively to build more complex ones.

Here too, AWS enriches its tool by introducing dynamic rules, i.e. whose thresholds / comparison values are no longer static but calculated, and generally indexed to the historical value of the data itself. This helps to reduce or even eliminate the time needed by developers to adjust these threshold values as the data evolves. AWS has also announced the availability of data anomaly detection functionality, which leverages Machine Learning algorithms to calculate and analyze data statistics, generate observations on potential data quality problems or data patterns that are difficult to detect or anticipate, and generate corresponding rules to continuously supervise their evolution over time.

Finally, generative AI also makes its appearance in Glue with the integration ofAmazon CodeWhisperer in Glue Studio's Notebooks, capable of generating Spark code corresponding to a processing task described in natural language (e.g. "Write Spark DataFrame into Redshift"), andAmazon Q, capable of generating complete integration jobs from a descriptive sentence, but also of correcting certain errors or answering developers' questions instantly via an integrated chat as would an SME(Subject Matter Expert).

Towards more responsible generative AI with Guardrails for Amazon Bedrock

The contributions of generative AI in terms of professional productivity (content creation, information search and summarization, generation of BI reports or Code, etc.), enriched user experience (virtual assistants & chatbots, helpdesk agents for service centers, feedback or conversation analysis...) or

of business process optimization are already well established. However, the development of "generative" applications presents new challenges that need to be anticipated and addressed, such as: the handling of controversial questions or the provision of ethically inappropriate answers, security risks (information leakage) linked to brands, the protection of sensitive user data (PII), or the ever-present possibility of generating biased or stereotyped answers.

Most generative AI foundation models (FMs), such as Jurassic, Claude or LLama2, already embed their own control mechanisms ("guardrails"), but these mechanisms remain limited to the general framework in which they have been trained. AWS proposes to go a step further with the introduction of Guardrails for Bedrock, a solution enabling users to define their own security and ethics rules specific to each generative application.

These safeguards can be applied indiscriminately to any foundation model, as well as to Bedrock agents designed to help users accomplish complex tasks using generative artificial intelligence. They can filter inappropriate content (hateful, insulting, sexual or violent) to varying degrees - both for prompts and generated responses, define and block forbidden topics using short natural language descriptions, or censor all or part of sensitive information in the FM model response before returning it to the user.

These rules are created without any code, simply through configuration screens and natural language prompts, making them accessible to non-technical populations, and can be tested by entering prompts directly into the console.

Charles Moureau

Charles Moureau

Director Cloud4Data

Thomas Dallemagne

Thomas Dallemagne

Partner Advisory

UX & analytics: the methodology that puts the user at the heart of your approach

UX & analytics: the methodology that...

It can't be said often enough: at the heart of any digital...
Connected and decentralized planning: adapting business processes in a VUCA environment

Connected and decentralized planning: adapting...

On March 14, Micropole took part in the Journées DAF, organized by...
Adoption, the 1st success factor for a Data project

Adoption, the 1st success factor for a project...

Today, every company has to collect, manage and use its data,...

Contact us