Amazon's Alexa+ is undergoing a significant transformation with new agentic capabilities, powered by generative AI and the Bedrock platform. This evolution moves Alexa from a reactive assistant to a proactive agent capable of complex, multi-step task completion, bridging gaps with innovative tools like Nova Act for seamless user experiences.
Alexa+ Breaks New Ground with Agentic Capabilities
Read original at WIRED →A look at some of the tech challenges Amazon solved in rebuilding Alexa with generative AIIt’s late Sunday afternoon, your arms are full of laundry, and you’re thinking about your busy week ahead. Time to call on your personal assistant to ensure you get to work on time each morning, dressed for the weather and informed about the news.
A warm and familiar voice replies, “Absolutely. Let me work on that routine for you.”For the hundreds of millions of Alexa customers worldwide, telling an Amazon voice assistant to perform simple tasks like playing music or resetting the thermostat has become normal. But the new Alexa+, with conversational ease and complex agentic capabilities, promises to be a transformative tool.
Thanks to advances in AI, led by Amazon Web Services (AWS), the opportunity finally arrived for Amazon to build the kind of assistant the company imagined from the start: one that doesn’t just respond to specific requests but can also have natural conversations to get a near-endless number of things done in the real world.
Alexa+ is rolling out in its early stages now. “We’re at the beginning of a long journey. Now that we have this great technology foundation, this is just the beginning of what we want to do,” Nedim Fresko, VP of technology at Alexa. "We have so many ideas and so many features to implement, and we expect this will grow and develop with our customers.
”Complex Challenges to OvercomeGetting to this point wasn’t easy. The technologists at Alexa had to figure out how to reliably connect an assistant to the tens of thousands of services and devices customers want to connect with in their daily lives.“We couldn’t just build a tech stack from scratch and put generative AI at the center,” says Mara Segel, director of product management for Alexa.
“We had to think about how we keep our high quality of service with all of these customers whose trust we’ve earned, and then how we bring a whole new set of emerging technologies and complete those inventions to get them to the next level of assistance.”The Alexa team quickly realized they faced distinct scenarios to resolve, with the help of AWS, powering new innovation through model orchestration and agentic capabilities.
Large language models, the basis of any natural-language AI system, don’t inherently know how to orchestrate APIs, the necessary building blocks of agentic capabilities. In some cases, third-party partners provide robust APIs to interface with Alexa. In others, especially with smaller companies and those that are less tech-forward, integrations are partial or missing entirely.
And as the agentic AI ecosystem rapidly develops across companies around the world, digital agents will need to learn to work directly with one another.To reimagine Alexa in a world of generative AI, Amazon’s Devices & Services and Artificial General Intelligence teams joined forces, rallying around the company’s customer obsession.
Having so many tech capabilities under one roof enables AWS to provide customers with a huge range of services conveniently, reliably, and swiftly.“Reinventing Alexa involves completely rearchitecting the system around generative AI and large language models, but also bringing along its core capabilities,” says Daniel Rausch, vice president of Alexa and Echo at Amazon.
“We don’t want to make any trade-offs or leave any customer behind in that journey. So, we have to build this thing at scale.”APIs at ScaleThe first breakthrough came where strong APIs were already in place. Amazon achieved something no one else has: orchestrating dozens of LLMs to reliably connect to tens of thousands of services and devices.
While many have struggled to link a single model to even one API, Alexa is already proving it can operate at an entirely different scale with major partners across smart-home, dining, ride-sharing, and home services using AI built on the Amazon Bedrock platform.Alexa customers can now do things like book a ride hands-free while they’re getting ready to leave, or they can reserve a table for dinner through natural conversation.
They can even add groceries to their cart and reorder household essentials without ever reaching for their phone. Say “I’m chilly,” and Alexa knows to turn up the heat. Follow with “Dim the lights,” and it understands the shift in context instantly. No pauses, no lag, no need to repeat yourself.Looking forward, Amazon is expanding this capability with Actions SDK, a developer tool designed to make integrations even simpler.
Instead of building custom code for Alexa, partners will be able to map their existing APIs to Alexa’s categories such as reservations, food delivery, and local services. This creates a consistent way for Alexa to talk to many different services, which is critical for scaling AI agents. Such standardization means Alexa+ won’t have to learn a new system each time—it can just plug into partners quickly and reliably.
But not every partnering company offers this kind of access through robust APIs. Many have incomplete APIs or none at all. “There are additional service providers who may not be as technical, but do have wonderful websites and things to offer our customers,” Segal says.Finding a way to automate the process and build in access for Alexa required the next breakthrough.
Agentic AI Comes to the ForefrontLongtime Alexa users have doubtless heard the dreaded response, “I’m sorry, I can’t help you with that.” You get it when you don’t phrase a request in a manner that Alexa can interpret, or when fulfilling your request would require interfacing with a web browser rather than an API.
Yet the promise of agentic AI means that users will expect Alexa to carry a task through, not stop halfway or hand it back for help on how to get through a seemingly simple Q&A. To overcome this hurdle and simplify the experience for customers, Amazon developed Nova Act, an agentic model that uses advanced visual reasoning.
It is designed to interpret a website much like a person would: reading forms, recognizing buttons and icons, even navigating maps, and then taking action in a secure, authenticated browser session.“Nova Act can navigate any website, whether it’s a modern design or an older design, whether the buttons are on the right side or the left side, whether they’re hidden, whether a pop-up appears,” says Michael Giannangeli, head of product for agentic AI, Amazon Nova.
“It handles it just like you would.”For example, Thumbtack’s API already lets Alexa search for local service providers like plumbers or handymen. But booking and confirming the job still requires going through a website. Nova Act will step in, understanding the page visually, filling out details, and completing the booking.
Nova Act is just one of several specialized AI models Alexa orchestrates to keep conversation natural, execute actions on the web, and validate accuracy. One of the most important breakthroughs will be how it performs under real-world conditions, where customers expect answers in seconds, even as Alexa coordinates multiple models, services, and partners behind the scenes.
The Multi-Model OrchestratorDelivering speed at scale requires a reimagined foundation and new advances in arbitration and latency. The rearchitected system is built on Amazon Bedrock, a platform for building generative AI applications and agents at production scale. AWS made it model-agnostic so the right model can be applied at each step.
Supporting this are what the Alexa team calls “experts”—collections of APIs, workflows, and reasoning tools tailored to domains such as booking a table, managing home services, or controlling smart home devices.With this architecture, Alexa will be able to arbitrate among providers at runtime, evaluating multiple options for completing a task, surfacing the best fit, and respecting customer preferences.
Just as importantly, it will do this instantly. A sophisticated routing system is being developed to minimize latency by matching each request to the fastest and most effective path—balancing speed, accuracy, and reliability without breaking the flow of conversation.“We have a model for every use case, from complex agentic tasks like coordinating across multiple agents and hundreds of Alexa experts, to really fast natural conversations with minimal latency, to things like navigating a web browser with Nova Act,” Giannangeli says.
AI That Gets Things Done in the Real WorldSuch breakthroughs will have a profound impact on Alexa partners and users alike. Robust APIs will connect through the Actions SDK, Nova Act will step in when APIs fall short, and arbitration will let customers choose among real providers in real time. For partners, a consistent framework makes integration simple.
Users will reap an immediate benefit in the form of requests that carry through from start to finish.For instance, you can tell Alexa that you’re in the mood for sushi, and it will immediately give you options for nearby sushi restaurants and delivery services, using natural conversation to help you order.
Similarly, booking a handyman, reserving a table, or managing a calendar can happen without handoffs or dead ends. “This is only the beginning,” Giannangeli says. “You’re going to see these models get better, faster, and more agentic, and it’s going to add even more value for our customers.”As more companies build AI agents, Alexa is designed to work alongside them, opening the door to a future where services coordinate seamlessly on behalf of customers.
Alexa+ remains in Early Access, constantly improving. What is here today is powerful—and what is coming will be even more transformative as Amazon realizes its vision of what it calls “ambient assistance,” technology that fades into the background and supports people in everyday life.“What I love about the team here and what we have under one roof at Amazon,” Raush says, “is you have AWS, Amazon Bedrock, Amazon Nova working on systems that can actually get things done for customers to pay off a vision as big as Alexa Plus.
”

