Expierience with Big Data and delivering what customers need
Autumn is currently the VP of Engineering at Cohere. She’s been with the company since September 2022 scaling teams & tools. Prior to buying into the startup life, she spent 3 years in financial services and 14 years at a large non-profit. Her passion is helping innovative developers buck the status quo - getting their improvements out of R&D and successfully launched.
Autumn Moulder is an experienced technology leader with a diverse background in infrastructure, security, and data analytics. Currently serving as VP of Engineering at Cohere, Autumn has been instrumental in scaling teams for LLM infrastructure and security initiatives. Prior experience includes Senior Director of Developer Platforms & Infrastructure at Progressive Leasing, where a focus on governance and data systems was key. Autumn's career also spans roles at The Church of Jesus Christ of Latter-day Saints, culminating in leadership of data platforms and analytics strategies. Academic credentials include a Bachelor of Science in Computer Science from Brigham Young University and an Associate's degree from Brigham Young University - Idaho.
Coming Soon!
Cloud native takes on new meaning in the AI and HPC domains. What does cloud native mean when your software is tightly coupled to hardware? When capacity is fixed, which assumptions start to break down? How can you flex GPUs batch training workloads and inference? Join us for a case study, demonstrating how a small team scaled ML infrastructure from a single cloud to multiple clusters across 4 cloud providers - in under 6 months. We’ll share unique multi-cloud challenges we uncovered around supercomputing infrastructure, cross cloud networking, capacity & quota management, batch workloads, FinOps, and observability. We will particularly highlight our experience using Kueue to manage fixed capacity across clouds & where Kubernetes still falls short for HPC workloads. Leave with a solid understanding of what it takes for an infrastructure team to support the lifecycle of a cloud native foundation model.
Deploying LLMs is challenging. This talk is a case study in how cloud native technologies, specifically Kubernetes and OCI artifacts, simplifies private LLM deployments. Allowing teams to run models in their infrastructure solves significant data governance & security challenges. However, it is still difficult to efficiently share large artifacts between model developers and model consumers. Autumn and Marwan share how open standards unblocked challenges and simplified LLM delivery. First, we explore how Kubernetes made it possible to rapidly deliver a highly portable, cloud-native inference stack. Second, OCI Artifacts have been underutilized as a delivery mechanism for artifacts beyond container images. We explore how we achieved significant efficiency gains by reducing duplicate storage, increasing download speed, and minimizing governance overhead. Walk away learning how you can leverage Kubernetes and OCI in your MLOps journey.
In this segment from theCUBE + NYSE Wired’s AI Factories – Data Centers of the Future series, theCUBE’s John Furrier sits down with Autumn Moulder, VP of Engineering at Cohere, to unpack how AI factories are reshaping enterprise infrastructure and the software stacks that run on them. Moulder explains Cohere’s enterprise-first approach across security, privacy and efficiency – meeting organizations where they are with the right-size models and applications, including Cohere’s North product. She shares how enterprises balance general-purpose LLMs with specialized models to hit strict SLAs and minimize hallucinations, and why triage and routing agents often deliver outsized quality and efficiency gains in real-world automation workflows. A notable data point: 48% of Cohere North deployments have been in on-prem environments – evidence of AI factories turning modern data centers into purpose-built compute hubs for AI at scale.
The conversation digs into the practicalities of what runs on AI factories: GPU-dense architectures, evolving network and storage fabrics and the tight interplay between models and software required to achieve 10x outcomes. Moulder discusses why startups often stall at POC due to resilience and security gaps, and how leveraging a proven stack can accelerate production readiness. The duo explores best practices for vertically integrated vs. decoupled designs, the economics of AI-scale infrastructure, and the future of specialized LLMs as general models continue to improve. If you’re evaluating how to operationalize AI across hybrid cloud and on-prem estates, this is a candid look at the engineering realities behind the next era of compute.
Ihab Tarazi, senior vice president and chief technology officer of AI, compute and networking at Dell Technologies Inc., and Autumn Moulder, vice president of engineering at Cohere Inc., join theCUBE’s Savannah Peterson and Dave Vellante at Dell Technologies World 2025 to discuss the accelerating path from AI experimentation to full-scale enterprise deployment. Their conversation explores how Dell and Cohere are building secure, efficient AI solutions that meet the demands of modern businesses.
Moulder shares how Cohere is working to democratize AI access while preserving data privacy, offering scalable models that enterprises can run securely on-prem or in hybrid environments. Tarazi highlights the technical innovations that reduce costs per token and improve time-to-value.
The discussion highlights how these AI platforms enhance productivity, improve customer experiences and maintain enterprise-grade data standards. Together, Dell and Cohere are helping organizations operationalize AI without compromise.
In honour of International Women's Day on March 8 2024, we're excited to share this panel dedicated to the phenomenal women in tech.
This event aims to highlight and appreciate the diverse and invaluable contributions of women on the Cohere team. Hear from Autumn Moulder, Cécile Robert-Michon, Ye Shen, Jessica Xie and Leila Chan Currie in this panel, showcasing their achievements and sharing their inspiring stories. This is an opportunity to recognize the incredible impact these women (and all women) have within the tech community.
Let's unite to uplift and encourage one another, recognizing the essential role women play in shaping the future of technology.
9 April 2025
At Cohere, we’re building the secure AI infrastructure enterprises need to adopt autonomous agents confidently, and the open A2A protocol ensures seamless, trusted collaboration—even in air-gapped environments—so that businesses can innovate at scale without compromising control or compliance.
15 April 2025
“With access to some of the first NVIDIA GB200 NVL72 systems in the cloud, we are pleased with how easily our workloads port to the NVIDIA Grace Blackwell architecture,” said Autumn Moulder, vice president of engineering at Cohere. “This unlocks incredible performance efficiency across our stack — from our vertically integrated North application running on a single Blackwell GPU to scaling training jobs across thousands of them. We’re looking forward to achieving even greater performance with additional optimizations soon.”
Jan 27, 2025
“AI is only as useful as the data you give it,”
09 Nov 2023
"This is similar to when, as an industry, we started moving into the cloud. All the same reasons that people gave for, 'I can't move out of my collocated facility to the cloud' are the exact same reasons we're hearing for why 'I need to run a private LLM and don't want to use a SaaS system,'" said Autumn Moulder, director of infrastructure and security at Cohere, a generative AI vendor in Toronto. "But they also don't have the expertise to run a fully open source [LLM] stack."
15 Apr 2025
As AI agents become a core part of all software systems, “interoperability is critical,” said Autumn Moulder, vice-president of engineering at Cohere, noting the sector is “in a rapid period of expansion where multiple industry standards are evolving.”
the system’s design means it “can deliver immediate utility, even as the network grows,” said Moulder.
The two protocols [MCP and A2A] together “ensure AI agents have the right context and can leverage the most useful tools,” said Moulder.
15 Apr 2025
"This unlocks incredible performance efficiency across our stack — from our vertically integrated North application running on a single Blackwell GPU to scaling training jobs across thousands of them," said Autumn Moulder, vice president of engineering at Cohere.