AI giveth and AI taketh CPU

Want to learn more about the topics Mark and Ryan discussed in this episode? Check out the AMD Advanced Insights podcast, a monthly show hosted by Mark.

Connect with Mark on LinkedIn.

[Intro Music]

Ryan Donovan: I'm Ryan Donovan, host of the Stack Overflow Podcast, and I'm here at HumanX to talk about the silicon aspect of AI. I'm here with Mark Papermaster, who's CTO of AMD. We're gonna talk all about it. So, welcome to the show.

Mark Papermaster: Ryan, thanks for having me.

Ryan Donovan: I wanted to start off, I'm sure you've seen this, I saw an April Fool's article about AMD buying Intel and a lot of people were a little taken aback, but also feeling like this was a interesting turn of events. You all have seemed to have made better choices than Intel, at least in terms of your market cap. What do you think your strategy has been in terms of AI? Do you think your approach to AI has been rewarded?

Mark Papermaster: Well, Ryan, what we've done at AMD is had a very, very focused development effort to drive our roadmap to be leadership across everything from supercomputing, the cloud, the edge and endpoint devices like PCs. So, when you think about that, it drives us to be laser-focused on, 'what do customers need? What really gives customer value?' Because that's what we're focused on. So, every product line, the way we run our processes. And we really, over the last decade plus, reinvented how we develop products, and it starts with a keen ear in listening to our customers. [Lisa] Su became CEO, and she highlighted three mantras for the company: build great products, really listening to customers and having delighted customers, and simplify everything we do. So, that focus is really powerful, and there's a culture of collaboration innovation. My role as CTO is to drive innovation, to drive our roadmap, to make sure that we're not ever getting complacent, and so that led us to AI. I mean, what attracted me to AMD 14 years ago was the fact that AMD was really the only company that had a deep footprint in CPUs, a deep footprint in GPUs. And when you look at AI, it needs both heterogeneous computing across both. And so, that's what we've been able to do, is to develop an engineering culture that really derives that value out of heterogeneous computing. And again, now we've even further expanded our portfolio all the way into embedded devices, and adaptive computing. So, AI is incredibly dependent on high-performance computing, and that's what our company is all about.

Ryan Donovan: Yeah, and it's interesting, you talk about the CPU and the GPU. I think my long-term understanding of AMD was mostly on the CPU, right?

Mark Papermaster: Yes.

Ryan Donovan: As X86 alternative.

Mark Papermaster: You bet. That's our heritage.

Ryan Donovan: And obviously with AI, everybody is sort of focused on the GPU, TPU side of the house. How do you fit those together in a– I don't know if you fit them together in a single chip or in a single combined piece of silicon. How do you get those to work together?

Mark Papermaster: Well, what people don't realize is that we've been combining CPU and GPU since 2011. It's been 15 years. When we started with PCs. And it wasn't about AI because it was 15 years ago. So, it was about, how do you get the best experience where you're running a combination of computing and graphics, whether it be gaming, whether it be just visualization on the PC that you're running, or workstation applications. So, we were really the first to create that type of tight integration of x86 with GPUs with a completely shared memory. So, it's very, very power efficient because it's not like you have to send something that you're processing in the CPU and send it over to a different GPU, and all the power that gets burned in doing that–

Ryan Donovan: So, they sort of share the, like, L1, L2 cache?

Mark Papermaster: Yeah, it's fully coherent. It's the same memory. So, it's not at a level one cache, but it's a shared memory, and so that is incredibly efficient. So, a lot of experience in heterogeneous computing, and we started over a dozen years ago. We started the heterogeneous system architecture with us and other companies because we've not only been committed to heterogeneous computing, we've been committed to doing that in an open ecosystem. And that really differentiates us versus our big GPU competitor, Nvidia. We've always been about being open. What you find is, as we then brought GPUs and CPUs into the data center, by then we had innovated on chiplets. So, you asked, is it always on the same piece of silicon? Well, for the data center where you need massive CPU and GPU compute, we use chiplets to be able to give different configurations. The number one supercomputers, and number two in the world, very tightly combine CPU and GPU chiplets together on a one carrier connected over silicon, but it's actually different chiplets that we put together. Incredibly energy efficient.

Ryan Donovan: Yeah.

Mark Papermaster: So, we're all about how to deliver value and innovation in how you put the computing elements together.

Ryan Donovan: So, talk about chiplets. What's the difference between a chip and a chiplet?

Mark Papermaster: Yeah. So, when you think about that, when you say it's one chip, it's a homogeneous chip, that means you're creating one design, it's gonna go to one semiconductor technology node. It's all built on the same node. And building a chip- think about photography, where you create an image. Well, that's what you do. You're creating a chip on a given technology and creating images of multiple layers. That's all the transistors, and all the wiring, and how you put it together. And so, it turns out [that] as you build bigger and bigger chips, it's harder to manufacture. So, we innovated with chiplets, meaning partition it. [We] created a hierarchical and a partitioned organization. So, we broke out the CPU compute elements—this is our first heterogeneous implementation from the chiplet that connects to all the memory and the IO off to storage, and other devices. And that way, we could put the CPU on the latest bleeding-edge semiconductor technology node, and the chip that docks to all the IO and memory, it doesn't need to be on there. It can be on a much more cost-efficient, older node. It also gave us agility. So, with just a CPU chiplet and an IO interface, we could create different combinations of CPU chiplets. You can take that same chiplet that you use for the data center, and you can use it in desktop PC applications. That's how we build our workstations, and we ended up doing the same thing in our GPUs. When we launched our big GPU to take on Nvidia in December of 2023. It's chiplet-based.

Ryan Donovan: I've heard there's a manufacturing bottleneck around the high-end chips, that [there are] basically two places that can make those. Does this get around the manufacturing bottleneck?

Mark Papermaster: Makes it easier because these chiplets are easier to manufacture. They yield better, so we're more efficient, and we work so closely with our supply chain. We worked with TSMC for decades. We plan our supply dependencies on them [for] years in advance. We've already locked in 2026 [and] 2027. We do the same thing with our memory partners. Long-term relationships we've had with them locking our supply, and then how you package them and put them together. Also, we're heavily invested in that industry. So, people do think it's all about the design. It's much more than just the design. It's how you architect putting it together. It's how you architect your supply chain and build those relationships to avoid bottlenecks in delivering to your customers.

Ryan Donovan: So, in the architecting [of] the chiplets and the chiplet combinations, how are you adjusting to the changes in workload needs, like the increased value of inference it's been in the last year or so? And I'm sure there's gonna be other shifts.

Mark Papermaster: Chiplets are a piece of our agility there, Ryan. So, before inferencing took off and it was all about training 'cause we had to train all these models, we had a diversion of needs between the traditional high-performance compute that all the national labs run on, weather forecasting, designing new molecules and enzymes. They need high precision. Some of the work, some of the analysis they do, can't use the approximations of AI. So, we created– using chiplets, we've had almost identical versions of our GPUs, one oriented for high performance computing that integrated CPU and GPU together on the same carrier, and another one for largely inferencing tasks that are all GPU. So, it gave us a lot of flexibility, just like our CPU chiplets gave us flexibility on how many cores do people need, CPU cores. We were able to tailor what type of AI are you running? And now, we're seeing even more and more diversity because as inferencing starts to take off, inferencing has many different flavors of the workloads. Are you vibe-coding, where you need a low latency, and you're using our caches for low latency? Are you needing to have [a] large context window? Your prompts are huge, you're running agentic flows, so you're dependent on leveraging all of that memory that hangs off of the CPU, and we have that flexibility with very high-speed buses between our CPUs and GPUs. So, our approach of modularity and of partitioning out how we implement gives us a lot of flexibility to tailor as workloads evolve.

Ryan Donovan: Yeah. So in managing [those] workloads, where does that management happen? Does it happen at the silicon level, or is there a software level that sends things to the GPU levels, to the memory bus, or whatever it is.

Mark Papermaster: Yeah, that control happens at the software level, so that's why our ROCm stack– that's our software enablement stack, it's open, right? People love that because they can contribute code. They're not locked in. They're a part of it.

Ryan Donovan: They can fork it if they need it.

Mark Papermaster: Through a process, but they could have their own private fork. That's correct. But it is controlling, really, how workloads are best partitioned across the CPU, across the GPU, and of course, we have very highly performant compilers that optimize. And the best example of that would be the benchmarks that were just released about a week ago, and it's MLPerf. Have you heard of MLPerf? It's a very good machine learning benchmark because it's not one of those biased benchmarks. It was a cross-industry effort such that–

Ryan Donovan: It's not contaminated by the training data, right?

Mark Papermaster: It's not, and it's generally accepted workloads that are real. So, people say when you look at that benchmark, it represents what Stack Overflow, what would you be running? It truly should be representative. And we passed, the first time, over a million tokens per second. We have showed super high performance on our inferencing capability, and really just toe-to-toe with a big competitor in leadership in areas of our inferencing capability. So, our software stack– and people said, "oh, AMD's got a long way to go."

Ryan Donovan: Right.

Mark Papermaster: Turns out AI is helping us dramatically speed our optimization. We have AI really optimizing our compilation, our kernel development, and so we're able to drive even faster performance improvements than we had even imagined a year ago.

Ryan Donovan: That's interesting, and I wonder, with x86 in my mind, was sort of the dominant paradigm for a long time. Well, then we have the ARM computing, which has a different sort of architecture. Are you all sort of all in on the x86, or are you looking at different chip load architectures?

Mark Papermaster: Where you really need high performance, our x86 implementations are the leaders, the highest density core count. Our new sixth-generation EPYC coming out later this year has up to 256 cores. These are high-performance cores. You can run simultaneous multithreading. That doubles it to 512 workflows. So, you think about agentic processing, everything works on X86. It's been the dominant architecture. So, your agentic flows might be spawning databases, might be spawning spreadsheets. They all work on X86. ARM is a fine architecture. It's growing its way to high performance. It still has more to go. We deploy it. We use it more in embedded applications. ARM is also on our network interface chip. So, to us, it's about using the right architecture for the right task. We use X86 where you have the highest performance needs, and we use ARM where you need a strong, and and fully dependable embedded CPU chip.

Ryan Donovan: Yeah. I think, unlike ARM—ARM is just a chip designer; you're in the manufacturing business, too, right? Or at least putting out full design, full chips, right?

Mark Papermaster: We sell full chips. We now, for biggest consumers of AI clusters, we sell rack designs.

Ryan Donovan: Oh.

Mark Papermaster: We don't deliver the racks. We have a rack reference architecture. But we have to design and optimize all the way to rack level.

Ryan Donovan: That's interesting.

Mark Papermaster: Yes.

Ryan Donovan: What goes into the full design optimization of the rack? I assume that's memory, CPU cores, et cetera, like IO?

Mark Papermaster: Yeah. Think about it this way. So, you look at our Helios rack coming out at the end of this year. It has up to 72 GPUs that you have in one rack. And if you're standing in front of it, in the front side, you pull this tray out. It's this huge tray. On that tray is two EPYC CPUs, our sixth-generation, two-nanometer CPUs on there. Incredibly highly efficient. And then, you see the GPUs on the same tray, and you also see the network interface chip on here. So, it's CPU, GPU, IO, and the memory is on there, so the high bandwidth memory is right in that same silicon carrier with the GPUs, and the DDR memory for the CPUs are on there. All that's on a compute sled. If you go to the backside and pull this tray out, you'll see the networks that scale out. So, you scale up to 72 nodes in a single rack, and then the network trays in the back allow you to scale out to connect these computer clusters together.

Ryan Donovan: To sort of offload workloads to other computing?

Mark Papermaster: Well, to create a mesh of yet a bigger computer. So, you start with 72, and now you, over the Ethernet connectivity of networking, you create a mesh of thousands or even hundreds of thousands of GPUs. And that's what you need for Frontier model training. It needs that level of scalability. Of course, an enterprise wouldn't need that.

Ryan Donovan: No. I mean, I'm not gonna have that on my desktop, right?

Mark Papermaster: You won't have it on a desktop. But that's the key to that modularity I described earlier. We take those same building blocks, and we can have it in exactly what you do need for interface. An air-cooled rack like you have today, forked right into a workstation.

Ryan Donovan: You talked about the ROCm sort of software stack. I assume that routes the workload, but how much can that routing scale? You have 72 nodes on a single rack. can it then scale to all of the ones connected by the network, or do you need a sort of more specialized hypervisor to do that?

Mark Papermaster: No, it's in ROCm. We have a communication libraries, RCCL. It's our communication software, and that's exactly what it's managing, ensuring very, very efficient scaling, as you scale out to literally thousands, or tens of thousands, or even up to 100,000 GPUs.

Ryan Donovan: What's the biggest blocker in the manufacturing process for chips?

Mark Papermaster: So, manufacturing, if people say, "oh, look at what happened with memory. Memory's in such high demand," or, "are you gonna get enough wafers from TSMC, who's the number one, chip wafer foundry?" And certainly, we use them extensively. We're one of their top customers. So, what we found at AMD, the way to avoid manufacturing bottlenecks is that you have to really plan your supply well in advance, and so we do that. We assure our supply. We remove manufacturing bottlenecks. Compared to writing software, it's all much, much slower. So, it takes us months to build these chips, and that's why you have to plan your demand well in advance. Agentic AI threw off the whole industry because there had been a ratio typically of one CPU to four or eight GPUs. Well, that's changing now. Now, you're seeing more and more CPUs relative to the number of GPUs to just process these agentic workflows. And so, that caught everyone off [guard]. So, we're racing to supply the demand because we plan, literally on an ongoing basis, about two years in advance, and this spike in demand for CPUs caught our customers off guard. They didn't see this coming, so therefore, we didn't get the demand signals.

Ryan Donovan: I did also wanna ask about the sort of manufacturing demand. I did read about the spike in the memory as it pertains to the consumer market; that folks like Western Digital have committed their entire 2026 manufacturing capabilities already. Do you see that spike in CPU demand affecting the consumer CPU market?

Mark Papermaster: It will because the data center demand is going so high, and those are higher margin products. So, it will put an additional constraint on consumer products, on phones, on PCs. Again, at AMD, we've got a great supply chain, so they're planning on an ongoing two-year in advance path of reserving cycle. So, we may see some impacts there, but as of right now, we don't. We've been able to contain it, but–

Ryan Donovan: But like you said, the AI software is moving so fast that the chip production you've had to shift, right?

Mark Papermaster: But Ryan, there's another trend that's going on, and that is small language models. So, people are looking to see, do I really have to run everything on the cloud or on a big, on-premise computer cluster that I build, or can I take some of the work that doesn't need a massive, trillion-parameter frontier model, but for what I'm doing in my business, can I contain that to a fine-tuned small language model? And that, for us, is very exciting. That's why we, over three years ago, AI accelerated all of our PC chips, all of our embedded chips, because you're gonna see more and more of the inferencing move to tailored small language models running at the edge; and it can still use the cloud or those big clusters that you have on premise to do your training and your big model fine-tuning.

Ryan Donovan: It does seem like there is a differentiation of the models that people are using where–

Mark Papermaster: Yes.

Ryan Donovan: They're using the big guys for the big stuff, and then everything else, evaluation, little routing, is going to smaller models and open source models. Yeah. Did you ever think about tailoring chips to specific models?

Mark Papermaster: Look, there's opportunity to do that. For a big company like AMD, we're selling to a broad community, so we won't be the one that races to create a tailored chip for a sliver of the market. But we have the capability with that agility I described earlier in our chat. And so, we watch it very, very closely. And if we see that sliver take off, or we see the trends, the signals that it's gonna be a growing workload, then we quickly can adapt with our chip design methodology to tailor to it. And the beauty is, it still all runs under our ROCm software stack, our open source stack, because we're not changing the programming constructs for the CPU or the GPU. So, for developers, they can have consistency, and then under the covers, we can tailor as we see certain workloads. We're seeing it right now with inferencing. Vibe-coding – that needs super low latency. But it's only on a portion. When you think about inferencing, there's a prefill stage and then a big decode stage, where you take the prompt and prefill, and you analyze what that is, and the decode is actually doing all of the inferencing to come back with the responses. But it turns out that that decode phase has multiple flavors. Some of it might need to be low-latency. Some of it might need to have higher batch sizes, which means more throughput. And so, we're seeing a real disaggregation in terms of how you can optimize to solve AI inference. Now, you start to see so many different variations of how people are using inferencing in industry.

Ryan Donovan: Is there a particular workload that you all are thinking about moving building for? Is there something that you're like, "This workload is coming up. We don't have a fix for this"?

Mark Papermaster: Well, first of all, we have general-purpose CPUs and GPUs, so it all works.

Ryan Donovan: Sure.

Mark Papermaster: The question is, where might you fine-tune to get even more perform– really more tokens per watt, per dollar, right? 'Cause that's what it's all about. So, we're constantly tinkering with how to get more tokens per watt, per dollar, and so yeah, so it's actually a constant.

Ryan Donovan: That is definitely– when I talk to people not in the software industry about AI, they point to a lot of the power consumption. What sort of levers do you have available to reduce that sort of watt per token per hour?

Mark Papermaster: Yeah. Guess what we found? It's not one lever. So, it has to be literally across the whole stack. So, we start at the most basic level, the transistor design. We work super close with TSMC, and we've actually altered their roadmap to adjust transistor devices to be more energy efficient. Then, how we put together, I mentioned the chiplets. We pick what's the most energy-efficient chiplet technology to choose, how you interconnect those chiplets. We've driven efficiencies there. How you package it together, how you deliver power to those chiplets, so we've driven efficiencies, how you deliver power. Now, when you put all those chiplets together, you package it—so, you have this big CPU or this big GPU module, and now how do you connect that? We've invested in photonics for the future generations to be able to connect at even more efficiency using photons versus copper. By the way, that won't be applied everywhere, 'cause it's more expensive. So, you'll pick and choose where you'd apply that connectivity. But then, it continues up the stack. We work with software developers so that they can optimize around our chips and move data around in a much more efficient way. We have coherency across our CPU and GPU that can reduce data movement. So, there are so many levers that can be applied. If you look at our CPUs, we've had 3D stacked SRAM for years. So, we have a next level of cache hierarchy 3D stacked, that just gets you tremendous performance benefits, but it's 3D stacked, so it's very energy efficient.

Ryan Donovan: We talked about vibe-coding a little bit, and efficiency of software. I've read that a lot of vibe-coded software is very inefficient. Do you worry that the software of the future will be a power drain? No matter how much you optimize the chips, we still have this sort of lousy vibe-coded software.

Mark Papermaster: We don't have the luxury to just work on chips and throw rocks at the application. We need to all collaborate.

Ryan Donovan: Yeah.

Mark Papermaster: It's the only way it's gonna happen. So, we deploy AI at AMD. We're using AI to speed our chip designs, to speed how we both design [and] how we validate them. We're using it to help us get our computer systems out to market faster, and even all the non-engineering applications. We're looking at where the power is consumed. We recently had the engineers identify how an agentic flow was using model context protocol, and the way it was being done, we realized could be done more efficiently. That gets the work done faster, so more token throughput, and that saves energy. More than ever, the industry has to band together and collaborate to drive energy efficiency.

Ryan Donovan: But it's interesting because I know Perplexity abandoned MCP in favor of just raw APIs.

Mark Papermaster: That's right. It's for the very reason I just described. They found it, it could get more efficiency.

Ryan Donovan: Yeah.

Mark Papermaster: But Ryan, that efficiency goes all the way up to even powering the data center, these gigawatt data centers. So, what we're finding is: the power monitoring at the data center show that the GPUs' workloads can create spikes in the power consumption. That's very inefficient. So, as the energy industry and the data center operators identified to this, we can put controls in to reduce those power spikes by slightly adjusting workflows, and you're firing on all those transistors at the same time. That's what creates a spike is: when you get a lot of simultaneous switching. So, there's tweaks we can do that help there. So, it's just incredibly interrelated, all the facets of creating the compute capability, and then delivering this kind of training and inference capability in the most efficient way.

Ryan Donovan: Yeah. I know you mentioned it a couple times, so I want to talk about the AI helping with the chip designs. I wonder what kind of models you use for that? Are those sort of special in-house trained models?

Mark Papermaster: They are indeed. We're a 56-year-old company.

Ryan Donovan: Right.

Mark Papermaster: We'll be 57, I think, later this year, and so we have a lot of history of chip design. So, we have proprietary models that we fine-tune with all of our data, and that helps us tremendously. That doesn't mean we don't use industry tools. We partner and use tools from all the computer-aided design and computer-aided electronics companies, so we use that, but we augment that with fine-tuning based on our data.

Ryan Donovan: Right. Have you found that the AI-guided chip design has unlocked new solutions, new things that your engineers hadn't thought of?

Mark Papermaster: Well, we're in the midst of that. So, to be honest with you, if you look at the last couple of years, we've been getting very, very nice point improvements. So, we look at one piece of the process, and we were getting, in some cases, 30% gains, 40% gains. But that doesn't mean the end-to-end workflow was getting 40% productivity improvement. It was point improvements along the way. What's changed in the last months is that the models have gotten much better, and the learning on how to create effective agentic workflows, has really evolved. And so, now that's where we're really getting the productivity gains, is where we can apply these agentic workflows. So, you're just getting iterations done in parallel and nonstop. That's just more than our human teams could do. It's still very much driven by our expert engineers. So, they're setting up these flows. They are deciding where is the human in the loop and how to make sure you don't get garbage out. So, how you run agentic workflows, it has to really be architected, maintained, and so you have owners, and engineers are owning and managing these workflows. But now, we are indeed starting to see some breakthroughs in productivity, but it's quite early yet.

Ryan Donovan: Yeah.

Mark Papermaster: These agentic workflows have been in the last months, not the last years.

Ryan Donovan: That is a fair point. And I wonder, do you think, like with AlphaGo, there was what they call move 37, which was a move sort of so inhuman and baffling, but it was fit within the training data.

Mark Papermaster: We are already seeing some of those.

Ryan Donovan: Yeah?

Mark Papermaster: We've had problems that we've been trying to run optimization, getting more performance. We bang away with our top engineers. We're getting a percent of performance. Then, we put in an agentic workflow that literally is trying every kind– things we wouldn't have even thought of. And then, we're seeing a multiplicative improvement in performance.

Ryan Donovan: I mean, who would have thought outside of the box meant inside a black box, right?

Mark Papermaster: Yes. I love that description. I hadn't heard that before.

Ryan Donovan: Yeah. You mentioned open source software, the open ecosystem. How much of what you do is open, and how does that affect the very big customers that you have?

Mark Papermaster: Yeah. It's important for our big customers, our hyperscalers. You look at the Helios rack, that 72 node GPU rack—we're coming out with MI455 later this year—that's designed to an open rack standard, ORW, Open Rack Wide, done with Meta, and that spec was donated to Open Compute Project, OCP. And so, why is that important to our big customers? It creates economy of scale. Other people can use it too, and as more people build to that, it brings the cost down. It just makes more efficient, cost-efficient computing. But to enterprise customers, it has a huge value. So, let's say you're in oil and gas, and you're running a university compute cluster. If you go with our open ecosystem, you get that open ROCm stack, you can contribute to it, you can create your own private fork. But moreover, with that open ecosystem, you can pick many different hardware versions. You can have different network suppliers with you, different storage solutions. You're in control. All these inferencing solutions are starting to proliferate. It's not one size fits all. An open ecosystem with lots of partners creating different hardware solutions, different software solutions, different ISVs that can tailor. It's creating more choice for our customers, and it is not locking them in. So, we think it's a huge differentiator. But for your developers that listen to Stack Overflow, I'd really urge you, if you haven't kicked the tires on AMD, do it now, because it used to be, "oh, that ROCm stack, it's behind the competition." Well, AI has basically eliminated the moat. We can port code over, we can optimize it. We've been working years to now, where our promotion, new code promotion, our CI/CD process, to make sure that new code, as it's promoted, [has] been running the right continuous integration, continuous development has high quality. As we've moved the needle tremendously on our software stack, and it's very, very robust.

Ryan Donovan: It's open. I mean, we love our open source here at Stack Overflow. So, what's the next thing anticipating you're planning for?

Mark Papermaster: Well, what we're planning for is that this incredible speed up we've been. So, as I said, hardware might run slower, but we have new GPUs every year for the data center. That's unheard of when you look at the history. It had been at about half of that rate. So, the speed will continue. What I think we're going to see more and more of is this collaborative innovation across the stack. We're going to see more and more understanding of which variants and combinations of inferencing take off, and how we can indeed optimize them and make them more efficient. And we're excited because we have great partnerships like the huge cluster that we announced that we're developing with OpenAI, and then, even more recently, with Meta. So, we really are seeing where the frontier models are going, and we're partnering with those in the industry. We have a number of financial institutions. We're looking at– think about their demands are running high frequency trading. They have a whole different set of demands. You have well logging and all the seismic analysis with oil and gas. So, what I think is we're just seeing everybody starting to see how do they tailor AI and AI for science. That's the AI approximations combined with that high precision that we continue to support at AMD. We continue to support customers that need FP32, FP64, that full precision, along with approximations, like math formats, like FP4, FP8. We're going to support both, and again, give customers choice. So, you can see a lot more tailoring to what problem are you solving? And with the breadth of our portfolio from supercomputers all the way to the embedded edge, we couldn't be more excited than to understand how we can help people with our open ecosystem tailored to their needs.

Ryan Donovan: Thank you for listening, everyone. I've been Ryan Donovan. I host the podcast and edit the blog here at Stack Overflow. If you have questions, concerns, comments, et cetera, you can email me at podcast@stackoverflow.com. And if you want to reach out to me directly, you can find me on LinkedIn.

Mark Papermaster: Well, Ryan, thanks so much for having me. Mark Papermaster, CTO at AMD, and for more information on topics that we discussed here today, check out the podcast I do called Advanced Insights that you can find on Spotify, on YouTube, and explore some of these topics in more depth.

Ryan Donovan: All right. Well, thank you for listening to this podcast, everyone, and we'll talk to you next time.

AI giveth and AI taketh CPU

TRANSCRIPT

Add to the discussion