As your AI gets smarter, so must your API

Kong is an all-in-one API platform for AI and agentic workflows.

Marco previously joined the podcast in 2024.

Connect with Marco on Twitter.

Congratulations to user Mark for receiving a Lifeboat badge for their answer to Visual Studio Code: Expand the horizontal bar for scrolling tabs.

TRANSCRIPT

[Intro music]

Ryan Donovan: Hello, and welcome to the Stack Overflow Podcast, a place to talk all things software and technology. I'm Ryan Donovan, your humble host, and today we are talking about AI agents and the API-ification of everything. My guest is returning to the program CTO of Kong, Marco Palladino. How you doing today, Marco?

Marco Palladino: I'm doing great and thank you, Ryan, for having me here again.

Ryan Donovan: Of course. As the CTO of an API company, you must be sitting at the catbird seat a little bit. The API AI agents have ramped up the uh, API usage – what's the sort of effect that you've seen as an API gateway provider?

Marco Palladino: Well, we're seeing that API consumption is being driven by the use cases. The world is building on top of APIs, and in the past, you know, we had mobile, which was a big driver for APIs; then the cloud native ecosystem was developing microservices, even more APIs. And like you said, AI is generating even more traffic on top of APIs. Whether we are consuming an LLM, or whether we're consuming MCP servers, or third-party APIs within our agents, at the end of the day, we're creating 10 x more API traffic than we were creating before. So, this has been certainly a huge tailwind for API consumption in the world.

Ryan Donovan: With that, we've seen the MCP protocol come in as a standard. I was thinking about it before the show that the MCP sort of looks like a kind of 'API gateway.' What do you think about that?

Marco Palladino: Well, so, MCP is a new protocol that was invented to be more friendly with the agents that we're building. Of course, you know, the agents can consume any API by implementing function calling, but MCP makes the advertising of the tools that are available, the consumption of these tools, all slightly easier and more intuitive, from an agentic standpoint. So you know, MCP—if you were looking at MCP—just by looking at the technology itself, effectively, it's an API protocol. It's not any different than your PC. It's not any different than GraphQL. It's one way that we can use to consume third-party data and services. Now, because it's so friendly with the agentic ecosystem, of course, now there is a sprawl of MPC APIs that everybody's building so that they can build the capable agents. You see, when building agents, there is lots of focus on the LLM models that we're gonna be using to provide the intelligence to the agents. But the agents, you know, they could be using the smartest model in the world, but if they do not have access to APIs or MCP servers to be able to interact with third-party data and services, well, then those agents are not gonna be very capable because there is not much that they can do with that intelligence. So, it's very important—maybe even more so than choosing the LLM—it's also very important to develop an ecosystem of integrations of APIs, or in this case, MCP servers that the agents can consume, so that we can build smarter agents over time that can do more over time with our data.

Ryan Donovan: The MCP servers seem like both a boon for the agents themselves and for the folks providing it, that they can limit and sort of direct where the agents go to. Is that something you've seen? Is there something that is beneficial to the people providing the server?

Marco Palladino: You know, organizations' developers are building MCP wrappers right now on top of RESTful APIs simply because they wanna provide an even easier way – you know, because when things are easy, people adopt more of it, right? So, if it's easy, there's more adoption, that's just how it's, and if it can make it easier for agents to use APIs via MCP in this case, then it's gonna be easier to create agents and make them capable. You see, the biggest problem with MCP right now is that everybody is hiring, you know, developers to be able to create a ton of MCP wrappers around all the APIs that already exist. You know, organizations already have tens of thousands of APIs that are ready to go, and they're now doing these huge efforts to basically create MCP servers on top of these APIs and make them available for consumption. Now, when they do that, there is a few challenges: we have to build them, we have to deploy them, we have to scale them, we have to secure them, we have to observe them. So, there is lots of work that goes into this MCP creation work. And at Kong, of course, we're working very closely with organizations that are going, you know, 100% into AI, and they're asking us for ways that we can simplify, you know, how they create MCP. And so, for example, you know, we provide MCP auto-generation, we provide a standardized way to secure all of the CP servers using the new integration that the latest MCP specification described, as well as the observability, you know, the governance, and all of that, when it comes to MCP. But for sure, there is a race towards creating MCP because the sooner we have an ecosystem of MCP servers, and the sooner we can build capable agents, and the sooner the organization can transform every business process into being an AI-native business process or create AI-native products that end users and end customers can use.

Ryan Donovan: Well, I've talked to folks building MCP servers. It's actually pretty simple protocol, from what I understand – a little bit of a thin layer on top of the API and some folks have described it as a little bit janky. But you're talking about a lot of the, sort of, capability planning, right? The sort of securing, the load balancing, traffic planning, that comes with increasing the sort of load that you're giving to the APIs, right?

Marco Palladino: But when it comes to MP, I think it's just moving very fast, I think, and it feels like the authors of the first version of the spec, or probably cut on the offside by the popularity that MCP got overnight. And so, you know, many things MCP didn't have – like security was one of the biggest ones, and now the latest version of the spec describes ZIZO alt integration, where we can secure the access to the MCP servers. I guess, you know, there is lots of things that still have lots of opportunity to improve, and I think that one of the biggest problems—the biggest problem I would even argue that the industry is trying to solve right now—is how do we impersonate end users in a chain of MCP tools that are being consumed? Because right now, most of these MCP servers are being used for internal use, but if we want to create an agent that an end user is using, like a regular user, then if our agent uses multiple MCP servers to do multiple things, we need to be able to impersonate the end user across the whole chain of commands that we're gonna run via the agent. And that is not a solved problem right now, there is many companies out there. There is lots of experimentation on how to create a – maybe a credential store, how to do this securely, how to propagate authorization, entitlements across these third-party services, and my hope is that MCP will eventually come up with a standardized way to solve this problem.

Ryan Donovan: Right. Yeah, 'cause that's exactly what I want when I run an agent: I wanted to have my credentials, I wanted to check my grocery list, or whatever. So, how do we do that securely? Because it seems like that's an easy way to just have theft of tokens and breach protocols.

Marco Palladino: Yes, I guess that's one of the reasons why most MCP implementations right now are being used in secure private environments. But of course, we all know that that's certainly useful for a variety of use cases, but there is 100 x more use cases if we can open up our agents to the public. And being able to provide a solution to this, I mean, becomes critical if you wanna build the agentic world, and that we all truly believe that the prompt is gonna be the new browser in years to come. You know, we're not gonna be interacting with our services and applications using websites or front ends anymore. We're gonna be asking a prompt, and the prompt is gonna be consuming or invoking multiple agents, each one verticalized for their own scope of work, and these agents are gonna be performing the work for us. Like, for example, I'm not gonna be going on the Airbnb website and, you know, browse and scroll, I'm gonna be asking an agent, 'hey, I'm gonna New York for this number of days – to find something that fulfills my preferences.' And there's going to be another agent that has all the preferences I want stored somewhere that will communicate with this booking agent – and then find the flight to find the Airbnb house, you know, and so on and so forth. So, I truly believe that if we're moving to a place where agents are going to be the new users of the internet, because they're being triggered by humans like me—by the regular consumer—well then, of course, we have to solve these and many other problems. But the impersonation of the end user across the chain of requests that we're making through all the function calling APIs or the MCP servers – that has to be solved. Otherwise, these will never become a reality.

Ryan Donovan: Right. I've been saying in a few places that we're probably gonna go to one single point of contact with the internet, be it a browser or a terminal, and that every SaaS company is gonna be an API. But it's interesting that we're sort of working this out in real time, 'cause agents are a pretty new capability—like less than a year maybe—that people have been banging on.

Marco Palladino: Yeah, I think it's pretty new, but it's one of those things that once you start seeing it, you cannot unsee it anymore. You know, it's obvious that we're going to be having an army of agents following us for all sorts of things that help us in our regular life or even professional life, and it seems inevitable that this is gonna happen. And when that happens, I think we're gonna be seeing one of those shifts of technology, of the likes we've never seen before. You know, back in the days. Used to be on the Yellow Pages, and so if you had to have your phone number as a business on the Yellow Pages, because that's what customers were using back then, you know, they were finding your business through Yellow Pages, and so have your phone number there. And then you had to have a website. Why? Because consumers moved over to the Internet, and if you wanted to get customers, you had to have a website. And with a mobile revolution, we've seen that, you know, 15 years ago, customers are moving to smartphones, and if you wanna target them, you need to have a smartphone application. Well, what's happening now is that the consumer is going to move to the agents because I'm gonna be asking agents to do things for me, and so there's gonna be a whole new internet that's gonna be agentic. That's why, you know, everybody's excited about this agentic world because agentic truly is the next iteration of the internet as we know it, and there's going to be a whole new set of technologies, specs, in places where these agents are gonna be competing for each other, they're gonna be using third-party services. And so MCP happens to be an early iteration of this agentic world. And I dunno, the industry is moving so fast, I dunno if MCP is gonna be here for the next 10 years or something else is gonna replace it. But the thing is, the genie is out of the bottle, and now everybody can see this agentic world, and everybody wants to go and build it. And I think it's a world full of opportunities. You know, if you're an entrepreneur today, you're basically looking at everything that we have today, from infrastructure to observability, to analytics, to even billing. All of that is going to be reinvented for the agentic world, and the opportunity for entrepreneurs to go build a great business always happens when there are these massive transitions in technology, because this creates an opening – an opportunity to come in where the incumbents are lagging behind because of the innovator's dilemma. And so I think it's gonna be a very, very exciting decade for the industry, for technology, but also for entrepreneurship.

Ryan Donovan: You all have done some recent surveys on how much the enterprise is actually jumping in and investing. I would like to hear a little bit about that survey. And then, as you know, we've also released some survey data recently.

Marco Palladino: Well, so we did launch a survey. We have interviewed, you know, hundreds of IT leaders, professionals, but also Kong users and Kong customers, and basically the majority of them—72% of them—are expecting to increase the, you know, LLM consumption over the next year. I was wondering, you know, what about the others? And my opinion is the others are the ones that are gonna be disrupted by these 72% that will, instead, go all-in and will exceed their investments when it comes to AI, because there is no other way. The other ones are gonna be disrupted, which brings me back to the opportunity that I was talking before – you know, there's gonna be lots of companies out there that will not catch this opportunity, and there's gonna be a massive opportunity for others to step in. What we have noticed is that there is a huge increase of adoption of Google's models, you know, Gemini, Vertex, compared in the early days. If you remember in the early days, Google was a little bit struggling. OpenAI came all of a sudden and almost ate Google's lunch – it seems like Google has recovered from that, and now 69% of the respondents reported using Google's models. And then, you know, just to wrap this up, something else that came very noticeable from the survey that we ran is that everybody's concerned about governance, security, guardrails... 'cause you know, there is no way to deploy agents in production without having full ownership of, 'what are these agents doing? What data are they interacting with? Is there anything that they're sharing that they should not be sharing?' So, there is lots of infrastructure work that needs to be built in order to provide production-ready agentic environments. And to be fair, you know, as the CTO of Kong, I see this as our opportunity to go and provide that infrastructure to the organizations that we work with.

Ryan Donovan: That's right. There's a man selling shovels right here. With our data, we've also noticed we get reports of people planning on increasing their AI usage, but at the same time, we've saw that trust in the AI tools fell. So, I think – what, as a man providing tools for the AI agents, how can you improve the trust that folks have in the LLMs and the AI agents?

Marco Palladino: I think we're moving and we're switching phases of AI adoption. You know, for the last couple of years, we had a bunch of AI teams that were working in a corner, you know, building the early agents, using, you know, their own tooling, they were building their own security, and so on. Now, of course. This does not scale across thousands of developers that need to get access to these AI opportunities. So, of course, you know, now that we have validated the first set of use cases, now that we have validated that AI can indeed help transform business processes, it can help with providing new, engaging experiences to our customers, well, now we need to scale this, and we need to scale it in production. It's not enough to just have behind the firewall. And of course, when you're making the shift from experimentation to production, well, there is a bunch of things that need to happen, right? We wanna, first and foremost, we don't want the developers to reinvent the wheel every time they're building an agent. So, there is lots of crosscutting requirements that every agent needs to have. Think of observability for what the agent is doing. Think of creating tiers of access to the models. Think of being able to apply blanket policies for PII sanitization so that we don't advertise customer data in, you know, to the agent. Think about having security envelopes in place for detecting and preventing prompting injection attacks, and the likes. You know, so there is lots of infrastructure that needs to be built that we definitely—and I say this for a fact—that we definitely do not want to reinvent for every agent that we deploy. And so, I think at this point, the platform teams, the core teams, the ones that are supposed to support the developer teams that are building the agents, you know, now the ball is in their court. Okay, come up with a platform that can help all of these developers build agents that are, by default, working on a core that is on an infrastructure that is a secure, that is observable, governable, and so on. And so, we're seeing the big rollout now of these AI-native platforms that will allow the agents to move to the next stage.

Ryan Donovan: It feels like AI and AI agents are sort of speed running a web development pipeline, like we're already getting to these frameworks and these, sort of, almost like, service-mesh-type things that are including all the platform stuff. So, AI agents now just have to write all the business logic.

Marco Palladino: You know, I think that the, you know, the industry is recognizing how important this whole movement is. You know, it's very hard to ignore. You know, we are working with customers in the financial services industry, you know, they're telling us that they've been innovating, using AI on the concept of a wealth advisor. You know, back in the days, wealth advisors were humans that were allocated to ultra-high network individuals. And now they are increasing the engagement of everybody else, not just the super rich, but every banking customer – they're now, you know, starting to experiment with AI-powered wealth advisors that increased engagement, do portfolio analysis, you know, they can alert on risk management, and so on and so forth. And we're hearing from them that this is increasing the engagement of their customers by up to 50%, you know? And so when you're starting to see that AI is adding such a tangible, measurable, positive effect on customer base—on the conversion rates, on the engagement rates—everybody cannot wait to go and build agents to deploy because there is a tangible business outcome that's being created with them. And so sometimes, these agents that are always pushing the boundaries of, you know, wanting to go to production, pushing the boundaries of the platform teams as well – they need to support these agents, because everybody's basically building these as we speak. Now, of course, regulated industries – they can only move so fast, you know, because at one point, they have to put everything to an end if they don't have the right infrastructure to do that; and so this is where Kong in, this is where, you know, other solutions come in, and this is how we're trying to create a production grade environment for these agents to run, given the confidence that they're running in a predictable, secure, measurable way so that they can go to production.

Ryan Donovan: I've talked to observability folks and talked to LLM folks, and trying to get that observability piece of the LLM has been difficult. I think Anthropic has some research on, like, spotting how it's thinking, but overall, how are you thinking about observability for LLMs and agents?

Marco Palladino: Yeah, so that's a very interesting problem to solve, especially because we have learned that many developers are actually using many different models, and not necessarily from the same provider. So, we have different models with different characteristics, and even different pricing points, and so we're seeing, more often than not, organizations wanting to use multiple models, sometimes even self-host their own models, but you know, using them from multiple vendors. You know, when it comes to observability, what we do not want is to have a different observability policy in place for each vendor, for each LM – we want to have one thing that runs across all the vendors and all the models. And when it comes to observability, it starts from basic observability understanding, you know, 'what agents are consuming, and what models?' And then it goes deeper into understanding, you know, how many tokens these agents are consuming, how much spend they're generating, is there any PII that they're sending, whether it's being sent back, that we should be identifying how many times the guardrails have been invoked, or, you now, how many times a guardrail has stopped a response from being sent back. And then, you know, if you go deeper and deeper into the observability space, we enter more exotic observability requirements, like for example: needing to know what was the decision-making logic that made an agent, or a model, make a specific choice versus another choice. You see, the agents that are interesting in the future, even now, are the ones that are going to be making decisions. But if we ever give the ability—the agency—to an agent to make decisions, then we need to observe, 'what were the criteria or the parameters?' that the agent was led to make a specific decision so that we can improve it over time. So, when we look at decision-making tracing, or decision-making auditing, we're I think entering something that's very much needed, but very much still an exotic requirement, or an exotic space, and the industry doesn't really have out-of-the-box solutions for this just yet. I think there's lots of opportunity here to innovate.

Ryan Donovan: I think the LLM is still itself a pretty big black box. I just remember years ago in college looking at what a neural network actually is, and it was just a big sum function. I was like, 'how do you understand this?' You know, I've seen some folks having things like confidence scores, and other things like that, but being able to trace it and being able to understand exactly how it got that information is going to be key.

Marco Palladino: It's a multi-billion-dollar opportunity. If the world is moving to agents, and agents are making millions, billions, tens of billions, trillions of decisions every second, every minute, whoever builds the infrastructure to be able to track this decision making process and improve it and make it better, I mean, really there is a multi-billion dollar opportunity here – to be conservative.

Ryan Donovan: You mentioned things like checking the guardrails, checking the tokens—are there ways at the API layer—at the API protection layer, to manage that spend? To reduce tokens before it gets to the expensive part?

Marco Palladino: 100%. And you know, this is the bread and butter of what Kong does. You know, Kong started as an API management platform, but over time, we expanded into thinking of ourselves as a connectivity platform. Everything that's an API, that's an event, or that's AI – it's connections that are flying from one place to another, and so our platform allows to basically provide the right controls, and governance, and accelerators for some of these capabilities. And so, when it comes to AI and agentic consumption, the LLMs themselves are APIs, our MCP servers are APIs – you know, we want to have a bird's eye view of everything that's happening in our agents, and, you know, Kong is innovating and iterating into providing this type of solution. Basically, you know, it's very hard to improve our agents or understanding of what is the LLM consumption that we have, including cost, to your point; it's very hard to do that if we do not have the infrastructure in place to track all of these metrics and put them in a chart somewhere. And so, what Kong does is the ability to associate a cost on a per token basis so that we can see what agents are spending the most money. And by the way, we can also implement semantic routing, such as: we can route simple prompts to cheaper models and more expensive prompts to reasoning models, which are more capable, but also more expensive. And so there is a whole area of cost optimization and token optimization that we're very invested in and innovating a lot to be able to reduce that cost. Another one is the prompt compression capability that we have. We can basically save up to 5 x of token consumption by shortening this prompt, compressing this prompt, like a zip file, you know? But we compress the prompt on the file, and we retain 80% of the semantic meaning of that prompt, so that basically, the prompt doesn't change with the actual intent or meaning; and at the same time, we can save up to 5 x stock in consumption to respond to that prompt. So, organizations that are farther ahead in AI are not looking at ways to optimize spend. Now, with that said, there is also many organizations that are not looking at spend so closely because they're still trying to find the right use cases that they want to launch, so they're still trying to find out 'what are the right forces to bet on' before they deploy these experiments, and then after they're successful, they start optimizing them.

Ryan Donovan: Are there cases you would recommend people to not use AI?

Marco Palladino: I would challenge: are there cases where you wouldn't want to use AI, and why not if not? AI is a new mind shift. It's a new paradigm. It's a new way of thinking of solutions to problems. My belief is that we should ask ourselves of the 1000, 3000, 10,000 business processes that we have today, which one of these business processes could potentially be driven by AI so that we can take valuable and expensive human resources and reallocate them to more complex problems that are going to actually push the organization, and the leader, and the vision of the organization forward, the execution forward, and delegate all the busy work, delegate all of that to agents. So, I guess, the question that I would ask organizations is: think about all the business processes that you have – can you replace it with AI? And if you can, what's stopping you from doing that? Because there is a competitor next door that is working day and night to replace the same business processes using AI, and guess what? That competitor is going to get a production gain, productivity gain, that in the long run is going to make the difference. And so, what are we waiting for – that competitor is just moving up the levels of abstraction and automating more things. Like, ultimately, I think that's what it comes down to with AI agents.

Marco Palladino: I'm not a big believer that AI will replace our work. You know, I think that our job just evolves in light of innovation, and so what we will see is more specialization on tasks that humans are much better suited to solve, and AI does everything else—does a busy work, and the busy work, you know—we shouldn't be doing it in the first place if there is something that can do it in an autonomous way and in an intelligent way. So, why are we spending time doing that busy work?

Ryan Donovan: It's funny, I just read a great historical question on one of our stack exchange sites about this guy who's saying, 'oh no, I'm going to automate myself out of a job.' And he updates it says, 'no, I didn't automate myself outta the job, got more work, got promotions, and by the end of it, 5 years later, he's CTO and he's leaving the company.'

Marco Palladino: I do believe that there must be, you know, a drive to always aim for more and more specialization, and the individuals that have that curiosity, that want to specialize – they have nothing to worry about. And obviously, there is going to be lots of generalists that will have to also innovate themselves in this new world so that, you know, as AI takes some of these generalist problems, you know, that it can solve autonomously, you know, they also need to evolve. I mean, look, humans and our skillset is not any different than an organization; likewise, an organization always has to transform itself or disrupt itself. Likewise, we have to disrupt ourselves, or we're going to be disrupted by technology. But I don't believe that, you know—there are many scare headlines out there—'oh, you know, it's going to put the whole world outta business.' I don't believe that. I think that the world will specialize in tasks that AI cannot yet do. I think it's very interesting how software is now being built by AI, as well. You know, there's a whole movement of – by the way, MCP: we spoke about MCP as a very important component of how these new modern IDEs and, you know, software development processes are integrated with third-party services and tools. I think that what we're seeing is that the security market looks like it's about to get 100 x bigger because of all the code that's being shipped by AI, and AI is not mature yet to understand that they're shipping some serious vulnerabilities out there. But, you know, look, it's very fascinating. I'm very bullish. I think that the next 10 years are going to be very exciting, and you know, what's more exciting than seeing it? Building it. And so, there's lots of builders out there—and I also consider myself a builder in Kong, I consider a company a builder in this new world—and there's nothing better than chiming in and building and helping a little bit with what we can.

Ryan Donovan: All right, ladies and gentlemen, thank you for listening today. It's come to that time of the show where we shout out somebody who came on to Stack Overflow, dropped some knowledge, shared some curiosity, and earned themselves a badge. Today we're shouting out the winner of a lifeboat badge: somebody who found a score that was drowning at a score of -3 question, and answered it so well that brought up the question itself and got themselves 20 points. So congrats to Mark, who dropped an answer on ‘Visual Studio Code: Expand the Horizontal Bar for Scrolling Tabs.’ If you're curious, we'll put it in the show notes. I'm Ryan Donovan. I edit the blog and the podcast here at Stack Overflow. If you have comments, concerns, or topics to cover, please email us at podcast@stackoverflow.com. And if you want to reach out to me directly, you can find me on LinkedIn.

Marco Palladino: My name is Marco Palladino. I am the CTO and Co-founder of Kong. You can find me on Twitter or X @ subnetmarco, and you can look more about what Kong is doing on konghq.com.

Ryan Donovan: Alright everyone, thank you for listening, and we'll talk to you next time.

Add to the discussion