This week we chat with Holly Cummins, the worldwide development practice lead for the IBM Garage. She shares the lesser known history of microservices and strategies for optimizing cloud computing to help companies save money while helping the environment.
Episode Notes
You can find some more of Holly's work and bio here.
She gave a great talk at KubeCon 2020, How to Love K8s and Not Wreck the Planet, which you can watch on YouTube here.
And here's a lovely presentation, Containers Will Not Fix Your Broken DevOps Cultures, drawing on her long history of programming and consulting.
TRANSCRIPT:
Holly Cummins What we see is often I think all of us, it's just human nature we, we tend to focus on the solution, rather than the problem. And we get really excited by the solution. And maybe actually the problem is deeper.
[INTRO MUSIC]
BP Good morning, everybody. Welcome to the Stack Overflow Podcast.
Sara Chipps Hey, Ben. Hey, Paul.
Paul Ford Back again!
BP Well, we have a special guest with us today. Holly Cummins. Holly, would you like to say good morning?
HC Morning!
BP Where are you in the world today?
HC I'm in the UK, just south of London.
BP And it's Dr. Holly Cummins. What's your title over there? JavaOne Rockstar, no actual rock and roll involved. [Holly laughs] That's what is says in your email signature.
HC Yeah. Yeah, as you go through life, you sort of collect these little things. So my, my work title is that I'm the Worldwide Development Discipline Lead for the IBM Garage, which as you can hear rolls off the tongue. And I'm also an IBMQ Ambassador and a JavaOne Rockstar.
PF Those are great titles. For Americans, that's garage. Like I just imagine half the-- ''what's a garage?'' [Holly laughs]
HC Yeah, I find that even in the same sentence, I'll flip back and forth between the two.
PF Mhm, of course. Of course. So first of all, it's great to have someone here because I from IBM, just, we should talk a little bit about the fact that IBM is utterly changing any vast and important way or sort of has right like so rather than me mangle what's happening, could you explain kind of what's happening to IBM?
02:05
HC Yeah, so, so it's sort of it's not my specialty subject. But what we're doing is we've got two different business arms, one really focused on infrastructure services, and one focused more on hybrid cloud and AI. And the decision is that it makes more sense for these things to be managed as separate companies, because the sort of the strategy wants to be different for those two things. And they have different capital requirements. And they just, they, they end up being feeling quite different.
PF So now there's going to be essentially two, or sort of two IBM's? Or how's it gonna work?
HC Yeah, the, the new one is, we don't know what it's gonna be called yet. So this is the new one sort of going through life as new co and at some point, it will get a more formal name.
PF Great example of shipping early, like just go ahead. [Holly laughs] Take your giant century plus year old mega corp, split it into and be like, ''We'll figure the name out, don't you worry.'' So I mean, I know this is not your your department, it's just sort of here we are. And there's about to be IBM is making a very big move around how it competes in the cloud space and how it does services. And so it's it's worth acknowledging that yet another giant transformation is upon us in our industry.
BP So Holly, tell us a little bit about how you came to the world of computer science and engineering and what it is you do specialize in?
HC So yeah, so my background is actually not in computer science. It's in physics, although that's actually true for I think, quite a lot of people in computer science. At one point, I so the team I'm in, we're sort of a services organization, a consultancy organization. And at one point, I looked around, and half of us had a physics background. So it's the sort of the second degree choice I think, for for computer science. But I've been with IBM for about 20 years, and I've been kind of steadily moving my way up through the stack and my time at IBM. So I started out working on the performance of the JVM. So super low level. And that was really cool. Because I knew if I made like a half a percent performance improvement, there's so many servers that are running that JVM, it would just have this huge impact on the world. So that was kind of cool. And then I went up the stack a little bit, and I started working in WebSphere. So I did a lot of work with WebSphere and OSGi. And then I went and started working more on--
04:30
PF Wait, pause, because, no, you're fine, but many of our audience members are going to be younger, [Holly laughs] and they will know. We need to tell me I know WebSphere is actually still out there in some ways, but like what was what is WebSphere? What is OSGi?
HC Yeah, so those are, they're sort of they don't necessarily need to be in the same sentence. So WebSphere is, I think, one of the real powerhouse application servers. It traditionally it didn't have an open source presence and so that meant that when you looked at statistics for application server usage, it sort of wasn't very prominent, but then if you looked at actually the market for application servers, it was quite prominent. And you know, a lot of the big banks, a lot of those really power workloads were were running on WebSphere, and still are running on WebSphere. Because like a lot of the rest of us WebSphere has been changing has been keeping up with the times. And that's where OSGi comes in. So OSGi is a modularity technology for Java. It sort of it never quite made it big. But it's had some really big impact in some specialized places, particularly again, in the sort of these it's, they coined the term microservices, actually, way before the distributed micro services that we all know. And it was solving a lot of the same problems of, ''I have 2 million classes on my class path, I don't really know which one is going to be used at any particular time, I definitely don't want them all visible on my class path. And I know that those people are over there are using the wrong one. And they're using the wrong version. So how do I try and sort this out and put some boundaries?'' So instead of using the network to put in the boundaries, it used, used a modularity system, and then it gave you that dynamism as well, so that you could sort of invoke services and not necessarily have to pre-know and compile in those services, but just invoke it dynamically. So what WebSphere did was they said, let's take this modularity technology, let's apply it to ourselves internally. And then that means that we can have this super modular super lightweight server that starts up in like, three seconds, because there's hardly anything to it. And then we can bring in those capabilities, rather than having to do what the model had been before of sort of, we have to ship everything and load everything. Because what if someone needs it at some point?
06:48
SC Is WebSphere is specific to Java?
HC It is. The sort of the end of the story is that what WebSphere needed to do was they needed to modernize. But then at the same time, they had a lot of clients who really, you know, wanted to continue using it. Thank you very much. So we sort of figured out a way to replace the core while keeping the externals the same, and that became WebSphere Liberty, so it was sort of a variation, and now it's Open Liberty. So it's, um, now it does have, you know, there's really good usage metrics, because it's one of the open source application servers, but yeah, it's, it's Java.
PF Typing in it, go look at it. WebSphere Liberty, I didn't know about WebSphere Liberty. This is one of the things where I'd--
SC I didn't know about WebSphere.
PF Oh, WebSphere to me is like, that was always the very serious one.
SC Wow, that's interesting.
PF Yeah. I mean, if you needed to build a large web platform for a large financial institution, WebSphere was always on that shortlist and sort of hand in hand with IBM consulting. So it totally is, but here it is, it's out in the world. WebSphere Liberty.
SC That's great, I think I always tried to avoid doing that, to avoid building large scale applications for financial environments. [Sara laughs]
PF I'm writing my, I'm writing my next blog platform on WebSphere. Okay, so I derailed completely, can you continue from there and and get us caught up to the present?
08:04
HC Yeah, so, I spent a while helping to build WebSphere Liberty, it was such a good thing that that we did there because if you go and you read up on it more, you can see it has these amazing startup times, it has this amazing footprint. And there's kind of a fun accident with with the history of WebSphere Liberty, because the market that they were originally thinking of when they wrote it was developers, they said we need to have a development experience that's really delightful. And so it has to start up really fast. And it has to be able to cope with running on a laptop. Because you know, that's what a developer isn't going to have a huge data center accessible to them. At the time we wrote it that that was the case. And then we realized that those exact same requirements that the developers need of it can come up and down really fast, and it has really small footprint are exactly the same as what you need in the cloud, because there your footprint is money. So it ended up being a really natural transition for for Liberty from working well to developers to working well in the cloud. But I also shifted from building products to helping people use products. So I moved into the Garage, which at the time was called the bluemix garage. And we're really operating like a startup within IBM to work with other startups, help them take advantage of the cloud, get the most out of he cloud.
PF So now tell us about a typical day. What are some of the things that you do?
HC So I, all sorts, usually I'll be talking to some customers who are maybe looking and trying to figure out, can we help them solve their problems? And then the first question there is always well, is the problem that you think you have the problem that you actually have? And so then that's part of it.
PF Well actually, pause there. Let's play that out. Let me be a customer. Tell me a typical problem they come in with, I'll be the customer.
09:51
HC What we see is often I think all of us it's just human nature we we tend to focus on the solution, rather than the problem and we get really excited by the solution, and maybe actually the problem is deeper.
PF Holly, I need machine learning. Is it like that? I mean, how do they, what do they?
HC Yeah, yeah, often it has, you know, tend to be quite fashion driven as well, again, I think, you know, that's just human nature. So it's sort of Holly, I need machine, machine learning or Holly, I have data and I feel sure there must be something to do with the data. But I don't know what the right question is.
PF Right, I know I have a data leak, you know? Yeah, let's do something. Yes. Yeah.
HC Or a couple of years ago, it was I need a chat bot. Well, what chat bot to do?
PF Oh, okay, I stumble in? I say, ''Holly, I need I need an ml powered chat bot. I read about it in CIO monthly. Now, what do I do? Tell me what I need to actually be thinking about?''
HC Yeah. So that so then what we'd usually do is we do some design thinking techniques to sort of try and drill down to say, well, who's your user? And what problems are they actually having? Because the problems that they're having aren't necessarily the ones that have the most exciting technological solution. Even though we work in technology, we always want it to be the exciting solution that my favorite was we had a client and they wanted to do some sort of a manufacturing scenario, they wanted to do a chatbot for password reset. Because what, yeah, yeah, I think what happens is that there's sort of, we naturally want to make an incremental change. So the way that they reset their passwords at the moment was they rang someone up. And they said, well, let's automate this. So instead of bringing someone up, what if they could just talk to a computer, and that would cut the person out of the process. So it sort of feels like this really natural transition. But when we sort of dug into it, we realized the reason they needed to reset their password all the time was because they were using these sort of handheld devices, wearing really heavy gloves, because it was this sort of factory floor scenario. [Ben & Paul laugh] And so there was sort of fat finger fingering entering the password. And actually, they didn't mean, they didn't really even need a password at all. So they could have just sort of like had like a QR code that was on the wall that they just, you know, photographed in order to authenticate into the devices. Or they could have just changed the system so that instead of giving you three attempts to get it wrong, it gave you 10 attempts to get it wrong.
12:09
PF Or they could make an amazing little chatbot, that pops up and says, ''Hey, what do you want to know your password?'' [Holly & Paul laugh]
HC And the punchline is that it was going to be quicker and cheaper for them to write a chat bot in the cloud, than it was for them to make the one character change in their back end system, to give them a more generous password expiry problem.
PF Ouuff.
HC Yeah so you sort of go around in a circle where the problem that you think you have isn't necessarily the right problem, nor is the second problem, you have to like--
PF You're digging deep into legacy stacks while you're doing this.
HC Yeah.
PF So okay, so a lot of kind of cloud advisory. And I know, one of the things I want to, I want to just sort of jump forward to, because it's fascinating to me, personally, is you do a lot of work around climate and sort of working in IT, and how to factor climate into your strategy. So you know, I think it's safe to say we have a maximalist very real sense of climate change coming and coming to IT. And it'd be great to get your perspective. So you know, you're talking to lots of developers, what should they be thinking about?
HC Yeah, I mean, we all, all of us, I think we sort of have this awareness of the problem, and we and we want to do something. And then it's such a huge problem, that it's hard to even know where to begin. But for all of us there one thing that we can do, which is so easy, is we can start reducing waste, and there's there's sort of a question of is reducing waste enough? Do we need to actually do a more fundamental transformation? But before we even get to that question, we should say, well, okay, well, but some of this waste is just, it's such low hanging fruit, and we're just burning resources for no reason. So let's sort it out and it because it's a double win, because, as well as then reducing the energy usage and reducing the emissions, we're saving money. So like, what's not, not to like?
13:59
PF So how do we reduce waste? What would you advise?
HC There's two things one is just often, and I think the cloud makes us way worse, what what ends up happening is we do an experiment, and we put a server up on the cloud. And then we forget to turn it off. These things, they just sort of keep going and going and going. And there was a team, they did a survey. And they looked at 16,000 servers, so fairly decent size, a survey and a quarter of them were doing no useful work. So there's just these--
PF So there's a 25% savings right there.
HC Yeah, so you can just chop that off.
SC Yeah, I'm thinking about the servers right now that I mean, I have running off turn them off after. [Holly laughs]
PF I mean, when you think about it, right? Like at least 25% of servers should be just turned off on principle on the internet.
BP Alright, we're pausing the episodes, we're all gonna turn those off.
PF Let's go turn those off. Just a big switch. Just shut down US East one for Amazon just to see what happens. I think.
HC Chaos testing. [Holly laughs]
PF Well, and also, just like give it a week and see what you really need. Everybody gets upset in the moment, you know, everyone's like five nines, whatever, just let's go for like one nine and just see what we're we're gonna, what's gonna work? Right, you know, maybe I don't need my bank, I could, you know, probably I don't.
HC I think I mean, yeah, 'cause we make these assumptions about what we need. And then when we start to challenge those assumptions either accidentally through chaos testing, or more deliberately we realize we don't like what there's a lot of things that we can do without so I'm hearing these amazing stories now of people that put in just a little bit of automation so that at 5pm, it shuts their servers down. And at 9am, it brings them back up. And as long as you've got the sort of the disposable infrastructure so that you can safely do that, then they can save like a third of their energy usage.
SC That's fascinating.
BP I have to tell you guys a quick story. So I've been watching Halt and Catch fire to try and just familiarize myself with a little bit of--
15:54
SC So you can understand me better?
BP Yeah, yeah, get my 80s computer references tight. And one of the one of the plotlines is he goes to work in the data center, because his stepfather is punishing him. And he goes in after he's clocked out at five, and all the machines are quiet, and he's like, what's going on? And the guy's like, oh, yeah, they're the machines run from nine to five, just like us. And then then we turn them off, you know, nine to five. He's like, ''Oh, my God, we'll sell this network, I've discovered this untapped resource.'' So before there was climate change, there was the idea of selling all that off time.
PF Well, I mean, growing up, you wanted to leave the computer on because the off switch was the part that would break the most easily. [Ben laughs] You just leave that PC on, because that was the least likely to melt down your PC.
SC Yeah, it's funny, because we all have like 30 docker, like instances doing things that you forget about. Working with a group that we had an Azure instance running. And we would always make sure to turn it off, because it charges by the hour, you can charge by the hour, so you want to make sure you have the least amount of time. So maybe that's a solution, we can kind of instead of letting everyone start up their own little server environment, we can hold them accountable that way.
PF What is sadder, though, than when you launch a little experiment using a container. And then you go back to like a couple weeks later to start it up and it takes like, maybe 30 seconds, because you know, nobody else is looking at it. You know, you're like I didn't know, that little website didn't get any traffic yet. Google just put that, you know, whoever whichever containers or just put that guy right to bed. And how does climate factor into your day to day? Is it something that our clients ready to hear about it? Are they coming to you about it? Where does it live inside of your daily work?
HC To be honest, it's still fairly fringe, I think. There's some clients are asking about it. But I think we still have it sort of compartmentalised in this box of like, that's something that we'll worry about in the future. That's something that someone else will worry about. And, and we kind of need to--
PF Oh, we sure will.
17:53
HC And we kind of need to be like bringing it in and looking at it when we make these technologies, both in our behaviors like should I be shutting this off? Should I maybe invest in the automation? But then also just thinking about, like, the implications of our technology choices of, yeah, this is good in this way. But actually, this is really bad in terms of climate. So maybe I should be trying to batch my workloads, maybe I should be making some other choices. Maybe I should be trying to share resources so that we're all in the same cluster, that kind of thing?
SC Yeah. Can you tell us a little bit more about your work in cloud native, and maybe talk to the folks listening? What do you what is cloud native to IBM? And how do you think about it?
HC Cloud native is one of those terms that we all want to be cloud native, but there's about 16 different definitions of cloud native, which makes it really hard for us to sort of figure out whether we've actually achieved it. So one definition is the classic of it was born on the cloud. And then we see a lot of times it just gets sort of treated as a synonym for Kubernetes. And so when we say we want to go to cloud native, what we really mean is we want to migrate to Kubernetes, which is a good thing, but not necessarily the same thing.
SC There's, you know, it's funny in the year 2020, there's still a lot of people making this migration, do people come to you at IBM garage because this is something they would like your help with? Or is this something you generally work with people that have already made the switch? Or what do you see the most?
HC We're seeing a bit of a shift, actually. So we're doing a lot more migration, because I think when we've both in my team and the IBM garage, we sort of started out as an innovation consultancy, but then what we realized is that there's about this sort of 20% of workloads that are either new workloads, or they're really trivial to move to the cloud. And then there's this 80% that are much more entrenched in the legacy. But in order to make progress with the 20%, you have to be making progress with 80% as well. So if you want to innovate, you have to talk to a back end. Otherwise, it's not innovation. It's just a pretty website. And so then, in order to actually allow more innovation, we also have to be sort of being brave and you know, getting on our hip waders and going into that legacy to sort of say okay, what can I do to this? To not, I don't have to rewrite it all. But I need to figure out the points of friction and fix those points of friction so that I can have my system as a whole moving forward.
20:10
PF So one of the things that's fascinating to me, right, is that I sort of, we're talking about innovation, we're talking about cloud and in sort of, a lot of your background was lower level, but things that were going to feed into IDs and feed into sort of SDKs and make it easier to compose services. And then I go, and I look at the different cloud services. And they're still relatively difficult to compose. And I'm wondering if where you see things going in, like, I keep wondering, like, when does the IDE for cloud show up? Or is it here and I've missed it, like, how are people going to be working, if they're building bottom up from Cloud services?
HC That easiness I think is such a huge part of the value of cloud is and our industry for like the last 60 years, we've just been trying to make things easier and trying to make things easier. And then occasionally, we've had these problems where we've gone in the opposite direction, because we had a problem that was too big to solve. And we had to sort of do it in two steps where we solved the problem. And then we went back and we made it easy. And I think with cloud, we're sort of on a little bit of that sort of loop now as well, where, when we first started with cloud, it was so wonderfully easy. And you know, you just would start something up and it would work. But then we realized that we couldn't actually debug everything in the cloud. And we realized that we couldn't actually make progress with our monolith. And then we sort of have gone back. And now we do have, I think it's fair to say quite a bit of complexity. And then now we're sort of trying to eat that complexity again, and sort of say, okay, well, let's have more managed services, you know, IBM got this managed Kubernetes. And it really does go back to okay, it's actually really easy to just do this.
PF It's hard to abstract things. I mean, it just truly is like, what are you abstracting? And it's just like, there's an easy fantasy that you'll just move boxes around on screen. But then it is interesting to see what's happening with orchestration in general, I'm just seeing a lot of developers get excited about things like pulumi or terraform, stuff like that, where the ability to kind of write programs about the services seems to be quite motivating. But yeah, we're not at that point yet. There's no, you can't just drag and drop a cloud into existence.
22:14
SC Isn't it kind of Glitch? Glitch, you can a little bit drag and drop and cloud I mean, existence.
PF You know, a great example, two of the thing that was that solved a problem but made things harder was Git. Git is incredibly complicated to get up to speed on, it's really hard. And then GitHub showed up showed up and kind of created the, the centralized infrastructure around it and GitLab and other services, right, but that original idea that like, oh, we're all gonna have decentralized version repositories on our hard drives, and we'll share them via email patches and sort of sync them up as we see fit. It turns out, the humans couldn't handle that we needed something relatively centralized in order to make sense of, so it feels like we're, we'll we'll see that pattern play out in cloud services in some way or another.
SC Do you want to talk about your work on Java, a little bit about where you see the Java community? Right now, we got to speak to some folks from Oracle not that long ago. And they talked a bit about the open source world of Java. So that would be cool as well.
HC Yeah, Java has such a bright future. It's been declared dead so many times. And actually, [Ben laughs] it just sort of seems to keep going from from strength to strength. And what one of the things that I find really exciting is how it has adapted to the cloud. And it has, it has changed. But when I worked in the JVM, one of the sort of the conversations that we'd have a lot was about ahead of time compilation, because what Java does is you send in your bytecode, and then the server starts up, and it starts compiling your bytecode. And you're watching your server start up saying, but look into scripting language, this is so fast, why can't you do this ahead of time? And then the answer comes back, if you know, from people who sort of know what they're talking about in this area, they say, ''Oh, no, no, no, you don't want to do ahead of time compilation.'' Because actually, if it watches your program for half an hour, if it watches your program for two days, it can optimize that way better, than if it would just do it ahead of time. So by making that trade off of that sort of slightly slower startup time, you get better throughput. But now again, that was when you would only restart your server every six months. Now in the cloud, these things are going up and down. Is it that old assumption that ahead of time is a bad idea has been flipped again, and things like Quarkus, do ahead of time compilation and they get they gain these sort of phenomenal footprint and phenomenal startup time because of doing ahead of time compilation.
24:38
SC That's really neat. You must have a real different understanding, working, I think, on the JVM and then working in Java, I think it kind of gives you I think a lot of programmers, back end developers take these things for granted what's happening under the metal. And it must give you a really interesting perspective on the language itself.
HC Yeah, it's nice to sort of know that the very sort of underneath and and the very top, but it is also, I think one of one of the sort of the beauties of our industry is that you don't have to and that, although, because we were talking earlier about how abstraction is hard, and it totally is, and we get these sort of failed abstractions that leak, and then in order to do anything meaningful, you have to go underneath, but the JVM has actually been really successful abstraction, that you don't have to know what's underneath in almost all cases, and you can just get something that works really pretty well.
BP Just for people who don't know what it was the Quarkus that you mentioned, what is that?
HC So Quarkus is supersonic, subatomic, Java. And so it's-- [Holly & Ben laugh]
SC So cool! That sounds awesome!
PF That was the the nerdiest Beastie Boys song. That's very exciting.
25:49
HC So what it does is it sort of combines a few, a few technologies. So it takes GraalVM. And then on top of that, it has some some runtime constraints, particularly around dynamism and reflection and a couple of Java features that you don't really particularly need in 2020. And once they put all of these things together, you get this runtime. That is just stupidly fast. So yeah, you should you should all go look at a demo of Quarkus, because I'd heard about Quarkus. And it didn't really make sense until I saw a demo. And then I was sort of I blinked and the server was like, wait a minute, shouldn't Shouldn't the server have taken some time to come up? But no, it just, it's super dynamic, it comes up, and then it has this ridiculously tiny footprint. I heard a story from someone who was doing booster duty with Quarkus. And they were I think they were not the most technical role, but they're doing booth duty and, and they bless them, they forgot to turn it off each time they do a demo. So they sort of show someone look how amazing it Quarkus is. And then the next person would come along, look how amazing and each time they started up. And at the end of the day, they had 200 instances on their little laptop. But the best thing about it is that it didn't matter. They had completely forgotten to shut down after every demo. And it just carried on and it didn't affect their laptop at all to have 200 servers running.
PF That is a different era than you know, dash MX in 180 mags. And if you do that 100 your computer will catch catch fire.
HC And then it ties back to climate change as well. Again, it's one of those double wins where if you have, if you can put 200 on a laptop, then in a cloud instance, you can have so many.
PF Interesting, okay, so, so we're back to optimizing we're back to actually taking computer science seriously as an industry. This is very exciting.
BP Somehow it got back to physics too, right? Quarkus, I like this. It all comes full circle for you, Holly.
[MUSIC]
BP Alrighty, everybody. It's that time of the episode. We always talk about this. So today's lifeboat awarded to: Snippets of Code. Convert seconds into days, hours, minutes, and seconds, convert seconds into seconds. Well, you know, you can have it however you like, thanks to Snippets of Code for sharing some knowledge, getting us up votes, saving this question and receiving a badge. I'm Ben Popper, the director of content here at Stack Overflow. And you can always find me on Twitter @BenPopper.
28:16
HC I was just gonna say I love that in 2020, time is still one of the hardest problems in computer science. [Holly laughs] Speaking of abstraction.
PF Oh, absolutely. No, no, it never ends. This podcast is us repeatedly talking about time zones for about a year.
SC They're just bad.
PF Time zones and off-by-one errors. It's the same stuff. We're still, we're still struggling. Sara, tell the people who you are.
SC Yeah. I'm Sara Chipps. I'm the Director of Community here at Stack Overflow. Go Iwillvote.com to find out your nearest polling location. And I'll stop doing that after the third.
PF Me too. I'm Paul Ford friend of Stack Overflow, you can check out my company at Postlight.com and vote, vote, vote, vote, vote, vote, vote.
HC So I'm Holly Cummins. I work for IBM, and you can find me on Twitter and various other places. But I would I'd love to point people, I talked a lot about Quarkus. So I think you can go look that up quarkus.io and I talked a bit about IBM cloud as well. So I encourage people to go have a look at that.
[OUTRO MUSIC]