Security Superfriends Episode 5: Chad Kalmes

Rich Seiersen

Security Superfriends | Chad Kalmes, Vice President, Technology & Risk, at PagerDuty

Chad Kalmes is not the Hulk! He just identifies with him. Why? Chad has rage – but he uses it as a force for good! His teams have said as much – giving him a toy Hulk to honor that seething yet productive part of his psyche.

It’s that drive that has propelled him to his current role as VP of Technology and Risk at PagerDuty. And it’s what made him a success while at Twilio (where I first met him).

I asked Chad to speak with me about his ascent to world domination. I also wanted to chat about how he is navigating cloud native security risks. Chad is rare in that he has worked back to back for multiple cloud native behemoths cutting across IT, Security and Risk. He attributes his ability to navigate roles and technologies fluidly due to his consulting background. I like hearing that. No cookie cutter backgrounds - lots of diversity in security!

He talks about security processes that can scale for modern software development. Working in increasingly DevOps centric environments, where engineering is built around the services and products they are building, there is a shift left, but balance is needed – where security can take care of some tasks to make things easier for faster development. By providing some security services, such as setting standards and guardrails, he enables the engineering teams to be free and independent to do what they need, and save their time because security has taken care of some of the work.

I hope you enjoy this fifth installment of Security Superfriends! Check out some edited highlights below, and be sure to watch the video or tune in via podcasts on our Soluble channels on Apple Podcasts, Spotify, Soundcloud, and Stitcher.

Richard Seiersen: Who’s your favorite superhero, and why?

Chad Kalmes: It has to be the Hulk because again you’ve got this notion of a logical, analytical really smart person who has his superhero power - he just uses his rage to get things done. I can empathize with that because to me there’s always something that drives me nuts – things don’t work right, processes take too long, people don’t get what you’re trying to communicate, so there’s always this undercurrent of anger that I love to channel into getting things done.

So I use my you know my rage to try and fuel change in the companies I work for. I have a prop because a previous team that I worked for actually got me the hulk statue because they could tell I was angry, but I was using that as a force for good.

RS: You’re at PagerDuty now, prior you were at twilio. What is different and what’s the same between those two leading technology companies? What are you seeing and how does that materialize to what you’re having to do in your current role?

CK: At the surface, purely from a risk perspective, you might say, hey there’s a lot in common with these companies, like they’re both cloud native, both probably started blocks from each other in the city of San Francisco and are headquartered there. They touch on aspects of messaging and communications of their core, but you know the devil’s really in the details there, and I think once you get below some of those superficial or industry kind of commonalities, there’s a lot different there. Very different tech stacks, different approaches to how you architect and deploy in the public cloud, different legal and regulatory environments, all that kind of stuff.

I think you were at twilio roughly the the same time, and so you know that that growth curve of seeing companies through the pre-IPO to IPO to post-IPO growth curve – I love that because no two of those experiences are the same, and the way that companies need to approach that, even in terms of when it comes to like filing the S1 and calling out your risk the organization, you really have to put on your hat and think through the eyes of that company, what’s different here, what’s unique about how we approach things. And that ultimately translates its way down to pretty probably every process in the company.

RS: In Twilio’s case, they were cloud native going on 10 plus years. At the start, that meant being all in on AWS. But what is and isn’t cloud native has evolved. First generation cloud native companies are now needing to digitally transform to modernize. What have been the biggest changes you have seen, and how are you navigating those changes from a cybersecurity and risk management perspective?

CK: Twilio was really interesting because they obviously continued to try and shift left, but also because their platform, from a communication standpoint, was so broad and you had so many different products and APIs that the teams were cranking out, they’d made this decision, and I don’t know what current state is, but at the time, they invested very heavily in the SRE platform team because they needed to make as many pathways and roads for the rest of the engineering teams building individual products to just ship their stuff and make it as easy as possible.

At PagerDuty, we started as much more of a focused single product platform, and are just really starting to turn into a true platform where you’ve got multiple products and services. We’re still very much a DevOps centric environment where the engineering teams are very much organized, you know at a very small granular level around individual services and products that they’re building. But now I think we’re starting to see some of the benefits of, you know, maybe if we had more of an SRE function and more of a platform team, we could actually take work away from some of these other teams.

We don’t want to lose that shift left DevOps focus or centricity any more than we have to, but maybe shifting a quarter step to the right and providing some of that stuff sort of as a service to those other teams, maybe that would actually benefit us, and that’s ultimately a strategic call for the engineering organization. But I think it’s an interesting example of how as you grow in scale and you increase in complexity, there’s always this sort of turn or tension between how much can we keep everything shifted left and let teams be as free and independent to do what they need, versus if we took care of just a little bit of that for them, they could actually do some other stuff faster. To me, that’s a fascinating evolution to watch and see, and, you know. who knows where it’ll go.

RS: How are you balancing developer freedom and responsibility v. a paved road approach to deploying new applications into the cloud? Specifically, how do you apply the security controls that are required without slowing down velocity?

CK: I’ll throw another super hero analogy or a quote out there: with great power comes great responsibility. Even in today’s PagerDuty, where we’re very DevOps centric, we still have certain common standards or things that are kind of non-negotiable or even if there’s no one central team that does it, it’s everybody’s responsibility, and we hold those people accountable, and security is a prime example of that, where you know we don’t necessarily mandate or have a security person on every single team, but we have certain practices, approach, mentality that we expect every team to adhere to.

And we’ve built as much of that as possible ideally as transparently as possible into all of the pipelines that those teams might use so that we set at least a high-water mark or a basic threshold that everybody, regardless of what their focus is or what they’re doing, has to adhere to.

I think that gets to what the ultimate approach is that we’ll take from. Call it balancing DevOps versus SRE platform that you were definitely not ever as a company you just i don’t see it personally being sort of the the like heavy paved roads. No – if you want to get to Second Street, you’ve got to go down Main and take a left on Third. Our approach is let’s agree on where the guardrails need to be, sometimes those guardrails have to be very explicitly called out. There will be caution tape on them, flashing lights, etc., but within there, we want to give people as much leeway as possible as long as they don’t hit those boundaries or repeatedly hit the same boundaries.

That’s kind of the litmus tests that we use as long as we have our our standards, our expectations in place, people aren’t constantly banging into them then, then maybe they’re good and appropriate, and sometimes we have to move those in a little bit, or sometimes we move them out but it’s always that kind of balancing act because we want to make sure that those standards are there, but exactly how different teams adhere to them or how it fits into their process, we want to be as flexible on that as possible.

RS: Since you mentioned policy, and we are talking about cloud native security, where are you on policy as code? What are your thoughts on it as an emerging security practice?

CK: We’re definitely huge proponents of infrastructure as code, and within our security practice, I’ll give full credit where credit is due. Internally I think that team at PagerDuty is actually doing a very good job of starting to move from infrastructure as code controls to policy as code controls, and are actually looking like how can we just call it pre-instrument or pre-define a lot of our standards and control expectations so that again developers don’t need to think about it.

They’re applied transparently, you know, ideally those boundaries are set so that people aren’t banging into them, and they may not even know they’re there in a lot of cases. And specifically around again the controls within our pipeline, how we think about and manage a lot of the boundary settings and, sort of that hard crunchy shell on our infrastructure. More and more of that is actually policy applied as code into the standards that our team publishes for the rest of the engineering organization.

We’re not not there yet; I haven’t seen a lot of dedicated bespoke tooling that supports that, but I’m definitely seeing that mindset shift and some progress, so I do think that’s a big area for potential windfall going forward. The more you can make that stuff standard, transparent, easy to then prove or illustrate in an audit because it’s all logged or it’s validated from a config setting – I think that’s definitely a direction that a lot of smart companies are definitely going to pursue.

Be sure to watch the video for more, and subscribe to our Soluble channel to see more great episodes of Security Superfriends.

Want to catch the latest and greatest in Security Superfriends? Subscribe to our Youtube channel for past shows and updates, and listen on Apple Podcasts, Spotify, Soundcloud, Stitcher, or wherever else you get your podcasts.