Security Superfriends Episode 3: Aaron Stanley

Rich Seiersen

Security Superfriends | Aaron Stanley: Head of Security at Twilio

As fast as fast can be! This describes the reality of software development at Twilio. Aaron Stanley, who runs security at Twilio, describes all this in this installment of Security Superfriends. Along the way, he shares why Triangle Man (pictured below) is his favorite Superhero - largely because he beats Particle Man. As a They Might Be Giants superfan, I have decided that this may be the best Superhero answer of all time! (I wasn’t sure if it was possible to top Ely Kahn’s answer of Wolverine a few weeks ago.)

triangle manOne area of interest to readers will be Aaron’s ascent to security hero status. I hope you are sitting down, because Aaron started out as a Lawyer! (Gasp!) He quickly left that to go deep into digital forensics, and the rest is history.

One of the most interesting facts about Twilio is their small teams approach to development and problem solving in general. It’s an embodiment of Netflix’s freedom and responsibility model. This approach is at the core of their volumetric release of software and meteoric rise in the technology world.

If you are wondering how security meshes with small team, high-speed development, then check out this episode!

Below are some edited excerpts from the interview

Richard Seiersen: How do you apply security in what I think is one of the most dynamic software environments in the world? If you read anything about Twilio, they’re pretty passionate about small, team-based development.

Aaron Stanley: Core to Twilio’s DNA is innovation. In order to grow a company and have products that are both extraordinary scale, and extremely feature rich and complex, you need your engineering teams to continuously innovate. Build new products, build new features, and push them out. So we push things out really fast, but we also try very hard to incentivize the innovation within the company so it won’t stagnate.

One of the ways to do that is to make individuals, or really small groups of people accountable for the feature getting into production, and accountable for the results, and customer delight for what it provides. So we’ve tried as a company to keep our engineering team small so that they’re quick, and so they can release features that people actually want to use, and we don’t really use the kind of bureaucracy that happens in larger companies where you have to have sign offs for decisions by this person, and this person and this person, and it takes a really long time to actually launch something unique, innovative, market leading.

That obviously creates tough challenges for security. It means we have to be operating really fast if our engineering teams are pushing things out with hundreds of thousands of deployments every day, that security has to be baked into the pipeline. So it’s difficult, it means that we have to leverage a lot of automation, we have to build a lot of tools ourselves, and we have to make sure that our engineering teams are innovating.

We’ve given them guardrails, so they can play in a path and swim in a pool, do a lot, have a lot of freedom, but there are certain lines they can cross that allows them to operate effectively. It allows us to ensure that the really important controls don’t get violated, so that’s the balance we’ve had to strike, and are continuously trying to strike at.

Obviously, the other thing that happens when you have lots of little teams that are incentivized to innovate, is they get to choose a lot of different functions, different languages, different tech stacks, they can experiment with things, especially, we’re heavily leveraging AWS, so every time a new AWS feature comes out, the team can say that’s probably the better way to do what we’re trying to do, and so we have to stay as ahead as we can. We’re trying to react to what teams are doing, but we have to keep abreast of all of the different options that the teams have as they move through and try to do what’s right for the customer.

RS: I think it could be said that Twilio is one of the first generation cloud native security platforms. But cloud native now is becoming much more ubiquitous. We’re seeing backend as a service, functions as a service, serverless, what have you, and of course Kubernetes. So where is Twilio on that spectrum of cloud native? Are they leaning heavily into the modern stuff like functions as a service, Kubernetes, and if so, what is that transition then for you as a defender?

AS: Twilio is a software company in addition to an API company, and if you look at some of the software offerings we have, it’s all serverless. Whereas we had a huge fleet of EC2 instances, and we still do to serve the legacy products that we continue to provide, the newer stuff is not that way. It’s how can we leverage the serverless platforms, how can we do an application purely in code so that we don’t have to worry about things like “what’s the CPU utilization of that box?” “Is the memory pressure getting too high?”

So just being able to leverage some of those new technologies has allowed us to move really fast. It’s also of course opened up new questions about how security works in the serverless world.

What happens when a bunch of AWS customers are using lambda. Is it possible for lambda data to spill over into other places? How do we show AWS where those things are that are happening. There’s been a lot of really interesting work within Twilio that leverages those platforms in a way that really helps our team

The other thing is containers, which have been big for a while, and I think the thing that I learned at Twilio, the one thing that shocked me the most was when I started out, I knew that it was a Linux shop, and most of the infrastructure was Linux, but there’s still some stuff that I’m surprised by in terms of infrastructure and what’s not running on Linux, and what was it running on, and why was it there? And what do a bunch of Windows machines in an environment like that actually mean?

So trying to manage those types of really hybrid environments, I was surprised at how there were some old, in my opinion hardware and software stacks in the company that didn’t make a lot of sense. As I started to really talk to people and understand how the company go to where it got, I realized every decision has a really good reason behind it, that many things have been fought over and argued over, and we continue to stand by the decisions that we’ve made and will uphold them.

When we don’t, we change, but because the company grew up in the very beginning of AWS, we made a lot of choices around technologies that we were going to adopt, and things we were going to build that we now come to rely on, that there’s alternative to. The question of course becomes is the switching cost worth it, are you going to get more out of the new thing, so whether it’s Kubernetes or EKS, the Amazon version of it or your own kubernetes cluster, you have to make those decisions, and then pay the switching costs.

So, as we as a company start to evaluate when are we going to do that, are we ready to make that, to pay that tax, we’re moving things, but it’s happening very slowly, so it’s a technology that the company is interested in, is working towards, is leaning into, but it is one of those things that’s like well, we’ve made a lot of good decisions, we’re where we are because we’ve made a lot of good decisions in the past, so let’s be very careful about making a different decision now, and really understand why we’re doing it.

That said, there is a lot that we could talk about in terms of how do you run a modern stack and be resilient in the cloud without doing it in a containerized way. I don’t think you can anymore. From the security team’s perspective, I think it is impossible for us to maintain good security practices across a hybrid cloud environment where you have Oracle cloud, Google cloud, AWS, Azure, Ali cloud if you’re doing business over there, you have, probably stuff in either an office or in a datacenter, where you have lots of different environments that you have to secure.

So one of the big promises that we have is that a container structure will solve a lot of the security headaches because it kind of neutralizes the platform layer, and if you can secure the platform, you’re good.

Be sure to watch the video to hear more, and subscribe to catch more great episodes of Security Superfriends.

Want to catch the latest and greatest in Security Superfriends? Subscribe to our Youtube channel for updates, and find all episodes of Security Superfriends on Apple Podcasts, Google Play, Spotify, or wherever else you get your podcasts.