1:1 with Alex Roetter: Leading Engineering at Twitter

Author

Alex

Co-founder at Waydev

Topics

Download the whole article here

Below is an excerpt from our conversation with Alex Roetter, former SVP of Engineering at Twitter, Managing Director and General Partner at Moxxie Ventures, and featured in the Netflix documentary, The Social Dilemma.

Alex led Twitter’s engineering organization of more than 2,000 people, helped build their ads network from near zero revenue to $2.5B a year, and scaled users to hundreds of millions.

He talked to our CEO and co-founder, Alex Circei, about the impact of development analytics, what it was like to scale an engineering team, and more.

Alex Circei: We would like to hear how you’ve managed to scale the engineering part at Twitter and what metrics did you use there?

Alex Roetter: When I joined Twitter, the whole company was a couple of hundred people, and I started the ads team with around ten people. There was a small group working on ads, but we grew exponentially, and I became the leader of that team. Later, that turned into about 400 people, and we grew from zero to $2.5B in revenue over a few years.

Then, I took over engineering for Twitter. Overnight my team went from 400, which was just the ads team, to the whole engineering organization, which was slightly over half the company. That meant around 2000 people, so there were multiple scaling activities and challenges.

One of these activities was a lot of hiring. The other was something we noticed immediately – we weren’t getting enough done for the size that we were, so I spent a lot of time trying to figure out why this was happening.

Speaking in graph theory terms, one high-level observation we made was that if you think about productivity, the fastest you can get work done is linear with the number of hires, so you’re never gonna get more than twice as much stuff done if you double the size of your team. Think about your organization as a graph, the number of pointwise edges in a graph with N nodes grows as N squared. So your communications overhead grows quadratically, but your theoretical best output grows only linearly. That’s really a communications drag and there are many other communications drags that grow super linearly as well.

We just found ourselves saying “Wow, this company is huge. It’s 2000 engineers, 4000 people. Are we really getting as much done as we want to get done?” Given the massive investment and the amount of life energy that people were pouring in, the answer was NO.

A lot of the work was organizational, because we really wanted to make sure that people could run independently with clear interfaces. When you design software, you sit down and build a bunch of objects, where each one has a very clear interface, and minimal, but sufficient, set of public methods.

You don’t accidentally expose implementation details in the direct declarations of the public functions and you keep private members private, and you really make none, or the minimum number of them, public. You need the same thing in an organization: where objects are teams, or individuals. This allows people to run independently, with very clear expectations with respect to other parts of the organization. Often this doesn’t happen, and one is left wondering: “Who’s responsible for what? Do I have authority to do that?”

For example, let’s say I’m on the Growth team and I’m responsible for DAUs (Daily Active Users), but a lot of the levers I need to pull are levers that I don’t control. So now, I’m responsible for something over which I don’t have the authority to change. It’s extremely frustrating and I’m set up to fail! I don’t even think it’s my fault because you asked me to do something, but you didn’t give me the resources or the ability to do it! We found a lot of things like that.

Also, we really wanted to understand why we were shipping software more slowly, so we did something very low-tech. I just walked around and asked people, “Is this the best job you’ve ever had?” And I, sort of knew the answer was gonna be “no”, but I wanted to hear from people why they felt that way.

So I asked, “Why not?” And we learned things like the gulf between authority and accountability we talked about, and that it was not clear who owns decisions – that’s very important too. If it’s not clear who owns a data structure, your software is going to be riddled with bugs. Who needs to get a lock to it before they write to a certain piece of memory? Can anyone just write to it? Should I expect that it’s in the state that I last left it in? Those being unknowns would be a disaster in software architecture, but that’s exactly what organizations do.

There were also some tactical things too, the builds had gotten extremely slow and there was inadequate testing. Git’s master branch in many of our largest repositories was often broken. I couldn’t submit code when master was broken, and I couldn’t do a PR or refresh and get a clean master on which to build, so it was very hard for many of our excellent engineers to run tests and validate that everything was working. It was very hard to deploy things in a world like that.

We invested and created a new team, which was first called “Team Awesome”. Then we called it “Engineering Effectiveness”, and it was just all about creating the best environment for engineers to do the best, most productive, most satisfying work of their careers. That was our north star and a bunch of the best engineers in the company transferred into this team.

Every time I gave a talk at an all-hands meeting, that’s what I talked about, because Twitter had this culture that the only thing that matters, and the only thing that the CEO cares about are demos and features. So the company was littered with demos that never shipped, which of course really hurt morale.

For example, we had this thing called Hack Week, where you could just build whatever you wanted for a week and demonstrate it at the end of that week. It was full of great ideas because the company was full of really creative, innovative, interesting, passionate people, but a tiny fraction of these things ever shipped.

It was extremely frustrating for everybody. It was frustrating for people that built all the projects that looked awesome and that people loved on demo day, but then never saw come to fruition, and it was frustrating for the company which wondered: “Why weren’t we investing in all these new ideas that never shipped?” We didn’t have a lack of ideas, we had a lack of execution, and a lack of ability to ship high-quality software quickly.

But there was a real culture where the only thing that mattered was shiny demos, not perceived “grunt work” like making the build faster. So I started talking, almost exclusively, to the company just about the build being fast, the test not being broken, how to deploy more continuously, the code review latency, and stuff that’s viewed as “not sexy”, but really crucial to building a frictionless machine where people can work. Companies have a way of deciding “Hey, whatever the senior leaders talk about all the time must be important.” At least, if you repeat yourself enough!

If you don’t have a very tight inner loop of understanding the customer, prototyping, testing, shipping, learning, and changing, over and over and over, then you can never make progress fast enough, especially when you’re really trying to optimize some metrics that are hard to move, like Twitter usage or product-market fit, or a new ad product. You really need to tighten the inner loop to have any hope of moving fast enough.

The core tools that support that workflow, e.g. fast builds, tests, deploys, etc, really have to be excellent and efficient for everyone.

Let’s think of what the smallest unit of a core agile team does every week: select user stories, ship tickets that address them, test the new version, learn from the users, postmortem that whole process, then rinse and repeat. If that’s not really friction-free, you’re never gonna make progress, and it’s extremely frustrating. Some really great people quit and went somewhere else where they could find more enjoyable, satisfying environments for themselves. We had to fix that environment asap.

Another thing that’s hard about scaling is the bigger things get, the easier it is to breed distrust. You know, if there are four of you working on a thing forever, and you’re always sitting around the same table together, it’s very hard to breed distrust and question intentions. The bigger you get, and the more management churn there is, and more new groups coming in through acquisition or hiring, the harder it becomes to trust everybody as a default.

Twitter was a high churn place, so, the whole company struggled with that. There is very little malice in the world compared to the amount of confusion or errors, but it’s easy to ascribe malice when there are large groups you’ve maybe never met, or haven’t built trust with.

Alex Circei: If you would have had Waydev, what do you think would have been different?

I think the value of a tool like Waydev is to fight distrust. There are many ways to fight distrust, but one of these is certainly transparency. Once you know what the person is working on, you understand their perspective and combat the natural tendency to lose trust as you scale.

As we discussed, we spent a bunch of time just making the engineering machine internally be as good as it could be, and part of that was trying to measure what was happening. That was a really hard problem.

Maybe if Waydev was around back then, it would have helped. We were able to measure some things, like how often the build was broken, deploy times, code coverage, etc.

So we did all of that and also we did one other thing: we graphed tickets closed by date as a time series. We had this quarterly OKR planning process, which was supposed to be high level and directional. But some teams had completely abandoned an inner loop of agile development and we’re just running on quarters.

We saw that because their graphs of tickets closed just had four spikes a year. So, on the last day of every quarter, they closed a bunch of tickets, and the rest of the quarter, they closed almost no tickets, which was completely crazy. This data really reinforced our intuition about what teams were higher-performing vs. which ones had more opportunities to improve.

Of course, one really wants to be doing work all the time, especially when some of the higher-performing teams really were closing tickets all the time. After we discovered that, we tried to debug it, asking some questions: “Are the tickets too big?” “Are people just not feeling urgency except at the end of the quarter?”

For example, let’s say you want to go run a race and you say: “Okay, I’m gonna run a 1600 meter race around a track in seven minutes” and your strategy would be “I’ll run each of the first three laps in 2 minutes and then I’ll do the last lap in 1 minute.” That’s a terrible strategy! And that’s what these teams were doing, not intentionally, but we discovered that from looking at this data.

Also, we did not have great metrics on code quality. We looked at little things like: “How often do tickets go back and forth? What is the latency for a code review?” But it was very hard to tell why latency was high when it was high. “Did the code reviewer not have time or was he or she simply not prioritizing the reviews? Were they making a bunch of pedantic comments or really useful comments that made the system better?” “Were the code reviews too large?”

We looked at the distribution of sizes of code reviews too. Obviously, it’s better to do smaller code reviews, instead of a single massive one that ends up being a lot of throw-away work based on the comments. The urge to not re-do work is so strong that it was easy to fight comments you received on a large review, even if they made things much better.

“How good was the code that we were building?” was a real blind spot for us. We had an experimental framework. So we looked a lot at what experiments were shipping, what experiments were not, what experiments were getting reverted, and what things perform at full scale, in the way that we predicted by experiment. That was a great system that provided a lot of insight. But we didn’t have something as good and illustrative on the engineering work and code itself.

There’s a really good side to all of this: great engineers know what’s getting in their way. You just have to ask them. One of the main frustrations is that no one’s asking them. We made a lot of progress just by asking engineers what slowed them down, listening, then spinning up projects like making the build faster, more distributed, rewarding code health instead of more demos that we could never ship.

But I always wished the system was more well-instrumented. Most systems that you’re trying to optimize are very well instrumented. If you think about a sales organization, it is extremely well-instrumented, you can really run sales by the numbers. If you’re a pilot, the aircraft systems are extremely well instrumented, and you can look at all the pressures, temperatures, fuel flow numbers, and all kinds of metrics. We were really struggling with the org not being well-instrumented. So, we tried to measure what was easy, trying to fix the low-hanging fruit, but it wasn’t perfect. That’s probably the area I wish Waydev had existed, several years ago.

Alex Circei: What was the structure of the teams? What tools did the engineering managers have to improve and work better?

Alex Roetter: We also did some work on the shape of the organization. We looked at things like the average span of control, which showed how many reports a manager had. By the way I think that’s a horrible phrase because it implies that management is about control. Management is about the opposite of that, it’s about empowerment, enabling people to do work, and giving them the tools they need, it’s not about control. It’s the packaging and the glue that holds things together. It’s not the actual work, which is building products for users. As a result of that, it should be the minimum skeleton number of managers you need, but it’s not the main point of the organization.

There were around 8 Engineering VPs, who reported to me. Then they each had organizations that were a couple hundred each, and from there it was recursive. But also, in some cases, we divorced the management structure from the teams doing work structure, and actually, the team that did this best was a team that we acquired into Twitter.

They had a bunch of engineers reporting to managers that knew about their specific type of engineering, whether it was mobile, backend, or whatever. But teams were assembled more on the fly. The team was the Agile group that was doing work that would meet up to burn through a bunch of user stories for a specific product that we were building.

Those could be disbanded and reassembled, but it didn’t require switching managers and coming up with a new career plan for yourself, being told that that path to the promotion that you are on, you’re not on anymore because now you have a new manager, and all the kind of stress that comes with switching managers. Because, at the end of the day, probably the single biggest determinant of whether or not you like your job is do you get along with your manager?

If you don’t like your manager, I will bet serious money that you will quit your job in a short amount of time, or transfer. So we tried to decouple that, and let project groups spin up, or shut down, without people getting different managers.

Another thing we tried to invest in is helping those managers have all the tools to succeed. We built a management course specifically on engineering management that we taught and we tried to teach how to run and effectively empower teams, what success looks like for them, and how to ask for help. It ended up being a great set of content and I’m really proud of the folks that built-in.

We built that, but there wasn’t one specific way of doing work across the company. So it was more about best practices and providing a template, not mandating a single way to run a team. There wasn’t one set of tools that everyone had. Twitter, for better or for worse, was a very fragmented engineering culture. How one team worked was very different from how others worked, and in some cases it had some downsides. There were very different ways of thinking and working, as a function of people’s backgrounds, where they worked before, their specific beliefs, and many other things, and that caused a lot of tension. In many cases, people started resenting how other teams worked.

Twitter wasn’t really a place where you could mandate one way to do things. We did that in some cases, but that was a very expensive thing to do in terms of backlash and effectiveness. So we really tried to build toolboxes of things that people could use without actually mandating them. That said, I do think the more you can agree to a base set of things that are important and the ways to achieve those, the better.

As a result of that, we did come up with this thing called Engineering Principles, I think there were 10 to 12, and it was things that we valued. We had a point of view on build versus buy, appropriate reuse, and test-driven development, to name just a few examples. We tried to come up with philosophical tenets that we could all agree to, so that when we went out into the world of this very distributed, bottom-up, heterogeneous, decision-making environment, people at least had shared values that they could look at to make their own localized decisions.

I think that worked pretty well, it wasn’t perfect, but that was the thing that we tried to do just because we didn’t have a culture of top-down, homogenous control, and a single way to work all the time.

Certainly, I think if you just showed up from another company, and you were now the manager of a group and you have to build things, there wasn’t really a pre-made box I could give you – “Here, all the tools, processes, methodologies, and philosophies that we believe in, and this is how software development works.” We tried to standardize that a little bit but there was very much this countercultural, kind of a revolutionary culture, where people didn’t want to do these things. It was an interesting challenge to try to create that to some extent.

Alex Circei: What would you think if each manager would use different tools? Would it be hard to have the same picture or a data-driven approach?

Alex Roetter: It’s really hard. Go back to the OOP analogy again: you can do whatever you want, it’s all behind your header file, and you implement it however you want. However, there are certain things, like “I need you to respond to this message and take these arguments and return an answer in this format.”

Code had to live on a certain server, the deploy tools had to work in a certain way, monitoring and observability had to be done in a certain way, so that our observability daemons could scrape variables and aggregate them across all the data centers. That requires you to report how much RAM you’re using in a standard way, report your latency in a certain way, etc.

We tried to build a little interface where we could get certain things, and it wasn’t just machine metrics. “Your projects have to be in a certain place, so we can see what people are working on. Your OKRs need to be in this format and need to live here. We can see the goals of the whole organization, your key metrics need to be here.” We built some of that scaffolding.

For instance, I don’t care what you do in your house, but we do need to agree that when we meet up in court, we’re going to use the same set of laws, and that when we drive around and share roads, we’re gonna all drive on the right (at least in the US).

There is efficiency to a single way of doing everything, you’re never reinventing the wheel, you’re not doubling effort, but also there’s a huge cost. You’re now forcing everyone into a certain thing, which might not be the right fit for them. You are also squashing good ideas because you are limiting innovation.

Waydev has metrics that help engineering leaders visualize tickets better, check the unplanned work, so that you can better understand historical and present performance. As a leader, you want to check these things and react in real-time, so that when you scale, it’s easy to understand if software delivery is in place, you have the best standards, and you’re able to maintain a high velocity.

Alex Circei: Can you tell us a bit about what you are doing now?

Alex Roetter: I’m a full-time seed-stage investor now. A lot of the activities of an investor: identifying great talent, helping them succeed, coaching, and knowing when to get out of the way, are similar across operating and venture. Don’t get me wrong, a lot is different as well! Katie Stanton, who is a long-time colleague of mine (from Twitter and even before), are working together at Moxxie Ventures, a firm she founded a few years ago.

We just raised a new venture fund, an $85M seed-stage fund, and we’re deploying it now. We’re investing across verticals, it’s always early-stage companies. We try to write $1.5-2M checks, try to buy 10% percent of companies in the early stage, and partner with them.

We really leverage our operating experience. Both Katie and I have operated for so long in different verticals. She operated in marketing, branding, go-to-market, and communications, and me, more on product and engineering. We try to find places where we can help leverage that operating experience to accelerate companies and help them avoid some of the pitfalls of operating, especially in hyper-growth environments.

We really enjoy that. We are not a vertical-specific fund. We look across verticals, but in particular, we’re doing a lot in climate tech these days. As you know, I really like developer tools, engineering productivity, and applications of AI. We’re also looking a lot at FinTech and financial inclusion, learning more about Web 3.0 and crypto stuff. We haven’t done many things there yet, but we are learning about it. And I think we will soon do something there, nothing announced yet.

But the area I’m most excited about is climate. It’s the largest problem facing us, it is at the root of other problems (inequality, poverty, injustice, poor health outcomes), and, transitioning the world’s economy to a sustainable future represents a massive economic opportunity for the investors and entrepreneurs who enable it!

It’s really fun. I get to work with great entrepreneurs and try to help them. It’s really satisfying and enjoyable.

Alex Circei: This is amazing, and I’m sure you are doing a lot of fun things and meeting a lot of amazing people.

Alex Roetter: Yeah, for example, I got to meet you and learn what you’re doing and back you in a small way. I’m working with people that are doing things as varied as AI for medicine, climate tech as I mentioned, AI, and many other things. It’s a real privilege to learn so much from them, and try to help in my own small way.

Alex Circei: Thank you so much for this!

Disclaimer: Alex Roetter is an investor in Waydev.

1:1 with Alex Roetter: Leading Engineering at Twitter

Download the whole article here

Alex Circei: We would like to hear how you’ve managed to scale the engineering part at Twitter and what metrics did you use there?

Alex Circei: If you would have had Waydev, what do you think would have been different?

Alex Circei: What was the structure of the teams? What tools did the engineering managers have to improve and work better?

Alex Circei: What would you think if each manager would use different tools? Would it be hard to have the same picture or a data-driven approach?

Alex Circei: Can you tell us a bit about what you are doing now?

Alex Circei: This is amazing, and I’m sure you are doing a lot of fun things and meeting a lot of amazing people.

Alex Circei: Thank you so much for this!

Software Development Team Management: 12 Tips for DevOps success

The CIO Dashboard: What metrics to track to turn it into an efficient tool for your organization

Mickey W. Mantle, a Distinguished Software Development Executive, and Noted Author Joins Waydev as Advisor

Request a platform demo

DORA Metrics Playbook