Unicorn Project

Gene Kim sequel to Phoenix Project, with more framing provided by Business Strategy and data-backed Product Management (Agile Product Development). ISBN:1942788762 https://itrevolution.com/the-unicorn-project/

Five Ideals of Locality and Simplicity; Focus, Flow and Joy; Improvement of Daily Work (Improvement Kata); Psychological Safety; and Customer Focus (customer driven ). https://www.infoq.com/articles/unicorn-project/

Core vs Context: see Inverted Doughnut

see Complexity Debt

Excerpts

Note to the Reader

The Unicorn Project takes place “in the present day,” and is a companion novel to The Phoenix Project (which also takes place “in the present day”). The events from both novels take place concurrently

Chapter 1: Wednesday, September 3

You’re doing what?” Maxine blurts out, staring in disbelief at Chris, VP of R&D at Parts Unlimited.

So you blame the person who was on vacation because that person couldn’t defend herself?” Maxine says in disgust. “That’s really admirable, Chris. Which leadership book did you get that from?”

“… I promised Steve and Dick that I’d put you in a role where you couldn’t make any production changes anymore,” Chris says, squirming. “So, uh, effective immediately, you’re moving from the manufacturing plant ERP systems to help with documentation for the Phoenix Project …”

“Oh, my God …” she says, finding her voice again. “You’re sending me to the Phoenix Project?!” she nearly yells

Maxine’s mind races, thinking about what she knows about the Phoenix Project. None of it is good. For years, it’s been the company death-march project

The board of directors just stripped Steve Masters of the chairmanship, so now he’s just CEO. And both the CIO and VP of IT Operations were fired yesterday, no explanations given, so Steve is now acting CIO too. Absolutely everyone is worried that there is going to be even more blood in the streets …

She can’t believe she’s starting to see these red flags at Parts Unlimited.

They have stores in almost every state and millions of loyal customers, although every metric shows those numbers declining

In the age of Uber and Lyft, the younger generation is more often choosing not to own cars at all, and if they do, they sure don’t fix their cars themselves

The room is so quiet you could hear a pin drop. It’s like a university library. Or a tomb, she thinks. It doesn’t look like a vibrant space where people work together to solve problems. Creating software should be a collaborative and conversational endeavor—individuals need to interact with each other to create new knowledge and value for the customer.

As she walks to the opposite corner where she was told to find Randy, Maxine suddenly smells it: the unmistakable smell of people who have slept in the office

I’m in charge of documentation and builds. In all honesty, things are a mess. We don’t have a standard Dev environment that developers can use

Maxine listens. All those stories about caveman technical practices in the Phoenix Project are actually true. She’s learned over her entire career that when people can’t get their builds going consistently, disaster is usually right around the corner

Equity Partners

Go Forward Options

Chapter 2: Friday, September 5th

She’s a developer who loves functional programming because she knows that pure functions and composability are better tools to think with. She eschews imperative programming in favor of declarative modes of thinking. She despises and has a healthy fear of state mutation and non-referential transparency. She favors the lambda calculus over Turing machines because of their mathematical purity. She loves LISPs because she loves her code as data and vice versa.

She’s baffled that no one knows of an actual person who uses Phoenix. Just who are they building all this code for?

The “15-minute stand-up meeting” went for almost 90 minutes because of all the emergencies. I don’t know how I missed this meeting yesterday—seems hard to miss because of all the yelling. Wow.
OMG. Almost no one else can build Phoenix on their laptops, either. They’re supposed to deploy this into production in TWO WEEKS! (No one is worried. Crazy. They think it will be delayed again.)

Steve usually presents with Dick, the CFO. About a year ago, Steve also started co-presenting with Sarah Moulton, the SVP of retail operations

Sarah will talk about the progress of the Phoenix Project later, and how it supports the three metrics I care most about: employee engagement, customer satisfaction, and cash flow

Effective last week, the board of directors has re-appointed Bob to be board chairman,” Steve says, his voice starting to quaver. Maxine watches with amazement as he wipes a tear from his eye

During wartime, it’s about finding ways to avoid extinction. And during wartime, the board will often split the roles of CEO and chairman.” Bob pauses, squinting into the bright lights, looking across the entirely silent audience. “I want everyone to know that I have complete confidence in Steve

The goal of the Phoenix Project is to enable our customers to order however they want, whether it be online, in our stores, or even through our channel partners. And wherever they order, they should be able to have their product delivered to their homes or to pick it up in one of stores

I’ve decided that it’s time for us to finally get in the game. We will be launching the Phoenix Project later this month. No more delays. No more postponements.”
Maxine hears an audible gasp from the whole audience

Chapter 3: Monday, September 8

Maxine smiles when she sees that Kirsten Fingle is leading the meeting. She heads up the Project Management Office.

Chris is glaring at someone across the table from him who looks like Ed Harris from Apollo 13. When she quietly asks the person next to her who he is, he responds, “Bill Palmer, the new VP of IT Operations. Promoted last week after the big executive purge

Kirsten, with the earnestness that makes her so effective, doesn’t let it go. She leans forward. “No, really. I think you said, ‘Good luck, chumps.’ I’m always interested in your perspective, given your extensive success in plant operations. I’d love to better understand what made you laugh.”

Hi, I’m Kurt. I’m one of the QA managers who works for William. I heard in the meeting that you need license keys and environments and a bunch of other things to get a build running? I think I can help.

We’ve got hundreds of developers and QA people working on this project, and most can only build their portion of the code base. They’re not building the whole system, let alone testing it on any regular basis

Oh, and by the way, there’s a rumor that might interest you,” Kurt says, looking around as if afraid of being overheard. “Word is that Sarah pushed for Phoenix to launch this week, and that Steve just approved it. All hell is about to break loose

At the very bottom of the notes, Derek wrote:
To get a Dev environment we need approval from your manager. The correct process is documented below. Closing ticket.

Helpdesk Derek is two buildings over, lower level.

Derek asks tentatively, “Do you mind if I ask a stupid question?”
“Of course. There are no stupid questions. Fire away,” she smiles, trying not to look manic.
“What’s a Dev environment? I’ve dealt with laptop issues, password resets, and things like that. But I’ve never heard of an ‘environment’ before

I’m sorry to have to close your ticket. Believe it or not, we’ve been out of storage space for the last three months

Learning Clojure, her favorite programming language, was the most difficult thing she had ever done, because it entirely removes the ability to change (or mutate) variables. Without doubt, it’s been the one of her most rewarding learnings

text message from one of her collaborators on an open-source project she wrote to help her with personal task management. She started this project over five years ago to help her keep awesome work diaries.

Her app let’s her easily push work into GitHub, JIRA, Trello, and the many other tools where she interacts with other people and teams

In that moment, Maxine decides she must bring this level of productivity that she’s helped create for middle-schoolers and her open-source project to the Phoenix Project, even if it means personal suffering in the short term.

Chapter 4: Thursday, September 11

the Phoenix Project is going to be launched tomorrow at five

no boring status meetings today. Instead, every meeting is in a genuine shitstorm, with people on the verge of panic

“Has Phoenix ever been deployed into production before?!”
“Nope,” he yells back.

Later that morning, Chris announces that William, the QA director, is in charge of the release team

William, when is your release team meeting?” Maxine asks him as he jogs by. She runs to keep up. “Can I help?”
“First meeting is in one hour

On the other side of the table is Bill Palmer, surrounded by a phalanx of faces she doesn’t recognize. She notices that there’s something … different about them.

These are all the Ops people, Maxine realizes. No wonder she hasn’t seen them around.

Marketing has pulled out all the stops. They’re going to spend almost a million dollars getting the word out about the Phoenix launch. All the store managers have been given instructions to tell every customer to download the app and hit the website Saturday—they’re even having contests to see which stores register the most new mobile customers.

Phoenix currently handles about five transactions per second

At that moment, Maxine hears a loud voice from the doorway. “For the survival of Parts Unlimited, we have to make this work, so of course we’re going to make it.”
Oh no, Maxine thinks. It’s Sarah Moulton.

“I can see that some of you have not bought into our mission,” Sarah says, appraising everyone in the room. “Well, as I mentioned in the Town Hall, the skills that got us here are not necessarily the same skills that will take us to where we need to go

The project manager in the group she’s sitting with says, “Won’t we need a bunch of firewall changes too? Not just to external traffic

At last, I’ve found you all,” John sneers, looking around as if he were a sheriff who had hunted down a group of outlaws. “I’m here about this mad plan to deploy the Phoenix application. This deployment will only happen over my dead body

Maxine looks around, thinking about the sudden, surreal appearance of Sarah and then John. She’s reminded of Redshirts by John Scalzi

Kurt is there with a black three-ring binder. Seeing her, he flashes a big smile. “I have a present for you!”
It’s an eighty-page document full of tabs. Just scanning the section headings makes her heart leap—they’re the painstakingly assembled Phoenix build instructions

bunch of people who really appreciate your work. We’ve been trying to crack the Phoenix build puzzle for months! But we’ve never been able to work on this full-time. Your notes helped us put all the pieces together. This saved us months of work!”

come to the Dockside Bar tonight at five. We meet there on Thursdays

the ‘official build team’ hasn’t exactly authorized these. They seem to view our efforts as a nuisance, or worse, as competition. Which, on the eve of the biggest and potentially most risky application launch in the history of the company, sure does seem odd, doesn’t it?

The build output is hypnotic and educational, because she’s seeing some components of Phoenix for the first time. There’s Java JAR files, .NET binaries, Python and Ruby scripts, and lots and lots of bash scripts.
Wait, is that a remote shell and installer that popped up?

Finally, nearly three hours after she started the build, she sees the scrolling output from her build window stop

Chapter 5: Thursday, September 11

Dockside parking lot, right on time for Kurt’s mysterious meeting

Hey, everyone, meet Maxine, the newest member of the Rebellion if I can help it. She’s the person that I’ve been telling you all about.”

Kurt laughs. “The reason Dave is so good is that he never asks for permission!

You’ve probably noticed that he’s running his own black market inside the company, right?”

the man in his late thirties who is wearing a funny vendor T-shirt. “This is Adam, one of my test engineers

Unfortunately, he’s so good at what he does, everyone seems to have him on speed-dial. And he’s on pager duty way too often, which we’re trying to fix.

last but not least is Dwayne,” says Kurt, gesturing to the oldest person at the table. He’s not only dressed differently than everyone, his laptop is different too—it’s a beast with a massive screen. “He’s a senior database and storage engineer from Ops and was the person who brought Brent into this group. They conspire all the time to find better ways to manage infrastructure

We invite you to be a part of the inner-circle of the ‘Rebellion.’ We’re recruiting the best and brightest engineers in the organization

And she notices that everyone has a small sticker of the Rebel Alliance from Star Wars on their laptop

Almost everything I need to do, I have to go up two levels, over two levels, and down two levels just to talk with a fellow engineer!”
“The Square!” cries out Adam, and everyone laughs

You’ve seen firsthand the reality bubble the bridge crew is in,” Kurt says. “They know the Phoenix Project is important, and yet they couldn’t have come up with a worse way to organize everyone to achieve it. They outsourced IT, brought it back in, outsourced one piece, maybe two pieces, shuffled them around … In many areas, we’re organized as if we’re still outsourced, and nothing can get done without permission from three or four levels of management.

You said you pitched William on funding an automated testing pilot.

We’re QA. We protect the organization from developers

You go around automating your QA, your budget shrinks instead of grows. I’m not saying you’re stupid, son, but you sure don’t seem to understand how this game works.’”

William is like a union leader, not a business leader,” Shannon says. “He only cares about growing his union membership dues

We’re going to lie low and keep doing what we’re doing, looking for new potential customers and recruits. We keep our eyes and ears open for opportunities to get in the game.”

It’s like in that movie Brazil, where the number-one fugitive is the rogue air conditioner repairman who fixes people’s air conditioners because Central Services never gets around to it. That’s us. We’re always on the lookout for places we can help. It’s a great way to make friends and find potential new recruits for the Rebellion.”

The build team is completely out of their league, with people who, I’m sorry to say, are the people who didn’t have enough experience to be application developers

Here’s what we still need: we need an automated way to create environments and perform code builds,” Kurt says, counting off on his fingers. “We need some way to automate those tests and some automated way to get those builds deployed into production. We need builds so that developers can actually do their work.
“So, who’s willing to volunteer some of their time to help Maxine get those Phoenix builds going?” Kurt asks. To Maxine’s surprise, all hands shoot up.

Chapter 6: Friday, September 12

At five p.m. the release starts on schedule

By midnight, it’s clear that a database migration is going to take five hours to complete instead of five minutes, with no way to stop it or restart it.

By two a.m. everyone realizes there is a very real risk that they are going to break every point-of-sale register in every one of the nearly thousand stores

Brent asks her to join a SWAT team to figure out how to speed up the database queries

They are stunned when they discover that clicking the product category drop-down box floods the database with 8,000 SQL queries

Dammit, I bet it’s another bad upload from the pricing team

“I need to inspect the CSV file that they uploaded into the app,” he says. “I think I can find one in the temporary directory on one of the application servers.” Maxine nods.

The first thing she would have done right away is write some automated tests to ensure that all input files are correctly formed before they allow them to corrupt their production database, and that the correct number of rows are actually in the file.
“Let me guess. You’re the only one who knows how to correct these bad uploads?” Maxine asks.

The Phoenix release is still nowhere near complete. “We’re fourteen hours into the launch, and the missile is still stuck in the tube,” Dwayne says glumly

it was the most amazing example of production data loss Maxine has ever seen.
Somehow, they managed to corrupt incoming customer orders

Most of the in-store systems are still down—not just the point-of-sale registers, but nearly all of the back-office applications that support the in-store staff

even the corporate website and email servers are having problems

Once again, Maxine’s sensibilities are offended by how entangled all these systems are with each other

all production changes must be approved by me, as well as Chris Allers and Bill Palmer

“It’s like the TV show Survivor,” says Shannon. “All the technology executives are just trying to last one more episode. Everyone is freaking out. Steve has been demoted, and Sarah is trying to convince everyone that she can save the company.”

We’ve got nearly sixty thousand erroneous and/or duplicate orders in the database, and we’ve got to fix them so that the finance people can get accurate revenue reports

Now, all product managers need to run everything by Sarah. Someone mutters, “Don’t hold your breath—she never responds right away.”
Great, Maxine thinks. Sarah has effectively paralyzed everyone in this room even further.
Throughout the day, all decisions and escalations quickly grind to a standstill, even for emergencies, which Maxine didn’t expect

Sarah has been sending out emails, sometimes in all caps, reminding people how important this is

The room is almost empty, even though this is a Sev 1 outage.
Apparently, everyone has had to go home sick. The Phoenix release forced people to work long hours together in close proximity all day and night, and with little sleep. Now everyone is dropping like flies

This is not acceptable, Wes,” says Sarah. “The business depends on us. The store managers depend on us. We need to do something!”
“Well, these were the risks we warned you about when you proposed proceeding with the Phoenix launch—but you emailed saying that we ‘need to break some eggs to make omelets,’ right?

They fired William

Holy shit …” he finally says. Everyone next to Adam is also looking shocked at whatever is behind her.
Maxine turns around and sees Kurt walking through the entrance.
Next to him is Kirsten, the director of project management

Kirsten laughs. “I’ve long harbored a suspicion that how we manage technology at this company is not working. And it’s not just the Phoenix release catastrophe. Look at all the things we need from Phoenix that are still years away on the project plan.

Somehow, I think Project Management has turned into an army of paper pushers, being dragged into every single task because of all the dependencies

She turns to Kurt. “What was that term you used? Watermelon projects? Green on the outside, but red on the inside? That’s what every one of our IT projects is these days,” Kirsten observes wryly

We used to have three networking switches in all of our manufacturing plants. One for internal plant operations, one for employees and guest WiFi, and one for all our equipment vendors that need to phone home to their mothership.
“A couple of years ago, probably during budgeting season, some bean counter looked at those three networking vendors and decided to consolidate them down to one switch

But what they didn’t know was that they had three separate outsourcers managing the three different networks

Within a week, one of the manufacturing plants had their entire network knocked offline

All three outsourcers denied that it was them, even when we presented them the log files that clearly showed that one of the them had disabled everyone else’s accounts. Apparently someone got tired of having their changes trampled on by the other two, so they just locked them out.”

Before, three teams were able to work independently on their own networks. And when they were all put on one network switch, suddenly they were coupled together, unable to work independently, having to communicate and coordinate in order to not interfere with each other, right?

They did it to reduce costs, but surely, in the end, it was more expensive for everyone all around

“Oh, my God. It’s just like the Phoenix Project!” she exclaims.
Silence falls upon the table as everyone stares at Maxine in a mix of horror and dawning realization.
“You mean everything that’s wrong with the Phoenix Project we did to ourselves?” Shannon asks

You are correct, Maxine. You are truly on the cusp of understanding the magnitude and scale of the challenges that await you,” a voice says from behind Maxine.

Chapter 7: Thursday, September 18

The owner of the familiar voice is, to Maxine’s surprise, the bartender the last time she was at the Dockside.

This is Dr. Erik Reid. You may not know this, but Steve and Dick have been trying to recruit him to serve on the board of Parts Unlimited for months. He’s worked with the company for decades. In fact, Erik was part of the initial MRP rollout in the ’80s, and then he helped the manufacturing plants adopt Lean principles and practices. We were one of the first companies to have an automated MRP system, and he’s a genuine hero among the manufacturing ranks.

On behalf of everyone in manufacturing operations, thanks for taking such good care of the MRP system

you’ve also created a system where small teams of engineers are able to work productively and independently of each other, with components painstakingly and splendidly isolated from each other, instead of being complected into a giant, ugly, knotty mess

Uh, what does ‘complected’ mean?” Kurt asks.
Erik answers, “It’s an archaic word, resurrected by Sensei Rich Hickey. ‘Complect’ means to turn something simple into something complex.

you’ve trapped yourself in a system of work where you can no longer solve real business problems easily anymore

The importance of lead times in software delivery is tantamount, as Senseis Dr. Nicole Forsgren and Jez Humble have discovered in their research,” Erik says. “Code deployment lead time, code deployment frequency, and time to resolve problems are predictive of software delivery, operational performance, and organizational performance, and they correlate with burnout, employee engagement, and so much more.
“Simplicity is important because it enables locality

You’re saying that Phoenix used to be simple, but now it has become complected beyond recognition

Build responsibility moved from Dev to QA to interns. Tech giants like Facebook, Amazon, Netflix, Google, and Microsoft give Dev productivity responsibilities to only the most senior and experienced engineers. But here at Parts Unlimited, it’s the exact opposite

“I’ve started calling all of these things ‘complexity debt,’ because they’re not just technical issues—they’re business issues.

It’s a magnificent example of the First Ideal of Locality and Simplicity in our code and organizations.

In the Town Hall, Steve talked about how much he cares about employee engagement. What does he think when he sees that the department responsible for the most strategic program in the company is miserable? Shouldn’t that worry him?

There are Five Ideals,” Erik begins. The whole table turns their attention to him. “I’ve already told you about the First Ideal of Locality and Simplicity. We need to design things so that we have locality in our systems and the organizations that build them. And we need simplicity in everything we do. The last place we want complexity is internally, whether it’s in our code, in our organization, or in our processes. The external world is complex enough, so it would be intolerable if we allow it in things we can actually control!

The Second Ideal is Focus, Flow, and Joy.

The Third Ideal is Improvement of Daily Work. Reflect upon what the Toyota Andon cord teaches us about how we must elevate improvement of daily work over daily work itself.

The Fourth Ideal is Psychological Safety, where we make it safe to talk about problems, because solving problems requires prevention, which requires honesty, and honesty requires the absence of fear.

And finally, the Fifth Ideal is Customer Focus, where we ruthlessly question whether something actually matters to our customers, as in, are they willing to pay us for it or is it only of value to our functional silo?”

Kurt leans forward. “If I can get Chris to give me that chance, would you all be willing to join the team and show that we can change the trajectory of the Phoenix Project?”

Chapter 8: Tuesday, September 23

On Tuesday of the following week, Maxine arrives at work to see Kurt beaming. “I got the job,” he says exuberantly.
“Really? The Dev job?” Maxine asks.
“Yes, the Dev job!”

Apparently, the technology executives had an off-site with Steve earlier this week, and one of the things they agreed upon was a one-month feature freeze

Data Hub is being widely touted as the “root cause” of the catastrophic crashes during and after the Phoenix release. Chris even called them out by name during one of the meetings Maxine was in, which she thought was quite unfair.

Maxine has never liked that there are actually three inventory management systems—two for the physical stores (one was inherited by an acquisition and never retired) and another for the e-commerce channel. And there’s at least six order entry systems—three supporting the physical stores, one for e-commerce, another for OEM customers, and another for service station channel sales

Data Hub is a mishmash of technologies built up over the decades, including a big chunk that runs on Java servlets, some Python scripts, and something that she thinks is Delphi. There’s even a PHP web server

When there are multiple repair transactions being processed concurrently, sometimes one of the transactions gets the wrong customer ID, and sometimes Data Hub completely crashes,” he says. “I’ve tried putting a lock around the customer object, but it slowed down the entire application so much, it’s just not an option. We have enough performance problems as it is.”

she proposes rewriting the code path using functional programming principles

She has each thread make its own copy of the customer object. They rewrite each object method into a series of pure functions—a function whose output is completely dependent upon its inputs, with no side-effects, mutations, or accesses to global state

Testing doesn’t start until Monday.”
Maxine feels her heart drop. “We can’t test it ourselves?”

They write the tests. They don’t even let us see the test plans anymore.

then tackle a crash-priority feature, this time to create some business rules around extended warranty plans, critical enough to be exempted from the feature freeze.
“Why is this so high priority?” Maxine asks Tom as she reads the ticket.
“This is hugely revenue-generating

Maxine guesses that they’ll have to bring in six other teams because of how many business systems this affects.
Maxine is dismayed as the number of teams that need to be involved keeps growing. This is again the opposite of the First Ideal of Locality and Simplicity

We can’t directly access production logs?” Maxine asks, afraid of the answer.
“Nope. Ops people won’t let us,” he says, typing into the form.

As Tom had suspected, it was an internal networking change that caused the problem.

Eventually, Sarah gets involved and demands that there be severe consequences

there have been times when we’ve hired new developers and six months later, they still can’t do a full build on their machines

“My pain points?” Tom muses. “It’s our environments. We used to have a good handle on this, but then we got moved into the Phoenix Project and they made us use environments from their centralized environments team.

“That’s terrific,” Kurt says. “We all know how important environments are. For now, feel free to spend half your time on this—I’ll hide it in the timecarding system.”

“They’re not kidding around,” Brent says. “Bill has been awesome. He’s told me in no uncertain terms that I’m to work only on Phoenix-related things. He’s taken me off of pager rotation for basically everything

All the tech giants, at some point in their history, have used the feature freeze to massively rearchitect their systems

Interesting that these CEOs I mention all have a software background, isn’t it?

Contrast that with the tragic story of Nokia

In 2010, Risto Siilasmaa was a board director at Nokia. When he learned that generating a Symbian build took a whole forty-eight hours, he said that it felt like someone hit him in the head with a sledgehammer,” Erik says. “He knew that if it took two days for anyone to determine whether a change worked or would have to be redone, there was a fundamental and fatal flaw in their architecture that doomed their near-term profitability and long-term viability

Siilasmaa knew that all the hopes and promises made by the engineering organization was a mirage. Even though there were numerous internal efforts to migrate off of Symbian, it was always shot down by the top executives until it was too late.

I asked my project managers to sample a couple of features and find out how many teams were required to implement them. The average number of teams required was 4.2, which is shocking

But consider the forces arrayed against you. The entire Project Management Office aims to keep projects on-time and on-budget, following the rules and enforcing the promises written long ago. Look at how Chris’ direct reports act—despite Project Inversion, they keep working on the features because they’re afraid of slipping their dates

Innovation and learning occur at the edges, not the core. Problems must be solved on the front-lines

And that’s why the Third Ideal is Improvement of Daily Work. It is the dynamic that allows us to change and improve how we work, informed by learning. As Sensei Dr. Steven Spear said, ‘It is ignorance that is the mother of all problems, and the only thing that can overcome it is learning.’

The famous Andon cord is just one of their many tools that enable learning. When anyone encounters a problem, everyone is expected to ask for help at any time

And thus problems are quickly seen, swarmed, and solved, and then those learnings are spread far and wide, so all may benefit

huge library of rules and regulations, processes and procedures

Each adds to the coordination cost for everything we do, and drives up our cost of delay

Some think it’s about leaders being nice,” Erik guffaws. “Nonsense. It’s about excellence, the ruthless pursuit of perfection, the urgency to achieve the mission, a constant dissatisfaction with the status quo, and a zeal for helping those the organization serves.

Which brings us to the Fourth Ideal of Psychological Safety. No one will take risks, experiment, or innovate in a culture of fear, where people are afraid to tell the boss bad news

John Allspaw says, every incident is a learning opportunity, an unplanned investment that was made without our consent

“I did some asking around to find out what actually happened,” Dwayne continues. “Apparently, Chad had worked four nights in a row, in addition to working his normal daytime hours

You’d be surprised how deeply this sense of injustice would resonate with Steve. You’d know that if you’ve spent any time on the manufacturing floor

when Steve signed on as the COO and VP of manufacturing, he made it contingent upon the company publicly targeting zero on-the-job workplace injuries? He was almost laughed out of the room

Alcoans were extremely caring people. Every time people were injured, they mourned and there was always lots of regret—but they didn’t understand that they were responsible. It had become a learned condition to tolerate injuries

Chapter 9: Monday, September 29

where do the QA people sit? I bought all these donuts for them!”
Tom looks surprised. “Well, I’ve met a bunch of them—some of them are offshore, some are on-site, but I haven’t talked with them directly in a long time

end of next week, when they present the testing results.”
“Next week? Next week?!” Maxine’s jaw drops. “What are we supposed to do in the meantime?

So, let’s suppose that all our changes work perfectly—when would be the soonest that our customers actually get to use what we wrote?” she asks

seven weeks from now

Go over to Building 7 to deliver those donuts while he’s preoccupied. I’ll connect you with Charlotte, who is, or was, William’s assistant. She’s like the mother hen for all the QA people.”
Kurt finishes typing. “She’s expecting you. I think three boxes will be enough for the Data Hub team. Ask Charlotte how to most strategically deploy the remaining two boxes,” he adds with a smile.
“She’ll get a conference room for you and bring the Data Hub QA team by,” Kurt says. “You’ll get a chance to meet all of them. And maybe you’ll find some people who are looking for help.

Interesting that developers can’t get into the QA building,” Tom says. “Does that mean QA people aren’t allowed into the Dev building?”

After ninety minutes, it’s clear to Maxine that it isn’t really about Dev versus QA—instead, it’s about how Phoenix business requirements change so often, which almost always requires urgent code changes. This reduces the time available for testing, resulting in poorer quality, as evidenced by the latest Phoenix disaster.

She’s been dying to see the QA team workflow. When Maxine sees the tool, she is momentarily taken aback.
“Is that IE6?” Maxine asks, hesitantly. The last time she saw that version of Internet Explorer was in Windows XP.
Purna smiles, as if she’s used to having to explain this to people. “Yes. We’ve been using this tool for over a decade, and now we have to run the client inside of an old Windows VM

Purna turns to her and says, “Well, that’s about all we can do. The QA1 environment still hasn’t been reset. We’re waiting on a customer test data set from the Data Warehouse team, and the Phoenix Dev teams still haven’t started their merge … until we get those, there’s really nothing we can do.”
“The developers haven’t started merging?” Maxine says, her heart sinking. “How long does that take?”
“We usually get something within two or three days

What exactly do you need from the Data Warehouse team?” she asks, reminded of Brent’s data problems during the Phoenix release and Shannon describing her five frustrating years on that team.
“Oh, everyone waits for them,” she says. “They’re responsible for getting data from almost everywhere in the company, and cleaning and transforming it so that it can be used by other parts of the business. We’ve been waiting almost a year for anonymized customer data, and we still don’t have test data that includes recent products, prices, and active promotions. We always get pushed down the priority list, so our test data is years old.”

A tall man in his early fifties is angrily yelling at Charlotte and pointing at the pizzas and then at Tom and the other Data Hub developers.
Uh oh. That must be Roy

Roy looks at her, speechless. Finally, he releases her hand and says loudly, “Oh, no you don’t. I don’t know what you all are up to,

Kurt is about to respond when Kirsten walks into the room behind him, saying, “Hi, Kurt. Hi, Roy. Mind if I join you all? Oh, I love pizza.”

Maxine sees Roy approach Kurt. She inches over so she’s just close enough to hear him say, “… This isn’t over. You somehow managed to find a patron, but she won’t be able to protect you forever. You think you’re better than us? You think you can come in here, put on airs, and automate everyone’s job away? Not on my watch. I’ll make sure we bring you down.”

Chapter 10: Monday, September 29

Maxine loves it when everyone merges their changes frequently to the ‘master branch,’ such as once per day

On the other hand, you have what Phoenix developers do—a hundred developers work for weeks at a time without merging, and from what Purna says, it usually takes at least three days to merge. Maxine thinks, Who would ever want to work that way?

There are 392 Dev tickets to be merged

Some are lobbying to do these merges less frequently—instead of once per month, maybe once per quarter

Jared is the source code manager. Developers aren’t allowed access to production. The only time developers can push changes to the release branch is for P1 issues. This is only a P3 issue

Four hours after following Jared out of the room, Maxine is dazed and disoriented

she sees that Ops is imprisoned by the same wardens as the developers upstairs.

Chapter 11: Wednesday, October 1

It’s just a giant circle of tickets, Kurt, being created and passed around, over and over again, without end.

We did it to ourselves. Long ago, QA used to be a part of Dev, but when I joined, QA had been made independent. We made a bunch of rules about how we needed to be separate from Dev concerns, you know, to protect the business from all those crazy, reckless developers. Each year, we used anything that went wrong as an excuse to create more and more rules to ‘make developers more accountable,’ which just slowed us down even more

In Operations, we did it to ourselves too

In Ops, it’s worse, because we have so many areas of specialization

So we need a ticketing system to manage those complex flows of work. But it’s so easy for people to lose sight of what the purpose of all this work is. It’s why the Rebellion is so important.

xine receives a reply from Kurt:
Hahahahaha! Sorry, no. Maybe Monday. But Data Hub and its environment are almost ready to be tested

The real question, Maxine realizes, is which features they should be working on. She wonders what features in Data Hub would be most important for the business. And which business unit they should focus on

Maxine eventually notices a term that comes up over and over: “Item Promotion.”

She sees another pattern just like this, but for winter tires and chains, chains and windshield de-icing fluid, and many more

each discounted bundle always takes two months to create.

The ticket was created seven months ago, and the title reads, “Create in one step: new product bundle SKU with associated discount.”

The author of the ticket is Maggie Lee, the senior director of products

Maggie’s name has come up over and over again.
“She works for Sarah, and all the product owners for in-store and e-commerce report to her,”

Now, instead of waiting weeks to get access to one of the scarce QA environments, you can just run this Docker image on your laptop. It takes a couple minutes to download, but only a couple of seconds to start up.

Kurt frowns. “Actually, I take what I said back. This is a Dev and QA success story. We still have angry business stakeholders who don’t have their features. How do you get these features into production?”

Who’s the most powerful opposition?” Kurt asks.
“Security, most definitely,” Dwayne says. “They’re going to want to do a security review of the code before it goes into production

We probably need to go through TEP-LARB

Nothing gets through TEP and LARB

LARB’ stands for Lead Architecture Review Board,”

And to pitch anything to them, you first have to fill out the Technology Evaluation Process form, or the TEP,” he explains. “Maxine is right. It’s a lot of effort. It’s about fifty pages these days.”

Maxine replies, “You know, we could just run Data Hub ourselves. Like, run it completely without any help from Operations, similar to how we ran our own MRP system in my old group.

Suddenly, he frowns. “Wait, wait, wait. Does that mean we’re all going to have to wear a pager?”
“Yes,” says Brent, adamantly. “You build it, you run it.”
Tom’s excitement visibly fades.

Purna and the QA teams are using the Data Hub environments as well; once features are flagged as “Ready to Test,” they’re tested within hours

many defects and even a couple of features are completely implemented and tested in one day

Holy cow, is that what I think it is?” Maxine asks, stopping midstride and staring at the screen.
“If you mean, does this look like a continuous integration server that is doing code builds and automated tests on Data Hub for every check-in, running in the environments that you helped build? If so, then you would be absolutely right,” says Adam, a huge smile on his face

They now have better technical practices than most of Phoenix.

She’s delighted that the team is organizing itself, just as Erik predicted.

Chapter 12: Monday, October 13

a new constraint has emerged

Now it is obvious that the constraint is deployment—

We have other huge problems being connected so closely to Phoenix. It sometimes sends us tons of messages that hammer the back-end systems that we connect to

Dealing with those systems of record are a huge a pain in the ass. We don’t have any real API strategy around here. No one knows what APIs are available, and even if you do, no one knows how you get access to them or deal with their crazy authentication or pagination schemes. Everyone’s documentation is crap, and some of these teams don’t even care if their APIs don’t work as advertised.

My name is Maggie Lee. I’m senior director of retail program management. What that really means is that I have the P&L responsibilities of all the products and programs behind our stores, which includes physical stores, e-commerce, and mobile. My group of product managers own strategy, understanding the customer and market, customer segmentation, identifying which customer problems we want to solve, pricing and packaging, and managing the profitability of everything in our portfolio.”

this could potentially save the quarter. And maybe even the company.

Steve has promised all the analysts that this holiday season we’re finally going to see an uptick in revenue. This is after years of over-promising and under-delivering. Everything hinges on Promotions being able to move the needle on sales

What exactly is in the way?” she asks.
“Who isn’t in the way?” Kurt laughs. “We’re meeting with Information Security tomorrow, who could kill this effort on a whim. But the real threat is the TEP-LARB.

Maggie finally smiles, in not an entirely kind way. “For this, I think we need to bring in the big guns.”
“Who’s that?” asks Maxine, curious who could possibly be a more powerful sponsor than Maggie.
Maggie grins. “Sarah. Take it from me, there is no one more effective at busting down inconvenient barriers than she is.

That’s exciting,” he says, agreeably. But then he takes off his glasses and puts them on the table. “Look, I really want to help, but I can’t. I’m responsible for making sure applications in my portfolio meet all applicable laws and regulations and that all those applications are secure. Given how radical of a change you’re making, I’m afraid we need to perform a complete due-diligence effort. And you simply can’t jump the entire queue. You have twenty people ahead of you who would scream bloody murder,”

Ron says, shaking his head. “I don’t set the priority or order of the applications. That comes from the business. You know, our customer.”
“But we are ‘the business!’ And those ‘customers’ you’re talking about aren’t our customers—they’re our colleagues!

Ron shrugs his shoulders. “If you want to change the order, you’ll have to talk to our boss, John.”

When her dad had a stroke two years ago, she had remarked on all the bewildering processes in the hospital to one of her doctor friends. Her friend responded, “You were lucky. The processes in a stroke ward tend to be superb, because everyone knows that every minute counts and waiting could be the difference between life and death.
“The worst systems tend to be in mental health and elderly care, where there is less urgency and often no patient advocate,”

As Maggie promised, they are on the LARB agenda on Thursday

At one table sit all the senior Dev and Enterprise architects, and at the other table sit all the Ops and Security architects

when Maxine hears that all they’re looking for is permission to use Tomcat, she’s aghast.
Having to ask permission to use Tomcat in production is like asking permission to use electricity—maybe it was once considered dangerous, but now it’s commonplace. Worse, it’s apparently their second time pitching the LARB.

That’s Ellen. She’s one of the best Ops people around

she sees all the Ops and Security architects shaking their heads. One of them says, “Dwayne, I appreciate what you’re saying, but we’ve never done anything even remotely similar to this. It’s embarrassing that we can’t even support Tomcat—but that shows you exactly why we can’t possibly support this. Unless there’s a group willing to volunteer to support this initiative as a side project, I think we need to table it.”
Dwayne speaks up, “Hell yes, I volunteer. And I’ll grab some people I know who would love to help the Data Hub team with the support responsibilities.”

The Ops Chair looks surprised but says, “I appreciate your enthusiasm, but I’m afraid that we cannot support your initiative at this time. Let’s pick this up in six months and see if conditions have changed by then.”

But we still got our asses kicked, right?”
Kurt says, “We did indeed. But if all goes according to plan, by the end of the day there will be a memo going out from Chris and Sarah announcing a small re-org that will allow Data Hub to operate outside the conventional Ops and QA processes. That will be the official go-ahead to do whatever we need to do.”

“What the hell have you gotten me into?!” Chris says, fuming at Kurt. “Maggie and Sarah tell me that you’ve proposed to create your own Ops organization inside of Dev?!

Maxine is in Chris’ office with Kurt, Dwayne, and Maggie. Chris is clearly not happy, but Maggie goes to extraordinary lengths to describe the business outcomes that need to be achieved and the grave consequences of not doing so.
Chris stares out his window for several moments and then turns to Maxine. “Do you think we really have the chops to keep all this from blowing up in our faces?”
“Absolutely, with the help of Dwayne and Brent from Ops,” she says with certainty.

As promised, Chris sends out a memo to everyone announcing a re-org—the Data Hub team is now reporting directly to him, and as an experiment, they’ll be exempt from the normal rules and regulations around changes, able to test their own code, deploy it, and operate it in production themselves.

Kurt laughs. “I don’t think Chris had that much choice in the matter. Both Maggie and Sarah took this all the way up to Steve.”

Many things in Data Hub blew up when installed on a current OS version … from this decade. They found several binary executables that no one could find the source code for. Data Hub had become this fragile and irreproducible artifact.

something strikes Maxine as odd. She notices that all the Data Hub engineers are pitching in

she asks Tom what’s going on. He says, “This sounds strange, but technically there isn’t any feature work even ready. Believe it or not, every feature is waiting on something from Product Management

What are all those blue cards?” she asks.
“Good eye. That’s exactly the problem,” he says. “Those are all the features that we’re working on, but we’re blocked because of something we need from Product Management

Oh, and here’s some yellow cards which are the features we’ve completed but that haven’t been accepted by the business stakeholders yet. This one has been waiting for forty days.

By the way, did you know that Sarah put a huge design agency on retainer, and now they’re flooding some of the other teams with wireframe diagrams that will probably never get worked on? And no matter how much the Dev managers ask them to stop sending wireframes, they still keep coming

She’s seen teams still waiting to be assigned designers, eventually making their own wireframes, HTML and CSS styling, and icons just to keep feature flow moving. These are the projects that teams are actually embarrassed to show other people, she thinks.
The good news is that Sarah got a bunch of great designers. The bad news is that she put them all where they weren’t needed and were actually slowing important development work

She searches for the first feature that she worked on with Tom about extended warrantee programs. When she finds it her jaw drops. That feature was first discussed almost two years ago.

Only 2.5 percent of the time required to go from concept to customers actually using the feature is spent in Development

That afternoon, Maggie comes up with an elegant solution. She decides to move the Data Hub product manager from the Marketing building to a desk right by Maxine starting Monday.

In the conference room, Maggie tells him, “You’re the bottleneck. Your top priority now is to make sure any questions that the technology teams have are quickly answered. Nothing else takes priority over that.”

she says, “If you’re too busy to work with the technology teams, I’ll move you into a pure product marketing role, and you don’t have to move your desk. Right now, I need product managers who are working side by side with the teams who are building what will achieve our most important business objectives. If you still want to be a product manager, I’ll figure out how to clear your plate and get those other responsibilities assigned to someone else.

better yet, engineers start gaining a much better understanding of the business domain

Chapter 13: Thursday, November 6th

Okay, we found two other places where we missed some configuration settings in environment variables. Those are now all in version control

Tom confirms, “Data Hub is processing transactions again. We’re in the deployment business!” He looks around with a big smile. “Who wants to go to the Dockside to celebrate?”

Maggie tells the group more about their struggles, some of which surprises and worries Maxine. They have only completed integrations with two systems of record. They are still waiting for nearly twenty API integrations, including product, pricing, promotions, purchases

They’ve hired a bunch of data scientists to help create more effective offers, but they’re still waiting on the Data Warehouse team

Kurt turns to Maxine. “Maggie already has a bunch of development teams assigned to support the Promotions effort, but they clearly need some help. Based on what you’ve heard, who would you want to take with you to make the biggest difference?”

When he nods, she says to Kurt, “That’s twelve people

Kurt looks at Maggie. “We’ll have to convince the higher-ups to make a massive investment in speed to achieve your goals. Do you think you can swing that?”
“Hang on a second. You’d all do that for me?”

The next morning, Kurt, Maxine, Kirsten, and Maggie are once again in front of Chris. When Kurt proposes temporarily swarming the Promotions efforts, not surprisingly Chris seems exasperated

Chris furrows his brow. Just like last time, he turns to Maxine. “What do you think, Maxine? Do we really need to do this?”
Maxine studies him, realizing how uncomfortable he is with the constantly changing plans, very different from the static plans that characterized the Phoenix Project

Despite Sarah’s loud demands to find someone to blame on the temporary Data Hub outage yesterday, Kurt refuses to do anything along those lines

Kurt starts the meeting, saying, “Every time we have an outage, we’ll be conducting a blameless post-mortem like this one. The spirit and intent of these sessions are to learn from them

The goal is to enable the people closest to the problem to share what they saw, so we can make our systems safer

I’ve spent decades looking at stack traces, but I’ve never seen them in our deployment tool

You know, I should have rehearsed looking at logs in this new tool

Over the next hour, Maxine and the group assemble an amazingly detailed and vivid timeline of what actually happened

Maxine is certain that everyone has learned something about how Data Hub actually works, in stark contrast with their mental models of how they thought it works

She records a list of five things people will change right away that will likely prevent future outages and will certainly make fixing certain problems faster in the future

Maggie continues, “For our loyal customers, we know what they buy and how frequently they buy it

We know from our customer research that our core market uses mobile phone apps extensively

here is a picture of Tomas, a customer we interviewed during our market research

now it’s something he does with his two teenage daughters and son. He wants his kids to focus on STEM, but he insists that they understand mechanical basics and learn self-reliance

Tomas doesn’t consider himself to be very technical, but he has six computers at home that he supports for his entire family

Right now, he uses a spiral notebook and these file folders to keep records for each car he maintains

He uses his mobile phone all the time, primarily for messaging but also Amazon. He would love to have more of the maintenance routine codified

One of the most critical sources of data is the in-store inventory systems. We want to promote overstocked items, but we’ve got to be very careful that we don’t promote items that we don’t have on hand in that region.

If only we had a single view of the customer

we’ve built some customer profiles based on their behavior. The archetypes we’ve created so far are: Racing Enthusiast, Frugal Maintainer, Meticulous Maintainer, Catastrophic Late Maintainer, and Happy Hobbyist.” (Personas)

For now, we are focusing on the Meticulous Maintainers and the Catastrophic Late Maintainers

On the screen, you can see a bunch of hypotheses we have. These are offers we think will be a big hit

The problem is that executing on any of these ideas requires months. Anytime we want to do something, we’ve got to make a million changes across all of Phoenix. Phoenix has been rolling for three years, and we haven’t made one targeted promotion yet. And if we can’t experiment, we can’t learn!”

She looks at Maggie for several moments and then asks the question she’s been wanting to ask since last night. “Just what does it take to build great products? And how can we as developers help?”

You know, if you’re that interested in the customer, you could do the same in-store training that all employees and managers do

He points to a rack of large, thick books. “These are the books that you’ll use to help your customers.” They look like the huge phone books Maxine grew up with, four inches thick with razor thin newsprint paper.
“Your customers will often come in looking for a replacement part or with a problem that you need to diagnose,

Even for the 2010 model year, there’s table after table of all different configurations. Number of engine cylinders, size of engine, standard cab, extended cab, short wheelbase, long wheelbase … and for variation, there are a bunch of parts

often, the customer won’t know. When that happens, you walk out to their car with them and help them find the information. The fastest way is to record all the information on this little sheet.” He holds up a piece of paper. “This helps ensure you only have to go out to their car once.”

She had no idea how much time the in-store employees spend helping customers figure out why cars won’t start or what the strange noises coming from their engines mean.
Accurately diagnosing the problem is important, because they can help the customer avoid going to a service station.

Maxine is also exhausted from seeing the constant inadequacies of the computer systems supporting in-store employees. Matt was right—using the system was a nightmare. Once you knew the VIN and the part you needed, looking up certain out-of-stock parts required using a 3270 terminal session and keying in commands. This is the famous “green screen” mainframe interface, which most people have seen but few have used

The process of ordering out-of-stock parts is even worse

Inside those cabinets are racks of tablets that corporate deployed to the stores. Trouble is, all the apps make you fill out so many fields that they’re even harder to use than the computers. At least the computers have real keyboards. No one has used a tablet in months.”

She’s always been a prolific notetaker. She remembers reading somewhere, “In order to speak clearly, you need to be able to think clearly. And to think clearly, you usually need to be able to write it clearly

When she finishes her draft an hour later, she closes her laptop. She knows everyone won’t read her document, which means she’ll have to give a presentation on it

Why do we have to enter so much data? If this person is a repeat customer, do we still have to type all of this in?”

“I still don’t get it. Why do we have to type in so much information?” the youngest trainee asks.
“Corporate wants us to,” Matt says, eliciting laughs from the trainees.

“It would be great if the computer systems could tell us who’s purchased multiple batteries so that we could proactively make that recommendation,” Kurt adds.

Part Three: November 10–Present

PART THREE
November 10–Present

Chapter 14: Monday, November 10

Before we start, there’s something I think we need to do,” Maggie says. “We really need a code name for this effort. If we’re working toward something big, we need to have a name.

She’s always loved the Tuckman phases of teams, going through form, storm, norm, and perform

And that’s why I’m excited to introduce the Unicorn Project,” she says, smiling as everyone in the audience laughs at the whimsical name. “I’d like to recognize Kurt Reznick and Maxine Chambers who approached me a short time ago with a radical idea to make this happen, along with a group of engineers who wanted to help. We have all been working with the support of the entire Phoenix Project, toward the goal of having incredibly effective campaigns in support of Black Friday

She also notes Steve’s oblique reference to Sarah and her absence today. She looks around and confirms that she is nowhere to be seen, wondering whether that’s good news or bad news.

A major part of Narwhal is that it will often store copies of the major company systems of record

Believe it or not, all of us are strongly in favor of a pure NoSQL solution

No one in the company had used NoSQL in production in a significant way, let alone for something so large and mission-critical. Usually Maxine thinks prudence and practicality would disqualify such a risky approach for such a high-stakes project

risk is losing relational integrity between all these tables that we’re copying from everywhere in the enterprise. As you know, a NoSQL database won’t enforce relational integrity like most databases we’re used to. But I’m comfortable that we can enforce it at the API level

The most difficult part was not the mechanics of importing the data from twenty different business systems. Instead, it was trying to create a unified vocabulary and taxonomy that they could use, because almost every business system had different names for similar things.

Physical stores have five different definitions of in-store sales

Maxine found herself constantly switching between insisting on clarity and consistency to ensure accuracy to saying “good enough for now”

One of the two most senior data scientists from the Promotions team is visibly flustered. “We still don’t have the fields we need in the one percent subset of the customer list from the Data Warehouse team

Maxine immediately sees that the queries the data scientists are building are a complete mismatch to what they’ve built Narwhal for. Narwhal is stellar at handling API requests from all the various teams across the company, but now they’re learning that it’s spectacularly not great for what the Analytics teams need to do

the Data Warehouse teams still haven’t reconciled the different definitions of product, inventory, and customer from the physical stores and e-commerce stores

In order for the Unicorn team to succeed, they somehow need to be decoupled and liberated from the giant data warehouse, and maybe even Narwhal, to support the massive calculations and queries they need to do.

But I spent nearly five years on the Data Warehouse team thinking about this. Let me show you something I’ve wanted to do for years.

She is proposing to build a Spark-like big data and compute platform

Unlike Data Hub, where almost every business rule change also requires a change from the Data Hub team, this new scheme would allow a massive decoupling of services and data

it could also support an immutable event sourcing data model, which would be a massive simplification compared to the current morass of complexity built up over decades.

I’ve always wanted to run something like this, but … it’s just so much new infrastructure to build at once. This seems a bit reckless, even to me.

All my intuition and experience says that our data architecture has created another bottleneck that affects every area of the company.

Chapter 15: Tuesday, November 25

At times, it’s difficult to know who works on which team, because people are moving so fluidly between them

Shannon’s hypothesis is correct—it was a problem in the order entry back-end systems. All the systems in that particular cluster are pegged at one hundred percent CPU usage; unfortunately, the system being hit is part of the main ERP, which handles almost all the core financials of the company. It’s been running for over thirty years, but it’s stuck on a version that is almost fifteen years old

All those clients start resending the queries, causing even more requests to overload the back-end database

Another forty-five minutes later, they cross over their goal of three thousand completed orders, grossing a quarter million dollars in revenue, and the orders are still coming in strong

the highest conversion rate we’ve ever achieved by at least a factor of five

We need everyone pulling their weight,” Sarah says, fuming in righteous indignation. “I think we need some mandatory overtime. Buy them more pizza and they’ll be happy to stay and do the work.

At three thirty a.m. she’s in the office with the rest of the team

Sarah is here too. As far as Maxine can tell, she appears to be haranguing someone about the pricing and promotional copy for one of the offers.
Maggie is also in the huddle, not looking happy

Within two minutes, over ten thousand people have hit the website and are going through the order funnel, and the rates of arrivals keep climbing. And again, all the CPU loads start climbing, much higher than in the test launch.

She watches as the number of orders continues to climb … until they flatline, just like on Tuesday.

Fulfillment options aren’t being shown! I’m guessing some fulfillment service is failing

This service calls out to a bunch of external APIs from the shipping providers, and some of those are failing. Brent suspects that they are being rate-limited by one of them

for something this mission-critical, there’s no way we should depend on external services, she thinks. We need to gracefully handle the case when they’re down or when they cut us off.

When we get shipping API failures, maybe we present just the ground shipment option. We know that this type of shipping is always available

Web server page requests are timing out

Brent had blasted their site with a homemade bot army

The component wouldn’t render for bots, only for actual customers who were logged in.”
As real customers hit their sites, this component made a bunch of database lookups from the front-end servers, which were never tested at this scale.

The recommendation component is what’s causing the unusual server load. Can we disable it until traffic dies down?”

They finally decide to just change the HTML page, commenting out the recommendations component

They’re surprised when there is no impact to the front-end performance

It takes four more minutes for them to discover that there is one more place where the component can be rendered

The server load finally dips another fifty percent when the most common graphic images are offloaded from their local web servers and moved to a Content Distribution Network (CDN)

This is the largest digital campaign that this company has ever done,” Maggie says. “We sent more emails today than ever. We pushed out more mobile app notifications than ever. We had the highest response rates. The highest conversion rates. We had higher e-commerce sales today than any other day in the company’s history

we’ve booked over $29 million in revenue today alone. We blew away last year’s sales record by a mile!”

we’re a long way from being done. We’re basically Blockbuster, who just figured out how to do paper coupon promotions. If you think that’s enough to save Parts Unlimited, you must be smoking something.

Erik arrives, grabbing the seat next to her. “Congratulations to you both—you did terrific today. Now, you need to show Steve and Dick how the future requires creating a dynamic, learning organization where experimentation and learning are a part of everyone’s daily work

The Fifth Ideal is about a ruthless Customer Focus, where you are truly striving for what is best for them, instead of the more parochial goals that they don’t care about, whether it’s your internal plans of record or how your functional silos are measured

Chapter 16: Friday, December 5

There’s more good news. We’ve put a huge amount of focus on improving the in-store systems to better help store managers incorporate the practices that we know every one of our rock star store managers use

in-store sales for our pilot stores are up almost seven percent

the nonpilot stores have had flat or negative same-store sales

Typing in these VIN numbers is the bane of almost every Parts Unlimited employee

We now enable customers to use their app to create a profile for all of their cars. They just scan the VIN on their car using their phone camera, and we automatically populate the information of their car

our in-store employees can scan a QR code on their phone to pull up their customer’s record

Besides being great for our customers, it’s great for our employees. For the first time, it’s like we’re doctors who have their patient’s charts on hand

we need to do it with the sponsorship and support of the highest levels of the organization.”
Maxine sees Steve smile, looking not just interested, but delighted. He applauds loudly, but before he can say anything, Erik speaks up.
“Ms. Lee is exactly right, Steve,” Erik says

Everything around you has been built upon the success of your Horizon 1 or cash-cow business. And as Maggie is alluding, you have nothing in Horizons 2 or 3.

Geoffrey Moore, who is most famous for his book Crossing the Chasm

I think he will be best-known for the Four Zones, which help us better organize ourselves to win in all Three Horizons

Horizon 1 is your successful, cash-cow businesses, where the customer, business, and operational models are well-known and predictable. For you, that’s your manufacturing and retail

Horizon 2 lines of business are so important, because they represent the future of the company

For you, this transition happens when your Horizon 2 business revenue hits $100 million

Horizon 2 efforts come from Horizon 3, where the focus is on velocity of learning and having a broad pool of ideas to explore

Horizon 1 thrives on process and consistency, on rules and compliance

In contrast, in Horizon 3, you must go fast, you must be constantly experimenting, and you must be allowed to break all the rules and processes governing Horizon 1

I will leave you with one last caution. Horizon 1 and Horizon 3 are often in conflict with each other.” He gestures at Sarah meaningfully. “Left unchecked, Horizon 1 leaders will consume all the resources of the company

Sarah stands up. She says, “Steve, as much as I appreciate what Maggie and her little team have done, I think this is a losing proposition. Your chairman and boss, Bob Strauss, has grave doubts about the future of the company

Bob’s idea of splitting up the company and selling off the pieces is our only prayer of salvaging shareholder value.

I’m convinced we should do another round of headcount reductions immediately to shore up profits

You will be responsible for nurturing new, promising ideas and exploring market risk, technical risk, and business model risk

At the end of each quarter, we’ll review progress on each initiative, and we’ll make a decision: continue funding the project; kill it, reassigning the team to the next best idea; double-down on the project; or graduate it to Horizon 2. We’ll also decide whether the entire program needs to be grown or shrunk.

To capture the best ideas, go create an Innovation Council. Find fifty of the most respected people from across the entire organization

Chapter 17: Friday, December 12

They were never ‘your people,’ Kurt. You were temporarily loaned a bunch of engineers for the Unicorn Project

But why now?” Kurt asks, incredulous. “What got everyone so riled up?”
Chris laughs humorlessly. “Sarah is stirring the pot, egging them all on

Over the last month, they had started building tons of automated tests for Phoenix so that they could better and more safely make changes. The effort was incredibly successful. However, with so many tests, running them now takes hours, and developers are starting to avoid checking in their changes, not wanting to wait for the long test times

This is two thousand lines of code to determine whether we can ship to the order location

Holy shit,” he says, staring at her screen in disbelief. “Fifty lines of code

The entire Unikitty CI cluster is down. Everyone’s builds are stuck.”

“I can’t believe it was a networking switch failure,” Dwayne says.

He presented his plan to create a competitive CI service to compete with Unikitty.”

Sarah has apparently started shouting from the rooftops that Unikitty is jeopardizing the entire company and that we should be shut down.

Sarah convinced Bob and the rest of the board to freeze all expenses, effective immediately. And Steve just found out that they’re denying the $5 million he was going to allocate to the innovation efforts

Sarah and our new board director, Alan, have convinced everyone that the success of the Black Friday promotions have unlocked huge new efficiencies; therefore, we don’t need as many people

We need you all to come up with a plan to eliminate $2 million from the IT organization—that’s about fifteen people across all of your groups.”

Yes, protecting the Innovation effort is our most important task.

Surprisingly, Kirsten has put on the table seven project managers, noting that the Rebellion has changed the way teams work. “Long term, we don’t want to manage our dependencies, we want to eliminate them

Last time we met, I mentioned Sensei Geoffrey Moore’s Three Horizons, but I didn’t have time to explain his concept of Core versus Context, which are what the Four Zones are about,” Erik says

they underinvest in Core, because they are being controlled by Context

Companies who become too burdened by Context are unable to properly invest in Core

You know that technology must become a core competency of this company and, indeed, that the future of Parts Unlimited depends upon it. But how much of the $80 million of your technology spending is Core, actively building competitive advantage, and how much of it is Context, which is important and maybe even mission-critical, but still needs to be standardized, managed down, and maybe even outsourced entirely?”

Electricity has become infrastructure that you buy from a utility company. It is interchangeable, and you choose suppliers primarily on price. There is rarely a competitive advantage to generating your own power. It is now merely Context, no longer Core.

As Sensei Clay Christiansen once stated, one keeps what is ‘not good enough’ and outsources what is ‘more than good enough,’”

Steve, lucky for you, according to Sensei Moore, the person best suited to manage Context is someone just like Bill and Maxine

think carefully about how each and every position you eliminate might disrupt flow, especially when you don’t have locality in decision-making, as embodied by the First Ideal

middle managers are your interface between strategy and execution,” he says. “They are your prioritizers and your traffic cops. We all have this ideal of small teams working independently, but who manages the teams of teams? It’s your middle managers. Some call them derisively the ‘frozen middle,’ but you’ll find that properly developing this layer of people is critical to execute strategy

“We have three ERP systems,” Maxine offers. “It’s a pain to have to integrate with all of them. In fact, all three of them are owned by one company now. Maybe now’s the time to bite the bullet.”

Looking at the growing list of ignoble Context on the whiteboard, Maxine doesn’t feel dread. Instead, she feels inspired thinking about how jettisoning these things will liberate the company from things that slow it down and present the opportunity for engineers to work in areas of far higher value

Chapter 18: Thursday, December 18

Who is going to save Horizon 3? She looks around.
In that moment, she realizes that it’s all up to her now.

Steve has a message for you. He says, ‘Take charge of the Innovation Pitch meeting. Good luck!’ He will be there if he can, but he’ll likely only be able to stay for a couple of minutes.

the first idea that generates buzz throughout the auditorium is a rating system for garages and service stations, which immediately gets the nickname “Yelp for Garages

A senior sales manager presents an idea to create a four-hour delivery service to their service station customers. This would enable those stations to offer more repair services, knowing that needed parts could be quickly delivered as needed. A competitive startup had recently emerged offering four-hour delivery, and the Parts Unlimited business unit that sold directly to service stations already cut their revenue forecasts for next year by ten percent because of them.

sell an engine sensor and create a huge array of offerings around it

The judges’ top choice is the Engine Sensor project

and the other two winners are the Service Station Ratings team and the Four-Hour Parts Delivery team

Maggie Lee is resuming her responsibilities for retail operations and the Innovation Council

“It was just like the movie Brazil!” Kurt says proudly, laughing. “I got killed with paperwork. Sarah opened up an investigation with HR about all the rules I broke

Can Sarah really get away with this?”
“For now. I’m suspended with pay for sixty days, pending further investigation,” he says. “Steve got Maggie off the hook for now too. Sarah is still at large, though.

everything is hinging on the success of the Horizon 3 projects. Steve is betting his job on it. If these efforts don’t pan out, Sarah will become the new, and likely last, CEO of Parts Unlimited as we know it.”

Although Debra frets about all the manual processes, Maxine knows that this is all about creating a Minimum Viable Product to test their offerings and confirm their hypotheses of what capabilities are required to fulfill them. This rapid iteration and learning before they invest heavily in rolling out a big, disruptive process is a great example of the Third Ideal of Improvement of Daily Work.

Chapter 19: Tuesday, January 13

Sarah Moulton is no longer with the company 

Maggie Lee will be taking over all retail-related concerns

our first profitable quarter in almost two and a half years

I’d like to congratulate Bill Palmer, who has been promoted to chief information officer, allowing me to vacate that position. And I’m pleased that I’ve gotten board approval to make him provisional chief operating officer

Thank you all for coming. This is the first of many ceremonies we will hold as we bid adieu to these things that used to inhabit our datacenters, tormenting us on a daily basis. I grew up with Kumquat servers nearly twenty years ago

Wes pulls a giant sledgehammer from behind him

These engine sensors are such cool devices! They’re manufactured in China, but the company that designed them is based not so far from here. I think it’s a very small shop,” Shannon says. “We’ve done some experiments modifying the software on the devices. They have ARM processors that run Linux. I’ve managed to change the configurations and reflash the devices, so now they’re sending their sensor data to our back-end servers instead of theirs.”

The garage recommendation service had seemed promising. The in-store managers liked having the data, but the complications created by sales account managers who owned the relationship with those garage owners with lower scores was sufficiently problematic. The business lead needed more time to come up with a better policy of what to do with those organizations. It was decided that further development of this idea would be suspended, and the Innovation Council decided to start the next most highly rated proposal

Surprisingly, we’re bringing an entirely new category of customers into our stores,” says Maggie. “We’ve found that many customers are car fleet managers and solo rideshare drivers

We found that a large number of Tesla owners are driving around for weeks with low tire pressure. We tested an offering to drive out to their car and refill their tires and their fluids, and we were all stunned at the high conversion rate.

there’s an area one hundred feet long and nearly fifty feet wide that is entirely empty, the racks having been hauled away.
On the floor are pieces of masking tape and paper tombstones indicating the business systems that used to reside on those servers.
“Email Server: $163K annual savings.”
“Helpdesk: $109K annual savings.”
“HR Systems: $188K annual savings.”

They will finally consolidate from twenty different warehouse management systems down to one. They will finally migrate to a current version of their ERP system

They used a technique called Wardley Maps to better localize what parts of various value chains were commodities and should be outsourced, which should be purchased, and which should be kept in-house because they created durable, competitive advantage. They used this exercise to methodically disposition their technology stacks, given the business context.

they found another technology gem right next to the MRP group: it was an event bus that ingested all the equipment sensor data from their manufacturing plants

it was exactly what Shannon had wanted when she had first pitched Panther

It is now at the center of Project Shamu, forming the foundation of a massive architectural change that will eventually touch almost every back-end service and API across the entire company

It will decouple all of these services from each other, allowing teams to make changes independently, no longer reliant upon the single Data Hub team to implement their business rule changes

Epilogue: One Year Later

EPILOGUE
• One Year Later

Steve had once again reclaimed the role of board chair, and he thanked Bob Strauss for his service to the company

The technology group is nearly twice the size of what it was when she was first exiled to the Phoenix Project

Thanks to Maxine’s endless and relentless lobbying, the TEP and LARB have both been disbanded. Proudly hanging on her desk is a certificate that says, “Lifetime Achievement Award to Maxine Chambers for Abolishing TEP-LARB,” which is signed by everyone from the original Rebellion

The engine sensor project is a monster runaway success, by far the fastest growing part of the company. Nearly two hundred thousand units have now been sold, generating $25 million in revenue

couldn’t keep enough in stock

it was their mobile app that made all the difference. People were buying the engine sensors because they loved the app so much. An entirely new demographic was entering the store. Many of the store managers had told her that it was the first time they’d seen so many people in their twenties coming to Parts Unlimited

Earlier today, Bill had taken her aside to let her know that she was being promoted. She was going to be the first distinguished engineer in the company’s history, reporting directly to Bill. She loves her proposed job description. Among other things, her charter is to help create a culture of engineering excellence across the entire company

Maxine is excited that there is finally a career ladder for individual contributors and brilliant technologists without having to become managers

Kurt’s now reporting directly to Chris. Rumor has it that he will soon be promoted to engineering director, and that Chris is trying to figure out how to finally retire and open up a bar in Florida. In the meantime, Chris has eliminated QA as a separate department, distributing them into the feature teams. Ops is quickly turning into a platform team and internal consultants

And in a suprising turn of events, earlier this week Maxine finally had that lunch meeting with Sarah, who had reached out to her. It was not at all what Maxine expected. Despite her initial wariness, she had fun and even learned some things

look at how Wall Street is valuing this company,” Erik says. “It’s at an all-time high, over 2.5 times higher than when you joined the Rebellion

Did you know that Steve keeps a book by his bed of the most important people in the company, so that he’ll always recognize them, even in a crowd at Disneyland. And did you know that you’re in it, as well as Kurt, Brent, and Shannon? A decade ago, only the top plant managers and store managers were in there. Now, there are engineers in there too.

The Five Ideals

THE FIVE IDEALS
The First Ideal: Locality and Simplicity
The Second Ideal: Focus, Flow, and Joy
The Third Ideal: Improvement of Daily Work
The Fourth Ideal: Psychological Safety
The Fifth Ideal: Customer Focus

References

REFERENCES

The Unicorn Project was heavily influenced by many books


Edited:    |       |    Search Twitter for discussion