Product Backlog Refinement Tips (formerly called Grooming)

Executive Summary

In this article, I discuss some tips for how to do effective Product Backlog Refinement.  “Refinement” is the new term in the Scrum Guide for what used to be called “Grooming” — this changed in 2011 in the Scrum Guide. The primary benefits of backlog refinement are increased productivity, increased understanding/quality, and prevention of delays due to unknowns. Backlog refinement can be a little difficult at first when getting started, but the benefits are large and they appear very quickly, usually after just a couple of Sprints of doing it.  The goal of Product Backlog Refinement is to get Product Backlog Items (PBI’s) that are “Ready” (shared understanding, acceptance criteria, sized) to be brought into a Sprint.  Items are usually brought into a Sprint at Sprint Planning, but they can also be brought in mid sprint at times.

Learn more about great Product Backlog Refinement practices in our /Agile Requirements and Product Owner Techniques classes.

Some Assumptions of this article

  • There are several kinds of Product Backlog refinement, but in this article, I will focus almost exclusively on Weekly Team Product Backlog Refinement, except as otherwise noted.
  • I use Scrum terminology and a 2 week Sprint cycle in this article, but you can translate these concepts to other software approaches and iteration lengths.
  • I will use the more generic Product Backlog Item(PBI) term here, except where I make special mention of User Stories. Pretty much everything in this article also applies to User Stories.
  • Synonyms for Product Backlog Refinement: backlog grooming, sprint preview meeting, user story grooming, detailing user stories, story writing workshop, user story conversations, etc
  • Any time I mention “backlog”, “backlog item” or “item” in this article, I’m referring to the Product Backlog and *not* referring to the Sprint Backlog.

Tips for Effective Backlog Refinement

In my tips articles, I try to describe highly effective ways to fulfill the Scrum framework. The Scrum framework intentionally leaves holes that are opportunities for teams to fulfill the framework in the way that is best for a particular team’s circumstances. The Scrum Guide defines the Scrum framework, and my tips are in no way meant to define or re-define Scrum. My tips are based on my numerous Scrum coaching experiences, and I believe them to be consistent with the Scrum Guide.

Tip – a practice that is definitely worth consideration, but might only be good in a few or very specific contexts or team situations.


Tip #1: Try to never schedule backlog refinement during the first or last 20% of the Sprint.

During the first 20% of the Sprint, the team is just getting started on this Sprint’s work, so you’ll want to give them some room to get a good start. During the last 20% of the Sprint, the team is working hard to get closure on the current sprint items, so that’s not really an ideal time to do it either. That middle 60% of the sprint is a good time to do backlog refinement.


Tip #2: Treat the backlog refinement meeting just like the first part of the Sprint Planning Meeting

  • Often called “The”What” of the sprint planning meeting, this part of the meeting talks about what will be developed in the upcoming sprint.
    • The PO or Dev Team member presents the backlog items (often User Stories), the rest of the team ask for clarifications, and then the items are estimated by the team.
      • The PO then or later might indicate that the order will change based on the estimate. Good! That’s the kind of collaboration that’s supposed to occur!

Tip #3: Make sure the PO knows that, during all of this sprint’s backlog refinement sessions, they will be expected to present enough work to last about 2 Sprints *beyond* the current sprint.

The reason for bringing this much work to the meeting is two fold:
1. Often times PBI’s will be reordered based on feedback in the meeting, so you want enough work leftover that you’ll be ready to fill a sprint.

  • One common reason: External dependencies, or coordination with another group outside this team.

2. It’s a good practice anyway to have some finely refined PBI’s beyond what you’re working on in a Sprint in case someone frees up and needs more work.

The PO may need to collaborate with the team to know how much work 2 Sprints worth is, but it need not be a lengthy discussion — just a very rough guess. Hopefully your team has seen the PBI’s at a Release Planning, PI Planning, or other meeting or in some other backlog refinement discussion meeting. If not, then the PO can discuss the new PBI with a team member or two (informal product backlog refinement) to get a feel for the very rough effort involved (but be sure not to communicate this rough estimate to the team — don’t want to anchor/skew their initial estimates!).


Tip #4: The backlog items should be fine grained and well understood by the PO or a Dev Team member for this meeting to work well. Try to have an initial set of acceptance tests defined before the meeting occurs.

When the PO or a Dev Team member brings a backlog item to the meeting, what you don’t want to hear is, “Ok, I don’t have a lot of details on this item, but I’ve got the basic gist (or first draft) of it.” What you would rather hear is this “I’ve worked on this item and I have informally collaborated with both the customers and the team on it. I have a really good idea of the details of the backlog item as well as the acceptance criteria for the PBI. It’s fairly solid, and I think there’s enough information there to start work almost immediately!”

Even if the initial acceptance criteria are high level, that is better than no acceptance criteria at all. The more the PO and team focus on getting to good detailed acceptance criteria, the better.  Also, see Story Testing Patterns.


Tip#5: Make sure everyone understands that estimates are not final until a PBI has been accepted into a sprint.

Since the team is getting a preview of the PBI’s, don’t let the team stress out too much on the estimates. Let them know that a PBI’s estimate is not final until the PBI has been accepted into the sprint. If someone comes up with new information between the backlog refinement and the time the item is brought into a Sprint, they are free to bring it up and re-estimate the PBI. This brings us to the converse of this tip…


Tip #6: Make sure everyone understands that Product Backlog order is not final until a PBI has been accepted into a sprint.

The PO is free to change the order of the backlog between the backlog refinement and Sprint Planning Meeting. This is ok. We welcome late change, remember? In practice, though, I’ve found big ordering changes happen rarely between the backlog refinement and Sprint Planning meeting, maybe 10% of the time or less.

While it may be frustrating for PBI’s to change order often, remember that this kind of information is something we want sooner rather than later.


Tip #7: Keep your eye on the goals of the meeting

One of the goals of the meeting is that, when it’s over, the team has a handful of PBI’s queued up and “Ready” to roll into a Sprint, and also has, in total, enough for the next two sprints beyond the current one. All major questions have been answered (or at least assigned to someone to answer), and the team feels confident and knowledgeable about the upcoming work. Of course, like everything, not every last detail will be known, but it’s not for a lack of trying. At the end of the meeting you might try asking this of the team: “Of all of the PBI’s that were presented, a) which ones would make you feel very uncomfortable if you had to start tomorrow? and b) which ones do you not feel confident at all in estimating?” If any of these kinds of risk are identified, assign someone to help mitigate that risk before the upcoming sprints. This brings us to our next tip…


Tip #8: Assign action items for any big risks or unknowns

For big requirements issues, this probably means assigning the issue to the PO. For big technology risks or unknowns, this means assigning a dev team member to research the issue at hand. The person assigned should be ready to report on the action item at or before the next refinement meeting or definitely before the item is brought into the Sprint. This will usually involve re-estimating the item based on the new information — and that’s perfectly acceptable. Who does the assigning? In general, to support self organization, the PO and Dev Team members should volunteer to take an action item. The Scrum Master can help by asking questions like: “Who can take this one on and get back to the team on this?”. Avoid having any one dominating person assigning action items to people — that’s not very self organizing.


Tip #9: Remember that backlog items (and/or User Stories) are a collaboration between the PO and the Dev Team

The final responsibility for making sure the PBI’s are adequately detailed for the development team’s use falls on the Product Owner. However, don’t take that to mean it’s the PO’s job to do all that work. The items should be the results of a collaboration between the PO and the dev team(and also probably between the PO and the customers or end users). It is perfectly acceptable, and in fact encouraged, that the PO meet with 1, 2, or all members of the team to help detail a PBI, prior to a refinement session (what I call “informal backlog refinement”). It is also perfectly acceptable if Dev Team members do the vast majority of refinement activities in direct consultation with users and stakeholders. Schedule meetings or have spontaneous collaborations as necessary, with the end goal being a well thought out, well detailed PBI when it is presented at a backlog refinement session. It is very atypical for a PBI to be one that is new to every single developer(The Scrum definition of “developer”) at a backlog refinement session. If this happens, consider it a bad smell and try to get developers involved earlier.

We highly recommend that you read the “Product Backlog Management Leader” section in our very best article on the Product Owner role: The New New Product Owner.

(Worth noting here, the Scrum Master has very little role in Product Backlog Refinement.  Their only real role is to teach the team how to communicate PBI’s with the PO, and how to execute efficient refinement.  Once this is accomplished, the Scrum Master will not attend refinement very often)


Tip #10: Remind the Dev Team to review the PBI’s to be discussed about 24 hours prior to the refinement session.

If your PBI’s are documented in any way before a backlog refinement session (such as in a tool,on a wiki or on index cards), send a reminder to the dev team to review them so they are prepared to speak intelligently about them.

If the PO is making lots of last minute edits right up until the refinement session, much of the information will be seen for the first time in the meeting, which can make the meeting last longer and be more confusing. Ideally, the PO and Dev Team should target their editing efforts at being ready 24 hours prior to the backlog refinement session.


Tip #11: Definitely feel free to split PBI’s during this meeting.

If anyone on the Scrum team feels the need to split items, during a refinement session is a perfect time to do it. You’ll probably want to re-estimate them too, and that’s fine. I hope it also goes without saying that if you split an 8 point PBI there should be no emphasis on making sure that the split out PBI’s all add up to 8 points. Just re-estimate them as if they were new, independent, PBI’s.


Tip #12: Optimize your time in the meeting.

For PBI’s that are well defined, just discuss and quickly estimate. For PBI’s that have already been discussed in a previous refinement session, you only really need to revisit them if there is new information about them. For PBI’s where there is new information, just discuss the new information enough to be able to give a new estimate.


Tip #13: Don’t be afraid to discuss a couple of items that are farther down the backlog

Sometimes there will be a need to discuss a backlog item that is further down the backlog. Here are some reasons for doing that:

  • The PO needs to gauge the rough size of a PBI
  • The dev team needs to identify external dependencies that will need to be started on ASAP while waiting for the rest of the item’s details to be flushed out.

There can be many other reasons, too. While we don’t want to get into a habit of doing BRUF (Big Requirements Up Front) or BDUF (Big Design UP Front), there are rare situations where looking at a backlog item that is farther down the road is advantageous to the team and the organization as a whole. Don’t be afraid to do it, but be wary of slipping back into BRUF and BDUF.


Tip #14: Strongly consider doing some informal backlog refinement before the full team backlog refinement.

  • Informal Product Backlog Refinement
    • The Product Owner or Dev Team members will work with zero or more dev team members or stakeholders to refine stories and their order in the backlog. Said another way, this is a lighter more informal way to refine the items in much the same way the refinement is done in the Weekly Team Backlog Refinement. This kind of informal refinement should be a daily, if not hourly, occurrence for the Product Owner and Dev Team.
      • When the PO gets together with development team members (early collaborators), she should strongly consider making sure that the three amigos are there.
        • Also feel free to bring stakeholders in to discuss.
      • Development team members should feel free to discuss relative sizes with the PO, but I encourage teams to wait to record any estimate until the entire team has estimated the item first. Said another way, the early collaborators should try to avoid skewing or anchoring the entire Dev Team’s estimates.

Tip #15: Don’t be afraid to introduce late breaking PBI’s. Try to minimize them, but embrace them when they happen.

Remain Agile! Some of the major goals around doing backlog refinement are to reduce unknowns and risks, but not all risks are easily identifiable beforehand. Backlog refinement is not meant to eliminate risks, only to minimize them. Late breaking PBI’s will probably still happen (hopefully rarely), and they should generally be welcomed by the team. On the other hand, if it seems like most of your backlog items are late breaking, that is generally a bad smell.


Tip #16: Typical Meeting Attendees: Dev Team, possibly the PO, and rare appearances by other key people

On rare occasions, you will need other key people, but prefer keeping non Scrum Team members out of the meeting as a general rule unless they have a very specific reason to be there. From time time to time, it will be useful to have a non Scrum Team member appear, but even those appearances should be limited. Try to limit the “other key people” to only attending while the relevant backlog items are being discussed, and empower the Dev Team to invite whoever they think will be helpful.


Tip #17: Experiment with the amount of refinement that your team does.

In the beginning, until you get “two sprints ahead” (beyond the current sprint), you will need more refinement than normal. After you get caught up, experiment with how much refinement is usually needed to stay “two sprints ahead.” When I coach teams, I usually tell them that 1-2 hours per week is typical, and that until they are “two sprints ahead,” they should double that amount. Don’t use these numbers as hard and fast rules, but instead as a starting place to adjust the time up or down as needed. The Scrum time-box for refinement is 10% of the Dev Team’s time per sprint. For a 2 week sprint, that means the equivalent of 1 full day (8 hours) per sprint for every Dev Team member. I find that most teams can stay well under that time box once they’re “two sprints ahead.”


Tip #18: Be sure to retrospect, inspect, and adapt!

Just like every other Scrum practice, it is absolutely imperative that your team inspects and adapts their practices. If your team feels like backlog refinement is a waste of time, then there are almost assuredly impediments that need removing. Do a “Five Why’s” analysis and read up on good backlog refinement techniques.


Learn more about great Product Backlog Refinement practices in our /Agile Requirements and Product Owner Techniques classes.

Conclusion

If backlog refinement is done well, you can skip the entire “What” part of the Sprint Planning meeting and go straight to decomposing PBI’s in the “How?” part of the meeting. How’s that for a short planning meeting? Most teams will find that having regular product backlog refinement increases overall team knowledge and eventually increases productivity, usually by a large factor.

An Agile Community of Practice Starter Kit

As I describe in my article An Agile Architecture Community of Practice, a very mature Community of Practice has mechanisms that represent that the community has come to agree upon, over a long period of time.  Those standards and working agreements need to mature over time as well…but what if you’re starting a new Community of Practice(CoP) from scratch and you’re NOT mature yet?  Where do you start?

To me, the healthy starter kit for a CoP is comprised of the following:

  • The Community forms around an ” improvement interest” or effort that spans across multiple teams(Release Train, Product Group, Nexus, etc), that is scoped according to improvement domain, and also possibly according to organizational domain, geographic domain, and or product(system under development) domain.
  • Roles: The following positions or roles need to be initially filled:
    • Community Agile Coach
      • fulfilled by an Agile Coach (and/or Scrum Master), absolutely required position, also good to have a “backup” at all times as well.
    • 2 Community Coordinators
      • Both of these coordinators must come from Scrum Development Teams*
        • * Exception:  if the community “improvement interest” is a topic whose “home base” is not on a Scrum Dev Team, then this requirement is waived.  For instance:  Scrum Master CoP, Agile Manager CoP, Agile Executive CoP, are not home based in Scrum Development Teams
        • The coordinators are responsible for ensuring the “communication mechanisms” below are bringing maximum value to the community.
    • 2-4 Subject Matter Experts
      • These people are the people that are most passionate about the subject that the community is formed around.  They typically already share their knowledge across their teams and/or organizations, and are looked up to widely across the Scrum Teams/and or org.  Ideally they come from Scrum Teams, but this is not a hard requirement, and sometimes difficult to fulfill in orgs that are relatively new to Agile.
    • 1 Management Supporter(optional)
      • This is optional, and should only be fulfilled if you can find someone with good Agile behavior(preferably someone with at least one Scrum or Scaled Scrum/Agile certification, preferably both), and communicate to this person that, in the community, they MUST act as a servant leader to the community, and must let the other roles run the community without too much influence or direction coming from the Management Supporter.  The management supporter might be more directive outside the community activities, but not through the community mechanisms.
  • Artifacts: The following artifacts are required:
    • Community Improvement Backlog
      • Unlike the Product Backlog in Scrum, this Community Improvement  Backlog (CIB) contains Community Backlog Items (CBI’s) that only represent “improvements” in the community’s “improvement interest.”  There may be Product Backlog Items that represent work on the Product under development that come out of a community’s efforts, but those will land on a particular Scrum Team’s plate(even if the item applies across several teams).  Remember, bring the work to the (already existing) Scrum Teams and put it on the Product Backlog!  The CB only holds community improvement ideas, not product development work.
      • Obviously, this artifact only needs a couple of items on it to start.
    • Community Working Agreements
      • Agreements about how your community will operate.  (This is the CoP version of  the “Team Working Agreements” in Agile)  The people who fulfill the above roles(the “role players”) are responsible for ensuring this happens.  Many of the other things listed here in this blog post could be your starter set for CWA’s(Roles, Artifacts, Events, Communication Mechanisms).
      • TWA #1 should be something like “In everything we do as a Community, we try to honor the Agile Manifesto.  So, we value individuals & interactions over processes and tools, we value responding to change over following a plan, and we value executing improvement experiments over endless theoretical discussion and “analysis paralysis” about how to go about improving.”
  • Events:
    • Meetings at least 3X a month, including at least one meeting open to the public. (Some meetings are best held more privately among the role players in the community, to help coordinate all efforts, public and private)
  • Communication Mechanisms:  The following comm mechanisms should be established from the beginning.  The Community Coordinators are responsible for ensuring maximum value is continuously obtained from these comm mechanisms.
    • Synchronous “push” communication for spontaneous collaboration:
      • If everyone is highly co-located, then the mechanism is “Talk to each other”
      • If not, then this mechanism is something like an “always on” videoconferencing system, but also could be a chat room (think Slack or HipChat)
      • Email is a very poor form of this mechanism, but if you can’t get something like the above, then an email list may be the best you can do right now.  (IF that is the case “Find a better Synchronous Communication mechanism” should be high in your community’s improvement backlog)
    • Asynchronous “collaborative” communication for longer lived communications
      • A wiki (something like a Confluence or MediaWiki)
        •  At a minimum, your wiki (and it’s sub pages) should document
          • Your Community’s mission/interest
          • Your CB or a link to your CB (possibly kept in another tool, but a wiki is ok for this purpose too)
          • Your CWA’s
          • The description of the current position holders
          • Info about how to get onboarded to your community’s communication mechanisms.
      • Last resort:  If you can’t get a wiki, an “opaque” document repository mechanism (such as Sharepoint, Google Docs, Dropbox or your code repository)  can be used to keep documents that the community can use, store, and access.
      • (optional) A “work board” for more tactical coordination and possibly the Community Improvement Backlog.  (Something like Trello, a wiki, or your favorite Agile life cycle tool if you can get your company to give you a separate project area for your efforts)  If you can’t quickly get this, this might be a CBI.

The above is the basic starter set to get a Community of Practice started.  Something not mentioned above is the intended audience for the community.  The community is led by the role players, BUT the beneficiaries and implementers of the improvements are generally the “grass roots” members of the community, such as members of your Scrum Development teams.  Said another way, there is a vital implicit role in a Community of Practice, that of  the “target audience” for the community’s efforts.  When the community has “public” meetings, or other communications sent via public communication channels, the target audience benefits greatly from the improvements made by the community.

For me, there are always at least 3 CoP’s that are needed in every environment where multiple Scrum or Agile teams work together closely on a product (aka system under development).

  1. Agile Architecture Community of Practice
  2. Agile Testing Community of Practice
  3. Cross Team coordination Community of Practice (This one often already exists in your scaling framework, such as Scrum of Scrums, Nexus Daily Scrum, Nexus Integration Team, System Team, and there are many other examples)
    1. These practices are really just narrow CoP’s that focus on “Coordinating the flow of work to maximize the value delivery of frequent software releases”

At this point, you may be asking yourself “Why this many roles and structure just to get started?”  Here’s the reason:  Until you have the “minimum viable practice” in place, you should not start a Community of Practice.  You are creating a C-O-M-M-U-N-I-T-Y.  With any fewer than the above roles, artifacts, events, and communication mechanisms fulfilled, you won’t likely have a chance of creating a long running effort that is sustainable, productive, and beneficial over a longer period of time.  If you can’t convince the above handful of people to take a leadership role, then you have more work to do and more support to gather before starting.

The emphasis for your community should always be on “executing improvements” over having endless “how we run the community” process discussions or “how we should go about improvement X” discussions.  Remember, “individuals and interactions over processes and tools.”  🙂

 

Story Testing Patterns

Be sure to remember to mix and combine the patterns as necessary!
You can also download a One Page PDF or view the Story Testing Patterns Presentation that goes along with this summary.

Pattern

Generally Good For…

Generally Bad For…

<“Test that…”>
  • Beginning Story Testers
  • Simple Tests
  • Tests hard to describe using the other patterns
  • Experienced Story Testers that know a better pattern.
  • Tests with a lot of setup logic or behavior logic(try a different pattern)
  • Tests where behavior depends on numerous test inputs
<Given/When/Then>
  • Tests that require
    • a lot of preconditions or setup, OR
    • setup that is important or easily forgotten
  • Tests that have a specific, non obvious trigger
  • Tests where there are few expected outputs
  • Tests that have unimportant/simple/obvious preconditions
  • Tests where there are multiple different inputs and multiple different outputs
  • Tests where a single Given/When/Then only describes one of numerous very similar test scenarios
<Specification By Example – Conceptual or Concrete>
  • Tests that have numerous:
    • Inputs that affect output behavior
    • Outputs/expected behaviors
  • Tests where it’s important to test a lot of different data scenarios
  • Tests where the trigger event is somewhat obvious
  • Any test where it seems like a table would be useful to:
    • describe the test better, or
    • help explore all of the possible inputs and outputs for a test.
  • Simple tests
  • Tests that are more about verifying simple UI behavior
    • For instance – “Test that an error message is displayed when the user enters an incorrect password.”
  • Test where there is really only one input or precondition
<Bullet Points>
  • Teams that are highly co-located with PO
  • Stories that are very small(2-3 days)
  • Tests that are very simple
  • Tests with fairly obvious expected behavior
  • Distributed Teams
  • Stories that are large (which is a bad habit anyway)
  • Tests that are not simple
  • Tests with non-obvious expected behavior
<“Test With…”>
  • Teams that are highly co-located with PO
  • Stories that are very small(2-3 days)
  • Tests that are very simple
  • Tests with fairly obvious expected behavior
  • Distributed Teams
  • Stories that are large (which is a bad habit anyway)
  • Tests that are not simple
  • Tests with non-obvious expected behavior
<Flow Charts>
  • Tests where the flow of behavior is very complex, and easier to represent with a series of successive questions/answers
  • Generally bad for everything else.
<State Diagrams>
  • Tests where a system object can go through numerous (often workflow related) states
  • Generally bad for everything else.

Remember to strongly prefer index cards(5×8), wiki’s, and whiteboards(take photos!) over ALM tools and other electronic documents/tools.

You can also download a  One Page PDF version or view the Story Testing Patterns Presentation that goes along with this summary.

Agile User Story Slicing – An Alternative to the Vertical Slicing Metaphor for Valuable User Stories

Vertical slicing is a decent metaphor for how to ensure that User Stories are indeed valuable to users and key stakeholders. However, I’ve found it a little bit lacking for more complex systems, especially ones that also have upstream and downstream systems that the system under development interacts with. Cases where this model is especially useful is if your Scrum Team is doing internal software, /System Integration, Business Intelligence/Analytics, and web services(ReST, SOA, etc) or Microservices. Typical User Story practice encourages us to /create stories with INVEST qualities. The practices below will help you do so, and help you connect better with the “V” part of the story, the part that means that story is Valuable to users and /key stakeholders. It will also help you with the other letters in that acronym too. But we’ll focus on the V.

Rather than rely on the vertical slicing metaphor, I’m starting to teach more and more of my coaching and training clients the “System Boundary” view of story slicing. In this model, I tell them to draw a line around their system, and then consider the system boundaries, labeled SB1-SB4 below.

SystemBoundaries
System requirements are best captured at the system boundary. As such, User Story Acceptance Criteria (and indeed their associated Acceptance Tests that are preferably automated) should document what happens at the system boundary, and no further.

Some Anti-Patterns
For instance, in my coaching travels, I have seen people creating “Analysis Stories”,”Technical Stories”, “Testing Stories”, “Back End Stories”, and “Integration Stories.” Nearly always, these are people who struggle with how to “slice the cake” to design a User Story that is truly Valuable, as in the INVEST acronym.

Sometimes, a team is very limited by the stories they can create because they are a component team, and are not able to create an end to end feature like feature teams can. In scaled Scrum implementations(several teams working on the same product/system), some organizations have remnants of waterfall still in their organizational design and team structure, such as a large organization composed almost exclusively of component teams. It’s important to keep an eye out for this “too many component teams” anti-pattern — it can be quite devastating to productivity and value delivery.
For more info see “WODA” in this article about the Top 10 Challenges to Scrum that Organizations face. But with that caveat out of the way…

Great Agile User Stories will hit the system boundaries at least TWICE!

The boundaries of the system(or product, as we say in Scrum) under development are shown in the graphic above. SB1-SB4 represent the system boundaries, where we interface with our key stakeholders (humans, as well as upstream and downstream systems, who have humans that represents the interests of those systems). A good Agile User story will hit those boundaries twice. For instance, the typical “vertical slice” in a typical software application with a GUI will hit system boundaries SB3 and SB4. Sometimes, upstream data will somehow get surfaced in one of a product’s GUIs — so it will hit system boundaries SB1 and SB4.

Example: System Integration (Vendor sold product that your org integrates into it’s internal systems)

Another example might even hit all 4 boundaries! These are pretty typical with /Scrum Teams that do System Integration. Some data could flow in from an upstream system(SB1), be presented in a product’s GUI(SB4) for some sort of approval or tweaking(SB3), then approved and sent to the downstream system(SB4). (But of course, this might be better split into 2 stories that each hit 2 boundaries each)

Example: Web Services

It’s important to note that in this model, the upstream system and downstream system could be the same system. Imagine your product is a back end payment system for a retail web site, and that your back end system had no real GUI, just web service endpoints (ReST, SOA, etc). In that case, the “upstream” system would send a payment service request to your system(SB1), and then your system would do some processing on it, and send it back to the (now considered) “downstream” system (SB2) with an approval or denial code or similar. Note that the upstream/downstream system might be in your org, or it might be external (what web services is really built for). It’s again important to note that if you have /”back end stories”, you probably have component teams. Again, not all component teams are evil, but having too many of them in your organization is pretty damaging in terms of agility and productivity. See “WODA” in this article about the Top 10 Challenges to Scrum that Organizations face.

One other related note is that if you’re implementing Scrum, the people that represent those upstream and to downstream systems(usually Dev Team members of those teams) should be involved in your Product Backlog Refinement and Sprint Reviews, so that they can collaborate about boundaries SB1 and SB2. In essence, these Dev Team members for the upstream/downstream systems are key stakeholders for your system.

In Summary
So, if your User Stories don’t touch the system boundaries at least twice, then strongly question whether you have sliced your story correctly to have “value” or not. Chances are, you haven’t — so look at other ways of slicing and dicing that User Story(see the links below for more help with slicing).

An Agile Architecture Community of Practice

Background

We often coach organizations on scaling Scrum, where 4-12 Scrum Teams are working on the same Product or a set of closely related Products (Applications, Systems, etc). We often get asked how things like Architecture and other multi-team related concerns are handled in a scaled Scrum approach. In Agile practice, handling these multi-team concerns is usually handled via a mechanism called a “Community of Practice”. Below is an example of a moderately mature community, based on a compilation of ideas that we have seen work well in the field. Note also that a TeamSet should always have more than one “Community of Practice”,and new ones should be formed and dissolved as needed.

Caveats about the example below:

  1. We use the term “TeamSet” below to refer to a set of teams, the 4-12 mentioned above, working on a Product or set of closely related Products. Obviously the more high impact the architectual initiatives, the more formality of process you will likely need. So, if your set of closely related Products is not that closely related, then this community should be less formal and less broadly applicable in its decisions. The converse of this is also likely true.
  2. The below example is that this shows more of an “ideal end state” — your org will likely have to take a few steps of organizational change before you can get here. But try to get as far as you can on step 1 — you might surprise yourself.
  3. We have probably added more formality and more documentation below than would be typical in a real life, highly Agile, community of practice. We do so here primarily for illustrative learning purposes, to give you more ideas than are truly needed (i.e. so you can pick and choose what work in your context). In real life, the community would likely be less formal.
  4. This example is for Architecture, but this same kind of approach easily fits other types of CoP’s: Agile CoP, Scrum CoP, Scrum Master CoP, Product Owner CoP, DevOps CoP, Programming CoP, Automated Testing CoP, UX CoP, SAFe CoP, RTE CoP, Nexus CoP, LeSS CoP, etc. [ONE MORE REMINDER: Your CoP should have the minimum amount of formality necessary, and to the extent possible, should operate bottom up.]

The Aegis Architecture Community of Practice

Our Charter Statement: We all work on a set of closely related products, collectively known as Aegis. This community organized around ensuring that the highest Architectural concerns that have a high impact on Aegis are efficiently and effectively addressed. Please note that this community focuses only on the highest Architectural concerns.

In Scope: The set of teams (TeamSet) that this community encompasses are: Stingray, 49’ers, Falcon, Journey, Explorers, Red October, Phoenix, and Hawking. The main “in scope” topic is high Architectural concerns, though the dividing line between “high” and “not high” is not always black and white. As you look below, hopefully the mechanisms we have in place will give you the idea of where that line is usually drawn.

Out of Scope: Practices related to Programming, Design and other concerns are generally left to others (Other communities and/or the Scrum teams themselves to self organize and solve). Test and Build automation is left to others. The Director of Software Dev hires and chooses the LA (Lead Architect), so that is out of scope for our community. Our Arch CoP only covers the teams in the Aegis TeamSet, so for Arch issues that are decided at the corporate level, you will need to talk to the EAG (Enterprise Architecture Group, not a CoP yet. 😦 ). Our Lead Architect has good communication with the EAG, but anyone should feel free to go the EAG for the appropriate services(just loop the Lead Architect in as well). Typically we form a working group from this CoP to go and talk to the EAG for any requests.


(In the sections below, note that we have specifically named them “Individuals”, “Interactions”, “Processes”, “Tools” to relate to the Agile Manifesto, valuing “Individuals and Interactions over Processes and Tools”.)

Individuals

Community Leadership

Home Scrum Team

Community Coordinators (CC) Jill Hutchins, Henry Nguyen Stingray, Falcon
Agile Coach Jeff Schwaber (Liaison to Scrum Master CoP) Falcon
Management Contact Ellie Swanson n/a, Director of Software Dev
Lead Architect(LA)
90% allocated to community
10% allocated to Scrum Team
Ian Robison Journey
Community Architects(CA)
50% allocated to community
50% allocated to Scrum Team
Phillip Smith Explorers
Brijesh Singh Hawking
Jill Hutchins Stringray
Henry Nguyen (Liaison to Tech Excellence COP) Falcon

Community Members

Architecture Team Reps(ATR)
10% allocated to community
90% allocated to Scrum Team
Sheila Hill Stingray
Carlos Diego 49’ers
Rachel Story Falcon
Chris Bradford Journey
(open) (interim- Phillip Smith) Explorers
Adam Gideon Red October
Adrienne Maxhouse Phoenix
Sandesh Mokkarala Hawking
General Member (GM)
All people above are also considered GM’s
Anyone interested!
Melissa Hayden (Liaison to Testing CoP)
Role Definitions
Community Coordinators This role is to be a servant leader to guide the community on what initiatives and other efforts to focus on. This role also helps provide leadership on which decisions are truly of high enough concern to warrant formality and process via the community. The only requirements for this role are that the person has the appropriate architecture knowledge and must have 3 months of experience in the role of ATR and/or CA prior to serving. The Lead Architect cannot be a coordinator (this helps prevent command and control hierarchical leadership — and respects self organization of the community). The coordinators are elected every 6 months.
Agile Coach This role is to be a servant leader to guide the community on how to respect the Agile Manifesto, the Scrum Guide, and in general, the “Community of Practice” approach to self organization. This person is expected to be an experienced senior Scrum Master or Agile Coach. The requirements for this role are: 1 year of experience as a Scrum Master or Agile Coach, 2 Agile/Scrum certifications, and at least 3 months of participation as a General Member in this community prior to serving. This person will also need to be able to spend ~20% of their time playing this role. The Agile Coach is elected every 6 months.
Management Contact This is senior management role from the Dev Org, cannot be a first line dev manager. Currently the Director of Software Dev plays this role. This contact is used to secure funding, facilities, and other logistical approvals needed for the community to hold its events. This person does not usually spend much time interacting with the community.
Lead Architect This person is hired by the Director of Software Dev to be the Lead Architect for the TeamSet and for the community. This person participates heavily in the community and has some decision making power (see “Process” below). The requirements for this role are determined by the Director of Software Dev. Cannot be the Community Coordinator (see that role description for more info)
Community Architect This person spends around 50% of their time on community initiatives. They are expected to be good communicators, accessible, and have the appropriate architecture knowledge.
The only requirements for this role are that the person has the appropriate architecture knowledge and must have 3 months of experience in the role of ATR and/or CA prior to serving.
Architectural Team Rep This person spends around 10% of their time on community initiatives. This usually revolves around being a communication radiator for their Home Scrum Team and ensuring that all relevant community communication gets shared with their home Scrum Team. This person will also often help with architectural initiatives that their Scrum Team is sponsoring. This should never be considered a gate or bottleneck role — i.e. anyone on any Scrum Team can interact with the community without having to go through (or get approval from) their ATR.
General Member Anyone who has an interest in architecture can participate in the public activities, meetings, and communication mechanisms of this community. In order to vote for the approval of CWA’s or in approval meetings, the person should have significant knowledge of the subject at hand, and have materially participated in 3 months worth of immediately prior community activities. It is strongly recommended that you recuse yourself from votes any time you do not meet these pre-requisites, OR any time you don’t have a strong opinion on the thing being voted on.
Liaison The community has various volunteer liaisons to other parts of the organization, generally to other CoP’s. These people help us identify synnergies and conflicts of scope between this community and the other communities. This is a pretty informal position, and any General Member is free to provide the same kind of information — it’s just that these people volunteer to definitely keep an ear to the ground between the two communities.

CWA’s re: Individuals

(CWA’s = Community Working Agreements)

  1. In everything we do, we try to honor Scrum(as defined in the Scrum Guide) and Agile (as defined in the Agile Manifesto). Because we have not yet chosen a scaled Scrum approach, we simply extend many of the ideas of the Scrum Guide and Agile Manifesto to our entire Community and TeamSet.
  2. The table and information above includes information that is also effectively CWA’s.
  3. Because we believe in the Agile value statement of valuing “Individuals and Interactions over Processes and Tools” and “Responding to Change over Following a Plan”, don’t ever be afraid to get some architects together in an ad hoc way to solve an architectural challenge — we can always retrofit those actions to our processes and tools later.
  4. Unless otherwise indicated, the use of the term “architect” refers to the LA, CA’s, and ATR’s (i.e. does not refer to GM’s).
  5. The architect titles above are completely independent from your job title and career path. The above titles are bestowed by the community (except for the LA, who is hired by the Director of Software Dev). This community makes no claim over the management or career path domains of the company. We are simply a self organizing community that co-exists with the rest of the organization.
  6. Every architect belongs to and works on a Scrum Team. In their Scrum team, they hold no special authority or title on their team other than “Scrum Dev Team member, ” regardless of their title in this community.
  7. Being a GM of the community is voluntary, with one exception: each Scrum team must choose and have an ATR with the appropriate skills, who is not an LA or CA.
  8. All new CWA’s(re: Individuals or any other topic) must be approved by a “fist of 3” by the entire community. For all new CWA’s that have a heavy impact beyond the community, a forum must be held where all Scrum team members from the TeamSet can give feedback prior to approval.
  9. There is always a Scrum team sponsor for each initiative, even if there is only a subset of the team working on the initiative.
  10. We strongly prefer bottom up initiation of all architectural initiatives, coming from the teams. Initiatives need not come from the architects, but the ATR for that sponsoring team must be involved and be highly informed of the initiative.
  11. The general time allocation expected of architects is listed in the chart above. Of course, there are exceptions at times. At any one given moment, an architect might have to choose between focusing on helping their team or helping their community. In that moment, the architect is expected to consult with others on which focus yields the most value for the entire TeamSet. Sometimes this means focusing on community efforts, and sometimes it means focusing on team efforts. Try to choose wisely in that moment.
  12. Coordinators can fulfill the role of either a CA or an ATR while also fulfilling the coordinator role, but this is not a requirement.
    1. Note above that an ATR cannot fulfill the role of ATR AND [CA or LA] at the same time, but an ATR could be an ATR and a coordinator at the same time.

Interactions

Communication Mechanisms
Ad hoc conversations and informal meetings — we encourage these the most! We encourage involving your ATR or CA when needed.
This Wiki
“Roughly” Monthly Public Meetings
Occasional ad hoc public meetings when needed (includes educational meetings, approval meetings, etc)
Group Chat (HipChat)
Email List
Occasional Private Meetings (Primarily only open to the LA, CA’s, ATR’s, and specially invited guests)
ATR’s are responsible for communicating the important outcomes of all of the above to their respective teams.
Architectural elements of the TeamSet Definition of Done
Scrum Team “Code Based Tools” pages
Communication Contact To get connected to our communication mechanisms, ask your ATR who to talk to.

CWA’s re: Interactions

  1. In everything we do, we try to honor Scrum(as defined in the Scrum Guide) and Agile (as defined in the Agile Manifesto). Because we have not yet chosen a scaled Scrum approach, we simply extend many of the ideas of the Scrum Guide and Agile Manifesto to our entire Community and TeamSet.
  2. The table above includes information that is also effectively CWA’s.
  3. Because we believe in the Agile value statement of valuing “Individuals and Interactions over Processes and Tools” and “Responding to Change over Following a Plan”, don’t ever be afraid to get some architects together in an ad hoc way to solve an architectural challenge — we can always retrofit those actions to our processes and tools later.
  4. In all of our architecture discussions and initiatives, we agree to use Agile Emergent Architecture.
  5. Our community maintains a wiki for any communication where light documentation seems like a good communication mechanism. This page is just one of our pages. See our home page for lots more stuff.
  6. Any person in the TeamSet can contact their ATR or a CA for help, collaboration, mentoring, or whatever is needed. It is NOT required to go through your ATR for every architecture interaction. We generally discourage direct communication with the LA unless you are working on an initiative with that person. The person is VERY busy. Your ATR or CA will involve the LA if that is needed.
  7. Each Scrum Team must keep an updated “Code Based Tools”(CBT) page connected to their team wiki. Only include tools that your team regularly uses and/or has significant experience with. The purpose of this CBT page is to spread knowledge to the entire TeamSet about which tools are in use, and which teams have knowledge of those tools. On the CBT page, the team must include 2 categories of tools and info:
    1. CBT’s that they use that are expressly approved by the Arch Community Tool Matrix.
      1. For each 3rd party library, please specify the exact library and versions in use, why the library is being used(it’s purpose), known scope of use(product, module, class, etc), and how widespread is its use (low, medium, high).
    2. CBT’s that they have not been expressly approved by the Arch Community Tool Matrix. (Note that this is not considered bad usage — not all tools are in the scope of our community)
      1. please specify the exact tool and versions in use, why the library is being used(it’s purpose), known scope of use(product, module, class, etc) and how widespread is its use (low, medium, high).
  8. In all of our communication mechanisms, we try very hard to be specific about topics of discussion and whether they are they “in scope for the community”– or not?
    1. For instance, using our communication mechanisms to just get general ad hoc architectural or even design/implementation/technical help is perfectly fine, but say something like “This is really more in scope for just our team, but we could really use some help on — who can help us with that?”
    2. If you’re not sure whether a topic is in scope for the community, just ask the community for help in determining that!
    3. Obviously, if you realize that a topic is in scope for a different community, by all means, please use that community’s communication mechanisms instead of ours.

Processes

Processes
Architecture Tools Approval Process (ATAP) (tools, frameworks, arch approaches, etc)
Election of Community Leaders
Quarterly Retrospectives

CWA’s re: Processes

  1. In everything we do, we try to honor Scrum(as defined in the Scrum Guide) and Agile (as defined in the Agile Manifesto). Because we have not yet chosen a scaled Scrum approach, we simply extend many of the ideas of the Scrum Guide and Agile Manifesto to our entire Community and TeamSet.
  2. The table above includes information that is also effectively CWA’s.
  3. Because we believe in the Agile value statement of valuing “Individuals and Interactions over Processes and Tools” and “Responding to Change over Following a Plan”, don’t ever be afraid to get some architects together in an ad hoc way to solve an architectural challenge — we can always retrofit those actions to our processes and tools later.
  4. We use the term “Tools” fairly broadly, to include essentially all architectural initiatives that require community approval or coordination.
  5. Community retrospectives are held at least once each quarter, and at least within the 2 weeks prior to a new community leadership election. At this time, we often review our CWA’s and Charter Statement to ensure we are in alignment.
  6. Every 6 months, an election is held to select the CC’s, CA’s, and Agile Coach.
  7. All architectural initiatives must include an “independent usage plan” that describes how future users of the initiative can be quickly educated on the tool/approach such that they will not be heavily dependent on tribal knowledge by a small number of people. This often includes light documentation as well as video recordings of education sessions for the initiative. Decreasing this type of “key person” risk enhances our Agility and ability to respond to change in the future.
  8. The ATAP is documented in detail elsewhere, but here is a summary:
    1. A Scrum Team suggests sponsoring an initiative to be approved as an experiment or as a tool, initiative, or decision that is approved for widespread community use.
      1. We encourage the teams, as much as possible, to sponsor initiatives of their own choosing. I..e we prefer they initiate.
        1. In rarer cases, sometimes the CA’s or LA will ask a team to sponsor, but the decision is up to the Scrum Team.
    2. An approval meeting is scheduled (giving the team time to be prepared). Sometimes this is done in regular monthly meetings, sometimes scheduled ad hoc.
    3. The Scrum Team makes review material available 1 week prior to the approval meeting for voting members to review prior to the approval meeting.
    4. The Scrum Team presents to the community.
    5. The community votes with a fist of five, where at least a fist of 3 is required of all approved voters. If a fist of 3 cannot be obtained, a “unity group” is formed to discuss further and/or come up with a compromise within 2 weeks, including those strongly in favor, as well as any that are a fist of 2 or lower(the dissenters). If the unity group can agree with in 2 weeks, then the voted is considered approved. If they cannot agree, then the LA makes the decision to approve or disapprove as a last resort.
    6. If approved, the community then documents the new tool as “approved for experiment” or “approved for use” and is added to the tool matrix. (see below)

Tools

Tools Matrix

Tool/Initiative Name

Being Proposed

Approved

For Experiment

Approved

For Use

Deprecated

Sunsetted

Logging Framework – SLF4J (v3.4, v4.0.1) X
Programming Language: Java (23.4 or above) X
Dependency Injection Framework: InjectorSpace 3.2 or above
(Home Grown)
X
Dependency Injection Framework: InjectorSpace 3.1 X
Dependency Injection Framework: InjectorSpace 3.0 or below X
Architectural Pattern/Domain Logic: Domain Model X
Architectural Pattern/Domain Logic: Service Layer X
Database: Oracle 23i or above X
Deployment Platforms: ???
Other various open source libraries:
Must be GPL-3.0 or EPL-1.0 license.
X
Programming Language: Scala (12 or above) X
Architectural Pattern: Microservices (will likely be limited in scope, as only valuable in certain contexts) X
Persistence Framework: Hibernate (54 or above) X
Architectural Pattern: Monolith applications X
Peer Review: Scrum Team must have documented procedures for peer reviews. X
… (author note — there would likely be many more items in a real CoP —
these are just examples)

Note that the TeamSet Definition of Done requires that all 3rd party tools that are code based (Libraries, code frameworks, Development Environments, etc) be represented on the above Tool Matrix.

The items below have at one time been considered out of scope for the Architectural Community.

Out of Scope Matrix

>

CWA’s re: Tools

  1. In everything we do, we try to honor Scrum(as defined in the Scrum Guide) and Agile (as defined in the Agile Manifesto). Because we have not yet chosen a scaled Scrum approach, we simply extend many of the ideas of the Scrum Guide and Agile Manifesto to our entire Community and TeamSet.
  2. The table above includes information that is also effectively CWA’s.
  3. Because we believe in the Agile value statement of valuing “Individuals and Interactions over Processes and Tools” and “Responding to Change over Following a Plan”, don’t ever be afraid to get some architects together in an ad hoc way to solve an architectural challenge — we can always retrofit those actions to our processes and tools later.
  4. We use the term “Tools” fairly broadly, to include essentially all architectural initiatives that require community approval or coordination.
  5. We don’t yet have any more special CWA’s on the Tools topic — most of what we record here is in the tool Matrix above.
  6. These tools have been considered to be “out of scope” for this community: Agile ALM Tool, Wiki Tool Choice, Test Driven Development, Peer review procedures, Test Automation techniques, Build Automation techniques, process compliance.

 

[TODO: Make Tables appear better]

Metrics for Agile Teams — Evidence Based Management for Software Organizations

If you’re looking for metrics to report up or “status” to report up in your software organization, look no further than Evidence Based Management for Software Organizations from Scrum.org.  The framework was just recently updated in 2018.  Note that these metrics are gathered at the Product level, not the team level.  If you plan to track metrics like this, and you try to track several teams on the same Product or in the same organization, what you will find is you will create an anti-pattern.  Once teams realize that they are competing against each other, they will stop helping each other!!!  So, don’t do that!  Instead, measure at the Product level, and have the teams retrospect both at the team level and at the product level on how they improve those metrics.  Also, remember that “not everything that counts can be counted”, and always consider subjective data in addition to objective data.  With all of those caveats, I highly recommend you click on the button at the bottom of the EBM page to download the “EBM Guide” as a PDF, and start measuring today!

An Introduction to Agile Emergent Architecture – Always Intentional

An Introduction to Agile Emergent Architecture: Always Intentional

Let’s Define Architecture
By definition, architecture is about the major pieces of the system^1, and about satisfying the non functional requirements such as availability, usability, compliance, scalability, security, extensibility , maintainability, and all of the typical “ilities.” It’s important to remember that architecture does not cover anything and everything that can be put into one of the above categories — because they would encompass every last bit of the *entire* system. The architecture should not encompass every last corner of the system. Instead, architecture focuses on the very highest level concerns in each of those categories, and nothing more than that. Further, architecture focuses only on the most basic of building blocks in those categories, and nothing more than that. This leaves as much flexibility as possible for later agility and implementation. So, be sure your architecture is focusing only on the highest of concerns. (The lower concerns might be considered such things as design, code, or implementation)

Big Up Front(BUF) Thinking
Legacy software development approaches put a large emphasis on understanding all the requirements for a system up front. This is also known as BRUF (Big Requirements Up Front). The disproven theory behind this thought is that if one understood all the requirements really well up front, then the most optimum architectures, designs, and software code could be created. As such, BRUF has siblings known as BAUF (Big Architecture Up Front) and BDUF (Big Design Up Front). All of these Big Up Front (BUF) approach theories have been disproven for one overall reason: it takes time to gain this BUF understanding, and by the time you do think you and your team have that understanding, the requirements have changed substantially, which then affects that architecture and design as well. What we have learned over the decades is that the requirements, technologies, users, technical team members, and the market for the product change quicker than we can gain that understanding, and certainly quicker than we can implement an architecture or design. In addition, customers often have trouble describing exactly what they want — they tend to have a much better understanding of what they want only after they see the working software. Further, with BUF thinking, usually the people that did the up front thinking are way different than the people implementing that thinking. So downstream from requirements, when changes to the architecture and design are truly needed, because of all of the previous history, BUF thinking, and totally different people involved, changes are extraordinarily costly *and* complex to retrofit. As such, Big Up Front thinking is a model that has been disproven, so it’s unacceptable, and indeed now.. passe.

No Up Front(NUF) Thinking
However, the other end of the spectrum, NUF (No Up Front Thinking), is also unacceptable. You can’t create a cohesive architecture, that is financially viable, without some good upfront thinking. Without at least some good upfront thinking, the architecture turns out haphazard, almost accidental in nature. Some people have called this unintentional architecture, and the name probably fits.

Signs you MIght be a Victim of NUF
One possible sign of NUF is a system rewrite. A system rewrite is almost always a sign of failure ^2, and the two biggest reasons for that failure are inattention to the user marketplace and inattention to continuous technical excellence via architecture, design, testing and coding practices. Inattention to continuous technical excellence creates what is known as technical debt, which is the other tell tale sign of NUF. Massive technical debt generally presents itself as a highly unacceptable amount of bugs, new functionality that takes way longer than it should, or systems that get jettisoned or re-written. Massive technical debt is pretty much a guaranteed outcome of No Up Front Thinking. As such, since massive technical debt has so many bad outcomes, No Up Front Thinking is also unacceptable.

Big Up Front thinking can lead to No Up Front Thinking
Now, before we move on, let me also address another relevant point. BUF thinking can actually lead to NUF. If a BUF architecture is not kept up to date, is not shepherded, or is extremely inflexible to change, people will avoid thinking about and making architectural changes. This results in NUF and massive technical debt. So, a single system can be a victim of both BUF and NUF. It would be really great if there was a way to find that right balance between BUF and NUF… right?

The Better Way: Agile Emergent Architecture
Enter “Emergent Architecture”, a term suggested by Agile thought leaders. One can think of this as Little Up Front(LUF) Architecture, combined with continuous attention to technical excellence. You could even think of it as “Continuous Architecture” if you so desired. With Emergent Architecture, you do just enough, just in time, at the “last responsible moment.” It’s also important to note that Emergent Architecture is also 100% intentional architecture. Architecture doesn’t just “magically appear”.

The Benefits of Emergent Architecture
By architecting at the last responsible moment, you are minimizing the requirement churn damage that accompanies BUF. By architecting at the last responsible moment, you are taking advantage of the latest and greatest technology knowledge. Finally, by architecting at the last responsible moment, you are very sure that the people collaborating on that architecture are the people that are about to implement that architecture. All of the sudden, the ingredients for success are all in the right place at the right time!

Technical Excellence
Technical Excellence can refer to architecture, design, testing, coding, and probably other practices too. Regardless of whether you do BUF, NUF, or Emergent Architecture, the ability to quickly and cheaply extend or change your software architecture at any given moment is directly proportional to your practices around technical excellence. The higher your technical excellence, the more quickly and cheaply you can change direction (<– indeed, this is the definition of agility). Having said that, in BUF approaches, since people make the mistake of thinking that they can “lock down all the requirements and architecture Up Front” , they rarely put in the technical excellence needed for rapid change. Examples of technical excellence are paired programming, collective code ownership, continuous integration, continuous automated testing, build automation and build pipelines, Test Driven Development, Unit Testing, lightweight code reviews, YAGNI, and many other practices. Agile approaches harness change for the benefit of the customer, so don’t forget Agile Manifesto Principle #9: “Continuous attention to technical excellence and good design enhances agility.”

Architectural Runway and the “Last Responsible Moment”
Now, let’s be careful with the “last responsible moment”. Note the word “responsible” in “last responsible moment.” For different architectural pieces, that last responsible moment will be at different times. Some architectural pieces will require a longer runway, and other pieces can do well on a shorter runway. In summary the more complexity and learning in the requirements, people, and technology, the longer the runway needed. If the complexity factors are low, then less runway is needed. Figure out that needed runway length (length of time before that architectural piece needs to be in place), and work your way back to when that runway needs to begin being built.

Let’s look at some examples.

Architectural Runway for a Deployment Platform
For example, determining the deployment platform(the initial UI, logic, and other tiers) for the Minimum Viable Product for a new product probably needs to happen before the first Sprint of developing that product. Having said that, let’s not regress to BAUF, but let’s do execute some upfront thinking and have that deployment platform pretty well figured out before that first Sprint. Since every Sprint has to include at least some small amount of user/business valuable functionality, it’s going to be hard to create some releasable software functionality *and* create your initial deployment platform all in Sprint 1. It is theoretically possible, but not likely.

Architectural Runway for an Open Source Logging Framework (Complexity is low, should be a fairly quick decision)
Another example of “last responsible moment” might be choosing an open source framework to do logging in your system. If logging is forecasted to be in Sprint 23, there is no reason to choose that framework in Sprint 3. You can probably wait until a few sprints before to begin working on that decision.

Architectural Runway for a 3rd Party Processing Component/Framework (Complexity is high, will be a long duration before a decision can be made)
Let’s assume the same as the above — we plan to begin using this component in Sprint 23, and we are currently in Sprint 3. We need to choose a large 3rd party proprietary component that does a large amount of processing or functionality(think accounting, medical, legal, or aerospace). Since it comes from a 3rd party, there is likely going to be a long runway needed to have that choice in place before functionality can be built on top of it. In this example, it might be perfectly fine to start the architectural discussions about which 3rd party framework to use and purchase in Sprint 3. Remember that you’re going to have to leave time for product evaluation, purchasing, legal, training/learning, and maybe even some technical investigation and technical spikes. It could actually take 20 sprints to execute that particular architectural runway and have the architecture first able to support business functionality.

Architectural Runway for logging standards across multiple teams(Complexity is medium, requires time for buy-in from multiple teams)
Remember that logging framework implemented in Sprint 23? Well, we’re now in Sprint 27, and now that the logging framework has been implemented on a couple of teams, the teams realize a need for standardization of use because the framework is beginning to be used widely across the product. At first people didn’t think this was a “high enough concern” to be considered architecture, but then then they realized that the teams were using widely different “logging levels”, that debug logging often crossed module/system boundaries, and that debugging across those modules was getting difficult. The teams finally realized that there would be some major value in ensuring some light standards were followed across the teams. Three might even be a nice wrapper around the chosen logging framework to make it easier for teams to use in a more consistent way. So, in Sprint 27, they began working on an architecture effort to solve those needs. By Sprint 31, the first teams were using the new wrapper and light standards. One thing to note here is that, back in Sprint 20, when they were originally choosing a logging framework, they could have chose to do all of that standardization up front — but that would have been BUF thinking, and they could not have predicted the ways the framework would be used. It’s very possible that no other team would ever pick it up and use it — as happened with most of their open source frameworks. In other words, they would have been heavily speculating needs rather than having a practical, experienced understanding of the real needs. Now, with the experience of a couple of different teams actually using it, in production, they can make better decisions about how to lightly standardize. In this way, they chose to do Emergent Architecture over BUF thinking.

Architectural Runway for Performance and Scalability(Complexity is highly variable here, depends on the product and the target stakeholders for the release)
Hopefully, if you have a new product, you have a Minimum Viable Product(MVP) release that doesn’t require supporting a zillion users concurrently. However, every product is different, so act accordingly. An MVP will certainly need to perform and scale in a way that meets the needs of the initial users. Get this wrong, and you will pay a very heavy price in terms of customer satisfaction and product adoption. As such, for an initial release, this topic is extremely important. The non functional requirements around performance and scalability will often be ordered high in your backlog, as they are both high value and high risk for an initial release. Having said that, sometimes premature optimization is a source of waste. The runway here will also depend on the tools and environments you need to do proper load testing. Choosing and setting those up can often take serious time, so that would extend the runway needed. So, the runway answer here is probably — “it depends.”

Architectural Runway for Government Regulatory Compliance(Complexity is highly variable here, depends on the weight/complexity of compliance, the agility of the agency, etc.)
The guidance for this piece of the architecture is almost identical to the factors involved in dealing with scalability and performance requirements. They are “must haves” that will often be ordered very high in the product backlog due to their risk and value. Note here that we are mainly discussing compliance needed by the product, not the process. (On a related subject, sometimes software process compliance ends up on your Definition of Done, which may or may not affect the product architecture or product backlog directly.)

The above runway discussions are just some examples to think about. The architectural runway for each of these needed pieces of the architecture will be different. As such, the “last responsible moment” to build that runway will require some serious thought as you refine your product backlog. Further, in order to deal with the unknown unknowns, you’d better put a time buffer into that runway since you can’t possible know the unknowable!

In Summary
Hopefully the above has given you a picture of what Emergent Architecture looks like. With each vital part of the architecture, you will need to consider the runway that gives you the “last responsible moment” and some buffer for the unknowns. Because the requirements will be emerging over time, so too will each of the architectural pieces. The architecture will emerge, or evolve, over time, but with intentional forethought to the necessary runway. Remember, due to the number and churn of moving parts (requirements, people, technology, etc) inherent in almost all software development, the *only* way to keep an optimum architecture over time is to give continuous attention to technical excellence. Finally, in order to avoid the gigantic trappings of BUF and NUF, Emergent Architecture is really the only sane choice.

(If you have several teams on the same product or in the same org who need to keep the Architecture of systems excellent, see:  An Agile Architecture Community of Practice


Notes:

^1 The focus of this article is Agile Emergent Architecture, so we give only brief attention to a definition of architecture here. It’s worth noting that many thought leaders have multiple credible definitions of architecture. Further, Martin Fowler and Molly Dishman have mentioned that some think the term “architect” is not even the correct metaphor — that maybe “city planner” is the better metaphor.

^2 It’s worth noting that there are some very rare scenarios where a system re-write might not be a sign of failure. For instance, if the system re-write is an early pivot in a product’s lifecycle, it might make great sense and not be due to the above mentioned causes. Having said that, I’ve never personally seen a re-write that was due to a good reason.

%d bloggers like this: