This view of service management… On the origin and the descent of managing services. We put meat on the bones. 2023-02-13T12:13:55Z hourly 1 2000-01-01T12:00+00:00 Verbs, nouns and kanban board structure 2022-12-02T13:24:35Z A kanban board structure may be designed by thinking of columns as verbs and cards as nouns

The post Verbs, nouns and kanban board structure appeared first on This view of service management....


I have often been called upon to help organizations that have gotten off to a bad start in using kanban. Often, the team lack understanding of the scope of work items and the defi­ni­tions of workflows. The kanban board structure suffers heavily. What approach have I found useful in making sense of these interrelated concepts? How can such easy more easily improve the kanban board structure?

The Existing Situation

Too often, the existing situation reflects more of a cargo cult approach to managing the flow of work. The work items consist of a hodge-podge of all sorts of work with all sorts of scopes. Those scopes range from short meetings to months-long pro­jects. What do I often see when the workflows have more detail than To Do / Doing / Done? In such cases, the columns of the board often consist of an incoherent mixture of high-level tasks, minor details, milestones and queues. The cards on the board, too, reflect a lack of ef­fec­tive policies defining card scope.

Card and Column Redundancy

Some cards are redundant with the workflow columns. For example, there might be a column labeled Review the document and, sure enough, there is also a card entitled Review the document. The breadth of the work varies from large projects to minor activities, such as attending a short coordination meeting, and everything in between.

Visualize all your work?

Such situations often exist because the team had been advised that all of its work should appear on a kanban board. While the team should indeed start to make visible all its work, doing so without any guidance or policies is not a sustainable practice. Thus, seeing cards of all sorts might be normal during the first few weeks that a team works with a board. But the team should quickly evolve from that state. Otherwise, it is not likely to get much benefit from either the board itself or from the kanban method. At worst, the board is apt to die out, being considered as administrative overhead with little added value.

Making sense of the cards and columns

Teams in such situations need quick and simple remedial ac­tions to start making sense of the kanban board. They need clear and easily understandable principles to apply to the use of the board. Only then can teams start using their boards as fun­da­mental tools for the continual improvement of the flow of work.

The analogy of language syntax1 helps to provide this clarity. At the simplest level, I have found it useful to think of the columns on the board as verbs and the cards as nouns.

Columns are verbs

Let’s take a simple example. Suppose a team’s value stream consists of analyzing, building and testing work items. The columns on the In Progress portion of the board should all be labeled as verb phrases. Thus, the column labels might be Analyze, Build and Test.

Avoid noun phrases

Avoid noun phrases, such as Analysis, Construction and Testing, to label the columns. Noun phrases lead to confusion be­tween the action performed and the object upon which that action acts.

Avoid adjectives and adverbs

Avoid adjectival or adverbial column labels, which tend to reflect the status of work rather than the activity the team performs.  Work status should be obvious based on the position of the work item in the relevant column. Trying to express work status in another way is supererogatory. Labeling col­umns, too, with status information is redundant and leads to confusion about where to place a card. You might object that the classic column names Ready and Done do not obey this rule. I discuss that issue below.

Labeling Queues

How would we label the co­lumns used to model queues? After all, by definition teams do not actively operate on work in queues. There might be a tendency to label queues with adjectives. Thus, you might see something like Doing Analysis followed by a queue called Analysis Done. I would re­com­mend rather to maintain verb phrases. Thus, the columns would be Analyze and then Wait for Building (assuming the following column is Build). Similarly, we would see Build and then Wait for Testing. Using verb phrases for queues renders clear­ly why a work item is in the column and what is expected to happen next.

Cards are Nouns

If columns are verbs, then the work items associated with cards are nouns. More particularly, they are the objects of the verbs. If, for example, a column is labeled with the verb Analyze, we need to ask the question, “Analyze what?” The title of a card provides the answer to that question. That same card title answers the questions, “What is waiting for building?” “What is being tested?” and so forth.

Suppose the team does marketing work. It defines a value streams for creating and executing a marketing campaign. Each marketing campaign (a noun) would have its own card. Suppose the team is managing the on-boarding of a new employee. Each new employee would have her or his own card. Suppose a pharmaceutical team manages a clinical trial phase. Each deliverable in the phase would have a corresponding card. Suppose a finance team is preparing a quarterly budget update. Each budget reforecast would have its own card, and so forth.

Nouns, Verbs and Value Streams

Using nouns and verbs as described above makes it much easier to think of a workflow or process as a value stream. The noun is the bearer of value. It is the noun whose value is increased incrementally as it passes through the value stream. In the pharmaceutics example, each column of the clinical trial phase leads to a better es­ti­mation of the safety and effectiveness of the drug. In the financial example, each column makes the budget reforecast more likely to be accurate. In the marketing example, each column brings the campaign closer to prospective customer apt to buy the company’s product.

The verbs are the successive transformations of the noun that progressively add value.  When thinking in such terms, many of the problems to which I alluded above would readily disappear. A card with a label such as “Coordination meeting” would make no sense. One does not “analyze” a coordination meeting as part of the normal workflow (unless that is a feedback loop/­retrospective type of workflow). One does not “test” a coordination meeting, etc. Since such activities as coordination are not, by definition, value-adding activities, they should not appear as In Progress columns on the board.

Subjects of the Verbs

Continuing with the syntactical analogy, what would the subjects of the verbs be? Clearly, the they are the actors performing the value-adding actions that cor­respond to each column.

These subjects, or actors, are typically visualized as attributes of the individual cards, not as attributes of columns. Why is this so? The simple answer is that making them card attributes considerably enhances the flexibility and adaptability of the board. But let’s take a closer look.

Generalists and Specialists

Suppose a team consists of specialists, where each action or column is performed by a distinct function. It might make sense to display the name of that function as the subject of the verb describing the column’s activity. Suppose a team working on a drug’s clinical trial consists of statisticians, programmers, medical writers and data ma­na­gers. One might imagine that a column called Analyze the Probability might be more precisely labeled as Statistician Analyzes the Probability. Similarly, a column labeled Write the Report might instead be labeled as Medical Writer Writes the Report.

That approach makes little sense if the team consists of generalists, where many team members are capable of per­form­ing many different types of work. Indeed, much of the early development of kanban for knowledge work concerned such teams. As a result, the inclusion of function names as the subjects of column labels never gained traction.

Separation of Duties

In other cases, a team desires a separation of duties rather than being constrained by distinct technical expertise. In such cases, the team decides that the person performing one activity should be someone other than the person performing the previous activity. This approach could add value by avoiding many cog­ni­tive biases and by adding the thoughts and experiences of multiple people to the work item. It is also a common practice intended to improve information security

Thus, if you have as succeeding columns Build then Test, it might be desirable that the tester be someone other than the builder. In this case, the identity of the worker—rather than the work­er’s function—is important. Normally, worker identity is an attribute of the individual work items, not of the columns.

The Backlog, Commitment and Done

Until now, I have been dis­cussing the columns of a kanban board representing the In Progress phase of the workflow. Can we apply the same concepts to those parts of the workflow reflecting the backlog, com­mit­ment to doing work, completed work and archived work? These columns are largely generic in nature and have a well-established naming tra­di­tion. Therefore, it might be as well to leave the respective column titles as nouns or adjectives, rather than verb phrases.
There are nonetheless some good reasons to consider applying the same verb phrase practice to all the columns. These reasons include:
  • Better integration of a team’s value streams into the overall value chain
  • Clarification of what should happen in transitioning from the backlog to In Progress

Value Stream Integration

When a board’s workflow starts with “Backlog” and ends with “Done”, the visual presentationFalling off the edge of the world gives the impression of the team’s work being isolated and independent of activities else­where in the organization. But such isolation is seldom the case. It smacks of edge-of-the-world, flat earth thinking. Rather, the object to which the value stream adds values is de­li­vered to some customer, for the purpose of achieving certain outcomes.

As the value stream mapping exercise makes clear, it is of critical importance for a team to understand who are its suppliers (deliverers of input to the work) and customers (recipients of the output of the work). From the perspective of the enterprise, work is not “done” when it reaches the last column on the last kanban board. Rather, work is in a queue waiting for the next team or customer to make use of that work.

Consequently, it might make sense to rename a column from “Done” to “Wait for Team X to Handle” or something of the sort. This is particularly in­te­resting if a team provides shared services, directing its output to various customers. Rather than a single “Done” column, the team might have multiple sub-columns for its output, depending on who should be receiving the output. A similar principle could be applied to the backlog.

Transitioning out of the Backlog

A backlog is not simply a queue in which work items passively await further handling. What is really happening to work items in a backlog? It makes sense to limit the non-value-adding effort to manage items in a backlog. There is nonetheless certain work to be done.

The first task is to decide to which backlog items the team should commit execution. The second task is to make more pre­cise to what the team is committing. I intend to discuss this second task in a future article. In any case, it behooves the team to limit its effort in performing these tasks, given the risk that a work item might never go through the value stream and deliver some value.

How, then, might we label this column with a verb phrase, rather than as simply as Backlog? Perhaps a more expressive label would be Groom and Commit or Groom and Refine. What are the objects of this verb phrase? They are simply the requests made by customers to do work. As such you may even wish to label the column as Groom and Refine Customer Requests.

This label makes it clear that the team needs to remove from the column any work items it does not intend to perform and refine those work items as a pre­re­qui­site to committing to doing the work. It is useful, then, to move to a new column those work items to which the team has committed doing the work.

That new column is a true queue, in the sense that work items simply wait there. No work is done on them. Indeed, the moment any work is done on such a work item, the card should be moved to the first In Progress column. How might we label this column, rather than something like Ready? Following the verb phrase recom­men­dation, perhaps we could label it Wait for Capacity to Handle. Once that capacity is available, the next work item is selected, based on whatever priority policy the team has defined.


By modeling the column labels and card scopes using a language syntax analogy, we can visualize more clearly how work pro­gresses through a value stream. We have simple rules to help teams define workflow structures and the scopes of cards.

In summary, each column is labeled as a verb phrase. Each work item is labeled as a noun phrase. That noun phrase is syntactically the object of the verb phrases in the columns. In certain cases, especially when agility is not very important and teams are composed of spe­ci­a­lists, columns may be labeled as sentence fragments, including the name of the function responsible for doing the work in that column. In such cases, the function name would be the subject of the verb.

A simple example of a board applying these principles is illustrated below:

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License The article Verbs, nouns and kanban board structure by Robert S. Falkowitz, including all its contents, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


1  I am aware that the analogy I make in this article to language syntax is meaningful only for many of the Indo-European, Hamito-Semitic and certain other languages. However, such concepts as verb objects and so forth might be meaningless in many other languages. I hope the linguists among the readers will excuse the ethnocentrism.

The post Verbs, nouns and kanban board structure appeared first on This view of service management....

The role of the problem manager 2021-09-25T09:19:59Z Before I talk about what I think a problem manager should be doing, we might start by summarizing what problem managers typically do. Of course, every problem manager performs dif­ferent­ly a role that each organization defines differently. So, I can only make a list of some of the major responsibilities of problem mana­gers. Rare are […]

The post The role of the problem manager appeared first on This view of service management....


Before I talk about what I think a problem manager should be doing, we might start by summarizing what problem managers typically do. Of course, every problem manager performs dif­ferent­ly a role that each organization defines differently. So, I can only make a list of some of the major responsibilities of problem mana­gers. Rare are the problem managers who perform all these activities.

What problem managers do

Manage problem records

If an organization has a formal problem management discipline in place, it pro­bably has a problem manager making records of problems and how they are being handled.

Act as a gateway

The formalism of problem management requires some­one to identify problems and when the organization meets the entry and exit criteria for each phase of the problem management value stream. The problem manager ge­ne­ral­ly plays this role, if anyone does.

Form resolution teams

Since problems often require multiple sets of expertise to resolve, an organization needs a means for identifying who can reasonably contribute to that work and for getting these people to contribute.

Chair progress meetings

Although some problem re­so­lution teams might be self-organizing, many org­an­i­za­tions have a culture re­quir­ing the presence of a formal meeting chair who calls meetings, fixes meeting agendas, conducts the meet­ing and often documents what was done and decided during the meeting.

Train problem resolvers

To the extent that an organization has a problem management process that it expects resolvers to follow, the problem manager might be the person who trains those resolvers. Sometimes, the training also includes the use of various problem ma­nage­ment tools.

Coach problem resolvers

Due to the limited ef­fec­tive­ness of one-off train­ing, the organization might find var­i­ous follow-up activities useful to develop the ma­tu­rity of problem re­so­lution over the long term.

Interface with other process managers​

Problem management has many close relationships with other disciplines, such as risk ma­nage­ment, in­ci­dent ma­nage­ment, change ma­nage­ment, inter alia. The managers of these disciplines exchange information, ne­go­ti­ate boun­da­ries, agree on the handling of specific cases and many other little details of the interfaces.

Track problem status

Sometimes, problems just seem to disappear without anyone having done anything explicitly to resolve them. Of course, something did change, but the link of that change to the resolution of the problem was not recognized. So, a problem manager may per­i­odi­cally review the lists of open problems and de­ter­mine if they still exist.

Create and distribute reports

Problem managers oversee the creation of periodic re­ports about the health and progress of their discipline, as well as analyses of the ag­gre­gated problems being handled.

What should problem managers do?

I wouldn’t ask this question unless I thought something were missing from the typical roles performed by problem managers. Most of the activities described above are non-value-adding activities. Problem re­sol­vers—the people who figure out the causality of a problem and define what should be done to mitigate the problem—perform the real added value of problem ma­nage­ment.

OK. A little bit of coordination does indeed help. However, many organizations have a culture that perpetuates control and coordination activities. Is problem management not as effective as you might like? Maybe you need more detailed processes, more training, more policies, more control. In short, more problem managers. A command and control approach to problem management thus dictates such behavior.

I would argue that a good measure of the success of the problem manager may be measured by the decreasing need for the role. There will always be problems. There will always be a need to handle them. But can we find a way to achieve the goals of problem ma­nage­ment with less and less input from the problem manager?

Coach individuals in the formation of ad hoc teams

As an ad hoc team, a problem resolving group usually has a short life span. What per­centage of that life span does the team spend in figuring out how to col­la­borate and what per­cent­age does it spend in the value-adding work of resolving problems?

In my role as a problem manager, I have frequently seen cases where the strength of personalities in the re­solv­ing group determines its working approach. Some people make snap judgements about the causes of a problem and refuse to listen to the contributions of others. Other people have ideas about the problem, but are afraid of appearing foolish should their ideas not pan out. In any case, they might be unwilling to enter into conflict with their outspoken teammates.

In other cases, team members don’t know how to handle uncertainty. Many tech­ni­cians either believe they know something (with 100% cer­tain­ty) or they are simply unwilling to commit them­selves. In other words, they see no useful ground between 100% sure and not knowing at all. And yet, that is precisely the ground where we almost always find ourselves.

Wouldn’t problem resolution be more efficient if the team’s storming and norming phases could be skipped? Shouldn’t it be possible for an ad hoc team to be performing from the start? I suggest that the problem manager should play a coaching role to develop teaming skills, encourage psy­chological safety, help people learn how to calibrate their levels of uncertainty and advise on appropriate levels of risk in the problem re­so­lu­tion activities.

Coach teams in self-organization methods

Often, existing organizational units have all the skills and authority required to handle a problem from end to end. In such cases, the presence of an external problem manager can be viewed as a form of external interference in the affairs of that organization. And yet, left to their own devices, such organizations often let problems fester until they provoke serious incidents.

Self-organization is the low­est overhead approach to resolving such problems. But teams often have hierarchical managers who dictate their activities and priorities. An organization will not likely transition spontaneously to a self-organizing culture. Thus, a problem manager/coach may usefully nudge org­ani­za­tions in a lower overhead direction.

But is this truly a role for a problem manager? In my view, the most fundamental problem of all—org­ani­zational units that perform ineffectively and in­ef­fi­cient­ly—is indeed mat­ter for a problem ma­nager/coach.

Coach leaders in low overhead methods to find consensus

Although many problems may be handled by existing teams, handling other prob­lems may require a con­sor­tium of people from multiple organizational units.

Some organizations depend on formal methods to re-assign people temporarily to ad hoc tasks. These cum­ber­some methods impede the rapid and flexible re­so­lu­tion of problems. Often, teams do not share the same priorities or have conflicts between internal priorities and en­ter­prise pri­o­ri­ties. A classic example of reinforcing and rewarding such attitudes and behavior is the use of personal and team bonuses based on achieving personal and team objectives.

Again, such situations are unlikely to change spon­ta­ne­ously. Indeed, I have seen more frequently the re­in­force­ment of the causes of the issues rather than a true improvement. Is the team’s work ineffective? En­force better compliance with the process! Find a manager who better controls the team! Increase the frequency of audits! Add more validation and approval steps. And so forth.

Instead of such regressive, illogical behavior, a problem manager/coach could play a role in helping diverse teams find a consensus in priorities and develop low-overhead be­ha­vior to ensure that people are available to address problems. Org­ani­zation members should move toward spon­ta­ne­ously vo­lun­teer­ing to work on problems rather than waiting for a manager to formulate a request, which is submitted to a resource allocation board and ap­proved by a top level manager with a budget.

Withering Away the Problem Manager Role

I promote a vision of problem managers becoming more like problem coaches. The problem coach’s role is to encourage a culture of behavior to rapidly and flexibly address problems with a minimum of overhead. The more successful the execution of this role, the more the organization becomes capable of spontaneously ad­dress­ing problems as part of normal work. At the same time, the role of the problem coach becomes less and less needed. Ideally, the problem mana­ger/­coach role should tend to wither away.

Creative Commons License The article The Role of the Problem Manager by Robert S. Falkowitz, including all its contents, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

The post The role of the problem manager appeared first on This view of service management....

The Three Indicators 2020-09-28T08:40:06Z When managing work in an agile manner, we should consider three types of indicators: lead, lag and along.

The post The Three Indicators appeared first on This view of service management....


Using the Kanban method leads us to rethink the indicators of performance and work manage­ment. Traditionally, we speak of lead indicators and lag indicators. But neither of these do justice to the essential benefit of Kanban allowing teams to im­prove how they work while they do the work. Thus, it is useful to speak of a third type of indicator: the along indicator.1

Suppose you are racing down a road in your car and come to an unexpectedly sharp bend. There is danger that you might not safely negotiate the curve.

Lead indicator: When you get into the car, your passenger says you should not drive too fast. You risk not being able to react in time to unpredictable conditions in the road. But, since your passenger is always nagging you about your driving, you may or may not pay attention.

speedometerAlong indicator: While you enter the curve, your passenger shouts “Slow down!”. If your reaction time is sufficient and the road is not too slippery, you brake enough to safely negotiate the bend.

dangerous curve ahead signLag indicator: The police report about yet another fa­tal accident at that bend concluded that the car was going too fast. As a result, they had a large sign erected near the start of the bend saying “DANGEROUS CURVE AHEAD. SLOW DOWN!” Alas, that sign will bring neither the driver nor the passenger back to life.

The usefulness of  the along in­di­ca­tor may be brought into re­lief by looking first at in­di­ca­tors used in the scrum method. Consider the sprint during which the team executes its value stream (or process) multiple times. Lead in­di­ca­tors are measured before the start of a sprint. Lag in­di­ca­tors are measured after the end of a sprint. As we shall see, I think it useful to speak of three indicators: lead, lag, along.

Lead Indicators

Fig. 1: The lead indicator is measured before a process is executed and applied to one or more future process instances

The story points asso­ci­ated with user stories exemplifies the lead indicator. Es­ti­mat­ing story points provides an in­di­ca­tor of how much work should be planned for a sprint. On the other hand, velocity might exemplify a lag in­di­ca­tor, measuring the user sto­ries the team completes during the sprint.

Lead indicators supposedly indicate how well an organization will likely per­form. However, our VUCA world can severely limit the usefulness of such in­di­ca­tors. We make decisions using lead indicators as input, then perform actions based on those decisions. So often, the volatility of circumstances and the un­cer­tainty of the lead indicators make those indicators less useful than desired. Lead indicators offer no guidance in ad­dres­sing those unexpected changes that occur while the work is being done.

Lag Indicators

lag indicator
Fig. 2: The lag indicator is measured after one or more process instances and applied to one or more future process instances

Wer nicht von dreitausend Jahren
Sich weiß Rechenschaft zu geben,
Bleib im Dunkeln unerfahren,
Mag von Tag zu Tage leben.
He who cannot draw on three thousand years is living in the dark from hand to mouth.
-Johann Wolfgang von Goethe

Lag indicators have the accuracy of 20-20 hindsight. Well, they do if the data se­lec­tion is not too biased. And the cal­cu­la­tion al­­go­­rithm must be correct. And that algorithm must be applied cor­rect­ly. And the indicator should be part of a ba­lanced decision-making system.

As systems become more complex, con­se­quently, deciders find it more difficult to predict the results  of any change. Lag in­di­ca­tors re­in­force the il­lu­sion that a change in the recent past causes the current state of a system. Such illusions may lead to Bermuda triangle-type phenomena.

The idea that the future is unpredictable is un­der­mined every day by the ease with which the past is explained.
-Daniel Kahnemann

Lag indicators might be useful if teams exploit them in a PDCA-style im­prove­ment cycle:

  1. They do some work
  2. They measure what they have done (i.e., they measure lag indicators)
  3. They make some changes
  4. Return to step 1.

For some, improvers are always at least one cycle behind when they choose im­prove­ments based on lag indicators. If the work done in step 1 (above) has unsatisfactory results, lag indicators  come too late to prevent the problem in the current process cycle. Using lag indicators to make decisions about the next cycle of work faces the same VUCA issues as lead indicators. Such use of indicators reminds us of the phenomenon of the general always fighting the last war.

Often, a lag indicator aggregates multiple measurements of multiple cycles of work. Such aggregation increases even more the lag between the activities measured and the supposed improvements based on those measurements.

Along Indicators

while indicator
Fig. 3: The along indicator is measured while a process instance is being executed and applied to that same process instance

Carpe diem quam mi­ni­mum credula postero.
Seize today and put as little trust as you can in the morrow.
-Quintus Horatius Flaccus

So, the question is whether we can find indicators that help us make decisions while work is being done. Can we make a difference in performance of the current process cycle, given the current cir­cum­stances, not the predicted circumstances in the future? This is where the Kanban method shines.

Take the example of cycle time, a typical lag indicator. More often than not, we seek to reduce mean cycle time for a given type of work. Kanban provides us with visual in­di­ca­tors of the current conditions that slow cycle time:

  • blocked work items
  • large queues
  • bottlenecks

By themselves, such indicators do nothing to improve performance. Improvement might come when those indicators are input to a ca­dence of re­views, decisions and actions. Certain of these cadences are along cadences, whereas others are lag cadences.

The daily team meeting and any immediate follow-up actions ex­em­pli­fy the along ca­dence. They typically address the issue of blocked work items. Or certain bot­tle­necks might be ad­dressed by an im­me­di­ate re-ba­lanc­ing of how the team uses its resources.

But suppose you need to address a bottleneck by changing a team’s quantity of resources or by chang­ing the value stream. In­di­ca­tors ag­gre­gated over time support such de­ci­sions. A monthly or even quar­ter­ly ca­dence of such de­ci­sions addresses these ag­gre­gated in­di­ca­tors. While “number of days blocked for a single work item” might be an along indicator, “mean number of days blocked during the past month for all work items” might be a lag indicator.

Note that aggregation does not, by itself, determine whether an indicator is lag or along. For example, traditional project ma­nage­ment mea­sures whether a single pro­ject is on time and with­in budget, both classic lag in­di­ca­tors. And both are mea­sured too late to make any dif­ference for the pro­ject con­cerned. They remain lag in­di­ca­tors when aggregated.

A matter of scope and timing

The astute reader will have noticed that along indicators are measured after one or more events occur, like lag indicators. Technically, this is true. The difference between a lag indicator and a alonge indicator lay in the granularity of the work we measure. A alonge indicator is use­ful only if the cycle time is significantly longer than the time to measure the indicator and react to the measurement.

Let’s return to the analogy I provided at the start of this article.

lead indicator cycle
Fig. 4: For lead indicators, the length of the cycle is not particularly relevant. The measurement and resulting action precedes the cycle start. The lapse of time between the action taken and the cycle start should be short.

Lead indicator: The scope of the indicator is too broad. It refers to your driving in general or to the entire trip. The indicator is not specifically tailored to the event of approaching a sharp bend in the road. Indeed, neither of you may have been thinking about the risk of unexpected sharp bends.

while indicator cycle
Fig. 5: For along indicators the time between the measurement and the action taken must be shorter than the remaining length of the cycle of work.

Along indicator: The indicator is tailored to the specific segment of the road on which you find yourself. It is communicated in such a way that the action to take is unmistakable and needs to be immediate.

lag indicator cycle
Fig. 6: If it takes longer to act on the measurement than the remaining duration of the work cycle, then the measurement cannot be an along indicator. It can only be a lag or lead indicator. Note that a lag indicator is typically measured after the end of the cycle, not during the cycle.

Lag indicator: Like the along indicator, it concerns only the immediate segment of the road. It might have some impact on the behavior of future drivers. But it comes too late the mitigate the current situation. Indeed, prudent drivers or slow drivers might end up completely ignoring the sign. If one day the driver is in a particular hurry, the sign might have no impact.

Three Indicators: Lead, Lag, Along

Management methods promoting agility encourage patterns of behavior enabling along in­di­ca­tors. These methods emphasize the value of adapting the gra­nu­la­rity of work items. When properly defined, a work item is suf­fi­cient­ly large to output some­thing useful to clients while keeping the administrative over­head low. It is suf­fi­cient­ly small to be measurable with along in­di­ca­tors and to allow for changes in direction with a minimum of lost effort. At the same time, whether the work item is completed successfully or not, it is of a size the encourages learning from work experience.

I hope this discussion encourages you to seek out along indicators and make the three indicators—lead, lag, along—part of your continual improvement efforts.

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International LicenseThe article The Three Indicators by Robert S. Falkowitz, including all its contents, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


Unless otherwise indicated here, the diagrams are the work of the author.

Speedometer: By Bluescan – Own work, Public Domain,

Figs. 1-3: The embedded image of the protractor is drawn from Scientif38 – Own work, CC0,


1 In using the term “along” I attempt to preserve the alliteration of “lag” and “lead”. Terms like “while” and “during” are less happy. Other sources speak of “coincident indicators” (for example, I hardly wish to give the impression that such indicators are coincidental.

The post The Three Indicators appeared first on This view of service management....

Visualization of Configurations 2020-09-16T17:27:00Z In this article, I will delve into some of the issues associated with vi­su­a­liz­ing the con­fig­u­ra­tions of systems.As with many other dis­ci­plines in ser­vice ma­nage­ment, the use of vi­­s­u­a­l­i­­za­­tions in con­fig­u­ra­tion ma­nage­ment can be prob­lem­atic. I hope to highlight some of these issues with a view toward: improving the func­tionality soft­ware de­ve­lop­ers build into configuration ma­nage­ment; and expand­ing how con­sumers of con­fig­u­ra­tion in­for­ma­tion make use of vi­­su­­a­l­i­­za­­tions.

The post Visualization of Configurations appeared first on This view of service management....


In this article, I will delve into some of the issues associated with vi­su­a­liz­ing the con­fig­u­ra­tions of systems.

As with many other dis­ci­plines in ser­vice ma­nage­ment, the use of vi­­s­u­a­l­i­­za­­tions in con­fig­u­ra­tion ma­nage­ment can be prob­lem­atic. I hope to highlight some of these issues with a view toward:

  • improving the func­tionality soft­ware de­ve­lop­ers build into con­fi­gu­ra­tion ma­nage­ment
  • expand­ing how con­sumers of con­fig­u­ra­tion in­for­ma­tion make use of vi­­su­­a­l­i­­za­­tions.

Many IT or­gani­za­tions have a high opinion of tools pro­vid­ing vi­­su­­a­l­i­­za­­tions of con­fig­u­ra­tion in­for­ma­tion. One or­ga­ni­za­tion with which I worked 15 years ago used this ca­pa­bi­lity to jus­tify choos­ing a par­ti­cular ITSM tool. I was not sur­prised, how­ever, when they never used that ca­pa­bi­lity as part of their con­fig­u­ra­tion ma­nage­ment work. It was a good exam­ple of an ex­cite­ment factor in a product or even a reverse quality. But why was this so? What qualities should vi­­su­­a­l­i­­za­­tions have for them to become per­for­mance fac­tors in manag­ing con­fig­u­ra­tions?

Scope of this discussion

By “con­fig­u­ra­tion vi­su­a­li­za­tion” I mean vi­su­a­liz­ing the structures of systems. A sys­tem consists of a set of elements that may relate to each other in many ways. A con­fig­u­ra­tion vi­su­a­li­za­tion should identify:
  • the system being visualized
  • the scope of the particular elements in the system being visualized
  • the dimensions of re­la­tion­ships among ele­ments being visualized.
For example, a system might be the set of nodes in a network to which data packets may be addressed. Those nodes are generally some sort of computing device, such as general-purpose computers, routers, prin­ters, load balancers, etc. These nodes might relate to each other in many different ways. They might be phy­si­cally connected. They might be able to route data to each other. They might follow each other as part of a business transaction, and so forth. Con­fig­u­ra­tion vi­su­a­li­za­tions represent the struc­tures of systems. Designers do not intend them to directly represent the dy­na­mics of those systems. These vi­su­a­li­za­tions do not have the purpose of describing a process or any sequence of events. That being said, the vi­su­a­li­za­tions of such events generally include the nodes at which the events take place. Designers often present these nodes in a structured way based on a con­fig­u­ra­tion vi­su­a­li­za­tion. Consider the difference be­tween the map of a trans­por­ta­tion network (e.g., Fig. 1) and the display of the route to follow for a particular journey (e.g., Fig. 2). Building the itinerary—a process vi­su­a­li­za­tion—on the foundation of the former—a structure con­fig­u­ra­tion—greatly enhances its usefulness.
TPG network map
Fig. 1: A configuration visualization of a transportation network shows the possible routes, their spatial relationships and the possibilities for interconnection.
Fig. 2: A process visualization of the use of a transportation network shows a specific itinerary. In this case, it also shows the timing of events.

In sum, the con­fig­u­ra­tion vi­su­ali­zations I will discuss here document only the struc­ture of a system. They do not document the activities of managing that system or even the activities of managing the sys­tem’s structure. However, con­fig­u­ra­tion vi­su­a­li­za­tions gene­ral­ly document structures from the perspective of only one, or perhaps a few, func­tions of the system.

Configuration Visualization Tense, Aspect & Mood

When documenting con­fig­u­ra­tions we may speak of various
  • tenses—when the depicted con­fig­u­ra­tion exists (past, present, future)
  • aspects—does the visu­ali­za­tion re­pre­sent a single mo­ment, an ex­tend­ed period, a series of re­pe­ti­tions; and
  • moods—the attitude of the visualization de­sign­er to the documented structure, or how the designer in­tends the viewer to relate to the visualization.

Configuration Tenses and Aspects

Often, we wish to know the con­fig­u­ra­tion of a system in the current tense. How is the system con­figured now? People changing systems also want to know the future tense of the con­fig­u­ra­tion. After a change will be made, what will the con­fig­u­ra­tion look like? (Such con­fig­u­ra­tions might be understood as imperatives rather than futures since changes often have unexpected or undesired results.) Part of the diagnosis of a problem involves un­der­stand­ing how a system was configured in the past. Sometimes the di­ag­nos­tician wishes to know the past perfect (aspect) configuration. This aspect might show how the system was configured at an instant in the past (perhaps as part of an incident). Other times the con­fig­u­ra­tion stakeholder needs to know the past im­per­fect con­fig­u­ra­tion—the con­fig­u­ra­tion dur­ing a con­tinu­ous period in the past. One might also consider in­ten­tion­ally tem­po­rary con­fi­gu­rations, often as part of a transitional phase in a series of changes.

Configuration Moods

The above examples mainly concern the indicative mood. Planners of potential changes also take interest in the conditional mood. “If we make such and such a change, what would the resulting con­fig­u­ra­tion look like.” Con­fig­u­ra­tion controllers need to distinguish between the in­di­ca­tive—what is the cur­rent con­fig­u­ra­tion—and what the current con­fig­u­ra­tion should be. Architects might establish jussive prin­ci­ples—principles that the org­ani­za­tion expects every con­fig­u­ra­tion to follow. Finally, stra­te­gists and high-level architects concern them­selves with the subjunctive, pre­sump­tive or optative moods. “If we were to adopt the following principles or strategies, what might the resulting con­fi­gu­ration look like?” As often as not, such hypotheses serve to discredit a certain approach

Conventions for Visualizing Moods and Tenses

Unfortunately, authorities provide no standard visual techniques to distinguish among these various tenses, aspects and moods. Vi­su­al­i­za­tion designers must resort to labels as the means to distinguish be­tween what was, what is, what will be, what should be and what could be. And, un­for­tunately, de­sign­ers seldom indicate these moods in their vi­su­al­i­za­tions. At most, they indicate the initial pub­li­ca­tion date, or perhaps the date of the last update.

Intelligent Visualization Tools

I suggest here how intelligent visualization tools might ex­press the various tenses and moods described above.


Analysts often wish to compare two con­fig­u­ra­tions of the same system, differing by tense or mood. Animation provides an in­tu­i­tive way to achieve this by highlighting the transitions between states. For example, a visualization might have a timeline with a draggable pointer. The viewer drags the pointer to the desired date (past, present or future) and the tool updates the con­fig­u­ra­tion accordingly.

If the visualization depicts a small number of changes to a system, animating each change separately makes the nature of the change more visible (see Fig. 3).

Fig. 3: An example of an animated change to a server cluster. Animation makes visible each change (namely, the new servers and their connection to the load balancer).

Animation may be useful but only under various con­di­tions. First, the elements that change must be visually distinct. For example, visu­ali­zation viewers might have great difficulty per­ceiv­ing the change from IP address 2001:171b:226b:19c0:­740a:­b0c7:­­faee:­­4fae to 2001:171b:226b:19c0:­740a:­b0c7:­­faee:­­4fee. The tool might mitigate this issue by high­lighting changes For example, the visualization designer might depict the background of changed ele­ments using a contrasting color.

Second, a stable, unchanging visual context should surround the elements that do change. Lacking this stability, the viewer might have great difficulty visualizing what has changed.

Third, the layout of the elements should be stable (excepting those that change, of course). For example, if two new elements replace a single element in the upper right corner of a visualization, those new elements should also be in the upper right corner. Tools automating the layout of elements using a force-directed placement al­go­rithm might not respect this constraint. Such al­go­rithms intend to position the elements and their links in the most legible way.

For example, they might attempt to make all nodes equidistant and minimize the number of crossed links. However, if the change involves a significant increase or decrease in the number or size of elements, such algorithms might radically change the layout. The change in layout makes it difficult to perceive the change. Allowing for very slow animation might mitigate this issue.

See also Fig. 8.

ERD generated by phpmyadmin
Fig. 4: This illegible entity-relationship diagram generated by phpmyadmin exemplifies the usefulness of force-directed algorithms

Color Saturation

Visualization viewers can easily detect the saturation, or vividness, of color (although certain people might have difficulty seeing colors). We might assign different levels of saturation to different tenses or moods (see Figs. 5–7). Of course, such a convention would require training to be correctly understood.

Color saturation used for configuration tense—past
Fig. 5: A past configuration. Saturation 95%
Color saturation used for configuration tense—present
Fig. 6: A present configuration. Saturation 45%
Color saturation used for configuration tense—future
Fig. 7: A future configuration. Saturation 15%

Used in conjunction with animation, though, the change in saturation would both highlight the change and be intuitively obvious (Fig. 8).

Fig. 8: An example of combining animation and color saturation to indicate a change in configuration tense. The unsaturated color indicates a future configuration. Of course, the animation could be more sophisticated but would be labor-intensive and its creation very hard to automate.

Designers might use many other attributes of color to indicate different moods or tenses. However, we already use most of these attributes for various other purposes. Using such attributes as hue, which often indicates element type or location, to reflect mood or tense risks creating confusion.


Watermarks might be an excellent means for indicating the tense or mood of a con­fig­u­ra­tion visualization. Authors often use them to distinguish between a draft version of a document and a final version. A simple text watermark highlights in an un­ob­trusive way precisely which con­fig­u­ra­tion the visualization depicts.

One might imagine that a visualization without any water­mark would represent the present. Any other tense or mood would have the pertinent watermark. To correctly interpret older vi­su­al­i­za­tions, they should display the date at which the visualization was last known to be valid.

Future configuration with watermark
Fig. 9: A future, planned configuration with a watermark

Multi-dimensional configuration visualizations

As with the analysis of any data sets, visualization de­signers often find it very useful to reduce the number of dimensions being analyzed.

When I speak of “multi-dimensional” vi­su­a­li­za­tions, I am referring to more at­tri­butes than just the po­si­tion­ing in Car­tesian space. In addition to “2D” or “3D” di­men­sions, I refer to any other attributes of a system’s elements or re­la­tion­ships use­ful for depiction and analysis.

As I mentioned above, a visualization takes the per­spec­tive of one or a few functions of the system being documented. Let’s take the example of the map of a transportation net­work (see Fig. 10) to illustrate this point.

LIRR map
Fig. 10: The configuration of a transportation network

The network functions to transport people or goods from place to place. Therefore, the map must give an idea of the relative positions of those places and, often, their surroundings. Often, the map depicts these positions schematically, but close enough to “reality” to be useful. A second dimension of the map illustrates the pos­sible in­ter­con­nections a­mong routes. A third di­men­sion might indicate the types of vehicles used on the line, such as train, boat, bus, etc.

The map depicts each di­men­sion using a different con­ven­tion. It might indicate sta­tions with a circle or a line per­pen­di­cular to the route, together with the stations’ names. Different colors might indicate the various possible routes. Solid or dashed lines might indicate the type of vehicle. Other symbols might indicate the types of inter­changes.

Only the designer’s imag­i­na­tion and the visualization’s messages limit the types of dimensions that a con­fig­u­ra­tion vi­su­a­li­za­tion might display. The classical di­men­sions include:

  • position—the coordinates or relative location of an element in two- or three-dimensional space
  • ontological classification of element type—the classification of the esvsential nature of the element, such as “computer”, “printer”, “modem”, etc.
  • ontological classification of re­la­tion­ship type—if the visualization depicts relationships among ele­ments, what are the types of relationships, such as “is part of”, “is connected to”, “depends on”, etc.

Other dimensions might include, for example, the age of the element, its manu­fac­turer, its model, its vendor, its gua­ran­tee status, its main­te­nance contract status, etc., etc. and so forth.

Visualization idioms

Designers use various types of visualization idioms to depict static configuration infor­ma­tion:
  • node-link diagrams (graphs or directed graphs)
  • enclosure diagrams
  • treemaps
  • adjacency matrixes
  • labeled illustrations.
Needless to say, designers regularly innovate new idioms useful for this purpose.

Directed Graphs

directed graph
Fig. 11: A simple directed graph

Components of directed graphs

A graph consists of a set of nodes some of which are connected by lines, called “edges”. A directed graph is a graph whose edges have directions. Con­fig­u­ra­tion ma­na­gers com­mon­ly use directed graphs to represent a network of components, such as com­pu­ters and other active network de­vices. Node-link diagram is a synonym for directed graph in this context.

Handling the complexity of directed graphs

Unless the system do­cu­mented by the vi­su­al­i­za­tion is trivially small, a directed graph quickly becomes unwieldy. The tool creating the visualization may handle the complexity of such systems in four ways:

  • it uses an algorithm, such as force-directed place­ment, to position nodes in as pleasing a way as possible
  • it can collapse collections of nodes into individual symbols
  • it can filter the diagram based on any dimensions of the nodes and/or edges
  • it can limit the scope of the diagram, generally by showing a limited number of edges

Collapsing nodes uses such principles as physical location or logical function. For example, all nodes located in a building, a floor, a room, a city, etc. may be collapsed into a single symbol. Thus, a cluster of computers each with the same function may be collapsed into a single node. Inter­ac­ti­vity with the viewer makes such diagrams most useful. The visualization user should be able to collapse and expand nodes at will or filter on node and link attributes.

Managing link ambiguity

visualization with ambiguous links
Fig. 12: Ambiguous relationship—all links are the same
visualization with different link styless
Fig. 13: Relationship type indicated by link style
visualization with labeled links
Fig. 14: Relationship type indicated by text labels

Any two nodes in a socio-technical system may interact in a variety of ways. Consider the relationships between a set of devices used in an organization and its various teams or personnel. In a graph, what does an edge between a certain machine and a certain team or other entity signify (see Figs. 12-14)? It might indicate many dif­fe­rent types of interactions, such  as:

  • the team uses the machine to achieve its business purpose
  • the team operates the machine so that others might use it
  • the team supplies the machine, being either a vendor or procurer
  • the team repairs the machine
  • the entity manufactures the machine
  • etc.

Thus, each edge or link should characterize a relationship be­tween two nodes ex­pres­sible as a verb (see Fig. 14). Unfortunately, many con­fig­u­ra­tion managers—as well as the tools they use—use verbs so ambiguous that they add little value to the ma­nage­ment of con­fig­u­ra­tions. “Depends on” and “relates to” offend the most. These catch-all terms mean “a relationship exists, but one unlike any type of re­la­tionship that we have already defined.”

Of course, a single entity or team might have multiple types of relationships with a single element. A semantically un­am­bi­guous visualization would require a separate link type for each type of re­la­tion­ship between any two nodes. But this approach quickly clutters the diagram, ren­der­ing it less legible. The visualization becomes a poorer communication  vehicle (un­less the vivsu­al­i­za­tion purports to de­scribe that com­pli­ca­tion). As a result, some vi­su­ali­za­tion designers collapse multiple relationship types into a single type of edge or link. The resulting vi­su­ali­za­tions might be simpler to view, but their ambiguity makes them much more difficult to in­ter­pret. Thus, they would be much less useful for any particular purpose.

Links and modes of visualization

Recall the discussion above about visualization modes. Designers of directed graphs often ignore this concept and mix different modes. A precise visualization would depict only a single mode at a time. The same issue applies to any con­fig­u­ra­tion vi­su­al­i­za­tion idiom. I address it here given the popularity of using directed graphs for docu­ment­ing configurations.

As an example, let us consider a vi­su­ali­za­tion of nodes and the com­muni­ca­tion of data packets among them. We might be interested in the im­per­fect indicative aspect of such a system. In other words, between which pairs of nodes are packets being sent. Or, we might be interested in the pairs that could transmit data to each other, whether they do or not. Or, we might want to see the pairs that should be sending packets to each other, again, whether they do or not. Capacity management can make good use of all these modes. Others are par­ti­cu­larly interesting for problem or inci­dent ma­nage­ment. And yet others are useful for system design, availability, release and change ma­nage­ment.

Tools may draw graphs of a particular by gathering data from various management tools. Network sniffers can gather the data about which pairs of nodes are com­mu­ni­cating with each other. An appropriate lapse of time must be selected for per­forming this analysis.

Describing which nodes could com­mu­ni­cate with each other requires knowledge of both the physical connection layer and the network layer.  Intelligent switches could report which nodes have physical connections. Routers and firewalls could report the rules allowing for or forbidding the routing of data. The visualization tool could then draw a diagram based on this data. But however could a tool automate the creation of a diagram depicting the lack of com­mu­ni­ca­tion between nodes? Tools can hardly detect physical connections that do not exist.

Such visualization auto­ma­tion might suffer from several constraints. Firstly, a com­mu­ni­ca­tion defect might prevent the collection of the very data useful for managing an incident. Secondly, while the tool could collect current and historical data, collecting data for a planned, future con­fi­gu­ra­tion would ad­di­tion­ally require some simulation capability.

Link density is a key metric for managing graphs. This metric measures the ratio of edges to nodes in the graph. If the link density is greater than three or four, it becomes very difficult to interpret the graph. The drawing tool should be able to detect link density and propose a collapsed, initial view of the system with a lower link density. The view should then have the chance to in­ter­actively filter the data displayed and expand col­lapsed nodes.

  • Ease of tracing node to node paths
  • Interactive features permit scaling over a very wide range
  • Easy to confuse the functional purposes and directions of edges
  • High link density renders the diagram illegible
  • In large diagrams, it may be difficult to find nodes by visual scanning of the diagram
  • Non-deterministic positioning of nodes


treemap at city level
Fig. 15: A treemap showing the relative number of nodes in a system by country and then by city.
treemap at node level
Fig. 16: The same data, but with the nodes themselves indicated.

Most systems consist of elements whose attributes may be organized in a hierarchic taxonomy. The physical locations of the nodes in a system provide a simple example of this. Consider the hierarchy Country→City→Site→­Building→­Floor→­Room. Suppose you wish to analyze the numbers of end nodes of a data network, by location. A treemap gives an immediate view of the site or building or room that has any number of such nodes.

If the visualization tool allows for searching based on node identifiers, a treemap also gives a quick means for visualizing the value of the attribute for a given node. For example, suppose you seek the node “ABC123”. If the tool shows it in a contrasting hue, you can see immediately its position in the hierarchy.

Fig. 17: In this variant of the treemap, called a "circle packing", a query on a certain model of the node reveals the blue dots (5, 13, 26, etc) and where they are located. Note that that model is used only in Switzerland.

This feature may be expanded to include queries on at­tri­butes outside of the taxonomy (see Fig. 17). For example, suppose you have a treemap showing the lo­ca­tions of the desktop computers. You do a query to display the computers of a certain model. You see im­me­di­ately the location and the clustering of those computers.

In theory, a treemap might document different node at­tri­butes at different levels of the hierarchy. However, viewers are likely to have difficulty interpreting such maps.

  • Easy to document a very high number of leaf nodes
  • Fast querying of the attributes of leaf nodes
  • Easy to judge relative quantities of leaf nodes at a given level of the hierarchy
  • Only useful for single attributes in a hierarchic taxonomy
  • Groupings are completely abstract, bearing no relationship to the real layout of the nodes

Adjacency Matrixes

adjacency matrix
Fig. 18: This adjacency matrix documents 10 nodes showing—in this example—both the direction of communication (arrows) and the bandwidth of the link (colors).

An adjacency matrix has all the nodes to be documented laid out on both the vertical and horizontal axes. The information in the cells at the intersection of the columns and rows documents the relationships between the corresponding two nodes.

Depending on the link attributes being documented, the matrix might contain re­dun­dant information. For example, in Fig. 18 the color of cells 1:2 and 2:1 must be identical. On the other hand, the arrows may be different.

Fig. 18 shows a very simple adjacency matrix. A more sophisticated matrix might add rows and columns to document other attributes of the nodes. An analyst might use the values in these additional cells to inter­ac­tive­ly sort and filter the matrix. Thus, adjacency matrixes can be powerful analytical tools.

  • Scalability even with high link density systems
  • Fast lookup of nodes (if  listed in a structured order, such as alphabetically)
  • May require training to make good use of the diagrams
  • Difficult to analyze topologies

Enclosure Diagrams

node-link diagram of failover cluster
Fig. 19: Even a simple node–link diagram of a failover cluster can be messy, with many edges.
enclosure diagram of failover cluster
Fig. 20: An enclosure diagram of the same cluster is much more elegant.

An enclosure diagram dis­plays one or more collections of nodes around which a line is drawn. Vi­su­al­i­zation designers com­mon­ly use enclosure diagrams to display clusters of nodes with similar functions. An example might be a set of IT servers that can fail over to each other. Figs. 17 & 20 provide examples, wherein nodes are enclosed by a set of circles.

An enclosure diagram is thus visually cleaner and simpler than a directed graph. Suppose a cluster contains four servers that can fail over to each other. A directed graph would require ten edges to display the cluster (see Fig. 19). An enclosure diagram would require only two lines (see Fig. 20).

  • Visual simplification, as compared to a directed graph
  • The layout of multiple enclosures within a complex system may be very difficult to compute

Labeled Illustrations

Rack mounted telephone equipment
Fig. 21: A labeled illustration of a system, such as this telephone equipment rack, and its installed components, is an excellent way to help identify the parts when you are viewing or seeing that system. This configuration visualization method is very old.

Product managers or manu­fac­turers often use labeled illustrations to help device managers to understand what they see. For example, a technician might enter a ma­chine room to remove a device from a rack. A labeled illustration of that rack could indicate the slots and the locations of the installed devices. Technicians find such illustrations especially useful if the devices themselves are poorly labeled. Examples in­clude labels being damaged, lost, too small to read, or in inaccessible positions.

Service agents might also use such illustrations for identifying the type of system before their eyes. This is especially true if the organization de­scribes the attributes of the system using a highly standardized taxonomy. Anyone having used a guide book to identify birds or plants understands this.

switch icon
Fig. 22: Although icons are compact, recognizable functional symbols, they do not help to identify the model or serial number of a component (unless labeled)
fiber channel switch
Fig. 23: Service personnel can use illustrations to identify a specific component or its model; a label identifies its serial number.

Thus, the illustration designer makes a trade between the level of detail shown and the details helping to identify the object. In other words, the designer should abstract the gestalt of the object in question. For example, a data switch model might be readily identifiable by its overall shape, the number and the layout of its ports and other significant details.

  • Easy identification of the components of a system
  • May require a library of component images
  • May require specialized functionality in the drawing tool to correctly place components within the system
  • Not useful for physically large systems (like a data network) or abstract components (like a database or an application)

Hybrid visualizations

We have seen that many different visualization idioms have the capability of doc­u­ment­ing con­fig­u­ra­tions, especially net­works of ele­ments. Each has its strengths and weaknesses. Why not make hybrid visualizations, combining the strengths and palliating the weaknesses?

Containers work well at the highest level of a vi­su­al­i­za­tion. They effectively portray com­plete­ly separate domains and parent-child relationships within a domain. Al­though they can also overlap, as in a Euler diagram, such visualizations provide little information about the nature of the shared area.

Graphs or trees may effectively document the higher-level details within a container. If the number of nodes is not too great, they could go to the leaf level. In particularly complex struc­tures or hier­archies with many levels, con­tainers might in­di­cate sub-graphs. A drill-down function would allow viewers to visualize the details within those containers.

A network of great depth may be too confusing or too computationally intensive to display a graph’s full level of detail. Illustration designers may resolve this problem by replacing rooted sub-graphs by adjacency matrixes, trees, treemaps or, as we saw above, containers.

Failure to benefit from con­fig­u­ra­tion visualizations

It would seem that con­fig­u­ra­tion data has everything to gain from visualizations. System stakeholders have difficulty in grasping the connectivity of IT com­po­nents using words alone. And yet, so few or­ga­ni­za­tions really benefit from creating diagrams. Why is this so?

In include among the reasons:

  • Inaccurate and in­com­plete un­der­pinning data
  • Not following Shnei­der­man’s mantra
  • No direct relationship between higher-level archi­tecture diagrams and phy­sical layer diagrams
  • System component re­la­tion­ships too complex for easy diagnosis via vi­su­ali­za­tions

Inaccurate and incomplete data

The problem of inaccurate or incomplete con­fig­u­ra­tion data is not, strictly speaking, a visualization issue. None­the­less, I will brief­ly summarize some of the reasons for this issue.

Consider, though, that a visualization drawn directly from data recorded in some database can hardly depict missing in­for­ma­tion. If a cluster contains ten servers but the con­fig­u­ra­tion ma­nage­ment system records only seven of them, do not imagine that the visualization will show the ghosts of those three missing machines.

Maintaining con­fig­u­ra­tion data as an afterthought

Service personnel often consider main­taining con­fig­u­ra­tion data as non-essen­tial administrative overhead. Up­dat­ing the data is a low priority step distinct from performing the corresponding changes. As a result, that personnel sometimes do not update data at all. Or, they might update data long enough after the fact that the details are no longer fresh in mind.

Configuration discovery as an afterthought

Some organizations attempt to address the poor integration of data ma­nage­ment into change activities by automating the dis­co­very of new or changed con­fig­u­ra­tions in a system. Indeed, recording of con­fig­u­ra­tion data often not established at the very start of a system’s creation. In such cases, automated con­fig­u­ra­tion discovery is often adopted as the means to address the daunt­ing task of documenting existing con­fig­u­ra­tions. And yet, such automated dis­co­very is often blocked in its attempts to discover. Furthermore, it often reports unmanaged ele­ments and attributes, need­lessly com­pli­cating the data. And automated dis­co­very can circumvent the intellectual process of strug­gl­ing with un­der­standing how a system is pieced together. It can thereby yield large quantities of data without much un­der­standing of how to use those data.

Unmanageable quantities of data

How do organizations mea­sure the “quality” of the con­fig­u­ra­tion management? I have often seen them use the percentage of con­fig­u­ra­tion elements registered in a database. They struggle to move incrementally up­wards from a very poor 10%. By the time they reach 70%, their progress thrills them and the effort exhausts them so much that they reach a barrier beyond which they hardly advance. They trot out a cost-benefit analysis to justify why it is OK to make only a symbolic effort to maintain or enlarge these data.

Shneiderman's mantra

Ben Shneiderman described the process of finding in­for­ma­tion from a visualization as:

  1. overview first
  2. zoom and filter
  3. details on demand

So common is this org­an­i­za­tional principle, vi­su­al­i­zation tool designers have come to view it as a mantra.

Most ITSM tools with con­fig­u­ra­tion diagramming cap­a­bi­lity do indeed provide zooming and filtering cap­a­bility. Many allow the viewer to see the details of a component via a simple mouse-over or other simple technique. The problem is in how an overview is defined and how users implement the concept of an overview.

In the worst case, an “overview” is implemented without any aggregation of detail. In other words, For example, suppose you doc­u­ment a data center with 500 physical servers. A graph overview might attempt to display those 500 servers with their various network con­nec­tions. The processing power of the computers used to do this is probably inadequate for the task. Even a high-resolution screen could only allocate a few pixels to each server, making the entire diagram useless. The network connectivity of the servers would probably be so com­plicated that the screen would be filled with black pixels.

Showing more components is not a useful way to provide an overview. There must be some aggregation principle applied, one that allows for the sim­pli­fi­cation of the vi­su­al­i­za­tion. Systems with very few components are the exception to this principle. In the latter case, visualization would have relatively little benefit.

The nature of the aggregation depends entirely on the purpose of the visualization. There is no single “right” aggregation prin­ciple. Com­po­nents could be aggregated based on the business functions they support. Another form of aggregation could be the models or versions of components. Physical location at various levels would be another example. Thus, an overview might show first the racks in the data center. With an average of twenty servers per rack, a much more manageable twenty-five nodes would appear in the initial overview.

Remember that this ag­gre­ga­tion is not a form of filtering. Instead, the vi­su­al­i­za­tion needs to display an aggregate as a single glyph or shape. Suppose you wish to depict the set of components supporting a given business function, such as all financial ap­pli­ca­tions. You might achieve this using a rectangular box with three overlapping computer icons. An overview visualization might show as many such rectangular boxes as there are supported business functions. It would then be possible to zoom in to a single rectangle and filter the visualization according to other principles.

Such an approach would adequately im­ple­ment Shnei­der­man’s mantra, but most ITSM tools do not implement the required logic. After all, the business function is an attribute of the application running on the server, not of the server itself. Most users of these tools do not structure configuration data in a way that would support this approach. I cite various reasons for this:

  • Con­fig­u­ra­tion managers make illogical shortcuts in the con­fig­u­ra­tion data model. As per the example above, they assign a business function to a computer, rather than to the ap­pli­ca­tion processes running on that computer.
  • Some organizations have decided to treat archi­tec­tural data, where po­ten­tial aggregation prin­ciples are defined,  separately from service management con­fig­u­ra­tion data. Never the ‘twain shall meet.
  • Even when con­fig­u­ra­tion management tools reflect arch­i­tec­tural principles, how should such data relationships be modeled? Should managers use a framework like TOGAF? Would or­gan­i­za­tions with­out architectural ex­per­tise give in to unwarranted sim­pli­fi­ca­tions? By what means would one know that a physical server supported a finance func­tion: via an attribute of the server itself? via an attribute of the ap­pli­ca­tions installed on the virtual servers realized on the physical server? via an attribute of a functional component of an ap­pli­ca­tion?
  • Functional aggregation would require both know­ledge of the static structure of the com­po­nents and their dynamic use. For example, knowing which functions a message bus channel sup­port depends on know­ing the functional domains of the messages using that channel. A similar problem exists for levels 1, 2 and 3 network components. Short-cir­cuit­ing these is­sues by hard-coding attributes of the components leads to documentation that is exceedingly difficult to maintain.
  • Many con­fig­u­ra­tion ma­nage­ment tools lack the simple functionality that proper diagramming might require. For ex­ample, many of the edges in a graph re­pre­sent­ing a network of components should be bi-directional. And yet, how many tools can simply model this physical reality?

Segregated architectural drawings

Architectural drawings are a subset of con­fig­u­ra­tion vi­su­al­i­za­tions. They are subject to the same con­straints as other vi­su­al­i­za­tions. In many fields, there is a direct relationship between an ar­chi­tect’s draw­ing and the physical objects to be created based on that drawing. In some fields and organizations, however, there appears to be a barrier be­tween the visualizations created by ar­chi­tects and the visualizations of the cor­re­spond­ing physical layer. Many IT departments are a case in point.

There are various reasons for this segregation, which may be organizational, pro­ce­dural or technical in nature. This is not the right place to investigate these reasons in more detail. But the result is often that IT arch­i­tec­tural drawings are generally created from tools specific to the architecture role. These tools may be of two types. Some draw diagrams based on the data in an underpinning database managed by the tool. Others are dedicated to the production of IT arch­i­tec­tural diagrams, with­out an underpinning database. On the other hand, the con­fig­u­ra­tion diagrams are created either from service management tools or from dedicated drawing tools.

In theory, organizations should have some policies, procedures and techniques for ensuring the coherence be­tween ar­chi­tec­tural data and con­fig­u­ra­tion ma­nage­ment data. While good coherence probably exists in some organizations, I have never seen it myself. At most, I have seen one-off attempts to co-ordinate, for example, a list of IT services as defined in the service management tool with the list defined in the architecture tool. The result has been two lists with two different owners and no practice of keeping the lists synchronized. In time, however, there will probably be increasing convergence between ar­chi­tec­ture and service ma­nage­ment tools.

architecture diagram and configuration diagram
Fig. 24: A configuration illustration detail (right) could be an exploded view of a high-level architecture diagram (left)

Why is this important? We have already seen, according to Shneiderman’s mantra, that it is useful for tools to first provide a collapsed overview of configurations and later allowing users to drill down or expand to the details. Architectural drawings at the business, application or tech­no­logy levels provide an excellent set of principles for depicting a collapsed, high-level view. One might even imagine a three-level hier­archy: a top, business layer; a middle physical e­le­ment layer and a detailed physical e­le­ment layer.

Most service management tools capable of generating con­fig­u­ra­tion visualizations rely on relationships to decide how to collapse and expand details. Often, only the anodyne parent-child re­la­tion­ship is the basis for this feature.

As a result, configuration managers some­times end up having the tail wag the dog. They create artificial or incomplete relationships in a CMDB just to enable the tool to draw a certain diagram. In the worst case, the distinction between a “service” and an “application” is lost.

Relationship too complex to diagnose via visualizations

Suppose you wish to use a con­fig­u­ra­tion visualization to help diagnose an incident or problem detected on a certain com­po­nent. Given the pres­sure, especially in the case of an incident, such an approach would be practical only in trivial cases. Suppose the component being in­ves­ti­gated had only a handful of first and second de­gree re­la­tion­ships with other com­po­nents. In this trivial case, a con­fig­u­ra­tion visualization is not likely to be useful. But cases with so few relationships are rare, even in simple infrastructures. Or, it might be true if only a tiny fraction of the relations were documented.

Otherwise, the tediousness of clicking on connected com­po­nents and checking their status  would far outweigh the possible benefits. In short, which approach to diagnosis would be better: using an ap­pli­ca­tion that simply delivers an answer or the trial-and-error use of a visualization?

A rule of thumb for graphs is to limit the number of links to four times the number of nodes. Too many links result in occlusion of elements or the inability to discriminate elements (see Fig. 25). Alas, modern technology brings us well beyond that rule of thumb. Servers contain many more than four managed components. Data switches are typically linked to 16 or even 32 nodes. Racks might contain 40 1U computers. This high link density is not a problem when dril­ling down to a single node and its connections. But, at the overview level, such a high density makes visualizations difficult to draw and harder to use.

a graph of the saa graph with a very high link to node ratiome data with the multilevel scalable force-directed placement (sfdp) algorithm applied
Fig. 25: The structure of the network is not visible when the number of nodes and links is too high
a graph of the same data with the multilevel scalable force-directed placement (sfdp) algorithm applied
Fig. 26: Simplifying the graph using a scaling algorithm makes the structure visible


There is no limit to what one might say about con­fig­u­ra­tion visualizations. I have at­tempted to outline here some of the issues I have found in over 30 years of experience in working with con­fig­u­ra­tion management and its visualizations. I hope these remarks will inspire you to reflect on how you visualize con­fig­u­ra­tion data and what you may do to make those visualizations more useful.
Horizontal bar

Creative Commons License The article Visualization of Configurations by Robert S. Falkowitz, including all its contents, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


[a]  Johnson, Brian, and Ben Shneiderman. “Tree-maps: A space-filling approach to the visualization of hierarchical information structures.” Visualization, 1991. Visualization’91, Proceedings., IEEE Conference on. IEEE, 1991.

[b]  Shneiderman, Ben. “Tree visualization with tree-maps: 2-d space-filling approach.” ACM Transactions on Graphics 11.1 (1992): 92–99.

[c]  Shneiderman, Ben. “The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations.” In Proceedings of the IEEE Conference on Visual Languages, pp. 336–343. IEEE Computer Society, 1996


Unless otherwise indicated here, the diagrams are the work of the author.

Fig. 21: By Charles S. Demarest – Demarest, Charles S. (July 1923). “Telephone Equipment for Long Cable Circuits”. Bell System Technical Journal. Vol. 2 no. 3. New York: American Telephone and Telegraph Company. p. 138. Public Domain,

Fig. 25: AT&T Labs, Visual Information Group. Downloaded from

Fig. 26: AT&T Labs, Visual Information Group. Downloaded from

The post Visualization of Configurations appeared first on This view of service management....

How to increase visualization maturity 2020-08-10T14:26:00Z We communicate information visually via a two-way street. BOth visualization de­signers and viewers must have similar levels of maturity to benefit from visualizations. This article suggest techniques for increasing the overall maturity of visualization techniques in an organizations.

The post How to increase visualization maturity appeared first on This view of service management....


In this series on information visualization, I have sug­gested many techniques for im­prov­ing how we visualize service manage­ment in­for­ma­tion. However, we communicate information via a two-way street. Visualization de­signers could benefit from these techniques, but viewers also need to understand them. How can you increase the visualization maturity of those viewers?

The bias that visu­alizations should be self-explanatory hinders efforts to expand visu­a­li­za­tion tech­niques. After all, if one picture is worth a thousand words, adding a thousand words to explain the picture makes little sense.

As with any form of human com­mun­i­ca­tion, learning its grammar, syntax and semantics enriches the use of the communication medium. We can appreciate music at an ata­vis­tic level. How much more can we appreciate it with an under­stand­ing of harmony, rhythm? We might find a poem beau­ti­ful for the sound of its language. How much more beautiful could it be by knowing something of the meaning and allusions of its words and images. We might appreciate the paintings of Leonardo as pretty pictures. But also knowing the work of his teacher, Verrocchio, shows us Leonardo’s genius. In short, we can com­mun­i­cate at the level of baby talk or we can make efforts to enrich our communications, adding in­sight, economy, beauty and effectiveness.

Building Trust

Developing the maturity of visualization audiences re­quires the audience mem­bers to attend to the visualization.1 In other words, they must pay attention to the visualizations, on the as­sump­tion that they contain useful mes­sages. The more the audience trusts the visu­a­li­za­tion de­sign­er, the more likely will it attend to the visualizations and their messages.

Suppose you were to receive a registered letter from the tax authority. The letter ap­pa­rent­ly requires some action but includes a word you do not under­stand. You will make it your business to learn what that word means and retain that knowledge for future messages. Compare this to the case of opening an un­so­li­cited commercial message where the message obviously intends to sell, not to inform. You will be much less likely to look up any words in that message that you do not understand.

The relationship between trusting the designer of visu­a­li­za­tions and the ma­tur­ity of the visualizations can be either a virtuous cycle or a vicious cycle. If you start on the right foot and from the very beginning provide timely, accurate and use­ful vi­su­a­li­z­ations, your au­di­ence will increasingly trust you. But if you start on the wrong foot, your messages might be ignored. Your communications become in­creas­ingly viewed as SPAM. Perhaps you have seen examples of these phenomena when looking at visualizations of the progression of COVID-19. You tend to return to those visualizations that pro­vide you with the in­for­ma­tion you want in a way that suits your level of understanding. You tend to ignore sources that appear to be untrustworthy.

A key question: how shall we start on the right foot in the journey toward greater trust and greater maturity?


A visualization creator might first think to annotate a visualization as a means for explaining it. An­no­ta­tions include:

  • indications  of the im­por­tant data
  • explanations of how to inter­pret the visualization

Only the latter type concerns us here.

Let’s take a simple example. Our vision can easily de­ter­mine if a line is more or less straight. However, our brain has much more difficulty assessing the degrees of curvature. Consider Figs 1 and 2. Can you guess the functions represented by the curves in these two diagrams? Probably not.

tangent function
Fig. 1: Function A shown with linear scales
1 - cosine function
Fig. 2: Function B shown with linear scales

Now consider Figs. 3 and 4. These diagrams represent the same data as in Figs. 1 and 2, but the scale of the Y-axis is logarithmic. If you understand how to interpret a log scale, you would know im­me­di­ate­ly that the curve in Figs. 1 and 3 represents a tangent or ex­po­nen­tial function. The curves in Figs. 2 and 4 unambiguously represent an entirely different function (the function hap­pens to be 1 – cosine).

tangent function log scale
Fig. 3: Function A (see also Fig. 1) shown with vertical log scale
1 - cosine log scale
Fig. 4: Function B (see also Fig. 2) shown with vertical log scale

If you understand how to inter­pret a log scale on a graph, you immediately benefit from using that con­ven­tion. But what per­cen­tage of the vi­su­a­li­za­tion’s audience un­der­stands the use of log scales? If not high, an an­no­ta­tion can help train the audience and explain what the graph depicts (see Fig. 5).

tangent function log scale annotated
Fig. 5: Annotating the visualization to explain the use of a log scale

Before and After Comparisons

We make take a cue from the preceding discussion for the next learning technique. This technique directly compares a “baby-talk” visualization with a more mature visualization that better com­mu­ni­cates the message.

Suppose you wish to illustrate the continuous evolution  of relative proportions of a variety of categories, such as the market share of services used by customers. Doing this with baby-talk graphs, such as bar charts, requires multiple bar charts, one for each point in time (see Fig. 6a). Grouping the bar chart (Fig. 6b) improves the visualization, but it aggregates the data into discrete periods, rather than showing continuous values. A stream­graph makes a more economical visualization of the same data and includes much more information (see Fig. 7). Comparing more than two bar charts challenges cognition. In stream­graphs, viewers see at a glance the growth and waning of each category. The visualization displays data continuously rather than at discrete times. Ideally, streamgraphs should in­ter­active­ly display data values according to the position of the mouse cursor (if the message being com­mun­i­cated requires such data) (see Fig. 8).

If the designer shows the two different versions of the visu­a­li­za­tions side by side, the viewer will quickly realize the many ad­van­tages of the more mature visualization, so long as it com­mu­ni­cates the desired message more effectively. The viewer would need little text to understand the inherent advantages of the one over the other. We seek an “aha!” moment when the scales of lazy tradition fall away from the viewer’s eyes.

barchart time series
Fig. 6a: BEFORE: The designer only knows how to make bar charts and tries to use them to visualize a time series.
grouped bar chart time series
Fig. 6b: BEFORE: The designer figured out how to group by period, but the data vizualization still aggregates the data, rather than displaying it continuously
streamgraph time series
Fig. 7: AFTER: The designer uses a streamgraph to display the continuous evolution of the data over time

Fig. 8: AFTER: Interactive measurements at each point in the chart can provide details.


Formal training in the inter­pre­ta­tion of information visu­a­li­za­tions will probably not stimulate interest. However, formal training in the creation of information visualizations will more interest a pop­u­la­tion of analysts and mana­gers. At the same time, such training will help in­crease the maturity of the population of information visualization viewers.

The value of a formal training program depends on the habits and existing training infra­struc­ture in an or­ga­ni­za­tion. Some or­ga­ni­za­tions maintain a large cata­logue of available courses, often delivered on-line. Or­ga­ni­­za­­tions should encourage employees to follow courses and should recognize their success.

Many of the online training platforms have courses, some being free, on data vi­su­a­li­za­tion. Be aware, how­ever, that some of these courses prin­ci­pal­ly concern the use of specific tools, such as Tableau or even R, to create visualizations. I would strong­ly recommend a more generic study of visualization before learning any tools. Many other courses treat visualization as a discipline subsidiary to the main theme, such as data science, statistics or mar­ket­ing.

However, if the organization lacks a formal training in­fra­struc­ture, informal self-study might be more effective. I hope that my series of articles on in­for­ma­tion vi­su­a­li­za­tions would serve this end. Many of those articles reference standard texts of visualizations which the reader might consult.


Some people will try anything, at least once. Others will dog­ged­ly stick with their ex­ist­ing practices, thinking change is always for the worse. But most people feel too busy to invest in new ways of doing things—unless they have a strong reason to believe a change will be beneficial. Tes­ti­monials help to provide that reason.

A testimonial states how using a certain visu­a­li­za­tion made achiev­ing a goal more ef­fec­tive and/or more ef­fi­cient.

Using the statistical control chart allowed me to iden­tify two un­der­ly­ing prob­lems we never recognized before.

I could replace 12 pie charts with a single heat map, making the weekly usage patterns clear at a glance.

The stream graph showed the evolution of categories, convincing our manager to re­al­lo­cate resources.

Both formal and informal com­mu­ni­ca­tions may include tes­ti­moni­als. They may be written or oral. You might even consider using the sort of visual testimonials that appear as marketing materials: maw­kish perhaps, but effective.

Our customer told us we won the contract thanks, in part, to the visualizations in our offer
Robert S. Falkowitz
Robert Falkowitz
Visualization Designer

Fig. 9: An example of a testimonial

Using testimonials:

  • reinforces leadership skills
  • reduces colleagues’ per­cep­tion of risk
  • encourages the use of more appropriate, more eco­­no­m­i­cal and more ef­fec­tive visu­a­li­za­tions

Getting feedback

Remember that information visu­a­li­za­tions are a form of com­muni­ca­tion. The receiver of a message can provide feedback to the sender to confirm that the message was well-received, un­der­stood and useful.

Organization members should seek to increase incrementally the maturity of their vi­su­a­li­za­tions. Jumping from baby-talk to the most so­phis­ti­cated of visualization might leave the recipients perplexed. Com­mu­ni­ca­tions using con­ven­tions they do not un­der­stand might demotivate them.

Incremental improvement means creating visu­a­li­za­tions only slightly richer and more so­phis­ti­cated than the last ones. If I speak a sentence using six words unknown by my conversational partner, he or she might ignore the message. Using only one unfamiliar word might trigger my partner’s curiosity, who thereby learns a new word. Visualizations are similar. (I am reminded of William Faulkner’s critique of Ernest Hemingway: “He has never been known to use a word that might send a reader to the dictionary.“)

Feedback from the visu­a­li­za­tion recipient to its creator provides the means to assure that improvements remain incremental and comprehensible. Engaged dis­cus­sions about the visu­a­li­za­tions provide the most useful feedback, but such ex­changes are not always practical. Consider using a simple feed­back method, such as providing a 1-10 rating system or a set of “good”, “bad” and “neutral” icons (see Fig. 10). A “1” would mean the recipient has no idea what the visu­a­li­za­tion tries to com­mu­ni­cate. A “10” would mean that the viewer finds the  com­mu­ni­ca­tion as useful and concise as possible (given the level of maturity of the recipient).

force directred graph

Fig. 10: A method for capturing feedback about visualizations.

The visualization creator should probably follow up with recipients providing very low ratings.  Too, the creator should be wary of consistently very high ratings. Too many 10s might indicate that the visualizations are too simple and are not contributing to improving maturity. In short, visualizations should generally be slightly ahead of the developing curve of maturity.


The suggestions made above represent the fruits of my own experience. I do not intend the list to be complete or ex­clu­sive. In­stead, I hope that each or­gan­i­za­tion seeking to increase the maturity of its information visu­a­li­za­tions will experiment, develop and share the methods that work best in its current context.

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International LicenseThe article How to increase visualization maturity by Robert S. Falkowitz, including all its contents, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

1 Dave Snowden (2011) describes the importance of trust more generally within knowledge management. See also

Unless otherwise indicated here, the diagrams are the work of the author.

Fig. 10: Includes a diagram by Deepthiyathiender – Own work, CC BY-SA 4.0,

The post How to increase visualization maturity appeared first on This view of service management....

Measuring Flow Efficiency 2020-08-06T13:59:00Z A very low overhead method is proposed for accurately measuring flow efficiency with flow management software.

The post Measuring Flow Efficiency appeared first on This view of service management....


Flow efficiency is one of the most important, startling and difficult to measure flow metrics. Flow efficiency describes the ratio of time spent on doing real, transactional work—the touch time—to the total calendar time needed to deliver output to a customer—the cycle time—expressed as a percentage.

If it takes a team 10 days to perform all the required work on a work item, but during those 10 days it was really working on the item only a total of 1 day, then its flow efficiency for that work item was (Touch time) / (Cycle time) * 100%, or 10%.

When team members are first presented with the idea of flow efficiency, something often clicks in their minds. It helps describe the phenomenon of taking months to deliver something when they had worked so little on the item. It brings to mind those endless delays, those fruitless meetings, the long periods of waiting for someone else to make a decision, return from holiday, get well again and so forth.

The problem of measuring flow efficiency

If we find flow efficiency so useful a metric, why don’t teams measure it and use to regularly to document improvements and to encourage yet further improvements? Rightly done, flow efficiency can be very difficult to measure.

The easy part of measuring flow efficiency is the measuring of cycle time.1 A team may easily take note of when it starts work on an item and when it has completed that work. It may find it hard to measure when it works on the item—the touch time. Two reasons explain this. First, the stakeholders might  disagree or misunderstand what constitutes that real work. Second, the administrative effort to note the starting and stopping of that work may be difficult to perform consistently and accurately. In short, many workers might not bother doing it, or they might record some fictive and imaginative values long after the fact. Completing weekly timesheets poses the same problem.

As a result, manual measurement of flow efficiency might be too time consuming and unreliable to perform without intensive supervision. This additional cost and overhead translates into rare and difficult to compare measurements.

What is the "real work" part of flow efficiency?

Different teams might have different concepts of what work duration ought to be measured. Cycle time might be divided into three categories: the time during which no one is working on the item; the time during which someone is performing the work that directly leads to completing the work item; and the time spent on coordinating and communicating the real work.

What is the "real cycle time" part of flow efficiency?

Some people might distinguish between poor quality work that leads to defects and must be repeated, on the one hand, and adequate quality work. As important as this distinction might be, I do not believe it reflects on the maturity of flow management and should not enter into the calculation of working time. Instead, worker ineffectiveness or inefficiency leads to longer cycle times. This explains why I emphasize the importance of measuring the end of cycle time as the time the work item is completed with the right quality. If you take 1 week to complete a work item, only to learn a week later that the quality is insufficient and then you take another week to fix the defects, the real cycle time is three weeks, not one week.

A more effective way of measuring work

If you are using a physical card board to visualize and track work, it might be very difficult to track the start and end of every period of real work on every work item. I suppose you could do it by writing on the backs of the cards. However, using a virtual card board with the necessary features offers a simpler method requiring much less administrative effort.

Remember that teams, especially those performing knowledge work, tend toward very low flow efficiency. When a team does not actively manage flow, analysts report the typical flow efficiency at around 5% to 15%.2 But even when a team has reached a good level of maturity in flow management the efficiency can still be under 50%. Consequently, the most common state of a work item is “not being worked on”, i.e., “not being touched”.

When you first create a work item, its state should thus be “not being worked on”. There are two ways to reflect this state:

  • the work item is in a queue
  • the work item is not in a queue, but is flagged as temporarily not advancing.
The latter case can occur for several reasons, of which the most important are:
  • the workers have switched context (started to work on something else without having finished the current value stream phase). They could do this voluntarily or an external party could impose the switch, such as when a manager wishes to expedite a work item
  • the work is blocked, generally because the team must wait for a third party.

The first reason is under the control of the team (a team might refuse to expedite an item, for example), while the second is largely not under the control of the team.

Low overhead steps for tracking working time

The following steps, illustrated with videos from a kanban software tool, show how a team may calculate accurately its flow efficiency with a minimum of administrative overhead.

Step 1:

The team pulls a work item into a new value stream phase only when it intends to start work on it immediately. The state of that work item is set to “working on it”, “being touched”.

In the example below, a worker pulls a card from the Requested column into the Analyze column. The worker makes no special changes to the card attributes. The cycle time starts at this moment. The touch time starts to accumulate.

Step 2:

As soon as the team finishes its current session of work on the item, it sets the state to “not working on it”. As per the discussion above, the tool might include various flags to distinguish the reasons for stopping work: voluntarily switched context; switched context due to expedited work; waiting for a third party; etc.

In the example below, a worker flags a card to indicate that the touch time has ended and the team no longer advances in its work on the item. The changed card color indicates the new state. The touch time stops accumulating.

Step 3:

The next time the team starts work on the item, it resets the state to “working on it”. The touch time starts accumulating again.

Step 4:

If the team finishes work on the current value stream phase, either it moves the item into a queue or, if explicit queues are not used, a worker resets the attribute to “not working on it”. Otherwise, return to step §2.

In the example below, a worker pulls the card into the “Analysis completed” column, a queue. While in the queue, the touch time accumulation ceases.

Step 5:

As soon as the the team completes entire value stream for the work item, a worker drags the corresponding card into the “Done” column. The cycle time comes to an end and the touch time stops accumulating.

At this point, the kanban software may include the work item in its calculation of flow efficiency. The software sums all the periods during which the item was either in a queue or flagged as not being worked on to calculate the touch time.

Detecting and correcting errors

The team can easily detect unfeasibly long periods of work when data is collected this way.  For example, if workers only work done during single, daytime shifts, the team should investigate and fix durations of working periods greater than, say, 12 hours (or any duration it wants).  Automated alerts in the software can help detect such cases. 

If a user attempts setting the state to “not working on it” when the item is already in that state, it is likely that the start of the session was not recorded. The software should indicate this missing datum and allow the user to set a feasible start time.

Finally, if the card lacks both the start and end times of a session, this omission could be detected, but only if workers also tracked their coordination and communication activities. In that case, there would be periods for which the touch/wait state of the work item is not being accounted.


Calculating the flow efficiency

The calculation of the cycle time should be straight forward. In case a work item had been marked as completed and then returned to an active phase of the value stream, the software should replace the initially recorded end time by the subsequent end time.

The software should calculate the touch time as the cycle time minus the time spent in queues during the cycle time, minus the time flagged as “not working on it”. In case the team reactivates the work item after it having been moved to a “done” phase, all the time in that “done” phase should be considered as if it were a queue.

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International LicenseThe article Measuring Flow Efficiency by Robert S. Falkowitz, including all its contents, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

1 Some people might prefer to define flow efficiency relative to lead time instead of cycle time. Lead time would be measured from when the customer commits to requesting some output. Presumably, there will be some waiting time between that commitment and the start of work. Thus, flow efficiency measured against lead time will probably be lower than if measured against cycle time.
2 This statistic is taught in Lean Kanban University training. It has been reconfirmed by numerous reports from Kanban coaches and consultants, based on their own experiences.

Unless otherwise indicated here, the visualizations are the work of the author.

The post Measuring Flow Efficiency appeared first on This view of service management....

Automated Value Stream Maps 2020-05-09T07:09:04Z An automated value stream map is an advanced example of how information visualizations may be integrated into service system management tools.

The post Automated Value Stream Maps appeared first on This view of service management....


The previous article in this series gave an overview of visu­al­i­za­tion types useful for managing services but rarely seen. In this article, I will examine in de­tail a key visu­ali­zation, the value stream map (VSM). I do not intend to explain how to use VSMs. This article assumes a basic understanding of value streams and of value stream maps. Instead, I will examine how you might automatically create and update that visualization with­in ser­vice and operations management tools.1

What is a value stream map?

Fig. 1: An example of a value stream map

A value stream map is one of the key visualizations used in lean management. It describes the flow of information and materials through a value stream. Many readily available sources describe these maps, so I will not go into any detail here. I will only note the iterative use of the maps in value stream mapping. This ac­ti­vity sup­ports the continual im­prove­­ment of an organization. It especially concerns identifying waste in the flow of ma­te­rials and information.

Tools for creating value stream maps manually

Fig. 2: Tools for the manual design of value stream maps

Many different tools are capable of creating value stream maps. Vir­tu­ally all these tools provide a VSM template, icons and drawing tools to enter text, position icons and draw connections.

I might mention in passing the simplest of tools: pencil, eraser and paper or white­board, marker and eraser. Using these tools, es­pe­ci­ally in a group activity, allows for personal interactions like body language and tones of voice. Automated tools have no channels to communicate those interactions.

However useful such manually created diagrams might be, they have no built-in intelligence. They do not connect automatically to any underpinning data. Users may validate the accuracy of the diagram only manually. Their main­tenance of the maps is labor-intensive. In short, such tools cannot create automated value stream maps.

Partially automated value stream maps

semi-automatic update of value stream map
Fig. 3: Some tools allow for automatic update of data in value stream map labels

Certain tools go a step beyond this sort of simple drawing. They allow shapes in the VSM to be related to data in spreadsheets. As the data in spreadsheets changes, managers may need to alter the diagram. In some cases, this synchronization may be auto­mated.

In their simplest form, such tools remain essentially drawing tools.  The user must create manually the objects on the VSM. In the more sophisticated form, these tools can draw complete VSMs based on data in the spreadsheet. To my knowledge, such tools hard-code the style and layout of the resulting VSM. Such tools re­pre­sent the simplest form of the automated value stream map.

Integrating VSM creation with service system management tools

The next step in the creation and main­tenance of automated value stream maps would be to bypass the spreadsheets. Service manage­ment or operations management tools may directly provide the data to VSMs from the opera­tional data they manage.

We may divide the setup of such automation into six areas:

  • the design of a VSM template
  • the definition of the value stream
  • the identification of the data sources
  • the linking of the data sources to the VSM object attributes
  • the identification of thres­holds to trigger alerts
  • the definition of analyses of the VSM data
  • the program for updating and communicating the VSMs

Once the designers complete this setup,  the system may create VSMs in a largely automated way. As we will see, we may also automate some of the uses of VSMS, once delivered to the desired audience.

Design the VSM Template

A VSM template may define the default attributes for a VSM. These attributes might include the shapes and icons to use, the color palette, fonts and so forth. Technically, the template might take the form of an XSL style sheet applied to XML data.

The manual choices made by designers prevent the automation of  template creation. Of course, some future and par­ti­cu­larly so­phis­ti­cated AI might be capable of executing this task.

Define the Value Stream

Value stream managers may define the value stream in a map either  visually or by direct coding. Designers already do such work using business process automation tools or BPMN notation. They might find it easier to define the value stream phases and other VSM components using a visual tool. Theo­re­ti­cally, designers could directly write, or tune, the un­der­pin­ning XML code. We might dub this technique “value stream as code”, ana­log­ous to “infra­structure as code”.

Lean management calls for gemba walks at the workplace to identify the phases of the value stream used in practice. How shall we conceive of a gemba walk when an IT system performs the service or process?

Certain tools can sniff network packets and trace other system artifacts. They add the intelligence needed to the flow of these virtual materials. Using such tools, it might be pos­sible to identify flow based on the reality of how the service system processes information. If possible, we should prefer this approach to basing the value stream on the theoretical architectural designs of a service.

For example, an electronic mail delivery service has unique identifiers of messages allowing tracing the real processing of messages. We could apply a similar approach to other services if they had the necessary identifiers. There might be other methods to identify automatically how a system processes data.

Among the factors influencing the usability of such methods are:

  • the degree to which nodes are shared
  • the complexity of the pro­cess­ing
  • the design of the in­for­ma­tion packet
  • the technologies in use

Auto­mating the identification of the value stream phases might be possible if the service system were designed to allow the necessary tracing.2

Identify the Data Sources

Data maintained in automated management tools may supply most of the object attributes dis­played on a VSM. I note below the ex­cep­tions depending on manual updates.

You will see in the diagrams below that I suggest automated updates based on data in log files. In principle, those data represent the reality of what happens in a service system. This reality may well be different from what we find in normative config­uration records, agree­ments and other such sources.

Cycle Times

capturing cycle times
Fig. 4: Cycle times may be captured from many sources, most, but not all, automatically

Cycle times may be measured and reported using various sources. Com­puter inputs and outputs might be timestamped. Kanban boards, whether physical or vir­tual, might record start and end times. Executors of purely manual tasks might report their cycle times.

In some cases, designers might calculate mean cycle times using Little’s Law:

Mean Lead Time = Mean work items in progress / Mean work items per unit time

Make sure that the measured times do not include non-value-adding time.

When machines perform work, we can distinguish be­tween value-adding time and non-value-adding time in a straight-forward way. When people perform work, only the executor of the task can really distinguish what was value-adding from what was not. Consider the issues associated with completing a weekly timesheet, recording the amount of work done on each assigned project.

Who knows what percentage of the time spent on a single task was value-adding? In general, only the person per­form­ing a task knows that. Note that the mere fact of recording such information is, itself, non-value-adding. Fur­ther­more, worker biases and other forms of error depreciate the re­li­ability of such time estimates. Consequently, you may wish to  collect these data only peri­odi­cally, not continuously. Too, in­de­pen­dent controls on the data recorded could help reduce bias and improve accuracy.

Take care to avoid high levels of measurement overhead. Random sampling may help to reduce that overhead, especially for a high volume of work items during the measurement period.

Queue/Inventory Sizes

A value stream map should report aggregated values of queue size. Instantaneous mea­sure­ments of queue size support proactive allocation of resources and un­block­ing activities. However, they do not support optimization activities based on value stream maps. Instead, we seek such statistics as mean inventory size and standard deviation over the sample period.

If computerized agents perform services, monitoring agents can measure queue sizes. For example, a message transfer agent (MTA) will have an input and an output queue. Standard agents can measure the size of those queues and report those data to the event management system.

For manual work, designers may derive queue sizes from kanban boards. The board designer may split each value stream phase into an “in progress” sub-column and a “completed” sub-column. In that case, the queue sizes may be viewed directly from the “completed” sub-columns. Other­wise, the “Ready” or “Backlog” columns to the left side of kanban boards display such work. Portfolio kanban software would be par­ti­cu­larly useful for ga­ther­ing such data. Furthermore, it can help ensure the same data are not counted multiple times.

For physical ma­terials, the machines that automate the handling of materials may provide in­ven­tory sizes. Supply chain data may also provide the data needed for their calculation.

In an information technology context, inventories of goods might include computers, spare parts and other devices. These components may be in storage, awaiting use or in the inter­me­diate phases of a value stream. For example, a technician may clone a batch of disks to prepare  com­puters for de­ploy­ment to the desktop. After preparation, but be­fore in­stal­lation in the com­puters, they form part of an intermediate inventory.

The diagram for cycle times (Fig. 4) is also mostly relevant to capturing queue sizes.


Fig. 5: The availability of service system components (at the functional level) may be captured automatically, for the most part

In an automated value stream map, we should consider the avail­ability of the whole system required for each value stream phase. Drilling down to the individual components becomes important only to define specific actions to improve availability.

Analysts may measure the availability of computing de­vices and other machinery in many ways. For value stream mapping, the most ap­pro­pri­ate way is to subtract the down­time due to incidents from the planned service time, divided by the planned service time. How­ever, I would not generalize the use of this metric for avail­ability.3

The service management tool should understand the rela­tion­ship of  system com­ponents to the successful completion of each phase of the value stream. Incident tracking needs to be able to iden­tify when any of those com­ponents have failed. It further needs to relate those failures to the com­ponents. In this way, the service ma­nage­ment tool can auto­ma­tically calculate availability for the value stream maps.

Resource and Capacity Use

The service ma­nage­ment tool should detect system component unavailability. It should also know how much of their theoretical capacity the service or process uses over any given period. It also needs to un­der­stand how capacity use is related to performance. Measuring the use of non-IT machines is more straight-forward. Some machines are either on or off. Others can function at dif­ferent speeds. Agents can generally measure the % of pro­cessing cycles used on computing processors. Combine this statistic with the processing capacity of a single cycle. Storage measurement, too, is very simple to measure. Also, the management tool should have an idea of how capacity use affects performance. For example, running a machine faster might increase its failure rate and hasten the time before the next pre­ven­tive maintenance. The use of a computer processor might have some logarith­mic re­la­tion­ship of capacity use to per­formance. Similarly, working people to exhaustion generally in­creases the error rate and lowers throughput. The over-use of resources generally provokes some form of waste. Inversely, the under-use of re­sources is another form of waste.

Defect Rates

Being able to measure defect rates at each phase of the value stream implies that:
  • each phase has distinct criteria for successful com­ple­tion
  • these criteria are tested at handover time to the next phase
  • the results of such tests—at least, the negative results—are logged
Logs may record the failures to meet those success criteria. The relevant automated value stream maps derive data directly from those logs. Application developers may in­clude in their applications the capa­bility to report intermediate failures to respect success criteria. There is increasing pressure on all developers to thereby enhance the observability of how software works. When workers detect defects manu­ally, such as via visual inspection of an inter­medi­ate product, they should maintain a corresponding manual log. The tool creating the automated value stream maps may process this log for reporting those de­fects on the maps. Customer reports are also a source of information about defects. Customer support request records may contain nu­mer­ic defect data. Records of the return of mer­chan­dise (if ap­pli­cable) may also con­tain such data. Channels, such as complaints to sales personnel, may contain anec­dotal defect infor­mation. Take care to avoid the double-counting of defects. Stop­ping production upon detection of a defect and not passing defective products down the value stream serve to prevent miscounting.
capturing defects
Fig. 6: Defects in goods or services may be captured via various channels

Batch Sizes

The size of a batch of work can have a very significant effect on the flow of that work. Con­se­quently, it can have a significant impact on throughput and lead times. De­spite this impact, service management tools do not general­ly provide a structured way of defining and managing batch sizes. Therefore, it might be dif­fi­cult to automate the reporting of batch sizes in a VSM.

In a retail store, batch size might be the quantity of items ordered from a dis­tri­butor when it is time to restock. In a factory, batch size might be the number of com­po­nents to assemble when it is time to fetch more to a station on that line. But what do we mean by “batch size” in the context of services delivered via software applications?

Software applications might manage the flow and processing of information in batches, as distinct from handling every transaction separately. The daily accounts closing aggregating the day’s transactions and balances exem­plifies this. Responding to queries in blocks of data, rather than de­li­ver­ing all results at once, is another example. Thus, you might see the results of a query in, say, 25 lines at a time. If you want to see more, click on the “See more” button.

Batching of work also occurs in the management of  technology com­ponents. For ex­ample, when a user in your com­pany needs a new computer, do you prepare just a single computer and deliver it or do you prepare a batch of computers? Technicians use the logic that larger batches of computers prepared in ad­vance permit more rapid deliveries. Of course, doing such work in batches may also lead to various forms of waste, such as over­production and rework.

Therefore, there is a case for knowing and reporting the sizes of batches. Tuning batch size is part of the incremental changes you might make to optimize the flow of work.

Data about the sizes of batches might hide in various places in ma­nage­ment tools. Work orders instructing someone to prepare x number of components might contain batch sizes. Ap­pli­ca­tion con­fi­gur­a­tion files or database records might contain them. Or they might be implicit in the capacity of the in­fra­structure used.

For example, the size of a batch of goods delivered by truck might be “as many as can fit”. The number of new disks in a batch might be “the number of connections to the ghosting apparatus”. Remember, though. A gemba walk might reveal actual batch sizes that differ from the planned or theoretically sizes.

capturing batch sizes
Fig. 7: Data about batch sizes may be sourced in many places, some of which workers record manually

Changeover and Maintenance Times

Changeover times might have a high impact on the flow of work on assembly lines. However, software systems, by their very nature, do not have such issues. Or, at least, they perform change­overs rapidly. The waste of such changeovers may become noticeable only when value stream managers eliminate far more important sources of waste.

We may consider two types of software change­overs. First, sys­tem managers might stop some software running on a platform to free up resources for a different software. Shutting down a virtual ma­chine and starting up another virtual machine on the same plat­form exemplifies this need. Another example is shutting down one application followed by starting up another application.

The second case is typical of any operating system supporting pre-emptive multitasking. The pro­cessor cycles dedicated to process context switching are a form of changeover and waste. Monitoring the number of context switches, as opposed to their duration, might be is generally possible.

Whether a system is hardware or software, it may require shut­downs for maintenance purposes. Technicians often perform manual main­te­nance tasks according to work orders generated by the production control system. How­ever, derive data for the VSMs from the aggregate of the actual maintenance times. We prefer this to the ex­pected times that work orders might indicate. Log and report automated maintenance tasks (which are generally non-value-adding activities). Examples include the periodic rebooting of servers or the shutdown of applications during the reorg­anization of indexes.

Similarly, virtually all software batch oper­a­tions are non-value-adding actions. Think of im­port­ing daily exchange rates, adding the day’s trans­ac­tions to a data warehouse or the periodic closing of books. These are not forms of maintenance, however. Report these activities as phases of the value stream, especially if they are performed frequently.

Fig. 8: Many changeover and maintenance activities automatically write to logs, but manual activities require the technician to record the execution times

Link the Data Sources to the VSM Objects

We have seen that a VSM may contain automatically reported data  derived from various ma­nage­ment tools. Some data, however, might be difficult to obtain automatically. Other data might reflect  planned or expected values rather than the actual operational values.

The VSM designer must link the identified data sources to objects in the value stream map. For example, link each inventory shape to the calculation of its inventory size. Link mean cycle times to the segments in the VSM’s timeline, and so forth.

Identify Alert Thresholds and Algorithms

Managers might use value stream maps to visualize how various com­ponents of a service operate. But they use them principally  to identify forms of waste and potential improvement ac­ti­vi­ties. So, let’s also try to automate the VSM’s use in identifying issues and im­provements. The automatic identification of issues depends, obviously, on first determining the criteria indicative of an issue. These criteria might be simplistic thresholds or more so­phis­ticated algorithms, such as used by AI analytics. To the extent that thresholds are used, a service management tool might already record their definitions. The most ob­vi­ous sources would be the agree­ments with cus­tomers and sup­pli­ers to respect certain lead times. They might also contain records of capacity thresholds for various service system com­po­nents. Older approaches may have defined performance criteria in OLAs. (OLAs may be deprecated in methods focusing on the customer and using multidisciplinary teams responsible for entire services.) Other sources of data might include industry benchmarks. For example, flow efficiency is a standard metric for flow ma­nage­ment. It is defined as value-added time divided by total cycle time, expressed as a percentage. It is commonly reported on value stream maps. Knowledge work ac­ti­vities like soft­ware engineering commonly have a flow ef­fi­ciency of 5% to 15%. In other words, flow is abysmally poor.

Define Visual Analytics

Value stream maps should visually indicate issues worthy of further in­ves­ti­gation and action. Only the imagination limits the visual tech­niques that VSM designers might use to highlight such issues. Examples of visualization tech­niques might include:
  • special colors with the color scheme in use
  • special fonts
  • changes to backgrounds around the objects or the labels concerned
  • text annotations
  • fish-eye display of map objects worthy of closer attention
value stream map with highlighted issues
Fig. 9: Illustration of various techniques used to highlight issues on an automated value stream map

Update and Communicate the Automated Value Stream Maps

Value stream maps need to be kept up to date. Value stream managers must have timely access to the updated versions. We shall want to automate these updates and map distributions as much as possible. Service management tools com­mon­ly have the capability to automate the update of visu­ali­zations. No innovations would be required to implement this func­tion for value stream maps. Similarly, the communication of service management information is already well advanced. Tools support pushing in­for­mation (ge­ne­rally by some elec­tronic messaging) and pulling in­for­ma­tion (making it available via some information portal. More so­phis­ti­cated tools also allow for sub­scribing to specific reports.

Validate and Decide

The simplest of drawing tools allow for group interactions and types of non-verbal com­mu­ni­ca­tion. Unfortunately, electronic and automated tools provide no good channels for this type of com­mu­ni­cation. (Do you believe that adding a smiley to the end of a written message has the same force as a genuine smile from the person looking at you?). Furthermore, we should not un­der­play the value of strug­gl­ing with building a visualization manu­al­ly to enhance learning and ac­cep­tance. Are you more apt to understand the visualization in whose creation you have par­ti­ci­pated or the visualization that has been created for you by a machine? It is not a good idea to apply the information displayed in an automated value stream map without further analysis or challenge. Such visualizations would be pointless. Just let the automated creation process take the necessary im­prove­ment steps on its own! Therefore, value stream managers need to view the maps analytically. They need to discuss them and decide for them­selves how to benefit from the information they display. They should attempt to discover their own insights. Only then should they decide which im­prove­ments to implement.

Implement Improvements

Implementation of the changes intended to im­prove the value stream concludes an iteration of the value stream mapping activity. What role could the automated value stream map play in this implementation activity? For many, the map would play no role at all during the imple­men­ta­tion of the change. An automated value stream map may conceivably act as a sort of operational control panel for a value stream. In other words, there could be a two-way relationship be­tween value stream operations and the map. On the one hand, the map is drawn directly from operational data. On the other hand, changes to the map could automatically change the para­meters of the flow of work. For example, batch sizes, shift duration and re­source counts  could be altered within an electronic automated value stream map. With such a technology, the map might also be used to test hypotheses about the impact of chang­ing flow parameters. Most or­gani­za­tions, however, have a very long way to go before they develop such capabilities.

Summary of Benefits of Automated Value Stream Maps

We assume that the members of a service delivery organization have achieved con­sen­sus on what a value stream map should display. In this case, automation will vastly decrease the time needed to ge­ne­rate an acceptable and useful map. As continual improvement tools, auto­ma­ted value stream maps may be useful in creating simulations of proposed improvements. Value stream managers may visually compare the situation of the recent past to a simulation of a proposed future. Visual simu­la­tions would be especially bene­ficial if the proposed changes were to alter the phases of the value stream itself. Furthermore, automating the cal­cu­la­tion and display of oper­a­tional values removes the risk of certain errors in the map. When a person types or pens in a value (e.g.,  a cycle time) there is the risk of misreading or mishearing the value. That person might misplace the decimal point or write the wrong number of zeros, etc. Automation also leads to the consistency of output, which en­hances the comparability of maps. This con­sis­tency is es­peci­ally im­portant in the algorithms used to calculate the numeric statistics reported on value stream maps. Two different persons might have different views on how to cal­cu­late availability; a single software instance for creating a map does not.

Summary of Drawbacks of Automated Value Stream Maps

I have already alluded above to the benefit of creating a value stream map manually. The creators struggle together in finding the best ways to present the in­for­mation on the map. They might decide to adapt the map for the particular purposes of a given service. In the end, they un­der­stand the details of the map because they created each part themselves. Merely being pre­sented with an image created by a third party makes learning from the map harder.

I described above how an automated value stream map might include visual indicators of factors that lead to waste. While they enhance map usability, they also present the risk of ignoring factors that are not visually dominant. Compare this situation to the bias assuming all is well if the light on the control panel is green.

Setting up the automation of value stream map creation is itself a labor-intensive activity. It makes sense only if the resulting system will create value stream maps regularly. This would be the case if value stream maps were being used as intended. However, some immature organizations might think of value stream maps as one-off types of documentation. They might create them once and then file them. In such cases, auto­mation makes little sense.

As with any form of automation, it makes sense if it solves one or more problems an organization is facing. But if the organization cannot identify the problems it is trying to solve, it cannot un­der­stand the value of auto­ma­tion. Such automation efforts are likely to be misguided and wasteful.

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License The article Automated Value Stream Maps by Robert S. Falkowitz, including all its contents, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


1 This is not a tutorial on how to use any particular service management tool. To my knowledge, no service management tool currently has the capability to automatically create and maintain value stream maps. However, if users are aware of what is possible, without very much effort on the part of the tool designers, they might start to request such capabilities.

2 Various tools exist that can track the flow of events through a service system. I am thinking of products from companies such as Dynatrace, New Relic, Amplitude or Splunk (no endorsement intended). The trick is to relate those events to the much higher value stream phases. It is unlikely that such relationships can be identified automatically.
3 When measuring the availability of an IT-based service, I generally recommend defining the metric as a percentage of customer requests that the system can fulfill. IN this customer-oriented way, we avoid considering a system to be unavailable when no one wants to use it. However, the traditional use of value stream maps in a manufacturing context understands the availability of machinery as the percentage of planned time that equipment is functioning correctly. This interpretation corresponds to the IT definition of availability as measurable in terms of the percentage of time a component is down.

Unless otherwise indicated here, the diagrams are the work of the author.

The post Automated Value Stream Maps appeared first on This view of service management....

Information Visualizations for Service Management 2020-04-24T12:25:00Z Service management tools include only rarely the visualizations that provide a high level of information to support the management of services. The reasons for this and many examples of high-level visualizations are provided.

The post Information Visualizations for Service Management appeared first on This view of service management....


In this installment of my series on information visualizations, I will describe a variety of visu­a­li­za­tion types that could be very useful for managing services but are rarely seen. There are many reasons for this rarity:

  • visualization creators are not aware of the chart types and their usefulness
  • the tool developers reuse chart­ing libraries that do not include those types
  • the recipients of those chart types might not understand them at first sight
  • organizational inertia drags down the desire to improve or innovate
  • the tool developers do not perceive there to be a demand for chart types other than the ones they offer.

I do not include in this article the types of charts that are commonly available in the tools service ma­na­gers tend to use: integrated ticketing tools and spreadsheet tools. These charts—bar charts, dot plots, area charts, radar charts, pie charts, bubble charts, diverse gauges, combinations of these types and yet several other types—tend to be designed for data visu­a­li­za­tions rather than in­for­ma­tion visualizations. They are a form of baby talk in the realm of visu­a­li­za­tion. Let’s try to use a more mature language to communicate our messages.

About service management tools

My purpose here is not to review and compare products available in the marketplace. That being said, a few remarks about the existing products may be helpful to service managers seeking support for more sophisticated information visualization.

In preparation for this article, I reviewed more than 20 different service management tools, selected based on their longevity, sophistication and popularity. For many of these tools, I benefited from the advice and support of their sales and support personnel, for which I am very grateful.

I do not intend to publish the accumulated data about chart types availability here, as the focus of this article is precisely on what these tools do not provide as visualizations. How­ever, if there is sufficient demand for this information, I might con­sider pub­lishing it elsewhere.

Linking to third-party visualization tools

Many tools make it possible to create visualizations using external tools, either by batch exporting data, real-time query of their databases or by using protocols, such as REST, to feed data into other tools. This capability leads us to ask what the added value of internal visualization functionality might be. In other words, if it is possible to create visualizations via third-party tools, why bother including this functionality within service management tools at all?

What is the added value of information visualization within a service management tool?

There are two ways to answer this question. The first answer is based on the eternal best of breed versus best overall integrated functionality debate. I do not intend to step into that basket of crabs here.

The second answer is based on the value of increasing levels of functionality. As a tool adds func­tion­a­lity, it generally reduces the time needed to create visualizations. At the high­er levels of functionality, it may also help visualization creators to avoid errors, especially when the visualization depends on a good knowledge of statistical analysis.

Comparison of visualization tool types
Fig. 1: Synopsis of differences among visualization tool types

Drawing tools

At the low end of that gradient is the tool that allows you to draw straight lines, circles, polygons and splines and fill in spaces. You can make any visualization at all with such tools, but:

  • their creation and main­te­nance is very labor-intensive
  • they do not allow for dy­namic updating of the visualization
  • there is no drill-down, fil­tering or sorting capability

Simple charting tools

At the next level is the tool that automatically creates the co­or­di­nate system and places the data objects of the visualization, based on the underlying data. This saves a huge amount of time, compared to the first level, but at the cost of limited freedom in what objects can be displayed. Such tools might be able to update the visualization (semi-)dynamically, but they do not offer any drill-down ca­pa­bi­lity. Any sorting and filtering capability depends on sorting or filtering the underpinning data, rather than the visualization itself.

Sophisticated charting tools

Those capabilities are added in the next level of functionality. The more sophisticated service ma­nage­ment tools are at this level. But there remains one higher level that most of these tools lack.

Service management-specific tools

A few (very few) service management tools take vi­su­a­li­za­tions to a higher level, providing (mostly) out-of-the-box charts that address issues specific to doing the work of managing services. The characteristics of such visualizations are best de­scribed using a few examples.

I call these visualizations “service ma­nage­ment-specific” for lack of a better title. Of course, most of them may be useful for com­mu­ni­ca­tions about fields other than service management. The mere fact that a type of visualization is available in a service management tool does not mean that the tool is producing service management-specific visualizations.

Example: the lead time histogram

Let’s take the example of a histogram displaying the dis­tri­bu­tion of lead times for a certain service. Creating a good histogram requires the following capabilities:

  • a useful number of buckets, and their size, must be defined
  • there should be a visual in­di­ca­tion of the median and perhaps the mean values
  • there should be a visual indication of thresholds sig­ni­fi­cant to the needs of the service customers, such as the lead time required to fulfill 90% of the service acts within the needed time
  • if the histogram is intended to be mono-modal, there should be an indication if there is a sig­ni­fi­cant pos­si­bi­lity of multi-modal under­pinning data
annotate histogram
Fig. 2: A lead time histogram annotated to describe the principal components
All of these visual components can be created at any level of func­tion­a­lity. The differences are in:
  • the overall effort required
  • whether the knowledge re­quired is embedded in the tool or in the tool creator
  • and, if the latter, the tool creator even has the requisite knowledge
To illustrate the last point, how many people have sufficient know­ledge of statistical analysis to de­cide if a bump in a histogram is just a random variation or in­di­cates that there are really two different types of samples in the underpinning data?

Example: the statistical control chart

Let me provide another example: the statistical control chart. It is very simple to create the scatter plot that is the chart type underlying this visu­a­li­za­tion. It is also very simple to include lines for the mean value and the stan­dard deviations in virtually any spreadsheet program (even those this requires a huge amount of redundancy). The real challenge is to present a visual analysis of the various series of data points that probably represent non-random variations. This is quite easy for a computer to do. Even though the statistical control chart has been a fundamental tool for the ma­nage­ment of processes for more than half a century, few service ma­nage­ment tools (among those I have analyzed) makes such a visu­a­li­za­tion available. Furthermore, tools that do allow for the creation of such control charts do not generally visualize the evidence for non-random sequences or data points, as I have done via yellow highlighting in the chart below.1
statistical control chart with systematic variation highlighted
Fig. 3: A control chart with systematic (non-random) variation highlighted

Do not think that I am criticizing service management tool de­ve­lo­pers. They are all positioning themselves according to their perceptions of the mar­ket­space and the niches in which they hope to deliver their products. The service management eco-system also includes the tool customers and the authors of service ma­nage­ment frameworks. They have accorded relatively little im­por­tance to the sorts of analysis and communication that the visu­a­li­za­tions described below are intended to support.

Higher-level visualizations for service management

In this section, I list a variety of visualizations that are particularly useful for managing services but are generally not available within service management tools. Indeed, many of them are available in only a few sophisticated charting tools.

Where possible, I have used a nomenclature based on the idioms defined by Tamara Munzner in Visualization Analysis & Design. That being said, there is little standardization of the names of visualizations.

The examples I provide below are more the raw visualization idioms (using Munzner’s terminology) than complete information visualizations. I would expect them to be embedded and adapted for the communication purposes, as described in my previous discussion of visualizations.

Control Charts

statistical control chart
Fig. 4: Statistical control chart

The control chart, also called a Shewhart chart or a statistical control chart, is the classic example of visualizing whether a process is under control. The chart is a scatter plot indicating the mean value and upper and lower control limits based on the standard deviation of the sample. It provides a simple means for identifying process instances that are apt to be the result of non-random effects.

Parallel coordinates

Parallel co-ordinates used to analyze network traffic
Fig. 5: Parallel coordinates

A parallel coordinate chart addresses the problem of not being able to visualize more than three dimensions projected onto a two-dimensional plane. Instead of trying to emulate Cartesian coordinates with axes at right angles to each other, it makes them parallel. Lines connect the values of each item on each axis.

An example of the power of visualizing with parallel coordinates would be the ana­ly­sis of DDOS attacks by tracking the source IP, target IP, target port and packet size for the attacking packets. An attack has a characteristic visual pattern.

Sankey charts

Sankey diagrams showing the volume of work items flowing from team to team
Fig. 6: Sankey chart

The chart par excellence for vi­su­alizing flow. In addition to dis­playing the nodes of a network and their connections, it re­pre­sents the volume of flow between nodes by the width of the connector.

Typical appli­ca­tions include the analysis of the flow of work from node to node; the analysis of the delivery of services from pro­vi­ders to consumers; and the analysis of transitions or trans­for­ma­tions from one set of states to a new set of states.

See a further discussion here.

Marey charts

Marey Chart
Fig. 7: Marey chart

Originally designed to visualize the relative speeds in a trans­portation network, a Marey chart may also be used to show the speeds with which the phases of a value stream are executed. Multiple lines allow for comparisons of different teams, different periods, dif­ferent processes, etc. See a further discussion here.

Cumulative Flow Diagrams

cumulative flow diagram
Fig. 8: Cumulative flow diagram

The cumulative flow diagram is a form of stacked area chart that quickly shows the evolution of the work backlog, work in progress and work completed over a period, such as the duration to date of a project. It also easily shows the changes in the mean lead time at a given moment.

Bump charts

bump chart
Fig. 9: Bump chart, showing the evolution of the count of changes, grouped by the main reasons for the changes

A bump chart encodes a time series, showing the evolution of a continuous value for each nominal category of  data. You can easily see the changes in  relative importance of those categories over time.

Some people use “bump chart” as a synonym for “slope graph” which, however, is a different type of chart.

Violin Plots

Reliability shown via a violin chart
Fig. 10: Violin plot comparing five models of disks

A violin plot allows for easy comparison of various ca­tego­­ries of data, showing the distribution over time and, typically, such statistics as the high, low, mean and quartiles. As such, it resembles the sim­pler box plot.

See also bee swarms.

Bee swarms

bee swarm
Fig. 11: Bee swarm comparing five models of disks

A bee swarm is a form of dot plot where the individual data points are shown as dots but grouped according to categories. A bee swarm is like a violin plot with the detailed data points dis­played, rather than the ag­gre­gated distribution sta­tis­tics and the outline of the distribution. Thus, a bee swarm would be a good way of drilling down from a violin plot to the details.

See also Violin plots and Clustered scatter plots.

Clustered scatter plots

clustering of data
Fig. 12: Clustered scatter plot

Virtually all visualization tools allow for the creation of scatter (X-Y or dot) plots. Often, these data points are clustered. Con­si­der­able value is added if the tool can identify those clusters and automatically visualize them, such as with circles or shading. Such visualizations are ex­treme­ly useful in AI applications. See also Contour Maps.

Tile Maps

tile map
Fig. 13: Tile map

When data can be related to geographical categories, such as countries or regions, a tile map is a useful way of visualizing that data. Not only can the data be plotted against many dif­ferent types of projections; the projections themselves may be altered (distorted) to visualize the data. About 30% of the tools I analyzed provide some map-based visu­a­li­za­tions, but only with the simplest, Mercator-style projections.

Stream Graphs

stream graph
Fig. 14: Stream graph

A stream graph provides an extremely dense, impressionistic visualization of the evolution over time of the counts of a series of categories of data. For example, one might display how types of incidents or categories of changes gain and lose po­pu­la­rity over time. Care must be given when viewers might be color blind. Good labeling and annotations greatly enhance this type of visualization.

A stream graph is an interesting example of how our perceptions can adapt to new visualizations. A moving horizontal axis  cha­rac­ter­izes this type of graph. This feature minimizes distortion of stacked shapes, but is un­ex­pected by viewers ac­cus­tomed to a Cartesian coordinate system.

Contour Maps

contour map
Fig. 15: Contour map

A contour map takes the clustered dot plot one step farther. It not only visually groups correlated data values, it also displays the degree of cor­relation via con­cen­tric contour lines.

Perceptual Maps

perceptual map
Fig. 16: Perceptual map

A perceptual map is useful for plotting the market’s per­cep­tions of competing services. Multi-dimensional scaling may be used to reduce the plotted dimensions to the most sig­ni­fi­cant ones.  The services may be clustered in the resulting map and opportunities for (re-)po­si­tioning services identified.

Heat Maps

tile map
Fig. 17: Heat map

A heat map is a useful way of showing the evolution of cyclic patterns. A row corresponds to one cycle. It consists of as many tiles as there are subdivisions in the cycle. A cycle of 1 day might have 24 tiles; 1 year might have 12 tiles or 365 tiles, etc. Each tile could be color-coded to indicate a category or an ordinal value. Show as many rows as needed for additional cycles. For ex­ample, you might  display the evolution of mean support calls per hour per day during a month.

A variant form, the cluster heat map, leverages the functionality of the re-orderable matrix, a particularly powerful analytical approach.

Note that some people call geographical maps colored to indicate some statistic as “heat maps”. We will live with the ambiguity. Adding to the confusion, both types of visualizations are also called “tile maps” by some.

Directed Graphs

force directed graph
Fig. 17: Directed graph

The directed graph (as well as the undirected graph) has be­come a very popular way of presenting and analyzing social networks. Insofar as some ser­vice providers have adopted the social network framework for organizing their work, es­pe­ci­ally support work, the directed graph can be applied directly to such activities. Some ap­pli­ca­tions might include the analysis and presentation of the flow of information (although Sankey charts do a better job of show­ing the volume of flow); the identification of bottlenecks; the reorganization of personnel to reduce inter-team com­mu­ni­cation and thereby increase over­all performance

Of course, the images used in configuration management tools to visualize configuration item relationships are essentially directed graphs (or just simple graphs). I will discuss such tools in detail in a separate article.

Value Stream Maps

Fig. 18: Value stream map

Value stream maps provide a com­pre­hen­sive overview of the activities in a value stream, the resources used, the availability of activities, the mean execution and waiting times, and so forth. They are particularly useful for identifying waste and bottle­necks.

Wardley Maps

Fig. 19: Wardley map

Wardley Maps are designed to help visualize strategic choices re­garding the possible evo­lu­tions of products.

Customer Journey Maps

customer journey map
Fig. 20: Customer journey map

Customer journey maps may take many forms. They are particularly useful for helping a service provider to understand how its services appear to the service consumers. As such, the consumers’ touchpoints are visu­a­lized, often relating them chronologically to each other, to the value stream, to the chan­nels through which consumers interact and to performance measurements.

There are, of course, many other types of visualizations in addition to the ones illustrated above and the common “baby-talk” visu­a­li­za­tions. This list is meant to be indicative rather than exclusive. Information communication is, by nature, open-ended and creative.

I have not included any of the visu­a­li­za­tions that are just visually structured text. Many of the models for defining enterprise strategy, such as business model canvases, fall into this category. Similarly, I have not included the tabular organization of text and the ever-popular word cloud, although the latter treats words more like images than like strings of letters.

I expect to continue this series, looking at such issues as interactivity and the depiction of configurations.

Horizontal bar

Creative Commons License The article Information Visualizations for Service Management, by Robert S. Falkowitz, including all its contents, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


[a]   H. Choi, H. Lee, and H. Kim, “Fast detection and visualization of network attacks on parallel coordinates,” Computers & Security, vol. 28, no. 5, pp. 276–288, 2009.

[b]  Tamara Munzner, Visualization Analysis & Design. CRC Press (Taylor & Francis Group, LLC). Boca Raton, 2015. ISBN 978-1-4987-0776-3.

[c]  Jacques Bertin, Sémiologie graphique. École des hautes études en sciences sociales, 2011 (original edition 1967). English version: Semiology of Graphics, Esri Press, 2010.

1  Thanks to Brent Knipfer for his feedback on the availability of statistical control charts.

Unless otherwise indicated here, the diagrams are the work of the author.

Fig. 17: Deepthiyathiender – Own work, CC BY-SA 4.0,

The post Information Visualizations for Service Management appeared first on This view of service management....

Kanban in the Time of Corona 2020-03-23T12:59:44Z I went to the supermarket yesterday and was delighted to see a standard kanban practice was implemented there. Attempting to limit the density of the shoppers in the store, you had to wait at the entry for a card—in fact, a kanban card—before entering. At the exit, you returned the card, enabling another entry to […]

The post Kanban in the Time of Corona appeared first on This view of service management....


I went to the supermarket yesterday and was delighted to see a standard kanban practice was implemented there. Attempting to limit the density of the shoppers in the store, you had to wait at the entry for a card—in fact, a kanban card—before entering. At the exit, you returned the card, enabling another entry to the store. The cards were delivered next to the post distributing disinfectant to your hands. I was not able to see if the cards themselves were disinfected between use.

This practice recalls the example given in introductory Kanban classes of the use of kanban cards to regulate the flow of visitors to some parks in Japan.

Next, I had to visit the pharmacy to buy a prescription drug. (Fortunately, it is not for any respiratory ailment.) While the pharmacy is open as per its normal schedule, entry is limited to one customer at a time. I suppose the augmented chance of ill visitors makes such a WIP limit advisable.

A second feature designed to improve flow and decrease lead times is the request that you email or call in advance, so that the order can be prepared in advance of your visit. Advance preparation allows the pharmacy to reduce waste and improve its use of resources. When you deliver a prescription in person, there is an awful lot of waiting time and movement. Furthermore, it is very difficult for the pharmacy to batch the orders and find the optimal batch size for fulfillment.

It might be difficult to reduce the movement type of waste, but the waiting can be reduced and different batch sizes tried out when orders are prepared in advance.

However, the practice of limited influx to keep a sanitary distance between customers within the store goes for naught if they all bunch up at the entry to the pharmacy, waiting for their respective turns to enter. This is the problem of replenishment of the ready queue. Given that the batch of waiting customers is constantly changing, you can hardly expect them to work out on their own how to keep the flow of entries going while maintaining a good distance between the waiting customers. After all, in how many countries do people queue up for the bus in an orderly, civilized fashion?

Thus, the pharmacy was obliged to structure the backlog of customers by providing the same sorts of ropes and poles that you see in airports and cinemas. At the same time, signs requested that the waiting people maintained their distance from each other.

And so, it was inevitable: the scent of bitter drugs reminded me of the fate of unrequited kanban practices.

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License The article Kanban in the Time of Corona by Robert S. Falkowitz, including all its contents, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

The post Kanban in the Time of Corona appeared first on This view of service management....

Visualizing uncertainty 2020-01-15T08:40:00Z In this age of visual management, Bayesian reasoning, machine learning and other statistical methods, it is increasingly important to understand how certain we are about the “facts” and how to visualize that uncertainty.

The post Visualizing uncertainty appeared first on This view of service management....


There is one thing certain about managing services: we are un­certain of service outcomes. Service performance levels are un­certain and even the outputs of our services entail significant uncertainty. If we try to persuade service stakeholders to use, to deliver or to manage services in a certain way by using information visualizations, we should be honest about the degree to which we think we understand what has happened, what should happen or what will happen. In this age of Bayesian reasoning, machine learning and other statistical methods, it is increasingly important to understand how certain we are about the “facts” and how to visualize them.

Douglas Hubbard has discussed at length how many people misapprehend their own cer­tain­ties.1 For many, often for those in technical fields of work, either they claim to be 100% sure of something or they refuse to offer an opinion. Not only is this phenomenon based on un­war­ranted degrees of certainty—100% sure simply does not exist—it abandons a very wide range of cer­tainty, say, from 70% to 95%, wherein we can make very useful judgements.

The designers of  in­for­ma­tion visu­ali­za­tions will be more or less sure of the information they present. Of course, they can label elements with text. But are there visual ways of indicating levels of certainty? The answer, as we will see below, is “yes”. The question, though, is how certain we can be that these visual methods are effective. In this article I will first present some general remarks about uncertainty and probability. Then, I will examine various techniques for the visual ex­pres­sion of uncertainty.

Describing uncertainty

Uncertainty can be described in many ways. If you ask an engineer how long it will take to resolve a certain incident, you might get the answer “four hours”. And if you follow up with “Are you sure?”, the engineer might respond “Sure I’m sure” or “I’m pretty sure” or maybe “Well, I guess I’m not very sure.” These are ordinal es­ti­mates of certainty. But they are likely to be influenced as much by emotion, personality and power re­la­tion­ships as by objective eva­lu­a­tions of uncertainty.

Ob­jec­tive assessments describe certainty with continuous values, usually expressed as a percentage ranging from 0% (certain that an assertion if false) to 100% (certain that an assertion is true). Uncertainty is merely calculated as 100% minus the certainty. So, certainty is the probability that an assertion is true. Uncertainty is 100% less the certainty of an assertion.

We generally want to assess un­cer­tain­ty over a range of values, such as a segment of calendar time or a range of lead times. We may describe such probabilities using probability density functions:

A probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can be interpreted as providing a relative likelihood that the value of the random vari­able would equal that sample2

Suppose we want to describe the probability of resolving an in­ci­dent within a certain amount of time—the resolution lead time. That lead time would be the continuous random variable. The set of possible values would range from anything greater than 0 to some large number, say, 525600 minutes (i.e., one year).

Normally, we split up that range into a set of buckets, depending on the mean­ing­ful gran­u­la­rity of the lead time. For example, hardly anyone cares if you fix a broken printer in 208 mi­nutes as opposed to 209 minutes. The difference of 1 mi­nute is not significant to the outcomes the users expect. In such cases, a useful bucket duration might be 60 minutes. Perhaps, if you are resolving an incident regarding automated securities trading system, a duration of one minute might be extremely sig­ni­fi­cant, in which case your buckets should be much shorter than 60 minutes.

We want to know how probable it is that the lead time will fall into each one of the buckets. We may describe math­e­ma­tically that probability as:

probability density function definition
Fig. 1: The probability that x (the lead time) falls in the range of a to b equals the integral from a to b of the probability density function, f(x)

These probabilities—in the ranges a to b, b to c, c to d, etc.—are typically visualized using a his­to­gram. Each bucket is counted in the selected sample of data and plotted as a vertical bar. Sometimes, a trend line  supports interpolating values. We will return to the use of such trend lines in the examples given below. Labels of percentages make the chart easier to interpret.

Fig. 2: A histogram of lead times with a trend line. For example, there is a 14% probability that the lead time will be between 120 and 150 minutes

Often, we wish to determine the cumulative probability rather than the probability of a value falling in a single bucket. Suppose a customer requests a change by a service provider, stating that other actions must be coordinated with the implementation of the change. Therefore, it is not the lead time for the change that is most important; rather, it is the cer­tain­ty that the change will be implemented by the announced date. In this case, the cumulative probability is useful to ascertain. Thus, the service provider might tell the customer that it is 80% probable that the change will be implemented in 210 hours. If the customer requires greater cer­tain­ty, say 90%, then the lead time might have to be 230 hours, and so forth (see Fig. 3).

cumulative probability of completion
Fig. 3: The cumulative probability of completing a task by a certain time resembles something like a sigmoid curve. At short lead times, completion is very improbable (in this example, 250).

Uncertainty is an attribute of derived statistics as well as of measured or predicted values. For example, it is common to assess the correlation between two values. The analyst may then make an inference about whether one of those values might contribute to causing the other value or whether a third variable contributes to causing the tested variables. Thus, the measure of covariance is a proxy for the certainty of a possible causal relationship.

There are rules concerning how uncertainty is propagated from one or more independent var­i­ables to a dependent variable.3

Systematic versus random uncertainty

Uncertainty may be classified (among other ways) as being due to systematic causes or to random causes. Deming referred to these causes as special causes and com­mon causes, respectively. A sys­tem­a­tic cause should be identifiable and managed whereas a random cause needs to be iden­ti­fied as such, but not managed as if it were sys­tem­atic.

Examples of sys­tem­atic error might be in­cor­rectly ca­lib­rated or in­cor­rectly con­figured mea­sure­ment de­vices; be­havior designed to mislead or defraud, such as artificially lowering lead times; bugs in software that prevent it from being fit for use or fit for purpose; and many other examples.

Mistaking random causes for systematic causes is a form of bias discussed at length by Kahneman.4 Suppose a major incident takes a long time to resolve and the cause is assumed to be systematic, al­though it is, in fact, random. Steps are taken to improve performance. Lo and behold, the next, similar, incident is resolved more quickly. The managers assume that the better performance was due to the improvement steps, thereby per­pe­tu­ating those steps as “good practices”. But the reality was that the poor performance was at the lower range of random effects, so the next case would almost certainly show an improvement, no matter what steps might be taken in the interim.

It stands to reason that a visualization in­di­cat­ing the un­cer­tainty of information would add value by visually dis­tin­guish­ing random effects from systematic effects. The sta­tis­t­ical control chart is the classic visu­al­i­za­tion used to distinguish systematic variation from random variation (see Fig. 4). A set of eight standard guidelines helps to identify those data points that reflect systematic, as opposed to random, variation.

For example, the analyst should investigate any point more than three standard deviations away from the mean; or seven points in a row on the same side of the mean, for possible systematic causes.

Rather than oblige each viewer to analyze the plot according to those eight guidelines, why not make them visually compelling? In the exam­ple shown here, the yellow back­ground highlights some of the exceptional data points that require further analysis.

statistical control chart with systematic variation highlighted
Fig. 4: A statistical control chart with additional visualizations of possible systematic error data points

How certain is certain?

As we have stated elsewhere, information visualizations are a means of communicating messages designed to persuade recipients to act or decide in a certain way. If we want to communicate in such messages how certain we are about the message, we should have a notion of how probable an assertion is good enough for a given decision.

For example, suppose a customer asks for some new service for implementation by a fixed date. In other words, if the service is not made available by a certain date, then it would serve no purpose to implement it. So, the customer asks the service provider how certain it is that the service can be delivered on time. If the service provider says it is 100% certain, then the customer knows the provider has confused opinion and wishful thinking with an objective assessment of the probability of on-time delivery. Assuming the provider has an objective means for calculating the probability, what would the right answer be?

There is no simple rule of thumb to answer this question.4 The value of what is at risk determines, too, whether to reasonably undertake an action. Such issues occur commonly in allocating limited resources for the various services.  Fig. 5 shows a simplified model of the changing benefits (value) of investing in two services, together with the combined value. A grey zone around each curve ap­prox­i­mates the uncertainty of the predicted benefits. The width of those zones is determined by the probability that the prediction will be right 2/3s of the time, assuming this is an acceptable risk. If the risks were higher, that zone would have to be wider. The maximum benefits for the combined resource allocations would be somewhere in the indicated range. Note that the range of uncertainty for the combined benefits is considerably greater than for each service separately.

visualizing uncertainty in maximizing benefits
Fig. 5: A simplified model of allocating resources to two services, including the margin of uncertainty for the benefits of resource allocation and a visualization of the range from maximizing the benefits.

Sources of uncertainty

Before we look at how to visual uncertainty, let’s first look at the different types of uncertainty. We may first distinguish between un­cer­tain measurements of events or states in the past and events or states predicted for the future.

Uncertainty about the past

Uncertainty about the past is typically the result of extra­po­lat­ing from a sample to a pop­u­la­tion. Suppose you wish to measure the satisfaction of consumers with your services. In all likelihood, only a small part of your entire con­sumer base will ever respond to a request for satisfaction levels. All other things being equal, the larger the size of that sample, the smaller the probable margin of error in comparing the sample statistics to the overall population statistics.

When measurements are poorly conceived or poorly exe­cu­ted, they often introduce significant forms of bias into the results.  Assuming you are even able to identify those biases, it can be extremely difficult to estimate how they cause your mea­sure­ments to deviate from reality.

Uncertainty about the future

Predicting the future when it comes to delivering services involves estimating the margins of error in those predictions. In almost all cases, service output is the result of events and starting states of a complex adaptive sys­tem. There are far more agents in such systems than can be iden­ti­fied, not to speak of the dif­fi­cul­ties in measuring them and their interactive influences on all the other agents. As a result, we use simplifying models to estimate how the system will behave. Furthermore, when behavior is stochastic, we can only predict future states within a range of some breadth and some probability.

Suppose a customer asks a service provider to modify how the service functions. The customer will almost always ask how long it will take to implement that change. Using a model of the complexity of the work and the availability of resources, the provider might come up with an estimate that the average time for doing such work is about 20 days. By performing a Monte Carlo simulation, using historical data for doing similar work, the provider might determine that there is a 40% chance of completing the work in 20 days, a 75% chance of completing it in 25 days and a 95% chance of completing it in 35 days. By using historical data as the basis for the simulation, the many factors impacting lead time that are not easily modeled are also taken into account. Thus, the estimate pro­vided to the customer depends on the degree of certainty the provider believes the customer wants.

Once again, the margin of error in future predictions depending on historical data depends, too, on the factors men­tioned for uncertainty about past events.

Visualizations of uncertainty

Let’s look now at various visualization techniques that ex­press uncertainty. These tech­niques range from simple ordinal messages to continuous values. In other words, some techniques are designed to express uncertainty as somewhere in the range from little uncertainty to great uncertainty. Other techniques include visual re­­pre­­sen­­ta­tions of probability den­­si­ty functions or even label with such statistics as the cor­relation coefficient.

Error bars

An error bar is probably the most common method for visualizing the uncertainty in a graphical re­pre­sen­ta­tion of data. An error bar is generally is line parallel to the dependent variable of a chart going through each plotted value. Suppose the values are plotted as dots, for example, with the dependent variable on the Y-axis. Each dot would have a line traversing it parallel to the Y-axis. In the case of a bar chart, the line would be horizontally centered on the bar, extending above and below the summit of the bar.

In its simplest form, an error bar reflects three statistics about each data point: at the top of the bar, the bottom of the bar and the place where the bar crosses the plotted value. The interpretation of these positions varies according to the visualization. For example, a plot of stock prices would typically show each day’s high, low and closing price. But this reflects the uncertainty of the market, not the uncertainty of prices.

A more direct visualization of uncertainty might interpret these points as the mean value, the mean plus one standard deviation and the mean minus one standard deviation. This encoding might make sense of the distributions of values were all normal.

In other cases, the points might encode the median value, the first and the third quartiles. This encoding starts to give a sense of skewed dis­tri­bu­tions. Fig. 6 provides an example of a box plot with four statistics for each category: minimum value, 1st quartile, 3rd quartile and maximum value. The relative po­si­tion of the box and the length of the vertical line give indications of the distributions of the values. If it is important to give a more detailed view of how the data are distributed, a violin plot would be a better visualization.

box plot
Fig. 6: Example of a box plot, showing minimum, 1st quartile, 3rd quartile and maximum lead times, by service request type

As we have seen, box plots can encode many different statistics. In certain cases, such as in doc­u­ment­ing securities prices, the context makes it clear that the visualization encodes the opening, high, low and closing prices. But this is an exception that proves the rule that box plots should be labeled unless a well-known tra­di­tion defines the encoding.

Violin plots

I have previously written at some length about violin plots as they may be used for services and kanban. See my article Violin plots for services & kanban. I provide here an example of such a plot (Fig. 7). The sizes and the shapes of the violins give a good indication of the degree of uncertainty in a value as well as how those values tend to be distributed.

Reliability shown via a violin chart
Fig. 7: An example of a violin plot comparing the distributions of failure rates for various models of hard disks.

Value suppressing uncertainty palette

This technique uses color to represent both a value and a level of uncertainty. Changes in hue encode the value being displayed. The degree of uncertainty is encoded mostly via the saturation, with lower saturation indicating higher uncertainty.

value suppressing uncertainty palette
Fig. 8: A key for displaying ranges of value by hue and ranges of uncertainty by saturation

The palette describes four levels of uncertainty. The number of bins for the values is a negative power of the degree of uncertainty. Thus, at maximum uncertainty there is but one value. This increases to 2 values, then 4 values, then 8 values for the lowest level of uncertainty.

How does this scheme work in practice? In Fig. 9 we see a map of the U.S. where each statistic is displayed according to the level of uncertainty (i.e., the margin of error in the statistic). To my eyes, it is easy to see that Montana, Idaho, Vermont and Wyoming have a high margin of error (.75–1). The Dakotas and Nebraska have a lower margin of error (.5–.75) and a low statistic (4%–10%), whereas Oregon, Nevada, New Mexico and others have a similar margin of error, but a higher statistic (10%–16%).

This example highlights the drawback of the approach. How easy is it to find the states with a low margin of error (0–.25) and a low statistic (4%–7%)? Maybe my color blindness is making the scheme less effective, except at the extremes.

In any case, the scheme is less than ideal because:

  • it uses too many different colors (15)
  • it uses color to encode two different types of data
  • it is easily misinterpreted as encoding a single range of values, rather than a range of values and the uncertainty of those values
uncertainty levels on a map
Fig. 9: A map combining measurements by region and showing the margin of error in the measurement using ranges of color saturation.

Continuous visual encoding

Uncertainty is traditionally visu­a­lized using a density plot. Violin plots include a variant of the density plot. When a time series is plotted as a line, how do we visualize the changing levels of un­cer­tainty throughout that series? Change in uncertainty becomes flagrant as the visualization passes from measurement of past values to  extra­po­lations into the future.

When a set of nominal categories are plotted as a bar chart, where each category might have a dif­fer­ent level of un­cer­tainty, how can we visualize that in the same chart as with the bars? The continuous encoding scheme provides a solution to these questions.

A continuous encoding scheme may be applied to bar charts by associating the thickness of the bar with the level of certainty of the statistic. The result looks like a normal bar chart but with strange caps atop each bar. Those caps are representations of the probability density of the value as it reaches the highest levels.

Look at the example in Fig. 10 to see how to interpret the chart. Look at bar C. Up until a value of about 560, the certainty of the value is nearly 100%. But above that value, certainty starts to drop. By a value of 750, the certainty has dropped to nearly 0%. As you can see in the chart, each category has somewhat dif­ferent certainty levels at the top of the range.

How could we use such a chart for service management purposes? Suppose we are comparing the expected lifetimes of different models of hard disks. The scale of the diagram might be in 1000s of hours. Thus, the expected lifetime of model C is almost certainly at least 650’000 hours. It might get as high as 750’000 hours, but beyond that there is virtually no chance a disk of that model would last any longer. The shape of the cap atop each bar indicates the distribution of failures once the disk reaches a certain age. This chart is a refinement over a simple bar chart that could only display a single statistic, such as the mean, about the lifetime of disks.

bar chart with certainty level
Fig. 10: In bar charts, only the very top of the bar is a zone of significant uncertainty. The horizontal thickness of the bar indicates the probability of the measurement on the Y–axis.
gradient chart of expected disk model life
Fig. 11: A gradient chart documenting the expected life spans of different models of hard disks

Fig. 11, a gradient chart, demonstrates an alternate method of showing uncertainty on a bar chart. Note, for example, that Model C stays very reliable until near the end of its life. Model B and E, however, have a long period of increasing unreliability at the end of their lives. In this example, gradients could also have been used to document reliability during the initial burn-in period.

A similar convention could be used for a time series plot wherein the plotted values have varying degrees of certainty. This would be the case if the reported statistics were extrapolated from samples. In service management, this might be the case if con­sumer sa­tis­fac­tion ratings were based on samples, insofar as it might be impossible or too expensive to poll all the consumers.

In such a chart (see Fig. 12), small density plots replace the traditional dots of a dot plot. The height of the plot indicates the level of un­cer­tainty of the value. Note that pre­dicted future values (in yellow) are much less certain than the past values based on samples.

Fig. 12: Plotting distributions for each data point in a time series.

The above encodings of uncertainty provide a lot of information—perhaps more in­for­ma­tion than many viewers need to make decisions. Fig. 13 shows a simpler technique for a line chart. Before the current time, the temp­erature is shown as a simple line, based on the recorded temp­eratures. Since predicted temp­eratures are uncertain, that black line continues, but within a grey background showing the changing margin of error. The visualization indicates that there is a margin of error, but does not indicate the probabilities within that margin of error.

uncertainty in weather forecast
Fig. 13: Using shading to indicate a range of probable future measurements

Hypothetical Outcome Plots (HOPs)

Hypothetical Outcome Plots are animated samples of possible out­comes. These possible outcomes reflect the un­cer­tainty about the data describing a certain state. There is evidence that the use of HOPs can help visualization viewers to better judge un­cer­tainty than with static encoding of uncertainty, such as with error bars or violin plots.

Think of a map showing different possible trajectories of a hurricane where the movement of the storm is animated simultaneously on each trajectory. HOPs avoid the am­bi­gu­ous encodings that cha­rac­terize the use of error bars or box plots. Apparently, non-specialist viewers find it easier to make statistical inferences from HOPs than from other visualizations describing uncertainty.

See the articles cited in the bibliography for more details.

Tools for visualizing uncertainty

The vast majority of tools for creating visualizations offer little support for visu­a­lizing un­cer­tain­ty. Box plots are the principal exception to this generalization. While many tools can generate box plots or error bars, they tend to have very limited con­fig­ura­bility.

This is less an issue for information visualizations than for data visualizations. We usually expect the latter to be generated in a largely automatic way, once the initial visualization para­meters are set. With information visu­ali­zations, which take a specific position on interpreting the data and argue in favor of certain decisions, this drawback is less of an issue. That is because most information visualizations require a certain degree of manual retouching of the images to emphasize or to clarify the messages they communicate. Among the visu­ali­za­tions de­scribed above, the only ones I have not been able to generate with a combination of automatic and manual tools are the HOPs.

Be that as it may, we can hope that an understanding of the usefulness of visualizing uncertainty and a growing sophistication in the creation of information visualizations will increase the demand for tools that will ease their creation. As that demand increases, the availability of more sophisticated tools is likely to increase.

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License The article Visualizing uncertainty by Robert S. Falkowitz, including all its contents, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


[a]   UW Interactive Data Lab. “Value-Suppressing Uncertainty Palettes”.

[b]   Munzner, Tamara. Visualization Analysis & Design. CRC Press, 2014.

[c]     Jessica Hullman and Matthew Kay. “Uncertainty + Visualization, Explained”.

[d]    Jessica Hullman and Matthew Kay. “Uncertainty + Visualization, Explained (Part 2: Continuous Encodings)”.

[e]    UW Interactive Data Lab. “Hypothetical Outcome Plots: Experiencing the Uncertain”,

[f]     UW Interactive Data Lab. “Hypothetical Outcome Plots (HOPs) Help Users Separate Signal from Noise”.

[g]    Jessica Hullman, Paul Resnick, Eytan Adar. “Hypothetical Outcome Plots Outperform Error Bars and Violin Plots for Inferences About Reliability of Variable Ordering”. PLOS ONE, 10(11), 2015.

[h]    Barry N. Taylor and Chris E. Kuyatt. Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results.  NIST Technical Note 1297. National Institute of Standards and Technology. There are multiple versions of this document. The 2009 version is available from Diane Publishing Co. The 1994 version is available online at

[i]     K. Potter, J. Kniss, R. Riesenfeld, and C.R. Johnson. “Visualizing summary statistics and uncertainty”. Proc. Eurographics IEEE – VGTC conference on Visualization. EuroVis’10. 2010. p. 823–832.

1 Douglas W. Hubbard. How to Measure Anything. Finding the Value of “Intangibles” in Business, 2nd ed. John Wiley & Sons, Inc., 2010.
3 See the generic description here and a simpler list here.
4 See, for example, Daniel Kahneman. Thinking, Fast and Slow. Macmillan, 2011.
5 Well, there are rules of thumb, but remember that our thumbs are all of different sizes. I learned this to my chagrin from my own experience. Before undergoing surgery I was told that 95% of cases show improvement, 4% show no change and the remainder show a loss of functionality. Although the odds appeared to be extremely strong in my favor, I nonetheless ended up in the 1% that showed a loss of functionality.

Unless otherwise indicated here, the diagrams are the work of the author.

Fig. 8: Downloaded from*J1usESUkh_BhBbIX

Fig. 13: Downloaded from

The post Visualizing uncertainty appeared first on This view of service management....