Utilizing Kanban to Manage Support at Typemock

A major focus area at Typemock is customer support. We believe support is a great venue for new feature ideas and innovation for our product, and also that one of the major appeal to purchase a Typemock product is the excellent level of customer support we provide. If you didn’t know, support at Typemock comes straight from the developers – we don’t offshore, outsource or offload our support to anyone; when you talk to someone that signs ‘Typemock Support’ you are talking to a world class unit testing expert, straight from our development team.

A while ago we were a bit displeased with the way our customer support was being managed; handling two product lines, bigger demand for our products and the development team taking on some major features that took a lot of effort (more on that soon!), all caused us to feel we were getting less effective at doing support. Another obstacle was the lack of visibility. Each iteration one of our developers was assigned to customer support. For the entire iteration he was disconnected from the rest of the team, and we did not have an immediate way of telling whether he’s idly playing with his thumbs, or rather crushed under a torrent of user complaints. The developer would have to come forth and say ‘I am free’ or ‘I am loaded’, rather than the process providing visibility into the state of affairs. Finally, we did not have a way to drive improvement – we did not know how long does it take for a case to go through support, what major causes do we have for complaints or a good way to turn these complaints into new features.

After putting a lot of thought to this I took on the task (mentored by Eli) to apply a Kanban process to our customer support. In short (and there’s way more info out there on the interwebs) Kanban is a flow based process that utilizes a pull, rather than push, mechanism to advance tasks through the flow. The worker (the support person or team, in our case) is always looking at the last step of the process to find out where can they advance a work item to the next step; they pull work items through the process. This is contrary to work items being thrown at the team and stockpiled until someone gets to them – in both scenarios the team is busy, but using Kanban you get better visibility to where the bottleneck is, and what process area needs to be improved.

The first step to implement our flow, was to analyze our support work process and translate it into flow form: steps in a support ticket’s lifecycle. We wanted this to be as simple as possible to avoid having to move each ticket through 20 interlocked lifecycle steps, so we came up this flow:

It pretty much speaks for itself – a support ticket can be in the support inbox, meaning it is yet to be handled. From there it goes to analysis where it’s dissected, analyzed and depply contemplated. After enough time to meditate on the case, the results can be communicated to the customer – request for more info, providing a workaround, an answer or admitting a bug. After we reply, the ticket is officially Done. Done tickets may return to the inbox to start the flow again – for instance, when we request more info to recreate an issue, we expect the customer to return to us with the data and begin going through the flow again.
To make sure the pull system works, you first install (initially) arbitrary caps on how many work items can exist at any flow stage. We chose the following set up:
This means that there can be up to 6 tickets in the inbox, 3 in analysis and one may be replied to. The done state is the bin list of all the cases we handled. We prune this once an iteration, extract feature ideas from and analyze common causes for the incidents.
How does this help up improve our process? Let’s run a simple simulation. Let’s say Dror is providing support this sprint, and he has 4 items waiting for him in the inbox, and 2 are being analyzed:
While Dror is happily churning away at the items he’s currently investigating, customers continue to have questions. Imagine three of these questions just landed in our support inbox. Now, this would cause a 7 item inbox, which we defined as illegal in our flow, so Dror picks up one of the cases from the inbox and starts working on it right away:
Now, what would happen if another case would land in the inbox? Dror would have to immediately relax two flow stages and move one case from analysis to replying, to make room to move another case from the inbox to analysis. This sort of situation gives us the first point we can leverage Kanban at: early congestion detection. Without working on a structured and capped flow, we would only find out about this congestion when cases would start to age in the inbox and angry customers would come knocking on my door. Not good. Using this methodology we know of the situation as it evolves and we can react. Also, as team lead I have excellent visibility – a quick glance at the Kanban board, counting the tickets (that’s the hard part) and comparing them to the cap on each stage shows me there’s congestion. There are more indicators of a stuck flow: a ticket staying in one place for too long, for instance. In this case, let’s say the team springs to release the stuck flow, lend a shoulder and after an hour of all-hands support work we have the following state:
Pretty satisfactory. It took us a moment to identify something bad was going to happen, and it took us a very short time to react, adapt and clean up the mess before it happened. Now, it’s Dror’s job to start pulling tickets through the flow. As there’s room to reply to a customer, he should think which analysis case can be communicated to the customer. If there’s no such case, he should look for an inbox case to analyze, and so forth. If he’s idling without inbox cases, he should come to his manager and pull more work – at this point he can help the team’s sprint, solve a bug or learn something – whatever helps the iteration goals the most. The support person is no longer a dark resource, thrown into support work and returning a bit dusty a couple of weeks later. He’s an active participant in the sprint, providing and receiving support from the rest of the team. 
Another perk we received out of this process is visibility into support statistics. Once the process is structured it gets easy to track ticket progress and analyze how long it took us to move tickets between stages. One indicator we track is time to close a ticket – the amount of time a ticket takes to get from Inbox to Done. Another is time to reply – the amount of time it takes us to move a ticket from the Inbox to Analysis (this comes with a notification to the customer). Using this process we managed to improve time to reply from 14 to 4 hours during business hours in two months – we are happy with the improvement and we hope you are too.
So that’s it – I hope there are more process geeks out there that enjoyed a lengthy look into how we implemented a process to improve visibility, team work and control over our support performance. Do any of you have comments, success/war stories of your own or tips how to make it even better? Sound off in the comments!