CIOReview
| | SEPTEMBER 20199CIOReviewme, that blocking and tackling is based on the principles of IT service management (ITSM). I have found that the better I am at service management, the better I will be at operational excellence, customer satisfaction and, more importantly, IT agility. And, in a world that moves quickly and changes constantly, IT agility just might be the most important thing I can develop. Perhaps my personal example will demonstrate how solid ITSM enables digital transformation. We design, develop and support two SaaS products that our clients and their employees use. When I first inherited these products, they were overly complex, fragile and brittle. All by themselves, they broke several times a week. As you might expect, it is difficult to focus on innovation and digital transformation when we are not good at the basics of delivery. My first priority had to be to make these products solid, reliable and performant. I am happy to report that we have accomplished those tasks. How did we pull that off? ITSM. We defined a service catalog and service levels and then used a quality incident response and resolution process to get to and resolve the root cause of our application issues. As we consistently applied ITSM to our products, performance improved and we were able to shift our efforts from firefighting to the proactive innovation of our products. In parallel, we got good at the process of production change. In my experience, the vast majority of system downtime comes from not-fully-verified changes to production systems. We defined a simple process (complex processes encourage workarounds and are themselves unreliable) in which every change, before we made the change, had to have a valid, executed test plan, a roll-back plan (just in case we missed something with our testing) and a communication plan (so that we would have visibility around the change and to make sure that we fully understood the reach and impact of the change on other things). Within a short period of time, our results improved as did our credibility (and credibility matters when we are trying to convince the rest of the organization to digitally transform­we typically trust people who are demonstrably competent). Even as we got better at the business of IT, we retained our focus on our use of ITSM principles.As our product foundation improved, we started to seize and realize opportunities to shorten our time to market. This led us to experiment with and start to use agile software development methods and DevOps. We moved from twice-yearly product releases to monthly releases. Then to daily releases and now to multiple micro-releases a day (if we need to). Shrinking our release cycle has created significant value. Fixes, enhancements and good old innovations get to our customers much faster. And, each release is smaller and therefore less complex and less risky. In order to pull this off, we had to be really good at service definition, delivery and management. We also had to eliminate exception handling so that we could standardize our processes in advance of automating our processes. In other words, we had to continue to have our ITSM act together. As we shorten our development cycles, we must shorten our incident response and root cause analysis cycles. In our case, we use production worthiness standards that define what we mean by a quality service. We meet weekly to review pending changes (including the week's product release) and any open incidents (an open incident is one for which we have not yet identified the root cause or implemented the counter measure). We keep incidents open for as long as it takes us to get to and resolve root cause. In one case, we had an open incident for over a year waiting for us the learn enough about what happened so that we could ensure that nothing like it would ever happen again. In searching for root cause, we focus our attention on the process rather than the specific incident. If the last release introduced a software bug, we ask ourselves how we need to change our software development processes in order to reduce the likelihood of releasing a bug. If there are too many open database connections, we do not "solve" this by rebooting the database server. Rather, we work until we get to the core issue. I don't want to just find a memory leak, I want to find out what causes memory leaks.Digital transformation depends on quality and agility. I have a strong opinion­formed over years of success and failure­that ITSM is the key to both quality and agility. And I need both if I want to lead an innovative approach to digital transformation. As you might expect, it is difficult to focus on innovation and digital transformation when we are not good at the basics of deliveryNiel Nickolaisen
< Page 8 | Page 10 >