Wednesday, 23 January 2013

Code Dojos and Continuing Professional Development

"I know Kung Fu!" - "Show me."

On Monday we had yet another successful Code Dojo (see here), but what is a Code Dojo and why would we want to participate in one? Let us understand the context.

We live in a world of fly-by-wire businesses, and gone are the days in which IT was a cost-centre, and if the system went down then the work-force could revert to paper-based processes. Today's IT function is the core engine of high-performance businesses, and there is no paper-based fallback. When the network, or a vital server goes down, everybody stops work. Some right away, some continue for minutes, perhaps hours in some cases, but ultimately everyone stops work once they need something vital from the business's core systems. If your business is transacted over the web, then for every minute your infrastructure is not accessible to customers, you lose both money and vital brand equity.

As I've written before, COPQ or Cost Of Poor Quality is a significant drag on a businesses ability to innovate or improve. The principle mechanisms behind poor quality software production are systemic drivers (company culture, target-driven projects) and programmer skill. Changing and improving the Culture and System of an organisation are often outside the reach of the technical staff, but an individual's own skillset is their own responsibility, and therefore within reach of the individual to affect.

Unfortunately for the IT industry, the vast majority of academic institutions both in the US, UK and Europe do not teach modern programming methods e.g. collaborative development (including pair-programming), Test-Driven Development (TDD), Continuous Integration (CI), Refactoring and Simple Design. A straw poll amongst London's developer community shows that less than half of the developers work for organisations that routinely employ TDD, and only about 1 in 10 employ pair-programming. This low level of take-up means that developers often lack good on-the-job training from their employers, and their only recourse for learning efficient high-quality modern development techniques is to self-study in their own time.

Whilst working for Sourcesense I saw a need for a workshop-based format for improving developer skills through deliberate practice, rather than the more normal classroom-based "tell them what you're going to tell them, tell them, then tell them what you've told them". The reasoning for the choice of format was driven by the fact that like the game of Chess, explaining the 'rules' of TDD (or any other XP practice) takes only a minute or two, but mastering it can take years, and is principally driven by practice. In the book "Outliers" author Malcolm Gladwell identifies the "10,000 hour rule", the observation that independent of the field of study, it takes about 10,000 hours of deliberate practice to master the skill, and in order to count against the total the practice must be focussed, goal-directed, stretch one's abilities, give continuous feedback and be followed by self-reflection. Those 10,000 hours of deliberate practice can be difficult to obtain, as so much of our day-to-day work doesn't stretch us, isn't focussed, doesn't give us continuous feedback or we aren't allowed time for reflection.

In order to advance our skills, an environment that is dedicated to deliberate practice rather than delivery of value is a valuable resource, and this is exactly what a Code Dojo is.

During a Code Dojo we work in pairs, using TDD, Refactoring and adhering to the principles of Simple Design in order to solve a set-piece problem (or 'kata'). The focus is not on completing the problem in the time available, but rather doing the best job that you can, and improving your practice of the core techniques in doing so.  

Thursday, 10 January 2013

One OS to rule them all


On the way to see the Hobbit, I caught Canonical's announcement about Ubuntu on Phones: this is a pivotal announcement as it shows a landmark commitment to having a single Operating System (OS) across all platforms: phone, tablet, TV, desktop and server.

Ubuntu now fits your phone
In particular, Mark Shuttleworth's keynote shows the new Ubuntu touch-screen smart-phone User Interface (UI) and demonstrates some familiar concepts that will now be core to the Ubuntu phone, and other UIs - the fact that the indicators and controls are there only when you need them, the rest of the time they are off-screen and don't get in the way of your experience of whatever you are doing. This is of more significance and has more profound consequences than you might at first imagine.

In a similar fashion to Ubuntu, Mozilla have announced FirefoxOS (previously known as 'Boot to Gecko'), in which not only does the phone have a truly Open OS based upon Linux, but the actual UI of the OS is a web browser, your home screen is actually a home page, and your apps are actually web apps, just running in the OS's hosted browser environment.

FirefoxOS home screen

For a long long time game software development companies have realised that what players want - and expect - is that everything else will disappear when they are playing a game. All of the player's focus is on the game, the windows, icons, menus and pointers will disappear; the experience is immersive. More and more we are seeing this familiar theme replicated in other types of software:

  • Every movie-playing application has a 'full-screen' mode
  • Photo applications like iPhoto and Aperture have 'full-screen' modes
  • Document writing applications like WriteRoom make 'distraction-free' mode a key differentiator
  • Programming editors like Sublime Text have a 'distraction-free' mode
  • GarageBand has a 'full-screen' mode

More and more applications are using what has been termed an 'immersive UI' in order to optimise the experience for the user, to just let them get on with what they want to do and not clutter their work-space with things that they don't currently need.

In a lot of cases, I simply don't want to know what my battery level is, what user ID that I'm logged in as or the time, I just want to get on with whatever it is I want to do. I want to manage by exception; by all means tell me when I'm low on battery, or when I need to leave for a meeting, but don't bother me unless I need to know something, or I ask.

Most of an operating system's features are not something I need to know about, I just need my apps to work. As a consumer I don't really care about CPU utilisation, what the strength of my WiFi signal is or what my most-used applications are. I only care about these things when I need to manage by exception, and there is no real difference here between a desktop, a tablet or a phone except in how profligate the OS can be with our screen real-estate.

The real benefits of an OS are in its non-functional features rather than its functional ones - that it is reliable, performant, responsive, secure etc. Again, these are not - or should not be - features, but rather minimum requirements to do the job.

As a developer, I need access to device capabilities - whether this is battery data, output to the screen, input from a keyboard or mouse, storage to a medium etc. The OS is an API that I interact with to find out information and to cause real-world side-effects. The more similar APIs are across different platforms, the happier I am because I have less work to do to reach a broader audience.

In the same way that the OS is an API, so is the Web Browser. As a consumer, I just want to use my chosen web-site, to consume information and services, to input data and perform tasks, and more and more the line between web-site and native application is being blurred. Ubuntu's 12.10 release finally accepted that there is no functional difference between a native app and a web app, and treats them in the same way. HTML5 local storage API is but one way in which we are getting closer and closer to a homogeneous application ecosystem.

Mozilla's 'Boot to Gecko' or FirefoxOS for phones is simply the end-point extension of this concept, in this case the OS is a web browser, and all apps are web apps, and whether the app is hosted on the phone or elsewhere is almost irrelevant. This leads us to the realisation that HTML5/CSS/JavaScript is now closer to being a 'write-once, run anywhere' environment than Java ever was, which has additional profound consequences. The OS and the Browser are converging, indeed it would be a defensible position to assert that the modern web-browser is in most ways that matter an operating system.

Digression: what is the difference between a product and a utility?
Answer: people care about the features of a product

Example: the vast majority of people don't care what brand of petrol that they buy, they just want the cheapest petrol they can get that enables their vehicle to get from A to B. The petrol companies attempts at producing key differentiators with additives packages and V-Power are all so much hog-wash, and the proof is the popularity of supermarket petrol. Everyone knows that Tesco and Sainsbury's don't make petrol, they just rebrand other companies petrol, and no-one cares, as long as it works and it's cheap.

The rise of popularity of services like uSwitch and EnergyHelpline is proof that no-one cares about the 'product' that gas and electricity suppliers produce, all they care about is that it works and it's cheap. Gas supply is just a pipe through which methane flows, water supply is just a pipe through which water flows, and in much the same way Electricity supply is just a pipe through which electrons flow. We don't care how Electricity works as long as we can watch TV and cook meals in our microwave ovens.

Increasingly people are becoming aware that telecoms, particularly mobile telecoms, is just a pipe, much like the other utilities, and the so-called Value-Added Services that operators try and add to differentiate themselves are more and more irrelevant. People just want a pipe through which Internet packets flow.

What I'm getting at is that the operating system is analogous to a 'pipe' through which applications flow, how it works isn't relevant to 99.9% of consumers, we just want it to work and be as cheap as possible, and Ubuntu Linux, Android, Chromium and FirefoxOS are complete from a non-functional perspective and free of purchase cost. Similarly so for the Firefox and Chrome browsers. The operating system is simply a utility. The conceit of companies like Microsoft and Apple is that we actually care about their operating systems, the hard truth is that we just want them to disappear and let us get on with what we want to do, using apps, and we don't care whether those are native or web apps as long as we can get what we want to do, done.

Mozilla have announced that they're expecting FirefoxOS to drive the the $50 smartphone; for many people in developing countries this will not just be their first smartphone, it will be their first access to the Internet. This changes everything!

Friday, 4 January 2013

The rise of Foxy thinking

Cunning like a fox

Now that we've survived the Mayan Apocalypse I predict that 2013 will see the rise of the generalist over the specialist.

This article last year from Harvard Business Review by Vikram Mansharamani, lecturer at Yale and author of 'Boombustology: Spotting Financial Bubbles Before They Burst' references Isaiah Berlin's 1953 essay "The Fox and the Hedgehog" which contrasts hedgehogs that "relate everything to a single, central vision" i.e. specialists, with foxes who "pursue many ends connected...if at all, only in some de-facto way" i.e. generalists. Berlin's essay is itself based upon the Greek poet Archilochus who wrote that "The fox knows many things, but the hedgehog knows one big thing."

The article talks of Professor Phillip Tetlock's 20+ year study of 284 professional forecasters. He asked them to predict the probability of various occurrences both within and outside of their areas of expertise. Analysis of the 80,000+ forecasts found that experts are less accurate predictors than non-experts in their area of expertise. Tetlock's conclusion: when seeking accuracy of predictions, it is better to turn to those like "Berlin's prototypical fox, those who know many little things, draw from an eclectic array of traditions, and accept ambiguity and contradictions."

The time has come to acknowledge expertise as over-valued. There is no question that expertise and hedgehog logic are appropriate in certain domains e.g. hard sciences, but they certainly appear less fitting for domains plagued with uncertainty, ambiguity, and poorly-defined dynamics e.g. social sciences and business - the very circumstances that have given rise to Agile and Lean Startup. This kind of flexibility is what has led to the concept of generalising-specialists, the T-shaped skills profile and the concept of 'everybody codes, everybody tests, everybody deploys'.

The time has come for leaders to embrace the power of foxy thinking.


Tuesday, 30 October 2012

CoffeeScript and classes, privacy and encapsulation

There is an uneasy alliance between the concept of a 'class' in CoffeeScript and the class-less prototypal inheritance model of JavaScript, itself influenced by Self and Scheme. Many people coming to Javascript, including the creators of CoffeeScript and Prototype.js have added mechanisms to their languages/libraries in order to enable a more 'intuitive' mechanism for handling Object-Oriented (OO) concepts.

Whilst I'm not declaring the use of OO in Javascript to be good or bad per se, what I would like to do is point out where the limitations of Javascript (and hence CoffeeScript) lie, and how to avoid inadvertently tying  your own shoelaces together. 

This

First this: 'this' in a class definition refers to the class (actually a prototype object from which instantiated objects of that class inherit from, since there are no real classes in Javascript or CoffeeScript) and 'this' in a class constructor refers to the object under construction. Let's look at a simple example with a Game class for a hypothetical game with a board:
class Game
  constructor: ->
    @board = []
myGame = new Game
Remembering that '@' is a shortcut for 'this.' - so our game board starts off empty. If we want to be able to pass in a starting board then we may do something like:
class Game
  constructor: (startingBoard) ->
    @board = startingBoard
myGame = new Game [BOARD]
We need to be aware that if we ever call new Game without passing in a startingBoard, then instead of returning null or a value, a call to myGame.board[0] will return an error:
TypeError: Cannot read property '0' of undefined
We need to account for the case where a startingBoard is not provided, and initialise the board to an empty state. We may choose:
class Game
  @board = [1]
  constructor: (startingBoard) ->
    @board = startingBoard
myGame = new Game [2]
This will appear to work as myGame.board[0] will return 2, unfortunately as soon as you try:
myGame = new Game 
then myGame.board[0] will now return TypeError as above. So we've just assigned 'undefined' from Game's argument list to @board. To prevent this we can instead make use of CoffeeScript's existential operator in order to check that a startingBoard was set like so:
class Game
  @board = [1]
  constructor: (startingBoard) ->
    @board = startingBoard if startingBoard?
myGame = new Game 
This also doesn't work, because myGame.board[0] will still return TypeError. The problem is (as mentioned above) that 'this' in the constructor and 'this' in the class refer to two different objects. If instead we were to use:
class Game
  board: [1]
myGame = new Game 
Now myGame.board[0] returns 1 as we expect, as board is a public property of the Game class, which is 'inherited' by new instances of Game, and 
class Game
  board: [1]
  constructor: (startingBoard) ->
    @board = startingBoard if startingBoard?
myGame = new Game [2]
Now myGame.board[0] returns 2 as expected, and if we initialise myGame = new Game without arguments, then myGame.board[0] will return 1, and all is happy. 

Because of the way that the class operator in CoffeeScript populates the prototype object with class attributes and methods, we can be sure that the overridden @board in the constructor is affecting the returned object's board attribute and not the prototype's attribute, hence if we have:
myGame = new Game [2]
myOtherGame = new Game [3, 4]
Both myGame and myOtherGame refer to different objects with different board attributes. All is now well in class-land. 

Privacy

Now things get more interesting. Many developers who come from an OO background are wondering where Javascript's (and CoffeeScript's) private attributes and methods are. Although there is the convention that private attributes and methods are prefixed with an underscore e.g. _privateMethod this isn't 'real' privacy (i.e. enforced at compile-time or run-time), it is just convention. 

Some of the more experienced readers are familiar with Douglas Crockford's "module pattern", which uses the properties of a lexical closure to hide information from code that is outside the closure e.g.
module = do ->
  secret = 0
  public =
    get: -> secret
    inc: -> 
      secret++
In our module we use an immediately invoked anonymous function to set up a closure and return an object with two methods: get() and inc() which get the value of our private 'secret' variable and increment it respectively. 

Trying to access module.secret will return undefined, and trying to set module.secret will create a public attribute of the returned object that will have no relation to or access to our private variable referred to in the closure. You can try this for yourself at http://jsfiddle.net/2SbpB/

So how do we use this with classes in order to give us class and instance private methods and attributes?

Class warfare

Let's go back to our simple Game class:
class Game
  board = []
  constructor: (start) ->
    board = start
  getBoard: -> board

myGame = new Game [2,3]
Example JSFiddle at http://jsfiddle.net/sxWET/1/

Accessing myGame.board just returns undefined, but myGame.getBoard()[0] returns the expected value 2.  This works great, and works equally well for methods too, until you realise that board, although a private attribute is actually a private class attribute shared amongst all instances and not a private instance attribute. This may not be such a big issue in the above example, where Game may very well be a singleton, but in other cases e.g. Player is a much bigger deal. You can see how this breaks down here: http://jsfiddle.net/AV7WS/

So how do we get private instance attributes? We can assign the attributes and functions that access them to be part of the object returned by the constructor as so:
class Game
  constructor: (start = []) ->
    board = start
    @getBoard = -> board

myGame = new Game [2,3]
hisGame = new Game [4,5]
Each object's 'board' attribute is private and cannot be accessed or changed from outside the object. Try it out here: http://jsfiddle.net/fqhb7/ 

This works, sort of, but other attributes and methods of the class have no access to board, because they're not defined within the constructor's closure. Hmmm. 

Now realise also that each object instance now has its own attributes *and methods*, so if you're dealing with large numbers of objects performance will suffer. We're getting to the point at which we're realising that this is an ugly hack which has taken us outside the envelope of what the environment was designed to do.  

I'd love to hear if anyone has found a more elegant way round this, but the general consensus seems to be "stop trying to make CoffeeScript/Javascript into something that it isn't, work within its limitations and try and take advantage of its strengths, rather than trying to morph it into another type of language by using fake typing workarounds".

Something else I'd also like to add: CoffeeScript's default of compiling files into anonymous closures removes a lot of the danger of pollution of the global namespace. If you're using node.js then you have the use of node's module system, with require and exports, and if you're working in the browser then you're definitely going to be interested in require.js which works in a very similar way. Both of these mechanisms provide much of the safety of private attributes and methods, except bounded by files rather than by classes.

We're seeing various ideas batted around for proposals for ES6 with regard to classes, Microsoft has already jumped the gun with TypeScript, and Google has jumped the shark with Dart. We'll see how effectively these two manage to square the circle as their products mature. Personally I'll be taking the road less travelled[1] and looking more closely at how we can use idioms from Self and Scheme to help our Javascript and CoffeeScript development. Stay tuned...

References: 
[1] "The road not taken", Robert Frost, 1918, http://www.poets.org/viewmedia.php/prmMID/15717 

Thursday, 25 October 2012

Cost of Poor Quality



"There's never enough time to do it right, but there's always time to do it over" -- various.

COPQ or 'Cost of Poor Quality' is a term often heard in engineering and manufacturing, where broken or defective product is often readily apparent, but rarely heard in software development. I'm going to start out with a quite restrictive definition of COPQ to begin with, principally because I'd like to start by surfacing COPQ by using things that are easily identified and measured, though in reality COPQ can encompass many more items that are less readily measured like 'difficult to extend' and 'unintuitive UI'.

Very few of us in the 'Digital Economy' actually enjoy producing poor quality software - we do it because we feel that we have no other choice, that circumstances, or people, drive us to. The purpose in adopting a relatively easy to define definition is so that you can build a simple-to-understand business case for doing software right, and in doing so not only save your organisation a shed-load of case, but also be able to stand by your work as something that you are proud of instead of something you're glad to be rid of.

COPQ =
cost of rework
+ cost of technical debt
+ cost of late delivery
+ cost of lost revenue

Here we define 'cost of rework' to be the amount of effort spent reworking code that has passed out of development, but has come back due to failing some aspect of testing, whether unit, system, regression, integration, UAT or deployment.

We define 'cost of technical debt' to be the very narrow cost of fixing bugs that have made it into production.

We define the 'cost of late delivery' to be the opportunity cost per unit time of not having the system ready, multiplied by the duration of the schedule slippage.

The 'cost of lost revenue' is in italics because this is more difficult to measure, and we really just need the first three in order to build a compelling business case. Let's work through these three items one at a time.

Example project
We'll consider an example project of 100,000 lines of code, or 100 KLOC. The COCOMO average developer productivity for a 100KLOC project is 330LOC/Person-Month[4]. This gives an effort calculation of 302 Person-Months. At an average Java developer salary of £47000/year[2], multiplied by a load-factor of 1.5 gives us a loaded-cost of £70,500/year or £5875/month, yielding a total project cost of £1.8m. Bearing in mind that only about 36%[4] of this cost is development, this equates to approximately £6.50/LOC.

Cost of rework
The industry average amount of rework is £1.75/LOC[3], leading to £175,000 rework costs. If you project is Java-based, you're out of luck, the industry average rework cost for Java is £3.36/LOC. Total cost of rework on a Java-based project is therefore £336,000.

Cost of technical debt
The industry average level of technical debt is 5 defects per KLOC[5], therefore we can expect to find and fix 500 defects in our 100KLOC programme before delivery. The cost to fix a defect, averaged over all classes of defects, is 5 hours per defect. Assuming an 8 hour working day and a 65% effectiveness we can therefore fix 22 defects per person month, therefore each defect costs us about £270 and we require 22.7 Person-Months to fix all of the defects, this equates to £133,500.

Cost of late delivery
The opportunity cost of a late delivery can sometimes be huge, but as we don't have industry figures for these risks and impacts we'll use something more easily measured: delay. The average schedule slippage on IT projects is 27%[6]. Note that HBR found that fully 1 in 6 of projects surveyed was a 'Black Swan' with cost overrun of (on average) 200%! (and an average schedule overrun of 70%). If the total cost of the project is £1.8m and the overrun is 27%, then this yields and average cost of late delivery of £486,000.

This gives us a total COPQ of £336,000 + £133,500 + £486,000 = £955,500, or 53% of the original programme budget; that's quite significant, by anyone's standards. What this tells us is that: a) the IT industry as a whole still isn't very good at controlling quality, cost or schedule, b) it is ironic that so many clients (and vendors) still seem to think that a fixed cost, fixed schedule, fixed scope contract is the default mechanism for commissioning an IT project.

These figures should be easy enough to come by for your own projects, what is the COPQ of your current or most recent project? With these figures in mind there is surely a sound business case for spending the time and money to do it right the first time, and make all our lives easier and more productive.

References:

[1] Jeff Atwood, "Diseconomies of scale and Lines of Code", 2006, (link) references: Barry Boehm et. al. "Software Cost Estimation with COCOMO II", 2000
[2] Current average UK Java salary from ITJobsWatch on 23 Oct 2012, (link)
[3] CAST Report on Application Software Health, 2011, (link)
[4] COCOMO cost-model calculator, NASA, (link)
[5] '3 harmful metrics and 2 helpful metrics', Capers Jones for CERM 2012, (link)
[6] 'Why Your IT Project May Be Riskier Than You Think', Harvard Business Review 2011, (link)

Tuesday, 28 August 2012

Value, and Throughput Accounting


There are many good articles written about how business value is the unacknowledged key driver of software development, and how Throughput Accounting gives businesses a new way to look at making business decisions based upon value delivered over time rather than Cost Accounting, which focusses on a simplistic view of cost-reduction as the main lever to control a business. But I'm not going to talk about that today.

Instead, let me ask a simple question: How do you prioritise a backlog? 

The answer is also simple: each backlog item has a value to the business and a cost. Cost is a function of actual cost to build and a weighted 'cost of risk', which is the sum of risk items individually multiplied by both their probability and their downside cost impact. 


Neither cost nor value are generally known in advance, both are estimates. Why? 

Value is an estimate in most situations; there are some situations in which value is known to a fairly high degree of certainty e.g. 3rd party supplier contracts i.e. a customer has given you a fixed price contract to implement a feature. Even under such circumstances the value is not 100% certain, the customer may pull out, may go bankrupt, may renegotiate half-way through implementation for a lower-cost solution, etc. 

Cost is an estimation because not only the 'cost-of-risk' part of the cost is an estimation, but the actual 'cost to implement' is also not known with certainty beforehand. Even in those situations in which it might be thought that the 'cost to implement' component is known e.g. buying in a third-party Commercial Off-The-Shelf solution (COTS) there are always extra unforeseen costs that cannot be accounted for up-front e.g. cost to integrate, cost to deploy, cost to configure, cost to manage the project etc. 

In summary: cost is an unknown, and value is an unknown. There are ways of assessing and managing these levels of uncertainty in advance, these methods are well known to scientific researchers who even in the hard sciences (I'm speaking as a Physics graduate) must take into account random and non-random variances and look for statistically significant correlations in their observed experimental data. In over twenty years in the IT industry I have not seen any calculations of cost and value that would come anywhere near passing the levels of rigour necessary for the peer-review of a scientific paper. Now, you may argue that these levels of rigour are unnecessary outside of the scientific community. I could offer the counter example of the UK government's National Programme for IT (NPfIT), the £22bn technology revitalisation programme for the UK's National Health Service (NHS), but although I think that programmes like these do indeed need appropriate levels of rigour, I will instead make the simple observation that the vast majority of IT projects I have worked on have had no significiant value analysis at all. 

Let me say that again: more than 90% of Product Owners, Business Analysts, Project Stakeholders and Executive sponsors that I've dealt with over the last twenty-something years have had anything more than a gut-feel about what a feature is worth. Not even a 'back of  a fag-packet' style calculation. Now I'm guessing that somewhere, in a board room, some kind of business case was put forward for the project, and there were some kind of expected costs and benefits analysed, and some kind of ROI figures bandied about, but none of that kind of high-level information ever made it down to the project teams, despite it being an essential part of the Prince 2 requirement for a business case document, a mandatory part of the one essential document (Project Initiation Document) that is key to the starting of any project.

Question: If no-one knows what a feature is worth, how can it be prioritised? Answer: simple - it is prioritised according to two common factors: a) how hard a stakeholder is willing to argue against his political rivals for its importance, and b) cost.

I would be prepared to argue that most of the debate that has taken place in Agile circles about Story Points and Estimation (of which whole books have been written) has been driven by the lack of any indication by the business of what the actual value of a Feature is, because if the only tool you have is a hammer (cost) every problem begins to look like a nail (more accurate and timely cost-reduction). 

The next time someone - Product Owner, Business Analyst etc. - asks you what the estimation of the cost to develop a feature is try saying "$10,000 plus or minus 1000%". When they demand - and I'm betting they will demand - a more accurate estimate say "I'll give you a more accurate calculation of cost when you give me an equally accurate calculation of business value." 

Nine times out of ten you'll be safe in the knowledge that not only do they not know the value of a feature right now, but they have no way of determining the value of a feature either. A shocking state of affairs, but why should you be forced to do more work to cover up for someone else's lack of professionalism? As long as the conversation remains focussed on the red-herring of cost, the true subject of import: value (and hence throughput) will remain unaddressed. 

In a future article I'll talk about different kinds of value, and how we can use Net Present Value (NPV)-style calculations to drive backlog prioritisation, but really, it's better to crawl first before trying to run and tripping over the shoelaces that the business has tied together for us. 

Monday, 14 May 2012

Crowd-sourcing a better place to work


Love the colour scheme!
What would be your idea work environment as a team member working on a cool technology project in central London? This is not (just) about the office space, it's about the team and the company you work for. I'd like to hear peoples' top three items about what they'd love to see for their perfect job. Why on Earth am I asking? Well, I'm helping set up a software company in central London, and I'm fascinated by the idea that the 'wisdom of crowds' can help me discover some of my 'unknown unknowns'.

Although I have a whole bunch of ideas, I will start the ball rolling by giving my own top three:
  1. A really great team - I don't have to agree with them, but I need to be able to respect them and be able to learn from them too. They're team players but they're not afraid of disagreement, not in an elitist jerk kind of way but in a "based upon what I know now, I don't agree; let's have an open conversation and I want you to convince me" kind of way. Without a great team, everything else is a non-starter. 
  2. An organisation that respects people - and by this I mean one that treats the team as grown-ups, like the professionals that they are, not like little kids that need to be told what to do and then watched over to make sure that they don't do anything naughty. This permeates throughout every aspect of an organisation, from pay and benefits to holiday policy to providing working materials. I want to feel that the company gives me the space to demonstrate that I am worthy of trust and respect, but starting from an assumption of adult professionalism, not kindergarden. 
  3. A culture of excellence - by this I don't just mean that people are good at their jobs, and the company produces products that excite and engage the customer, but an organisation that understands that IT is essentially a learning business, and that we need time and space to learn and improve - not just technology and skills, but about the market, about our customers, about business and about each other. Those who learn fastest can adapt fastest and lead the market, today's startups are about learning faster than the competition, not doing faster or changing faster. 
There's lots more, but part of my own journey is having the discipline to listen more, and talk less. I'd love to hear what your top three are...


Sunday, 13 May 2012

Net-negative producers


A friend asked me how a set-top box software team could possibly have 20 people on it, surely 5 or so should be sufficient for what is basically a UI app to parse a programme and stream some video? 

In answer I told him the story of when I started on an assignment for H3G in their new posh glass building in Maidenhead, working in the 3G products division in early 2002 producing what were essentially a bunch of web apps. I asked my boss how many people were working on the project: "350" was his reply. My jaw dropped to the floor. "Don't worry," he said with a smile on his face "all the real work is being done by nine guys working in a back room of the pub over the road. This new glass building is just a decoy to confuse Vodafone and Orange!"

How on earth could it possibly take 350 people (DBAs, Product Managers, Product Managers, Architects, BAs, QAs, Developers) to produce a bunch of simple web apps? I simply couldn't get my head around what I had assumed would be perhaps 25-30 people, at most. 

This was a pattern that was to repeat itself, and it took me a while to understand why. Back in my early days in the IT industry I assumed that experience and productivity were related by a linear function like productivity = (experience) times a constant: let's call it 'a', giving us p = ax. This implies that if a developer had twice as much experience then he would be twice as productive, more or less. The following diagram shows what I mean:


It took a few years before the true scale of what was going on became apparent, good developers with twice as much experience were not twice as productive, they were four or more times as productive, the relationship between experience (or perhaps ability) and productivity wasn't just polynomial, it was more like a power law! p = a^x


As it turns out there has been plenty written about this from Steve McConnell[1] (the research that shows great programmers are 20-25 times better than mediocre ones), to Mark Zuckerberg's[2] famous quote that “Someone who is exceptional in their role is not just a little better than someone who is pretty good, they are 100 times better”. So this isn't exactly new news. People debate this in the IT industry back and forth all the time, but I've noticed that the people who are great developers, or who are managers or executives who have worked with great developers simply take it as empirical fact, and it only ever gets debated by people who haven't actually experienced the phenomenon (or actually read the research). It did however take a few more years for the real penny to drop, and that is that for the graph I had drawn in my head, the function does not pass through the origin, instead of p = a^x the function is more like p = a^x - b


The end result of this is that the poor developers who are swimming in the shallow end of the gene pool are not just poorly performing, they are net-negative producers! What does this mean? This means that the programmers to the left side of the graph are the ones producing the bugs that the people in the middle portion of the graph are fixing, they are the ones inserting misleading comments, writing spaghetti code and implementing poor design that the developers further to the right are wasting time fixing, deleting, refactoring and redesigning. If your organisation doesn't have many people in the far right hand side of the graph, then it is easy to see that no matter how many developers you have, you might not be making any forward progress at all, although you will be burning a ton of cash! I've illustrated this with some shading below:


Thus you need more people to the right of the blue section in order to outweigh the net-negatives. If you don't have many people on the right-hand side of where we cross the x-axis, then the reason for the lack of progress becomes obvious. It is this graph that explains the original question: how can you have so many developers and yet be accomplishing so little? This is something I've been explaining to people for years. 

Corollary: As yet another example of how pretty much every idea I've ever had has already been thought up by somebody cleverer, I found an example of prior art on Ward Cunningham's wiki[3] referencing work by G. Gordon Schulmeyer - in his own words:

"We've known since the early sixties, but have never come to grips with the implications that there are net negative producing programmers (NNPPs) on almost all projects, who insert enough spoilage to exceed the value of their production. So, it is important to make the bold statement: Taking a poor performer off the team can often be more productive than adding a good one."

This. This is why last year a major household name came to us asking us why with more than 120 off-shore Java developers on a project, they couldn't make any progress. Answer: (as harsh as it may sound) employ professionals that can actually do the job. As Red Adair said when a potential customer blanched upon being told how expensive his rates were: "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur."

Wednesday, 21 March 2012

Documentation and Agile

Today I saw a question on the LinkedIn Agile forum:
Where do you go to see what the code "really" does besides asking a developer?
What I worry about are where are the final documents that describe what the software does? That seems to get lost in Agile. And is sooooo important for long term maintenance, Technical Support/Call Center and years later Product Owner updates.

There are all kinds of assumptions and world-views wrapped up in this, including the old bugbear "we're Agile, so we don't need documentation", which not only is plainly invalidated by the Agile Manifesto but is quite clear for everyone that understands the principles and values upon which Agile is built. Documentation has value, but it also has cost. Where there is no better way of doing something, we will create documentation, but where there is a higher benefit-to-cost ratio option, we will use that instead.

The original poster's question above needs Root-Cause Analysis - example: why do you need documentation? Could it be because:

a) I am a new guy and I want to learn how the system works?
b) I am a customer service person and I want to evaluate whether what the customer is describing is a bug or the way the system normally works, so that I can close the case?
c) I am a support triage manager and I want to rate this case as a defect or a change request or WAD ('Working As Designed') so that I can send it to the appropriate team?
d) I am a Product Manager and I want to know what the system was specified to do so that I can construct a product roadmap?

All of these answers embody one or more organisational dysfunctions:

a) You need an on-boarding programme with proper training and mentoring; just sitting the poor guy down with a user manual is not the answer to making new staff effective quickly.
b) CSMs shouldn't be closing cases as WAD/"Not a bug"/whatever - the role of the CSM is to make the customer happy, and this means understanding what the customer's issue is and following through. Telling them that the way that they interpret the manual sitting on their desk is that "this is the way the software is supposed to work" isn't helping the customer. Gain understanding, formulate a plan of action with others, execute that plan and monitor and follow-through with the results. CSMs should 'own' the customer experience.
c) Different silo-d teams handling defects vs. change requests is a 'team smell'. People closing customer issues because the software is WAD is a process and a team smell. Actually having so many product issues that you need a support triage manager is an 'everything' smell!
d) If the Product Manager doesn't know what their product should do, God help you all! (Hint: satisfying the customer is a good start). As DHH said: you don't need an issue tracking system to tell you what is wrong with the product, you just need to listen to your customers, because they will tell you *every day* until it's fixed!

Going back to the original question: "How do I know what the software does if I don't have any documentation?"

1. If you want to know what the software does - actually use the system. 'Normal' behaviour is quite obvious to a regular customer.
2. If you want to know what customers want the system to do, simply ask people that field customer support calls (which should include your entire team on rotation at least once a month, everybody, no exceptions, even your execs). Quite frankly, it is irrelevant what someone specified the system to do X months or even years ago - whether it was a Product Manager or your customers - if none of your customers actually want the system to work like that now...

If you want documentation as contracts, then automated tests (acceptance, unit, integration, etc.) are the best solution, as every build has to fulfil these contracts. The works of Dan North and Gojko Adzic are valuable starting points here. Next stop, the code. If neither your tests or your code are readable, you have a larger problem than not having a specification document!

The problem with written documentation that isn't executable i.e. tests or code, is that it quickly becomes out-of-sync with the product as it has little or no value add over the tests and code to the people actually writing it and it is not possible to automatically verify that the document is wrong because there is no automated method of validating the documents against the actual product. As a consequence simple human nature steps in; we are lazy forgetful creatures that don't like duplicating work we've already done. Result: the documentation is always a lie; as time progresses, the lie gets larger. Behaviour-Driven Design (BDD) or its child Specification By Example (SBE) can really help here.

In the past the documentation has been the contract between the customer and the supplier, between the marketing function and the IT function, between the development team and the support team, between the architecture team and the development team, between the development team and the operations team... The Agile way of working is to shift the emphasis away from contracts and towards working software as the primary metric of success, and to minimise contracts as mechanisms for limiting change and to shift to a new way of working based upon collaboration and prioritising change effectively.

This fundamental change in the way of working makes people who are used to working with written contracts all their life uncomfortable. Many people don't like change, many organisations don't like change. Agile isn't for everyone, and it isn't a silver bullet. Don't feel that you have to 'embrace Agile' if your organisation is not on-board with this kind of radical organisational change management programme.

For those that are committed to this kind of change, the rules of Simple Design, good Stories and acceptance criteria, and SBE/BDD (with their associated TDD and CI) are an excellent place to start with making the code and tests better fulfil the functions previously taken by documentation.

Saturday, 10 March 2012

The invisible deficit


I read with interest a tweet from Kent Beck the other day as it seemed to ring a bell: "the complexity created by a programmer is in inverse proportion to their ability to handle complexity". He followed up the tweet with a note on his Facebook page explaining that he'd been doing a code review of a developer's code and noticed that the guy didn't recognise that the solution he'd adopted was needlessly complex compared to the problem, but the developer simply couldn't see it. Kent finished with "The programmer least likely to be able to handle the extra complexity is exactly the one most likely to create it. Seems a little unfair. I'm interested in how to break this cycle, and whether it is even possible to break this cycle." (emphasis mine).

A link in the comments led me to the Dunning-Kruger effect, and a light bulb came on as I recognised a friend's comment from last week that none of the rest of the developers on his team were able to code for toffee, indeed this sprint all they had been tasked to do was to review his code, and by the time he'd left the office on Friday evening they'd failed to do even that! Over a beer he opined that he'd be better off bringing in a machete to work on Monday and just hacking them all to death and then he'd be able to work faster! Well, he is Italian, so I can forgive him the exaggeration. Having spent much of my time over the last three months interviewing Java developers for one of our clients I can empathise with my friend's feelings: there are a lot of mediocre developers out there, particularly in the Java world (I refer to Java as "the new COBOL").

So what is the Dunning-Kruger effect? Dr. David Dunning, a Cornell professor of social psychology, in 1996 read an article describing the arrest of a Pittsburgh bank robber called McArthur Wheeler; undisguised and in broad daylight the guy had attempted to hold up a bank, his photos were later released to the press and he was arrested and charged. He had been under the sadly mistaken belief that if he smeared his face in lemon juice it would make him invisible to video cameras. Yes, really. Not only was he too stupid to be a bank robber, he was too stupid to know that he was too stupid to be a bank robber!

Dunning wondered whether it was possible to measure a self-assessed level of competence against actual competence. Soon he and his graduate student, Justin Kruger, had organised a program of research and three years later their paper, “Unskilled and Unaware of It: How Difficulties of Recognizing One’s Own Incompetence Lead to Inflated Self-assessments,” was published.

Surprise surprise! Kruger and Dunning found that incompetent people will:
  • Tend to overestimate their own level of skill;
  • Fail to recognise genuine skill in others;
  • Fail to recognise the extremity of their inadequacy.



In one fell swoop they have explained why every taxi driver you've ever met is both an expert in politics and international diplomacy, and why a psychology study showed 93% of drivers rate their driving ability as 'above average" [O. Svenson, Acta Psychologica, 1981]. The thing is, this knowledge isn't exactly new; empirically, wise men have known this for a long long time, for example:
  • "One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision" - Bertrand Russell
  • "Ignorance more frequently begets confidence than does knowledge" - Charles Darwin
  • “The only true wisdom is to know that you know nothing.” - Socrates
What Dunning has identified is a form of cognitive bias (something I talk about a lot, in relation to software development) based on the metacognitive inability of the lowly-skilled to recognise their shortcomings. This correlates strongly with Noel Burch's "Four stages of competence":
  1. Unconscious incompetence
  2. Conscious incompetence
  3. Conscious competence
  4. Unconscious competence

What we are seeing is the first stage, and as Burch correctly identified, moving from stage 1 to stage 2 is complicated by the fact that being in stage one we are not even aware of the fact that there is something more to learn, like the blind spot on the retina our mind rationalises reality around our lack of awareness to protect our self-esteem. Indeed, as Dunning says: "You can call it self-deception, but it also goes by the names rationalisation, wishful thinking, defensive processing, self-delusion, and motivated reasoning. There is a robust catalogue of strategies people follow to believe what they want to, and we research psychologists are hardly done describing the shape or the size of that catalogue!"

Faced with such a situation, how are we to move from Burch's stage 1 to stage 2 and become aware of our incompetence? To get out of the trap of cluelessness, the not knowing what we do not know, requires two things:
  1. Exposure to new information;  
  2. The ability to recognise that new information as useful and pertinent.  
The former happens all the time, it is the latter that presents the problem. As Kent asks, is it even possible to break the cycle? It occurred to me that just as it is almost impossible to spot your own mistakes when writing, Agile has built up a body of practices that rely on having another pair of eyes with another set of experiences and world-views to see what you are doing and to help you become more aware of your own limitations - these include (but are not limited to):
  • Peer review (not Agile per se. but worth mentioning)
  • Pair-programming
  • Retrospectives
  • Three amigos
  • Iteration showcase
  • Cross-functional teams
  • Face-to-face interaction
  • Daily stand-ups
  • Test-Driven Development (the computer is the observer)
  • Continuous Integration (the build machine is the observer)
All of these things allow us opportunities for feedback, to see beyond our own limitations, and give us opportunities to grow and learn; I believe this focus on collaboration is one of the core strengths of Agile and why it is more than just a 'bandwagon' or 'silver bullet' and truly represents a change for the better in Software Development.