These are not the NULLs you are looking for…

Just like Obi-Wan Kenobi messing with minds using the force, allowing null values in database fields can lead to undesired behavior.  Null values really aren’t a ‘value’…they are an absence of value.  They act more like a field state than a referenceable value.  This can lead to interesting behavior when querying a field based on value.

For example, consider the following table:

ID
field
Value
0
foo
this is a foo test
1
[null]
null test
2
boo
this is a boo test
3
goo
this is a goo test
4
[null]
another null test

If you issue a query like this:

Select * from table where field != ‘foo'

One would normally look at this and think, “I’m going to get all the rows where field is not equal to ‘foo’”.  What you are really going to get is all the rows where field is not null and the value of field is not equal to foo. In other words, this is your result:

ID
field
Value
2
boo
this is a boo test
3
goo
this is a goo test

Which may or may not be what you are expecting.  Because you reference a value for field in the where clause, it means that all of the values of foo that are not a value (i.e. null) will not be included in the query.  If you were expecting those values to be in the result set you have to explicitly reference them as shown in the following query:

Select * from table where (field is null or field != ‘foo’)

This query will return the result:
ID
field
Value
1
[null]
null test
2
boo
this is a boo test
3
goo
this is a goo test
4
[null]
another null test

Depending on what you were expecting there is a wide variance between the two result sets.  When creating database tables one must take great care to decide if NULL should be allowed and if not allowed then what the default value should be.  When querying a table that has fields that allow NULL values one must take care of how they reference that value in the query to ensure that the right response is returned.

Null values can be a valuable characteristic when referencing data but they can have undesirable consequences when not use appropriately.

KPI, Productivity Metrics, & Rock Fetches…oh my!

ohmy

Productivity metrics for the common developer

So I’ve been thinking for the past few days about this subject, wondering how much I could say without setting off my Tourette’s or causing everyone’s eyes to roll back in their head.  I’m sure that most of the things I’m going to say will not come as a surprise nor will they be new concepts.  But I feel compelled to say them anyway.  Katie bar the door…

First, The Problem

We want a way of measuring performance.  We look to quantify, calculate, and project our daily activities in the ever-searching desire for improvement.  To quote the great Peter Drucker: “what gets measured improves” and “what get’s measured get’s managed”.  We want an empirical scoreboard to prove, justify, and show our performance and how it measures up to those around us.  These kinds of thoughts often lead to the conclusion “you can’t manage what you can’t measure” (a quote often attributed to Peter Drucker…only he never said it, more on this later).  In a regimented sport or activity, the rules for success or scoring are clear.  Yet even in these activities the proliferation of additional statistical analysis to indicate success occurs.  Baseball being one of the greatest offenders (re Moneyball, Sabermetrics), but in virtually every other sporting event they are being used.  But only the score determines winning or losing, without regard to the metrics.  Not to pick at a sore, but consider the Super Bowl between the Seahawks and New England.  Seattle played well, by a fair number of statistics they played better than New England, yet for a single play at the end of the game going against them, Seattle wins.  The score was the ‘keyist’ performance indicator of success.  Nothing was more important than that.  The rules of business aren’t as clear as the NFL.  What determines winning/losing is much more subjective and less deterministic.  Having said that, we search for key indicators to help us progress.   Yet the software development world has never come up with any meaningful metrics or statistics for individual software productivity that don’t cause more damage than they help.  Not for lack of effort, $$$, or desire.  Not the geniuses at IBM, not Microsoft, not Amazon (not even Apple or Google).  Neither has Wall Street, Silicon Valley, or Seattle.  It’s not for lack of trying…  They haven’t found any meaningful metrics because of the essential nature of the software development activity.

Key Principles & Attributes

Valuable metrics apply to repeatable tasks.  Otherwise, there is no correlation and definitely no causation. That’s why metrics involving manufacturing, construction, customer onboarding, sales, etc. have meaningful insight.  Those activities consist for the most part of largely repeatable tasks.  The nature and order of the tasks might change but the task archetype holds true.  Software development is fundamentally different.  While delivering code is our output, that is NOT what we do (if it were you could use typing speed as a key productivity metric).  Software developers are knowledge workers.  Our job is to take inexact complicated ideas & specifications and formulate specialized, exact, well-defined solutions.  In so doing, almost every task is new.  It’s like creating a road where every day one uses a wildly different & unknown pavement material to build a road across a bed of quicksand.  Fred Brooks described it this way, “The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination.”  To paraphrase Steve McQueen, “We deal in abstractions, friend.”  Complex abstractions.  At best, our experience & knowledge helps us to know if we understand the size of the problem, just not the detail (you could make the argument that if developers are consistently solving the same problem than the management really sucks…that metric should be something like: # of same problems solved = 1).  A good programmer can take a high-level complex problem, abstract out the essential complexity, and deliver a working solution in a foreign language (code) that can easily be understood and maintained by fellow programmers.  Some people are better about intuitively dealing with complex abstractions than others.  This is largely why the activity of programming frustrates some people and attracts others (oh look, shiny complex abstractions).

The Problems of Perspective

It is because of this high level of abstraction (and solving a different problem every time) that finding metrics that have insight, meaning, and value across this complexity is impossible.  Fred Brooks said, speaking of the activity that programmers do,

“…well over half of the time you spend working on a project (on the order of 70 percent) is spent thinking, and no tool, no matter how advanced, can think for you. Consequently, even if a tool did everything except the thinking for you – if it wrote 100 percent of the code, wrote 100 percent of the documentation, did 100 percent of the testing, burned the CD-ROMs, put them in boxes, and mailed them to your customers – the best you could hope for would be a 30 percent improvement in productivity. In order to do better than that, you have to change the way you think.”

While you can measure the time spent thinking, you can’t measure the qualitative value of it.  It’s the area of the biggest potential for productivity gain for a software developer.  It’s one of the reasons why going to conferences is so valuable…it has the goal of changing the way one thinks.  And you know what I think about books…  Everything else is focused on the 30% that’s left.

This is the fundamental perspective that people ‘seeking for a sign’ with metrics lack.  But these problems are not endemic to just software developers, the same problems occur with any knowledge worker.  The problem, and this where I’m going to get controversial, is that most managers want to treat programmers like blue-color workers (except for the hourly rate).  Now would be a good time to have read Peter Drucker’s seminal work on Knowledge Workers (there is a link for you at the end).  No really, go read it.  Without his view & understanding your life will lack meaning, focus, balance, the force…it will be a low-carb life. Managers believe that if they can just find the right set of metrics, then programmers can be reduced to a replaceable, pluggable, and interchangeable cog (which will simplify their job considerably).

Other Knowledge Worker professionals have similar issues with productivity metrics, most notably Doctors, Lawyers, Poets, Journalists, and Athletes.  There have been a number of studies that have shown that metrics for most ‘Professional’ workers fail to quantify 60%-70% of their daily activities (Peopleware, Augustine, Mills, et al).  And it’s not for lack of trying (there is actually more research on the lack of good productivity metrics for Doctors and Lawyers than there are for any other class of Knowledge Worker).  What set of metrics would you want to judge your Doctor by?  Hours worked?  Patients per hour?  Number of maimed/dying patients?  Same with Lawyers…billable hours?  Billable hours are only a metric for how rich the partners are getting, not how efficient a particular attorney is.  Did George Steinbrenner say, “I need a shortstop ‘resource’ to plug in here” or did he go out and get Derek Jeter?  Did the Bulls need a shooting guard ‘resource’ or did they need a Michael Jordan?  How well did Apple do when they staffed a CEO ‘resource’ in Scully instead of Steve Jobs?  All other knowledge workers are measured by their results.  Did the doctor fix your problem?  Did the lawyer make your problem go away?  Did MJ push off and hit the winning shot over Bryon Russell?  All result oriented, not metric based.  One could make a compelling argument that Peyton Manning in the last Super Bowl didn’t have very good metrics (very poor metrics in comparison), yet he did what was needed to win the game.  The result mattered, not the metric.  It’s fundamentally essential to the very nature of what we do as software developers that we can’t and shouldn’t be managed as pluggable ‘resources’, nor will quantitative metrics apply to our requisite activities.

Not all Hope is Gone

While trying to find and use metrics for individual developer productivity is fraught with danger (Peter Drucker – “What gets measured gets managed”) and will lead to Teamicide™, there are some things that are good indicators of performance.  Those who would throw up their hands in alarm and quote, “you can’t manage what you can’t measure” are missing the bigger picture.  We ‘manage’ un-measurable things every single day, across a multitude of disciplines.  It just takes a different mindset and perspective to do so.  One uses different tools for different kinds of things.  If all you have or want is a ‘metric’ you are not going to be successful managing knowledge workers.  Almost all of the valuable indicators of performance for software development are going to be on a Team level, not on an individual level.  And they will be qualitative in nature instead of quantitative.  In other words, they will not be metrics.  You want to find archetypal patterns of impedance (roadblocks).  Think of the principles/tenets of Agile: Eliminating Waste, Amplifying Learning, Decide as Late as Possible, Deliver as Fast as Possible, Empower the Team, Build Integrity In, and See the Whole (stolen liberally from Mary Poppendieck).  Those things are vital to productivity.  But they aren’t quantitative criterions.  For example, I believe Unit Testing is a clear indicator of productivity.  But it’s not a clear metric.  If someone is writing a lot of tests…how good of tests are they (i.e., are they testing getters/setters or are they testing the essential complexity of the abstraction)?  Are they getting better as time goes on?  If someone isn’t writing any tests, that might indicate a problem.  It’s an indicator, not something to put on a dashboard.  Personally, I would also ‘chart’ a list of knowledge gained month-to-month, year-to-year, by the team.  While not a direct link to thinking ability (attacking the 70%), if you see knowledge acquisition is occurring, by its very nature the team is getting better and more productive.   Now not all knowledge is of equal value.  So again it’s qualitative instead of quantitative.

If you examine the developer workflow (the 30% stuff), there are things that we should know & track.  Consider the following:

Impediments:  Developers & Managers should know the impediments they have faced and what they are currently facing, and what they are (classification of type).  By classification, information not forthcoming (design specs, customer information, etc.), misunderstanding requirements, adding non-essential complexity, etc., they should be tracking how well those things are being resolved.  Again, the key is to look for patterns, not individual events.

Time between delivery and ‘live’ in production:  We should track how long things take to go from ‘done-done’ to being deployed.  Generally, small projects aren’t worth tracking, but larger projects can sit in limbo for a fair amount of time.  This we should care about.

Time in development phases:  This one is hard to quantify, but it’s how much time is spent in planning, design, implementation, testing, fixing implementation, fixing design issues, etc.  The recursive nature makes it hard to quantify but I’ll bet if developers & managers looked at the outliers they would find some relative patterns as to why some projects bounce around instead of being finished in a timely manner.

Number of tasks in WIP:  I think developers should strive to keep their current Work In Progress list small (too little is rarely the problem, it’s when it’s too big).  When I have a small number of tasks I’m likely to work on one until completion and then move on to the next.  When I have a larger number of tasks in my queue, it’s easy to bounce from task to task, doing some work, lightly servicing the demand, but not completing it.  While I recognize that issues arise, one of the key factors for productivity is the size of a person’s work queue.  Developers and Managers should be much better about balancing the Work In Progress (even at the risk of putting things on ice – the backlog).  In conjunction with this, classifying the nature of the task can be of use (but it does lead to ‘gaming’ of the system where someone ‘cries wolf’ to get higher priority).

Categorizing of the defects:  This can easily be done with tools software developers currently use (ie NewRelic, TrackJS, etc.).  As we solve classes of defects we should stop seeing them occur (null variables, out of scope, etc.).  If we solve some of them but then keep seeing other issues of the same class occurring it indicates either we need learning or lack discipline (or both).  Not tracking those will not show us where we need improvement.

So all of these things have a commonality, they are the base principles of Kanban (too long, too late to start talking about Kanban now…subject for another day).  Becoming familiar with the principles of Kanban will help to understand and modify our workflow (or at least track it better), which should lead to gains in productivity (the 30% that Fred Brooks was talking about).  To get better than that we need to learn how to think better/faster (the 70%).

Summary

You can’t use metrics for developer productivity, and that’s a good thing.  No really, it’s OK.  Developers are Knowledge Workers and should be treated as such.  It’s fundamental to the essential work that we do.  If one is not careful, metrics will discourage, discriminate, depress, and destroy (you will get what you measure).  Any productivity guides should be aggregated on a team basis.  Trends are far more valuable than discrete measurements.  We should seek for archetypal understanding rather than specific detail (abstract away to a pattern, then solve for the pattern).  Learning how to better think will swamp any productivity gains that occur from workflow/process improvements (the 70% vs. 30%).

Required Reading

Beyond the Information Revolution, Peter Drucker, The Atlantic, 1999

http://www.theatlantic.com/past/issues/99oct/9910drucker.htm

The Five Most Important Questions You Will Ever Ask About Your Organization, Peter Drucker

The 5th Discipline (systems thinking), Peter Senge

Lessons In Agile Management (Kanban), David Anderson

The Lean Startup, Eric Ries

Lean Software Development, Mary & Tom Poppendieck

Peopleware, Tom Demarco

Mythical Man Month, Fred Brooks*

* I once heard Fred Brooks refer to MMM as the ‘Bible’ of software in that everybody quoted from it, few people have actually read it, and almost nobody follows what it says.

Book report: Designing Search

Designing Search is a very interesting and compelling treatise on the complexities of designing a search interface.

The first chapter alone, Starting from Zero, is well worth the price of admission…if for no other reason than it opens one’s mind to the possibilities, complexities, and land mines that lie in wait for the uninitiated.

The book is broken out into three sections: Optimizing eCommerce Search Results, Designing Search eCommerce Interactions, & The Future of eCommerce Search. Each section does a good job of covering the intended subject while at the same time illuminating how difficult it is to to search really well. At the end of each section I looked back to how much I had learned yet I was left with how much more there is to learn, specifically about the search implementation that I’m currently working on.

Some of the best points that the book makes are with the importance of understanding the patterns and models of information seeking that exist with customers, searching, and the interaction between the two. In addition to the importance, the book also provides a decent domain pattern language that will quickly add the words ‘pogo sticking’, ‘thrashing’, ‘berry picking’ & ‘pearl growing’ to your daily lexicon.

The book also includes a fair number of insights from other authors and practitioners as well as a pretty good set of references for each chapter. It’s almost like this book is a gateway drug for Search…as it will change how you think, view, and interact with search forever more (and in my case caused me to queue up a fair number of other tomes on the subject for my next reading).

It’s a very good book, well written in an easy-to-read manner that encourages re-reading sections for more enlightenment.

The author also has a pretty good website, www.designcaffeine.com, where there are more resources on UX, design, etc.

It was the best of times…

Dickens

We live in a glorious age of computing.

Never in the history of the world has so much technology been available for so many people for such little cost.

For example, this blog runs on an EC2 instance in Amazon’s data center powered by NGINX, PHP-FPM, MySql, and WordPress.  All open source and free software.  And the price of using Amazon’s data center is continually getting cheaper.  And I’m typing this on an iPad…

It’s a great time to be involved with technology…I can hardly wait to see what the future hold.