NoSQL Fast? Not always. A benchmark

Tags

, ,

NoSQL databases should offer superior performance and scalability. You’re giving up your trusted SQL for something, right?

This benchmark follows a simple use case:
Fetch 100.000 records from the database and stream that into an HTTP response. The test was done with 20 concurrent users.

We wanted to test the performance of different databases. Our options:

  • Cassandra – easily scalable
  • MongoDB – fast and flexible
  • PostresSQL – trusted technology
  • Datastore – App Engine Scalable Database

Reading 100.000 records from the database requires the result to be paged. Having concurrent users means you don’t want to fetch the full result set into memory before serving it.

Multiple clients accessing the same data can be good or bad for the database. Caching will work well since data is effectively read multiple times. Concurrency might also introduce competition for this resource depending on the architecture of the database.

The result were quite surprising, reading 100k rows from Cassandra with 20 threads  took on average 61 seconds. MongoDB was roughly twice as fast and PostgreSQL almost 30 to 60 times faster than Cassandra! Our experience with importing and exporting data from Cassandra matches these numbers, mutation small amounts of data is fast, getting large amounts of data out is slow.

Times are in seconds, PostgreSQL average response time is 0.9 seconds. Cassandra 61 seconds, MongoDB 17 seconds.

Databases

Lower is better. The 90% bar means: 90 percent of the requests were delivered in the given time

Http Chunked Responses – streaming challenges

Tags

, , , ,

Streaming http responses is not used by many developers. Most web framework provide some support, but they all seem to give you the message: keep your responses small. If you want to find out what pitfalls to look out for when streaming large responses, read on.

Streaming the http response means you don’t know the length of the content when you start. This is mostly applicable to large responses, in our case a large result set from the database. Exploring a few options in Ruby and Java, there seems to be a few glaring omissions in the chunked response support. Be aware of them before you start and pick the framework that supports your use case best. Some frameworks (like Tappestry) don’t support it at all!

Blocking
Firstly, the response will be blocked. The framework should not mind that the calling thread is used (blocks) to serve a few minutes worth of data. Many Ruby web framework don’t handle long lived responses very well.

Documentation
The second omission is the documentation. In the Play framework, the Async responses always seems to be ‘run running calculations’. Streaming is also long running, but the first result is available almost instantly.

Multi threading
The third omission is multi threading. Most frameworks give you an object to write your write to (out or output stream) but if your response is large and requires transformation before output you might want to use multiple threads. Guess what happens when you hit the output stream with multiple threads? Breackage.

Buffering
The fourth omission is buffering. When writing a response you don’t want to send every character the minute it is available. One framework I found get this one right: undertow.

Pulling
The last omission I’ll discuss is pulling. When the client is downloading the response slowly, it’s easy to overwhelm the output stream (if it doesn’t block when it’s full). The most elegant solution would be one where the framework tells you when you can write more, when the client is (almost) done reading the previous response. The Grizzly framework seems to support this, I’ll check it out and report.

Java 8 – Streams and Lambdas

Tags

,

Pipeline programming

Reactive programming is the new rage. Declare your logic and let it go! If you’ve worked with Scala or Ruby, you’re used combining different operations on your collection like mapping, sorting, filtering. When you can create functions (e.g. blocks) on the fly this becomes simple and readable. For instance mutation, filtering and sorting an array in Ruby is one line of:

['b', 'c', 'a', 'ab'].map(&:capitalize).select{|i|i.length == 1}.sort # => ["A", "B", "C"]

Lambdas

Luckily Java 8 also has the ability to create functions (lambdas). They are great for transforming your collections. Where creating a filtered version of a collection involved creating a new collection and a loop over the old one, with lambdas you can often replace this with a short and expressive statement

In it’s true statically half object oriented nature, Java has special functional classes to represent lambdas where primitives play a special part. Defining a function could like something like this:

ToIntFunction<? super String> alwaysOne = s -> 1;

Probably you will mostly use anonymous lambdas on collections. Collections (e.g. List, Map, Set) have new functions specifically designed for this. We’ll get to the Streams in a moment, first a short summary of the collection goodness you get with Java 8:

Java 8 adds these methods to all collections (iterables):

  • forEach – perform some action on each element
  • removeIf –  remove elements for witch the lambda return true

java.util.List also adds:

  • replaceAll – the mutable version of ‘map’ which doesn’t return the new collection (gotta love Java)
  • sort – mutates collection, lambda should return compareTo compatible results (e.g. 0, 1, -1)

java.util.Map adds:

  • computeIfPresent – replace or remove one entry with result of lambda but only if entry already existed
  • computeIfAbsent – sets a map value if it’s not already there (great for lazy initialising)
  • compute – combination of the above functions
  • merge – add or replace entry using the old value as input
  • replaceAll – same as List

Function pointers

Perhaps the weirdest syntactical change in Java 8 is the method reference operator :: In stead of using a plain lambda expression:

stream.map(s -> s.toLowerCase)

you can refer to the method by it’s class and name directly:

stream.map(String::toLowerCase)

A bit less verbose but it might not be obvious when to use the reference over the lambda.

Enter Streams

Apart from the useful functions per collection, the java.util.stream.Stream is the new functional kid on the Java block. They are designed specifically for lambda operators, parallel processing and chaining multiple transformations together. A simple example will make Java 8 get very close to the Scala/Ruby version:

Arrays.asList("a", "b").stream().map(String::toUpperCase).filter(s -> s.length() == 1).sorted()

A few things to note: generics are here to save the day, ‘s -> s.length’ is only possible because the type is already known. There is some type inference going on, so the input type (String in this case) can differ from the output type, which is very cool. Most notably Stream offers you two functions: map and reduce, renowned functions from the functional world. And since you can use concurrent computation by using parallel streams, you can easily write a fast single machine map/reduce using the stream API. If you’ve worked with Scala or Ruby you’ll realise how profound it is having these two functions available.

Not all lambdas are equal

Streams have two kinds of operations: intermediate (transforming, lazy) and terminal (value producing).  In its try OO style, in Java 8 the responsibility for collecting all those (parallel) transformations into some output is delegated to Collectors. In a not so great OO style but conveniently terminal operations like sum() are available directly on some of the stream types.

Types of Streams

This:

Arrays.asList(“1″, “2”, “3”).stream().map(s -> new Integer(s)).filter(i -> i > 1).collect(Collectors.summingInt(Integer::intValue)); //  5

is equivalent to:

Arrays.asList(“1″, “2”, “3”).stream().mapToInt(Integer::valueOf).filter(i -> i > 1).sum(); //  5

If we weren’t producing Integers but more complex objects, the first variation would make more sense, but the second version creates an IntStream, which has convenient methods that  only make sense on a numeric stream such as sum(). Creating an IntStream directly from a Collection containing Integers is not possible but using the static functions of the IntStream interace (good Lord) you can do:

IntStream.of(1, 2, 3).filter(i -> i > 1).sum();

In short

The new lambdas in Java 8 have already been to good use in the default libraries. Streams are going to make your Java programming life very different, adding fast and expressive ways of filtering and transforming your data. When Java 8 is adopted expect your functions (or your colleagues functions) to start accepting and returning Streams in stead of collections. It will make your programs faster, shorter and more fun to write.

Further reading

 

Preparing for an Agile Transition

Tags

, ,

Are you ready?

Transforming a team or organization to Agile thinking and working requires those involved to be ready for the change. Readiness on all levels: managerial, coaches and the people doing the work. Many organizations start implementing Scrum and/or Agile, often without regard for readiness. Those transitions might succeed in the short run but will likely bounce back into old habbits (the ‘jelly problem’).

Mind the grass roots

For years the main complaint in Agile was lack of management support. These days more often Agile transitions are started because upper management wants it. That’s great but it could lead to ignoring the grass roots. Make sure you create grass root support for Agile. Keep in mind that bombastic presentations about the benefits of Agile works for managers, not for the work floor. You need to light the Agile fire and that can take a while to get lit. One option could be to keep management support for Agile secret until the employees start to get exited about Agile.

Tip: Tell management that any organizational change takes years, especially in large organizations. Suggest to start very small, low key and possibly even under cover.

Tip: Use a big bang Agile transition only after a period of slow grass roots and management preparation.

Mind the managers

If Agile is introduced by management, they often think that it means management is not the problem and that can hinder the transition. Management really does need to change, otherwise you didn’t need Agile. It can also increase cynicism amongst the people because they feel it all needs to come from them. The biggest problem is management has a tendency to use command & control to implement Agile, which makes Agile a set of tricks not an organizational change.

Tip: Make it explicit that managers will not lose their job after the Agile transition.

Tip: Discuss with management what they want to change in their own behavior before or as part of the Agile transition. If they say ‘nothing’, try again or give back the assignment.

The Jelly Problem

Organizations are in a way like jellies. You can push them and something will happen, they will move and shake. You think you’re getting somewhere but leave it alone for a while and the jelly just reverts to its old shape. A lot of change programs, including Agile, end up with the same jelly as they started.

Tip: Don’t confuse movement for progress, make sure everyone is on board with the change and spend explicit time making the Agile transition permanent.

Tip: Find out how previous change programs took place, why did they fail or succeed? Learn from that and adjust your transition accordingly.

Change in Progress

Lean teaches us that overburdening (‘muri’) is a key to waste. It’s often overlooked that people can be overburdened with organizational change. Many organizations that  currently start an Agile transition will likely have other changes already in progress. Employees often have multiple changes to deal with:

  • Change teams,
  • change locations,
  • change your brand and vision,
  • change operating system,
  • change customer communication,
  • change to conform to new regulations,
  • change to comply with methods and certification (e.g. ISO),
  • change functions and roles (e.g. because of cost cuts or mergers)

How many changes are taking place while you are doing your Agile transition? How many have only recently finished? Are you sure people are not fatigued of all the changes? If so, your Agile transition will have a small chance of succeeding. So the key to reducing overburdening is using ‘pull’ for your change programs.

Tip: Don’t push your transition onto overburdened people, find indications that the organization really has the time and capacity for doing the next change. Especially be aware of inactive but unfinished change programs.

Tip: On management level keep track of all change programs that are planned or in progress in the same part of the organization (see image)

Change in progress

 

The Clojure Ecosystem

Tags

The Clommunity

Recently starting programming in Clojure, the somewhat lackluster state of some of the Clojure community projects is striking. Programming languages thrive when there is an active community and Clojure seems a mixed bag. Clojure sees regular (non breaking) releases and some of Clojure books are amongst the best development books ever written.

Leiningen

Probably one of the tools that makes us sing songs of hope for Clojure is Leiningen. It’s an active project and it’s the defacto standard for building Clojure projects. It hasn’t crapped out on me once, I installed a few plugins with no issues and

IDE support

  • Sublime Text 2. The SublimeREPL project is pretty awesome, but it’s not Clojure specific. The REPL does start and being able to send snippets of code to the REPL is pretty neat. I’ve developed a few functions using this approach. The main issue is that the Repl is not using the Lein project but just starts a vanilla Repl. The Repl froze a few times: annoying. So I’d say it’s nice but not awesome.
  • Eclipse. Though both Eclipse and Intellij feel too heavy for a dynamic language, they do have plugins you can use. Counterclockwise is the Clojure plugin for Eclipse. It has support for Leiningen, but lacks support for running tests. Running a Repl  inside Eclipse feels odd but sending code snippets seems to work. I couldn’t figure out how to load the full project into the Repl, which is strange because Eclipse is so project oriented.
  • IntelliJ. La Clojure is the plugin for IntelliJ, maintained by Jetbrains (which ought to be a plus). Haven’t used it yet.
  • Emacs. There are several deceised Emacs clojure plugins, but one has emerged (survived): nrepl.el. I didn’t know Emacs, so trying it out was a bit of a struggle. Emacs uses Lisp for some parts, so I imagine it has good support for Clojure’s syntax. The commands available from nrepl make it seem like a very powerful option.

SQL

For SQL databases there is Korma, which still a bit minimal but it allowed me to do a fairly complex join query using aliases. It’s under active development, although I would advise to double check its working before using it in a live environment. Clojureql seems to be a more stable but also a bit stale alternative.

Solr

Solr is a popular Lucene based database, it has search features beyond regular SQL databases and it is generally well supported. I say generally because Solr seems to be a weak spot for Clojure. The main reason for this blog was the state of Clojure Solr libraries.

There are 4 Solr Clojure projects (that I could find).

  • clojure-solr seems to be the grand daddy given its age and the fact that one of the other projects is based on it. It’s dormant, even the forks are not up to date. 
  • solrclient is not really a project, more a one off trying to use Solr through its JSON interface.
  • solrclj is less dead and looks like a proper project. It hasn’t been updated to the latest Solr and Clojure versions though. If I’d fork one project to bring it up to date it would be this one.
  • star is, well, the rising star amongst these libraries. Only created hours ago and literally minutes after I started searching for Solr + Clojure on github. The big ‘cool’ here is that the author also created the famous Solr Ruby client rsolr. Though no real lines of Clojure have been written yet, from the first commit it already looks promising and this guy obviously knows what is required in a Solr client library.
  • icarus is a little bit more recent, there is a version that supports Solr 3.5 (4.2 is the most recent). It’s only for querying and looks like it’s no longer maintained. It’s based on clojure-solr.

Unit Testing

Clojure.test

Clojure comes with a unit testing API build right into the core. It’s cool that you can start writing unit tests out the gate. Leiningen runs it out the box and for a Java developer with JUnit experience (like me), it just works.

Midje

I’ve been reading Brian Marik’s book on Clojure, he authored his own testing framework Midje. Haven’t looked at it yet, but worth a mention.

Lazytest

Lazytest was written by the same guy who did the default Clojure.test library. Lazytest has one main issue: it’s dead. The current development branch is unstable but also two years old, a fatal combination. It’s one giant rough edge and doesn’t seem to support Leiningen properly.

Autotest

One nice feature of SBT (Scala’s Leiningen) is the automatic compilation and running of tests on source change. There are some attempts to add this to Lein, but projects are either undocumented, too old or test framework specific. I do hope I’ll find a lein plugin that just does ‘detect source change = rerun tests’

Misc

One library worth mentioning is Cheshire. I’ve only used one function (parse-string) so far but it’s a feature rich and fun library to use. Of course its main strength comes from Clojure itself because any JSON structure can be parsed to and from a Clojure data structure (of maps and vectors). Keys turning into keywords makes it a pleasure to work with JSON. The author Lee Hinman seems a very productive member of the Clojure community, check out his other projects.

For further exploration of Clojure, here’s a list of libraries and Leiningen plugins

Android Tablet – a Review

Tags

, , ,

I received the Asus Transformer Pad. An Android tablet. I have some experience with the iPad (first generation) and have an iPhone 4. So this is my first encounter with Android. The pad comes with Android 4.0

Unboxing

The user experience of a product starts with the box it’s shipped in. Unlike some other Android devices, the box was easy to open. The tablet was immediately visible after opening the box. However it was it wrapped in a plastic that prevented me from seeing the product in all its glory. Also, the plug was wrapped in a tight and thin plastic that was hard to remove. Fiddling with bits of plastic caused some irritation, and left me with a pile of packaging material.

The worst part from the un-boxing was that the device had no battery life. I pressed multiple buttons (where’s the on switch?) It made me worried if the device might be broken. After struggling with more wrapping plastic, I plugged it in and starting charging. Still not sure if it’s actually working, but it showed some sign of life.

Set up process

The set up process was boring. The dark styling didn’t help to encourage me. Some forms had a problem because the fields were hidden behind the keyboard and I couldn’t see them unless I hid the keyboard.

One example of the dismal procedure was the wireless network set up. During installation the device asked me to connect to my wireless network, which makes sense. The problem was the lack of confirmation when I entered my password. The network showed the word ‘connected’, but I wasn’t sure if that meant it was actually working. A green tick symbol or something showing the ‘connect to Internet’ step was finished successfully would have been helpful.

Software Update

After using the device for a few minutes, a new release of Android was available. The good thing is that it downloaded automatically over the air. What it didn’t say which version was available or why I should upgrade. If you haven’t upgraded a device before, this would probably be a very confusing dialog. It made me feel like I’m upgrading because Asus wants me to, not because I want to.

Installing new Apps

The application store shows some nice suggestions like Evernote, that gives me confidence Android is a serious platform. Installation of new apps is fairly easy. Searching for an app works OK but has it’s own issues: for instance you can’t see whether applications are optimized for the tablet. So when looking for a movie trailer-app, the number of possible apps is huge. If I could filter out the ones that are designed for tablet devices, it would help me find the right one more easily.

The search field was often unresponsive, which was irritating. I couldn’t see if it was because it was still loading, so I just had to randomly push the search field to see if it would come available.

In-App naviation

Applications don’t seem to adhere to a common navigation style. Going back to the previous screen is not always obvious. It looks like pressing the top left app icon brings you back to the start screen, I’m not sure. The back button is neat, but I’m often unsure where that will bring me and in some cases it didn’t work at all. The back button leaves too much room for developer to build the application like a website, which it isn’t.

Widgets

Of course one feature Android has that iOS doesn’t is widgets. I haven’t tried many, but especially a tablet has lots of room to show both app icons and widgets, so it’s a big plus for Android. To compensate for this great feature, Android keeps the bottom navigation bar always visible, which means apps don’t use the fill screen and immersion suffers. The trailer app didn’t even switch to full screen when playing the video, this is where the desktopy feel is a bad thing.

Switching between apps

For Android apps are ‘windows’ that you switch between. The ‘windows’ icon in the bottom bar is great, switching between apps is always available and the preview icons make it comfortable to find the right application. I have to figure outhow to close an app so it gets removed from view.

Summary

Android on the tablet is not bad, but also not great either. My iPad experience has been a more pleasant one, apps looks better, responds better and are more uniform. There are lots of apps, but Android developer seem less eager to create tablet specific versions, which is a shame. Apart from the widgets, everything feels like ‘slightly worse’ than the iPad.

When not to use BBD (yet)

Tags

, , ,

What is BDD

This blog is not about explaining BDD in detail, but here’s a short description. BDD is a way to build software against tests that are defined beforehand. BDD allows you to make sure your whole application is working, not just parts of it. It’s basically an outside-in way of testing, as apposed to unit testing which is focussed on the inner parts of the application.

What BDD is not

BDD is not a way to discover your application’s requirements or use cases. It’s a way of documenting the requirements and use cases in an executable way. Even the Wikipedia article on BDD gets it wrong here, so it’s a common misconception.

The pitfall I’ve come across is always starting your user story implementation with BDD. In the cases where you know how your application should behave that’s fine, but when the behavior still needs to be fleshed out BDD can get in your way. The confusion is between problem description and solution design, BDD is solution design while you might confuse it for problem description.

So let’s take an example:

Suppose your customer wants to have a view on the customers orders. The orders are stored in a database, but not necessarily in a user friendly structure. The user also isn’t very familiar with all the data that is stored in the order, because some of the data comes from external systems. Now, when you start asking your user for BDD examples, you will probable fail. You can’t get examples for statuses, missing information, multi-line orders etc. because the user simply doesn’t know what he/she wants exactly at this stage.

So the usual course of action is to either discuss some examples of the data, but in this example the we ‘just build something ™’. We asked the users to interact with the data for a few weeks by just showing the raw data structure. The users could relate the stored orders to other sources of information, their work process and past systems that stored customer orders. Only after that experience could they tell us how they wanted to see the data, what features they would like, what manual steps they’d like automated. And only now could we actually create a BDD scenario.

If you’d started with BDD, you would have been describing user goals and scenario’s that they couldn’t really support. Since you’re not building to solve their problems (because you don’t know them yet) how can you create BDD scenario’s? In that case your BDD tests will just be system integration tests, not value driven scenarios. So skip BDD in this case until you’ve learned a little bit more.

Lean Startup Approach – paper mockups

In the Lean Startup reminded us all of finding out what problem you’re solving before you start solving it. The goal in the Lean Startup is learning about your customers problems, desires, needs etc. The trick is to find the fastest (and cheapest) way to learn so that you can solve the right problem effectively. Sometimes writing code is the easiest way to start learning, get something out there. Now this kind of code is similar to the post-it notes in a customer brainstorming session: you throw them away.

Wrapping up

So to summarize, if you know how the application should behave use BDD. Often when a user just wants the same features he/she has seen before on another application or website, the behavior is clear. If application behavior is not clear, make sure the problems/needs are discovered and you’ve agreed on the application design before starting BDD.

A drawing that explain the two scenarios:

Application behavior is known, use BDD

Discovering through building, use BDD later

The State of Agile Estimation

Tags

, , , ,

The problem with estimating

Estimation is one of the holy grails of software development. Imagine if you could tell your customer exactly when something is done and how much it will cost…no more deadlines, no more surprises. That would be something, right?

But the world is not perfect, it’s complex. So estimation is hard. And when you work on something risky and challenging, which is hard to estimate, your customer probably wants to have accurate estimates even more!

Agile estimation has been around for a long time, XP and Scrum have brought us story points, velocity and (with the help of some project managers) project burn-downs.

Two years ago, my team was doing Scrum estimation by the book (I still don’t know which one though). The backlog was estimated for months ahead estimation and velocity was tracked as if our life depended on it. As in many teams, this didn’t work quite the way it was advertised. Planning meetings were tough, estimates (in points) often way off, pressure was mounting and code was rushed at the end of the sprints to reach the velocity the team committed to. No, if you have any experience in Agile, you’ll probably find a few mistakes in there.

So we want estimates, but they are hard to get right. We’ve had some interesting experience with estimation ourselves.

Stop estimating

We decided to stop estimating. First we needed to get the quality right, the process stable and the testing to be synchronized with coding. So we decided to reduce estimation to a bare minimum. Since we started we went through the following levels of estimation:

  1. Full fledge Scrum Estimation and Tracking ™
  2. No estimation, just keep working until the MMF was done (an MMF is something the user can see, touch, taste or smell)
  3. Estimate the size of the MMF
  4. Try to keep the MMF size within 2 weeks

We’ve been stuck at 4 for quite a while, using a continuous development process. MMFs are not always the same size though, some can’t be reduced to two weeks and others are just not estimated correctly. There are two reasons I want to get back to a higher level of estimation and tracking:

  1. Visualize what the team achieves in a given period, much like we track our product performance in the field; you want to know if you’re improving.
  2. Give the stakeholders some idea when to expect which MMF in the coming months.

These two goals are different, the target is different, one is the team, the other is the stakeholders. I’m hoping to find a minimum viable tracking system that can serve both goals.

Wanting to track the work done and have some (rough) prediction of what we will deliver in the coming months is our goal. Let’s see how to achieve that.

The state of estimation

So, to get back at (team) estimation after disbanding it for 3 years should be interesting. A quick search shows me a mixed bag of the current state of Agile estimation. Three years ago, estimation was non optional. Everyone was doing Scrum (or waterfall) and estimates and velocity were the best things since sliced bread. Today, the situation has changed, with probably many people sharing the same experience we had (estimation sucks). So there seems to be two main approaches to estimation these days:

  1. Use very little estimating and focus on fast delivery
  2. Do estimation all the time

Examples for the first (little estimation) often stem from the Kanban arena. Some suggest just counting bite size stories. The stories are roughly the same size, so you can just count how many you’re doing each time period. Predicting far ahead into the future is not possible (and not a goal).

Scrum.org has removed velocity and story points from the guide, but the act of estimation is mentioned and still a key attribute of Scrum. How you estimate, it doesn’t say. Joel Semeniuk gives a good overview of how to tackle estimation.

Try, inspect, improve, repeat

We’ll give estimates another go. The mean principles I think should apply:

  • Estimate on a high level
  • Separate size estimation from measuring time spent
  • Track predication accuracy

Expect another blog post as we learn more about reintroducing estimating. If only I could tell you when I will write it ;)

 

Version numbers are overrated – use version labels instead

Tags

, ,

Most developers use version numbers for their software. It’s considered a ‘best practice’, but version numbers are overused and give you a false sense of control and compatibility. Use the simplest versioning scheme possible, where versionless is the simplest.

Rationale

What is a version number? It’s a label (doesn’t have to be numeric) that identifies one particular snapshot of your software. Usually there is one overall version number for the whole system, even when components of that system have their own versions. Using this label, anyone can reason about the state of the code for that label. Users might say, in version IE 9.0.1.2 there is a bug when I try to print. The developers will know exactly which state the code was in for that version number and should be able to find and run that version (and fix the bug). Summary: you need to label revisions of your code to refer to it later.

Numbers are silly

So far, so good. Version labels are useful. But in stead of labeling the software with a date of release, the name of the latest feature, most developers use numeric labels. And not just a sequence number (1, 2, 3), but fancy numbers with dots! Giving your application a version number like 15.0.12.21.h makes it (and you) look really complicated and smart, don’t you think? Have you seen the Chrome version numbers? Anyway, let’s go into these version numbers more deeply. A typical application, let’s say a website for creating blogs, releases these versions:

  • 1.0
  • 1.1
  • 1.2
  • 2.0

What does that mean? Well, convention says:

  • first ever release
  • small change
  • small change
  • big change

That’s it, nothing more to it. But the numbers look more professional, don’t you think? Although list 1 and 2 have the exact same meaning. And was the change from 1.2 to 2.0 was really that big? We don’t know. What this developer deems big, might not be so big and worthy of a ‘major’ version bump for others. So why use numbers at all? Why not use descriptive names like:

  • ‘first basic YourBlog release’
  • ‘fixed bugs in login’
  • ‘fixed bugs in page-saving’
  • ‘redesigned UI’
Now you might argue that you can’t see which version preceded which. Who cares? If you need get back to the exact version of fixed bugs in page-saving you can just check out that version (assuming you’ve labeled it in your source control). If you want, you can track the chronology somewhere else. Or you could add the date to the label to signify when it was released to add even more meaning to the label:
  • ‘2011-1-10 first basic CMS release’,
  • ‘2011-2-1 fixed bugs in login’,
  • ‘2011-2-9 fixed bugs in page-saving’,
  • ‘2011-3-2 redesigned UI (big change)’.
Now you have everything you need, suppose you read theses labels in your source control? Much better than just 1.1, 1.2, don’t you think?

My Application is an API

There are cases where using the dot-numbers make sense: when you’re developing an API or library for public use. Then the numbers signify something very important: compatibility. The convention basically says: if you increase the number behind the dot (1.1 -> 1.2), the library will be backwards compatible with all 1.x releases. When increasing the main (major) number (1.2 -> 2.0), compatibility is not guaranteed with 1.x releases. That is quite an exact definition of a big change. So if your API is always backwards compatible, you can stick to 1.x releases forever.
Now, this might be the root cause of the wide spread use of numbers as version labels. Most developers think API development is the coolest thing in the business and many practices (like misuse of interfaces in Java) stem from thinking your code will be used as an API. In reality, most code is not an API. GUI applications don’t have a programatic interface, so they don’t need version numbers. Even if you’re releasing a new version of your library for in-house use, no one will trust your backwards compatibly claims anyway and just re-test their whole system using your new version. Only when the library is developed externally, you want to know if you can upgrade safely or not.
To summarize: for API development version numbers signify something important, most people don’t build public APIs but build applications or websites that don’t require this strict convention.

Maven: WTF?

Maven doesn’t help in making people steer clear of the number fetish. When you create a new maven project. The initial version for you module is: 0.0.1. Yes, not one dot, but two! You’re going to be doing some serious API-building business. No, you’re not.

What does that mean? By convention, 0.0.1 means ultra-pre-beta-alpha release. It’s probably just prints ‘hello world’ in a console. So you start building and maven increases your version with every release you make. 0.0.1, 0.0.2, 0.0.3. You’re not making much progress, so after a few weeks you make the bold move and change it to 0.1, your first pre-beta-alpha release! Or is it, perhaps it’s already live, in production and making money. When do you switch over to 1.0? When all the bugs have been fixed? The first time you go live? 1.0 means done, right? Stable, right? Yes, it does. But your website is never done, your GUI will always have issues so there is no 1.0!

I find when building whole systems, there is never a clean cut release. The first public release is not much different from the one before or after. So stop fiddling with the numbers.

Just to get back at the API argument, when you release an API to the world, there will be a 1.0. It’s that contract again, meaning: this API is stable and ready for production use, it has undergone testing and everything. This number means something because your communicating it explicitly, it’s in the product you ship (like commons-lang-1.0.jar). Your website just has a url, your GUI an icon, no version.

Version-less coding

Some code doesn’t need version numbers at all, it’s just code. If it’s checked into source control (on a particular branch) it’s usable. No 0.1 release or 0.1-SNAPSHOT. Just the code. If I want to make change that breaks stuff, I’ll create a separate branch. Maven doesn’t allow this, it’s basically duplicating what my source control system can do better (track revisions). For libraries, I might want this, but for my main project, I don’t. One of the reasons these version numbers are still so prevalent, is that Maven requires a version number. Starting with 0.0.1 will start you off thinking you need a complex version numbering scheme.

The build number alternative

I propose a complete alternative to using version numbers. It’s simple: build numbers and labels. Every time you build your system, for instance in your CI tool (like Hudson) it gets a build identifier, let’s say a timestamp. That number might be set in your source control as a label or the revision of the code might be stored alongside the build number, either way the build number will be a reference to a specific state of your code, but also a specific attempt to build that code. Sometimes you have to build the same code multiple times to get a good release (your build procedure might have issues). Now the build numbers are just a sequence. You can label releases that are actually going out for release, so you can refer to them later. Using the build number, you can even pick up the exact artifacts from that build.

  • 2011-01-01-1543 – ‘added new content type’
  • 2011-01-02-1218
  • 2011-01-02-1543
  • 2011-01-02-1743 – ‘improved render time’
  • 2011-01-03-1109

Most CI tools (like Bamboo, probably others too) even support labeling a build.

One extra benefit of this, is that your code doesn’t have to contain version numbers. I think it’s a smell that the version of the code, is in the code. The version is something external to your code, your source control system has to deal with revisions, not the code itself. You’ll see that a lot of web projects don’t have a version at all, it’s just the code.

So, using labels in stead of dot-numbers, everyone will know what the version entails but you don’t have to worry about the numbers anymore. So, let’s make it easier!

The Task Board Retrospective

Tags

, , , ,


When did you last improve your task board?

Most teams working with an Agile process like Scrum use a task board (physical or digital). Even many teams that only do a bit of Agile or Scrum will work with a task board. It’s a core practice in Agile software development, not specific to any methodology. So, it’s common practice.

So like all things in Agile it should be subject to scrutiny once in a while. A fair question is, why have a task board at all? Just because you’re doing Agile(tm)? No one benefits from blind adoption, so ask yourself this question. Not to get rid of your task board, but to find out if it really fits your purpose and to find room for improvement.

Continue reading

Follow

Get every new post delivered to your Inbox.