Software Development: 2010

Sunday, November 14, 2010

Me and Aunt Marge

I'm hopeless. I'm an old, narrow-sighted, rigid UNIX-school developer. I will never be able to raise above the cultural differences between Windows and UNIX. Let's face it - I don't like Microsoft. After finally reading Joel Spolsky's article on Biculturalism I can appreciate the reasons behind the "Microsoft culture" but I guess I'll never be able to associate myself with it.

My wife has recently completed a course on VBA in Excel. A couple of days ago she suggested we sit together and try to implement something. Well, I thought it would be fun. If only I knew how wrong I was!

Now, what bothers me is that if Joel is right about this "Aunt Marge" philosophy, I wonder how it applies to VBA in Excel. Right, my wife is an accountant, not a programmer. But programming macros in VBA is as close to programming in a non-programmers' world as one could possibly think. Let's put it this way - I would be surprised to see this application in the Aunt Marge's toolbar (unless she has dragged it there by mistake).

Anyway, we opened Excel and started. Here's a spec of what we intended to do. In a certain column, find a cell that matches a certain template - this indicates a start of a section. A table is expected to be located two rows below this number. The table is concluded with a "total" line. Copy the initial number and the "total" to a new row in the second sheet. Repeat the search process until the end of the sheet, adding another row to the second sheet every time. Fairly simple.

Well, as a programmer having plenty of the first virtue, I assumed there must be a function that finds a cell matching a certain pattern. So I opened the Help and searched for 'find regular expression'. Great - here's a Find function! I'm starting to write the VB code, but when I hit the opening parenthesis I pay attention that the parameters in the tooltip do not exactly match, so to say, the parameters in Help. It took me a while to realize that there is more than one version of Find. No problem, I tried to locate another. Didn't work out. What does a frustrated programmer do these days? Google it! Surely, within a second I was able to find an article about the "correct" Find method.

Now, it seems that Microsoft developers have decided that since most of the programmers use Google anyway, API reference is no longer necessary. The internet's article hinted that in order to look for a regular expression, one should use the parameter "SearchFormat". Excel's help gives the following information about this parameter: SearchFormat - Optional - Variant. Description - "The search format". Duh! It's great they use a word breaker for automatically producing help files. Surely, no more light is shed on the mysterious "Variant" parameter in the Remarks section. And I'm not mentioning that the syntax of regular expressions is nowhere to be found. Oh, sorry, I forgot - Aunt Marge doesn't need those!

On the other hand, the mentioned Remarks section contains an absolutely remarkable (sorry for a wordplay) note that has shaken my concepts of good and evil. The values for the various optional parameters are saved each time you use this method. Which means that if you don't specify their values on the second invocation, the values from the first one are used! It seems that the designers of this particular piece have never heard of the Principle of Least Astonishment.

This is actually a very good proof of the Joel Spolsky theory. This behavior stems from pure GUI thinking. It's perfect for humans to have their previous choices remembered. For programmers it's disastrous because it makes the behavior unpredictable. I can bet there's a corresponding UI dialog for this Find function and it's simply a bad design to have this logic anywhere lower than the UI layer. But the Microsoft's culture is so "Aunt Marge" oriented that perhaps they simply don't have enough people who know how to write software for developers.

I'm ashamed to admit that I gave up trying to make this Find method work for me and resorted to some hand-made search with a While loop. Each one of us has his limits, sorry.

By the way, one of the times I desperately hit F1 brought me a third version of Find - something completely different! Which reminded me why back in 1999 I fell in love with Java. In Java there is only one type representing a sequence of characters - String. Young programmers who haven't seen the Win32-COM-MFC-ATL nightmare won't understand what I'm talking about. But those who remember words like char*, wchar_t*, TCHAR*, LPCTSTR, CString, CComStr, BSTR etc. would be nodding in agreement. Ah, yes, there was also an OleString. Wasn't it? Never mind. I'm sure every single one of those had its rightful raison d'être. But could someone please tell me what does it have to do with Aunt Marge?

Thursday, November 11, 2010

About usability

Recently I became concerned with the usability of the software we write. I came to it from two different perspectives in parallel. First, I have recently moved from one project to another. Being a newcomer, it's easy to criticize the product. It's also a rare chance because after a couple of weeks one gets used to various nuisances. They call it a fresh look.

The second perspective is much more narrow but is much easier to use for the demonstration. At my leisure time I'm developing a small Java application for my Nokia N97. A small thing that records a track using the built-in GPS and exports it in the GPX format. Just to prevent a flood of comments - I know there are dozens of application doing just this. But this one is mine - I like it, I can add little features to it and it also gives me a little insight into the J2ME world. Just for the fun of it.

However, this small application continues to teach me usability lessons. It's amazing how easy it is to write a "regular" application and what it takes to plan a really easy-to-use one.

As an excuse, I have to say that at first I did download some application for track recording, but didn't quite like it (usability-wise). This is what brought me to making my own. The sad thing is that the approach of that application "spoiled" me - I only "improved" it here and there but didn't change the whole concept.

A couple of examples to demonstrate what I mean. The first "working" version of my application constantly displayed the latitude and longitude, altitude and their respective accuracies (the numbers GPS has to offer). There was a "Start" button that changed the display to two text fields, "Name" and "Description". Name was mandatory but had a default value "Track", Description was optional. After entering the name one had to press "Start" again to actually start the recording. Then the application started accumulating points. Later, when the "Finish" button (displaying instead of the "Start") was pressed, the application dumped the file.

It took me a while to realize a number of problems with this approach. First, it doesn't make sense to accumulate points in memory when the application may start writing them to the file immediately. Apart from memory optimization, it's a huge improvement in usability. For one part, since I still didn't find out how to sign the midlet, there's an annoying security confirmation dialog popping up whenever the application is trying to open the file for writing. It's better to display it right away than to delay it to the end. Another immediate benefit is that now pressing "Finish" takes no time - the file is written already and the application only needs to close it properly. Finally, if the application gets stuck (unfortunately, it happens once in a while for some obscure reason), the part of the track accumulated before the lock-up doesn't get lost.

When trying to "reverse-engineer" why on Earth didn't I do it from the beginning, I recalled that in the original application there was a feature of adding waypoints along the way. In GPX, they have to be written in the beginning of the file, so the program has to defer the writing until all waypoints are known (which doesn't happen until the end of the recording). However, even if I decide to add this feature, it still makes sense to write the file along the way and store the waypoints in memory instead. When the track is finished, rewrite the file (if any waypoints have been added). By the way, this is a typical example of how a marginal feature has the potential of destroying the main flow. We, programmers, spend so much time with computers that we got used to think like them - 'then' is no different from 'else'. For humans it makes all the difference.

Another observation has to do with the fact that the phone is not a small computer. Well, technically it is, but not as far as the user is concerned. The user of the phone doesn't like using the keyboard! Even on N97 with full "qwerty" the keys are smaller than thumbs. However, in my first version the user had to change the default value for the track name - otherwise the previous one would get overwritten. An additional consideration is that the recorded track is later transferred to the computer anyway and may be easily edited there. So practically it doesn't matter what name is given to the track at the time of recording. From here the solution is obvious - give it a unique name automatically! The best is to encode the date and time so it would be easy to identify it later. In the current version I still display the "name-description" dialog, but I believe this should be removed altogether. I never entered a description and after having changed the default of the name to be unique I don't edit it either.

Here goes the last one (at least for now). After a couple of uses I realized that apart from the fact that they constantly change (and thus indicate that the application still works), the latitude and longitude numbers do not make any sense. On the other hand, on my trips I usually have a large-scale map and would love to make sure I am indeed at the point I think I am. Which made me find the formulae for calculating the "Transverse Mercator" projection coordinates (in practical words, the new Israeli coordinates network). So my current version displays these numbers instead, which is much more useful.

Surely, I'm rediscovering the wheel - Apple have understood for a long time already the difference between a regular product and a "great" one. Still, it's funny to watch this on a small and distilled example. Think usability!

Thursday, October 21, 2010

My "anti-patterns" - 3

After the previous entry with rather bombastic statements ("don't use inheritance!") this is going to be a rather modest and specific one.

Pair

It looks that almost in every project there's someone who once wrote this:

public class Pair<F,S> {
    private final F first;
    private final S second;
    public Pair(F first, S second) {
        this.first = first;
        this.second = second;
    }
    public F first() {
        return first;
    }
    public S second() {
        return second;
    }
}

When I first saw this, I thought "if it's so useful, why doesn't Java have it already?" It didn't occur to me then, but after some time I got to the conclusion why. Not only is this class useless, it actually does harm.

Most obviously, it obscures the code. pair.second() - what is it? What this statement is talking about?! Who's on first? What's on second? Things should be referenced by their meaning, not by a position inside a structure! What is it? Assembly language? [SP+8]?!

Secondly, I believe that it actually hides the reluctance of a developer to name things and think about them. It's much easier to hide behind an anonymous "pair" than to think whether this "alliance" between two objects is really needed and how it should be called.

Sometimes it is also a quick way to circumvent the restriction that a method has to have only one return value. Need a second one? (Most probably an additional flag.) Make it return a pair - the original object plus a boolean! What's the meaning of this boolean? Don't bother, it's "second"! Good luck in reverse engineering what this code does!

Recently I happened to see the following "masterpiece":

HashMap<String, Pair<Integer, HashMap<String,
    Pair<String, String>>>>

Notice this "operator" at the end? Amazing how robust the Java compiler is!

By the way, the above example reminds me of another idea (and this probably deserves a topic of its own). I'm starting to think that collections, especially maps, should probably never be used per se, in particular when being used as return values. They should rather be wrapped into classes whose name describe the meaning of the collection, not its type signature. Don't be alarmed, in most of the cases the wrapper class will have to expose only a very small subset of the methods of the collection. However, they would have a chance to have much better names, explaining what purpose the collection serves in this particular context. For instance, if you have a mapping from "logical" names of something to their "physical" names (whatever it means), a code like namesMapping.getPhysicalName("Attr1") really looks much nicer than namesMapping.get("Attr1").

Back to the "pair". Whenever one uses this "pattern", a duplication of its declaration is inevitable. Imagine a method that returns a pair. The description of the pair (its two types) will first be declared at the signature of the method, another time at the return statement of the method (accompanied with the new keyword) and every time this method is used (most probably the return value would be stored in a local variable). So the same combination of types is being repeated all over the place, cluttering the code.

Hopefully I convinced you to avoid it. If not, think of a more advanced class - a triple!

Added on Nov 4, 2010: The world has definitely gone crazy: http://www.javatuples.org/. As a response, see also http://thedailywtf.com/Articles/Overloaded-Help.aspx.

Monday, September 20, 2010

My "anti-patterns" - 2

Continuing the "anti-patterns" subject. By the way, "anti-patterns" are not, strictly speaking, the "anti" of "patterns". "Anti-patterns" are specifically "don't do" while patterns do not necessarily mean "do". However, it's much easier to say "don't" than to say "do". Out of the 10 commandments the vast majority is "don't". Out of 613 mitzvot, 365 are negative. So let's continue the wave...

Inheritance

I just love to make controversial statements! So bombastic - do not use inheritance! OK, you know it was a joke. But! I see so much cases of inheritance being used for no good reason that I'm becoming worried. It seems that the new generation of developers was raised on a hype of object-oriented. Functions are passé, we should program in objects! The next (highly dangerous!) clause coming out of this is "when there's a common piece of code, one should use inheritance". Wrong, wrong, completely wrong!

Several horrible constructions come out of this thinking. Abstract classes that are full of unrelated methods placed there only in order to be accessible in subclasses. "This new class is almost like the one we have - let's derive it from the latter." "We have a flow that differs in some details - let's create a super-class with a couple of abstract methods and implement them in subclasses. The latter is called "template pattern". Finally, there's a pure abuse of inheritance (especially interfaces) that has to do with creating small abstract classes (or interfaces) no one really uses.

Let's look at these examples one by one. In this case there is already an hierarchy of classes (for whatever reason) and two of them need to share some code (generally a utility function). The temptation is to put this code into one of the abstract classes common to both. I guess it comes from the junior developers' fear to create new units of code - be it classes, interfaces or whole modules. It turns out this requires some sort of courage (who would have thought!). So a new protected method gets created in an abstract class.

Two immediate problems come out of this - the fact that the method is placed in the abstract class invites future users of this method to disguise themselves as members of this hierarchy of classes (otherwise they won't get access to this useful method) without real reason. The second problem arises when one tries to refactor the abstract class to a new infrastructure module. This offensive method in many cases knows some abstractions of higher levels and prevents the abstract class from being extracted.

It's generally very easy to find out these "weeds" - try to put the word 'static' before the definition of such method. If the code continues to compile, you can safely extract it out to a utility class (yes, there's nothing wrong in using utility classes with static methods!).

Second case - perhaps the most dangerous one. There's a class that does almost what we want? Let's extend it and override the offending method(s). This specifically creates inheritance for no good reason. In this case the most important thing to look for is a possible violation of the Liskov substitution principle. Can an instance of this new class be used in whatever context an instance of existing class is being used? Probably not. So try to refactor the existing class. Either make it suite your needs or take out the required functionality to another place. Beware of the Conway's Law.

The third case (aka template pattern) may be an evolution of the previous one but may also be created on its own. It's often characterized by an abstract class having methods 'postSomething' and 'preSomething'. Very curious idea - naming a method not according to what it should do but to when it is supposed to be called - sort of callbacks. However, contrary to callbacks which are supposed to react to events, there are no events here. For instance:


public abstract class AbstractCommand {
    public final void execute() {
        validate();
        doExecute();
        postProcess();
    }
    protected abstract void validate();
    protected abstract void doExecute();
    protected abstract void postProcess();
}

Wow! What a code! Excuse me, what exactly this "design" gives? To prevent any future command from forgetting to "validate"? And, as you may guess, 28 out of 30 commands have their 'postProcess' method empty. By the way, another LessAbstractCommand quickly comes up with a method 'doExecute' doing some "common" stuff and calling 'doRealExecute'. Within a couple of levels one quickly gets lost in those "real-real" executes. Great fun to debug as well!

Of course, this is an extreme example, but it's amazing how tempting it is to "establish a flow" and make derived classes only implement "plug" methods. Next time you are considering something of the king, think instead of an easy-to-use set of utilities, or corner stones to be used by future implementors. It's OK if more than one of them would call the same two methods in the same sequence - no harm done. This is not the kind of code that should be reused. With the template method, however intelligent you are, you won't be able to cover for all cases in your "universal" flow and one day your successor will add an ugly "prePostSomething" method implemented in only one subclass.

Finally! Little interfaces - one of my favorite ones. "Many of our objects have a name and a description - let's create a common interface (superclass) NamedEntity (AbstractNamedEntity)!" My favorite saying goes "the fact that both a kangaroo and a soldier are capable of running doesn't mean they should both implement Runnable". The main question you should ask yourself - will there be any code that gets an instance of any NamedEntity and treats it as such? Check yourself thoroughly and refrain from doing it.

I was once swearing hard when I had to study a system where every object that had a 'startup' and a 'shutdown' method was declared to implement a common interface. Surely there was no place in the system that handled all instances of this interface. But imagine what pain it was trying to understand where the 'startup' method of a certain object is called from. One gets 15 places, 14 of which are irrelevant for the specific case - now go find the right one!

So - is inheritance evil? No way! As indicated in a comment to my previous post - "guns don't kill, people do". Thanks, I liked it! But it's amazing how often the inheritance is being abused, this is all I want to say.

Friday, September 17, 2010

My "anti-patterns" - 1

Whole books are written about "anti-patterns" in software design, so I probably won't be saying anything new here. There are several anti-patterns however which I consider sort of "my favorites". I first thought about writing about all of them in one post, but then I recalled one of my readers criticizing me for putting too much statements together. So I'm going to publish them one by one instead.

Visitor pattern

It took me quite a time to grasp the idea of this pattern. Perhaps I'm kind of challenged but I still don't think something that complicated should be used unless really (and I mean really) justified. So my first problem with this construction - it's hardly readable. I hate reading a code where methods have obscure names. I always get lost in the labyrinth of 'visit' and 'accept' calling each other.

However, this is not the single reason. Look at the example from the Wikipedia article.

class CarElementPrintVisitor implements CarElementVisitor {
    public void visit(Wheel wheel) {      
        System.out.println("Visiting " +
            wheel.getName() + " wheel");
    }
 
    public void visit(Engine engine) {
        System.out.println("Visiting engine");
    }
 
    public void visit(Body body) {
        System.out.println("Visiting body");
    }
 
    public void visit(Car car) {      
        System.out.println("Visiting car");
    }
}

I guess I'm going to hurt the feelings of the advocates of this pattern, but this snippet makes me wonder how the pattern made its way to the classical book on object-oriented design. Sorry, but what I really see above is nothing else but a 'switch' statement nicely disguised in methods overloading. Wouldn't it be more straight to simply use 'instanceof' instead? At least the code would be more readable and would not require those obscure 'accept' methods.

Finally, my last objection has to do with modularization of the code. Normally, when extending a class, one doesn't necessarily have to place the derived class in the same module where the base class is. With visitor pattern, the family of classes derived from CarElement is a closed circle. Not only when adding a new derivative one has to go over all visitors, but the derivative has to be in the same module the rest of the family is. Surely, any design assumes something about possible future changes. The same design cannot support more than one direction of changes. This specific one supports adding new visitors but makes it practically impossible to add new elements visited.

To summarize, I have an impression that the visitor pattern is an illusion. It creates more problems than it solves and unnecessarily complicates the code. The only positive side is that it sounds cool when one claims at an interview "I have mastered the visitor pattern!"

Friday, August 20, 2010

The simplest hot-spot detection tool

How do you find a performance problem in a working system at a customer site? The ones used to running those sophisticated profiler tools would try to sneak one of them into the customer environment, modify the runtime settings, get a snapshot, another one... All this with a customer looking over their shoulder. A virtual one (fortunately there's WebEx) but still.

This week I had two support cases that proved that there is a much simpler way. The key is to take a thread dump periodically and see whether there's any pattern.

One of our customers was complaining that after the system runs for several days, its performance degrades. "What is slow?" - "Everything!". A classical hopeless case. The moment you hear it you know the escalation is going to be open for weeks with fruitless attempts to understand what specific scenario is slow and with a disappointed customer in the end. But before you give up, take thread dumps! The more the merrier.

After examining them for a long time without any clue I suddenly noticed something strange. A thread that was doing some database-related activity was in the LinkedList.remove() method called from inside the JDBC driver. This would not be so curious if it didn't repeat itself in almost every thread dump. Every time a thread was doing something JDBC-related, the thread dump caught it in that 'remove' method.

I admit that I forgot for a moment how 'remove' really works and was under the impression that a remove from a linked list is a constant performance operation. It is because of this temporary "blackout" that I was so stunned. What are the chances that every time I take a thread dump I would "catch" a thread in a method that should take nanoseconds?! Something completely bizarre was going on here.

Surely, after having finally read the source of LinkedList I realized that remove was not done using an iterator and hence had to find the object to be removed first, traversing the whole list. From here the conclusion was almost obvious - there is a leak in that list. Some digging in the code of the driver revealed that this list was used only when a certain connection parameter was set (that's why it didn't happen always), but these are the "gory details".

The lesson here is that a pure statistical approach to finding performance hot-spots is not bad at all. Imagine you are running your program in the dark and randomly flash a light that shows you where it is now. If your hot-spot is sufficiently "hot", most of the times you should find your program inside it (by the way, if I'm not mistaken, this is how profilers' sampling feature works).

Needless to say, the second support case I mentioned was very similar. Here it was even simpler because this customer was load-testing our system, so one thread dump was sufficient. Almost every "interesting" thread was at the same place. Case solved.

Lesson learnt - when learning "advanced tools", do not forget the basics. "Advanced tools" only save time, it doesn't mean one is helpless without them.

Friday, August 13, 2010

On Infrastructure Teams

Prologue

Top-level manager: Why it took so long for your developers to implement this feature?!
Second-level manager 1: They had to invest in the infrastructure because it wasn't flexible enough to support this feature.
Top-level manager: Hmm.. And you? Why do you have a delay?
Second-level manager 2: We discovered that our component foo had a bug - my engineers spent a day brainstorming how to tackle it.
Top-level manager: This foo component, doesn't it do just what your bar component is doing?
Second-level manager 1: Well, almost, but we don't have a common codebase.
Top-level manager: I think I have a great idea! We should build an infrastructure team! We will put all smart folks there and they will write these components once and for all! We shall save a lot of resources by doing these components only once!

[A year later]

Developer: What do you think about adding this functionality to our system?
Team lead: Excellent idea! But wait!.. For this the infra component we are using must be changed. I'll see if I can ask them to add this nice feature.
Infra team lead: Sounds cool! Do you have a written spec? No? You better write one, so we could understand better what you need. We also would be able to show it to other teams so they could have their say. That's a great suggestion and I even think we should make something much more general out of it! I shall check our work plans, it looks like in June we will be able to start working on this. Great, keep coming with ideas like this! And don't forget to bring the spec!

[Another year has passed]

Top-level manager: Amazing, I didn't expect you to complete this so fast!
Second-level manager: It's because we decided to stop using the infra component and our engineers made something much lighter and faster in just a week! What they wrote is exactly what we need, without the overhead that component incurred.

[Couple of months later]

Customer support manager: Wait! I remember we solved this problem a year ago, how could it be it happened again?!
Team lead: Well, since then we stopped using that infra component and have implemented our own...

* * *

Sounds familiar? I believe every organization goes through the cycles like this. We, software developers, hate copy-n-paste. We love reuse. We cannot stand having two similar pieces of code without trying to refactor them out to a common component. And the things we especially love to write are infrastructural components!

This is how the infrastructure teams are born. It seems so natural! It's cool to work in such teams! But just as the area of influence of this team expands, the more rigid it becomes. If it tries to serve only one project, the others get angry at it. If it tries to satisfy everyone, it finds it difficult to change anything. No single project has any sure way of influencing its work plans. It becomes unresponsive. And then the projects find (or invent) reasons to stop using the services of this team. And then it starts again.

It's interesting to note that this happens at all levels. Shall we generalize this class to support this case as well? Wouldn't it make it too cumbersome and hard to support? Shall we use that third-party library? Looks like it does what we need, but it does so much more! Wouldn't it incur an overhead? Shouldn't we create a group that is responsible for various security-related components? ~~Should we use the services of IT or support our own servers?~~ Etc. etc.

To use or to develop? It's an eternal dilemma in software. Like always, the art is to combine the two approaches. Choose rigorously what to utilize without succumbing to the NIH syndrome. When I face a choice of adopting a third-party component vs. developing one internally I try to ask myself the question "what was the primary use-case the OTS component's developers had in mind"? Does it match ours? Whether what we need is the primary purpose of the component or not? Even if what you need is served well by the component, but it is only one of the dozen areas it covers, you probably shouldn't take it. The size generally incurs a high learning curve and a serious expertise. The price may not be justified. The size and the complexity of the component should just match the complexity of the problem you are trying to solve.

Another indicative question is whether the problem you are trying to solve lies on your primary path. For instance, you shouldn't develop your own build framework, unless this is your primary product. You should take an existing one and try to adapt yourself to its principles. However, never make compromises when it comes to your primary expertise.

So, do we need the infrastructure teams? It seems that again there's no good answer. We do. We don't. We create them. We hate them. Open source seems to be the best viable alternative. I don't necessarily mean internet-based open source. It may be an internal one. But with a possibility for the consumers to actively influence (provided they don't break existing tests) and a possibility for the central authority to supervise. Whenever there is a monopoly on changes in a component, it provokes antagonism among its users. Especially when they are sufficiently skilled to develop a better one by themselves.

Wednesday, July 28, 2010

On the evolution of software projects

One of the things that frustrates a junior developer that browses through a large software project's codebase is the variety of approaches to solving the same problem. From the first glance this seems to be a bad thing - we don't like confusion, all these many ways require separate treatment when we are to change things and so on. Could it be different however? After many years of working in different projects I have a feeling that it's the way of life and the best we can do is to learn to live with this fact.

Why does it happen? Why there should be more than one way to handle the same thing in the same project? There are several reasons for this. First, there is lack of communication. When the project becomes large, people split into teams and talk less to each other. Even in the same team, we don't always ask our neighbor every small question we have. On the contrary, we love to solve problems and we take pride in solving them ourselves. NIH - not invented here.

Second, we all have our own opinion. Even when we know there is a solution, we don't always like it. We think we know better. Sometimes it is even true. This is how competing solutions are born. If both are valid, they start to recruit followers. Often the division is drawn along the teams' boundary. Two teams each have their own "methodology" and argue that their approach is better than the one the neighbors use. I like to view this silent "competition" as a battle for survival between species. It promotes mutations and evolution. If one of the approaches turns out to be superior to its competitor, teams boundary doesn't help and there start to appear "dissidents" who assume the "neighbors'" views. In the end one of them extincts.

Finally, there is a third and the most interesting reason for the "disorder". It is a planned transition. The problem with large projects is that it is practically impossible to carry out a replacement for an existing approach in one giant leap. Even when everyone agrees that a certain technology (or simply a way of doing things) should be replaced by a better one, the switch could not be done overnight. Especially when there are persistent artifacts that have to be taken care of, like files in a certain format or a database structure. So we enter a transition period when the two technologies co-exist.

There are several interesting points about such transitions. First, they take time. And when I say "time" I mean a lot of time - sometimes years. This itself causes an interesting phenomena. Sometimes, if the transition is not pushed hard enough, a second wave starts before the first one has finished - a third approach is born and then the three of them co-exist. Since I tend to stay in the projects long enough, I sometimes find myself in a position of a historian who teaches the newcomers why there are so many ways of doing the same thing. Unfortunately, many times the transition is never finished. Either its initiator leaves the project and no one picks up the glove or the ratio of cost to gain is too high. No one wants to do the dirty job of cleaning up.

Second thing about the transitions - and I shall again refer to biological analogies - is the self-preservation of the technique being eradicated. Let me explain. Most of us program "by example". We look for another place in the code doing similar thing and mimic the solution implemented there. Now, if there are sufficient places using the old approach, it continues replicating itself like a virus. Sometimes I found myself wishing there would be a way to magically mark all existing places with a comment "don't copy me!"

So what should we do? How do we cope with this? As I stated in the beginning, the most important thing is the acceptance. Unlike single-person university projects, the real projects are diversified. They are written by a bunch of people with different views, they evolve over time and they do not transform with a wave of a wand. That's one of the reasons I like them!

Software Development