70% of the Software You Build is Wasted (Part 1 of Series of Tool/Platform Rants)
Posted by Bob Warfield on September 4, 2007
The headline gives a gruesome statistic, but it is probably understated. At least 70% of the software you build is wasted because you are constantly reinventing the wheel by building components that do not deliver any competitive differentiation to your offering. You have to build them for the offering to work, but they’re things that everyone in your space also provides. Often, they are things that everyone in every space provides.
I was having lunch with a CTO friend recently and broached this subject to him in a deliberately provocative way. After he got past my delivery, he sighed and commented that he agreed. Every new job he takes requires reinventing the same wheels all over again. Another friend who is a marketing guy had exactly the same reaction even though he isn’t a techie. He knew exactly how much was being invested in Engineering to build stuff that he couldn’t put in a press release or otherwise tout. His referred to this work as a tax on innovation. The non-differentiated stuff would just barely be average if you did it extremely well. It would be average because it didn’t matter that it be any better than average: it wasn’t a competitive differentiator. Therefore you couldn’t afford to make it better than average if you were focusing your business properly.
I find it incredibly bleak to consider that 70% of the lines of code being written will be average at best and will likely make no difference to the business.
Consider the example of Security to give a flavor for what I’m talking about. In the Enterprise World where I’ve spent a lot of my career, Security is a key area of functionality that CIO’s and IT guys want to know about before they’ll even think about buying your product. There is a long list of features and questions they have to be briefed on. How will you interface with our LDAP server? How will you deliver single sign on under our portal standard? Do you support our (1 of 257 distinct formulas we’ve seen so far) exact recipe for how we want passwords and cookies to be handled by thin clients? All of the real offerings need to have good answers for all of these questions, so it’s not giving you any proprietary advantage to solve these problems, it’s just part of the cost of doing business.
For SaaS companies, the tax is even higher. At a perpetual software company, you install inside the firewall. Many security issues for SaaS are taken for granted once you’re inside the firewall. There is a lot of IT glue already in place to enforce things like password policies (it has to change every 90 days except when there is a Harvest Moon, there must be at least 13.5 characters, 1 of which is a symbol from the Greek alphabet, 3 of which are the square root of your Mother’s first pet’s name, yada, yada, yada) and to let other vendors shoulder some of the burden for things like monitoring whether the software is functioning properly. As a SaaS vendor, you have to build all this glue that your perpetual peers take for granted. Is it any wonder that the common wisdom has become that SaaS takes more investment capital?
There are many more examples:
- Forms and UI: Most UI is really pretty similar, with a slightly different application of surface level cosmetics for branding. This is a good thing, because it means you understand the vocabulary if not the language when you see the UI of most web software. Yet, a staggering amount of work goes into recreating this wheel over and over again.
- Database Connections and Persistence: The data goes in a database in most cases, whether we’re talking mySQL or Oracle. Isn’t it amazing how much effort still has to go into this connection lo these many years since E.F. Codd postulated relational databases? And how long have we known that object oriented languages need a way to put objects into the database and then get them back out, something we call a “persistence layer”? Yet, we frequently have to build or at least extensively modify some component to make this happen.
- Reporting, Messaging, Data Feeds, yada, yada, yada. The list of things companies build that already exist in some form or fashion is huge. NIH is alive and well in today’s Software 2.0 world.
- Scalability: Caches and similar contrivances get recreated over and over again as traffic builds up on your web site.
There are many more examples that I’ll leave to the readers. How does this happen? I blame three candidates, one of which is cultural, and the other two are technological, yet enabled by the cultural quirk:
Software Developers are Producers Not Consumers of Code
Code reusability is hard, as anyone who has tried to herd the cats (developers) all together towards some form of reusability or core technology will tell you. Every developer will loudly proclaim that code must be reused. They will immediately follow this up by demanding to work in a core technology group that will produce the greatest code for sharing since sliced bread. What they’re really saying is, “Everyone should reuse code, but it has to be my code they reuse.” Take any software developer who is well regarded by his peers, collect some of his code, destroy all comments and other information that would connect the code to the star, and give the code to another engineer telling him he has to use it or maintain it. The recipient can be a star or just one of the troops in the trenches, it doesn’t matter. In 99 out of 100 cases, the recipient will loudly proclaim that the code is completely unusable and will have to be rewritten.
Programmers hate to reuse code becaues they hate to read and understand code. In the old days it was called “NIH” syndrome.
By now you may be thinking that “Software Developers are Doers, Not Learners”. In practice, I have not found this to be true, but I have found that the organizations the developers work for rarely invest in letting their developers learn, hence the end result is they’re stuck doing the same old think in the same old Curly Braced Language:
The Tyranny of One-Size-Fits-All Curly Brace Languages
If everyone else’s code is crap, which we have established by now, I’d better have a language that lets me write anything, because you never know what I might have to write. I’d better have a curly brace language: a real Alpha Geek’s Power Tool for Programming. The Curly Brace Languages are C++, Java, and C#. They are all descendants of the mighty C language, which was created in 1972 in order to build Unix, an operating system. Because of this, C is considered a Systems Programming Language, although many used it to create Application Software. Because Systems Programming is about creating absolutely the gnarliest, most difficult types of software, such as operating systems and even new language compilers, it has to be able to get down to the finest levels of detail without making any assumptions. It can’t get in your way, in other words. Wikipedia puts it amusingly by saying that System’s Software talks to hardware while Application Software talks to people.
Unfortunately, for that 70% of wheel reinvention I mention above not getting in you way means the Curly Braced Language also doesn’t help you much. You do all the heavy lifting! OTOH, should you need to write code that talks to hardware (does your business really need that?), the Curly Braced Language is indispensible. Blogger Russell Beatie puts it extremely well in Java Needs An Overhaul:
There’s something about the Java culture which just seems to encourage obtuse solutions over simplicity.
It isn’t the culture though; it’s the language that encourages obtuse solutions. It’s deep object oriented programming. It’s the fact that the Curly Brace Languages have become the assembly languages of our day, and there is nothing more obtuse than a big assembly language program. Don’t believe Curly Braces = Assembly Language? Consider the following strengths of Curly Braces, which are essentially the same as assembly language:
- You can talk directly to the hardware (yes, you can do Systems programming, but does your project really need to?)
- Because I get better performance (yes, but not in a multicore world where massive scalability not tight loops will rule)
- Because I might have to do some gnarly cool down-in-the-weeds Geek thing that can only be done by a Curly Brace Language (yes, but how often must you do this?)
Enough said. You can get the job done with Curly Braces, but 70% of your work is wasted because you have an electron microscope when you really needed a pair of reading glasses. Smart companies are now using more than one language, a practice called Polyglot Programming. They know the dangers of overly focusing on a single Curly Braced Language for all programming. I’ll have more to say about Polyglot Programming in a future post.
The Application Framework Tower of Babble
So what happens with my Curly Brace Language when I want to do Application Programming instead of Systems Programming? Well, unless I am crazy enough to ship an OS with my application (don’t laugh, it has been done in the past!), I need help talking to the OS that’s already in place. That’s because the Curly Braced Language is so busy not getting in your way that it doesn’t help you much either. It can talk to your hardware, but scarcely knows your operating system. So, the stuff that isn’t built in comes to you via the Application Framework (what used to be called libraries). Without an Application Framework, Curly Braced Languages can’t do much except write, “Hello, World.” Unfortunately, that’s been done and is no longer monetizable. Time to move on.
App Frameworks are supposed to be standardized so everyone can reuse that code, but fortunately, the best thing about standards is there are so many to choose from. I recently read a 4 part article about Web Application Frameworks for the Python language and lost count of how many different frameworks were mentioned. It was quite a good article, but you get a sense from it that App Frameworks can become another excuse to write more code especially if your language allows it. The other problem with them is they are wicked hard to learn and understand. Learning to write code in C is not bad at all. The original book on C was 272 pages of clear easy to read text. The original tome for learning to write in Microsoft Windows was Charles Petzold’s classic Programming Windows was a dense 1478 pages! Did I mention learning Application Frameworks is hard? That’s 5 times as much reading for the framework as for the language!
Quoting Beattie’s “Java Needs An Overhaul” on Frameworks gives a flavor of what it’s like and why it’s broken:
As a Java developer, I was always so amazed at how difficult it was to use the standard Java Class Libraries for day-to-day tasks. Every app out there ends up having to include 20MB of .jars in order to get even the simplest functionality working because Java libraries are so low-level and incomplete.
What’s worse, is that most of the frameworks are not that well implemented. In fact, there are no great frameworks that solve all the 70% of problems that are the tax we’re talking about here. This forces many large organizations to wind up writing their own framework, thereby empowering the internal crowd who wants code reuse so long as it is their code that is being reused. The proliferation of these frameworks inside large companies is so extreme that the benefits are usually lost. Hence more 70% tax burden.
Hey Wait, What Happened to Object Oriented Programming?
Yessir, making it easier to reuse code was one of the big promises of OOP. I love OOP and have loved it since encountering Smalltalk. I confess I’m an odd soul because LISP was the first programming language I learned, which raises a lot of eyebrows. I was a General in the OOP Wars where Borland C++ battled it out against Microsoft C++. I even ran a startup that sold a Modula-2 compiler of all things, for a little while in the 80’s. OOP is a powerful tool, but I have two criticisms of it. First, as it is traditionally deployed by the Curly Brace Languages, it is incredibly baroque to the point of being extremely powerful yet almost impossible to master. Experienced devotees of the Curly Braced OOP Priesthood will tell you that these constructs are exquisitely precise in letting them refactor their code along whatever architectural designs they desire, but that out of the universe of people who can write code, a much smaller universe can do object oriented programming. This is a shame, because the original concepts for object oriented programming came out of languages like Smalltalk (and Simula) that were designed to make programming approachable by anyone. There is a growing suspicion that OOP really doesn’t help productivity much at all, but I’m not yet ready to enter that camp myself. However, more than one person has said that the reusability of C++ is not significantly better than C, and I am in agreement with that.
The second issue I have is that OOP doesn’t really facilitate code reuse very well. It isn’t service oriented, it’s about controlling the fine grained behavior of objects in intricate ways. In fact, one could argue that it makes a lot of code much harder for someone to read and understand because of all the things that happen implicitly and in many and varied locations. Certainly anyone who has ever walked through a complex inheritance scenario using all the OOP bells and whistles in a Curly Braced langauge on code someone else wrote will tell you it was a harrowing adventure at best. The old computed GOTO in FORTRAN has nothing on OOP when it comes to the power to obscure meaning.
What About Open Source?
Open Source has spawned a lot of code reuse in certain areas, moreso than any other trend I’ve seen in my career. Definitely more than Object Oriented Programming. I think its great, and I am a believer, but it has its limitations too. A lot of Open Source code is intended to be reused as-is or extensively modified. That is, its more like software that’s so cheap you may as well reuse it than software that is designed to be more reusable than other software. Truly resuable code would not need much modification to repurpose it, or the modification involved would be extremely trivial.
That leaves Open Source reusability down to build versus buy. If the code to be reused is suffciently simple to just write, most developers will not select Open Source. If the code is extremely complex, and the economics or schedule do not allow for a rewrite, Open Source comes to the rescue. This tends to push Open Source code sharing more towards the grandiose and away from the prosaic. mySQL would be a heck of a thing for an app company to have to write before it could get on with developing expense reimbursement software.
Given this trade-off, I see most Open Source code reuse as being more a matter of module reuse. Some fairly large and complex Open Sourced subsystem gets packaged up with some glue code and becomes the centerpiece of an important part of an application. That’s great, but it doesn’t seem to whittle much off the 70%.
What’s the Answer?
If you want to quit wasting 70% of your efforts on software, you’re going to have to discover a way to reuse code–preferably reusing code that (Gasp!) other people wrote anyway. Getting back to the service oriented perspective, true code reuse benefits from the service oriented perspective. Forget the Curly Braced Power Tool perspective for a minute. Use the power tools to create the proprietary advantage that you currently only get to spend 30% of your time and resources on. Look for a simpler, service-oriented approach to the 70% of functionality that is undifferentiated. Favor simpler service-oriented approaches without making them too simple as to be unworkable. This will minimize the amount of learning your developers have to do to reuse the code components. This is why REST is rapidly becoming more popular than SOAP as a protocol for Service Oriented Architectures. It’s simpler.
A lot of things succeed because they are simpler. C, in its day, was far simpler than Algol or PL/I or even COBOL. C++ was simpler than the overblown Ada. And Java simplified a lot of the issues that were on the C++ programmer’s mind. Now lately we see that scripting languages like PHP, Python, and Ruby have succeeded well because they’re simpler than the Curly Braced Languages.
There is no one-size fits all, so why not choose a couple of sizes for different occasions? Martin Fowler (author of one of my favorite books on Enterprise Patterns) puts it well when he says, “we will see multiple languages used in projects with people choosing a language for what it can do in the same way that people choose frameworks now.” Or, as the Meme Agora blog puts it, we are entering an era of Polyglot Programming.
PS: While you’re thinking about Polyglot Programming, consider that the Multicore Crisis is going to start kicking sand into a lot of the old machinery sometime soon anyway. The Curly Braced Languages will be the ones hardest hit by it because they’re closest to the cores.
Java Device Drivers from Sun: For those who don’t think you can talk to hardware in Java, here’s a detailed paper on how it works, and info on the Java Device Driver Kit (JDDK). If you’re uncomfortable reconciling the notion of device drivers running on virtual machines, Sun makes a good case for it. Their argument is that by letting the VM run as part of the kernel, you can create device drivers that are independent of the underlying CPU’s instruction set. This is particularly important to Sun who have both SPARC and x86 to worry about.