Biggest Post Ever Redux: NoSQL as a More Flexible Solution?
Posted by Bob Warfield on July 23, 2011
Thanks to Reddit, HackerNews, and a host of other sources, my post on NoSQL being a Premature Optimization just became the biggest post ever for Smoothspan Blog. Thanks to all for reading!
I’m actually surprised at how little argument the post has gotten. The best comeback has been that NoSQL is not just about scaling. You can see some of that sort of response in the comments for the original post.
The “it’s not about scaling” argument boils down to it being easier to model some kinds of problems with NoSQL than Relational because the tools and model are more flexible. To this, I can only respond, “yeah maybe, but was modelling really the hard part of what you’re doing?”
I’ve modeled a lot of things in relational. Some of them were very arbitrary and had little to do with the hardcore relational way of thinking. Come to think of it, most were pretty arbitrary. More than one commenter suggests that the existence of object relational mapping layers is a clear indication of how painful relational can be. But it sure doesn’t feel that way if you’ve done a lot of it. Seems like the usual sort of shoehorning some arbitrary notion into a data structure that we deal with all the time in Computer Science. There are lots of good proven tools for it. I’ve built mapping layers too and even that wasn’t all that hard. Adrian Cockcroft from Netflix left one of the very first comments and suggested it was hard to beat the productivity of Ruby on Rails with MySQL for a small team. That’s a case where the mapping layer became integral to the fabric of the framework, and one I’d love to see happen more often given how fundamental persistent storage is to a lot of problems. One could even argue it is the fundamental thing that sets Ruby on Rails apart was making persistence a first class problem they wanted to solve up front. Maybe there is another Ruby on Rails success story just waiting for a NoSQL tool to get crossed with some up and coming dynamic language.
Go back to my original post on NoSQL and go through some of the Netflix materials. Some of the problems they had to solve in NoSQL are modeling problems there too, BTW. The difference is that the warts and edge cases for relational are pretty well understood by now. You don’t have to invent your own solutions (as the Netflix people did for things like NULL handling)–there are 6 or 8 out there just waiting to be Googled to choose from.
But this all ignores my question about whether modeling was really the hard part. I don’t think it is, though developers love to think about the up front “minimum best fit to their design vision” as the hard part. Having been through 6 startups now, the hard part is all the stuff that isn’t written down. It’s the problems that pop up when things just don’t work, don’t work as expected, stop working, work too slowly, and generally just piss you off for no good or predictable in advance reason. They pop up in the middle of the night, late in the project, after customers get hold of the software, and when there is no turning back. They show up in spades when you hire new people and don’t have time to train them, so they just have to figure it out on their own. Such problems extend far beyond mere development of a prototype and will in the ops and day-to-day care and feeding of a successful system. They are problems born of a lack of maturity. They get gradually burnished away over time in mature toolsets as bugs are fixed, experience spreads, and patterns and know-how are disseminated to the community. NoSQL (let alone NewSQL) is just not old enough to be there yet relative to relational. Give it a minimum of 5 years and more likely 10. Companies like Netflix are helping make it happen as we speak. When there are 10 Netflixes that have all built big projects that are wildly different and in total involved over 1000 developers, then we’ll have a start on it.
Meanwhile, you have a startup or a project that needs doing. If you have a cadre who have already been through NoSQL in a prior startup or project, they may have the experience and scar tissue to make an informed decision about it. They represent a localized burnishing of the worst problems away. If you haven’t ever done more than read articles and tinker with toy projects, why would you risk your important project playing around with this new technology? What do you hope to gain when there are proven solutions already at hand? Do you expect the Silver Bullet that will magically cut your development time in half? Do you think your startup or project is so easy you have the luxury to experiment with additional risk?