SmoothSpan Blog

For Executives, Entrepreneurs, and other Digerati who need to know about SaaS and Web 2.0.

Archive for the ‘enterprise software’ Category

Does the Cloud make Single-Tenancy OK for SaaS?

Posted by smoothspan on April 15, 2009

Multi-tenancy and all its flavors seems to be on people’s minds lately.  I just finished going back and forth with Phil Wainewright on some of the nuances of multi-tenancy and how it impacts the cost model for SaaS.  Phil’s post was on some of the sophisticated nuances of multi-tenancy as expressed by some of the latest announcements from the SaaS Vatican, Salesforce.com.

More recently, it has come to my attention both through my fellow bloggers of the Enterprise Irregulars as well as via a trackback to my blog post that there are those who are now saying the Cloud has made the world safe for SaaS vendors to forget multi-tenancy and plunge ahead with single-tenancy.

Not so fast! 

Color me skeptical.  If you don’t have a multi-tenant architecture, you’re going to argue it isn’t necessary.  That doesn’t make it right.  Before we get too far along, let me define multi-tenancy:

Multi-tenancy is a software architecture that allows multiple tenants to be hosted on a single box (or cluster of boxes) just as easily and economically as a single tenant could be hosted on the same configuration.

The bloggers taking the position that single tenancy is good enough hoist a variety of flags in support of their position, but for me, it boils down to answering one simple question about your SaaS business and its customers:

Fundamentally, do you have an application that can successfully run multiple tenants on a single box or not? 

If a single box has enough horsepower to run multiple customers for your app, the argument for single-tenant is completely (pardon my near-pun) untenable.
 
Salesforce runs 55,000 customers on 1,000 commodity servers.  You just aren’t going to be able to do that with a single tenant architecture no matter how much virtualization you choose to run.  If nothing else, virtualization runs afoul of a fixed cost/variable cost phenomenon very quickly.  A lot of the basic system software allocates fixed overhead, whether we’re talking about your DB server, your app server, your web server, or whatever.  Virtualization does not share the resources required for the fixed overhead, only the the variable costs.  Multitenant shares the fixed overhead too.  Those variable costs put an upper limit on the number of tenants you can shove onto a single box, no matter how small the tenant’s needs may be.
 
The new articles maintain that the Cloud fixes all this through the magic of elasticity.  Really?  That’s hogwash.  The Cloud at best and if you really architected your app to take full advantage of elasticity may help a little bit.  But most of the problem is database, and elasticity and databases so far remain a very hard technical problem to solve.  Try dynamically varying your partitioning and/or federation scheme to really scale up and down in real time in the Cloud.  It’s hard enough to get apps to scale to arbitrary Enterprise needs at all.  Try doing that in real time so you get multitenant cost savings?  Good luck!
 
So if you can’t run an average of 55 customers on each of 1000 servers like Salesforce, how many can you run without multitenant?  3?  5?  10?  What does that do to your cost versus true multitenant?  What does that do to the overhead of maintaining the servers?  What does that do to your cost of delivering the service and to the resultant cost model you have to saddle you customers with?  What does that do to you competively if you’re up against a company that does have the true multitenant cost advantage?
 
One example on the cost subject that I am familiar with:  a lot of the Social Software companies wind up charging by page views or total participant seats.  In many ways, this is anathema to community where you should do everything you can to encourage participation and not penalize it.  This is especially true for outwardly facing communities where the company wants a predictable cost model and can’t imagine being charged by their continually changing customer base or especially the changing usage patterns of that base.
 
In fairness, there are some business model + company + market + architecture combinations where it wouldn’t matter because you can’t run multiples on a single box.  If you’re strictly selling to organizations that can’t run on a single box no matter what, single tenant is fine.  Perhaps this is what these other bloggers are saying, but I’m skeptical any Enterprise 2.0 app would have that requirement, and that’s the kind of software these bloggers are describing.  FWIW, Helpstream will run 150 customers and nearly 400K seats on a single box loaded to about 10% of capacity.  That’s nothing though.  Look to the Googles, Facebooks, Amazons, eBays, Twitters, and similar big web properties to see real capacity.  Now combine that kind of capacity with multiple tenants.  It is a powerful competitive position to be in.

As I say, one can imagine combinations where you couldn’t combine multiple tenants onto a single box.  A honking big transaction processing ERP app might be one.  For another, at Callidus, I had customers using as many as 150 CPU’s to generate all the sales commission calculations for a huge salesforce.  To give an idea, we had telcos paying 20K sales reps and insurance companies paying 250K reps.  That’s a lot of transactions paid to a lot of people!  But those kind of Enteprise deals are very unusual for SaaS companies, and that kind of app is pretty unusual too because of the number of transactions being processed and the complexity of the business logic.  Still, such apps could be successful in those markets with single tenancy.
 
The articles go on to talk about the advantages of being able to customize these multiple instances.  Frankly, that scares me too, because the whole SaaS model really starts to break apart there when you decide to radically customize each instance.  It may be a value add, but it is a radically different value add than SaaS.  In fact, at that point, it’s a hosted ASP model, not SaaS.  Useful for some organizations, but there is a reason that model never achieved broad market appeal. 
 
Lastly, let’s talk about the whole security business.  This is the 800lb Red Herring in the room.  The minute you go SaaS or Cloud, you have outsourced that problem.  You can listen to vendors argue all day long about which architecture is “safer”, but that is an over simplification of the myriad factors that matter to the point it is just marketing and not substance.  It has as much to do with process as code architecture, which is why most of the security related standards like SAS70 and HIPAA don’t spend a lot of time on software architecture so much as the processes that surround that software.  Don’t take my word for it.  Look at what happened to Amazon around the whole AmazonFail incident for lack of process on an area that didn’t even involve any code.  Their problem was due to a data change.

BTW, the multi-tenancy imperative gets stronger constantly due to the multicore crisis.  We no longer get faster cpus (i.e. faster clock speeds) every 18 months according to Moore’s Law.  Instead, we get twice as many cores.  The easiest way to take advantage of more cores?  You guessed it: stack more tenants into the same box. 

For more, read Michael Dunham’s excellent post over on Haut Tec.

Posted in cloud, enterprise software, saas | 9 Comments »

If You Thought SaaS Was Annoying, The Cloud Babies Will Piss You Off!

Posted by smoothspan on January 7, 2009

I’ve been enjoying a spirited exchange with some of the Enteprise Irregulars around SaaS and Big Software for the Enterprise.  I won’t bore you with too many of the details, but we wound up in one of the classic cul de sacs these arguments often do.  Big Software was expressing their annoyance that once again incredible magic was being claimed, “Because it was SaaS.”  They were so annoyed at all the hype they percieved SaaS to be, and felt it was duping customers into believing too much in the name of SaaS.  If you read this blog at all (or have had a look at my resume), you will know I am an unabashed SaaS supporter, so when I hear someone shaking their head and bemoaning that SaaS is just a lot of hype, I spring into action.  Like any good evangelista, I launched into a long sermon about the many innovations SaaS has brought about that would be appropriate for any Enterprise (Big Software) Software to adopt regardless of whether they have a SaaS offering.

As it was happening, I was surprised myself at how it was coming out.  I’m not sure I ever heard anyone say SaaS had innovations that should be copied back into on-prem software before, but as I was waxing forth on the topic, I realized it was one of those things that had been germinating in the back of my mind for quite some while.  Let’s talk about that for a minute and then I’ll get into the whole Cloud Babies thing. 

What innovations has SaaS created that others would do well to adopt?  I’m talking about product architecture and functionality here.  Largely, it boils down to the idea of making software that is flexible without requiring expensive custom SI work.  Big ERP is legendary for the amount of expensive SI work that is required to install it.  The cost of such work is extraordinary, and the price tag when that work goes awry has created some legendary scandals in the Big ERP history books.  Getting away from all that is one of the promises of SaaS, and as I was quick to point out in that debate, it’s not just hype.  The economics of SaaS won’t support the expensive SI customization work. 

So how do SaaS vendors deal with the problem?  First, let me be the first to admit that a lot of them don’t.  They just restrict the scope of their offering and you live with that.  Sometimes that means the offering can only be successful for Small or Medium sized businesses and Big Enterprises can’t make use of it.  But that’s not the best answer.  The best answer is to find a way to deliver the flexibility in a way that doesn’t require expensive custom work.  There are two ways the SaaS world tackles this–for some problems metadata is the answer, and for other problems end user-approachable self-service customization works.  Let me give some examples of each.

Metadata is literally “data about data“.  As such, it is a beautiful thing.  Let’s consider the database.  It is very common for different organizations to want to be able to customize the database to their own purposes.  Let’s say you have a record that keeps information about your customers.  A lot of this information will be common, and could be standardized.  We all want the customer’s name, their address, phone number, and perhaps a few other things.  But then there will also be a lot of things that differ from one organization to the next.  Perhaps one wants to assign a specific sales person to each customer.  Another wants to record that customer’s birthday (obviously this is a much smaller organization than the first!).  And so on.  Without metadata, each database has to be customized and changed.  With metadata, rather than changing each database, you build the idea of custom fields in, and then you can just tell the database what the custom fields will be in each case but the structure needn’t change.  Metadata is not unique to SaaS, but it is an important part of the “multitenant” concept.  It makes it possible for all those tenants to live in the same database, but still get to have all their custom fields.

Metadata can also make it possible to enable that second method for flexibility.  Customizing a database without metadata is going to require someone to get into the database, modify the schema, make sure reports are modified to deal with the new schema, make sure the schema changes don’t break the product, and on and on.  Such work is definitely the province of expensive and highly technical experts.  However, once we have metadata, we can create a simple user interface that lets almost anyone add new fields, and that handles all the rest of it automatically.  Suddenly we have made what had been a difficult and expensive technical task approachable in a self-service way by non-technical customers.  Not only that, but they can make these changes quickly and easily, and they can even iterate on them until they get it just right.

Hopefully you can see why making expensive “flexibility customization” easy like this is essential to SaaS.  It makes no sense to sign up for cheap monthly Software as a Service and then have to spend millions to get it customized before you can use it.  Salesforce.com and others have done a fabulous job figuring out how to deliver this kind of thing.  There were a few non-SaaS companies doing this earlier, but nobody had made it an end-to-end requirement for the whole application install experience before the SaaS world came along and its economics made it imperative.  One example of a company that did this sort of thing to good effect was Business Objects.  It’s essential BI innovation was to make it possible to have the DB experts define the metadata needed to make querying the objects easy.  My old alma mater Callidus Software was another.  Our software computed Sales Compensation, which requires a lot of complex business logic.  Most of the players required expensive custom work to create comp plans, but we offered a product where business analysts could create the comp plans using formulas a lot like what you’d find in Excel.

The time is ripe, I would argue, for Big Software to be examined for opportunities to apply the same lessons.  Much Big Sofware is a couple generations older than the SaaS products of our time, so it isn’t suprising there should be some innovations worth looking at.  And in fact, Big Software are no dummies either.  See for example this discussion with Henning Kagerman of SAP’s changes in thinking about how to customize business processes.  Their Business By Design offering is not only a SaaS offering, but also a new generation concept for On-premises, and it is ripe with these sorts of ideas.  SAP has long been one of the customization heavyweights, but the pendulum seems to be swinging to the idea that next generation architectures might need to find ways to maintain flexibility while reducing the cost of customization. 

Adoption of these new ideas by the mainstream even outside of SaaS will be a good thing for all concerned.  But such adoption usually signals the maturation of an area, and this triggered little warning bells in my head.  If Big Software is upset and annoyed at the SaaS upstarts, who will upset and annoy the SaaS guys?  Who will unleash not just all the hype and disruption, but like SaaS, a set of innovations that SaaS, Big Software, and others will want to adopt too.  We’ve got a billion dollar SaaS leader in Salesforce, a gaggle of successful SaaS public companies still growing rapidly, an economic climate set to magnify the SaaS advantage further, and a number of exciting SaaS startups such as my own Helpstream.  The other thing is I’ve noted that when bubbles burst and everyone is wringing their hands in anguish, just as the hype from the last binge is dying down and consolidation is setting in, that’s usually when the next cycle is being born.  You just have to look around for it and it’s probably right there in plain sight.  Enter the Cloud Babies.

I call them Cloud Babies not out of any desire to denegrate, but because the Cloud is still in its infancy.  I am intentionally distinguishing SaaS from the Cloud too.  I mean the Cloud in the sense of Amazon, and perhaps Force.com.  The Cloud as a platform and a datacenter that is not only not the customer’s datacenter, but not even the software vendor’s datacenter.  I mean utility computing and everything that implies.

The Cloud Babies will be just as annoying to those not yet on the Cloud as SaaS is for those not yet selling (or buying) SaaS.  It’s going to seem ridiculously over hyped.  It’s going to seem like it isn’t real, that it won’t last, and that it will only matter to certain market segments or to small businesses but never large enterprises.  In fact, you can already ready most of that out there.  But I have already seen enough of the Cloud (Helpstream moved to Amazon recently) to know that there is a lot more to it than that.  There is a kernel of hard reality to it.  The Cloud is disruptive.  It will lead to innovation.  It will lead to architecture changes that give fundamental advantages.  If you thought the Sequoia memo of doom about what startups should do in this economy was serious, they missed an important point.  Any startup running their own datacenter today is at a huge disadvantage to those who are already in the Cloud.

I saw on Twitter earlier today that Fred Wilson means to sell GOOG and AAPL tomorrow and buy AMZN.  I agree.  If the SaaS Guys were annoying, you ain’t seen nothing yet.  The Cloud Babies are really gonna piss you off!

Posted in amazon, cloud, data center, enterprise software, platforms, saas | 4 Comments »

IT is the Big Consolidator, but SaaS and Cloud Computing Could Be Equalizers

Posted by smoothspan on January 5, 2009

After sifting through the blizzard of Monday morning blog posts in Google Reader without finding much of interest (glad to hear Jobs looks to be in reasonable health), I turned to Twitter and immediately vectored onto some more interesting stuff.  The best yet was Andrew McAfee’s post on the impact of IT capital spending as a barrier to entry.  Conventional B-school wisdom is that industries mature in proportion to their capital spending.  Businesses that require a lot of capital spending have a barrier to entry, and so relatively few smaller firms can afford to play in those industries.  He gives oil exploration as one example.   But apparently IT capital spending runs completely counter to this.  The more an industry spends on IT, the more likely more businesses are going to be able to get in:

IT capital, in other words, appears to be unique in that it lowers barriers to entry rather than raising them.

What a great story for IT and our industry!  Interestingly, this study specifically excluded companies and industries that make or sell IT hardware or software, so this is the real economy, and not just the High Tech industry.  The theory for why this is so is that IT capital spending increases the efficiency of other parts of the business far in excess of the cost of the IT.   Hence overall, it makes things easier for the business.

Here is another interesting tidbit from the study:  it has been an established maxim that IT evens up the playing field for small companies versus large companies.  In other words, the right IT can make a smaller company very competitive with a larger one.  But this particular study appears to dispute that.  The more IT capital spending there is, the more the concentration of players is shifted towards big companies.  Interestingly, other kinds of capital spending favor fragmentation between large and small companies, albeit fewer of either due to the increased barriers to entry.

McAfee’s theory on this is that:

I believe that this is because modern IT increases the scope, the precision, and the fidelity with which a business innovation can be propagated throughout a company. To put it as tersely as possible, good ideas and good execution separate winners from losers, and IT helps companies execute on their good ideas (technology also helps companies generate good ideas, but that’s a subject for other posts).

I would put that another way, which is to say that IT reduces friction in an organization if well implemented, and allows a large organization to “think small” in nimbly and efficiently implementing smart strategies while growing to a larger scale.  ERP and other Enterprise Software makes it possible for Big Companys to “bottle” their Best Practices discoveries, and ensure consistent implementation of these practices through business process automation.  Another of McAfee’s great posts shows that the variation in profitability for industries that make outsized investments in IT is much greater than industries that don’t.  Put another way, there is a bigger spread in lowest to highest profitability where large IT investments are being made.  This tends to reinforce the idea that IT spending lowers the friction and enables the winners to rise more quickly over the losers.

If that’s all true, I think I see the problem for smaller companies.  Implementing the level of IT available to larger companies still becomes a barrier to entry for companies lacking the scale to undertake such expensive projects.  There is still friction there that keeps the little guys from competing effectively.  That’s where SaaS and Cloud Computing can still come in as equalizers that give the little guys a chance.

Forrester’s TEI (Total Economic Impact) ROI analysis makes the advantage of SaaS for smaller businesses more apparent.  For example, they state that for small business, with 100-249 employees and 50 users, SaaS has a better TEI throughout a 10 year life cycle, as well as lower cumulative costs.  Medium businesses with 250-499 employees and 100 users this advantage falls to 7 years, largely due to a need to handle more integration and other more specialized requirements.  Somehwat larger businesses with 500-999 employees and 250 users have an advantage for 6 years in SaaS.  The largest business category in the report is businesses with 2500 employees and 500 users still show an advantage for SaaS out to 6 years, but it’s a pretty muddled picture where you have to look closely after about year 3 to see that advantage.

I find Forrester’s data to be a pretty convincing reason for why larger businesses have had an advantage in deploying IT technology, but also for why SaaS changes that picture to make it easier for smaller businesses to enjoy some of the same advantages.

Posted in business, cloud, enterprise software, saas | 1 Comment »

Helpstream in the News

Posted by smoothspan on December 24, 2008

There’s been a lot of great blogging activity around my company, Helpstream, lately. 

The latest is Paul Greenberg’s write up on CRM 2009 where he tells what really sets Helpstream apart:

Each of them is a genuine gem – in the case of Helpstream, I can’t even find a flaw. 

and:

This is my paradigm company for a CRM 2.0 feature set.  Para-digm.  They seem to have it all together.  They are the ones that I use as the example of the difference between CRM 2.0 and Web 2.0.  They are my numero uno for explaining the difference between CRM 2.0 and Web 2.0.

Thanks for the kind words, Paul!  This is exactly the kind of discussion we have with our partner Oracle, which is extremely interested in the whole “Social CRM” phenomenon.  Helpstream is, as Paul suggests, a really unique combination of traditional Customer Service technologies with some new Web 2.0 technologies that really rocks the house with new levels of ROI.

Also on deck are a couple of fabulous articles about Helpstream’s recent move to the Amazon Cloud.   Larry Dignan says we are “the blueprint” for how others can move to the cloud.  Thomas Foydell says Helpstream “moved up a whole other level” relative to other SaaS vendors like Salesforce and Netsuite by moving its datacenter into the cloud. 

They’re both great articles if you want to know more about the company that is my day job.

Posted in Web 2.0, amazon, business, cloud, enterprise software, saas | Leave a Comment »

Too Much Cash Bad for Internet and Enterprise Innovation?

Posted by smoothspan on December 22, 2008

Fascinating post by Larry Dignan where he looks at Bernstein analyst Jeffrey Lindsay’s musings.  Lindsay likens Microsoft, Google, and Yahoo to Ford, GM, and Chrysler.  His premise is that all of their cash is buying up successful Internet plays faster than VC’s are funding new ones, and that this is similar to what happened in the early days of the automobile industry.

Lindsay goes on to say that he thinks having too much cash is causing these big players to do the wrong thing.  Microsoft loses $1.5B a year just to keep their hand in the Internet game, while all three are playing a cut throat price war on advertising.  Meanwhile he thinks Google wastes too much money on inefficient internal product development.  I remember a lot of complaining back in the first dot com bubble by people like Andy Grove about how strange things get when the cost of capital falls to nearly zero.

Adding to the general blight on innovation is Lindsay’s contention that the big players don’t do anything once they’ve acquired the innovative companies and their management teams.  Not only do they not do anything, but they simply copy each other’s strategies.  Lindsay says they’re like yesterday’s unsuccessful media conglomerates, and blames this tendency for AOL and Yahoo’s downfalls.

I tend to agree with what’s been said here.  I’m not completely sure it’s bad for innovation though.  At some point, companies quit innovating as much and just focus on execution.  Provided they are acquired after that point, it may actually benefit innovation.  After all, the creative people who built the company may then go on to do something else innovative.  But it does tend to mean that the particular product, strategy, or niche plateaus and goes nowhere. 

The other thing that struck me about the article is that it applies to Enterprise software just as much as Internet software.  There are big companies like Oracle waiting for their next acquisition fish to grow big enough to be worth hooking.  Meanwhile, there are relatively few new plays being funded by VC’s.  The SaaS crowd is very promising, but the dot com bubbles (there’ve been two now, haven’t there?) have starved the formation of new Enterprise plays.  In fact, the SaaS group is not very far along taking over from the perpetual license companies precisely because there are not yet great SaaS companies in every niche.

One of the things I keep waiting for is for the tech industry to show signs of maturity in understanding how to manage acquisitions.  There are some great models out there like General Electric, Johnson and Johnson, or 3M.  Most Tech Industry acquisition doesn’t have that great “collection of independent companies under one big brand” approach.  Our methods are more about milking companies that have peaked.  This is certainly a lucrative business (Oracle doesn’t do badly at all!), but I’m not sure it is as successful as what we see outside Tech.  The closest thing we have to it so far seems to be Cisco in terms of its ability to keep acquired franchises relatively vital and growing.  Does anyone know of other great examples in the Tech Industry?

Related Articles

Washington is Killing Silicon Valley:  Amen!

Posted in Web 2.0, business, enterprise software, venture | 1 Comment »

The Race for Internet Single Sign On

Posted by smoothspan on December 9, 2008

Single Sign On is a facility common in Enterprise Software that let’s you sign in once (or at least use the same userid and credentials) to gain access to every piece of software, even though they may come from many different vendors.  It’s a nice time saving convenience.  There is currently a big move afoot to provide SiSO (the usual abbreviation for Single Sign On) for the web itself.  Google has OpenID, Facebook has recently delivered Facebook Connect, and now there is MySpaceID.  

Who will be next?  The browser owners such as Mozilla?  SalesforceID?  Why not?  SFDC is cozying up to Google in various ways and it isn’t hard to implement SSO with the Salesforce platform.  My own company, Helpstream, supports Salesforce and OpenID (e.g. Google) SSO.  It’s a great convenience to our customers, and more importantly to our customers customer’s who use our application for Customer Service.  When it comes to security issues, why should credit card issuers or some such get into the fray?

In the end, I can’t think of a good reason for any of these to be the dominant winner in the near future, so application vendors should support as many of them as they can.  Eventually businesses will insist on SSO.  They already have it for on-premises applications.  Who knows, maybe business will insist on it for security reasons.  That’s another factor in Enterprise use where businesses want an API that lets them rapidly shut off all the accounts for a particular user, for example, a terminated employee.  None of the current Internet SSO options support that, but we saw such functionality added to the iPhone not long ago.

Dave Weiner, as channelled by Dare Obasanjo, says these standards are too complex and that points the way to a new generation.  I disagree.  It’s been easy to implement OpenID and Salesforce credentials at Helpstream, and we’re going to do Facebook next.  This is just wishful thinking from Weiner and Obasanjo who abhor the idea that SSO might be locked up by one of these big players.  The lockup isn’t going to happen precisely because it is pretty easy to support more than one.  Dare also points out some good examples where you may not want a single ID identifying who you are in every web situation lest things become embarrassingly co mingled.  OTOH, advertisers will love having yet another way to see whose footprints on various web pages are whose.

Keep watching the drama, and ask you software vendors to support the standards you want to use.  It’s all part of the growth and maturation process for Cloud Computing.  And be careful if you think your online presence is anonymous!

Related Articles

GigaOm:  MySpace launches MySpaceID

Posted in Web 2.0, cloud, enterprise software, platforms | 5 Comments »

One Week Later on Amazon Web Services

Posted by smoothspan on December 8, 2008

Well it’s official.  My company, Helpstream, has now been running our application entirely on Amazon Web Services for a week and we’re very happy with the result–it’s better, faster, and cheaper.  We’ve gotten a more robust system for our multitenant SaaS application that’s actually cheaper and easier for us.  Customers are reporting that the application even seems faster than it had been.  The effort involved was not too bad, though we did go through a multi-stage process before committing everything to Amazon.  I’ve chronicled that process on our corporate blog if you’re interested in seeing how such transitions are done.

Meanwhile, I can’t imagine why startups are fooling around with their own data centers.  Easy for me to say, we were too just one short week ago!  But seriously folks, given the current economy and the fact that you can deliver a better service more easily and cheaply with Amazon, why wouldn’t you make that a high priority?

I remember sitting in our weekly staff meeting with my Products organization discussing how to phase the transition.  We’ve got quite a lot of business activity on the horizon, as well as over 120 customers using the service at present.  I was arguing for more baby steps and my fear that we might screw something up.  My Director of Operations made the statement that when he looked at Amazon versus the sort of datacenter a startup can run, he couldn’t understand how we could afford to wait any longer than we had to.   What he meant was that the capabilities of AWS were not something we could even begin to approach any time soon.  When we took a careful look at what we were afraid of happening in a move, it turned out there was a strategy to mitigate every single risk.  So, we put together our migration plan and got on with it.  Boy were we happy we did!

Posted in amazon, cloud, ec2, enterprise software, platforms, saas | 6 Comments »

What is Twitter Good For in the Enterprise? 3 Key Use Cases

Posted by smoothspan on December 5, 2008

Some rumblngs among the Enterprise Irregulars about Twitter this morning.  The usual discussion broke out between the Twitter-lovers and the I-don’t-get-Twitterers.  Being a group of Enterprise types, it was a little more focused on informed opinions and less on inflamed passions that this conversation often is, and it reminded me to write a bit about this topic which I had internalized, taken for granted, and then stopped worrying about.  Let’s just run through some uses for Twitter in the Enterprise and some reasons not to ignore it.

Parts of the Conversation Take Place on Twitter Because Some Prefer It

Whether you’re a Twitter lover or not, be aware that there is a group that wants to have their conversation there.  If you don’t connect with Twitter at all, you are going to miss out on that conversation.  Don’t assume the only thing being discussed is which fast food people had for lunch each day.  Why do people like Twitter for this conversation instead of blogs, forums, or social networks?  First, let’s just drop the “instead of”.  For many, it’s “in addition too”.  Second, I’ve written before about the idea that Learning Styles can influence how people like to consume or create content on the web.  Here is my diagram for a sort of “Myers Briggs” of the web:

If you think about the matrix, you’ll see why a lot of things on the web invite such a polarized love/hate relationship. It’s all about how people communicate and which type of web experience maps best to those preferences. It’s well understood through examples like the Myers-Briggs test that everyone doesn’t learn and communicate in the same way. If you’ve ever tried a system like Myers-Briggs, you’ll understand how much light it can shed on why two people are having a hard time communicating successfully in business. Keep in mind that the same thing can happen on the web and if you want to be sure you are successfully communicating with, or at least listening to, every group, you have to cover every learning style.

Eventually business will realize this and they’ll create a superior web presence that checks all of the boxes on the matrix. Some are trying and getting close already.

Twitter Forces Short Responses: Ideal for Purposes Where Brevity Focuses

Getting back to the language of the Enterprise and it’s practitioners, have you ever heard about or employed some of the principles that can be used to make meetings or other inter-personal processes (offsites, budget planning, etc.) more efficient?  Consider messaging exercises of various kinds like creating mission statements, or key messages on a web page.  Don’t these exercises benefit when restricted to brevity?

Twitter falls into this category too.  By only allowing 140 words, it changes the nature of the conversation that takes place.  Ask yourself what kinds of conversations are better served by only allowing 140 characters?  As a quick, special purpose brainstorming tool, I suspect there are a number of “Twitter Games” one could come up with that would be ideal.  How about the exercise of naming a product?  That seems ideal for a Twitter exchange.  Or how about working on an elevator pitch?

What about forcing brevity to summarize?  This transitions to the idea of Twitter as telemetry or news feed.  If you can scan a list in Twitter and see tinyurl clickthroughs for those that need more attention you’re being more efficient than dealing with the information in situ with the full mass of words.  That’s got to be valuable for a number of enterprise processes.

Twitter as Telemetry or News Feed

There are certain kinds of information where it is important to tell at a glance what the current status is, but to be able to go back over time and see how that has changed as well.  Think of the old-style stock tickers and news feeds.  Twitter is ideal for that purpose.  For example, I use TwitterFeed to update my Twitter stream every time I post to this blog, for example.  That way, anyone following me sees there is a new post, sees the title, and can check it out with a tinyurl click if they like.

There are plenty of enterprise applications for an information stream that talks about what’s happening right now.   The reason I use the term “telemetry” is that Twitter can literally be viewed as a component of some larger system.  You can feed it messages (as I do with TwitterFeed, but it could be almost any corporate information source), and you can also pull the messages off Twitter via apis to use in various ways.  Maybe you are a busy sales manager just trying to keep certain messages top of mind for your sales reps, but the messages change constantly.  They’re promotions or some such.  Build a quick and easy Twitter telemetry system where you can type the messages of the day in as needed and they appear on a window that the sales reps monitor.  At my day job (no, I don’t blog for a living!) for Helpstream, we have built Twitter into the business rule fabric of our Customer Service application.  You can use it in this telemetry fashion as you see fit for your business.  For example, it’s trivial to create a Twitter stream that would reflect every new idea submitted to our Idea Storm facility.

Conclusion

I hope I’ve given you some ideas for why Twitter could be useful in the enterprise.  There’s a lot more potential in Twitter than what I’ve covered.  What are your ideas for how to put it to work in your business?

Posted in Web 2.0, enterprise software, strategy, user interface | 8 Comments »

MySQL and BEA: Oracle and Sun Will Be At Each Other’s Throats!

Posted by smoothspan on January 16, 2008

Big news today is that Sun is buying MySQL and Oracle is buying BEA. This creates a couple of strange bedfellows to say the least. BEA is inextricably wrapped up in Sun’s Java business (is it really a business or just a hobby given the revenues it doesn’t produce?) which gives a reason for the two to get closer together. On the other hand, there is hardly a bigger threat to Oracles core database server business imaginable than MySQL, which has got to push the two companies further apart. What a tangled web!  Is Sun leaving Oracle to its own devices in order to pursue cloud computing?  Sure looks like it!

Let’s analyze these moves a bit. I want to start with BEA and Oracle.

As we all know, Oracle started that courtship dance not long ago and was rebuffed for not offering enough.  Amusingly, they closed almost exactly at the midpoint of the prices the two argued were “fair” at the outset.  Meanwhile, the recession is really setting in, stock prices are falling, and Oracle’s offer went up.  Since Cisco’s John Chambers mused about IT spending will slowing, it has become a widely accepted article that this will happen. So shall it be said, so shall it be written, Mr. Chambers. That’s a very bad thing for BEA, which is primarily selling to that market. The corporate IT market is their bread and butter for a number of reasons. Many ISV’s and web companies will look to Open Source solutions like Tomcat or JBoss with which to reduce costs. Corporate IT wants to superior support of a big player like BEA. The darker truth is that big Java seems to be falling out of favor among the bleeding edge crowd. Java itself gets a lot of criticism, but is strong enough to take it. J2EE is another matter, though there is still a huge amount of it going on. There is also the matter of the steady ascendency of RESTful acrchitecture while BEA is one of the lynchpins of Big SOA.  There is already posturing about the importance of BEA to Oracle Fusion.  If it is so important, Fusion may be born with an obsolete architecture from day one. 

The long and the short is that any competent tea leaf reader (is there any such thing?) would conclude that this was a good move for BEA to let themselves be bought before their curve has crested too much more. For Oracle’s part, its a further opportunity to consolidate their Big Corporate IT Hedgemony and to feed their acquisition-based growth machine. I am not qualified to say whether they paid too much or not, but if I do think the value curve for BEA is falling and will continue to fall post-acquisition. They are way late on the innovation curve, which looks to me like it has already fallen.  In short, BEA is a pure bean counting exercise: milk the revenue tail as efficiently as possible and then move on.  For this Oracle paid $8.5B.  Not surprisingly, even though it is a much bigger transaction, there is much less about it on the blogosphere as I write this than about the other transaction.

Speaking of which, let’s turn to the Sun+MySQL combination.  Jonathan Schwartz gets a bit artsy with his blog post introducing the introduction, which he calls “Teach dolphins to fly.”  The metaphor is apropos.  Schwartz says that MySQL is the biggest database up and comer news in the world of network computing (that’s how we say cloud computing without offending the dolphins that haven’t figured out how to fly yet).  What Sun will bring to the table is credibility, solidity, and support.  He talks about Fortune 500 needing all that in the guise of:

Global Enterprise Support for MySQL – so that traditional enterprises looking for the same mission critical support they’ve come to expect with proprietary databases can have that peace of mind with MySQL, as well.

That business of “proprietary databases” means Oracle.  Jonathan just fired a good sized projectile across your bow Mr. Ellison.  What do you think of that? 

I know what I think.  Getting my tea leaf reading union card back out, I compare these two big acquisitions and walk away with a view that Oracle paid $8.5B to carve up an older steer and have a BBQ while Sun paid $1B to buy the most promising race horse to win the Kentucky Derby.  What a brilliant move for Sun!  Now they’ve united a couple of the big elements out there, Java being one and MySQL the other.  They could stand to add a decent scripting language, but unlike Microsoft’s typical tactics, they’ve learned not to ply a scorched earth policy towards other platforms, so they are peacefully coexisting until a better cohabitation arrangement comes along. 

We talked a little about the Oracle transaction being a good deal for BEA:  it’s a lucrative exit from declining fortunes.  What about mySQL?  Zack Urlocker comments about the rumor everyone knew, that MySQL had been poised to go public.  Let me tell you: this is a far better move.  Savvy private companies get right to the IPO alter, and then they find someone to buy them for a premium over what they would go out at.  What they gain in return is potentially huge.  The best possible example of this was VMWare.  Now look where they are.  I will argue that would not have been possible without the springboard of EMC.  At least not this quickly.   Sun offers the same potential for MySQL.  It is truly the biggest open source deal in history.  It’s also a watershed liquidity event for a highly technical platform based offering from a sea of consumer web offerings.  The VC’s have been pretty tepid about new deals like MySQL.  Perhaps this will help more innovations to get funded.

What do others have to say about the deal?

 - Tim O’Reilly echoes the big open source and importance of database to platform themes.

 - Larry Dignan picks up on my rather combative title theme by pointing out that it puts Sun at war with the major DB vendors:  Microsoft, IBM and Oracle.  Personally, I think any overt combat will hurt those three.  The Open Source movement holds the higher moral ground and it just won’t be good PR to buck that too publicly.  Dignan sounds like he is making a little light of Schwartz’s conference call remark that it is the most important acquisition in Sun’s history, but I think that is no exaggeration on Jonathan’s part.  This is a hugely strategic move that affects every aspect of how Sun interfaces with the world computing ecosystem including its customers, many partners, and its future.  When Dignan asks what else Sun needs, I would argue a decent scripting language.  Since Google already has Python in hand, what about buying a company like Zend to get a leg up on PHP?  Last point from Larry is he asks, “If Sun makes MySQL more enterprise acceptable does that diminish its mojo with startups? Does it matter?”  Bottom line: improvements for the Enterprise in no way diminish what makes MySQL attractive to startups, providing Sun minds its manners.  So far it has been a good citizen.  With regards to, “Does it matter?”  Yes, it matters hugely.  MySQL is tapped into all the megatrends that lead to the future.  Startups are a part of that.  Of course that matters.

One other thought I’ve had:  what if Sun decides to build the ultimate database appliance?  I’m talking about order it, plug your CAT5 cable in, and forget about it.  Do for dabases what disk arrays did for storage.  That seems to me a powerful combination.  Database servers require a painful amount of care and feeding to install and administer properly.  If Sun can convert them to appliances, it kills two birds with one stone.  First, it becomes a powerful incentive to buy more Sun hardware.  This will even help more fully monetize MySQL, which apparently only gets revenue from 1 in 10,000 users.  Second, it could radically simplify and commoditze a piece of the software and cloud computing fabric that is currently expensive and painful.  Such a move would be a radical revolution that would perforce drive a huge revenue opportunity for Sun.  They have enough smart people between Sun and MySQL to pull it off if they have the will. 

Conclusion

Sun has made an uncannily good move in acquiring MySQL.  As Wired points out:

One company that won’t be thrilled by the news is Oracle, makers of the Oracle database which has managed to seduce a large segment of the enterprise market into the proprietary Oracle on the basis that the open source options lacked support.

With Sun backing the free MySQL option (and offering paid support) Oracle suddenly looks a bit expensive.

How else can you simultaneously lay a bet on owning a substantial piece of the computing fabric that all future roads are pointing to and send a big chill down Larry Ellison’s spine for the low low price of just $1B?  Awesome move, Jonathan!

Related Articles

VARGuy says the acquisition means Sun finally matters again.  $1B is cheap to “finally matter again!”

Posted in Open Source, Partnering, Web 2.0, business, enterprise software, platforms, saas, soa, strategy | 9 Comments »

Eventual Consistency Is Not That Scary

Posted by smoothspan on December 22, 2007

Amazon’s new SimpleDB offering, like many other post-modern databases such as CouchDB, offers massive scaling potential if users will accept eventual consistency.  It feels like a weighty decision.  Cast in the worst possible light, eventual consistency means the database will sometimes return the wrong answer in the interests of allowing it to keep scaling.  Gasp!  What good is a database that returns the wrong answer?  Why bother? 

Often waiting for the write answer (sorry, that inadvertant slip makes for a good pun so I’ll leave it in place) returns a different kind of wrong answer.  Specifically, it may not return an answer at all.  The system may simply appear to hang. 

How does all this come about?  Largely, it’s a function of how fast changes in the database can be propogated to the point they’re available to everyone reading from the database.  For small numbers of users (i.e. we’re not scaling at all), this is easy.  There is one copy of the data sitting in a table structure, we lock up the readers so they can’t access it whenever we change that data, and everyone always gets the right answer.  Of course, solving simple problems is always easy.  It’s solving the hard problems that lands us the big bucks.  So how do we scale that out?  When we reach a point where we are delivering that information from that one single place as fast as it can be delivered, we have no choice but to make more places to deliver from.  There are many different mechanisms for replicating the data and making it all look like one big happy (but sometimes inconsistent) database, let’s look at them.

Once again, this problem may be simpler when cast in a certain way.  The most common and easiest approach is to keep one single structure as the source of truth for writing, and then replicate out changes to many other databases for reading.  All the common database software supports this.  If your single database could handle 100 users consistently, you can imagine if those 100 users were each another database you were replication to, suddenly you could handle 100 * 100 users, or 10,000 users.  Now we’re scaling.  There are schemes to replicate the replicated and so on and so forth.  Note that in this scenario, all writing must still be done on the one single database.  This is okay, because for many problems, perhaps even the majority, readers far outnumber writers.  In fact, this works so well, that we may not even use databases for the replication.  Instead, we might consider a vast in-memory cache.  Software such as memcached does this for us quite nicely, with another order of magnitude performance boost since reading things in memory is dramatically faster than trying to read from disk.

Okay, that’s pretty cool, but is it consistent?  This will depend on how fast you can replicate the data.  If you can get every database and cache in the system up to date between consecutive read requests, you are sure to be consistent.  In fact, it just has to get done between read requests for any piece of data that changed, which is a much lower bar to hurdle.  If consistency is critical, the system may be designed to inhibit reading until changes have propogated.  It take some very clever algorithms to do this well without throwing a spanner into the works and bringing the system to its knees performance-wise. 

Still, we can get pretty far.  Suppose your database can service 100 users with reads and writes and keep it all consistent with appropriate performance.  Let’s say we replace those 100 users with 100 copies of your database to get up to 10,000 users.  It’s now going to take twice as long.  During the first half, we’re copying changes from the Mother Server to all of the children.  The second half we’re serving the answers to the readers requesting them.  Let’s say we can keep the overall time the same just by halving how many are served.  So the Mother Server talks to 50 children.  Now we can scale to 50 * 50 = 2500 users.  Not nearly as good, but still much better than not scaling at all.  We can go 3 layers deep and have Mother serve 33 children each serve 33 grand children to get to 33 * 33 * 33 = 35,937 users.  Not bad, but Google’s founders can still sleep soundly at night.  The reality is we probably can handle a lot more than 100 on our Mother Server.  Perhaps she’s good for 1000.  Now the 3-layered scheme will get us all the way to 333*333*333 = 36 million.  That starts to wake up the sound sleepers, or perhaps makes them restless.  Yet, that also means we’re using over 100,000 servers too: 1 Mothers talks to 333 children who each have 333 grandchildren.  It’s a pretty wasteful scheme.

Well, let’s bring in Eventual Consistency to reduce the waste.  Assume you are a startup CEO.  You are having a great day, because you are reading the wonderful review of your service in Techcrunch.  It seems like the IPO will be just around the corner after all that gushing does it’s inevitable work and millions suddenly find their way to your site.  Just at the peak of your bliss, the CTO walks in and says she has good news and bad news.  The bad news is the site is crashing and angry emails are pouring in.  The other bad news is that to fix it “right”, so that the data stays consistent, she needs your immediate approval to purchase 999 servers so she can set up a replicated scheme that runs 1 Mother Server (which you already own) and 999 children.  No way, you say.  What’s the good news?  With a sly smile, she tells you that if you’re willing to tolerate a little eventual consistency, your site could get by on a lot fewer servers than 999.

Suppose you are willing to have it take twice as long as normal for data to be up to date.  The readers will read just as fast, it’s just that if they’re reading something that changed, it won’t be correct until the second consecutive read or page refresh.  So, our old model that had the system able to handle 1,000 users, and replicated to 999 servers to handle 1 million users used to have to go to 3 tiers (333 * 333 * 333) to get to the next level at 36 million and still serve everything consistently and just as fast.  If we relax the “just as fast”, we can let our Mother Server handle 2,000 at half the speed to get to 2000 * 1000 = 2 million users on 3 tiers with 2000 servers instead of 100,000 servers to get to 36 million. If we run 4x slower on writes, we can get 4000*1000 = 4 million users with 4000 servers.  Eventually things will bog down and thrash, but you can see how tolerating Eventual Consistency can radically reduce your machine requirements in this simple architecture.  BTW, we all run into Eventual Consistency all the time on the web, whether or not we know it.  I use Google Reader to read blogs and WordPress to write this blog.  Any time a page refresh shows you a different result when you didn’t change anything, you may be looking at Eventual Consistency.  Even if you suspect others changed something, Google Reader still comes along frequently and says an error occured and asks me to refresh.  It’s telling me they relied on Eventual Consistency and I have an inconsistent result.

As I mention, these approaches can still be wasteful of servers because of all the data copies that are flowing around.  This leads us to wonder, “What’s the next alternative?”  Instead of just using servers to copy data to other servers, which is a prime source of the waste, we could try to employ what’s called a sharded or Federated architecture.  In this approach, there is only one copy of each piece of data, but we’re dividing up that data so that each server is only responsible for a small subset of it.  Let’s say we have a database keeping up with our inventory for a big shopping site.  It’s really important to have it be consistent so that when people buy, they know the item was in stock.  Hey, it’s a contrived example and we know we can cheat on it, but go with it.  Let’s further suppose we have 100,000 SKU’s, or different kinds of items in our inventory.  We can divide this across 100 servers by letting each server be responsible for 1,000 items.  Then we write some code that acts as the go-between with the servers.  It simply checks the query to see what you are looking for, and sends your query to the correct sub-server.  Voila, you have a sharded architecture that scales very efficiently.  Our replicated model would blow out 99 copies from the 1 server, and it could be about 50 times faster (or handle 50x the users as I use a gross 1/2 time estimate for replication time) on reads, but it was no faster at all on writes.  That wouldn’t work for our inventory problem because writes are so common during the Christmas shopping season. 

Now what are the pitfalls of sharding.  First, there is some assembly required.  Actually, there is a lot of assembly required.  It’s complicated to build such architectures.  Second, it may be very hard to load balance the shards.  Just dividing up the product inventory across 100 servers is not necessarily helpful.  You would want to use a knowledge of access patterns to divide the products so the load on each server is about the same.  If all the popular products wound up on one server, you’d have a scaling disaster.  These balances can change over time and have to be updated, which brings more complexity.  Some say you never stop fiddling with the tuning of a sharded architecture, but at least we don’t have Eventual Consistency.  Hmmm, or do we?  If you can ever get into a situation where there is more than one copy of the data and the one you are accessing is not up to date, Eventual Consistency could rear up as a design choice made by the DB owners.  In that case, they just give you the wrong answer and move on. 

How can this happen in the sharded world?  It’s all about that load balancing.  Suppose our load balancer needs to move some data to a different shard.  Suppose the startup just bought 10 more servers and wants to create 10 additional shards.  While that data is in motion, there are still users on the site.  What do we tell them?  Sometimes companies can shut down the service to keep everything consistent while changes are made.  Certainly that is  one answer, but it may annoy your users greatly.  Another answer is to tolerate Eventual Consistency while things are in motion with a promise of a return to full consistency when the shards are done rebalancing.  Here is a case where the Eventual Consistency didn’t last all that long, so maybe that’s better than the case where it happens a lot. 

Note that consistency is often in the eye of the beholder.  If we’re talking Internet users, ask yourself how much harm there would be if a page refresh delivered a different result.  In may applications, the user may even expect or welcome a different result.  An email program that suddenly shows mail after a refresh is not at all unexpected.  That the user didn’t know the mail was already on the server at the time of the first refresh doesn’t really hurt them.  There are cases where absolute consistency is very important.  Go back to the sharded database example.  It is normal to expect every single product in the inventory to have a unique id that lets us find that part.  Those ids have to be unique and consistent across all of the shards.  It is crucially important that any id changes are up to date before anything else is done or the system can get really corrupted.  So, we may create a mechanism to generate consistent ids across shards.  This adds still more architectural complexity.

There are nightmare scenarios where it becomes impossible to shard efficiently.  I will over simplify to make it easy and not necessarily correct, but I hope you will get the idea.  Suppose you’re dealing with operations that affect many different objects.  The objects are divided into shards naturally when examined individually, but the operations between the objects span many shards.  Perhaps the relationships between shards are incompatible to the extent that there is no way to shard them across machines such that every single operation doesn’t hit many shards instead of a single shard.  Hitting many shards will invalidate the sharding approach.  In times like this, we will again be tempted to opt for Eventual Consistency.  We’ll get to hitting all the shards in our sweet time, and any accesses before that update is finished will just live with inconsistent results.  Such scenarios can arise where there is no obvious good sharding algorithm, or where the relationships between the objects (perhaps its some sort of real time collaborative application where people are bouncing around touching objects unpredictably) are changing much too quickly to rebalance the shards.  One really common case of an operation hitting many shards is queries.  You can’t anticipate all queries such that any of them can be processed within a single shard unless you sharply limit the expressiveness of the query tools and languages.

I hope you come away from this discussion with some new insights:

-  Inconsistency derives from having multiple copies of the data that are not all in sync.

-  We need multiple copies to scale.  This is easiest for reads.  Scaling writes is much harder.

-  We can keep copies consistent at the expense of slowing everything down to wait for consistency.  The savings in relaxing this can be quite large.

-  We can somewhat balance that expense with increasingly complex architecture.  Sharding is more efficient than replication, but gets very complex and can still break down, for example. 

-  It’s still cheaper to allow for Eventual Consistency, and in many applications, the user experience is just as good.

Big web sites realized all this long ago.  That’s why sites like Amazon have systems like SimpleDB and Dynamo that are built from the ground up with Eventual Consistency in mind.  You need to look very carefully at your application to know what’s good or bad, and also understand what the performance envelope is for the Eventual Consistency.  Here are some thoughts from the blogosphere:

Dare Obasanjo

The documentation for the PutAttributes method has the following note

Because Amazon SimpleDB makes multiple copies of your data and uses an eventual consistency update model, an immediate GetAttributes or Query request (read) immediately after a DeleteAttributes or PutAttributes request (write) might not return the updated data.

This may or may not be a problem depending on your application. It may be OK for a del.icio.us style application if it took a few minutes before your tag updates were applied to a bookmark but the same can’t be said for an application like Twitter. What would be useful for developers would be if Amazon gave some more information around the delayed propagation such as average latency during peak and off-peak hours.

Here I think Dare’s example of Twitter suffering from Eventual Consistency is interesting.  In Twitter, we follow mico-blog postings.  What would be the impact of Eventual Consistency?  Of course it depends on the exact nature of the consistency, but lets look at our replicated reader approach.  Recall that in the Eventual Consistency version, we simply tolerate that we allow reads to come in so fast that some of the replicated read servers are not up to date.  However, they are up to date with respect to a certain point in time, just not necessarily the present.  In other words, I could read at 10:00 am and get results on one server that are up to date through 10:00 am and on another results only up to date through 9:59 am.  For Twitter, depending on which server my session is connected to, my feeds may update a little behind the times.  Is that the end of the world?  For Twitter users, if they are engaged in a real time conversation, it means the person with the delayed feed may write something that looks out of sequence to the person with the up to date feed whenever the two are in a back and forth chat.  OTOH, if Twitter degraded to that mode rather than taking longer and longer to accept input or do updates, wouldn’t that be better? 

Erik Onnen

Onnen wrote a post called “Socializing Eventual Consistency” that has two important points.  First, many developers are not used to talking about Eventual Consistency.  The knee jerk reaction is that it’s bad, not the right thing, or an unnecessary compromise for anyone but a huge player like Amazon.  It’s almost like a macho thing.  Onnen lacked the right examples and vocabulary to engage his peers when it was time to decide about it.  Hopefully all the chatter about Amazon’s SimpleDB and other massively scalable sites will get more familiarity flowing around these concepts.  I hope this article also makes it easier.

His other point is that when push comes to shove, most business users will prefer availability over consistency.  I think that is a key point.  It’s also a big takeaway from the next blog:

Werner Vogels

Amazon’s CTO posted to try to make Eventual Consistency and it’s trade offs more clear for all.  He lays a lot of good theoretical groundwork that boils down to explaining that there are tradeoffs and you can’t have it all.  This is similar to the message I’ve tried to portray above.  Eventually, you have to keep multiple copies of the data to scale.  Once that happens, it becomes harder and harder to maintain consistency and still scale.  Vogels provides a full taxonomy of concepts (i.e. Monotonic Write Consistency et al) with which to think about all this and evaluate the trade offs.  He also does a good job pointing out how often even conventional RDMS’s wind up dealing with inconsistency.  Some of the best (and least obvious to many) examples include the idea that your mechanism for backups is often not fully consistent.  The right answer for many systems is to require that writes always work, but that reads are only eventually consistent.

Conclusion

I’ve covered a lot of consistency related tradeoffs involved in database systems for large web architectures.  Rest assured, that unless you are pretty unsuccessful, you will have to deal with this stuff.  Get ahead of the curve and understand for your application what the consistency requirements will be.  Do not start out being unnecessarily consistent.  That’s a premature optimization that can bite you in many ways.  Relaxing consistency as much as possible while still delivering a good user experience can lead to radically better scaling as well as making your life simpler.  Eventual Consistency is nothing to be afraid of.  Rather, it’s a key concept and tactic to be aware of.

Personally, I would seriously look into solutions like Amazon’s Simple DB while I was at it. 

Posted in amazon, data center, enterprise software, grid, platforms, soa, software development | 6 Comments »