SmoothSpan Blog

For Executives, Entrepreneurs, and other Digerati who need to know about SaaS and Web 2.0.

Archive for the 'ec2' Category


Hurry, The Cloud Computing Platform Opportunity is Perishable!

Posted by smoothspan on April 7, 2008

As I write this post, many are predicting that the big announcement from Google tonight will be that it’s opening up BigTable for the world to use.  At least Kevin Burton and Mike Arrington think so.  I hope so, because the world needs a lot more cloud computing choices.  I wonder how many have figured out just how little time remains to introduce new cloud computing platforms?

Ray Ozzie has said, “[the cloud market] really isn’t being taken seriously right now by anybody except Amazon.”  He’s right on the mark:  it isn’t being taken seriously by anyone except Amazon.  The distant runner up is Benioff’s Force.com.  I say distant because there are a lot of problems with it, not the least of which is an economic model that makes it completely untenable for anyone but big corporate IT to use.  Technically, it is a completely closed and proprietary environment that offers only minimal leverage.  It’s true, they’re very seirous about it, so in that sense we should add them to the list, but the way they’re going about it makes it seem less than serious.

Here’s an important tip for various big industry players who’ve made noise about Cloud Computing at various points:  it’s a perishable opportunity!  You don’t have forever to contemplate how to get in and start winning.

Why?

Because ultimately it boils down to differentiation and commoditization like any market.  The longer you wait, the more bipolar the market becomes.  Allow Amazon to get too strong and you’ll have two choices:

-  Copy Amazon’s API’s very closely and charge a lot less. 

-  Launch a radically different approach that offers big advantages in some other way.

The middle ground will be untenable.  An API or service that is only slightly better than Amazon’s but is incompatible won’t succeed.  We’ve seen this time and time again in our industry.  It’ll play out the same way here.  For a brief time everyone can be slightly different.  Then the world will discover the differences don’t matter and they’ll gravitate towards one player.  If someone already has huge momentum (e.g. Amazon), you must either be incredibly differentiated or much cheaper.  Both are pretty hard to do.

We could ask whether Amazon has already reached a stage that only the two options can fly.  I don’t think so.  Not quite anyway.  It takes longer than you’d think, although their success has been phenomenal.  My prediction is that the window to introduce a major new cloud computing platform initiative is not quite 2 years.  If you’re not out by end of 2009, you will face a major uphill struggle.  In fact, if you’re not a great big player, the window is much less.

There are significant challenges for the big players to execute quickly enough:

-  Sun never seems to execute on anything quickly enough.  Sorry guys, but the company just doesn’t evolve very fast.  That’s why you’re buying properties like MySQL, right?

-  Google wants to be a precision machine, focused on squeazing margin out of a lucrative model.  What would they do, if like Amazon, they announced this thing and suddenly had more traffic to it than their core properties?  They have a history of absorbing startups and then taking a long time to get the thing to a level they feel is commensurate with their standards.  Cloud computing is in many ways worse.  They lose control and let other people’s software run inside their firewalls on their servers. 

-  Microsoft is in the unenviable position the old RISC world was in against Intel.  They have to build everything themselves on their platforms.  There is no synergy with third parties.  It’s ironic really.  The Intel/Microsoft PC Kiretsu could divide and conquer and they were so successful even Apple finally went Intel because the others couldn’t afford to do it themselves.

-  Yahoo?  People used to talk about them in the same breath, but clearly the wheels are coming off that stagecoach.  For a big player, cloud computing is not a little investment.  Particularly now when there is quite a lot of momentum already built.  Yahoo’s bets are laid, and they’re a lame duck besides.  Count them out.

-  IBM?  Could be.  They’ve made announcements but the follow up is weak.  IBM could certainly afford to throw enough services at the problem to get it going until the technology catches up.  They can sure sell such a thing.  The biggest challenge they have is their command and control culture may never let it reach critical mass.

-  Tata et al:  Big Indian or Chinese.  Why not?  These are huge companies overseas.  They have the expertise to do quite a lot.  The Asian markets are hot, hot, hot, and they’re not that well served by Amazon.  These guys would be my bet for the odds on Dark Horse players if they get it and can get their act together.  They’re ideal as low cost providers and like IBM, they can throw service at it until they get it right.  There is surprisingly little technology required at this stage to get started at the level Amazon is at.  You need an EC2 and an S3 clone and a bit of window dressing that does something they don’t.  How about an identity system?  I’ve written about that before.  Wouldn’t you think if a service was announced business would fly to it overseas?

Meanwhile, Amazon is coming to a sort of crossroads as well.  The traffic to Amazon Web Services exceeds the traffic to the rest of their properties combined.  This is no longer a remaindering strategy for unused MIPs as many VC’s I talked to late last year seemed to feel.  Amazon is now experiencing significant growth and scaling pains for the service.  EC2 just went down for about an hour for many customers.

This is both good news and bad news for Amazon.  The good news is that they’re learning how to keep these systems up and they others haven’t even started up that learning curve.  The bad news is it annoys customers mightily. 

The other thing I watch Amazon for is signs they’ll offer anything with AWS that they didn’t already have to build for their core business.  The availability of something interesting and new would be a further signal that this is not just a remaindering business.  More importantly, it would be a further barrier to entry and exit around their valuable property.  As it stands, EC2, S3, and SimpleDB are pretty low level.  They do not represent big barriers.  All that is available in one form or another via Open Source to others who want to play.  Amazon’s expertise in billing and payment processing is more differentiated, but not compelling and as currently offered, very Amazon-centric.

Note to Werner Vogels:  it’s time look for key innovations in AWS to build lock in while you continue to make the service more robust.

Note to others:  Time is running out.  Get in the game or move on.

Note to self:  Look for a dip and buy AMZN stock.

Related Articles

Google responded well to the challenges I set forth above with App Engine.  See my blog post for more details.  By focusing on language support instead of raw virtual machines, they’ve actually raised the bar in the sort of way I keep saying Amazon needs to above and below in the comments.  I stick to my 2 year prognosis.  If you aren’t a Big Player here within 2 years, the window will close.  What Google has done is raise the ante on what you must deliver to be in the poker game.

Posted in Web 2.0, amazon, data center, ec2, platforms, saas | 12 Comments »

Amazon Raises the Cloud Platform Bar Again With DevPay

Posted by smoothspan on January 1, 2008

Wow, what an exciting time to be watching the Amazon Cloud Platform evolve.  We’re just beginning to think through the recent SimpleDB announcement when Amazon launches DevPayLucid Era CEO Ken Rudin says land grabs are all about a race to the top of the mountain to plant your flag there first.  It seems like Amazon has hired a helicopter in the quest to get there first.  Google, Yahoo, and others are barely talking about their cloud platforms and here is Amazon with new developments piling up on each other.  And unlike some of the developments announced by companies like Google, this stuff is ready to go.  They’re not just talking about it.

What’s DevPay all about, anyway?  Simply put, Amazon are providing a service to automate your billing.  If you use their web services to offer a service of your own, it gives you the ability to let Amazon deal with billing for you.  It’s based off the pricing model for the rest of the Amazon Web Services like EC2 and S3, but you can use any combination of one-time charges, recurring monthly charges, and metered Amazon Web Service usage. You have total flexibility to price your applications either higher or lower than your AWS usage.  In addition, they’re promising to put everything they know about how to do e-commerce (and who knows more than Amazon?) behind making the user experience great for your customers and you.

It’s not a tremendous big step forward, but it’s useful.  It’s another brick in the wall.  There are companies out there providing SaaS infrastructure for whom billing is a big piece of their offering, so obviously it is a problem that people care about having solved.  What are the pros and cons of this particular approach?

Let’s start with the pros.  If you are going to use Amazon Web Services anyway, DevPay makes the process dead simple for you to get paid for your service.  It’s ideal for microISV’s as a way to monetize their creations.  The potential is there for interesting revenue that’s tied to usage in the classic SaaS way.

What about the cons?  Here there are many, depending on what sort of business you are in and how you want to be percieved by customers.  I break it down into two major concerns: flexibility and branding.  Let’s start with branding, which I think is the more important concern.  It’s not clear to me from the announcement how you would go about disassociating your offering from Amazon so that it becomes your stand alone brand.  You and your customers are going to have to acknowledge and accept that the offering you provide is part of the Amazon collective.  Resistance is futile.  This is the moral equivalent of not being able to accept a credit card directly, and instead having to refer customers to PayPal.  It works, but it detracts a from your “big time” image.  If having a big time stand-alone image is important for you, DevPay is a non-starter at this stage.  It’s not clear to me that Amazon would have to keep it that way for all time, but perhaps they need to protect their own image as well, and would insist on it.

Second major problem is flexibility.  Yes, Amazon says you can “use any combination of one-time charges, recurring monthly charges, and metered Amazon Web Service usage”.  That sounds flexible, but it casts your business in light of what resources it consumes on Amazon.  Suppose you want a completely different metric?  Perhaps you have another expense that is not well correlated with Amazon of some kind that has to be built in, for example.  Perhaps you need to do something completely arbitrary.  It doesn’t look to me like Amazon can facilitate that at the present.

Both of these limitations are things Amazon could choose to clean up.  So far, the impression one gets is that Amazon is just putting a pretty face on the considerable internal resources they’ve developed for their primary business and making them available.  What will be interesting is to see what happens when (and if) Amazon is prepared to add value in ways that never mattered to their core business.  Meanwhile, they’re doing a great job stealing a march on potential competition.  As a SaaS business, they should be quite sticky.  Anyone that writes for their platform will have a fair amount of work to back out and try another platform.  DevPay is another example.  It will create network lock-in by tying your customer’s business relationship in terms of billing and payment to Amazon, and in turn tying that to your use of Amazon Web Services.  For example, that same lack of flexibility might prevent you from migrating your S3 or EC2 usages to, say, Google.  There doesn’t look to be a way for you to build the Google costs into your billing in  a flexible way.

We’ll see the next 5 to 10 years be a rich period of innovation and transition to Cloud Computing Platforms.  Just as many of the original PC OS platforms disappeared (CP/M anyone?) after an initial flurry of activity, and others have changed radically in importance (it no longers matters whether you run PC or Mac does it?), so too will there be dramatic changes here.  The beneficiaries will be users as well as the platform vendors, but it’s going to take nimbleness and prescient thinking to place all your bets exactly right.  The good news is the cost of making a mistake is far less than it had been in the era of building your own datacenters!

Related Articles

To Rule the Clouds Takes Software: Why Amazon’s SimpleDB is a Huge Next Step

Coté’s Excellent Description of the Microsoft Web Rift

Posted in amazon, data center, ec2, grid, saas, strategy | 5 Comments »

What if Twitter Was Built on Amazon’s Cloud?

Posted by smoothspan on December 18, 2007

There was recent bellyaching in the blogosphere again about Twitter being down.  Dave Winer grumbles, “What other basic form of communication goes down for 12 hours at a time?”  There are various comments, and in the end, apparently it was about their moving ISP’s.  Twitter themselves had this to say:

Twitter is humming along now after a late night. Our team worked earnestly into the night and morning on our largest and most complex maintenance project ever. Everything went pretty much according to plan except for one thing: an incorrect switch.

The switch in question caps traffic an unacceptable level. In order to correct this, we’ll need to get some hardware installed. Unfortunately, that means we’re not done with our datacenter move just yet. This type of work can be frustrating but it’s all towards Twitter’s highest goal: reliability.

Such moves are never easy, they always include a hitch of some kind, and the Twitter customer base is hopelessly addicted to the medium so Twitter hears about it whenever the turn the thing off for any period of time.  I look at this and for me it’s just one more reason I wouldn’t want to own a datacenter.

Suppose your service, or maybe even Twitter, was built on Amazon’s Cloud or some other Utility Computing solution.  You don’t own the servers, you are renting them.  If loads go up, you can simply rent more in direct proportion to the loads and on 10 minutes notice.  A recent High Scalability article on scaling Twitter shows they don’t really have all that many servers:

  • 1 MySQL Server (one big 8 core box) and 1 slave. Slave is read only for statistics and reporting.
  • 8 Sun X4100s.
  • 10 boxes, in other words.  Now it comes time to upgrade.  Much pain and frustration.  To do it well, and without interruption, they really need 2 complete copies of their infrastructure.  This way, they can prepare the new version and start cutting users over to it while leaving the old one running.  When everyone is over, the old system can be decommissioned.  For many startups, owning twice as much hardware as they use is just out of the question.  The more successful they become, the more expensive it becomes to entertain such a luxury.  Not so on a utility computing service like Amazon’s.  Purchase the use of twice as many servers for just how long it takes for a successful upgrade and then cut them loose afterward.

    There are detractors to the Amazon approach out there, but do we really think it would make Twitter much less reliable?  What if it made it much more reliable?

    Here’s another thought that runs rampant:  how well would Amazon’s new SimpleDB work for a service like Twitter?  It seems tailormade.  Certainly the notion of a “texty” database with up to 1024 characters per field seems like a fit.  It would be fascinating to see some of the Twitterati put up a Twitter clone on Amazon’s Web Services using SimpleDB just to see how well it works and how quickly it could be put together.  Given the platform and the requirements of the application, it seems like it would not be that hard to do the experiment.  It would certainly make for an interesting test of how well Amazon’s infrastructure really works.

    Posted in Web 2.0, data center, ec2, grid, platforms | 1 Comment »

    To Rule the Clouds Takes Software: Why Amazon SimpleDB is a Huge Next Step

    Posted by smoothspan on December 15, 2007

    One Ring to rule them all, One Ring to find them,
    One Ring to bring them all and in the darkness bind them…

    J. R. R. Tolkien

    There is much interesting cloud-related news in the blogosphere.  Various pundits are sharing a back and forth on the potential for cloud centralization to result in just a very few datacenters and what that might mean.  The really big news is Amazon’s fascinating new addition to their cloud platform of SimpleDB.  Let’s talk about what it all means.

    Sun’s CTO, Greg Papadopoulos, has been predicting that the earth’s compute resources will resolve into about “five hyperscale, pan-global broadband computing services giants” — with Sun, in its version of this future scenario, the primary supplier of hardware and operations software to those giants. The last was channeled via Phil Wainewright, who goes on to ask, “What is it about a computing grid that’s inherently “more centralized” in nature?”  He feels that Nick Carr has missed the mark and swallowed Sun’s line hook, line, and sinker.  For his part, Carr’s only crime was to seize on a good story, because at the same dinner, another Sun executive, Subodh Bapat, was telling Carr that sometime soon a major datacenter failure would have “major national effects.”  The irony is positively juicy with Sun talking out both sides of their proverbial mouths.

    The tradeoff that Carr and Wainewright are worried about is one of economies of scale that favor centralization versus flexibility and resiliency that favors decentralization.  Where they differ is that Carr sees economies of scale winning in a world where IT matters less and less and Wainewright favors the superior architectural possibilities of decentralization.  Is datacenter centralization inexorable?  In a word, yes, but it may not boil down to just 5 data center owners, and it may take quite a while for the forces at work to finish this evolution.  The factors that determine who the eventual winners will be are also quite interesting, and have the potential to change a lot of landscapes that today are relatively isolated.  Let’s consider what the forces of centralization are.

    First, there is a huge migration of software underway to the cloud.  In other words, software that is never installed on your machine or in your company’s datacenter.  It resides in the cloud and comes to you via the browser.  Examples include SaaS on the business side and the vast armada of consumer Web 2.0 products such as Facebook.  No category is safe from this trend, not even traditional bastions as should be clear from the growing crop of Microsoft Office competitors that reside in the cloud.

    Second, this migration leads to centralization.  The mere act of building around a cloud architecture, even if it is a private cloud in your own company’s datacenter, leads to centralization.  After all, software is moving off your desktop and into that datacenter.  When many companies are aggregated into a single datacenter, into a SaaS multi-tenant architecture, for example, further centralization occurs.  When you offer a ubiquitous service to the masses, as is the case with something like Google, the requirements to deliver that can lead to some of the largest datacenter operations in the land. 

    Third, there are the afore-mentioned economies of scale.  Google has grown so large that it now builds its own special-purpose switches and servers to enable it to grow more cheaply.  The big web empires are all built on the notion of scaling out rather than scaling up, and they run on commodity hardware.  Because they have so many servers, automating their care and feeding has been baked into their DNA.  Not so with most corporate datacenters that are just beginning to see the fruits of crude generic technologies like virtualization that seek to be all things to all people.  Virtualization is a great next step for them, but there are bigger steps ahead yet that will further reduce costs.

    Fourth, the ultimate irony is that centralization begats centralization through network effects.  This is the story of the big consumer web properties.  Every person that joins a social network adds more value to the network than the prior person did.  The value of the network grows exponentially.  This connectedness is facilitated most easily in today’s world by centralization.  Vendors that start to get traction increase their network effects in various ways:  Amazon charges to bring data in and out of their cloud, but not to transfer between services within the cloud.

    Lastly, there are green considerations at work.  The biggest costs associated with datacenters these days are around electricity and cooling.  Microsoft is building a data center in Siberia, which is both cold and pretty central to Asia.  Consider this:  given the speed of light over a fiber connection, what is the cost of latency in having a data center somewhere far north (and cold) in Canada like Winnipeg versus far south (and hot) like Austin, Texas?  It’s 1349 miles, which, as the photon travels (186,000 miles per second) is about 7.2 milliseconds.  The world’s fastest hard drive, the nifty Mtron solid state disks I’m now coveting thanks to Engadget and Kevin Burton, can only write a paltry 80K or so bytes in that time:  not even enough for one photo at decent resolution.  So consider a ring of datacenter clusters built in colder regions.  Centralized computing is up north where the cold that computers like is nearly free for the asking: just open a window many days.  Or come closer.  Put it up on a mountain peak.  Immerse it near a hydro dam and get the juice cheaper too.  It doesn’t matter.  Laying fiber is pretty cheap compared to paying the energy bills.

    The next question is trickier: how do these clouds compete?  Eventually, they will become commoditized, and they will compete on price, but we are a long ways from that point.  At least 10 years or more.  Before that can happen, customers have to agree on what the essential feature sets are for this “product”.  I believe this is where software comes into play, and that should be a matter of great concern for the hosting providers of today whose expertise largely does not revolve around software as a way to add value.   As Eric Schmidt said (via Nick Carr) when he started saying Google would enter this market:

    For clouds to reach their potential, they should be nearly as easy to program and navigate as the Web. This, say analysts, should open up growing markets for cloud search and software tools—a natural business for Google and its competitors.

    Some will immediately react with, “Hold it a minute, what about the hardware?  What about the network?”  The best of the cloud architectures will commoditize those considerations away.  In fact, commoditization will start down at the bottom of the technology stack and work its way up.  The first stage of that, BTW, is already almost over.  That was the choice of CPU.  MIPS?  PowerPC?  SPARC?  No, Intel/AMD are the winners.  The others still exist (not all of them!), but they’ve peaked and are on their way down at various terminal velocities.  Their owners need to milk them for profit, but it would be a losing battle to invest there.  Even Macs now carry Intel inside, and Sun now carries the ticker symbol “JAVA”, a not-so-subtle hat tip to the importance of software.

    Hardware boxes are largely a dead issue too.  There is too little opportunity to differentiate for very long and the cpu’s dictate an awful lot of what must be done.  Dell is an assembler and marketer of the lowest cost components delivered just in time lest they devalue in inventory.  Sun still pushes package design, and it may have some relevance to centralization, but this will be commoditized because of centralization.

    Next up will be the operating system.  Again, we’re pretty far down the path of Linux.  Corporations still carry a lot of other things inside their firewalls, but the clouds will be populated almost exclusively with Linux, and we could already see that has happened if we could get reliable statistics on it.  Linux defines the base minimum of what a cloud offering has to provide:  utility computing instances running Linux.  This is exactly what Amazon’s EC2 offers.

    What else does the cloud need?  Reliable archival storage.  Again, Amazon offers this with S3.  Cloud consumers are adopting it in droves because it makes sense.  It’s a better deal than a raw disk array because it adds value versus that disk array for archival storage.  The value is in the form of resiliency and backup.  Put the data on S3 and forget about those problems.  This begins the commoditization of storage.  Is it any wonder that EMC bought VMWare and that a software offering is now most of their market cap?  Hardware guys, put on your thinking caps, this will get much worse.  What software assets do you bring to the table.

    3Tera is a service I’ve talked about before that has a very similar offering available from multiple hosting partners of theirs.  They create a virtual SAN that you can backup and mirror at the click of a mouse.  They let you configure Linux instances to your heart’s content.  Others will follow.  IBM’s Blue Cloud offers much the same.  This collection is today’s blueprint for what the Cloud offers in terms of a platform.

    But, this platform is a moving target, and it will keep moving up the stack.  Amazon just announced another rung up with SimpleDB.  For most software that goes into the Cloud, once you have an OS and a file system, the next thing you want to see is a database.  Certainly when I attended Amazon Startup Project, the availability of a robust database solution was the number one thing folks wanted to see Amazon bring out.  The GM of EC2 promised me that this was on the way and that there would be several announcements before the end of the year.  First we saw the availability of EC2 instances that had more memory, disk, and cpu, so that they’d make better database hosts.  SimpleDB is much more ambitious.  It’s a replacement for the conventional database as embodied in products like mySQL and Oracle that was designed from the ground up to live in a cloud computing world.  At one stroke it solves a lot of very interesting problems that used to challenge would-be EC2 users around the database.

    Along the lines of my list of factors that drive data center centralization, Phil Windley says the economics are impossible to stop.  Scoble asks whether MySQL, Oracle, and SQL Server are dead:

    Since Amazon brought out its S3 storage service, I’ve seen many many startups give up data centers altogether.

    Tell me why the same thing won’t happen here.

    There is no doubt in my mind that all startups will give up having datacenters altogether before this ends.  However, before we get too head up in assuming that SimpleDB gives us that opportunity, let’s drop back and consider what it’s limitations are:

    - It is similar to a relational database, but there are significant differences.  Code will have to be reworked to run there, even if it doesn’t run afoul of the other issues.

    - Latency is a problem when your database is in another datacenter from the rest of your code.  Don MacAskill brings this one up, and all I can say is that this is another network effect that leads to more centralization.  If you like Simple DB, it’s another reason to bring all of your code inside Amazon’s cloud.

    - All fields are strings, and they are limited to 1024 characters.  Savvy developers can use the 1024 characters to find unlimited size files on S3, as well as other methods like combining fields to get around this limit.  Mind you, a lot can be done with that, but it is again a difference from traditional RDMS systems and it means more work for developers that must overcome the limitation.

    - There are no joins, if you want them (and many proponents of hugely scalable sites view joins as evil), you have to roll your own. 

    - Transactions and consistency are also absent.  Reads are not guaranteed to be fully up to date with writes.

    - There is no indexing and a whole host of other trappings that database afficionados have gotten comfortable with.

    Mind you, serious web software is created within these limitations including some at Amazon itself.  In exchange for living with them, you get massively scalable database access at good performance and very cheaply.  And, as Techcrunch says, you may be able to get rid of one of the highest cost IT operations jobs around, database administration, and your costs are even lower.  Remember my analysis that shows SaaS vendors need to achieve 16:1 operations cost advantages over conventional software and you can see this is a big step in that direction already.

    There is no doubt that cloud computing will be massively disruptive, and that Amazon are well on their way in the race to plant their flag at the top of the mountain.  The pace of progress for Amazon Web Services has been blistering this year, and much more hype free than what we’ve gotten from the likes of Google and Facebook when it comes to platform speak.  It’s almost odd that we haven’t heard more from these other players, and especially from the likes of Google.  GigaOm says that Simple DB completes the Amazon Web Services Trifecta.  They go on to say that Amazon’s announcements have the feel of a well thought out long term strategy, while Google’s make it sound like the ad hoc grab bag of tools.  I think that’s true, and perhaps reflective of Google’s culture, which is hugely decentralized to the point of giving developers 20% free time to work on projects of their choosing.  The problem is that such a culture can more easily give us a grab bag of applications, as Google has, than it can provide a well-designed platform, as Amazon has.  Or, as Mathew Ingram puts it, while everyone else was talking about it, Amazon went ahead and did it.

    I’ve talked to a dozen or so startups that are eagerly working with the Amazon Web Services and having great success, as well as some frustrations.  They require rethinking the old ways.  Integrity issues are particularly different in this brave new world, as are issues of latency.  That matters to how a lot of folks think about their applications.  Because of the learning curve, I don’t plan to go out and short Oracle immediately, but the sand has started running in the hourglass.  There will be more layers added to the cloud, and over time it will become harder and harder to ignore.  There will be economic advantage to those who embrace the new ways, and penalties for those who don’t.  This is a bet-your-business drama that’s unfolding, make no mistake.  At the very least, you need to get yourself educated about what these kinds of services offer and what they mean for application architecture.

    Business located low in the stack I’ve mentioned will be hit hard if they don’t have a strategy to embrace and win a piece of the cloud computing New Deal.  We’re talking hardware manufacturers like Sun, Dell, IBM, and HP.  Software infrastructure comes next.  Applications that depend on low cost delivery, aka SaaS, are also very much in the crosshairs, although probably at a slightly later date.

    Welcome to the brave new world of utility cloud computing.  Long live the server, the server is dead!

    Related Articles

    Amazon Raises the Cloud Platform Bar Again With DevPay

    Coté’s Excellent Description of the Microsoft Web Rift :  Nice post on cloud computing at Microsoft

    Posted in Web 2.0, amazon, data center, ec2, grid, platforms, saas | 10 Comments »

    Cloud Computing in Someone Else’s Cloud: The Future

    Posted by smoothspan on November 16, 2007

    Ever hear of a fabless chip company?  This is a company that sells Integrated Circuits but owns no manufacturing facilities.  They just write software, in effect, and send it out to someone else’s fab.  Brilliant.  Many kinds of manufacturers often do the same.  After all, manufacturing may not be the distinctive competency of a company, or the company may achieve better economies of scale by using centralized manufacturing owned by much large companies.

    This is starting to happen big time with web software.  IBM just announced they’re going to join Amazon in the cloud computing business with “Blue Cloud”.  Companies will be able to buy capacity in someone else’s cloud which they sell as their own.  No need to own any hardware or even visit a colo center.  Why would you want to own a datacenter if you didn’t have to?  Why would you think you can do it as well as Amazon or IBM?  Many others including Yahoo, Google, and Microsoft will be a part of this future.  Sun is already there with Sun Grid. 

    So far, the formulas are pretty similar.  IBM and Amazon are both Linux-based systems built on virtualization software.  At some point, if enough hardware capacity is locked up in this rental data centers, it will become an important sales channel for all server hardware manufacturers.  Take Dell for example.  They’ve always sold direct.  Shouldn’t they consider this kind of business, especially when other hardware companies are going there?  What about HP?  Look at it as a way for hardware makers to switch from the equivalent of perpetual licensing to the SaaS rental model. 

    What about Microsoft?  Can .NET be as successful if they don’t build a Cloud Computing Service that is .NET based?  Seems to me this is a strategic imperative for the OS crowd lest Linux steal the show.  Sun is already there with Solaris on Sun Grid.  This is the system my old alma mater Callidus Software uses to host their SaaS solution and it works well.  IBM is not missing the chance to offer PowerPC as well as x86 servers for Blue Cloud.  IBM is also partnering with Google around Cloud Computing, so there may be all sorts of interesting bedfellows before this new paradigm is done rolling out.

    A great example that’s being written about by Scoble and others is Mogulus.  CEO Max Haot says they don’t own a single server, it’s all being done on Amazon, and yet they’re serving live video channels to 15,000 people with just over $1M in funding.  You’ve got to love it!  A number of other serverless and near serverless companies commented on Scoble’s post if you want to see more.  These big guys are not the only ones in the business.  Certainly companies like OpSource and Rackspace count too. 

    There are many potential advantages, and a few pitfalls.  First the advantages: it’s a whole lot easier and cheaper to build out your infrastructure this way.  Why have anything to do with touching or owning any real hardware?  How does that add value to your business?  The real innovators will make it easy to flex your capacity and add more servers on extremely short notice.  Take a look at your average graph of web activity:

    CNN Traffic

    This is traffic for cnn.com.  Notice how spikey it is?  Those are some big spikes.  If you web service hits one, you must either have a ton of extra servers on tap, or deal with your site getting painfully slow or going down altogether.  With a utility computing or grid service such as Amazon EC2, you can provision new servers on 10 minutes notice, use them until the load goes away, and then quit paying for them.  Payment is in 1 hour increments. 

    I know a SaaS vendor whose load doubles predictably one week out of every month because of what his app does.  He owns twice the servers to handle this peak.  He’s growing fast enough at the moment that he doesn’t sweat it much, but at some point, he could really benefit by flexing capacity.

    Now let’s talk about downsides.  First, most software doesn’t just run unchanged on these utility grids.  Even if it did, most software isn’t written to dynamically vary it’s use of servers.  Adding servers requires some manual rejiggering.  Amazon has a particularly difficult pitfall: you have to write your software to deal with a server going down without warning and losing all it’s data.  In fairness, you should have written your software to handle that anyway because it could happen that you whole machine is toast, but most companies don’t start out writing software that way.  There are companies, Elastra is one, that purport to have solutions to these problems.  Elastra has a MySQL solution that uses Amazon’s fabulously bulletproof S3 as it’s file system.

    The second issue isn’t so much a downside really.  We can’t blame these services for it at any rate.  What I’m talking about is automation.  To really take advantage here you need to radically increase your automation levels.  I recently saw a demo of some new 3Tera capabilities that I’ll be writing about that help a lot here.

    The bottom line?  You’re missing out if you’re not exploring utility computing: it can save you a bundle and make life a lot easier.  The subtext is that there are also a lot of new technologies, vendors, and partnerships coming down the pipe to help maximize the benefits.

    Related Articles

    Nick Carr picks up the theme.  One of the commenters raises an excellent point.  Using an IBM or Amazon gives peace of mind to customers of small startups.

    Posted in Web 2.0, business, data center, ec2, grid, saas, strategy | 3 Comments »

    Amazon Beefs Up EC2 With New Options

    Posted by smoothspan on October 16, 2007

    I’ve been a big fan of Amazon’s Web Services for quite a while and attended their Startup Project, which is an afternoon seeing what it can do and hearing from entrepreneurs who’ve built on this utility computing fabric.  Read my writeup on the Startup Project for more.  Amazon has been steadily rolling out improvements, such as the addition of SLA’s for the S3 storage service.  Today, there is big news in the Amazon EC2 camp:

    Amazon has just announced two new instance types for their EC2 utility computing service.  The original type will continue to be available as the “small” type.  The “large” type has four times the CPU, RAM, and Disk Storage, while the “extra large” has eight times the CPU, RAM, and Disk.  The large and extra large also sport 64 bit cpus.  Supersize your EC2!

    Why do this?  Because the original small instance was a tad lightweight for database activity with just 1.7GB of RAM while the extra large at 15GB is about right.  Imagine a cluster of the extra large instances running memcached and you can see how this going to dramatically improve the possibilities for hosting large sites.

    One of the neat things about this new announcement is pricing.  They’ve basically linearly scaled pricing.  Whereas a small instance costs 10 cents per instance hour, the extra large has 8x the capacity and costs 8×10 cents or 80 cents per hour.

    What’s next?  These new instances open a lot of possibilities, but Amazon still doesn’t have painless persistence for databases like mySQL.  If you are running mySQL on an extra large instance and the server goes down for whatever reason, all the data on it is lost and you have to rebuild a new machine around some form of hot backup or failover.  That exercise has been left to the user.  It’s doable: you have to solve the problem in any data center of what you plan to do if the disk totally crashes and no data can be recovered.  However, folks have been vocally requesting a better solution from Amazon where the data doesn’t go away and the machine can be rebooted intact.  I was told by the EC2 folks at the Startup Project to expect 3 announcements before the end of the year that were related.  I’m guessing this is the first such announcement and two more will follow. 

    There’s tremendous excitement right now around these kinds of offerings.  They virtualize the data center to reduce the cost and complexity of setting up the infrastructure to do web software.  They allow you to flex capacity up or down and pay as you go.  Amazon is not the only such option.  I’ll be reporting on some others shortly.  It’s hard to see how it makes sense to build your own data center without the aid of one of these services any more. 

    Posted in Web 2.0, amazon, ec2, grid, multicore, platforms, saas, software development | 2 Comments »

    To Escape the Multicore Crisis, Go Out Not Up

    Posted by smoothspan on September 29, 2007

    Of course, you should never go up in a burning building, go out instead.  Amazon’s Werner Voegels sees the Multicore Crisis in much the same way:

    Only focusing on 50X just gives you faster Elephants, not the revolutionary new breeds of animals that can serve us better.

    Voegels is writing there about Michael Stonebreaker’s claims that he can demonstrate a database architecture that outperforms conventional databases by a factor of 50X.  Stonebreaker is no one to take lightly: he’s accomplished a lot of innovation in his career so far and he isn’t nearly done.  He advocates replacing the Oracle (and mySQL) style databases (which he calls legacy databases) with a collection of special purpose databases that are optimized for particular tasks such as OLTP or data warehousing.  It’s not unlike the concept myself and others have talked about that suggests that the one-language-fits-all paradigm is all wrong and you’d do better to adopt polyglot programming.

    I like Stonebreaker’s work.  While I want the ability to scale out to any level that Voegels suggests, I will take the 50X improvement as a basic building block and then scale that out if I can.  That’s a significant scaling factor even looked at in the terms of the Multicore Language Timetable.  It’s nearly 8 years of Moore’s Cycles.  I’m also mindful that databases are the doorway to the I/O side of the equation which is often a lot harder to scale out.  Backing an engine that’s 50X faster sucking the bits off the disk with memcached ought to lead to some pretty amazing performance.

    But Voegels is right, in the long term we need to see different beasts than the elephants.  It was with that thought in mind that I’ve been reading with interest articles about Sequoia, an open source database clustering technology that makes a collection of database servers look like one more powerful server.  It can be used to increase performance and reliablity.  It’s worth noting that Sequoia can be installed for any Java app using JDBC without modifying the app.  Their clever monicker for their technology is RAIDb:  Redundant Array of Inexpensive Databases.  There are different levels of RAIDb just as there are RAID levels that allow for partitioning, mirroring, and replication.  The choice of level or combinations of levels governs whether your applications gets more performance, more reliability, or both.

    Sequoia is not a panacea, but for some types of benchmarks such as TPC-W, it shows a nearly linear speedup as more cpus are added.  It seems likely a combination of approaches such as Stonebreaker’s specialized databases for particular niches and clustering approaches like Sequoia all running on a utility computing fabric such as Amazon’s EC2 will finally break the multicore logjam for databases.

    Posted in Open Source, amazon, ec2, grid, multicore, platforms, software development | 3 Comments »

    Guido is Right to Leave the GIL in Python, Not for Multicore but for Utility Computing

    Posted by smoothspan on September 14, 2007

    There’s been a lot of back and forth in the Python community over something called the “GIL” or Global Interpreter Lock.  Probably the best “get rid of the GIL” argument comes from Juergen Brendel’s post.  Guido, the benevolent dictator of Python has responded in his own blog that the GIL is here to stay and he doesn’t think it is a problem nor that it’s even the right choice to try to remove it.  Both combatants have been eloquent in expressing their views.  As is often the case, they’re optimizing to different design centers and likely will have to agree to disagree.

    Now let’s try to pick apart this issue in a way that everyone can understand and make sense of for large scalability issues in the world of SaaS and Web 2.0.  Note that my arguments may be invalid if your scaling regime is much smaller, but as we’ve seen for sites like Twitter, big time scaling is hard and has to be thought about carefully.

    First, a quick explanation on the GIL.  The GIL is a bit of code that causes multiple Python threads to have to wait before an object can be accessed.  Only one thread may access an object at a time. 

    Whoa!  That sounds like Python has no ability to scale for multiple cores at all!  How can that be a good thing?  You can see where all the heat is coming from in this discussion.  The GIL just sounds bad, and one blogger refers to it jokingly as the GIL of Doom.

    Yet all is not lost.  One can access multiple cpu’s using processes, and the processes run in parallel.  Experienced parallel programmers will know the difference between a process and a thread is that the process has its own state, while threads share their state with other threads.  Hence a thread can reach out and touch the other thread’s objects.  Python is making sure that when that touch happens, only one thread can touch at a time.  Processes don’t have this problem because their communication is carefully controlled and every process has its own objects.

    Why do programmers care about threads versus processes?  In theory, threads are lighter weight and they can perform better than a process.  We used to argue back and forth at Oracle about whether to use threads or processes, and there were a lot of trade offs, but it often made sense to go for threads. 

    So why won’t Guido get rid of the GIL?  Well, for one thing, it was tried and it didn’t help.  A new interpreter was written with fine-grained locking that minimized the times when multiple threads were locked out.  It ran twice as slow (or worse on Linux) for most applications as the GIL version.  The reason is that having more lock calls was slower:  lock is a slow operating system function.  The way Guido put this was that on a 2 processor machine, Python would run slightly faster than on a single processor machine, and he saw that as too much overhead.  Now I’ve commented before that we need to waste more hardware in the interest of higher parallelism, and this factor of 2 goes away as soon as you run on a quad core cpu, so why not nix the GIL?  BTW, those demanding the demise of the GIL seem to feel that since Java can run faster and supports threads, that the attempt at removing the GIL must have been flawed and there is a better way.

    I find myself in a funny quandry on this one, but ultimately agreeing with Guido.  There is little doubt that the GIL creates a scalability speed bump, but that speed bump is localized at the low end of the scalability space.  If you want even more scalability, you still have to do as Guido recommends and use processes and sockets or some such to communicate between them.  I also note that a lot of authorities feel that it is also much harder to program threads than processes, and they call for shared nothing access.  Highly parallel languages like Erlang are focused on a process model for that reason, not a thread model.

    Let me explain what all that means.  Threads run inside the same virtual machine, and hence run on the same physical machine.  Processes can run on the same physical machine or in another physical machine.  If you architect your application around threads, you’ve done nothing to access multiple machines.  So, you can scale to as many cores are on the single machine (which will be quite a few over time), but to really reach web scales, you’ll need to solve the multiple machine problem anyway.

    As Donald Knuth says, “premature optimization is the heart of all evil in programming.”  Threads are a premature optimization when you need massive scaling, while processes lead to greater scalability.  If you’re planning to use a utility computing fabric, such as Amazon EC2, you’ll want processes.  In this case, I’m with Guido, because I think utility computing is more important in the big picture than optimizing for the cores on a single chip.  Take a look at my blog post on Amazon Startup Project to see just a few things folks are doing with this particular utility computing fabric.

    Submit to Digg | Submit to Del.icio.us | Submit to StumbleUpon

    Posted in Web 2.0, amazon, data center, ec2, grid, multicore, platforms, saas, software development | No Comments »

    Amazon Startup Project Report

    Posted by smoothspan on September 13, 2007

    I attended the Silicon Valley edition of the Amazon Startup Project today.  This is their second such event, the first having been hosted in home-town Seattle.  The event took place at the Stanford University Faculty and was well attended: they basically filled the hall.  The agenda included an opening by Andy Jassy, Sr VP for Amazon Web Services, a discussion on the services themselves by Amazon Evangelist Mike Culver, a series of discussions by various startups using the services, a conversation with Kleiner Perkins VC Randy Komisar, and closing remarks by Jassy again.  Let me walk through what I picked up from the various segments.

    First up were the two talks by Amazon folk, Jassy and Mike Culver.  Jassy kept it pretty light, didn’t show slides, and generally set a good tone for what Amazon is trying to accomplish.  The message from him is they’re in it for the long haul, they’ve been doing API’s for years, and the world should expect this to be a cash generating business for Amazon relatively shortly.  That’s good news as I have sometimes heard folks wonder whether this is just remaindering infrastructure they can’t use or whether they are in fact serious.  The volumes of data and cpu they’re selling via these services are enormous and growing rapidly.

    Mike Culver’s presentation basically walked through the different Amazon Web Services and tried to give a brief overview of what they were, why you’d want such a thing, and examples of who was using them.  I had several takeaways from Mike’s presentation.  First, his segment on EC2 (Elastic Compute Cloud–the service that sells CPU’s) was the best.  His discussion of how hard it can be to estimate and prepare for the volumes and scaling you may encounter was spot on.  Some of the pithier bullets included:

    • Be prepared to scale down as well as up.
    • Queue everything and scale out the servicing of the queues.

    He showed a series of Alexa traffic slides that were particularly good.  First he showed CNN’s traffic:

    CNN Traffic

    As you can see, there are some significant peaks and valleys.  In theory, you’d need to build for the peaks and eat the cost of overcapacity for the valleys if you build your own data center.  With a utility computing fabric like Amazon’s you can scale up and down to deal with the demand.  He next overlaid Flickr onto this data:

    Flickr Traffic

    Flickr’s problem is a little different.  They went along for a while and then hit a huge spike in Q206.  Imagine having to deal with that sort of spike by installing a bunch of new physical hardware.  Imagine how unhappy your customers would be while you did it and how close you would come to killing your staff.  Spikes like that are nearly impossible to anticipate.  CNN has bigger spikes, but they go away pretty rapidly.  Flickr had a sustained uptick. 

    The last view overlaid Facebook onto the graph:

    Facebook Traffic

    Here we see yet another curve shape: exponential growth that winds up dwarfing the other two in a relatively short time.  Amazon’s point is that unless you have a utility computing fabric to draw on, you’re at the mercy of trying to chase one of these unpredictable curves, and you’re stuck between two ugly choices:  be behind the curve and making your customers and staff miserable with a series of painful firedrills, or be ahead of the curve and spend the money to handle spikes that may not be sustained, thereby wasting valuable capital.  Scaling is not just a multicore problem, it’s a crisis of creating a flexible enough infrastructure that you can tweak on a short time scale and pay for it as you need it.

    One of the things Mike slid in was the idea that Amazon’s paid for images were a form of SaaS.  To use EC2, you first come up with a machine image.  The image is a snapshot of the machine’s disk that you want to boot.  Amazon now has a service where you can put these images up and people pay you money to use them, while Amazon gets a cut.  The idea that these things are like SaaS is a bit far fetched.  By themselves they would be Software without much Service.  However, the thought I had was that they’re really more like Web Appliances.  Some folks have tried to compare SaaS and Appliance software–I still think it doesn’t wash for lack of Service in the appliance, but this Amazon thing is a lot cleaner way to deliver an appliance than having to ship a box.  Mike should change his preso to push it more like appliances!

    All of the presentations were good, but the best ones for me were by the startup users of the services.  What was great about them was that they pulled no punches.  The startups got to talk about both the good and bad points of the service, and it wasn’t too salesy about either Amazon or what the startups were doing.  It was more like, “Here’s what you need to know as you’re thinking about using this thing.”  I’ll give a brief summary of each:

    Jon Boutelle, CTO, Slideshare

    The Slideshare application is used to share slideshows on the web, SaaS-style.  Of course Jon’s preso was done using slideware.  His catchy title was “How to use S3 to avoid VC.”  His firm bootstrapped with minimum capital, and his point is not that you have to get the lowest possible price per GB (Amazon isn’t that), but that the way the price is charged matters a lot more to a bootstrapping firm.  In his firm’s case, they get the value out of S3 about 45 days before they have to pay for it.  In fact, they get their revenue from Google AdSense in advance of their billing from Amazon, so cash flow is good!

    He talked about how they got “TechCrunched” and the service just scaled up without a problem.  Many startups have been “TechCrunched” and found it brought the service to its knees because they got slammed by a wall of traffic, but not here.

    Joyce Park, CTO, Renkoo/BoozeMail

    Joyce was next up and had a cool app/widget called BoozeMail.  It’s a fun service that you can use whether or not you’re on Facebook to send a friend a “virtual drink”.  Joyce gave a great overview of what was great and what was bad about Amazon Web Services.  The good is that it has scaled extremely well for them.  She ran through some of their numbers that I didn’t write down, but they were very large.  The bad is that there have been some outages, and its pretty hard to run things like mySQL on AWS (more about that later).

    BoozeMail is using a Federated Database Architecture that tracks the senders and receivers on multiple DB servers.  The sender/receiver lists are broken down into groups, and they will not necessarily wind up on the same server.  At one point, they lost all of their Amazon machines simultaneously because they were all part of the same rack.  This obviously makes failover hard and they were not too happy about it. 

    Persistence problems with Amazon are one of the thorniest issues to work through.  Your S3 data is safe, but an EC2 instance could fall over at any time without much warning.  Apparently Renkoo is beta testing under non-disclosure some technology that makes this better, although Joyce couldn’t talk about it.  More later.

    Something she mentioned that the others echoed is that disk access for EC2 is very slow.  Trying to get your data into memory cache is essential, and writes are particularly slow.  Again, more on the database aspects in a minute, but help is on the way.

    Sean Knapp, President of Technology, Ooyala

    Ooyala is a cool service that let’s you select objects on high quality video.  The demo given at Startup Day was clicking on a football player who was about to make a touchdown to learn more about him.  Sean spent most of his preso showing what Ooyala is.  It is clearly an extremely impressive app, and it makes deep use of Amazon Web Services to virtually eliminate any need for doing their own hosting.  The message seemed to be if these guys can make their wild product work on Amazon, you certainly can too.

    Don MacAskill, CEO, Smugmug

    I’ve been reading Don’s blog for a while now, so I was pleased to get a chance to meet him finally.  Smugmug is a high end photo sharing service.  It charges for use SaaS-style, and is not an advertising supported model.  As I overheard Don telling someone, “You can offer a lot more when people actually pay you something than you can if you’re just getting ad revenue.”  Consequently, his customer base includes some tens of thousands of professional photographers who are really picky about their online photo experience.

    Smugmug has been through several generations of Amazon architectures, and may be the oldest customer I’ve come across.  They started out viewing Amazon as backup and morphed until today Amazon is their system of record and source of data that doesn’t have to be served too fast.  They use their own data center for the highest traffic items.  The architecture makes extensive use of caching, and apparently their caches get a 95% hit rate.

    Don talked about an area he has blogged on in the past, which is how Amazon saves him money that goes right to the bottom line.

    Don’s summary on Amazon:

    • A startup can’t go wrong using it initially
    • Great for “store a lot” + “serve a little”
    • More problematic for “serve a lot”

    There are performance issues with the architecture around serve a lot and Don feels they charge a bit too much (though not egregiously) for bandwidth.  His view is that if you use more than a Gigabit connection, Amazon may be too expensive, but that they’re fine up to that usage level.

    His top feature requests:

    -  Better DB support/persistence

    -  Control over where physically your data winds up to avoid the “my whole rack died” problem that Joyce Park talked about.

    The Juicy Stuff and Other Observations

    At the end of the startup presentations, they opened up the startup folks to questions from the audience.  Without a doubt, the biggest source of questions surrounded database functionality:

    -  How do we make it persist?

    -  How do we make it fast?

    -  Can we run Oracle?  Hmmm…

    It’s so clear that this is the biggest obstacle to greater Amazon adoption.  Fortunately, its also clear it will be fixed.  I overheard one of the Amazon bigwigs telling someone to expect at least 3 end of year announcements to address the problem.  What is less clear is whether the announcements would be:

    a)  Some sort of mySQL service all bundled up neatly

    b)  Machine configurations better suited to DB use:  more spindles and memory was mentioned as desireable

    c)  Some solution to machines just going poof!  In other words, persistence at least at a level where the machine can reboot, access the data on its disk, and take off again without being reimaged.

    d)  Some or all of the above.

    Time will tell, but these guys know they need a solution.

    The other observation I will make is one that echoes Don’s observation on Smugmug:  I’m sure seeing a lot of Mac laptops out in the world.  3 of the 4 presenters were sporting Macs, and 2 of them had been customized with their company logos on the cover.  Kewl!

    Submit to Digg | Submit to Del.icio.us | Submit to StumbleUpon

    Posted in Partnering, Web 2.0, amazon, data center, ec2, grid, multicore, platforms, saas, software development, strategy, venture | 12 Comments »

    Persistent mySQL Now Available for Amazon EC2/S3 Junkies

    Posted by smoothspan on September 2, 2007

    There are now two companies, Elastra and RightScale, who are offering solutions for Persistent mySQL on Amazon EC2/S3.  This is a significant development in utility computing because most companies wishing to use Amazon’s platform would have to solve this thorny problem before they could get on with doing something interesting.   Having an off-the-shelf solution makes it that much easier to adopt the platform.

    Some are concerned about the price or about getting locked into Amazon, but I think these are relatively safe bets.  First, we already have 2 players, and Amazon will likely offer a solution of its own.  Hence the price will stabilize to a lower point in a competitive marketplace.  Second, mySQL is the API here, not Amazon.  Any utility computing service that wants to make a go will have to support mySQL in some form or fashion, so rehosting may not even be that bad.  Hence the lock-in is minimal.

    Thanks to High Scalability for the heads up!

    Posted in Web 2.0, amazon, data center, ec2, grid, multicore, platforms, saas | No Comments »