Hacker News new | past | comments | ask | show | jobs | submit login
The Impending Cloud Reshuffle (erikbern.com)
205 points by pierremenard on Dec 2, 2021 | hide | past | favorite | 127 comments



> What if cloud vendors focus on the lowest layer, and other (pure software) vendors on the layer above?

I'm at AWS re:Invent this week and I can tell you this absolutely not what Amazon is doing. Two themes really stand out in the keynotes and associated presentations:

1.) Becoming a home for data and ML services backed by hardware technologies like Graviton, Trainium, and stupid fast network connectivity. Amazon has dozens of value-added services from S3 to analytic databases. By my count they have at least 16 managed data services.

2.) Extend Amazon cloud wherever possible into non-cloud environments. Specifically: push to on-prem through technologies like AWS Outposts and Container Marketplace Anywhere as well as integrate with IoT.

#2 looks like a pump fake to defuse customer lock-in concerns. The real play is #1, which is to become the preferred home for data, taking the maximum possible share of revenue in the market. Taken as a whole, Amazon has the best data story of any company I know of.


Best data story? They are sorely lacking a strong distributed database offering. Athena is slow and hard to optimize, and redshift does not hyperscale - that's why everyone switched to snowflake. These are the fundamentals of a data function at any company. GCP has bigquery, such itself serves enough use that companies could consider leaving AWS.


The company I work for uses GCP primarily for bigquery. The rest of compute followed so we didn't have to pay all of the networking fees to use aws.


Agreed. BigQuery, Cloud Spanner and Firebase are unique products which no other cloud provider can match even in the slightest. I think these things are a pull factor into GCP and everything else can be easily replaced once people decided to just move everything into GCP.

I got sucked into GCP because GKE is (still) much more advanced than EKS or AKS and that alone was a pull factor for me, let alone the other services which I mentioned.


I don’t actually know that much about the AWS ecosystem at all but to me from the outside I wouldn’t even consider doing anything greenfield there they already seem so far behind in terms of offerings some of which you touched on there but the actual list is much longer AFAIK.

Does anyone who’s less ignorant than me want to give me a run down of why AWS holds so much appeal? Is it purely non technical reasons? Are they doing a much better job than GCP in areas I just am not aware about?

I ask as a genuine question to learn more despite how it may read…


AWS is the cloud standard. It's not fancy, it's not cheap, but it does it's job. There is more widespread knowledge around it vs the other clouds => it's easier to hire, imo not the best way to hire but lots of companies do it that way. It's probably what MS Windows is for the desktop.


That was the impression I had from the outside. They were clearly number 1 for a long time and I guess that’s when everyone made the decision and now the idea of migrating is not feasible for most.


From someone not used to GCP Cloud Spanner and Big Query sounds like great products but the general trend seems to be to go with AWS or Azure. Maybe their solutions are good enough.

We have started to use CosmosDb and it scales enough for us.


Redshift pulls down $100s of millions in annual revenue as far as I can tell. Not everyone is switching.


Almost unrelated question... Did you used the maps at the conference? Did you liked them? (I work for the company that made the maps and I am wondering how people that are not us, and not the executives that chose us, think about us)


That's a great question. I did use the map built into the re:Invent app and it's really handy. That said, it helps enormously to have been there before. Even a single venue like the Venetian is enormous and requires some practice to get from A to B quickly within the building. The re:Invent team has guides standing around to give directions, which also really helps.


I don't think the contention is that they will give up on these services, or publicly acknowledge that this is happening. A lot of AWS customers will be very happy with the native vendor services, it's just that the premium and corporate end of the market will go with more sophisticated ISV services. Also this is a forward looking prediction, it's happening in certain segments now, and will become prevalent in other segments over time. That's the claim anyway.

Take Elasticsearch for example, AWS is offering a clone service, but if you want the best Elasticsearch experience on the latest code base maybe you're better off using Elastic's own Elastic Cloud service that's also within AWS. If things play out as Erik predicts (and Elastic doesn't screw up) they should do just fine. We'll see if that's how it plays out.


Elasticsearch's cloud offering is a hot mess, didn't have a lot of confidence in how it was provisioned. We did a bake-off between ES Cloud, AWS Opensearch, and hosting our own cluster on AWS and the winner was... hosting our own (all things being equal we'd much rather go PaaS.) AWS Opensearch would have won out had it supported plugins which I understand they are working on.


I think AWS knows that the best it can do is offer so much combined value that people don't want to leave the ecosystem. It's the same play that Amazon has used for Prime. If Prime were just shopping, people could leave Prime for Walmart+ or Target/Shipt. Instead, Prime is a service that keeps lots of touch points. You essentially get a free streaming service in Prime Video (included with your retail shipping thing) or you get a free shipping service (included with your video streaming); you get some perks on Twitch; you get a music streaming service that doesn't include all music, but has a lot for free (with no ads) and the potential for Music Unlimited for 20% less than Spotify/Apple Music.

If you can get the same value from another ecosystem, you might leave for a better price. If you can't get the same value from another ecosystem, you won't leave. If you're locked into proprietary AWS systems, it might require a lot of work to migrate.

I do get where the author is coming from with RedShift and Snowflake. I think it's reasonable to look at AWS's data warehouse position and Snowflake's success and see a bit of a failure on AWS's part. However, I think that Snowflake represents a threat to AWS. If all services above the infrastructure became like Snowflake, I could more easily move from AWS to GCP or Azure.

I'd note that it isn't just about companies who will switch cloud providers. It's about whether a company can credibly threaten that they might switch cloud providers. If I'm negotiating with AWS and they see that I'm using SQS, Lambda, RedShift, Timestream, Kenesis, and other proprietary AWS stuff, they know that my threat to leave AWS is hollow. I'd have to rewrite too much infrastructure. Sure, maybe I could replace SQS with Kafka, but it isn't a drop-in replacement. Sure, there are other time-series databases than Timestream, but it's going to require engineering time to migrate.

Yes, even in a world where all the software is the same on top of the infrastructure, it would take time and money to migrate to a different cloud, but it would be less time and money. That "less time and money" means that AWS needs to be more price competitive and it gives me more negotiating power. The harder it is to leave AWS, the more leverage that Amazon has in negotiations. Companies like Spotify and Snap spend hundreds of millions on their cloud services each year. They ink major deals that they negotiate. If they rely on lots of proprietary AWS or GCP systems, it makes it a lot harder to leave - which leave the cloud provider with the leverage when the contract comes up for renewal.


Amazon services are individually not that great in many cases. Their strength is the enormous diversity in the aggregate. You might use your own services for data and apps, but if it's all glued together with Lambda you aren't going anywhere.


Which is exactly why you shouldn't adopt serverless. It's a boondoggle lock-in trick.

If it were an open and widely supported standard it would be a different story.


This is literally what knative is. I don’t know the AWS world at all but at least in GCP their main ideas of serverless are:

1. Knative: open source standards based serverless platform that’s in the process of making its way into the CNCF. I assume this is what Fargate is on AWS? 2. Cloud functions which is also something that you can drop into AWS, Azure, open shift etc..


> I think AWS knows that the best it can do is offer so much combined value that people don't want to leave the ecosystem.

This is exactly the Microsoft playbook.


I worked at AWS Systems Manager, which has a hybrid on-prem solution that lets you integrate non-aws resources into the IAM/HOSM world (sometimes just called Managed Instances). There was a bit of pushback internally around what kind of play we were allowed to make in the on-prem/hybrid space, and eventually the product just got KTLO'd. Kind of a shame, it was neat.


This seems prophetic. I'd be really curious what you could do with that kind of "Managed Instance" and why it was killed.


It wasn't killed, the feature just never got the kind of marketing and traction that we felt it could/should have. IAM went off and built their own blessed version later, I think.


Not to mention, as the author mentioned, there's a (likely) scenario that pure cloud providers become a low-margin commodity. We've already seen this play out. Even with lock-in, to attract new customers, you'll be competing mostly on price.


+1 Amazon has always been a "lower margin" business. One argument for why and how AWS beat all the tech vendors to the punch was that everyone else had businesses build on higher margins (Google, MSFT, Oracle, VMware, etc). Now that AWS delivers most of Amazon's current profit, it will be interesting to see if Jassy asks for more margin out of them or tries to find it in other parts of the business.


I agree. Cloud being a commodity is already prevalent. CSPs are cut throat beyond that, and bleeding into the integration space more than ever before. Data/ML is a key entry point today.


#1 is Nvidias play as well, no? More the hardware side rather than the data side of course.


Ahhh... maybe.

> There's some sort of folk wisdom that the lowest layer of cloud services is a pure commodity service.

> Cloud vendors might be pretty happy making money just in the lowest layer. Margins aren't so bad and vendor lock-in is still pretty high.

It may be folk wisdom, but big customers look at stuff like storage and CPU pricing when picking a vendor, because they need lots of storage and CPU. These services, in turn, are priced close to the actual cost of the underlying storage and CPU because of the high volume of storage and CPU involved.

I have worked at cloud vendors. Margins aren't great. Cloud vendors are always asking questions like, "How does vendor X sell this for so cheap? We can't make a profit at that price! How can we make our costs for X lower?" The lock-in is seen as a way to sell the higher-level services.

However, the high-level services, the stuff that start-ups compete with, also face competition with open-source projects. Things that used to be SaaS are now DIY with open-source projects running on IaaS. Ten years ago you had no Kubernetes, and you didn't have columnar storage out of the box on PostgreSQL. Cassandra was immature. Nowadays, open-source software running on IaaS has killed the need for bigger segments of the SaaS market, assuming you have enough expertise in-house to run it.


> assuming you have enough expertise in-house to run it.

Most small to mid-sized companies would almost always opt to pay a bit more for a managed offering and not have to bring in expertise in-house for things that are not their core competency.


For a small company, that's sensible. For a mid-size company, I think it's short-sighted. I'd argue that mid-size tech companies need the expertise that comes with running more of their own stack. If you don't have the expertise, then you're going to be dealing more with support engineers to solve your problems, and you won't have as much insight into your tech stack as you'd want.

There's a big difference between the experience of opening a support ticket to fix a problem with your database and paging one of your salaried engineers to fix it--someone who actually runs the service to begin with.


The way I've heard it phrased best is the way (I think) Allen Ward put it.

If a component is high cost, tightly integrated with the rest of the system, and difficult to design, then you need to be an expert on it in order to evaluate suppliers and make a good purchasing decision.

To become an expert, you have to build it for a while yourself. And once you have built it for a while and become an expert, you might as well keep that up, unless there's a strong economic incentive not to.


I've seen engineers build and run a clustered database and make a total pig's ear out of it. Every day was another outage (or the continuation of yesterday's outage). Several times we were a hair's breadth from actual irrevocable customer data loss. Just because you got a few of your salaried engineers to build a DB doesn't mean that they had the correct knowledge and respect for data to do it properly.


I’ve seen people make a mess out of SaaS offerings that are supposed to be foolproof. If we assume that we’re hiring incompetent engineers, we can make a mess out of any situation.


You’re both right: most companies want to hire the least competent (least costly) engineer capable of doing the job. Then they have mixed luck in hiring and wind up with a distribution of engineers, including some plain incompetent ones. They barely have competent enough engineers to build their core product, let alone perform undifferentiated technical operations on self-managed infrastructure.

I agree with you that if we assume all our engineers are strictly competent, then that gives us a major advantage over our vendors, and tilts the scales to build over buy.


I don't think it's strictly about competence and more about specialism. In my personal experience, which you can take with a large grain of salt, what happens is that the engineering manager with a certain type of career history tends to look down on any type of operations work and anyone who does it. I have heard things that come off sounding like this: "We hire world class computer scientists from prestigious Ivy League schools and FAANG experience, so why do I need to hire some one-trick-pony DB guy, when my engineers could build their own databases from scratch?"


If people still had witty quotes in email signatures, I would totally steal your last sentence.


How are mid-sized companies going to afford to retain that expertise, once they've trained it?

Or, to put it another way, mid-sized company X can never pay as much as hyperscale company Y for Z expertise. Because utilization at mid will always be less than at hyper.

I'm all for retaining inhouse expertise, but it's fundamentally a return-on-capital optimization problem, albeit one where your capital asset has the ability to walk out the door. So inhousing core, using managed everything else is a more reasonable bargain.


The expertise in question isn’t really used at “hyperscale” companies. Not everyone wants to work at large companies anyway. And the expertise isn’t that difficult to train or acquire (compared to getting another software engineer).

> Or, to put it another way, mid-sized company X can never pay as much as hyperscale company Y for Z expertise. Because utilization at mid will always be less than at hyper.

Being able to run a PostgreSQL cluster is a handy skill but won’t get you hired at Google.

Google can afford to pay an engineer to make cool stuff like a faster malloc implementation or analyze the best possible way to encode data to be stored on disk. That’s because making Google’s malloc 1% faster will pay someone’s salary many times over, or likewise for saving 1% CPU in your disk servers.

But running your own database cluster and container orchestration ain’t exactly rocket science these days, and running your own stack means you have an expert in-house when the shit hits the fan—and that’s when your expert earns their salary, many times over.


One root cause I've seen over during database outages is insufficient IO. It's caused by the other most common root cause: lack of actual expertise. Until you've been through the wringer, it's easy to lack respect for data: how important it is, how fucking large it is, and how long it takes to do basically anything with a few TB of it. If you didn't hire a real expert, and your DB-for-a-day guy miscalculated the spec for your bare metal clusters, you'll find it extremely hard to magic up faster hardware during a 4am full-outage when Europe wakes up. If you're on cloud, at least you have the flexibility to increase block storage IOPS or quickly reboot a node on beefier hardware or faster network connections. But suddenly the "cheap" DIY database is a lot more expensive than you budgeted for, and the phrase "nobody needs a dedicated DB guy" starts sounding really short sighted.


I’m talking about managing your own software on IaaS as an alternative to running off SaaS, you seem to think I’m talking about ditching the cloud entirely!


Not at all! I did say: "If you're on cloud, at least you have the flexibility to..."


> Because utilization at mid will always be less than at hyper.

Will it?

Economies of scale have diminishing returns. If you have one engineer who is 20% busy, your engineer cost per unit will be five times higher than a company five times your size who keeps that one person 100% busy.

If you have four engineers who are all 100% busy, compared to a company 250 times your size with 1000 engineers, your cost per unit is equivalent.


Hey, your math makes sense, and is a good illustrative example, but in the real world if you schedule has your engineers 100% busy, then you have no slack to handle emergencies/unexpected events.

I'd target 60-70% utilization of engineer time. The remaining 30-40% is for stuff which can be dropped when needed (refactoring, non-critical/experimental projects, personal learning, ...). And if you have good engineers, they probably have a really good idea of how to prioritize for filling that 30-40% so as to move the company forward.


The same applies to machines!

There is an inverse relationship between resource utilization and queue length, under reasonable assumptions. If you think of low CPU utilization as "wasted" CPU and try to fill it up, you can completely kill your service's ability to quickly respond to requests.

I like to make this analogy when talking to people who feel like they are not working hard enough. If you fill up an engineer's time with high-priority work, you get the same problem as if you fill up a machine's CPU with high-priority tasks... you get a system that cannot respond quickly, a system that spends a lot of time overloaded.

It's a fun exercise to try and calculate the relationship between utilization and expected queue length.

Plus, those batch jobs are still important, they're just not as time-sensitive.


> Or, to put it another way, mid-sized company X can never pay as much as hyperscale company Y for Z expertise.

Hyperscale companies usually invent the wheel internally and maybe release it as OS a few years later.


Even for a small company in the software space, running your own self-hosted OSS stack isn't a big ask. I've spent ~80 hours over the past year setting up and running our stack. It's nothing too crazy[1] but it covers our needs with no ongoing fees and, more importantly, no outside dependencies to rely on.

[1] Proxmox hosting a virtual server running Ubuntu, providing Kimai, WeKan, Mattermost, Jitsi, Mediawiki, OpenVPN access, some Samba shares, source control, and comprehensive backups.


You think hiring AWS engineers is cheaper than hiring some old style sysadmin that can manage any servers?

Most likely you'll end up spending less in ops. The only problem may be finding talent, as everyone goes the AWS way so they can charge more.


I generally never understood the appeal of the cloud, and the particular way the 3 big vendors sold proprietary services as a means of vendor lock-in. I mean, I understand the appeal to them, seeing how they made ungodly amounts of money doing it, and given the choice you never want to be in a commodity market, but as user/developers, the value proposition has always been unclear to me.

- The term 'containerization' implies in theory (and somewhat in practice) that you package your software in such a way, that it can run on commodity hardware with enough resources.

- The huge success of open source means that for any given problem domain (KV stores, SQL, Web Server,LB, caches etc.). There exists a best-in-class solution that's free as in speech. In fact, cloud providers' solutions use these libraries as well.

- The development of languages that combine productivity and performance (Go/Java etc.) and with Moore's law enabling ridiculous core counts/RAM sizes etc. means that there's less of an incentive to run a server farm when you can easily fit 128 peak-performance cores under your desk. Stackoverflow did this/has been doing this model with quite some success.


It's all about management cost. How much does it cost to operationalise your own server farm with the same guarantees you can get from a cloud vendor? Do you have the necessary expertise on infrastructure/SRE to have team(s) deploy and operate your own server farm with the production level your business require? Do you have a good hiring pipeline to supply you candidates with the required expertise when the time comes to replace your experts? Is that a business risk worth taking if it's not your business' core business?

For example: I use Elasticsearch and have used it for the past 10 years, I have deployed ES clusters myself, I have been a team lead for infrastructure teams that had to support a ES cluster, etc. I would never, ever, choose to do it myself again if I can find a managed solution. It's a lot cheaper when you factor in that you will require at least 1 engineer with expertise (or being trained for it) to properly support a running ES cluster in production.

The same applies for databases (I've managed shards/clusters of Postgres and MySQL in my life), and Kubernetes clusters, and any other piece of infrastructure that becomes critical to your business, you have to manage it and therein lies the majority of the cost of any solution: supporting it. The cloud provisions, manages, backs up, fails over, etc., without intervention from my side. Yeah, I can set up all of that myself but that is another cost to take into consideration, and something also to be maintained over the years.

Yes, I can find open-source tools for a lot of issues in software engineering, I can't find anyone to run those tools for free, so the trade-off analysis is based on this when looking into solutions, and a lot of the solutions from cloud vendors is exactly managing resources for me and my teams.


> For example: I use Elasticsearch and have used it for the past 10 years, I have deployed ES clusters myself, I have been a team lead for infrastructure teams that had to support a ES cluster, etc. I would never, ever, choose to do it myself again if I can find a managed solution.

Ditto. I've seen ES / Cassandra cluster management casually dropped on a team who just happen to use an frontend that relies on them. Those are usually the most brittle, unloved databases in the world, and everyone on the team ends up walking away with the false impression that "X is a tire fire, never use it".


Funny you mention Cassandra. That was another piece of hell-on-Earth for one of my teams to manage, even with a few seasoned SREs and database engineers. It is beautiful while it's working, it's hell to properly setup, configure and get it stable enough after ironing the quirks and gotchas.


Lol. I didn't just pick that example at random. I've been there too.


I imagined so, haha, glad to see you on the other side, we both survived!


So this (at least in theory) should be covered by kubernetes operators.


To me the ease of use is key. I don't want to deal with outages, hardware failures, networking issues, etc. I only want to focus on the software. That's why I don't understand why would anybody (apart from some special cases) would run their own servers.


You will still have to deal with outages (even if probably at a much lower rate). Just today I got a confirmation from AWS that the 1h outage that our production, multi-AZ RDS database suffered *three months ago* was due to an underlying failure in their storage subsystem. I didn't have to fix any SAN or fiber issue, true, but I lost hours puzzling and debugging things to understand why it failed, with no good answer (until today).


Let's be honest. It is much different than having to sit in the car in the middle of the night to change a hard drive in the server that just failed because of a power surge.


Gonna go the typical HN nitpicking on examples

> It is much different than having to sit in the car in the middle of the night to change a hard drive in the server that just failed because of a power surge.

Never ever had to go at night in a data center to replace a single disk failure, even in my DC days. Always the 1st working day after the disk failure, since they were redundant. Now, network gear upgrades during the nights, those I made a few and I'm happy they are history for me now. But even in a cloud world you have to do from time to time low-traffic time interventions.


This might happen, but Cloud vendors will fight tooth and nail to stop it from happening, because the lower layers are commodities and a race to the bottom pricewise. Also, many companies want a single neck to wring if anything goes wrong, and are thus unwilling to deal with a patchwork of third-party vendors.

That said, telcos used to mint money from overseas calls, SMS, ringtones etc, until the Internet came along and everything went "over the top" (over data and thus outside operator billing). They didn't take that lying down, either (IIRC Skype was still banned in the UAE until COVID finally whacked some sense into them), but in the end telcos were still reduced to dumb pipes they are today.


The "single neck to wring" doesn't really exist unless you're a Salesforce / Atlassian / ServiceNow sized customer. Below a certain size, good luck getting support from AWS beyond "read the docs". The thing is, most of the time your outages are because you screwed up. "Shared responsibility" means that unless AWS actually loses a whole region for the day, or S3 actually loses your data forever, it is still "your fault". Because you didn't balance across AZs, take backups, test backup recovery, understand point-in-time recovery or egress charges etc. As the article says, CTOs don't change cloud providers. I've been on the hook for DR exercises where the same stack is running on the same cloud and they've failed because engineers make crazy number of invisible hard-coded design choices. Once you build your product on AWS, you're pretty much stuck there. K8s offers kinda-of an escape hatch, because it's possible to build your SaaS on EKS in such a way that it could be migrated to AKS or GKE, but it's still a ton of work for the plumbers. So the reason that AWS has this proliferation of ready-made services (and there are tons of them - try sitting an AWS professional certification) is that you'll generally use them in favour of deploying your own FOSS thing (even if it's "just a helm chart away"). Because running a database cluster (for example) is never as simple as just deploying it. Once your engineers decide to use Cognito, Lambda, Kinesis, Aurora instead of doing a DIY, you're locked in. Hell, it would even cost a fortune to simply download the data to close your account (because AWS Snowball doesn't have an egress option where they'll drive a few PB of your data to your data center in a semi-truck)


> This might happen, but Cloud vendors will fight tooth and nail to stop it from happening,

They are failing badly IMO. Most developers I know hate the whole experience of using AWS.

I can't complain much as I make good money by understanding their shit so others don't have to.

That said, I always find amazing how one of the largest companies in the world does not a have a UI with happy paths for simple common cases.

Most services are a hot mess of IAM, weird APIs, undocumented limitations and gotchas.


>They are failing badly IMO. Most developers I know hate the whole experience of using AWS.

It doesn't have to be great, only better than the alternatives. I would argue most devs prefer the rocky experience of AWS to working with their own operations people on aging hardware.


For now, maybe. But if your customers hate you, yet only a little less than they hate your competitors, all it takes is one new cloudSP to enter your market who ISN'T hated to completely wipe out the commodity low end of your business.

Maybe the major cloud players believe they can buy up new & unhated competitors before they can establish a loyal user base. That strategy has worked pretty well so far for the vendors of mobile computing -- which just happens to be the primary consumer of low end cloud services.

So if mobile can be commodified some day (the way unix workstations were by Linux), maybe basic cloud services can be too?


This analysis ignores the reality that most teams do not want to spend time optimizing each component in their stacks. As a CTO/CIO, you have to pick your battles, and the "more future vendors" box in the stack made my hair stand on end. Nobody wants their vendor stack to keep growing!

I need a managed PostgreSQL database; my needs would have to be relatively exotic for e.g. RDS to not be a good solve. AWS has deployment options for the big stacks. Tools like Vercel do not yet address the solid majority of Web developers who primarily write code for .NET or Java, and thus are not even in the conversation at most organizations (and nobody really wants 2+ PaaS vendors).

This is not to say the point solution vendors will not be able to build solid businesses. MongoDB is doing well alongside platform solutions like DynamoDB. But the gravity is clearly with the platforms, and nothing Mongo can do (short of becoming a hyperscaler cloud provider with lots of offerings) will change that.

Snowflake is a special snowflake; it's entirely possible that the lessons of Snowflake apply only to Snowflake.


I think most of us, senior included, SEVERELY underestimate the cost of having a deeply fragmented stack.

I believe having 1 or 2 max vendors, providing most of your services, is immensely beneficial in the long run, for many reasons.

cf http://boringtechnology.club


I always assumed AWS had the same model as the Amazon store, if they see something gaining in popularity on their platform (or off it), copy it and do it themselves.

It’s inevitable that things like Vercel, Netlify, Fly, and Render are copied by Amazon. It’s what they do, they want the full stack top to bottom fully integrated.

The thing they will never compete in though is developer UX, that’s where the smaller vendors (like those mentioned above) will always shine. AWS is now about 100x to big to have a good developer UX, there is just to much there.

The company that is really shaking things up is CloudFlare, they don’t seem to be afraid of playing with the business model to undercut AWS.


Their big money comes from enterprise. Enterprise does not care about developer quality of life.


Exactly, and it shows in how overly complex the whole system has become.


There are a few factors involved but the biggest ones we aren't discussing are demographic shifts and fashion. We've got an entire new generation of developers entering into the marketplace that have no idea what life was like _before_ cloud computing was a thing, and their priorities are different than simply avoiding some capex and some graybeard sysadmins.

In terms of strategy, AWS is completely and totally unapologetic about copying companies built on top of their platform if it helps them in the future. The author suggest that there haven't been this many companies aimed at the vendors before but that's a lack of insight into the history of how they came about in the first place. When AWS was first launched there were dozens and dozens of vendors competing with, building on top of, and partnering with cloud vendors. Almost none of them exist today.

Snowflake is an incredible exception but not a rule. The reason they succeeded (and the reason I personally believe fly will succeed) is because they're building hard to implement underlying technology and focusing on DX. That's basically it. If you're going to compete on economics you're absolutely hooped.


Don't forget that Snowflake (implicitly) supports a paradigm where all of your data ends up locked into AWS. It would just cost too much to move it elsewhere.

Note the (current) top post on this article:

> hodgesrm 4 hours ago | next [–]

> I'm at AWS re:Invent this week and ... Two themes really stand out in the keynotes and associated presentations:

> 1.) Becoming a home for data and ML services backed by hardware technologies like Graviton, Trainium, and stupid fast network connectivity. Amazon has dozens of value-added services from S3 to analytic databases.

So if Snowflake is helping AWS succeed in AWS goal 1 ("home for data"), I think AWS is smart enough to help Snowflake grow and prosper.


Your data ends up "locked" (as you mentioned, expensive to migrate/replicate) on the cloud platform that your Snowflake instances are running on; AWS, Google or Azure. I'd say it's in ALL of the cloud providers' interests to help Snowflake grow and prosper on their cloud platform.

At the moment it really seems that AWS is most successful here since all of the new and more advanced SF features are coming to AWS-based SF first. Then again, SF started on AWS so probably has most of their roots settled there.

While data replication across clouds in SF is expensive, I'm sure SF could/might absorb this one-time cost for you if you needed to migrate, esp if there was any sort of threaten from a particular cloud provider (assuming we're talking about internal stages and tables, not external stages/tables).

It's really in SF interest to lower this cost (or liability as you might see it) since they're really pushing for a multi-cloud data ecosystem. Just look at their data exchange product and how they're trying to make the underlying cloud platform irrelevant. The only thing really stopping this from taking off is the cost of setting up replicas in an of the regions or clouds that you want to share to or consume from.


> The reason they succeeded (and the reason I personally believe fly will succeed) is because they're building hard to implement underlying technology and focusing on DX.

I agree. This is how I think about successful software. Not entirely sure there is enough supporting data, or at least it's not easy to collect/access.


"fly"?


probably fly.io


In the bigger companies, the effort of bringing in new vendor is substantial. There are contracts that need to be approved by legal, security assessments from IT, billing arrangements from finance, user database integration, and so on. If the company already using AWS, then any product offered by AWS has a huge advantage.

I am sure that even if independent services are more loved by programmers, there will still be substantial amount of companies using AWS versions. And since the bigger companies would prefer them, I would not be surprised to know that AWS earns more money from them than startups do.


Definitely this: If I had to guess, Snowflake gets a lot more customers from organizations with pre-existing balkanized data silos that need to be fixed than from Redshift churn. Just a guess though.


I couldn't really follow this thesis. This observation popped out to me:

> [Redshift] was a brilliant move by AWS, because it immediately lowered the bar for a small company to start doing analytics.

Every sales organization I've ever seen collected too much data. What the kids today call "analytics".

And these orgs don't know what to do with it all. So much so that they delude themselves. Convincing themselves they're divining wisdom from noise. I have no idea what this phenomenon is called.

In fact, most recently, my last gig had decades of data hosted on Teradata, was migrating to Redshift. I worked on the Recommendations team, which was trying to evolve into Personalization.

Teams of data scientists. So much data. A cultural legacy of batch processing hidden behind ML pipelines. So much effort.

It took me a while to figure out most of the "work" my team did was completely fictional. Our most effective recommender algorithm was just showing people what they'd already looked at in the last 6 months. But this simple truth was hidden behind a massive rube goldberg machine.

So. What was true of CRM and ERP systems in the 90s remains true today. Collecting data without purpose, without a working hypothesis, without experiments validating the effort, is just wasted effort.

In my time, I've worked with two very smart marketing people. Knew how to design a survey, how to crunch the numbers, validated their own work. Once you see how a pro does it, you realize most everyone is just faking it, fooling themselves along with every one else.

Pretty much just like every other discipline.

--

Oh. How does my cynicism relate to the OC's prediction about a cloud shuffle?

For the users of cloud stuff, making data collection, aggregation, and analysis easier and more accessible is a net negative.

Which I suppose is great for cloud providers.


>Convincing themselves they're divining wisdom from noise.

You are technically correct, the best kind of correct!


> Inferring wisdom from noise. I have no idea what this phenomenon is called.

That's called projection.


> I have no idea what this phenomenon is called.

It's called either pereidolia or apophenia


Perfect. Like seeing faces in tree bark.


This seems like a mix of wishful thinking and extrapolating from a single outlier (Snowflake).


I agree there's too much emphasis on Snowflake, but I think the article still makes a good point, which I interpreted as something like this:

1) Cloud provider core competency is basically IaaS

2) Some level of SaaS was baked in to attract or lock-in customers, but isn't really a core competency

3) This leaves room for faster more agile startups to come in and build best of breed SaaS solutions on top of the IaaS, including those areas where cloud providers already have their own solutions.

I don't see a tipping point on the horizon to warrant language like "impending reshuffle", but I think it's an interesting trend to keep tabs on.


The article has a eye-grabbing title that is quite larger than what's covered in the content, which has become a norm nowadays. My read: the article builds on 1 core thesis --- the big 3 are moving into infrastructure focused vendors, and more and more software layers, when moved to Cloud, will be owned by vertical aligned vendors with gradually refined market focuses (data warehouse, streaming/messaging, app hosting, etc.; and the pattern repeats itself along even higher levels).

I cannot say I have enough data to agree or disagree with the prediction. Mostly, the Cloud is still very early in its development.

The Cloud today only changes how the underlying business model works in provisioning computing. I.e., people now by default goes to Cloud for machines. That's a generational transition from the old DC/on-prem model.

But on top of this new model, the software only start to gain some properties that are distant from the old model fundamentally. Lambda is one such example. Docker/Kubernetes, as the article points out, is actually very incremental development over the old models (thinking about Vsphere puppet scripts etc.).

My sentiment is that fundamental shift like Cloud, only manifest itself after it gives birth to fundamental shift in how applications are developed and run. There are a lot of relatively new trends in this area, but none of them give me the impression of a killer paradigm. I am looking for something like PC + windows, Internet + Google, Cloud + ? type of thing.

Let's see in 10 years.


This is the polar opposite of what is happening in cloud.

All vendors are moving up the stack, with the eventual endpoint being even business productivity apps aligned with Azure, GCP, AWS.

They all see IAAS as the commodity end with less value add and lock-in.

This is undeniable through actions and PR statements and has been the observed direction of travel for a decade at least.


Seems like Americans live in some sort of other parallel reality Internet where Hetzner et al., don't exist.


This so much. Seeing cloud spend in companies always make me laugh.

I could do all that for a fraction of the cost with dedicated servers outside of EC2.

I don't know where this guy worked but I've seen at least 10 cloud migrations in my career, across different clients. It's not that uncommon and not too expensive unless you're relying on weird services on top of your cloud.


Cloud =/= Just EC2, VMs. Why stop with the dedicated services? We can do much cheaper in colocation, or even building a datacenter /s


You're kidding but I've seen migrations from data centers to aws (despite committing to much larger expenses) and from aws to data centers (mainly to save money).

In practical terms, the hassle of managing the hardware is far greater than having to manage software, especially now that everyone is on the container boat. Not that it was necessarily different for people on bsd + jails in the past, but that wasn't widespread.

I think dedicated server are a sweet spot of complexity / cost.


Yes?

Not sure why the '/s'. Options exist, and not just "AWS or GCP".


Sure aws, gcp and azure are interchangeably expensive


You may not be looking at the same market. Some comparisons:

AWS $16,000M revenue Q3 Azure $9,000M est revenue Q3 GCP $4,500M revenue Q3 OVH $181M revenue Q3 (~46M "cloud") Cloudflare $172M Q3 Hetzner $277M est revenue (2019 year)

A lot of customers find smaller providers very valuable. But the reason you see so many mentions of AWS/Azure/GCP is simply a reflection that yes, the bulk of hosted services in the world are based on those three. Yes the HN crowd might like the approach or thought of the smaller service providers, but they are much much smaller.


Money is nice, but if you counted where the Internet's websites are hosted you'll probably see a different picture.

AWS solves mostly an accounting, not a tech problem.


Incidentally, Hetzner has recently expanded their cloud offering with a zone in the US: https://www.hetzner.com/news/11-21-usa-cloud/


Not to detract from the point, but I don't think you can take revenue numbers from RedShift to Snowflake like that. We made that move and ended up with 20x the monthly cost (after budgeting for 10x).

Seems like AWS is still making their money on the underlying layer in our case.


Did you spend more than 20x more on Snowflake? Where there any benefits to justify that massive increased cost?


His whole argument more or less hinges on the fact that the 3 services he picked only had AWS, GCP and Azure as options, and hence the big 3 are content to defend their market position. But that's not a done deal as things can change on a dime and the cloud vendors are being vigilant. Their 50-60% margin will be spent on developing more lock-in.

On one side you have Cloudflare. Cloudflare is the Apple of cloud. They're gunning for the big 3 by vertical integration and proprietary products and hence changing the way the market operates.

On the other side you have up-and-coming IAAS and MAAS providers chipping away at the margins. These are either commodity providers that cut the fat, or owners of DC and/or connectivity so has the synergy to undercut the cloud platforms. This plays into the open platform push such as cloud native / kubernetes.

It's almost certain that the cloud providers will lose marketshare in the IAAS space as they simply can't compete on cost. Regulation is also going to catch up to prevent cloud lock-in. So the only play these platforms have is to fight on scale and features which means continuous expansion on all fronts, including higher-level services.


I think this guy is very optimistic on cloud if he thinks in ten years people will still view it as the right way to go. A bunch of monopolies are using various levers to push companies to shift to a model with uncapped spending (literally), with a huge extra profit margin. Cloud has benefits for two narrow segments of business: Rapidly scaling new companies and extremely globalized companies, and makes basically no sense for anyone else.

As regulations catch up, national borders continue to impact how businesses operate across networks, etc., I doubt the cloud model will still look good in ten years. Half the benefit of the cloud is just being... newer. Interfaces are often more modern simply because, well, the Windows Server team all started working on Azure instead. A decade from now, people will look at the pains and problems of working with AWS and Azure the way people look at dealing with Windows Server and hardware maintenance today: Looking a little rough.


> What happened?

Marketing. Everyone has heard of Snowflake. And not just IT people. Finance, marketing, I’m sure even the cleaners have heard how wonderful it is.

Redshift on the other hand is something that a lot of IT people and even some data warehousing people seem unaware of.

There are many people who seem to think that Snowflake is the only option.


Snowflake legit has some features Redshift doesn't, the killer one IMO being auto scaling for computing resources. That is really awesome for infrequent workloads.

Last week AWS uncovered Redshift Serverless, but still on technical preview.

Another point, Snowflake is just easier to use. I've seen young developers who knew a bit of database stuff setup and play with it with little help.

Redshift on the other hand is much more entrenched in the whole AWS mess and requires one to understand shit like VPCs, security groups, IAM, cloud watch, etc....


Maybe, but I've not heard of either.



To me, if you cut out all the garbage, this sounds like going back to managed dedicated hosting. All things are new! Managed hosting is so much better than designing your own stack! Look, you can pay us to just do it and it's like magic.

Seriously?

I hate the fact that most of my projects live on AWS right now. But I don't feel locked in; I can jump off if I want. I don't buy into the entire ecosystem (read: the parts that are difficult to migrate away from), but AWS is just an amazing way to manage lots of resources and patch them together in any way you can imagine. And therein lies the creativity and control -- and cost savings. Because no two projects scale in the same way. The fantastic thing about self-managed cloud environments is that you can find the right size and scalability for each thing, and everything is where you put it. Zero-downtime is up to you. The biggest drag on downtime is people in between you and access to restarting or working directly on those services. In the 2000's, I used to have to put in a fucking ticket with tech support to reboot a server in a datacenter six timezones away, and it wouldn't happen until someone had breakfast.

Do you think people who write code and handle the optimization of it across databases and servers around the world, want to pay a middleman to "magically" do everything - and see why it doesn't scale, and find out who to call when it breaks? No, of course they want control over how their organization runs.

I mean, part of my job is just identifying services we're currently overpaying for. I can take that down to a very fine-grained level with AWS. Why would I want to pay someone else to hide that information from me?

"The edge" is like "no-code visual programming language". No one seriously wants convenience over power; but it's a great marketing pitch.

2030 will not be the revenge of managed hosting brought to you by lazy corporations like Cloudflare. It will be distributed self-managed stacks running everywhere.


I think one thing that articles like this miss is that not all cloud services run, themselves, in VMs. For example, most of Google's don't. While virtualization has improved a lot, it wouldn't be possible to replicate many of Google's cloud offerings with competitive performance and cost on Google's Compute Engine. Cloud Storage, BigQuery, serverless (Cloud Run, App Engine, Cloud Functions), Spanner, Bigtable, and Firestore all come to mind.

One big challenge for cloud workloads is multi-tenancy. If you want to run arbitrary untrusted code on the cloud, your only option is something like gVisor (which isn't exactly a compromise-free solution). Nested virtualization is not secure and will likely even preform worse.


I agree, however I really like what https://firecracker-microvm.github.io/ is doing to change the above.


Can you elaborate on how Firecracker is changing the situation?

My claim is that it isn't possible to build services on top of GCE/EC2 that compete with 1st party cloud offerings in important ways.

The problems with nested virtualization are mostly in silicon and to some extent in kvm. Firecracker doesn't really have much to do with it as the VMM isn't the problem.

You could use Firecracker to build your own GCE/EC2 competitor on bare metal, but that solves a different problem.


You can get bare metal instances without a hypervisor on EC2


Not sure about this. Take Microsoft for example. Databricks runs on Azure, but they are investing heavily on Synapse workspaces so they own both the underlying infrastructure and the application stack on top.

I think the cloud providers have incredible amounts of engineering resources. Their aim is to do anything to get everyone onto the platforms. I don’t see any clear trend coming out in the years ahead. There will be a myriad of solutions (some wholly owned by the cloud provider and some with a third party like snowflake or databricks involved) and customers can choose what works for them.


> Other pure-software providers will build all the stuff on top of it.

I was just watching the AWS keynote and it seems they are doing the opposite, offering more and more ready-made services so that startups don't need to write software at all.

What's the end goal of this? In the future amazon will use ML to auto-generate startups and offer them for sale. Then startup entrepreneurs will not need to hire programmers anymore, they will just buy a premade startup from AWS and become traveling salesmen. Basically gig worker CEOs for Bezos.


The end goal is proprietary lock-in, AKA The Microsoft Strategy.

The reality of auto-ML is that you don't need garbage data to learn garbage. All you need is a "developer" who has no idea how dirty or unstable their data is, how accurate/precise the results should be, or where bias and edge cases manifest before GIGO becomes your project's mantra.

The claims of auto-ML and auto-programming are two sides of the same coin. They're both silicon snake oil.


This theory and the conclusion on what the landscape will look like is spot on -- it's so spot on it's already obvious. In fact, it doesn't go far enough. People are overlooking another angle -- the evaporation of the moat around the larger providers in terms of infrastructure they provide. Every year it gets easier and easier to run core AWS offerings (ex. S3 or EC2-like services) as long as you don't need 5 9s of availability (I'd argue most businesses actually don't need that and don't have it even if the underlying infra does). Smaller players can run that infra at much smaller scale than AWS with higher relative margins.

Software is easier to administer these days (don't say buzzwords like "cloud native" never did anything for you), and the startling thing is that the larger clouds still have a near monopoly on robustness/service choice. Of course I'm lying a bit here -- OVH has managed databases, and there are Managed Service Providers (MSPs) that run in places that are not the major clouds but the point here is that fundamentally technological leverage and hardware prices are actually chipping away at the moats of the large clouds every day. Larger clouds have to move up the value chain, and that is exactly where smaller companies can have an advantage (not in pricing, but in quality).

Hetzner has just opened it's first US data center in Ashburn, Virginia -- and IMO was worth using when it was only in Germany. Time to interactive is really important, but almost no one worries about this when it comes to dealing with backends. Deploying your frontend to cloudfront/vercel/any other CDN and leaving your relatively simple backend (what you might use for a simple three tier ruby/django app) is extremely cost effective but requires know-how. Hetzner starting operations in the US is going to introduce new competition in the lower layers of this vertical, and there's absolutely no reason that the Managed Service Providers (MSPs) can't take advantage of this and grow their customer base. I'm trying to start making the know-how portable -- building a cloud that runs on many providers (starting with Hetzner) and I'm calling it Nimbus Web Services[0] with a launch early next year.

[0]: https://nimbusws.com


I disagree.

The profit margin/markup cloud providers make increases the higher you climb the abstraction layer. 1GB of RAM is 10X more expensive in Lambda than in EC2, but costs AWS the same to provide.

Second - services offered are a key selling point and driver of competition between clouds. they all want more features.

Third - considering point 1 & 2, clouds prefer to offer their own versions of things. i.e AWS and the Elasticsearch fork.


> There's never been this many companies going after services that traditionally belonged to the cloud vendors

Interesting article.

The words 'traditional' and 'cloud' in one sentence... I can't stand that. The cloud is just marketing speak for the internet. Make it vague on purpose so you can sell the same sh!t in another package to more people. And grab all the data in the meantime.

HN is smarter than this, right!?


Obviously the cloud vendors will try hard to foist lock-in on customers, and savvy customers should be trying hard to avoid those proprietary high level services

But what percentage of cloud programmers are wise to this? The individual programmers make the decisions that cause lock in

Cloud vendors know how locked in you are when you negotiate discounts, but by that time the key decisions are long past


Erik's contention is that the second part after the 'and' isn't obvious at all. There's no point avoiding lock in because you're locked in anyway. You might as well just use the features and services that best suit your needs, because that way you'll make best use of your developer time and resources. I think he's right, the costs of any migration to a new cloud service are so high that it's pointless making gestures pretending you're preparing for it.


There are huge discounts available. You don’t get them by moving, you get them because you’re not locked in


Every now and then you need to stand up a completely new system, or replace an existing system. At those times you get to pick what platform you will build it on, and Amazon wants you to pick AWS. Therefore AWS needs to be competitive.

In the vast majority of cases that decision really isn't going to be influenced by whether your existing system on AWS does or doesn't use Amazon specific tech. If you're "moving" your service from AWS to Azure those kinds of implementation details just don't matter, in most cases you're effectively going to build a replacement system and not really migrate an existing one. The costs of such a migration would in most cases dwarf any possible savings, no matter how carefully you tried to avoid lock-in.

I'm obviously generalising a fair bit, but the exceptions are likely to be less complex systems that are perfectly fine using fairly generic services.


Great post, thank you for sharing. I look forward to seeing this being reposted on HN in 2025, then again in 2030 for us to discuss.


This ignores a major factor driving usage of cloud-native services — namely that you don’t have to go through a corporate procurement process to use them, you can just use them. Which is why you’ll see any service offered by the major cloud providers achieve meaningful market share, even with an inferior product.


"Let's say a customer is spending $1M/year on Redshift. That nets AWS about6 $500-700k in gross profits..."

Can AWS seriously buy and operate infrastructure for half the price than anyone else ? I can't imagine that being cheaper than running the datacenter yourself.


Yes.

They simply call city/state/country X, and tell them: Yo, we are going to build a datacenter farm here for AWS. You are not going to tax us, provide half the funds, and we will try to source the workforce in your region mkay?

And that is how their datacenter is almost gratis, and ours costs millions.

Once you are at the AWS scale, things are differen.


> Cloud vendors might be pretty happy making money just in the lowest layer. Margins aren't so bad and vendor lock-in is still pretty high.

But will vendor lock-in will be significantly lowered if the scenario predicted comes to pass?


I read the article and the comments here.

Regarding the article: "Kubernetes will be some weird thing people loved for five years"

What will replace it? It seems to me that people are moving towards K8 because it's the right abstraction.

Regarding suggestions to use Herzner. I just looked because, while I knew the name, I didn't know the offerings. All I see is web hosting and Linux VMs.


There's a very particular kind of writing style happening here that I think I attribute to Matt Levine:

> If that customer switches their $1M/year budget to Snowflake, then about $400k7 goes back to AWS, making AWS about $200k in gross profits.

> That seems kind of bad for AWS? I don't know, we ignored a bunch of stuff here.

> …

> I'm not so sure? I spent six years as a CTO and moving from one cloud to another isn't something I even remotely considered. My company, like most, spent far more money on engineer salaries than the cloud itself. Putting precious eng time on a cloud migration isn't worth it unless cloud spend starts to become a significant fraction of your gross margins. Which is true for some companies! But those are in a minority.

I like this style -- conversational, well-cited, like an industry insider giving you an off-the-cuff read on their discipline at a party -- and I'm glad we're seeing more of it. Do I misattribute it to Money Stuff? Was someone doing it before?

(Some of my formative reading experiences were with Art Buchwald and Erma Bombeck and I think I see a hint of that too.)


haha, author here, I do admit that Matt Levine is one of my favorite writers today


Sorry to hijack the thread here. Why is your prediction that IBM gives up "hybrid multi-cloud"? Isn't hybrid multi-cloud exactly aligned with the future that you are portraying here (sans IBM's bet on k8s, which we could have another debate on) that what runs on top of cloud infrastructure will be available across multiple clouds?

Disclaimer: I work for IBM, and my opinions are my own.


monopolies consolidate, and no tech company wants to be just a utility. my prediction is the monopoly dark age deepens, innovation stagnates. cloud is really monetized technical debt


I don't think so.


I’m waiting for graphics cards to be accessible again so we can do ML generative art, music, and multimedia at home and implode the cloud for anything but storage and recovery of algorithmic libraries that regenerate seeds.

Sorry/not sorry but I’m sick of this bizarre startup fetish for syntax art and finance as usual.

The algorithms are public domain. It’s still feudal property ownership. We call caste cohort now, having conjured another abstraction to confuse the rubes.

We send kids to college to define statistical objects of unknowable information network effects in many contexts to keep them busy. We’ll probably just morph to supporting the logistics network in the abstract. I can see teens and 20-something basically nationalizing that approach once enough old people die. It’ll all be boxed up behind tidy APIs




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: