Kubernetes may be the current darling of the open source crowd, but Hadoop was no less revered before it. Hadoop ultimately ran out of gas because it was incredibly hard to use. Kubernetes, though making strides, remains “no picnic to operate,” as Capital One’s Bernard Golden has stated. That’s a very diplomatic way of saying, as others have, that the Kubernetes “experience [can be] a pain in the ass.”
Is Kubernetes heading toward a Hadoop-style exit?
Probably not. While Hadoop got more complicated with age, Kubernetes keeps getting easier. While Kubernetes will likely never be “easy,” per se, its complexity differs from that of Hadoop in critical ways, paving the way for Kubernetes to remain an industry standard for years to come.
Hadoop, the complex gift that kept on taking
Let’s first be clear about Hadoop. Or not clear, as the case may be. Apache Hadoop was complicated enough when it roughly translated to “MapReduce.” Over time, however, it kept evolving, and while that evolution led to more powerful options, those options proliferated. They also weren’t necessarily easy to get working together. As Tom Barber stated, “What does Hadoop actually do? MapReduce was replaced by Spark was replaced by other stuff and so on. Of course you can plug a lot into it but it’s still clunky.”
Why clunky? VMware’s Jared Rosoff captures the problem nicely: “Hadoop’s complexity comes from the fact that a typical Hadoop setup basically consists of dozens of independent and complicated systems that had different lifecycles and management models.” Flume, Chukwa, Hive, Pig, ZooKeeper, and so on. Clever names but a nightmare to get everything working together. Hadoop is a “complex stack of solutions,” argues Host Analytics CEO Dave Kellogg, and all that complexity is born by the user.
Perhaps most differently from Kubernetes, however, is the model used to extend Hadoop. As Rosoff notes, “Hadoop didn’t think about how people would extend it and the result was an ecosystem of incompatible extensions.” By contrast, he continues, “One thing Kubernetes gets extremely right is structuring the way it gets extended. Operators, CRI/CSI/CNI, ensure that as more vendors pile on, they do so in sane ways.” In other words, unlike Hadoop and its incompatible extensions, “Kubernetes with dozens of operators is still Kubernetes.”
Kubernetes, complexity you can rely on
That’s not to say Kubernetes is simple. As one of Kubernetes creators, Joe Beda of Heptio (VMware), is in a good position to declare, “Kubernetes is a complex system.” That complexity is somewhat necessary, he goes on, because “It does a lot and brings new abstractions.” Does everyone need all of those abstractions (and bells and whistles) all of the time? No. “I’m sure that there are plenty of people using Kubernetes that could get by with something simpler.”
But for those who need Kubernetes, Beda stresses, it’s not necessarily more complex than other system with which people are already familiar. It may simply be “new” complex versus “old and comfortable” complex:
[A]s engineers, we tend to discount the complexity we build ourselves vs. complexity we need to learn. When you create a complex deployment system with Jenkins, Bash, Puppet/Chef/Salt/Ansible, AWS, Terraform, etc. you end up with a unique brand of complexity that you are comfortable with. It grew organically so it doesn’t feel complex.
But bringing new people on to help on an organically grown system like this is difficult. They may know some of the tools but the way that you’ve put them together is unique. This is a place where, IMO, Kubernetes adds value. Kubernetes provides a set of abstractions that solve a common set of problems. As people build understanding and skills around those problems they are more productive in more situations. There is still a steep learning curve! But that skill set is now valuable and portable between environments, projects, and jobs.
Catch that? Unlike complexity that lives in a particular deployment system you may have built at Company X (and is unique to that company), the kind of complexity you master with Kubernetes can follow you from company to company. In this way, it becomes much less complex than these other systems, as knowledge is portable. Put another way, “learn once, apply everywhere.”
Learn once, apply everywhere
That learning, in turn, is much easier than Hadoop ever was. Kubernetes, unlike Hadoop, is an easier system with which to become familiar, in part because of where it can run. As Gareth Rushgrove writes, “You can run Kubernetes locally much, much easier (Docker Desktop, Kind, MicroK8s) than the other similar examples. Lowering the barrier to entry makes it easier to become familiar, which combats perceived complexity.”
It also helps, as Cloud Native Computing Foundation executive Chris Aniszczyk stressed, that while “distributed systems are inherently complex, the upside with Kubernetes is that every major worldwide cloud provider and multiple vendors offer a managed conformant/certified version of it (no forks) which helps most users with complexity of managing at scale.” Even so, perhaps the right question, Tamal Saha indicates, is whether “Kubernetes [is] complex given the problem it tries to solve.” For him, the answer is no.
That is the same answer to the question, “Will Kubernetes get Hadooped?” Kubernetes is already well past that stage. Yes, as one commentator has posited, Kubernetes is “a complex orchestration tool, and not ideal for all use cases. Like so many of the tools in our space, it also takes time to learn, use, and understand. ‘A few hours’ isn’t going to be sufficient.” It’s a complex tool solving a complex problem. But there is “intentional complexity and accidental complexity,” as Beda argues. Hadoop suffered from the latter, while Kubernetes involves the former.
For these and other reasons, we should see Kubernetes continue to thrive as the industry standard for container orchestration.