In The Olden Days™ of web development, a single FTP server and a bash script were all that was necessary to deploy and serve an application. If it worked on your machine, it was good enough for the rest of the connected world. (Of course, if a server failed, a dozen sysadmins would scramble to the rack and swap out components.)
As technology improved and access to the Internet broadened, more and more operational knowledge was required to accomplish the same goal. Developers adapted by learning more about how to configure the nitty gritty details of their servers. We picked the operating system, the database, and the web server; we learned how to mitigate disaster scenarios and manage package dependencies; we tweaked our instances’ regions and load balancers. Eventually, with the rise of bigger cloud hosts, we emphasized the practice of DevOps, a methodology that allowed us to consider operational work the same way as code. The tooling to spin up a server could now be reviewed, tested, and automated.
DevOps certainly made operations easier. But it’s also made development a lot harder. By conflating the role of an application developer with the role of an operations engineer, organizations are spending more time managing their infrastructure than the real challenge: delivering features their users will value.
When Test-Driven Development was “rediscovered” in the early aughts, it wasn’t that teams hadn’t been writing tests. Rather, TDD emphasized that writing tests was a fundamental concept to software development—the responsibility of the engineer writing the software, not thrown over-the-wall to a QA team somewhere. Testing your code became an essential part of the development process. The newfound expectation that developers would be responsible for testing their own code created a virtuous cycle throughout the industry: programmers received immediate and continuous feedback that they were both building the right features and building the features right.
Similarly, DevOps attempts to shift the duty of managing the operation of an application onto the person writing code. Asking developers to reason about their infrastructure sounds like a move towards the same goal as asking developers to write tests alongside their features. Just as TDD did with testing, DevOps was envisioned to be bring a system’s operations into the development feedback loop. But there’s a false equivalency between operations work and testing code. Writing good test suites requires one to think about the flow of software—the implications of different states and behaviors—which is a natural part of development. Rationalizing the amount of disk space necessary, which operating system to use, or which version of a package to compile code with is, frankly, much harder. The fact that operations work is difficult requires more emphasis, not less.
DevOps has only managed to abstract the complexity of running a server. While implementing functional tests is a somewhat standardized practice amongst teams, the industry is still iterating on what a functional DevOps workflow looks like. We’re noticing a trend where organizations believe that by adopting DevOps practices and tooling, developers will be able to more efficiently build applications, but this isn’t necessarily the case.
The false premise of DevOps is that your team will be able to ship features quickly, since presumably, application developers will be able to spin up whatever infrastructure is necessary on their own. However, we’re finding that organizations are wasting more time configuring their operations than they are purportedly saving.
Organizations need to consider what their teams’ workflow will be before permitting developers to operate their infrastructure. They need to plan a strategy where every developer is using the same set of guidelines to deploy their apps. If an application is designed as a collection of microservices, it becomes imperative to prevent different groups from juggling different contexts. In short, there are human and communication issues to discuss before diving into the technical aspects.
Even after selecting a hosting platform or IaaS tooling base, changing your mind in the future can prove costly, both in terms of time and money. If you choose a certain cloud provider and tooling system, and design an automation workflow around them, the effects of vendor lock-in become real. Migration to different services is difficult: you can’t just switch from AWS/GCP/Azure/DigitalOcean or Puppet/Ansible/Terraform without redesigning your entire process.
Prior to establishing a DevOps workflow, teams will also need to identify bottlenecks that they know will be out of their control, such as DNS configurations or security groups and access rights. It’s not impossible work, but it is work. The question is whether that’s an appropriate focus for application developers to reason about before they write their code.
Certainly, developers should have a good understanding of how all the parts of an application work together. Knowing the strengths and limitations of your infrastructure is essential to developing reliable software. While there will be developers who like to analyze the infrastructure side of operations, there will also be many who won’t—and that’s okay! We would like to see teams that encourage both sides of application management: how it works and how it runs, rather than forcing both considerations onto every application developer.
We call this methodology NoOps: the idea that there is a process by which operations is so seamless that it doesn’t even appear to exist. Organizations should encourage collaboration between ops-minded developers that satisfy the needs of development teams. Insisting that teams adopt an IaaS workflow is often too great a responsibility too soon, particularly if the DevOps flow is forced, rather than desired.
If you’re a smaller startup, or even a company with non-technical scaling issues, we encourage you to work with existing PaaS services like Heroku or Zeit. The deployment of your application is just a single
git push command away. The higher costs of these services relative to running your own infra on other cloud providers is inconsequential to the time you’ll save, plus the additional salaries to pay for ops experts. If your team is spending more time working on tools for infrastructure than delivering meaningful features, something is wrong. Focus on building features for your users, and worry about fine-tuning your operations when you’ve hit it big!