Agent Interview: Eric Newbury on the super performant new Roc language
- Publish Date
- Eric Newbury
- Cathy Colliver
How did you first find out about Roc?
One of our agents posted a talk about Roc in one of the Slack channels at Test Double—I think it was the functional-programming channel. I was fascinated because it’s a desire I’ve had for a while, which was: I love writing Elm, it would be great if I could write it on the backend. And that’s essentially the idea of Roc, which is a compile-to-binary functional language.
How are you supporting Roc as a new language?
So I watched all the talks, and at the end of the talk the language founder, Richard Feldman gave a callout, “if anybody’s interested in playing around with the language, it’s not an open repository yet, because we’re still just trying to get the groundwork laid down, but send us an email, we’ll add you in.” Richard Feldman is a prominent speaker in the Functional and Elm community, authored one of the original Elm books out there, Elm in Action, and I’ve been watching his talks for a few years, so I worked up some courage and finally sent him an email. He added me to their private Zulip chat workspace and suggested, “if you’re interested in contributing, here’s a list of good starter tickets”. So I took some growth time to give it a shot and started a bit at a time. I only have 3 functions implemented in the standard library now, but I’ve been able to pair with one of their main developers and am starting to grow my knowledge. I think there’s roughly five or six core developers who have been there from the beginning, and now there are roughly 60?
What interested you most about Roc as a new programming language?
One of the biggest innovations in my opinion about this as a language is there’s a new layer of abstraction that I don’t think has existed before in how a programming language fits into the technology stack. Typically a programming language is bound to its runtime, one-to-one.
All that backstory to say that most languages typically are bound to a single runtime but there’s an appetite for having access to different runtimes based on the type of application you are writing. And Roc basically explicitly separates the language from the runtime. The language implementation itself shouldn’t be tied to how it’s being used—to the purpose of how it’s being used. And that also pairs really well with the functional mindset, which is that all functions should be pure functions, meaning that they have no side effects. And that the way that you create effects in your system—because if your code can’t have an effect then it’s not doing anything, that’s the whole point of code—is that they should be managing effects instead of side effects. That means your code doesn’t just say, “execute this HTTP request right now”, it’s going to say, “Hey, here’s the recipe for how to fetch data from a server.” And then hands it to some kind of central piece of architecture that’s responsible for managing those effects.
That’s kind of how Elm works, but it’s specific to a front end system where that managed part is the thing that interacts with the browser. For a general purpose language that’s a little bit harder to separate, which is why languages like Elixir don’t really have that hard separation. You can in fact have side effects within functions in Elixir, but there are so many books out there describing architecture to make it more heavily structured. Designing Elixir Systems with OTP is one of the books that I’ve read that proposes patterns to separate out effects and keep the majority of your codebase “pure”. But at the end of the day it doesn’t enforce it as a language, and Roc wants to do that. And therefore it’s trying to create an abstraction for that managed area—that architecture that’s not built into the language. So the language isn’t assuming they know what you’re trying to solve. They’re just saying here’s a language, here’s how you write code, and then here on a separate layer of abstraction, called a “platform”, is where you choose your architecture and runtime.
The core standard library of the language itself doesn’t have any concept of where it’s running. There’s not even a concept of input/output. You can’t just write to the output, because the language itself doesn’t know where it is. It doesn’t know what it can do. Any kind of effect, including IO, is built into the platform. So, you’ve got a command line platform, you’ve got a web server platform. Which is essentially a platform that gives you a hook that says what are my routes, background jobs, different processes that could spin up, etc.
And then on top of all that the compile time target for the language isn’t just one thing. It’s not just compiling to Java byte code or Elixir BEAM or whatever it is that your language is compiling to. The main one is LLVM, which is basically a low level platform that allows you to compile to all the major CPU architectures at once. It’s an assembly language that’s agnostic of the target architecture and offers a lot of optimizations for each final target.
Benchmark testing of Roc at this point has shown that it can be almost as fast as C++, and it’s been shown that it’s faster than Go on some benchmarks. They have optimizations in the Roc language that allow it to do in-place mutation under the hood when it’s possible. So even though it’s a functional language it has performance that can match up to C++ imperative programming style.
Additionally, they’re also writing other implementations of it for web assembly, not just LLVM, so the language itself is much less bound to what it’s going to be run on. There are multiple groups of people within Roc working on different kinds of compilers for different purposes. I could even see this getting compiled to BEAM code that could run with all the power of the Erlang runtime, and then creating a platform that allows you to hook into the OTP, which would be really, really cool. That would be exciting.
What types of programming could be interesting or possible in Roc?
Roc definitely seems targeted for use as a low-level language, Instead of something like C++ where functional languages don’t yet have a foothold. Theoretically you could be writing something in Roc, instead of Rust or C++. Rust is kind of halfway between. They’re doing a little bit of stuff that is more modern, slightly functional concepts like pattern matching. But to me it’s still just a safer version of C. Roc is truly a pure functional language that is optimized for lightweight low-level applications, embedded systems, that kind of stuff.
What do you like most about functional programming?
The things I like most are subjective. For instance, I find functional code to just look more attractive, like on a purely aesthetic level. It just flows in a direction. To me it’s much more readable. If I’ve designed it well, I can read it from top to bottom like a recipe that just flows. Starting with this I want you to do this, this, this, and this. And you get to the bottom, and it’s just very clear what’s happening. I’ll give you something more concrete.
One of my favorite things about functional programming is the separation of behavior and data. This is the exact opposite of object oriented programming, which is packaging up data and behavior into an object or an object has state, or an object has methods, which are behavior on that state. Conversely in functional style, you simply have collections of functions (in modules) and data types, but state lives elsewhere.
Having done both for a while now, it tends to be easier for me to reason about code when I can take a batch of functionality and test them all individually, you know, removing data from the situation. “Okay, there’s no data here. Given this, I expect you to do this,” and it’s just clear instructions. You can test that inside and out.
The data types are also very concrete. It’s just structs and primitives, things that hold data in very simple forms. When you remove state from the equation and are only working with functions and data, testing becomes very easy. So maybe I’m honing in on the real reason. When you combine data and behavior that inherently ends up creating state. In order for those two things to live side-by-side, state must exist in all of those places, which means there’s some knowledge that exists in space and time. Whereas, if you separate those two things, that no longer exists. You have behavior that exists outside of space and time. And in a separate location you have state, managed as a single responsibility.
Dealing with state is not really something functional programming talks about as a core principle, that I’m aware of. It’s just something that it’s forced to do in a different way. Different languages do this in different ways. But at its core, functional programming basically centralizes state into one place and that also, once again, makes it easier to reason about.
That is, this is my state management engine, this is my state machine. And it doesn’t involve any behavior. It doesn’t have data structures per se. It’s just an abstract state machine. And behavior and data definitions—everything like that—exists outside of that. This makes it much easier to reason about.
One advantage of doing it all in one place: you end up having fewer issues with race conditions, and different processes conflicting with each other that seem like they should be in separate worlds, but in reality are affecting each other. When you just say “everything with state lives over here”, well now you can’t escape the truth that these two things are happening in the same place. You’re forced to deal more with the atomic nature of updating state and mutating state. It’s not just happening wherever you happen to think you need it to happen. You have to do a little bit more upfront thinking: “I need the state to be in a different place, so let me set it up properly so that it all happens as part of everything else in the system.” It’s not pretending to be isolated when it’s not.
What is it about getting to a place where it’s easy to reason about the code that is so helpful?
I think specifically as a consultant who is asked to jump into new codebases a lot and then quickly be productive, functional code—or code that’s easy to reason about in general—makes it easy to understand an isolated part of the code without being forced to absorb the entire codebase into your head first.
Aside: Generally what I find is that when object oriented code is written really well, it tends to embrace a lot of the patterns that I see in functional code anyway. So it might just be that functional code happens to force you to have other good habits, and it’s just a kind of transitive property there.
Functional code allows you to just jump into a piece of the codebase, see some code and—without having to know what other things are happening in the system—you can understand what this part is supposed to be doing. And you can write a feature in that part so you can be productive more quickly. With code that’s easier to reason about, specifically functional code, because of those managed effects instead of side effects, you can be pretty sure that what you’re looking at is all you need to know to understand how to write a test and add features to that section.
How do new programming languages help software development evolve over time?
Richard Feldman gave a talk about almost this exact subject, which was kind of a walk-through history of the development of programming languages. It feels like we’ve kind of been on pause for 20 years for any major developments in programming languages, once Java and C++ hit mainstream. In the last maybe 5 to 10 years, partially because of LLVM, it’s getting easier. You can worry more about the syntax of the language and the design patterns of a language, without necessarily having to waste as much time on understanding the architecture of every processor that’s out there. You can hand off some of that grunt work and let a different platform do that. So I think that’s one of the reasons we’re seeing a surge of languages recently.
I think object oriented programming kind of ran its course a long time ago, but because of how much the tech boom has created business interest in technology and writing applications, there’s been such a focus on productivity and creating products with what we know. There has not been as much push for innovation. There have still been pain points with object oriented programming. I think people have felt those, and that’s why even in object oriented languages, best practices tend to lean away from purely OO concepts like inheritance and embrace more functional concepts like composition. But with the amount of up-and-coming developers getting started, going through camps, going through universities—the focus is building competence with mainstream tools to be qualified to get existing jobs. But I think this status-quo is coming to its logical end, where we are seeing benefits of other design patterns. And now we have technology that makes it a little bit easier to start acting on those amazing innovations.
I don’t think it’s really going to change what we’re able to create per se, because at the end of the day we’re all running code on the same architectures. And those innovations are happening separately with new kinds of processor architectures that don’t really have to do with the program languages that run on them. But what it does do is change the kind of code we write and basically allows the code to be much less buggy. I believe that the number of bugs being reported from functional languages is less than the number of bugs that gets reported from object oriented languages. (Though there are many variables there, for instance the proportion of senior developers writing functional code is higher.) So embracing functional languages going forward is going to allow you more productivity, hopefully less tech debt, fewer bugs.
A lot of these new languages are also going back to embracing static typing, like Elm and Roc. It’s one of the reasons I was excited about Roc, even though I love Elixir (also a functional language) because Roc is a statically typed language and I miss that. And it’s statically typed in the best way, not in the Java way. It’s not the way that you have to define every little thing and it makes for a ton of really verbose code and it gets stuck on things that it should be able to just do. New languages that are statically typed code don’t force you to be super verbose, but they are very precise.
For instance, you can do a lot with type inference, where the compiler can still make guarantees on types without you having to explicitly define them. This is similar to what TypeScript does, except TypeScript takes it as far as it can, but then can essentially “give up” and lose the ability to give a compile type guarantee. In Roc, the types are never wrong (unless there’s a bug of course).
The main advantage is that a statically typed system is going to allow you to have more confidence about your codebase. It’s going to allow you to refactor your codebase without having to remember anything, and go searching your codebase for matching strings. You can literally change something wherever you want to and you can know, for instance with Elm, with almost 100% confidence that your code will not compile until you have fixed all of the places where you were supposed to update that reference.
I think we’re going to start seeing more languages going back to embracing static typing because the pain of it is much, much less, and the benefits are much higher than they used to be. The compilers are also much more user-friendly the way they’re being written now, so there is a huge movement in programming languages to create empathetic compilers. They’re usually written in first person. I know that sounds weird, but when there’s an error message it will say something like, “I ran into this issue, and I couldn’t do XYZ. Here are some places that I found that might have a typo.” The idea is that the compiler isn’t there to yell at you, it’s there to tell you what it knows and what its best guess is as to the source of the problem. “I ran into this kind of a situation and here is what I found the options are. Good luck!” Having a statically typed system gives the compiler more knowledge, so that it can actually be more helpful.
You can move faster. Typically the selling point of dynamic languages that are not statically typed would be that theoretically you can go faster because you’re not over here trying to worry about finding every type. It just kind of figures it out and it goes on its merry way. I would argue now that using a statically typed language actually enables you to move even faster, because maybe there’s a few places where you are going to have to define types and set those things up ahead of time, but that time is way less than the amount of time you end up saving by using these empathetic compilers.
You never have to waste brain power writing extra tests to find out “did we break something in here we don’t know about?” You can lean more heavily on your compiler and on your type system than you were on your tests, and now your tests can really focus on the business logic truly. It’s not going to be about 100% code coverage, checking “did we touch every part of our codebase?” just to make sure that your application actually runs. No. Your tests will now be checking, “does this solve the problem that I wanted it to solve?”, and not some other problem. If it compiles, you know your code is going to run, and it’s going to run without errors. But the tests are there to make sure it’s actually doing the thing. It’s actually solving the problem that was intended. That shift in the purpose of testing will be a really cool thing for the future of programming languages, and potentially turn testing from a chore into a really enjoyable experience.
This interview is based on a recorded conversation with Eric Newbury and Cathy Colliver. It may or may not self-destruct.