By Wojciech Gawroński | October 18, 2018
You may have noticed the first part here. If not, it is more or less a business case for serverless computing. I have explained there the what and whys behind serverless, but also talked about the architectural, economic and operational impact that it has on your systems and products. We have left a fascinating question there, wondering if the first word in FaaS acronym (function as a service) means something, for the functional programmers.
I went into that rabbit hole, which turned out to be an exciting chance to build know-how which we can leverage for our clients. The following blog post is the distilled version from that journey.
I would like to tell you a story about how much yak-shaving is required to use our beloved functional languages in serverless. Personal experiences support the whole path, moreover - I did some experiments and provided evidence that provided me not only perspective which functional language is the best for serverless computing but also provided me the even broader knowledge and understanding for the applications of serverless approach.
Is FaaS functional by nature?
Knowing what and why bits let’s talk about the technology available for us.
Let’s return to our topic. The first word of that acronym suggests that we, as functional programmers, are in a privileged situation. That is far from true.
In the beginning, I have compiled a list of available functional programming languages. In that case, a functional language for me was the one that forced me to use functional paradigm only. I did not consider multi-paradigm languages during my analysis.
Now let me introduce the top three most used public cloud platform providers and their capabilities in FaaS (function as a service) space.
As you may see, there are similar options between most popular platforms. The most popular choice when it comes to the runtime seems to be Node.js. Equally important are Python, JVM, and .NET. We can also use provider-specific ones like PHP or Go. One thing that speaks directly from the diagram is we have limited choice from the start when it comes to functional languages which we can use.
Also, please note that none of those run-times supports functional programming as a first-class citizen. It may sound controversial for many people, but I would say that .NET is the closest possible one (as it supports tail-call optimization by additional opcodes that are emitted by F# compiler). JVM has no support for it - at least for now - and Node.js (effectively V8) has it, but many things cause a bail-out (deoptimization in case of fulfilling the particular condition, e.g., using exceptions).
One additional thing that I have to mention before we finally seal the list of the functional languages for our analysis.
There is a well-known workaround that allows spawning any executable from the inside of our package - it is called a shim. It is a small program, often written in Node.js or Python, that spawns other executable and redirects input and output to the proper places. Thanks to that many people could use Go before AWS supported it or use Haskell for example. I made that one exception in my analysis to include a particular language to compare such workaround with runtimes that are fully supported.
Limited by available runtimes, I have chosen the following languages to compare: PureScript, BuckleScript, ClojureScript, F#, Scala, Clojure. Worth noting is that I have performed tests on just one cloud provider - I have chosen AWS.
I deliberately skipped languages that can be supported only as shims (e.g., Erlang and Elixir - believe me, I did that with an aching heart), Elm (as it is inherently front-end language), Frege and Eta (I could not force those to run on any platform). I have skipped OCaml, but BuckleScript proudly represents the ML family. When it comes to my exception, as mentioned earlier, I did that only for the king of the functional languages - Haskell.
Having an introduction to the technology and choices I made, let’s discuss the methodology and analyze the measurements, with some conclusions.
First things first: complete source code is available on our company’s Github. You can find it here: patternmatch/functional-programming-in-serverless-world.
Let’s start by explaining what each implementation contains. Inside each lambda, we have two handlers, both accessible via Amazon API Gateway HTTP endpoints (I have used POST method, to eliminate possible HTTP caching).
- Echo - what comes in, goes out.
- Functional and naive implementation for sieve of Erathostenes - prime numbers generator.
The first one does not do any work and rewrites body and content type from request to the response. The second handler is a basic implementation of an algorithm that lists prime numbers up to a given point - so this one is parametrized by a single number, and it returns a list of primes. In both cases, the implementation does not perform any unnecessary work, e.g., logging long strings.
Then we have four scenarios that I would like to test:
- Package Size.
- Memory Usage.
- Execution Time.
- Cold Start.
I would like to start from measuring package size, going through memory usage (which is dependent on the upper limit of generated primes), execution time (also dependent on that element) and cold start time. In scenarios two and three we are using the endpoint with primes numbers, for the fourth scenario - we are using the echoing handler.
You may wonder why we bother with measuring package size. It turns out that one of the many limitations are enforced on your code storage and how many packages you can have at given time per region.
When it comes to the second and third points, I want to emphasize that I have not looked for the leanest and fastest runtimes. What I am looking for is predictability - as we look for the most predictable and stable runtime, to pay the least amount of money. Stability gives us room for possible optimizations, e.g., smaller memory tier declared upfront.
Moreover, it is not about overall latency, but just measuring the code execution time - as that is the factor for which we are paying. We do not want to measure Lambda response time together with Amazon API Gateway latency.
Having said that, let’s jump into results!
Let’s start by analyzing the most straightforward scenario. Package size is pretty easy to get right. We can see that JVM based solutions are pretty high on the list. For me, a winner of that round is F#, which is the most lightweight and powerful at the same time. It is the best-supported one as well. Haskell footprint is pretty low, and that’s something related with amazing and aggressive compiler optimizations.
A thing worth mentioning is that we need to know our toolchain. Otherwise, we may end up in a strange situation when tooling is blowing up the size of our package. I chose ClojureScript, not for bashing it, but to show how it can end. The highest bar represents a ClojureScript project, which has no optimizations enabled - the difference between no optimizations and the optimized version is enormous.
Next element - memory usage:
This time winner is not visible at first sight. Node.js based solutions allocate the least amount of memory (mostly because of lighter runtime/virtual machine). Also, Haskell has a pretty low footprint - this is related to the build mechanism. We are bundling everything in a very thin executable, that is upfront redacted and optimized by the compiler. It is a double-edged sword - in case of any problems with that binary we are basically on our own, as it is a workaround.
I am a little bit surprised by such high position of ClojureScript on that list, especially when generating the list of prime numbers for higher values. This time I would point on PureScript which has significantly smaller and more predictable memory footprint than others, but I must admit that again F# does a great job - even if it is a little bit bigger, is pretty predictable as well.
That scenario is probably the most controversial measurement. Once again - we do not measure efficiency here, but rather cost-effectiveness. As the execution time directly influences how much we pay at the end.
When it comes to different incoming argument values: for smaller numbers, nothing exciting happened, especially that we pay for 100 ms as the smallest slice - you can see that we omitted that on the chart. Exciting stuff starts to happen around input size of 10 000 and more - we can see that both Scala and ClojureScript are significantly off - they are even hitting the hard limit at the end, which is equal to 20 seconds.
For me clear winner of that round is again F# - stable and predictable execution time can provide you the most significant savings.
The Ultimate Problem of All Benchmarks
There are two issues with this particular measurement of the benchmark - it is contextual.
One thing is that we compare pristine and default settings for each environment. At first, it sounds fair - but it is not.
A thing worth noting is that due to fire and forget nature of those requests, underlying container reuse time is minimal. Assuming that events are not frequent enough, each time when FaaS platform executes your code it is starting the fresh instance of that container.
It means a couple of things - most importantly you cannot fully leverage work that VM is doing for long-term optimizations. One example of such optimizations are the ones provided by JIT compilers or sophisticated Garbage Collection algorithms. Because your code runs only for a short time, the techniques mentioned above are not providing sufficient benefits.
The second problem relates to this. As we use defaults, it means that some of the runtimes pay the initial fee needlessly. The canonical example would be JVM where good practice is to disable specific stuff with following the set of flags:
-XX:MaxHeapSize="75-80% of configured AWS Lambda memory tier" -XX:+UseSerialGC -XX:+TieredCompilation -Xshare:on
Another perfect example is something that Rich Hickey (main persona behind the Clojure programming language) calls Situated Programs.
Such software runs for an extended period, consume a lot amount of memory and accumulates state over time (e.g., recommendation or rule engines). Choosing a serverless approach for such type of workload would be a great mistake.
In other words - startup time. How painful is for the end user to hit a cold serverless architecture before we have adequately warmed up containers.
The lightweight implementations (Node.js based ones) are the winner of this round. The smaller runtime you need to carry, the faster you boot - no surprises here.
Another thing worth noting realization is how containers reuse looks. There are no guarantees for that, you need to determine that in an experimental way.
I strongly recommend reading both in details, especially if you are interested in investing in serverless, but I summarize them now.
Most of the platform providers keep containers idled for around 20-40 minutes. Keep in mind that is not guaranteed and documented anywhere. Moreover, for AWS there is a significant difference that depends on the memory tier. The biggest lambdas are recycled a little bit faster when idle. From the provider perspective, it looks like a perfectly fine optimization. The additional significant thing is that every time we do a deployment or recreate a particular function we are dealing with entirely cold infrastructure. Moreover, if more than one request hits such infrastructure concurrently, all of them pay the initial cost altogether.
We talked about platform capabilities, performance - let’s discuss the most problematic pain point - enforced constraints. Because it is always a tradeoff: we are trading less work done on our side, by applying additional rules on our solution and code.
First things first: we do not know how containers look. We cannot assume anything about reuse. We do not know the exact hardware, operating system - that’s the point, but for our workaround - if we are dealing with a crocodile - knowing that it is Linux is like knowing that it is a reptile instead of an animal. Moreover, on Azure there is no way to distinguish if we land on Linux or Windows container - so our shim needs to be updated with OS sniffing logic to be fully compliant with that. Many drawbacks that we need to remember, but in most cases, we can deal with that.
However, there is more. Serverless is the somewhat new kid on the block, and a new approach is something that requires new best practices - especially when it comes to debuggability/observability, logging, monitoring, and maintenance. By default using default provider service like Amazon CloudWatch - which may be problematic because of the company policies, being too expensive, or even limited for your use case. Luckily some of that work is done already (the excellent example is debugging and services like AWS X-Ray). In some cases, you may find that you are in uncharted land and you are on your own.
Speaking about providers: choosing that paradigm, you are at the mercy of the provider that they add some feature. A canonical example would be support for .NET Core 2.0 added on AWS half year after the official release. Before that, you were forced to use .NET Core 1.0 and not even 1.1. You can read those complaints in the Github issue.
If that’s not enough, limits are applied everywhere - execution time, filesystem access and space limits, concurrency limits, memory, and networking limits, package size (e.g., 50 MB package for AWS Lambda - anyone working with JVM knows how limiting it is), amount of packages uploaded to the platform, amount of created functions. Providers advertise serverless platforms as almost infinitely scalable, and in practice, they could be - if you are compliant with the extensive list of their rules. You need to know them in the first place.
For us, functional programmers, concurrency and memory limits are the most painful ones, as in most cases we are giving up control to the runtimes that are running on vertically scalable hardware which we are controlling. When designing solutions in a serverless way with the use of functional languages emphasis is put in different place than usual.
Another consideration is your initial choice of the language and runtime. After choosing a thicker one, you need to prepare to fiddle with VM configuration options. Otherwise cold start or some not used features may cause your hair becoming much grayer than they are now.
The same goes for the build chain. I have lost afternoon investigating differences caused by the serverless-offline plugin and ClojureScript build pipeline - in the end, I have dropped that plugin because it introduced so many differences between the actual version of code deployed into cloud and implementation bent to the requirements of the plugin to test it locally. A similar remark applies to the build pipeline options and tweaks - you need to know it, e.g., to keep generated packages lean. The perfect example is dead code elimination performed by the compilers - the difference can be significant, as in the ClojureScript example package size 18 MB without any optimizations versus 2.2 MB with all optimizations applied.
I do not want to discourage you from using serverless architecture. However, the critical thing here is the context - you should not apply it blindly and based on the hype.
If I have to create a fresh serverless project and would like to use functional programming language the safest choice is to use to the F#.
From runtimes and languages available to us this one is the most functional one, and either Azure Functions (in version 2) or AWS Lambda has the sufficient support for the .NET Core platform.
That was a crazy, but very informative ride. Moreover, everything started with a single question - so do not put down those immediately as those arise, because who knows where such open topics can take you.
Focus On Product And Move Tech Out Of Sight
Cloud Computing and Serverless are our core expertise. Partner with our experienced and pragmatic builders that help you innovate, move quickly and be cost-effective with use of cloud platform.Schedule a call with our expert