You may not need GenServers and Supervision Trees

The thought that people seem to think of GenServers and supervision trees in the elixir/erlang world as too essential deterring people from using these languages has been brewing in my mind for quite some time. In fact I have written about it before in Elixir Forum. This is a summary and extended version of that (including some choice replies) to give it more visibility.

It feels like we talk so much about GenServers etc. that people who come to Elixir feel like they need to use them or they are not “really” using Elixir. I fear that it keeps people out of the elixir community because all this just seems so complicated. However, you can write great applications without ever writing them yourself. I emphasize yourself because a lot of the libraries you’ll use (phoenix, ecto etc.) already have their supervision trees and their processes – you’re standing on the shoulders of giants truly. Still,  I hear people say something to the tune of “We’re still using this like Rails – we should use GenServers” – without any need or concrete reasoning. Having something as simple as rails (in parts, let’s not derail about rails here) but at the same time fully parallel and more efficient is a huge boon to my mind.

I feel like hardly anyone ever highlights this. I’ve been programming elixir since ~2015. I’ve never written a production GenServer or supervision tree (and I have multiple applications in production and multiple published libraries). I simply didn’t encounter these problems or the eco system took care of it for me. Still I felt somewhat inadequate because of it, as everyone seem to already be talking about these. Which is also understandable, they are unique, they solve hard problems – there’s lots to learn and share knowledge.

I’ve had this thought that often you don’t really need GenServers and supervision trees in my head for a long time (~2016) but was too afraid to voice it. Maybe I just don’t understand it enough? Maybe I need to learn more about them? Everyone always talks about them, I’m sure enlightenment will come to me some time soon! I finally nervously wrote the aforementioned Elixir Forum thread in 2018. My nervousness went down a notch after José Valim, creator of elixir somewhat hyper productive and omnipresent, liked the post within ~15 minutes of its creation. Also most people were really supportive, hence I’m happy to share this more openly, sorry for the delay a lot had been going on in my life 😉

Be mindful that I don’t say you don’t need them – I’m merely saying that you can build significant and great elixir applications without writing your own GenServers and supervision trees. It’s of course still a topic worth learning.

GenServers? Supervision Trees? Processes?

If you’ve ever watched a talk about elixir or erlang you’re likely to have heard about these. They’re one of the “killer features” of erlang and elixr. They give you that famed parallelism backed by the actor model, reliability and that whole “Let it crash! (and restart in a known good state)”. Both GenServer and Supervisor are behaviours showing a process “how to behave” and they both ship with Erlang/OTP.

So what’s the problem with Genservers and Supervision Trees?

GenServers, supervisors etc. are great technologies that help you solve problems. They’re one of the things that is most special & unique about elixir & erlang. As a result lots of conference talks, blog posts etc. focus on them and it seems everyone wants to use them. Through the big focus on them in the community it sometimes feels like you can’t be a “real” elixir/erlang programmer until you’ve used and mastered them.

However, do you need them all the time? At least while using a framework (like Phoenix), chances are you don’t. The hidden detail of course is that you are using GenServers and friends without even knowing it – Phoenix runs every request and every channel[1] in their own processes. Ecto has its own pool for your database connections. It’s already parallelized and you don’t need to take care of it. That’s the beauty of it. What I’m saying is that in the standard situation the eco system takes care of you.

Building a relatively standard CRUD web application with Phoenix? No need.
Just using channels for a chat like applications in Phoenix? You’re good.

Of course, you don’t need them until you really do. I like how Jordan put it:

I agree that you, 99% of the time you do not need to use anything otp related but when you are in that 1% its really a game changer that makes elixir as great as it is.

The Bad and the Ugly

It’s not like using GenServers and friends makes everything instantly better. Quite the opposite, it can make your code both harder to understand and slower.

Sometimes the introduction of processes just complicates the code, what could have been a simple interaction is now obfuscated through a bunch of GenServer calls. Through trying to use these concepts you can also essentially grind your application to a halt performance wise. To take an example from the excellent Adopting Elixir, which also covers this topic:

A new developer team started building their Phoenix applications.
They had always heard GenServers could be treated like microservices but even
tinier. This “wisdom” led them to push all of their database access control to
GenServers .
(…)
performance was abysmal. Under high-enough load, some pages took 3 seconds to render because they built a bottleneck where none existed. They
defeated Ecto connection pools because all access happened through a single
process.
In essence, they made it easy to create global, mutable variables in Elixir. They
essentially crippled the single biggest advantage of functional languages, for
no gain whatsoever.

When to GenServer?

Adopting Elixir again provides some guidance as to what to best use processes for:

  • Model state accessed by multiple processes.
  • Run multiple tasks concurrently.
  • Gracefully handle clean startup and exit concerns.
  • Communicate between servers.

They especially highlight that GenServers aren’t to be used for code organization.

Robert Virding (one of the creators of Erlang) also chimed in and his response is so measured that I want to quote it in full:

I think the most important thing to understand and to use properly is the concurrency in the problem/solution/system. Using GenServers and other behaviours is just one way of doing the concurrency but it is not the only way. They are tools and like all tools they need to be used in the right way. The problem is to get the right level of concurrency which suites your problem and your solution to that problem. Too much concurrency means you will be doing excess work to no real gain, and too little concurrency means you will be making your system too sequential.

Now, as has been pointed out, many packages like Phoenix already provide a pretty decent level of concurrency which is suitable for many types of applications, at least the ones they were intended for. They will do this automatically so you don’t have to think about it in most cases, but it is still there. Understanding that is necessary so you can work out how much concurrency you need to explicitly add if any. Unfortunately because it is all managed for you “invisibly underneath” many don’t realise that is it there.

While people agree in general, there are also some that say that most systems can benefit from a GenServer and supervision tree. As Saša Jurić points out:

With a lot of hand waving, I’d say that GenServers are OTPs built-in building block for building responsive services, Tasks are the same for non-responsive ones, and supervision tree is the built-in service manager like systemd or upstart. In the past 10+ years of my backend side experience, I’ve worked on small to medium systems, and all of them needed all of these technical approaches.

He also has a fairly extensive blog post on the topic of when to reach for these tools and when not to: To spawn, or not to spawn.

In the end, I’m also not an expert on GenServers and Supervision Trees – as I said I never wrote a production one. Still learning and still growing. I think knowing them well gives you a good basis to make informed decisions on when to use them and when not to.

Abstractions

But you came to Elixir for the parallelism! So you need GenServers right? No.

Elixir core and the community have been very good at providing easy to use solutions to write fully parallel programs without having to write your own GenServers and supervision trees. There are gen_stage, flow and broadway, but somewhat more importantly a couple of these are built-ins like Task (do something in parallel easily) and Agent (share state through a process).

Want to geocode the pick up and drop off addresses of a shipment in parallel and then wait until both have finished? Say no more:

Learning

Although I’m saying you don’t need it and can write applications without it just fine, it’s a fascinating and interesting topic that can make you a better programmer without ever writing your own supervision trees even. And as I said, education is key to know when to reach for them and when not to. So here are a couple of books I can recommend to up your knowledge:

  • The Little Elixir & OTP Guide Book – my personal favorite, in it the author takes you through writing implementations of poolboy starting with a simple one, showing what shortcomings it has and then extending on it building a complicated supervision tree but you get the reasoning to go along with it to see for what feature what level of complexity is needed. A fascinating read for me.
  • Adopting Elixir – covers many aspects as mentioned before but is especially good at what to use these concepts for and what not to use them for (and is an interesting read overall if you consider to get your company into elixir)
  • Elixir in Action – Saša Jurić is one of the most knowledgeable people I can think of about this topic, hence I quoted him before, and a significant part of the book is dedicated to it too. A second edition came out earlier this year so it’s also all up to date.
  • Functional Web Development with Elixir, OTP, and Phoenix – build a simple game, first with simple functions, then develop a GenServer interface to it with some supervisors and then wire it all up in a Phoenix App – good introduction and especially good at showing some solid debugging work.
  • Programming Phoenix – introductory book I wish everyone writing a phoenix application read first, as it also covers why things are done in a certain way and it’s by the authors of the framework which gives a unique perspective. It also has the aforementioned information of what is already parallelized etc. It also includes a pretty cool use case for a supervised GenServer (getting suggestions from an external service and ranking them).
  • Designing Elixir Systems with OTP – this is shaping up to be a great resource on GenServers, Supervision Trees and OTP in general. I haven’t read it yet, but James, Bruce and PragProg have all my trust plus I read some early praise already.

Final Thoughts

Well, you don’t need GenServers and supervision trees to start with writing elixir applications! Go out there, write an application, play with it, have fun, call yourself an elixir programmer (because you are!). Still, learn about OTP to expand your mind and to know where to look when you encounter a problem where a supervision tree could help you.

When discussing this in Elixir Forum, Dimitar also came up with a good thought: Maybe the OTP is what pulls people in and helps them discover all those other nice things?

I came for OTP. I stayed for the functional programming.

As a community, I think we should make it clearer that you don’t have to use GenServers and that doing so might actually be harmful. Of course all those conference talks about how to use them, distributed systems etc. are very cool but every now and then give me a talk about how a business succeeded by writing a fairly standard Phoenix application. Don’t over complicate things.

I’m not saying you shouldn’t learn about GenServers. You should. But know when to use them and when not to.

Lastly, if you disagree I want you to scream at me and teach me the error of my ways :smiley:

 

[1]Technically the web server of phoenix cowboy doesn’t use GenServers and supervision trees for normal http request handling but their own thing, they have a similar functionality though so it still holds true that you don’t need to roll your own. Thanks to Hubert for pointing that out.

edit1: Correctly mention new edition of Elixir in Action, be more specific about Cowboy Processes and include a thought in closing paragraph. Thanks go to Saša, Hubert and Dimitar.

Revisiting “Tail Call Optimization in Elixir & Erlang” with benchee 1.0

All the way back in June 2016 I wrote a well received blog post about tail call optimization in Elixir and Erlang. It was probably the first time I really showed off my benchmarking library benchee, it was just a couple of days after the 0.2.0 release of benchee after all.

Tools should get better over time, allow you to do things easier, promote good practices or enable you to do completely new things. So how has benchee done? Here I want to take a look back and show how we’ve improved things.

What’s better now?

In the old benchmark I had to:

  • manually collect Opearting System, CPU as well as Elixir and Erlang version data
  • manually create graphs in Libreoffice from the CSV output
  • be reminded that performance might vary for multiple inputs
  • crudely measure memory consumption in one run through on the command line

The new benchee:

  • collects and shows system information
  • produces extensive HTML reports with all kinds of graphs I couldn’t even produce before
  • has an inputs feature encouraging me to benchmark with multiple different inputs
  • is capable of doing memory measurements showing me what consumers more or less memory

I think that these are all great steps forward of which I’m really proud.

Show me the new benchmark!

Here you go, careful it’s long (implementation of MyMap for reference):

We can easily see that the tail recursive functions seem to always consume more memory. Also that our tail recursive implementation with the switched argument order is mostly faster than its sibling (always when we look at the median which is worthwhile if we want to limit the impact of outliers).

Such an (informative) wall of text! How do we spice that up a bit? How about the HTML report generated from this? It contains about the same data but is enhanced with some nice graphs for comparisons sake:

newplot(4).png

newplot(5).png

It doesn’t stop there though, some of my favourite graphs are the once looking at individual scenarios:

newplot(6).png

This Histogram shows us the distribution of the values pretty handily. We can easily see that most samples are in a 100Million – 150 Million Nanoseconds range (100-150 Milliseconds in more digestible units, scaling values in the graphs is somewhere on the road map ;))

newplot(7).png

Here we can just see the raw run times in order as they were recorded. This is helpful to potentially spot patterns like gradually increasing/decreasing run times or sudden spikes.

Something seems odd?

Speaking about spotting, have you noticed anything in those graphs? Almost all of them show that some big outliers might be around screwing with our results. The basic comparison shows pretty big standard deviation, the box plot one straight up shows outliers (little dots), the histogram show that for a long time there’s nothing and then there’s a measurement that’s much higher and in the raw run times we also see one enormous spike.

All of this is even more prevalent when we look at the graphs for the small input (10 000 elements):

newplot(8).png

Why could this be? Well, my favourite suspect in this case is garbage collection. It can take quite a while and as such is a candidate for huge outliers – the more so the faster the benchmarks are.

So let’s try to take garbage collection out of the equation. This is somewhat controversial and we can’t take it out 100%, but we can significantly limit its impact through benchee’s hooks feature. Basically through adding after_each: fn _ -> :erlang.garbage_collect() end to our configuration we tell benchee to run garbage collection after every measurement to minimize the chance that it will trigger during a measurement and hence affect results.

You can have a look at it in this HTML report. We can immediately see in the results and graphs that standard deviation got a lot smaller and we have way fewer outliers now for our smaller input sizes:

newplot(9).png

newplot(10).png

Note however that our sample size also went down significantly (from over 20 000 to… 30) so increasing benchmarking time might be worth while to get more samples again.

How does it look like for our big 5 Million input though?

newplot(11).png

Not much of an improvement… Actually slightly worse. Strange. We can find the likely answer in the raw run time graphs of all of our contenders:

newplot(13).pngnewplot(12).png

The first sample is always the slowest (while running with GC it seemed to be the third run). My theory is that for the larger amount of data the BEAM needs to repeatedly grow the memory of the process we are benchmarking. This seems strange though, as that should have already happened during warmup (benchee uses one process for each scenario which includes warmup and run time). It might be something different, but it very likely is a one time cost.

To GC or not to GC

Is a good question. Especially for very micro benchmarks it can help stabilize/sanitize the measured times. Due to the high standard deviation/outliers whoever is fastest can change quite a lot on repeated runs.

However, Garbage Collection happens in a real world scenario and the amount of “garbage” you produce can often be directly linked to your run time – taking the cleaning time out of equation can yield results that are not necessarily applicable to the real world. You could also significantly increase the run time to level the playing field so that by the law of big numbers we come closer to the true average – spikes from garbage collection or not.

Wrapping up

Anyhow, this was just a little detour to show how some of these graphs can help us drill down and find out why our measurements are as they are and find likely causes.

The improvements in benchee mean the promotion of better practices and much less manual work. In essence I could just link the HTML report and then just discuss the topic at hand (well save the benchmarking code, that’s not in there… yet 😉 ) which is great for publishing benchmarks. Speaking about discussions, I omitted the discussions around tail recursive calls etc. with comments from José Valim and Robert Virding. Feel free to still read the old blog post for that – it’s not that old after all.

Happy benchmarking!

Released: benchee 0.99, 1.0 & friends

It’s finally here – benchee 1.0! 🎉🎉🎉

The first benchee release was almost 3 years ago – it started a mission to improve benchmarking tooling in the elixir eco system. And now we’re not at the goal – after all it’s never done and we’re not short of ideas of what to do.

What’s in a 1.0?

Also called “Why did you take so long to call it 1.0?” – 1.0 for me means a good level of stability. A level where not every second new benchee version all formatters would need updates because they would break otherwise. And in recent releases we have still shuffled major data structures around A LOT (just check all the Breaking Changes (Plugins)). Benchee was mostly stable from a user perspective – but this means it’s less of a risk factor to go ahead and write your own plugins, something that benchee always encouraged/was built to empower. I don’t have any plans for 2.0 right now – all features that I know of can easily be added to the existing structure.

It also means I’m happy with the features. What benchee offers is great, we have:

  •  nano second precise run time measurements
  •  memory measurements
  • rich statistics
  • show information such as CPU, elixir and erlang versions about the system running the benchmarks
  •  support for multiple inputs
  •  hooks to support even unconventional scenarios
  • you can access it all via your CLI, CSV, JSON or HTML (including nice graphs!)
  • and actually a lot more 😉

Benchee might have started out as “I want benchmark-ips in elixir” but it has surpassed it in many ways so that I’d actually want to have benchee in Ruby but that’s another topic. However, that makes me proud of what we accomplished.

With that amount of polish I can also easily sit back and not work on benchee for some time because I know it’s good – it is “done” in the sense that it can do everything I wanted it to do when I started the project (and even more!).

As for what is actually in it mostly removing deprecations. You can check out the Changelog.

What’s 0.99?

I found it nice how rspec did their 2.99 –> 3.0 switch – get it to run on 2.99 without deprecation warnings and then you can safely use 3.0. That was a great user experience. Ember.js handles their major versions similarly. Now, benchee is nowhere near as complex as those 2 but we thought providing that nicety would still be great.

Features

As mentioned before 0.99/1.0 don’t actually include many features – the previous 0.14.0 release from about a month ago was very feature packed. These releases are a lot about polish. Redoing the documenation, updating names, fixing typespecs, being more careful about what is and isn’t exposed in the public interface.

A small but important feature made it in though – displaying the absolute difference between measurements:

Comparison:
flat_map           2.34 K
map.flatten        1.22 K - 1.92x slower +393.09 μs

See that little+393.09 μs? It’s how much slower it was on average in absolute terms. With these comparisons people often focus too much on “OMG it’s almost 2 times as slow!!!” but this number helps put it into context: It’s not even half a millisecond. If you only do this once in a web request the difference likely doesn’t matter. It’s a calculation I always did in my head, I’m happy to make it easily accessible for everyone.

Along with this patch those values were added to our Statistics struct – including the “x-times slower” values, which means formatters no longer have to implement this themselves! Hooray!

We’re an org now!

An astute observer might have seen that all my benchee repos have been moved to the github organization bencheeorg. What’s that all about? It’s mostly a tribute to benchee not being a personal project but a community project. Many people have contributed massively to benchee, most notably Devon and Eric. Without Devon we probably still wouldn’t have memory measurements and without Eric our unit scaling wouldn’t be as great as it is. Others such as Michał and OvermindDL1 have also contributed a lot through ideas, testing and help (especially with memory measurements :)). Feels wrong to keep the repositories attached to a single person.

Also, should anything happen to me (which I hope won’t happen), the others could still add people to the organization and carry on.

It also helps with another problem I’ve had: I want to extract small useful libraries from benchee: Statistics (introduced by me), System Information gathering (introduced by Devon) and unit scaling (introduced by Eric) – where do I put these repos? All under their own name space? All under my name space? Nah, I put them in the benchee organization where we share ownership – that’s where they belong.

The future of benchee

As I said benchee isn’t done – there is an open PR to add reference jobs which didn’t make it into the release. We’d like to add more types of memory measurements, as well as measuring reductions, incorporating profiling right after benchmarking to drill down on those bottle necks sounds great, more compact console output and also include the benchmarking code itself in the suite so that formatters could display it. Finally, now might finally be the time to brush up on meta programming and write that DSL wrapper that people apparently want.

Help with all of those is very welcome. Personally, I’m really itching to extract these libraries I mentioned – let’s see about that. Also to showcase benchee with some nice benchmarks – after all what good is a great benchmarking tool if you rarely use it?

Video & Slides: Do You Need That Validation? Let Me Call You Back About It

I had a wonderful time at Ruby On Ice! I gave a talk, that I loved to prepare to formulate the ideas the right way. You’ll see it focuses a lot on the problems, that’s intentional because if we’re not clear on the problems what good is a solution?

You can find the video along with awesome sketch notes on the Ruby on Ice homepage.

Anyhow, here are the slides: speakerdeck slideshare PDF

(in case you wonder why the first slide is a beer, the talk was given on Sunday Morning as the first talk after the party – welcoming people back was essential as I was a bit afraid not many would show up but they did!)

Abstract

Rails apps start nice and cute. Fast forward a year and business logic and view logic are entangled in our validations and callbacks – getting in our way at every turn. Wasn’t this supposed to be easy?

Let’s explore different approaches to improve the situation and untangle the web.

Benchee 0.14.0 – Micro Benchmarks? Pah, how about Nano Benchmarks!

Long time since the last benchee release, heh? Well, this one really packs a punch to compensate! It brings you a higher precision while measuring run times as well as a better way to specify formatter options. Let’s dive into the most notable changes here, the full list of changes can be found in the Changelog.

Of course, all formatters are also released in compatible versions.

Nanosecond precision measurements

Or in other words making measurements 1000 times more precise 💥

This new version gives you much more precision which matters especially if you benchmark very fast functions. It even enables you to see when the compiler might completely optimize an operation away. Let’s take a look at this in action:

You can see that the averages aren’t 0 ns because sometimes the measured run time is very high – garbage collection and such. That’s also why the standard deviation is huge (big difference from 0 to 23000 or so). However, if you look at the median (basically if you sort all measured values, it’s the value is in the middle) and the mode (the most common value) you see that both of them are 0. Even the accompanying memory measurements are 0. Seems like there isn’t much happening there.

So why is that? The compiler optimizes these “benchmarks” away, because they evaluate to one static value that can be determined at compile time. If you write 1 + 1 – the compiler knows you probably mean 2. Smart compilers. To avoid these, we have to trick the compiler by randomizing the values, so that they’re not clear at compile time (see the “right” integer addition).

That’s the one thing we see thanks to our more accurate measurements, the other is that we can now measure how long a map over a range with 10 elements takes (which is around 355 ns for me (I trust the mode and median more her than the average).

How did we accomplish this? Well it all started looking into why measurements on Windows seemed to be weird. We noticed that the implementation of :timer.tc/1 had hard coded the values to be measured in micro seconds:

But, in fact nanoseconds are supported! So we now have our own simple time measuring code. This is operating system dependent though, as the BEAM knows about native time units. To the best of our knowledge nanosecond precision is available on Linux and MacOS – not on Windows.

It wasn’t just enough to switch to nano second precision though. See, once you get down to nanoseconds the overhead of simply invoking an anonymous function (which benchee needs to do a lot) becomes noticeable. On my system this overhead is 78 nanoseconds. To compensate, benchee now measures the function call overhead and deducts it from the measured times. That’s how we can achieve measurements of 0ns above – all the code does is return a constant as the compiler optimized it away as the value can be determined at compile time.

A nice side effect is that the overhead heavy function repetition is practically not used anymore on Linux and macOS as no function is faster than nanoseconds. Hence, no more imprecise measurements due to function repetition to make it measurable at all (on Windows we still repeat the function call for instance 100 times and then divide the measured time by this).

Formatter Configuration

This is best shown with an example, up until now if you wanted to pass options to any of the formatters you had to do it like this:

This always felt awkward to me, but it really hit hard when I watched a benchee video tutorial. There the presenter said “…here we configure the formatter to be used and then down here we configure where it should be saved to…” – why would that be in 2 different places? They could be far apart in the code. There is no immediate visible connection between Benchee.Formatters.HTML and the html: down in the formatter_options:.  Makes no sense.

That API was never really well thought out, sadly.
So, what can we do instead? Well of course, bring the options closer together:

So, if you want to pass along options instead of just specifying the module, you specify a tuple of module and options. Easy as pie. You know exactly what formatter the options belong to.

Road to 1.0?

Honestly, 1.0 should have happened many versions ago. Right now the plan is for this to be the last release with user facing features. We’ll mingle the data structure a bit more (see the PR if interested), then put in deprecation warnings for functionality we’ll remove and call it 0.99. Then, remove deprecated functionality and call it 1.0. So, this time indeed – it should be soon ™. I have a track record of sneaking in just one more thing before 1.0 though 😅. You can track our 1.0 progress here.

Why did this take so long?

Looking at this release it’s pretty packed. It should have been 2 releases (one for every major feature described above) that should have happened much sooner.

It’s definitely sad, I double checked: measuring with best available precision landed 21st of May and function call overhead measurement was basically done 27th of June. And the formatter options landed 10th of August. Keeping those out of your hands for so long really saddens me 😖.

Basically, these required updating the formatters, which isn’t particularly fun, but necessary as I want all formatters to be ready to release along a new benchee version. In addition, we put in even more work (specifically Devon in big parts) and added support for memory measurements to all the formatters.

Beyond this? Well, I think life. Life happened. I moved apartments, which is a bunch  of work. Then a lot of things happened at work leading to me eventually quitting my job. Some times there’s just no time or head space for open source. I’m happy though that I’m as confident as one can be in that benchee is robust and bug free software, so that I don’t have to worry about it breaking all the time. I can already see this statement haunting me if this release features numerous weird bugs 😉

In that vain, hope you enjoy the new benchee version – happy to hear feedback, bugs or feature ideas!

And because you made it so far, you deserve an adorable bunny picture:

IMG_20190127_150119.jpg

Slides: Elixir, Your Monolith and You (Elixir Berlin Version)

I was supposed to give this talk at ElixirConf.Eu, but sadly fell ill. These are the slides (still titled alpha-1) that I used to give it Elixir Berlin which was met with a great reception. Which is also why I was so looking forward to give it again and have it recorded… Anyhow, if you saw the talk and want to go through the slides again or you were looking forward to the slides – here they are.

Slides can be viewed here or on speakerdeck, slideshare or PDF

Abstract

Elixir is great, so clearly we’ll all rewrite our applications in Elixir. Mostly, you can’t and shouldn’t do that. This presentation will show you another path. You’ll see how at Liefery, we started with small steps instead of rewriting everything. This allowed us to reap the benefits earlier and get comfortable before getting deeper into it. We’ll examine in detail the tactics we used to create two Elixir apps for new requirements, and how we integrated them with our existing Rails code base.

Join us on our tale of adopting Elixir and Phoenix and see what we learned, what we loved, and what bumps we hit along the road

edit: slightly updated version from devday.io – PDF slideshare

benchee is now called bunny!

edit: This was an April’s fools joke. However, bunny will remain functional. It’s only implemented as a thing wrapper around benchee so unless we completely break API (which I don’t see coming) it’ll remain functional. Continue reading for cute bunny pictures.

It is time for benchee to take the next step in its evolution as one of the prime benchmarking libraries. Going forward benchee will be called bunny!

IMG_20171207_084810_Bokeh-ANIMATION.gif
Al likes the naming change!

We waited for this very special day to announce this very special naming change – what better day to announce something is being named bunny than Easter Sunday?

It is available on hex.pm now!

But why?

We think this is an abstraction that’s really going to offer us all the flexibility that we’re going to need for future development. As we approach 1.0, we wanted to get the API just right.

This is true courage.

We also haven’t been exactly subtle dropping hints that this naming change was coming. For once I have described benchmarking as bunnies eating food on numerous occasions (each bunny is a function that tries to eat it’s input as fast as it can!). Other than that, the frequently occurring bunny pictures (or even gifs) in benchee Pull Requests could have been a hint.

Also, eating is what they do best:

IMG_20180120_094003_Bokeh-ANIMATION
Yum yum we like benchmarking

For now bunny still works a lot like benchee. However, it exposes a better and more expressive API for your pleasure. You know, bunny can’t only run like the good old benchee. No! Bunny can also sleep, hop, eat and jump!

This all comes with your own personal bunny assistant that helps you benchmark:

After all this hard work, the bunny needs to sleep a bit though:

IMG_20180216_102445-ANIMATION.gif

This is clearly better than any other (benchmarking) library out there. What are you waiting for? Go and get bunny now. Also, I mean… just LOOK AT THEM!

IMG_20180120_103418.jpg

IMG_20171221_144500.jpg

IMG_20171221_144657_Bokeh(1).jpg

Video & Slides: Stop Guessing and Start Measuring – Benchmarking in Practice (Lambda days)

I managed to get into Lambda days this year and got a chance to present my benchmarking talk. You can watch the video here and check out the slides.

Sadly the bunny video isn’t working in the recording 😥

You can see the slides here or at speakerdeck, slideshare or PDF.

Abstract

“What’s the fastest way of doing this?” – you might ask yourself during development. Sure, you can guess – but how do you know? How long would that function take with a million elements? Is that tail-recursive function always faster?

Benchmarking is here to give you the answers, but there are many pitfalls in setting up a good benchmark and analyzing the results. This talk will guide you through, introduce best practices, and surprise you with some results along the way. You didn’t think that the order of arguments could influence its performance…or did you?

The curious case of the query that gets slower the fewer elements it affects

I wrote a nice blog post for the company I’m working at (Liefery) called “The curious case of the query that gets slower the fewer elements it affects“, which goes through a real world benchmarking with benchee. It involves a couple of things that can go wrong but how combined indexes and PostgreSQL’s EXPLAIN ANALYZE can help you overcome it problems. It’s honestly one of the blog posts I think I ever wrote so head over and read it if that sounds interesting to you 🙂