Benchee 0.3.0 released – formatters, parallel benchmarking & more

Yesterday I released benchee 0.3.0! Benchee is a tool for (micro) benchmarking in elixir focussing on being simple, extensible and to provide you with good statistics. You can refer to the Changelog for detailed information about the changes. This post will look at the bigger changes and also give a bit of the why for the new features and changes.

Multiple formatters

Arguably the biggest feature in Benchee 0.3.0 is that it is now easy and built-in to configure multiple formatters for a benchmarking suite. This means that first the benchmark is run, and then multiple formatters are run on the benchmarking results. This way you can get both the console output and the corresponding csv file using BencheeCSV. This was a pain point for me before, as you could either get one or the other or you needed to use the more verbose API.

You can also see the new output/1 methods at work, as opposed to format/1 they also really do the output themselves. BencheeCSV uses a custom configuration options to know which file to write to. This is also new, as now formatters have access to the full benchmarking suite, including configuration, raw run times and function definitions. This way they can be configured using configuration options they define themselves, or a plugin could graph all run times if it wanted to.

Of course, formatters default to just the built-in console formatter.

Parallel benchmarking

Another big addition is parallel benchmarking. In Elixir, this just feels natural to have. You can specify a parallel key in the configuration and that tells Benchee how many tasks should execute any given benchmarking job in parallel.

Of course, if you want to see how a system behaves under load – overloading might be exactly what you want to stress test the system. And this was exactly the reason why Leon contributed this change back to Benchee:

I needed to benchmark integration tests for a telephony system we wrote – with this system the tests actually interfere with each other (they’re using an Ecto repo) and I wanted to see how far I could push the system as a whole. Making this small change to Benchee worked perfectly for what I needed🙂

(Of course it makes me extremely happy that people found adjusting Benchee for their use case simple, that’s one of the main goals of Benchee. Even better that it was contributed back❤ )

If you want to see more information and detail about “to benchmark in parallel or not” you can check the Benchee wiki. Spoiler alert: The more parallel benchmarks run, the slower they get to an acceptable degree until the system is overloaded (more tasks execute in parallel than there are CPU cores to take care of them). Also deviation skyrockets.

While the effect seems not to be very significant for parallel: 2 on my system, the default in Benchee remains parallel: 1 for the mentioned reasons.

Print configuration information

Partly also due to the parallel change, Benchee wil now print a brief summary of the benchmarking suite before executing it.


tobi@happy ~/github/benchee $ mix run samples/run_parallel.exs

Benchmark suite executing with the following configuration:
warmup: 2.0s
time: 3.0s
parallel: 2
Estimated total run time: 10.0s

Benchmarking flat_map...
Benchmarking map.flatten...

Name                  ips        average    deviation         median
map.flatten       1268.15       788.55μs    (±13.94%)       759.00μs
flat_map           706.35      1415.72μs     (±8.56%)      1419.00μs

Comparison:
map.flatten       1268.15
flat_map           706.35 - 1.80x slower

This was done so that when people share their benchmarks online one can easily see the configuration they ran it with. E.g. was there any warmup time? Was the amount of parallel tasks too high and therefore the results are that bad?

It also prints an estimated total run time (number of jobs * (warmup + time)), so you know if there’s enough time to go and get a coffee before a benchmark finishes.

Map instead of a list of tuples

What is also marked as a “breaking” change in the Changelog is actually not THAT breaking. The main data structure handed to Benchee.run was changed to a map instead of a list of tuples and all corresponding data structures changed as well (important for plugins to know).

It used to be a list of tuples because of the possibility that benchmarks with the same name would override each other. However, having benchmarks with the same name is nonsensical as you can’t discern their results in the output any way. So, this now feels like a much more fitting data structure.

The old main data structure of a list of tuples still works and while I might remove it, I don’t expect me to right now as all that is required to maintain it is 4 lines of code. This makes duplicated names no longer working the only real deprecation, although one might even call it a feature😉

Last, but not least, this release is the first one that got some community contributions in, which makes me extremely happy. So, thanks Alvin and Leon!😀

Tail Call Optimization in Elixir & Erlang – not as efficient and important as you probably think

(Automatic) Tail Call Optimization (TCO) is that great feature of Elixir and Erlang that everyone tells you about. It’s super fast, super cool and you should definitely always aim to make every recursive function tail-recursive. What if I told you that body-recursive functions can be faster and more memory efficient than their especially optimized tail-recursive counter parts?

Seems unlikely, doesn’t it? After all every beginners book will mention TCO, tell you how efficient it is and that you should definitely use it. Plus, maybe you’ve tried body-recusion before in language X and your call stack blew up or it was horrendously slow. I did and I thought tail-recursive functions were always better than body-recursive. Until one day, by accident, I wrote a none tail-recursive function (so TCO didn’t apply to it). Someone told me and eagerly I replaced it with its tail-recursive counterpart. Then, I stopped for a second and benchmarked it – the results were surprising to say the least.

Before we do a deep dive into the topic, let’s take a refresher on tail call optimization from Dave Thomas in Programming Elixir 1.2 (great book!):

(…) it ends up calling itself. In many languages, that adds a new frame to the stack. After a large number of messages, you might run out of memory.

This doesn’t happen in Elixir, as it implements tail-call optimization. If the last thing a function does is call itself, there’s no need to make the call. Instead, the runtime simply jumps back to the start of the function. If the recursive call has arguments, then these replace the original parameters.

 

Well, let’s get into this🙂

Writing a map implementation

So let’s write an implementation of the map function. One will be body-recursive, one will be tail-recursive. I’ll add another tail-recursive implementation using ++ but no reverse and one that just does not reverse the list in the end. The one that doesn’t reverse the list of course isn’t functionally equivalent to the others as the elements are not in order, but if you wrote your own function and don’t care about ordering this might be for you. In an update here I also added a version where the argument order is different, for moe on this see the results and edit6.

map_body here is the function I originally wrote. It is not tail-recursive as the last operation in this method is the list append operation, not the call to map_body. Comparing it to all the other implementations, I’d also argue it’s the easiest and most readable as we don’t have to care about accumulators or reversing the list.

Now that we have the code, let us benchmark the functions with benchee! Benchmark run on Elixir 1.3 with Erlang 19. Let’s just map over a large list and add one to each element of the list. We’ll also throw in the standard library implementation of map as a comparison baseline:

(The benchmarking was actually done by this a little more verbose but equivalent script so it generates the CSV ouput and console output on the same run using benchee’s slightly more verbose interface. Feature to make that possible with the nicer interface is planned😉 )

For the more visual here is also a graph showcasing the results (visualized using benchee_csv):

new_map_tco
Graphing iterations per second (higher is better)

So what do we see? The body-recursive function seems to be as fast as the version from standard library. The reported values are faster, but well within the margin of error. Plus the median of the two is the same while standard deviation is higher for the standard libary version. This hints at the possibility that the worse average may be through some outliers (resulting f.ex. from Garbage Collection). The tail-recursive version with ++ is VERY SLOW but that’s because appending with ++ so frequently is a bad idea as it needs to go to the end of the linked list every time around (O(n)). But that’s not the main point.

The main point is that the tail-recursive version is about 14% slower! Even the tail-recursive version that doesn’t reverse the list is slower than the body-recursive implementation!

What is highly irritating and surprising to me is that the tail-recursive function with a slightly different argument order is significantly faster than my original implementation, almost 10%. And this is not a one off – it is consistently faster across a number of runs. You can see more about that implementation in edit6 below. Thankfully José Valim chimed in about the argument order adding the following:

The order of arguments will likely matter when we generate the branching code. The order of arguments will specially matter if performing binary matching. The order of function clauses matter too although I am not sure if it is measurable (if the empty clause comes first or last).

Now, maybe there is a better tail-recursive version (please tell me!) but this result is rather staggering but repeatable and consistent. So, what happened here?

An apparently common misconception

That tail-recursive functions are always faster seems to be a common misconception – common enough that it made the list of Erlang Performance Myths as “Myth: Tail-Recursive Functions are Much Faster Than Recursive Functions”! (Note: this section is currently being reworked so the name might change/link might not lead to it directly any more in the near-ish future)

To quote that:

A body-recursive list function and a tail-recursive function that calls lists:reverse/1 at the end will use the same amount of memory. lists:map/2, lists:filter/2, list comprehensions, and many other recursive functions now use the same amount of space as their tail-recursive equivalents.

So, which is faster? It depends. On Solaris/Sparc, the body-recursive function seems to be slightly faster, even for lists with a lot of elements. On the x86 architecture, tail-recursion was up to about 30% faster

The topic also recently came up on the erlang-questions mailing list again while talking about the rework of the aforementioned Erlang performance myths site (which is really worth the read!). In it Fred Hebert remarks (emphasis added by me):

In cases where all your function does is build a new list (or any other accumulator whose size is equivalent to the number of iterations and hence the stack) such as map/2 over nearly any data structure or say zip/2 over lists, body recursion may not only be simpler, but also faster and save memory over time.

He also has his own blog post on the topic.

But won’t it explode?

I had the same question. From my experience with the clojure koans I expected the body-recursive function to blow up the call stack given a large enough input. But, I didn’t manage to – no matter what I tried.

Seems it is impossible as the BEAM VM, that Erlang and Elixir run in, differs in its implementation from other VMs, the body recursion is limited by RAM:

Erlang has no recursion limit. It is tail call optimised. If the recursive call is not a tail call it is limited by available RAM

Memory consumption

So what about memory consumption? Let’s create a list with one hundred million elements (100_000_000) and map over it measuring the memory consumption. When this is done the tail-recursive version takes almost 13 Gigabytes of memory while the body-recursive version takes a bit more than 11.5 Gigabytes. Details can be found in this gist.

Why is that? Well most likely here with the large list it is because the tail recursive version needs to create a new reversed version of the accumulator to return a correct result.

Body-recursive functions all the time now?

So let’s recap, the body-recursive version of map is:

  • faster
  • consumes less memory
  • easier to read and maintain

So why shouldn’t we do this every time? Well there are other examples of course. Let’s take a look at a very dumb function deciding whether a number is even (implemented as a homage to this clojure kaons exercise that showed how the call stack blows up in Clojure without recur):

The tail-recursive version here is still 10% slower. But what about memory? Running the function with one hundred million as input takes 41 Megabyte for the tail-recursive version (mind you, this is the whole elixir process) but almost 6.7 Gigabyte for the body-recursive version. Also, for that huge input the tail-recursive version took 1.3 seconds, while the body-recursive function took 3.86 seconds. So for larger inputs, it is faster.

Stark contrast, isn’t it? That’s most likely because this time around there is no huge list to be carried around or accumulated – just a boolean and a number. Here the effect that the body-recursive function needs to save its call stack in the RAM has a much more damaging effect, as it needs to call itself one hundred million times.

So, what now?

Tail-recursive functions still should be faster and more efficient for many or most use cases. Or that’s what I believe through years of being taught that tail call optimization leads to the fastest recursive functions😉. This post isn’t to say that TCO is bad or slow. It is here to say and highlight that there are cases where body-recursive functions are faster and more efficient than tail-recursive functions. I’m also still unsure why the tail-recursive function that does not reverse the list is still slower than the body-recursive version – it might be because it has to carry the accumulator around.

Maybe we should also take a step back in education and teaching and be more careful not to overemphasize tail call optimization and with it tail-recursive functions. Body-recursive functions can be a viable, or even superior, alternative and they should be presented as such.

There are, of course, cases where writing tail-recursive functions is absolutely vital, as Robert Virding, creator of Erlang, rightfully highlights:

No, the main case where is TCO is critical is in process top-loops. These functions never return (unless the process dies) so they will build up stack never to release it. Here you have to get it right. There are no alternatives. The same applies if you top-loop is actually composed of a set of mutually calling functions. There there are no alternatives. Sorry for pushing this again, and again, but it is critical. :slight_smile:

But what does this teach us in the end? Don’t take your assumptions stemming from other programming environments for granted. Also, don’t assume – always proof. So let’s finish with the closing words of the Erlang performance myths section on this:

So, the choice is now mostly a matter of taste. If you really do need the utmost speed, you must measure. You can no longer be sure that the tail-recursive list function always is the fastest.

edit1: there previously was a bug in is_even_tco? with a missing not, not caught by my tests as they were calling the wrong function😦 Thanks to Puella for notifying me.

edit2/addendum: (see next edit/update) It was pointed out at lobste.rs that running it in an erlang session the body-recursive function was significantly slower than the tco version. Running what I believe to be equivalent code in Elixir and Erlang it seems that the Erlang map_body version is significantly slower than in Elixir (2 to 3 times by the looks of it). I’d need to run it with an Erlang benchmarking tool to confirm this, though.

edit3/addendum2: The small tries mentioned in the first addendum were run in the shell which is not a great idea, using my little erlang knowledge I made something that compiled that “benchmark” and map_body is as fast/faster again thread. Benchmarking can be fickle and wrong if not done right, so would still look forward to run this in a proper Erlang benchmarking tool or use Benchee from Erlang. But no time right now😦

edit4: Added comment from Robert Virding regarding process top loops and how critical TCO is there. Thanks for reading, I’m honoured and surprised that one of the creators of Erlang read this🙂 His full post is of course worth a read.

edit5: Following the rightful nitpick I don’t write “tail call optimized” functions any more but rather “tail-recursive” as tail call optimization is more of a feature of the compiler and not directly an attribute of the function

edit6: Included another version in the benchmark that swaps the argument order so that the list stays the first argument and the accumulator is the last argument. Surprisingly (yet again) this version is constantly faster than the other tail-recursive implementation but still slower than body recursive. I want to thank Paweł for pointing his version out in the comments. The reversed argument order was the only distinguishing factor I could make out in his version, not the assignment of the new accumulator. I benchmarked all the different variants multiple times. It is consistently faster, although I could never reproduce it being the fastest. For the memory consumption example it seemed to consume about 300MB less than the original tail-recursive function and was a bit faster. Also since I reran it either way I ran it with the freshly released Elixir 1.3 and Erlang 19. I also increased the runtime of the benchmark as well as the warmup (to 10 and 10) to get more consistent results overall. And I wrote a new benchmarking script so that the results shown in the graph are from the same as the console output.

edit7: Added a little TCO intro as it might be helpful for some🙂

edit8: In case you are interested in what the bytecode looks like, I haven’t done it myself (not my area of expertise – yet) there is a gist from Sasa Juric showing how you can get the bytecode from an elixir function

edit9: Added José’s comment about the argument order, thanks for reading and commenting!🙂

Benchee 0.2.0 – warmup & nicer console output

Less than a week after the initial release of my benchmarking library Benchee there is a new version – 0.2.0! The details are in the Changelog. That’s the what, but what about the why?

Warmup

Arguably the biggest change is introduction of a warmup phase to the benchmarks. That is the benchmark jobs are first run for some time without taking measurements to simulate a “warm” already running system. I didn’t think it’d be that important as the BEAM VM isn’t JITed (as opposed to the JVM) for all hat I know. It is important once benchmarks get to be “macro” – for instance databases usually respond faster once they got used to some queries and our webservers serve most of their time “hot”.

However, even in my micro benchmarks I noticed that it could have an effect when a benchmark was moved around (being run first versus being run last). So I don’t know to what effect, but at least to a small effect there is warmup now. If you don’t want warmup – just set warmup: 0.

Nicer console output

Name                                    ips        average    deviation         median
bodyrecusrive map                  40047.87        24.97μs    (±32.55%)        25.00μs
stdlib map                         39724.07        25.17μs    (±61.41%)        25.00μs
map tco no reverse                 36388.50        27.48μs    (±23.22%)        27.00μs
map with TCO and reverse           33309.43        30.02μs    (±45.39%)        29.00μs
map with TCO and ++                  465.25      2149.40μs     (±4.84%)      2138.00μs

Comparison: 
bodyrecusrive map                  40047.87
stdlib map                         39724.07 - 1.01x slower
map tco no reverse                 36388.50 - 1.10x slower
map with TCO and reverse           33309.43 - 1.20x slower
map with TCO and ++                  465.25 - 86.08x slower

The ouput of numbers is now aligned right, which makes them easier to read and compare, as you can see orders of magnitude differences much more easily. Also the ugly empty line at the end of the output has been removed🙂

Benchee.measure

This is the API incompatible change. It felt weird to me in version 0.1.0 that Benchee.benchmark would already run the function given to it. Now the jobs are defined through Benchee.benchmark and kept in a datastructure (similar to the one Benchee.run uses). Benchee.measure then runs the jobs and measures the outcome and provides them under the new run_times key instead of overriding the jobs key. This feels much nicer overall, of course the high level Benchee.run is unaffected by this.

These additions already nicely improve what Benchee can do and got a couple of items off my “I want to do this in benchee” bucket list. There’s still more to come🙂

Introducing Benchee: simple and extensible benchmarking for Elixir

If you look around this blog it becomes pretty clear that I really love (micro) benchmarking. Naturally, while working more and more with Elixir (and loving it!) I wanted to benchmark something. Sadly, the existing options I found didn’t quite satisfy me. Be it for a different focus, missing statistics, lacking documentation or other things. So I decided to roll my own, it’s not like it’d be the first time.

Of course I tried extending existing solutions but very long functions, very scarce test coverage, lots of dead and outcommented code and a rotting PR later I decided it was time to create something new. So without further ado, please meet Benchee (of course available on hex)!

What’s great about Benchee?

Benchee is easy to use, well documented and can be extended (more on that in the following paragraphs). Benchee will run each benchmarking function you give it for a given amount of time and then compute statistics from it. Statistics is where it shines in my opinion. Benchee provides you with:

  • average run time (ok – yawn)
  • iterations per second, which is great for graphs etc. as higher is better here (as opposed to average run time)
  • standard deviation, an important value in my opinion as it gives you a feeling for how certain you can be about your measurements and how much they vary. Sadly, none of the elixir benchmarking tools I looked at supplied this value.
  • median, it’s basically the middle value of your distribution and is often cited as a value that reflects the “common” outcome better than average as it cuts out outliers. I never used a (micro) benchmarking tool that provided this value, but was often asked to provide it in my benchmarks. So here it is!

Also it gives a rather nice output on the console with headers so you know what is what. An example is further down but for now let’s talk design…

Designing a Benchmarking library

The design is influenced by my favourite ruby benchmarking library: benchmark-ips. Of course I wanted it to be more of an elixirish spin and offer more options.

A lot of elixir solutions used macros. I wanted something that works purely with functions, no tricks. When I started to learn more about functional programming one of the things that stuck with me the most was that functional programming is about a series of transformations. So what do these transformations look like for benchmarking?

  1. Create a basic benchmarking configuration with things like how long should the benchmark run, should GC be enabled etc.
  2. Run individual benchmarks and record their raw execution times
  3. Compute statistics based on these raw run times per benchmark
  4. Format the statistics to be suitable for output
  5. Put out the formatted statistics to the console, a file or whatever

So what do you now, that’s exactly what the API of Benchee looks like!


list = Enum.to_list(1..10_000)
map_fun = fn(i) -> [i, i * i] end

Benchee.init(%{time: 3})
|> Benchee.benchmark("flat_map", fn -> Enum.flat_map(list, map_fun) end)
|> Benchee.benchmark("map.flatten",
                     fn -> list |> Enum.map(map_fun) |> List.flatten end)
|> Benchee.statistics
|> Benchee.Formatters.Console.format
|> IO.puts

What’s great about this? Well it’s super flexible and flows nicely with the beloved elixir pipe operator.

Why is this flexible and extensible? Well, don’t like how Benchee runs the benchmarks? Sub in your own benchmarking function! Want more/different statistics? Go use your own function and compute your own! Want results to be displayed in a different format? Roll you own formatter! Or you just want to write the results to a file? Well, go ahead!

This is more than just cosmetics. It’d be easy to write a plugin that converts the results to some JSON format and then post them to a web service to gather benchmarking results or let it generate fancy graphs for you.

Of course, not everybody needs that flexibility. Some people might be scared away by the verboseness above. So there’s also a higher level interface that uses all the options you see above and condenses them down to one function call to efficiently define your benchmarks:

list = Enum.to_list(1..10_000)
map_fun = fn(i) -> [i, i * i] end

Benchee.run(%{time: 3},
             [{"flat_map", fn -> Enum.flat_map(list, map_fun) end},
              {"map.flatten",
              fn -> list |> Enum.map(map_fun) |> List.flatten end}])

Let’s see some results!

You’ve seen two different ways to run the same benchmark with Benchee now, so what’s the result and what does it look like? Well here you go:

tobi@happy ~/github/benchee $ mix run samples/run.exs
Benchmarking flat_map...
Benchmarking map.flatten...

Name                          ips            average        deviation      median
map.flatten                   1311.84        762.29μs       (±13.77%)      747.0μs
flat_map                      896.17         1115.86μs      (±9.54%)       1136.0μs

Comparison:
map.flatten                   1311.84
flat_map                      896.17          - 1.46x slower

So what do you know, much to my own surprise calling map first and then flattening the result is significantly faster than a one pass flat_map. Which is unlike ruby, where flat_map is over two times fast in the same scenario. So what does that tell us? Well, what we think about performance from other programming languages might not hold true. Also, that there might be a bug in flat_map – it should be faster for all that I know. Need some time to investigate🙂

All that aside, wouldn’t a graph be nice? That’s a feature I envy benchfella for. But wait, we got this whole extensible architecture right? Generating the whole graph myself with error margins etc. might be a bit tough, though. But I got LibreOffice on my machine. A way to quickly feed my results into it would be great.

Meet BencheeCSV (the first and so far only Benchee plugin)! With it we can substitute the formatting and output steps to generate a CSV file to be consumed by a spreadsheet tool of our choice:

file = File.open!("test.csv", [:write])
list = Enum.to_list(1..10_000)
map_fun = fn(i) -> [i, i * i] end

Benchee.init
|> Benchee.benchmark("flat_map", fn -> Enum.flat_map(list, map_fun) end)
|> Benchee.benchmark("map.flatten",
                     fn -> list |> Enum.map(map_fun) |> List.flatten end)
|> Benchee.statistics
|> Benchee.Formatters.CSV.format
|> Enum.each(fn(row) -> IO.write(file, row) end)

And a couple of clicks later there is a graph including error margins:

benchee_csv

How do I get it?

Well, just add benchee or benchee_csv to the deps of your mix.exs!

def deps do
  [{:benchee, "~> 0.1.0", only: :dev}]
end

Then run mix deps.get, create a benchmarking folder and create your new my_benchmark.exs! More information can be found in the online documentation or at the github repository.

Anything else?

Well Benchee tries to help you, that’s why when you try to micro benchmark an extremely fast function you might happen upon this beauty of a warning:

Warning: The function you are trying to benchmark is super fast, making time measures unreliable!
Benchee won’t measure individual runs but rather run it a couple of times and report the average back. Measures will still be correct, but the overhead of running it n times goes into the measurement. Also statistical results aren’t as good, as they are based on averages now. If possible, increase the input size so that an individual run takes more than 10μs

The reason why I put it there is pretty well explained. The measurements would simply be unreliable as randomness and the measuring itself have too huge of an impact. Plus, measurements are in micro seconds – so it’s not that accurate either. I tried nano seconds but quickly discarded them as that seemed to add even more overhead.

Benchee tries to run your benchmark n times then and measure that, while it improves the situation somewhat it adds the overhead of my repeat_n function to the benchmark.

So if you can, please benchmark with higher values🙂

Ideas for the future?

Benchee is just version 0.1.0, but a lot of work, features and thought has already gone into it. Here are features that I thought about but decided they are not necessary for a first release:

  • Turning off/reducing garbage collection: Especially micro benchmarking can be affected by garbage collection as single runs will be much slower than the others leading to a sky rocketing standard deviation and unreliable measures. Sadly,  to the best of my knowledge, one can’t turn off GC on the BEAM. But people have shown me options where I could just set a very high memory space to reduce the chance of GC. Need to play with it.
  • Auto scaling units: It’d be nice to for instance show the average time in milliseconds if a benchmark is slower or write something to the effect of “80.9 Million” iterations per second for the console output for a fast benchmark.
  • Better alignment for console output. Right now it’s left aligned, I think right alignment looks better and helps compare results.
  • Making sure Benchee is also usable for more macro benchmarks, e.g. functions that run in the matter of seconds or even minutes
  • Correlating to that, also provide the option to specify a warmup time. Elixir/Erlang isn’t JITed so it should have no impact there, but for macro benchmarks on phoenix or so with the database it should have an impact.
  • Give measuring memory consumption a shot
  • More statistics: Anything you are missing, wishing for?
  • Graph generation: A plugin to generate and share a graph right away would be nice
  • Configurable steps in Benchee.run: Right now if you want to use a plugin you have to use the more “verbose” API of Benchee. If Benchee gains traction and plugins really become a thing it’d be nice to configure them in the high level API like %{formatter: MyFormatModule} or %{formatter: MyFormatModule.format/1}.

So that’s it – have anything you’d like to see in Benchee? Please get in touch and let me know! In any case, give Benchee a try and happy benchmarking!

 

Elixir 1.3’s mix xref working its magic in a real world example

The upcoming elixir 1.3, available as a release candidate, brings along many cool new features such as Calendar Types and (finally!) test groups for ExUnit.  My favorite new feature hasn’t been discussed a lot though. It’s also missing in the “What’s coming in Elixir 1.3” blog post. So, I’d like to show it off: mix xref

So what is mix xref? In the words of the changelog:

Mix v1.3 includes a new task called xref that performs cross reference checks in your code. One of such checks is the ability to find calls to modules and functions that do not exist

Which is a great addition, finding these bugs where you have a typo or forgot an alias/import at compile time! Frankly I thought/hoped elixir already did this, and it did but only sometimes (e.g. you couldn’t import a module that isn’t there).

The even better information is that mix xref is executed by default when you code (and its dependencies) are compiled – so check the warnings after upgrading!

Your tests should generally capture these bugs, but a compile time warning is even earlier in the process and not everything is well tested. We experienced that first hand last week where it took us some time to trace a test failure back to a bug in timex_ecto and opened a PR improving test coverage with failing test cases for the bug. Curios mind that I am I wanted to see if mix xref would have found that call to an undefined module and let’s see:

tobi@airship ~/github/timex_ecto $ elixir -v
Erlang/OTP 18 [erts-7.3]  [64-bit] [smp:8:8] [async-threads:10] [kernel-poll:false]

Elixir 1.3.0-rc.0 (7881123)
tobi@airship ~/github/timex_ecto $ mix clean
tobi@airship ~/github/timex_ecto $ mix compile
Compiling 7 files (.ex)
warning: undefined protocol function dump/1 (for protocol Ecto.DataType)
  lib/datatype.ex:1

warning: undefined protocol function dump/1 (for protocol Ecto.DataType)
  lib/datatype.ex:23

warning: function Ecto.Type.blank?/1 is undefined or private
  lib/types/date.ex:14

warning: function Ecto.Type.blank?/1 is undefined or private
  lib/types/datetime.ex:14

warning: function Ecto.Type.blank?/1 is undefined or private
  lib/types/datetimetz.ex:34

warning: function Ecto.DateTimeWithTimezone.cast/1 is undefined (module Ecto.DateTimeWithTimezone is not available)
  lib/types/datetimetz.ex:57

warning: function Ecto.Type.blank?/1 is undefined or private
  lib/types/time.ex:14

Generated timex_ecto app

It found not only the bug we were hitting (Ecto.DateTimeWithTimeZone is undefined) but more sources of potential bugs. Which is great, as my hope is that those would have never made it into a hex package release with mix xref.

This is why I believe that mix xref might have the biggest impact on the Elixir eco system of all changes coming in 1.3. Don’t get me wrong, I love test groups, calendar types and all the other updates. But this should directly improve the quality of both released libraries and the  development experience. Sure, these should all be caught by tests but (sadly) not everybody tests that rigorously.

There are over 30 of such problems reported by xref in the dependencies of our relatively small phoenix app (standard phoenix + edeliver + maybe 5 other dependencies and their dependencies). That’s potential crashes, bugs and dead code looming that I’d rather avoid. Hence, I’m REALLY excited for Elixir 1.3 and mix xref and you should be too😉

 

Slides: Ruby to Elixir – what’s great and what you might miss

This is a talk I gave at the Polygot Tech Meetup in Berlin in April. It has parts of my previous elixir talk while adding a new perspective more directly comparing Ruby and Elixir as well as being shorter as it was used as an intro to a fun panel discussion.

Abstract

Elixir and Phoenix are known for their speed, but that’s far from their only benefit. Elixir isn’t just a fast Ruby and Phoenix isn’t just Rails for Elixir. Through pattern matching, immutable data structures and new idioms your programs can not only become faster but more understandable and maintainable. While we look at the upsides we’ll also have a look at what you might be missing and could be improved.

Slides

Slides: Elixir & Phoenix – fast, concurrent and explicit

This is the first talk I ever gave about my two new favorite technologies to play with (at home and at work) – Elixir and Phoenix. I gave this talk at Vilnius.rb in march and at the Ruby User Group Berlin in April. Hope you enjoy it.

Abstract

Elixir and Phoenix are all the hype lately – what’s great about them? Is there more to them than “just” fast, concurrent and reliable?

This talk will give a short intro into both Elixir and Phoenix, highlighting strengths, differences from Ruby/Rails and weaknesses.

Slides

 

Don’t you Struct.new(…).new(…)

As I just happened upon it again, I gotta take a moment to talk about one my most despised ruby code patterns: Struct.new(...).new – ever since I happened upon it for the first time.

Struct.new?

Struct.new is a convenient way to create a class with accessors:

2.2.2 :001 > Extend = Struct.new(:start, :length)
 => Extend 
2.2.2 :002 > instance = Extend.new(10, 20)
 => #<struct Extend start=10, length=20> 
2.2.2 :003 > instance.start
 => 10 
2.2.2 :004 > instance.length
 => 20

That’s neat, isn’t it? It is, but like much of Ruby’s power it needs to be wielded with caution.

Where Struct.new goes wrong – the second new

When you do Struct.new you create an anonymous class, another new on it creates an instance of that class. Therefore, Struct.new(...).new(...) creates an anonymous class and creates an instance of it at the same time. This is bad as we create a whole class to create only one instance of it! That is a capital waste.

As a one-off use it might be okay, for instance when you put it in a constant. The sad part is, that this is not the only case I’ve seen it used. As in the first case where I encountered it, I see it used inside of code that is invoked frequently. Some sort of hot loop with calculations where more than one value is needed to represent some result of the calculation. Here programmers sometimes seem to reach for Struct.new(...).new(...).

Why is that bad?

Well it incurs an unnecessary overhead, creating the class every time is unnecessary. Not only that, as far as I understand it also creates new independent entries in the method cache. For JITed implementations (like JRuby) the methods would also be JITed independently. And it gets in the way of profiling, as you see lots of anonymous classes with only 1 to 10 method calls each.

But how bad is that performance hit? I wrote a little benchmark where an instance is created with 2 values and then those 2 values are read one time each. Once with Struct.new(...).new(...), once where the Struct.new(...) is saved in an intermediary constant. For fun and learning I threw in a similar usage with Array and Hash.

Benchmark.ips do |bm|
  bm.report "Struct.new(...).new" do
    value = Struct.new(:start, :end).new(10, 20)
    value.start
    value.end
  end

  SavedStruct = Struct.new(:start, :end)
  bm.report "SavedStruct.new" do
    value = SavedStruct.new(10, 20)
    value.start
    value.end
  end

  bm.report "2 element array" do
    value = [10, 20]
    value.first
    value.last
  end

  bm.report "Hash with 2 keys" do
    value = {start: 10, end: 20}
    value[:start]
    value[:end]
  end

  bm.compare!
end

I ran those benchmarks with CRuby 2.3. And the results, well I was surprised how huge the impact really is. The “new-new” implementation is over 33 times slower than the SavedStruct equivalent. And over 60 times slower than the fastest solution (Array), although that’s also not my preferred solution.

Struct.new(...).new    137.801k (± 3.0%) i/s -    694.375k
SavedStruct.new      4.592M (± 1.7%) i/s -     22.968M
2 element array      7.465M (± 1.4%) i/s -     37.463M
Hash with 2 keys      2.666M (± 1.6%) i/s -     13.418M
Comparison:
2 element array:  7464662.6 i/s
SavedStruct.new:  4592490.5 i/s - 1.63x slower
Hash with 2 keys:  2665601.5 i/s - 2.80x slower
Struct.new(...).new:   137801.1 i/s - 54.17x slower
Benchmark in iterations per second (higher is better)
Benchmark in iterations per second (higher is better)

But that’s not all…

This is not just about performance, though. When people take this “shortcut” they also circumvent one of the hardest problems in programming – Naming. What is that Struct with those values? Do they have any connection at all or were they just smashed together because it seemed convenient at the time. What’s missing is the identification of the core concept that these values represent. Anything that says that this is more than a clump of data with these two values, that performs very poorly.

So, please avoid using Struct.new(...).new(...) – use a better alternative. Don’t recreate a class over and over – give it a name, put it into a constant and enjoy a better understanding of the concepts and increased performance.

Running a meetup

Running a meetup

So far in this post series I covered what you should be aware of before you start organizing a meetup and the 5 basics defining your meetup. I saved one of the most important parts, how to actually run the meetup, for last.

A meetup is usually divided into a couple of phases: Before, Arrival, Main and Goodbye. To easily see what there’s to do, the format of this post is slightly different than the others. It’s not so much discussions but more of a check list for each of the phases, so you don’t forget anything.

Before

As you might have noticed during the first two posts, most of the work done for a meetup happens before the meetup. The preparation is the real work, if it is good, then the meetup is mostly fun. The foundation for a great meetup is great preparation. Most of the things to prepare (online presence, atmosphere, talks, regular schedule, …) were discussed in previous posts. This is about what to do in the days leading up to the event.

  • Check with the hosts if everything is still set and clear up any questions (how many people are expected to come, how long the meetup will run, do they have a projector etc.) and remind them of important things (putting up signs etc.)
  • Check with the speakers if they are ready and everything is good to go for the meetup day and remind them of the CoC
  • See that you announce the meetup via the mailing list and twitter (I usually like to tweet about every talk individually to give the speakers some exposure and let interested attendees know what topics will be covered)
  • On the day of the meetup tweet again, make sure that the directions to the meetup are clear and let them know if there’s food at the meetup

Arrival

For Arrival I like to be the first person at the venue, so I get there 30 to 45 minutes before the official meetup start. If the meetup involves presentations of any kind, be sure to bring your own laptop. The laptop of a speaker might break down, they don’t have their adapter with them… lots of things can happen, so it’s good to have a backup on your side.

Then it’s time to make sure everything is set:

  • Who is the responsible person from the venue in case we need anything?
  • Is there something special we gotta pay attention to (for instance, keep windows closed so neighbours aren’t disturbed)?
  • Are there enough signs to the meetup place so people can find it easily?
  • Are there enough chairs for the expected crowd?
  • Is the projector there and ready? Are adapters there?
  • Is there a microphone system, do we need it?
  • Where’s the bathroom?
  • Is there Wifi? What’s the password?
  • Where are the drinks (+ food?)?
  • Where are my speakers? Do they have any preference when to speak? (I usually let them choose on a “first come first served” basis)
  • Check that the laptop of the speakers works OK with the projector (Adapter etc.), before the meetup starts to prevent bad srprises

Main

For the Main part I’ll make sure I found enough speakers to fill the content before the break and then start relatively on time. Starting a bit late is fine, as people always arrive late. The Ruby User Group Berlin even has this “tradition” where we always start 15 minutes late (pssst).

First comes the welcome and an overview that should include:

  • thanks everyone for coming
  • quick introduction to the format (talks, lightning talks)
  • today’s topics and speakers
  • where are the bathrooms
  • where are food & drinks
  • wifi (also good to show on a projector or have signs around)
  • mention general rules such as the CoC
  • host & sponsors (if you have some), I usually give them maximum 5 minutes to introduce themselves while advising for a shorter time – people get bored easily

Then it goes on to announcing talks, as well as different parts of the meetup (break, lightning talks) and tell people that we are always looking for talks and encourage them to approach me to bounce talk ideas around.

If there are small pauses in between speakers (while connecting to the projector) I like to share some related news (new version of major library X released, security vulnerability in Y, conferences) and ask the audience if they also have any news to share. I just don’t like sustained periods of silence while the meetup is supposed to be running.

To get the attention of people and have them be silent a long extended “Shhhhhhh” while standing on the stage usually works best in my experience. Sometimes it’s just enough to stand there, wait and look like you are going to say something. Holding up one hand (maybe with a balloon) also has worked pretty well for me.

Trying to get some attention at a Rails Girls Berlin workshop
Trying to get some attention at a Rails Girls Berlin workshop

Goodbye

For the Goodbye the essential topics are mostly:

  • thanks for coming!
  • next meetup place + time
  • call for talks
  • if there’s an after party place, tell them where it is and where to form a group to get going there
  • ask kindly for help cleaning the space up (stacking chairs, collecting bottles etc.)
  • thank the speakers once again for their talks

And that’s basically it… but don’t forget – after the meetup is before the next meetup and it’s always a lot less stressful to have things already organized a long time in advance. I try to aim for having the venue and a couple of talks confirmed a month in advance, admittedly I often fail at that.

And that’s also an important takeaway here, nothing is ever perfect and it doesn’t have to be. Don’t worry if you don’t get all of this straight, if you forget something… it happens. I’ve been doing this for a long time and I still forget things but usually everything goes just fine. If I forget something, people remind me or ask. People also know that meetups are organized by volunteers, so they  are forgiving and willing to help when something goes wrong.

So, let me know if this helped you or if I forgot to cover something and you have any remaining questions. This marks the end of this little series, so I hope that this helps you get your meetup started or that you got the insight into organizing meetups that your were looking for.

Defining the 5 basics of your meetup

Defining the 5 basics of your meetup

After looking at some things you should be aware of before you start your own meetup. Let’s take the next step and ask: “What will your meetup be like?”. In this post we’ll take a look at 5 basics that will define your meetup. These are:

  • Atmosphere
  • Activity
  • Place
  • Time
  • Refreshments

As always, this is my own opinion and I might have forgotten something. If you find something missing please let me know and I’ll happily amend this post🙂

Let’s start with Atmosphere as it is the fuzziest concept but probably the most important one.

Atmosphere – creating a friendly community

When you organize a meetup you create a space where people meet. These people are mostly strangers, especially in the beginning. For the continued success of a meetup it is important to create a welcoming, inclusive and friendly atmosphere. How do we get there?

One of the most important factors here is you, the organizer. Whether you want it or not you’ll lead by example. Your actions and behavior will influence the expected behavior of the group. Be friendly, welcoming, stay humble and approachable. It is important that people can come to you, raise concerns, alert you to problems, give feedback or just for a friendly chat.

Speaking of which, to signal to the outside that you want to create a friendly community that welcomes everyone, who abides by a basic set of rules you should have a code of conduct. You should also be willing to enforce these rules. I’m not going to debate the merits of codes of conduct in detail here as that has been done better elsewhere and this post is going to be very long as it stands. Just be aware that just “having” a CoC isn’t enough. Attendees, especially speakers, need to be made aware of it and you need to enforce actions against violators. Also, don’t roll your own. There are plenty of CoCs for events out there that you can take or adjust. For instance, in Berlin we created the Berlin Code of Conduct, which is based on the pdx.rb CoC, translated it into many languages and you can sign it. You will note the contact information on the web page. It is there so attendees can reach out, request help or report violations.

Activity – what do we do

What do you want to do during your meetup? There are multiple possibilities that vary for the size of the meetup and the organizational effort. Also be aware that you can mix and match these, which often makes sense.

Talks

This is what the meetups I organize mostly do. We have speakers that present about a topic somehow related to the overall topic of the meetup. We usually have 3 talks that are around 20 minutes each. We split that up into 2 talks, a break, then another talk and then lightning talks (or a quiz!).

Talks shouldn’t be much longer than 20 minutes (I usually cut people off at 30 minutes). Not everyone is interested in a given topic, so sitting there for 60 mins hearing about something you are not interested in can be pretty frustrating. Also it’s hard to focus for so long (especially after a work day as most meetups are on week days and in the afternoon). Moreover, it encourages the presenters to focus on the essentials.

Mostly talks are accompanied by Q&A sessions. Q&A is controversial, some people don’t like it at all while others LOVE Q&A. Some Q&A sessions tend to get down into nitty gritty discussions of one very deep topic that is only valuable to the one asking the questions. Moderation is key here, don’t let it get too long (I tend to do ~3 questions tops).

To get talks going it is great to have a way for people to submit talks with abstracts that you can then schedule for the meetup. This is why I love onruby (first class support!) and deplore meetup.com (have fun searching through all your messages and manually adding everything!). You need to inquire if speakers are ready to give their talk at your next meetup rather early as they need time to prepare (2 weeks to a month or more in advance is the optimum, a week works if you are lucky, but isn’t really fair to the speakers) and your need to do all the speaker management as well.

With this sort of setup, the greatest compliment you can get is that your meetup feels like a “mini conference”. It is a great setup to have, but especially with a high number of desired talks it is hard to keep going month after month. You need to have a rather big community to keep this going with good talks. Of course, it is fine to scale down and sometimes just have fewer talks. At the same time it is vital to establish that the community is friendly, talks by first time speakers are very welcome and that you’ll aid with feedback about slides and/or presentations.

Lighting Talks

Lightning talks are very short talks (usually 5 minutes – sharp, no running over!) that therefore often don’t need a big preparation time. I don’t usually schedule them in advance for meetups. I just ask during the meetup if anyone has a lightning talk they want to present and then they can come to me and get started right away.

Sometimes people just show off cool hacks or announce new events. It’s usually a lot of fun. I’ll allow a max of 4 lightning talks per event (so that the event doesn’t run too late), usually it’s less.

Coding Together

Especially in smaller meetups where you can’t prepare talks every time it is cool to just sit together and code learning from each other and trying out something new. In my opinion this is ideally accompanied by an introductory talk/tutorial about a library/pattern/whatever so that people can then get to playing with what they learned straight away.

I watched a talk by the creator of Elm where he mentions that for him doing a meetup about Elm with talks often attracted PhDs while a coding/hacking meetup encouraged newcomers and helped to spread the technology much better. I can’t verify the same for RUG::B (no PhDs and plenty of juniors often) but I can see how newcomers benefit more from a coding meetup and how it might be better for very new technologies, that a lot of people just want to play with. In fact I’d love to start some sort of “hack together” meetup in Berlin, but time is sparse😉

Our local Clojure user group also runs a similar setup (talk + hacking). When I was there I found it very fun, because of the hands on nature and the pure joy of learning from my peers (we played with ClojureScript and Om).

Discussions

When you have neither talks nor coding together, you can also just sit together and discuss. I’ve seen this used mostly in meetups around “agile” development/project management. In a  smaller group (10 to 30 participants usually) participants first propose topics to discuss and in the end it is voted which topics will be discussed. Depending on the venue either all topics are discussed together in the big group as participants chime in or there are multiple topics which are then discussed in different rooms in parallel. The discussions are time boxed and after the discussion is over the findings are summarized for the whole group (especially cool with multiple “tracks” as you get to hear the conclusions of the other groups/topics).

The easiest form of this is to just get together in a Bar/Restaurant and talk about whatever people feel like talking about – rather unstructured. This can be a lot of fun depending on what you are after. I genuinely like people from the community and so it is great for me to get out and get to know them talking about things that aren’t necessarily programming related. We used to have both Ruby Picknick and Ruby Burgers (Berlin style – vegetarian) organized in Berlin, just a friendly get together. People also frequently go to a nearby bar/restaurant after a meetup.

If you choose the “discussion in a bar” as your primary activity, be aware both with the meetup structure and the title that the drinking part doesn’t take the main stage, especially with alcoholic beverages. The concern here is not only about the quality of discussions derailing, but it can also have a repelling effect to people that don’t drink and especially minorities. Some people don’t feel safe in this environment (especially with lots of strangers!) or simply don’t like it by nature.

Quiz

A quiz prepared in advance about oddities, edge case behaviours or “Aha!” features of the technology of discussion has been really successful for us. Usually everyone is pretty engaged trying to figure out the answers plus you learn something. Mostly what you learn isn’t really that applicable but still fun to know.

Examples of what I’m talking about: Ruby Trivia one two and three.

Of course it’s great if you have some sort of prize for people who answer questions. It’s not necessary though, people are fine just answering for the fun and joy of it.

Refreshments – people are thirsty & hungry

Make sure that drinks are available at your chosen venue, either sponsored or for an affordable price. It’s important that there are non-alcoholic drinks (water, lemonade etc.). The presence of alcoholic drinks might depend on your country. In Germany it is rather normal to have beer available at meetups, while I know of meetups in Sweden where there is a conscious decision not to have alcohol at meetups. If you have alcohol at your meetups take care that no one goes over board. Meetups should never be about the drinks, meetups are about the people and the activity (talks, coding..).

Food is optional. It’s great to have food, otherwise people either go hungry for a long time or they have to hurry to get some diner before the meetup starts. However, food for so many people isn’t exactly cheap. It can be more affordable for smaller meetups, though. I know of a couple of small meetups (5-20 people) that have a recurring food + drinks sponsor. I usually ask our hosts if they want to provide food. If they do, great. If they don’t, no problem. If there is food, be sure to announce it in advance so everyone knows. Make sure to also be inclusive with the provided food options. In Berlin we make sure to at least provide vegetarian options and try to provide vegan options.

Venue – where do we meet

The venue for the meetup should be reachable from wherever people mostly work. If you move a bit too far away you can expect a 10 to 30% drop in attendance (+ sometimes frustrated tweets of people who think it’s too far). The place of course should offer enough space for your expected crowd. For talks bigger open spaces mostly work best. If you want to have multiple separate discussion groups then of course you need multiple breakout spaces.

A regular venue is good because people don’t have to look up where to go and how to get there, they just know. It’s also less stress as an organizer as you don’t have to go looking for a venue every time. Often times the regular meeting space is the company at least one of the organizers works at.

However, the Ruby User Group Berlin likes to move around (as from before I took it over). That has advantages as well: You get to know companies from your city, you introduce your meetup to employees of the host company (as they are likely to stick around) and companies that only host the meetup sometimes are much more likely to sponsor drinks and food😀

As usual, there is no best option. I think for a new meetup sticking to a regular venue is easier for organizers and attendees at first. Plus, in order to be able to move around venues every month you need a somewhat large pool of companies willing to host you, meaning your topic/technology needs to be sufficiently “mainstream”. Sometimes special venues also lead to an increased attendance. Our highest attended meetup ever (150 to 200, depending on whose count you believe) was in the brand new SoundCloud office that people really wanted to see, right after it opened.

Time – when do we meet

The most important property of the time is that it should be regular. This way attendees can get used to when it is. RUG::B is every first Thursday of the month at 19:30 unless that’s a holiday or something special. Done, people know that.

Of course check for scheduling conflicts with similar meetups (especially a Berlin problem?) – you don’t want to have attendees having to decide where to go.

As far as I can tell evenings of weekdays, excluding Friday, work best. Give people enough time to get to the venue and also to grab something to eat before arriving at the meetup. In Berlin meetups usually start around 19:00 or later. As people want to get home at some time make sure the meetup doesn’t run too long (admittedly, especially RUG::B often runs late).

There are also meetups that happen in the morning (to be more parent friendly among other things), which is another idea to be explored.

Now that we have the basics down, think about what your meetup should be like. How do you create a friendly atmosphere? What will the format of the meetup be? Where and when do you meet? What refreshments will there be? When that’s all decided, we’re all set up for the first meetup. So the next post will be about running the meetup – from start to finish🙂