Journeys of a not so young anymore Software Engineer

2016-06-20T09:18:51+02:00

Hey!

Thanks for your interesting post! It was really nice to read! These are interesting findings, however, I’m afraid, what in your solution was described as Tail Recursion Optimised solution in fact not really is:

  defp _map_tco(acc, [head | tail], function) do
    _map_tco([function.(head) | acc], tail, function)
  end

In here, the last executed term is in fact call to the function, but unfortunately, within this call you’re performing the calculation (`function.(head)`) as well. The function will be TCO if the function call is the last one executed, and the only one executed.

Please, take a look at this solution:

  def my_map(list, fun) do
    do_my_map(list, fun, [])
  end
  def do_my_map([], _fun, acc) do
    Enum.reverse acc
  end
  def do_my_map([head | tail], fun, acc) do
    new_acc = [fun.(head) | acc]
    do_my_map(tail, fun, new_acc)
  end

If you take a look at last function that performs computation (`new_acc = [fun.(head) | acc]`), and with the new value, performs the call of the function (`do_my_map(tail, fun, new_acc)`), the results on my machine are as follows:

➜  tco_bench mix run lib/check.exs
Compiled lib/tco_bench.ex
Benchmarking map with TCO reverse...
Benchmarking map with TCO and ++...
Benchmarking map simple without TCO...
Benchmarking map TCO no reverse...
Benchmarking my_map...
Benchmarking stdlib map...

Name                          ips            average        deviation      median
map simple without TCO        3779.45        264.59μs       (±18.86%)      247.00μs
my_map                        3770.22        265.24μs       (±18.61%)      247.00μs
stdlib map                    3730.08        268.09μs       (±19.84%)      252.00μs
map TCO no reverse            3484.50        286.99μs       (±18.92%)      266.00μs
map with TCO reverse          3006.32        332.63μs       (±19.99%)      322.00μs
map with TCO and ++           3.99           250499.84μs    (±5.98%)       251028.00μs

Comparison:
map simple without TCO        3779.45
my_map                        3770.22         - 1.00x slower
stdlib map                    3730.08         - 1.01x slower
map TCO no reverse            3484.50         - 1.08x slower
map with TCO reverse          3006.32         - 1.26x slower
map with TCO and ++           3.99            - 946.75x slower

➜  tco_bench mix run lib/check.exs
Benchmarking map with TCO reverse...
Benchmarking map with TCO and ++...
Benchmarking map simple without TCO...
Benchmarking map TCO no reverse...
Benchmarking my_map...
Benchmarking stdlib map...

Name                          ips            average        deviation      median
map simple without TCO        3833.15        260.88μs       (±18.53%)      245.00μs
stdlib map                    3805.99        262.74μs       (±18.94%)      250.00μs
my_map                        3771.31        265.16μs       (±19.16%)      247.00μs
map TCO no reverse            3461.84        288.86μs       (±20.13%)      267.00μs
map with TCO reverse          3086.25        324.02μs       (±17.14%)      321.00μs
map with TCO and ++           3.95           253440.21μs    (±6.24%)       250962.00μs

Comparison:
map simple without TCO        3833.15
stdlib map                    3805.99         - 1.01x slower
my_map                        3771.31         - 1.02x slower
map TCO no reverse            3461.84         - 1.11x slower
map with TCO reverse          3086.25         - 1.24x slower
map with TCO and ++           3.95            - 971.47x slower

➜  tco_bench mix run lib/check.exs
Benchmarking map with TCO reverse...
Benchmarking map with TCO and ++...
Benchmarking map simple without TCO...
Benchmarking map TCO no reverse...
Benchmarking my_map...
Benchmarking stdlib map...

Name                          ips            average        deviation      median
map simple without TCO        3834.41        260.80μs       (±18.37%)      245.00μs
my_map                        3812.49        262.30μs       (±17.96%)      247.00μs
stdlib map                    3799.19        263.21μs       (±18.56%)      250.00μs
map TCO no reverse            3485.17        286.93μs       (±19.02%)      267.00μs
map with TCO reverse          3041.72        328.76μs       (±18.70%)      322.00μs
map with TCO and ++           4.09           244627.10μs    (±5.48%)       242761.00μs

Comparison:
map simple without TCO        3834.41
my_map                        3812.49         - 1.01x slower
stdlib map                    3799.19         - 1.01x slower
map TCO no reverse            3485.17         - 1.10x slower
map with TCO reverse          3041.72         - 1.26x slower
map with TCO and ++           4.09            - 938.00x slower

In fact, the solution without `TCO` appears on the top of the result list, but in two of three tries, `my_map` is almost as performant.

Similarly, `my_is_even?`:

  def my_is_even?(n) do
    my_is_even?(n, true)
  end
  def my_is_even?(0, acc) do
    acc
  end
  def my_is_even?(n, acc) do
    new_n = n - 1
    new_acc = !acc
    my_is_even?(new_n, new_acc)
  end

With results:

➜  tco_bench mix run lib/check_number.exs
Compiled lib/tco_bench.ex
Benchmarking is_even?...
Benchmarking is_even_tco?...
Benchmarking my_is_even?...

Name                          ips            average        deviation      median
is_even?                      7.04           141987.93μs    (±3.05%)       142986.00μs
my_is_even?                   6.81           146892.47μs    (±3.60%)       145488.50μs
is_even_tco?                  6.78           147587.36μs    (±3.16%)       146817.00μs

Comparison:
is_even?                      7.04
my_is_even?                   6.81            - 1.03x slower
is_even_tco?                  6.78            - 1.04x slower

➜  tco_bench mix run lib/check_number.exs
Benchmarking is_even?...
Benchmarking is_even_tco?...
Benchmarking my_is_even?...

Name                          ips            average        deviation      median
is_even?                      7.04           142105.27μs    (±2.63%)       141226.50μs
my_is_even?                   6.90           144946.26μs    (±3.76%)       144587.50μs
is_even_tco?                  6.85           145950.56μs    (±4.11%)       144922.50μs

Comparison:
is_even?                      7.04
my_is_even?                   6.90            - 1.02x slower
is_even_tco?                  6.85            - 1.03x slower

➜  tco_bench mix run lib/check_number.exs
Benchmarking my_is_even?...
Benchmarking is_even?...
Benchmarking is_even_tco?...

Name                          ips            average        deviation      median
my_is_even?                   7.00           142768.66μs    (±3.36%)       143294.00μs
is_even?                      6.97           143423.80μs    (±3.78%)       142783.50μs
is_even_tco?                  6.88           145270.09μs    (±3.68%)       144500.50μs

Comparison:
my_is_even?                   7.00
is_even?                      6.97            - 1.00x slower
is_even_tco?                  6.88            - 1.02x slower

In fact, I haven’t noticed any significant difference with (actually, the results vary a bit depending on how many times the script is run):

  def my_is_even?(n, acc) do
    new_n = n - 1
    new_acc = !acc
    my_is_even?(new_n, new_acc)
  end

or just _flipping_ the `acc` result in the function call:

  def my_is_even?(n, acc) do
    new_n = n - 1
    my_is_even?(new_n, !acc)
  end

Again – thanks for your article! It’s nice read, and you made me to stop, think and in fact – measure on my own. This is invaluable lesson!

Best regards!

edit1 by PragTob: replaced github markdown style code blocks with code blocks from wordpress for better readability

	list = Enum.to_list(1..10_000)
	map_fun = fn(i) -> i + 1 end

	Benchee.run %{
	"map tail-recursive with ++" =>
	fn -> MyMap.map_tco_concat(list, map_fun) end,
	"map with TCO reverse" =>
	fn -> MyMap.map_tco(list, map_fun) end,
	"stdlib map" =>
	fn -> Enum.map(list, map_fun) end,
	"map simple without TCO" =>
	fn -> MyMap.map_body(list, map_fun) end,
	"map with TCO new arg order" =>
	fn -> MyMap.map_tco_arg_order(list, map_fun) end,
	"map TCO no reverse" =>
	fn -> MyMap.map_tco_no_reverse(list, map_fun) end
	}, time: 10, warmup: 10

	number = 10_000_000

	Benchee.run %{
	"is_even?" => fn -> Number.is_even?(number) end,
	"is_even_tco?" => fn -> Number.is_even_tco?(number) end,
	}

	tobi@happy ~/github/elixir_playground $ mix run bench/is_even.exs
	Benchmarking is_even?…
	Benchmarking is_even_tco?…

	Name ips average deviation median
	is_even? 10.26 97449.21μs (±0.50%) 97263.00μs
	is_even_tco? 9.39 106484.48μs (±0.09%) 106459.50μs

	Comparison:
	is_even? 10.26
	is_even_tco? 9.39 – 1.09x slower

Writing a map implementation

An apparently common misconception

But won’t it explode?

Memory consumption

Body-recursive functions all the time now?

So, what now?

Share this:

Related

20 thoughts on “Tail Call Optimization in Elixir & Erlang – not as efficient and important as you probably think”

Leave a comment Cancel reply