martes, 14 de abril de 2015

Cache is the new GC

Now all rubyists are pros in all Garbage Collection techniques.

And we all have to know the difference between mark&sweep, reference counting, generational, and whatnot (I also see more references to this in the lua mail list lately).

And we're already optimizing for not creating so many objects. And we pray Rust and Nim for being so powerful and fast and allow us to manage memory manually. and we decide that checking for existence of an element in a short Array is probably faster than checking in a small Set.  Even more if we know the distribution of the expected values.... Well... now, the next step is Cache.

Interesting points here http://dev.mensfeld.pl/2015/04/ruby-global-method-cache-invalidation-impact-on-a-single-and-multithreaded-applications/ .

And after that, Locality of variables, and compiler tricks to optimize code. Here's a nice StackOverflow thread.

Inside this thread the thing that brings you back to reality is that compilers are allowed to do pretty amazing things . Things that are so complex, that if you have to have that in mind... well.... good luck.  I guess for real time systems it makes sense, or very low level programming, but there are really complex techniques which seem really hard to anticipate .

What's the point of all that? No idea. it's just funny that sometimes we try to push low level languages to higher levels, and then, we program ruby as if we were forging asm.  Funny :) . In the end, all benchmarks are made to lie in one or other regard, so I guess the most important thing, is having Amdahl's law in mind (or an adaptation of it): All the optimizations you'll do, will apply only to the percentage of code where the optimization is feasible.  The original speaks about intrinsically serial code that cannot be parallelized.   My idea is that optimizing for cache hits in ruby when you have a webapp which does http calls here and there to external systems is not really the way.