Thursday, May 7, 2015

Of programming for machine learning, speed and Julia

Machine learning is really about exploring models, moving around and modifying code. Indeed, in machine learning, the level of abstraction developers expect is quite higher than what c/c++ would easily allow. When you are in exploration for new models and new solutions, you want to be able to, from the start, write code at a level of abstraction as close as possible to straight math equations. This explains the popularity of languages like R and Matlab, as these languages allow the user to write code and think in equations and tensors instead of thinking in containers, types and references.

The problem with these scripting languages is that they are extremely slow, as shown in the following graph, taken on (the unit (value of 1) of this table is the value of the C execution time). While these numbers truly only show part of the picture, I think that if you look around on the net you will quickly find that none of these numbers are really controversial.
Recently, Python has gained huge popularity in the world of machine learning, as Python is indeed great fairly high level programming language with a huge community, and because NumPy provides awesome Matlab-like tensor manipulation with typed, fast operations implemented in native languages (C/C++/Fortran). Python, however, has some really big limitations. Indeed, the main implementation of the language only uses a straight bytecode interpreter, and offers no way to restrict dynamic typing to allow for optimization. As such, it is pretty slow (although still being much faster than R and Matlab). Its reliance on old, non-thread-safe C code forced the developers of CPython to implement what is known as the Global Interpreter Lock. Basically, it means that a single Python interpreter process can only have one thread executing bytecode at the time. What this means is that any non IO based shared memory multithreading (not to be confused multiprocessing) is impossible, greatly limiting the options for parallelism in Python.

Enter Julia. Julia is a JIT (Just In Time) compiled language, meaning that instead of simply interpreting bytecode, the subroutines are compiled to native language slightly in advance. This allows for the following executions of a functions or a loop to be much quicker than they would be if they were interpreted at each pass. The fact that the code is actually compiled instead of just being interpreted also allows fir static analysis and optimisation to be made on the code, for exemple removing code with no visible end effects (dead code), propagating constants, etc. Julia also has optional static typing, allowing for even greater static analysis and optimization.
Shared memory parallelism is still being developed for Julia. There is currently a working implementation of a thread safe demonstration version of Julia for Linux, but code still needs to be produced in order for Julia to support shared memory parallelism in other mainstream platforms.

An other extremely interesting feature Julia has is that it was built to be incredibly easy to interface with C/C++ and with Python. Note here that when I say incredibly easy, it's no exaggeration. Some examples of code integration can be observed here for C, and here for Python.

Julia also fully supports IPython style online notebooks, like this :

It also has a beautiful plotting library in Gadfly, aside from being able to use Matplotlib really easily:

All in all, even though Julia is still a bit too young to be fully considered for production code, it is an extremely promissing project that I would watch really closely.


  1. Thanks Jules, I liked your post. I heard first about Julia from Mostafa while he was visiting us for few weeks. Sounds interesting to follow... I will follow the Julia evolution too.

  2. interesting! At least we know such a thing exists!