Converting Python Files to Cython

τэкnoкraτ

We make our prototypes in Python. A prototype's purpose is its fast production. However sometimes this fast production aim contradicts with fast execution and we need to parts in C. Recently, instead of C, I tried Cython as an alternative to convert Python programs partially to C.

Suppose we have a normal Python program that works without error in Linux. (For Windows a similar approach should work with MinGW or Visual C.) The first step is to save this example.py as example.pyx and creating a Makefile as the following:

``` {.sourceCode .make} all: cfiles compile

cfiles: cython -a example.pyx

compile: gcc -g -O2 -fpic -c example.c -o example.o python3-config --includes gcc -g -O2 -shared -o example.so example.o python3-config --libs

clean: rm -f example.c .o .so

Save this as `Makefile` and now you should be able to run

``` {.sourceCode .bash}
$ make 

in the directory that has Makefile and example.pyx and get example.so as a importable Python module. Note that anything that runs before saving the file with pyx extension should also work now.

You can import the module as normal.

``` {.sourceCode .python} import example

According to Cython docs, this trivial conversion should yield a 5%
increase in performance. However this is not the aim of Cython.

Suppose we have some hard calculation methods that we use over and over
in loops. We have a code like this in one of our *feature comparison*
modules. It compares lines in a figure by looking at their length, angle
and midpoint relative to a center.

Previously code was calculating all these in Python and the tests (on a
large datase) took days. Then I converted *two functions* that make the
actual comparison like the following.

``` {.sourceCode .cython}
import cv2
import numpy as np
from . import dtw
from pyemd import emd

cdef extern from "math.h":
     double log(double)
     double fabs(double)
     double sqrt(double)
     double pow(double, double)

cdef _cmp_length_midpoint_angle(double a_len, 
                                int a_mp_x, 
                                int a_mp_y,
                                double a_angle,
                                double b_len,
                                int b_mp_x,
                                int b_mp_y,
                                double b_angle):
    len_diff = fabs(log(a_len) - log(b_len))
    mid_diff = 4 * sqrt(pow(a_mp_x - b_mp_x, 2) + pow(a_mp_y - b_mp_y , 2))
    ang_diff = 2 * fabs(a_angle - b_angle)
    return len_diff + mid_diff + ang_diff


cdef _cmp_length_angle(double a_len,
                                     double a_angle, 
                                     double b_len,
                                     double b_angle):
    len_diff = fabs(log(a_len) - log(b_len))
    ang_diff = 2 * fabs(a_angle - b_angle)
    return len_diff + ang_diff

def cmp_length_midpoint_angle(a, b): 
    return _cmp_length_midpoint_angle(a[0], a[1][0], a[1][1], a[2],
                                      b[0], b[1][0], b[1][1], b[2])

def cmp_length_angle(a, b): 
    return _cmp_length_angle(a[0], a[2], b[0], b[2])

In the previous version cmp_length_midpoint_angle and cmp_length_angle were standard Python functions. The conversion took about half an hour and it reduced the running time more than half. It paid off even in the first few hours.

Cython is fantastic.



Comments !