Swig is working. My new cache aware data structure is working. Segmentation faults have been removed. It seems faster too. Good news, you may ask?
Unfortunately no. When my code was working, I created a new branch with git and left the old one in master. Something seems to have gone wrong meanwhile. Today I just checked out the master branch and it was the same as my new branch.
This is bad.
Anyway, what I was trying to do was that I was trying to add vectorization. Not working. No idea why. The portions I actually vectorized are reporting correct results. Naturally, other parts also got touched as I was trying to convert my data to 16 byte aligned AoS form. Now, I have no idea why it's not working. Meanwhile I wanted to go back to the older, working, scalar version.
And now, it's gone.
God knows how much I struggled to get vector multiply working using only SSE2 intrinsics. There is a direct instruction in penryn class CPUs. It turs out that SSE3 wasn't so useful after all since it had mainly floating point intrinsics.
Can I have some divine intervention please?