Sunday, December 20, 2009

How to screw the Avatar 3D experience and alienate customers

How exactly might one go about doing that? After all, it is a James Cameron's MASTERPIECE, which will redefine CGI forever, a worthy sequel to Titanic.

Here's how Odeon, Connaught Place, a proud member of Anil Ambani's empire went about it.

You have a cinema screen with totally screwed up contrast ratio, color range that makes ugly cam prints look good and a sound system worse than a cheap pair of earphones.

And you charge people 200/300 bucks for it. Not to mention the 10x priced food to go with it.



Sunday, November 1, 2009

Mmap-ing temporary files

There is a cool way to allocate memory without actually doing malloc. It uses temporary files.

Here's how to do it,

1) Make a temporary file, and then use fileno() to get it's file descriptor.

2) Seek to the the end of the region which you want mmap-ed and the write a dummy data (just a "") to make the file of that size

3) Actually mmap this file with this size itself.

Why bother with all this, you ask. It can be useful to inter-operate between C land and numpy while sharing memory while avoiding memory leaks. This can be achieved by using numpy.memmap class and then passing in the numpy array to obtain the pointer.

This will in general need some amount of reworking in your library's memory allocation routine, but avoids the troubles associated with making sure that the C object is deallocated after the associated numpy object is freed.

Saturday, October 31, 2009

I wave, you wave, we all wave

I just got an invite for Google Wave, but there aren't many people around at the moment. Also I can't see any invite other folks link either. Not bad for 100th post, eh..?

Tuesday, October 27, 2009

Tweeting from python

Stumbled across this. Really cool. Now I can send myself a private tweet to keep track of my long running jobs, or just about any damn thing. I don't have to be present at the keyboard to monitor the progress for this.

The only problem, is this. No proxy support, no.... :(

Hopefully, it'll materialize soon.

Saturday, October 24, 2009

Parallel nirvana on the cheap

I just had a production run for this, it took 48 hours. Now time for some parallel goodness from python side. :) So, after much experimentation and frustration, here is a great module, and here is some great advice.

Of course, you can get it in C/C++ land too, just use this.

Saturday, October 17, 2009

Tegra2 netbooks

It seems Tegra2 based netbooks will come out next year. If it has a dual core ARM Cortex A9, running at atleast 1.5 GHz, it'll probably be faster than my present laptop in CPU power. If it has a GPU based on the 9400M chipset, it'll be faster than the gpu in my laptop too, while supporting CUDA, OpenGL 3.2, OpenCL 1.1, WebGL etc. Not to mention that it'll have dedicated hardware for video decode acceleration as well. Tegra1 devices are supposed to be capable of playing back HD video for over 10 hours on a single battery charge. Hell, if this thing can give me 6 hours of HD video playback, I am gonna love it. On a smaller process, it should get more power efficient as well, but that is probably asking for too much. Considering the amount of integration it has, a netbook based on it will likely be cheaper as well... :) :)

If this thing comes with 2GB ram, then I am definitely gonna pick one up. With Chrome OS, it could prove mighty useful as well.

4GB ram is highly unlikely since ARM doesn't have a 64 bit CPU core out as yet, AFAIK. But Tegra3, :D

So my wishlist for Tegra2 would be,

1. Dual core ARM Cortex A9 with NEON at 2GHz (it'll be fun to write an vector backend for eigen on ARM)
2. 9400M GPU
3. hardware accelerated video decode for H.264, VC1, MPEG-4, DivX
4. 2GB RAM
5. minimum 6 hour battery life while playing 1080p HD video
6. nVidia supporting Chrome OS on it. (Let's hope Chrome OS will come with some nice hacking tools with it too, or atleast there will be some community support around building distributions that allow hacking it.)

Friday, October 16, 2009

Getting colors in git output on opensuse 11.1

I just installed opensuse 11.1 on my lab machine. Git was behaving very oddly here, displaying color codes instead of color-ed output with git-diff and other commands.

A bit of googling later, I found this, exactly the same issue, but on Mac OS X. It was useful to me. Problem fixed.

Sunday, October 11, 2009

Broad comparison of Larrabee and AMD and nVidia GPU's

Jawed, from B3D, in an excellent post describes the broad architectural features of Larrabee and GPU's from nVidia and AMD. Worth a read for anyone who is interested in high performance hardware of tomorrow and for those who are looking to tap into the cheap teraflops of these beasts.

Wednesday, October 7, 2009

Looks like I did it

It seems like that I have managed to graft on a few changes to my magnum opus successfully.

I am just waiting for second opinion and then, merge to master. :)

It's a bit slower than what I'd prefer, need to profile it.

Monday, September 28, 2009

Dropbox, file syncing service

Somebody recently recommended to me the dropbox file sharing service. It is really cool. You just see a local folder, you make changes to it and it is auto synced across multiple machines.

They provide an opensource plugin for nautilus, and a closed source deamon to actually talk to their servers. You install the opensource plugin, and then do a dropbox start -i, and the binary deamon is downloaded and installed. After that, it could not be simpler to use.

The package (even the opensource part) will never be part of fedora proper, and here's why. But yeah, it could be a useful inclusion in rpmfusion.

Saturday, September 19, 2009

Python extensions, wrapping with swig and bug hunting

If you use swig to wrap your c/c++ code into a python extension, you will sometimes come across bugs when you try to import your freshly minted module.

Errors which go something like,

_mod, attribute referenced before assignment.

They can be pretty nasty to hunt, since you don't know if the bug is in your code, python interpreter (HOLY COW! that's not where I tweeted ;) ), swig (please no!!) or your own code (damn, I'll have debug it instead of passing it off on someone else..).

Chances are that the error is in your code. And if you get error like the one showed above, then python interpreter is throwing an error while importing your extension. To debug it, try the following,

1) if your module is named FOO, then do

import _FOO

as you have to prepend the underscore for the real module while swig sugarcoats it to make it look a little less ugly.

There is a very good chance that it is happening because there is a function in your code which is unknown to swig or the python interpreter is attempting to call a function that does not exist.

The path from now on is a little harder.

2) do

objdump -dS > some-text-file

And dive into that disassembly looking for the offending function. This isn't for the faint-hearted, but it pretty straightforward, quick and simple if you are used to it. :)

Happy bug hunting

Friday, September 18, 2009

Dual quaternions in eigen.

I have been trying to add dual quaternion support (mainly for rigid transformations) to my project. That is done, but now I am trying to push it into eigen. To follow the discussion, head here and there.

Friday, September 11, 2009

Benefits of flatter/more balanced expression trees

I am rewriting (partly) my magum opus. This time, I have overloaded operators of a simple class wrapping the __m128i datatype. This allows me to generate much flatter expression trees, and boy, the code generated is fantastic. I have never seen so awesome assembly code generated by gcc from my code.

Here is the assembly dump for the brave and/or the foolish :)

0000000000400ba0 <_z11addressegenpkspkis0_s0_s0_s0_ssss>:
400ba0: 66 44 0f 6e 5c 24 08 movd 0x8(%rsp),%xmm11
400ba7: 44 0f bf 54 24 20 movswl 0x20(%rsp),%r10d
400bad: 66 0f 6e 74 24 10 movd 0x10(%rsp),%xmm6
400bb3: 4c 8b 1d 06 18 20 00 mov 0x201806(%rip),%r11 # 6023c0
400bba: 66 45 0f 61 db punpcklwd %xmm11,%xmm11
400bbf: 66 0f 61 f6 punpcklwd %xmm6,%xmm6
400bc3: 66 44 0f 6e 74 24 18 movd 0x18(%rsp),%xmm14
400bca: 41 f7 da neg %r10d
400bcd: 66 45 0f ef ff pxor %xmm15,%xmm15
400bd2: 66 41 0f 70 e3 00 pshufd $0x0,%xmm11,%xmm4
400bd8: 66 0f 70 de 00 pshufd $0x0,%xmm6,%xmm3
400bdd: 66 44 0f 6e 6c 24 20 movd 0x20(%rsp),%xmm13
400be4: 66 45 0f 61 f6 punpcklwd %xmm14,%xmm14
400be9: 66 41 0f fd 63 40 paddw 0x40(%r11),%xmm4
400bef: 66 45 0f 61 ed punpcklwd %xmm13,%xmm13
400bf4: 66 44 0f 6f d4 movdqa %xmm4,%xmm10
400bf9: 66 41 0f fd 5b 50 paddw 0x50(%r11),%xmm3
400bff: 44 89 54 24 d4 mov %r10d,-0x2c(%rsp)
400c04: 66 44 0f 6f cb movdqa %xmm3,%xmm9
400c09: 66 0f 6e 44 24 d4 movd -0x2c(%rsp),%xmm0
400c0f: 66 41 0f 70 ee 00 pshufd $0x0,%xmm14,%xmm5
400c15: 66 41 0f 6f 53 70 movdqa 0x70(%r11),%xmm2
400c1b: f3 44 0f 10 f8 movss %xmm0,%xmm15
400c20: 66 45 0f 70 e5 00 pshufd $0x0,%xmm13,%xmm12
400c26: 66 44 0f db ca pand %xmm2,%xmm9
400c2b: 66 45 0f 61 ff punpcklwd %xmm15,%xmm15
400c30: 66 44 0f 6f f3 movdqa %xmm3,%xmm14
400c35: 66 44 0f db d2 pand %xmm2,%xmm10
400c3a: 66 41 0f fd ec paddw %xmm12,%xmm5
400c3f: 66 41 0f 71 f1 02 psllw $0x2,%xmm9
400c45: 66 44 0f 6f e4 movdqa %xmm4,%xmm12
400c4a: 66 41 0f 71 d6 0f psrlw $0xf,%xmm14
400c50: 66 41 0f 70 ff 00 pshufd $0x0,%xmm15,%xmm7
400c56: 66 0f 6f f3 movdqa %xmm3,%xmm6
400c5a: 66 41 0f 71 f2 04 psllw $0x4,%xmm10
400c60: 66 45 0f eb d1 por %xmm9,%xmm10
400c65: 66 41 0f 71 d4 0f psrlw $0xf,%xmm12
400c6b: 66 41 0f ef 7b 60 pxor 0x60(%r11),%xmm7
400c71: 66 0f fd ef paddw %xmm7,%xmm5
400c75: 66 44 0f 6f c5 movdqa %xmm5,%xmm8
400c7a: 66 44 0f 6f ed movdqa %xmm5,%xmm13
400c7f: 66 0f 6f fd movdqa %xmm5,%xmm7
400c83: 66 44 0f db c2 pand %xmm2,%xmm8
400c88: 66 41 0f 71 d5 0f psrlw $0xf,%xmm13
400c8e: 66 0f 6f d5 movdqa %xmm5,%xmm2
400c92: 66 45 0f eb d0 por %xmm8,%xmm10
400c97: 66 0f 71 d2 04 psrlw $0x4,%xmm2
400c9c: 66 44 0f 6f c6 movdqa %xmm6,%xmm8
400ca1: 44 0f 29 17 movaps %xmm10,(%rdi)
400ca5: 0f 29 22 movaps %xmm4,(%rdx)
400ca8: 0f 29 19 movaps %xmm3,(%rcx)
400cab: 41 0f 29 28 movaps %xmm5,(%r8)
400caf: 48 8b 05 0a 17 20 00 mov 0x20170a(%rip),%rax # 6023c0
400cb6: 66 0f 6f 80 a0 00 00 movdqa 0xa0(%rax),%xmm0
400cbd: 00
400cbe: 66 44 0f 6f b8 90 00 movdqa 0x90(%rax),%xmm15
400cc5: 00 00
400cc7: 66 0f f9 c3 psubw %xmm3,%xmm0
400ccb: 66 0f 71 d0 0f psrlw $0xf,%xmm0
400cd0: 66 44 0f db f0 pand %xmm0,%xmm14
400cd5: 66 0f 6f c2 movdqa %xmm2,%xmm0
400cd9: 66 44 0f f9 fc psubw %xmm4,%xmm15
400cde: 66 41 0f 71 d7 0f psrlw $0xf,%xmm15
400ce4: 66 45 0f db e7 pand %xmm15,%xmm12
400ce9: 66 0f 6f 88 b0 00 00 movdqa 0xb0(%rax),%xmm1
400cf0: 00
400cf1: 66 0f 71 f0 06 psllw $0x6,%xmm0
400cf6: 66 0f f9 cd psubw %xmm5,%xmm1
400cfa: 66 45 0f db e6 pand %xmm14,%xmm12
400cff: 66 0f 71 d1 0f psrlw $0xf,%xmm1
400d04: 66 44 0f db e9 pand %xmm1,%xmm13
400d09: 66 0f 6f ec movdqa %xmm4,%xmm5
400d0d: 66 0f ef c9 pxor %xmm1,%xmm1
400d11: 66 0f 71 d5 04 psrlw $0x4,%xmm5
400d16: 66 45 0f db e5 pand %xmm13,%xmm12
400d1b: 66 0f 61 e9 punpcklwd %xmm1,%xmm5
400d1f: 66 0f 61 c1 punpcklwd %xmm1,%xmm0
400d23: 66 44 0f eb a0 80 00 por 0x80(%rax),%xmm12
400d2a: 00 00
400d2c: 45 0f 29 21 movaps %xmm12,(%r9)
400d30: 4c 8b 1d 89 16 20 00 mov 0x201689(%rip),%r11 # 6023c0
400d37: 66 45 0f 6f 8b c0 00 movdqa 0xc0(%r11),%xmm9
400d3e: 00 00
400d40: 66 41 0f db f9 pand %xmm9,%xmm7
400d45: 66 45 0f db c1 pand %xmm9,%xmm8
400d4a: 66 44 0f db cc pand %xmm4,%xmm9
400d4f: 66 0f 6f e3 movdqa %xmm3,%xmm4
400d53: 66 0f 71 d7 02 psrlw $0x2,%xmm7
400d58: 66 0f 71 d4 04 psrlw $0x4,%xmm4
400d5d: 66 0f 61 e1 punpcklwd %xmm1,%xmm4
400d61: 66 41 0f 71 f1 02 psllw $0x2,%xmm9
400d67: 66 45 0f eb c1 por %xmm9,%xmm8
400d6c: 66 44 0f eb c7 por %xmm7,%xmm8
400d71: 66 44 0f 61 c1 punpcklwd %xmm1,%xmm8
400d76: 44 0f 29 06 movaps %xmm8,(%rsi)
400d7a: 4c 8b 1d 3f 16 20 00 mov 0x20163f(%rip),%r11 # 6023c0
400d81: 0f 29 6c 24 e8 movaps %xmm5,-0x18(%rsp)
400d86: 41 8b bb d0 00 00 00 mov 0xd0(%r11),%edi
400d8d: 0f af 7c 24 e8 imul -0x18(%rsp),%edi
400d92: 01 3e add %edi,(%rsi)
400d94: 48 8d 7e 04 lea 0x4(%rsi),%rdi
400d98: 41 8b 8b d4 00 00 00 mov 0xd4(%r11),%ecx
400d9f: 0f af 4c 24 ec imul -0x14(%rsp),%ecx
400da4: 01 0f add %ecx,(%rdi)
400da6: 41 8b 93 d8 00 00 00 mov 0xd8(%r11),%edx
400dad: 48 8d 4e 08 lea 0x8(%rsi),%rcx
400db1: 0f af 54 24 f0 imul -0x10(%rsp),%edx
400db6: 0f 29 44 24 d8 movaps %xmm0,-0x28(%rsp)
400dbb: 01 11 add %edx,(%rcx)
400dbd: 48 8d 56 0c lea 0xc(%rsi),%rdx
400dc1: 45 8b 93 dc 00 00 00 mov 0xdc(%r11),%r10d
400dc8: 44 0f af 54 24 f4 imul -0xc(%rsp),%r10d
400dce: 44 01 12 add %r10d,(%rdx)
400dd1: 0f 29 64 24 e8 movaps %xmm4,-0x18(%rsp)
400dd6: 45 8b 8b e0 00 00 00 mov 0xe0(%r11),%r9d
400ddd: 44 0f af 4c 24 e8 imul -0x18(%rsp),%r9d
400de3: 44 01 0e add %r9d,(%rsi)
400de6: 44 8b 4c 24 d8 mov -0x28(%rsp),%r9d
400deb: 45 8b 83 e4 00 00 00 mov 0xe4(%r11),%r8d
400df2: 44 0f af 44 24 ec imul -0x14(%rsp),%r8d
400df8: 44 01 07 add %r8d,(%rdi)
400dfb: 41 8b 83 e8 00 00 00 mov 0xe8(%r11),%eax
400e02: 0f af 44 24 f0 imul -0x10(%rsp),%eax
400e07: 01 01 add %eax,(%rcx)
400e09: 45 8b 93 ec 00 00 00 mov 0xec(%r11),%r10d
400e10: 44 0f af 54 24 f4 imul -0xc(%rsp),%r10d
400e16: 44 01 12 add %r10d,(%rdx)
400e19: 0f 29 44 24 e8 movaps %xmm0,-0x18(%rsp)
400e1e: 44 01 0e add %r9d,(%rsi)
400e21: 44 8b 44 24 ec mov -0x14(%rsp),%r8d
400e26: 44 01 07 add %r8d,(%rdi)
400e29: 8b 74 24 f0 mov -0x10(%rsp),%esi
400e2d: 01 31 add %esi,(%rcx)
400e2f: 8b 44 24 f4 mov -0xc(%rsp),%eax
400e33: 01 02 add %eax,(%rdx)
400e35: c3 retq
400e36: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
400e3d: 00 00 00

To be fair, I am comparing gcc 4.3 to gcc4.4.

The compiler runs out of steam at around 400d86, but still this is great. I am using eigen after that. I have no idea why the operations have not been vectorized. Need to look into that as well.
But still, +1 to this.

EDIT: The vectorization for the code after 400d86 can be fixed if you follow this and this.

Thursday, September 3, 2009

AMD's profiler for linux

I had no idea that there was an RPM for AMD's code analyst sitting in fedora's repositories until I came across this page.

I immediately got hold of the package and now it is sitting nicely on my laptop.

Time to grind my software against it. :)

Sunday, August 30, 2009

[update] short support in eigen

There was recently some good news on the vectorization for short operations front. I may be able to do it with a little less work then. If you want to follow the work that's being done on this, follow this thread. Help welcome, of course!

Now we have native 64 bit builds for chromium in fedora. Thanks to the chromium team and Tom, who has been maintaining those builds. This post was written using 64 bit chromium on fedora.

Monday, August 24, 2009

Adding short support to eigen

I have started work on adding short support to eigen. I think this would help me a lot as I don't need an int in my work. Short should be just fine. You can follow the progress of my work here. There is no vectorization support yet, but it should be able to generate scalar code just fine. For more in depth discussion on this, follow this thread.

Writing unit tests is up next.

Sunday, August 16, 2009

Emacs 23

Is it just me, or emacs 23 (the one that ships with fedora, not ubuntu) does kinda look cool?

BTW, it broke cuda mode, dunno why? :(

Google Wave goes live on 30th sept.

Sept is shaping up to be an exciting month. AMD's dx11 class gpu's launch on sept. 10, and Google Wave will go live on 30th. It seems that Wave will be by-invitation-only, just like gmail was initially. :(

But still, let's hope it will open up faster.

Thursday, August 6, 2009

Installing AMD's opencl implementation for CPU

I just installed AMD's implementation of OpenCL for it's CPU's on Fedora 11, x86-64. Here's how to do it.

1. Download the SDK from here. and unpack the sdk.

1A. Move the decompressed archive into the place where you want to install it. I chose /home/rpg/bin. I don't like binary crap arbitrarily polluting my system directories and settings.

2. Enter the directory of the sdk and do


You could use make -j2 or even make -j4 if you have multiple cores to burn.

3. Now you need to add a few places to your paths. Add this to your .bashrc and .bash_profile files. Both of them should be in your home folder.




The installation notes provided by SDK have some typos. For a start, it is lib and not bin and you add different folders for different architectures. And you definitely don't need root access for this.

Lucky me though, I have a SSE3 capable CPU. It is ironic to note that the inventor of x86-64 is leaving behind the classic x86-64. I wonder why they felt SSE3 was mandatory.

After all this, this is what I get to see,

$ ./BlackScholes
For test only: Expires on Wed Sep 30 00:00:00 2009

0.118873 7.04261 0 0.00272289 0 33.9944 5.90269 0.0174322 15.6142 31.8218 0.208182 31.4897 36.4303 4.70751 0 0 0.180582 0 23.1549 0 56.4034 19.635 38.0241 0.00524368 2.9981 52.4114 33.1443 50.3665 0 36.6461 0.167021 0.00245133

9.18383 0.216879 56.2858 23.5745 51.5608 3.39307e-06 0.0396819 14.4737 0.00709584 0 5.18874 0 0 3.79125 63.0257 49.3472 17.1783 51.754 0.191513 16.1561 0 0 0 29.3654 0.00833774 0 0 4.21006e-06 17.8919 6.60636e-05 4.04936 31.1893

Option samples Time taken(sec) Options / sec
4096 1.565 2617.25

The exact paths have been scrubbed from the output here though. Enjoy!

Monday, August 3, 2009

Chrome working again

Chrome is working again. Inside IITB. :)

Monday, July 13, 2009

Going home

I am writing this from inside the great firewall of China. :)

Thursday, July 9, 2009

No chrome for me :(

I knew this was coming, but I didn't it would come so fast and in this way. Latest svn build just hit a SNAFU. Ah, the perils of living on the bleeding edge. Thank God, I still have firefox. And 3.5 is pretty good with it's speed too.

Monday, July 6, 2009

Compiling open64 with gcc4.4

After this, God knows what got into me and I tried to compile open64 with gcc4.4 that comes with fedora 11. There were a few bugs along the way. I included cstring, cstdlib here and there and they were fixed. The last one that I got stuck on was in c-parse.c. Now I am officially giving up on my install-as-many-compilers-as-you-can binge. LLVM should be fine, for now.

EDIT0: Ok, I broke my promise to myself. I installed it using the binary rpm. :)

installing LLVM-GCC in fedora 11, 64 bit

Fedora comes with a prepackaged llvm. To install that, you just need to do yum install llvm. But to compile any thing with llvm, you need to install llvm-gcc. Here's how to do it.

1. install llvm by

$ yum install llvm

2. download llvm-gcc source code from here. I downloaded llvm-gcc4.2-2.5.source.tar.gz. Let's say you downloaded it to /path/to/llvm-gcc

3. now extract the tarball by

$ cd /path/to/llvm-gcc
$ tar -zxvf llvm-gcc-version.tar.gz

4. make temp directories for compilation

$ cd /path/to/llvm-gcc/llvm-gcc4.2-2.5.source
$ mkdir install;mkdir obj

5. now comes the fun part of customizing everything to suit your needs. I did

$ cd obj
$ ../configure --prefix=`pwd`/../install --program-prefix=llvm- --enable-llvm=/usr --enable-languages=c,c++,fortran --disable-multilib

The extras here are fortran, disabled multilib for native x86-64 builds and the prefix directory for installation.

6. Now compile and install

$ make -j2

On your machine, it might be a little different depending on how many cpus you can spare for installation.

7. Finally, install it with

$ make install

Now all your stuff will be installed int the $prefix/install/bin directory. Here, $prefix is the path you used to run ./configure, and not some arbitrary environment variable. You may place symlinks at your preferred places.

Visit to Alcatraz

Getting to San Francisco: $3.70

Ferry to Alcatraz, $26.00

Having your camera battery die on you after you get there: Priceless

Saturday, July 4, 2009

No chrome, :(

Have a look at this.

I will have to give up chrome when I return to IITB. I am disappointed, but hopefully, this feature will get implemented soon.

Thursday, July 2, 2009

More updates

After falling to alomst 30, my to-watch list of movies has swelled to 42 again. :) Time to go back to India is approaching too. Mumabi would be as usual swamped with rains, no doubt. :(

Wednesday, July 1, 2009

Chrome on Fedora 11, 64 bit

I just installed google chrome using this. Even though I am running a 64 bit version. Install using yum and it will run perfectly. I am all over this thing. It is awesome, it runs like a rocket, GAAAWWWD. Every one should try this. Thanks for making this repo available. This thing is going to kick ass when done in gentoo. !!! Thank God for the competition.

EDIT0: I wrote this using chrome. The repo maintainer seems to be updating his repo fairly quickly (for now atleast).

EDIT1: As correctly pointed out here, the rpms are not native 64 bit binaries. They are 32 bit ones, which work correctly for 32 bit ones too.

Friday, June 19, 2009

Why I love Python

Yesterday, my advisor was talking to me. I was telling him about the progress we have made.

Me: This is a python script to drive every thing

Advisor: What is Python?

M: It is a scripting language.

... 5 minutes later

A: Let me look into the script.

.....He reads for a minute or two.

A: Now I see why people program in Python. I can actually understand it.


Monday, June 15, 2009


We lost. :(

But honestly though, we didn't deserve to win considering the way we played. 14 wides, top 4 playing like they are holidaying on a tropical beach, paid for by God know who.

But it keeps up the pattern we have been seeing since 1987. Did well, reached semis. 1991, a disaster. 1996, again reached semis and 1999, made a hash of it all. 2003, reached the finals and in 2007, suffered humiliation at the hands of Bangladesh. In 2007 T20, we actually won for a change, and in 2009..........

We'll lose to SA too on Tuesday, rest assured.

Knives are going to be out for Dhoni and co now, considering the Sehwag fiasco.

Tuesday, June 9, 2009

Reset signals

I had quite a moment today. I have been bamging my head against a problem for quite some time now. Today, it got solved. I had generous help from Michal in it BTW.

The problem. The mother code works uses a active high reset and the module for my chip uses active low. Epic fail!

Friday, June 5, 2009

Software updates and linux distros

Today, I just out of curiosity did a uname -a and gcc -v on th elinux server that cnnects to my hardware. And horror of horrors, what do I find? Gcc 3.4 and kernel 2.6.8. Python v2.3.

Now come on guys. Update your software atleast once a year. I know everyone does not run Gentoo. But still.

Speaking of Gentoo, I must confess that I have been drawn to it. I really need the customizability for some applications/libraries atleast. The usual complaints apply, however. I am told (by some one who is a serious linux pro) that gentoo offers too much choice, such as letting you install a bootloader yourself (aka, not by default).

I am coming around to the view of installing Sabayon and using portage to get my optimization-compilation kick. BTW, IMHO, Sabayon is the coolest looking distribution out there. I have downloaded it's iso, just in case. I'll probably use the lab machine as the guinea pig in this experiment. :)

Honestly though, I think that with a Core i7, compilation would be much less of a problem. I really like hardware multithreading. I think, all new CPU's should have it though. :) Wish AMD will go this route as well.

I have been checking out the portage tree over at and it seems that they have done a really thourough job. I mean, apart from the llvm, vTune and sage, I could find ebuilds for every thing. Even flash, skype, imkl, acml, icc etc. The first two are a mystery, but with sage, I am not surprised. With their philosophy of making a giant static binary (the source tarball is >300M, I think), and bundling all the open source math software ever created (including their shiny python interpreter) into it, I am amazed that somebody managed to get it into Debian. I hope the fedora guys will have some luck as well. But it'll probably will always be way behind on updates.

I'll get the videos done by the weekend hopefully. Promise.

Tuesday, June 2, 2009


Ok, the pics got done earlier than I expected. So enjoy. But the videos will probably take more time.

Pics/videos from Maker's fair 2009

I'll be putting up stuff I grabbed from Maker's Fair 2009. It's gonna take a while, but I really want to annotate the stuff real well so that onlookers have some context around it. Promise.

Sunday, May 31, 2009

Running skype on fedora 64 bit

I just installed skype on my fedora 10 laptop (64 bit) using this and it works like a charm.

EDIT0: It is broken in fedora 11. I haven't got around to fixing it yet.

Wednesday, May 20, 2009

In Berkeley :)

I am back at the lab (LBNL) now. I have settled down here in the last couple of days. The weather is nice here. Though I have not been doing the stuff had planned I would do yet. Really need to get started on it soon.

Fedora 11 got delayed. :(

Will update this blog soon. Stay tuned.

Tuesday, April 28, 2009

Updates 2

This month, I have been really bad with my blogging. I did very few updates (in fact, this is the second one this month). But now I hope to turn a corner. The exams are over. The project (Sriram and mine) is done. And I am in the mood for some relaxation now. Though I still can't believe we managed to finish our project some 24 hours before the deadline. I mean, when was the last time it happened?

Never, dude.

We borrowed a lot of code from kgllib. It's a fantastically written llibrary for opengl convenience functions. Too tied to Qt for my taste, but the Qt stuff was easy to remove. Initially, we wrote a lot of nice code, but soon after degenerated into bad programming practices. Global variables all over the place, messy code in between. We avoided the deprected bits of opengl functionality wherever we could, which, IMHO, is a good thing. We had a nice time hunting for segmentation faults, and removing them. :)

Friday, April 17, 2009


1) Exams are going on. But I have a bit of a break here.

2) My IIT life is coming to an end. Not immediately, but yes the end has begun.

3) Our (Sriram and mine) opengl project is coming along. Need to fix a few bugs before we can get it to work. Had a few issues with git while collaborating on it. For now, we are just screwing any optimizations. :)

4) I was recently put up as a contributor to Eigen on their webpage. Seems wonderful to see my name in print. :)

Monday, March 16, 2009

Render to texture

We, (that is Sriram and myself) just got our render to texture facility working. We drew a quad and and then generated some funny patterns in the pixel shader. It was fun. My first gpu side hacking. More of it is coming along soon, I promise.

Thursday, March 12, 2009

Haskell and fractals.

I recently came across Fraqtive. Very nice program. And well optimized and well written too, by the looks of it. It automatically takes care of multithreading and vectorization. And it uses some very nice algorithms to to handle zoom ins etc. The big news for me came when I stumbled across this and realized that it uses template metaprogramming to generate fast C++ code. Bit more googling and what do I find, Haskell bindings for LLVM!!!

I remember that in Lisp one can get access to the code of the functions to manipulate on them. If I can get them in Haskell, it would be great. Generate LLVM code from there, and then run it under a JIT. and after that, send it off to opengl and use shaders to generate nice colors for them....

May be I am getting too ahead of myself. New language, new libraries. May be, I should start from Python first :) But there's no denying that it would make for a very good learning project in Haskell.

Wednesday, March 11, 2009

Vectorized integer multiplication

Integer operations in vector ISA's rarely get much love (compared to floating point operations), unless it has got something to do with video decode-encode. But I needed something like that for my needs. So I wrote one, and it turns out to be faster than the one in eigen, so it was committed.

Sunday, March 8, 2009

Open source developer

Now I am an official opensource developer. Yay!!!!. Feels great to see some stuff from my side being accepted there. Small beginning, I admit. But hopefully, as we go along, I will contribute more and more. BTW, this community is very friendly and responsive.

I wrote this because I wanted to use quaternions and it not being vectorized is obviously a shame. I have also sent a better routine for vectorized integer multiplication. Let's hope it gets in too. The code is only somewhat faster, but hey something is better than nothing.

Tuesday, March 3, 2009

Good news

Finally, some sanity is restored. Look here for a rant. Lets' hope nvidia drivers won't break this.

Wednesday, February 25, 2009


Lot's of 'em here today.

I came across this site. Hack LLVM from python. Wow. I really liked it. I even went through some of the tutorials up there which tell you how to hack llvm to make your own language. The lexer and the parser I am obviously going to generate using automated tools (if I ever do it). Making AST from parsed input is obviously going to take some doing. But if you can get to LLVM IR, after that it is a joy ride.

Another nice thing about his website is that it is made using asciidoc. I immediately recognized it from it's similiarity to the git documentation. Seems way better suited to generating HTML than LATEX and you can do nice code formatting and insert math formulas.

And the first preview (to my knowledge) of RV740 just got out here. Seems really good. Good luck AMD. And please don't go bankrupt.

Sunday, February 22, 2009

OpenGL Assignment

It's done. I managed to do the whole thing, even as much as for 150% credit, though it is useful only if I am stuck at the borderline. The mid sem was half a washout, but I think he gives papers like this. Of course, there was no time to optimize anything.

After the grades for this come out, I'll post it here depending on how embarrassing was it. :) Though I hope using Python for it wasn't a problem.

Friday, February 20, 2009


The exams are over. But the assignment I referred to here still needs doing. Thank God I have python. With this sword of damocles hanging on my head, all thought of optimization, and writing it as a python extension has gone miles away from my head.

Let's Rock.

Saturday, February 14, 2009

Exams, again

Midsems have begun, again. And immediately after them I am supposed to do an assignment for which the specs were changed midway. And now, I have to rewrite it all. I have a few ideas, and hopefully, will be able to implement them in time. Thank God for python, where would I be without you. And yes, atm, the usual aim of having it run blazing fast is out of the window.

Thursday, February 5, 2009

POVray videos

I just made a few videos from my POVray demo. I thought I'll put them up here. So here it is, at 800x600 resolution, encoded at 15 fps.


Wetting my feet

It seems a long while ago that I wanted to learn Haskell. And I haven't made any progress yet. But it seems to be changing. I came across this. It's a very nice book at a cursory glance. And I also got this book from the library. It seems to be pretty good as well.

Wednesday, February 4, 2009

POVray assigment done

The POVray assignment I wrote about earlier is now complete. Got some nice pics out of it. Here's one of them.

Now I am rendering them at higher resolution and will be making a movie out of them. Right now I have made a mpeg-4 one but I am looking to encode them in theora format. FFmpeg, I love you. Though this movie was made with mencoder, they both basically use the libavcodec and libavformat libraries.

It is somewhat surprising that in ray tracing too, texture mapping is used extensively.

Anyways, the pics look pretty cool to me.

And yes, I figured out how to get rid of that irritating warning. Here' how.

In the vfe/unix folder of the source tar ball for unix, there's a file called unixconsole.cpp. IN it, the following code appears,

if (user_code != current_code)


fprintf(stderr, "%s: this pre-release version of POV-Ray for Unix %s\n",PACKAGE,current_week <>);


Comment it out. Afterwards, it will whine about expiring in a while, but those warnings are harmless, it won't come to pass.

Happy ray tracing.

Friday, January 30, 2009

The road to GLSL: Part II

Well, where's the part I dude? You may ask. Well, it was here, when I realized what I had with me for 2 years. And, in my 50th post, I am happy to tell you that we are nearly there now. Have a look.

[rpg@rpg ~]$ glxinfo|grep NVIDIA
server glx vendor string: NVIDIA Corporation
client glx vendor string: NVIDIA Corporation
OpenGL vendor string: NVIDIA Corporation
OpenGL version string: 2.1.2 NVIDIA 180.25
OpenGL shading language version string: 1.20 NVIDIA via Cg compiler
[rpg@rpg ~]$

No, my prompt isn't like that. I reinstalled Fedora 10 because I was fed up with KDE. Now I have both GNOME and KDE for future proofing. I'll just keep upgrading and use whatever seems better at the time.

The install went horribly though. Long story. Some data loss has occurred too, though can't say how much. Hopefully, it is in can-be-easily-mitigated category. My mistake. Should have acted more prudently.

Anyway, now the fun begins.

Wednesday, January 28, 2009

Rant against povray

Recently, we were given an assignment to render some scenes using POVray and demonstrate to what we had learned about ray tracing. All right then, let's install it. OK. it may not be OSI compliant license but as the page there states, it's because of it's birth time. We are sure to have packages for it in ubuntu. It's a just a simple matter of running

$sudo apt-get install povray

I couldn't have been more wrong.

This project is quite screwed up as it turns out.

There has been no stable release for more than 4 years now. The project's mailinglists distinctly convey a sense of lack of developers. But this is the least of the troubles. Sure, I could have just stuck to the old but stable version. But it is not multithreaded.

Yup. That's right. A serial ray tracer.

If I didn't know better, I would have said that was a malapropism (GRE side effects, :-) ). People have been using SMP machines (if not multi core) for a while now and since POVray has been ported to many different machines before, it is incomprehensible that it's developers haven't threaded it yet.

The most irritating thing about it is their concept of beta timeout. Their betas usually expire after a fixed time. And on top of it, these guys can't even release betas in time. Look here.

They can't release the updated betas on time, aren't releasing new stable versions at all and everyone else is supposed to update every month or so.

I somehow manged to patch the vfe.cpp file referenced in the above link. I even managed to render a few things like this.

And, then I went to dinner. What a mistake on my part.

And now, I have to fight this.

~/povray-3.7.0.beta.29/unix@rpg-lab> ./povray
povray: this pre-release version of POV-Ray for Unix has expired

Now only the betacode hack is working, in spite of modding the sources as mentioned in the link above to remove the timeout.

God help me.

Just look at Blender. It has a huge community behind it. Scriptable in python (yay!), supports GLSL shaders, renders all right, imports/exports from/to every file format imaginable, can interface to other ray tracers as well. I think I am begining to like it. Further, it can export to anything that ffmpeg supports.

I am running it on a machine with SSE3, but gcc can hardly be expected to make use of any of it's horizontal math goodness.

Tuesday, January 27, 2009

GRE over, finally

My GRE is over. Thank God so much. The last few days have been full of nervous tension. But it is over. And it went well.

And now, let's install Fedora 10, and chase shaders, GLSL shaders. And yeah, before that, we need to backup our data, watch some movies, etc etc.

Thursday, January 22, 2009


Just wanna get over with it. Feeling good about it. Let's see how it goes.

Fingers crossed.

PS: Slumdog Millionaire gets 10 Oscar nominations. AR Rehman gets 3 of them. That guy is a genius. He's gonna win atleast one of them. I am ecstatic about it.

Thursday, January 15, 2009

Thread Safety

OpenGL ain't thread safe. What a shame.

Turns out few libraries are thread safe. How the hell are average Joe programmers supposed to write multithreaded code when the libraries they rely upon aren't. Beats me.

Saturday, January 10, 2009

Exams and system calls

I have got my GRE comping up on 27th. It's really keeping me away from my crazy shaders. I wanted to get into system call level programming earlier, as I find it to be a powerful tool. I looked at the mmap system call first and found it to be very nice.

The GLSL examples I have looked at usually define a read from file function which stat's a file for it's size, allocates memory reads and passes the char * to compiler. Better still, just stat the file to get it's size, mmap those many bytes and then, just pass the mmap-ed pointer to the compiler!

PPM images lend themselves to even better use. Just open the file, read the header info and close it. Then, mmap the file at the offset such that the header is read off and use the size from prior header read, and voila, you can stream the texture to the GPU from hard disk asynchronously. Other texture loading libraries typically require you to allocate memory temporarily while they read the files. This way, we get rid of memory bugs, (which can get nasty BTW) and this is something we can do easily in the background, ideal use for multi-threading. And it's simple too, as they it requires no inter-thread I/O.

I know PPM images are too big to be practically used, but hey, OpenGL needs images in that format, and we will give it in that format. Compressed textures, we'll look into, but later.