Timer with nanosecond accuracy for electronic purpose

My "coding-with-electronic" new hobby, I have studied the datasheet of the Texas Instrument TLC5940. I think I have quite an accurate overview of what is needed to power some LEDs (for a RGB LED cube for example) and of what the code loop would look like.
The problem is that you have to send data to the chip bit by bit and between each bit, you set another pin to high, wait for 16ns then set this pin to low then wait again for 16ns.
I am aware of the free running counter at 1MHz, but a pulse every 1µs is a huge loss of time if you have, let's say 5 TLC5940 in a chain that use 16 outputs of 12 bits each, i.e. 5*16*12=960 bits to fill. In the perfect case, it takes 960*32ns=30720ns so 30.72µs to compare with using the free running counter (it would be 960µs).
Is there a way to get another timer that would give a better time resolution?

Please help.

