53. HTTP GET is slow - WebKit porting to Mona OS

Using Twitter on our WebKit is not comfortable, because it's slow. So the next step is make it faster.
At first I thought renderring was slow. Yes it was slow, but HTTP GET is much slower than it. As you know, to make Twitter work, WebKit should get many resources, such as css, JavaScript, images, json or html. Each HTTP GET is relatively slow than "normal" browsers.

How can we find the bottle neck?


There are two processes, and many layers.
WebKit -> Curl(HTTP) -> lwip(TCP/IP) -> Ethernet driver(virtio).

Using binary search using nowInMsec() API and rdtsc(), narrowed down where is the bottle neck. Finally, I found most of time is consumed at memcpy. Copying received datum to another buffer to merge them.

What should we do next?

We should think following things.

  • Is our memcpy is fast enough?
  • Can we reduce # of calls of memcpy?

Is our memcpy is fast enough?

Yes. We use memcpy written in assembly borrowed from newlib. It uses rep and movsl. It's as fast as __builtin_memcpy.
One aggressive way is to use SSE optimized memcpy. Should I try it? I missed Google Code Search, I can't find a SSE optimized memcpy for GCC.

Can we reduce # of calls of memcpy?

Yes technically. But I don't want to change a code of lwip.

Timer?

I guess another thing we should consider is timer. WebKit, curl and lwip are using timer.
WebKit is using it for event handling. Curl is using it for timeout handling. lwip is using it for TCP re-transmission.
If timer function is not good enough, it causes bad performance.