Julius Plenz – Blog

trying pthreads

Today I played around with POSIX threads a little. In an assignment, we have to implement a very, very simple webserver that does asynchronous I/O. Since it should perform well, I thought I'd not only serialize I/O, but also parallelize it.

So there's a boss that just accepts new inbound connections and appends the fds to a queue:

clientfd = accept(sockfd, (struct sockaddr *) &client, &client_len);
if(clientfd == -1)
    error("accept");
new_request(clientfd);

The new_request function in turn appends it to a queue (of size TODOS = 64), and emits a cond_new signal for possibly waiting workers:

pthread_mutex_lock(&mutex);
while((todo_end + 1) % TODOS == todo_begin) {
    fprintf(stderr, "[master] Queue is completely filled; waiting\n");
    pthread_cond_wait(&cond_ready, &mutex);
}
fprintf(stderr, "[master] adding socket %d at position %d (begin=%d)\n",
    clientfd, todo_end, todo_begin);
todo[todo_end] = clientfd;
todo_end = (todo_end + 1) % TODOS;
pthread_cond_signal(&cond_new);
pthread_mutex_unlock(&mutex);

The workers (there being 8) will just emit a cond_ready, possibly wait until a cond_new is signalled, and then extract the first client fd from the queue. After that, a simple function involving some reads and writes will handle the communication on that fd.

pthread_mutex_lock(&mutex);
pthread_cond_signal(&cond_ready);
while(todo_end == todo_begin)
    pthread_cond_wait(&cond_new, &mutex);
clientfd = todo[todo_begin];
todo_begin = (todo_begin + 1) % TODOS;
pthread_mutex_unlock(&mutex);

// handle communication on clientfd

(Full source is here: webserver.c.)

Now this works pretty well and is fairly easy. I'm not very experienced with threads, though, and run into problems when I do massive parallel requests.

If I run ab, the Apache Benchmark tool with 10,000 requests, 1,000 concurrent, on the webserver it'll go up to 9000-something requests and then lock up.

$ ab -n 10000 -c 1000 http://localhost:8080/index.html
...
Completed 8000 requests
Completed 9000 requests
apr_poll: The timeout specified has expired (70007)
Total of 9808 requests completed

The webserver is blocked; its last line of output reads like this:

[master] Queue is completely filled; waiting

If I attach strace while in this blocking state, I get this:

$ strace -fp `pidof ./webserver`
Process 21090 attached with 9 threads - interrupt to quit
[pid 21099] recvfrom(32,  <unfinished ...>
[pid 21098] recvfrom(23,  <unfinished ...>
[pid 21097] recvfrom(31,  <unfinished ...>
[pid 21095] recvfrom(35,  <unfinished ...>
[pid 21094] recvfrom(34,  <unfinished ...>
[pid 21093] recvfrom(33,  <unfinished ...>
[pid 21092] recvfrom(26,  <unfinished ...>
[pid 21091] recvfrom(24,  <unfinished ...>
[pid 21090] futex(0x6024e4, FUTEX_WAIT_PRIVATE, 55883, NULL

So the children seem to be starving on unfinished recv calls, while the master thread waits for any children to work away the queue. (With a queue size of 1024 and 200 workers I couldn't reproduce the situation.)

How can one counteract this? Specify a timeout? Spawn workers on demand? Set the listen() backlog argument to a low value? – or is it all Apache Benchmark's fault? *confused*

posted 2012-01-17 tagged linux and c