Julius Plenz – Blog

C Programming Exam Fail

I just wrote an exam for the course Technische Informatik III which was about operating systems and network communication. In the exercises throughout the semster, we had to program in C a lot. Naturally, in the exam was one task about interpreting what a C program does.

It was really simple: Listening on a UDP socket and print incoming packets along with source address and port. The program looked somewhat like this (from what I remember; also some things were done in a not so clever way on the exercise sheet, and they had obfuscated the variable names to a non-descriptive a, b, etc.):

#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <stdlib.h>
#include <error.h>

int main(int argc, char *argv[])
{
    int sockfd;
    struct sockaddr_in listen, incoming;
    socklen_t incoming_len;

    char buf[1024];
    int len; /* of received data */

    /* listen on 0.0.0.0:5000 */
    listen.sin_family = AF_INET;
    listen.sin_addr.s_addr = INADDR_ANY;
    listen.sin_port = htons(5000);

    if((sockfd = socket(AF_INET, SOCK_DGRAM, 0)) == -1)
        perror("socket");

    if(bind(sockfd, (struct sockaddr *) &listen, sizeof(listen)) == -1)
        perror("bind");

    while(1) {
        len = recvfrom(sockfd, buf, 1024, 0, (struct sockaddr *) &incoming,
                        &incoming_len);
        buf[len] = '\0';
        printf("from %s:%d: \"%s\"\n", inet_ntoa(incoming.sin_addr),
                ntohs(incoming.sin_port), buf);
    }
}

I lol'd so hard when I saw this. It's a classic off-by-one error. (Can you spot it, too?)

If you want to store x bytes of data in a string, reserve x+1 bytes for the NULL termination character. Here, if you send a message that is exactly 1024 bytes long (or longer, as it'll get truncated), buf[len] will actually be the 1025th byte. Which might just be anything.

And those guys want to teach network and filesystem programming – hilarious. :-D

posted 2012-02-17 tagged life, university and c

minimizing Linux filesystem cache effects

Last weekend I toyed around a bit and tried to write a shared object library that can be used via LD_PRELOAD to minimize the effect a program has on the Linux filesystem cache.

Basically the use case is that you have a productive system running, and you don't want your backup script to fill the filesystem cache with mostly useless information at night (files that were cached should stay cached). I didn't test whether this brings measurable improvements yet.

The coding was really fun and provided me with yet another insight how the simple concept of file descriptors in UNIX is just great. (GNU software is tough, though: I got stuck once, and found help on Stackoverflow, which I had never used before.)

posted 2012-02-09 tagged linux, c and hack

vlock and suspend to ram

I've had weird race conditions when using vlock together with s2ram. It appears suspend to ram wants to switch VTs, while vlock hooks into the switch requests and explicitly disables them. So some of the time, the machine would not suspend, while at other times, vlock wouldn't be able to acquire the VT.

To solve this, I wrote a simple vlock plugin, which simply clears the lock mechanism, writes mem to /sys/power/state and later reinstates the locking mechanism. This plugin is called after all and new. Thus, the screen will be locked properly before suspending.

Here's my suspend.c:

#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

/* Include this header file to make sure the types of the dependencies
 * and hooks are correct. */
#include "vlock_plugin.h"
#include "../src/console_switch.h"

const char *succeeds[] = { "all", "new", NULL };
const char *depends[] =  { "all", "new", NULL };

bool vlock_start(void __attribute__ ((__unused__)) **ctx_ptr)
{
    int fd;

    unlock_console_switch();

    if((fd = open("/sys/power/state", O_WRONLY)) != -1) {
        if(write(fd, "mem", 3) == -1)
            perror("suspend: write");
        close(fd);
    }

    lock_console_switch();

    return true;
}

Simply paste it to the vlock modules folder, make suspend.so and copy it to /usr/lib/vlock/modules. I now invoke it like this:

env VLOCK_PLUGINS="all new suspend" vlock

posted 2012-01-20 tagged linux and c

trying pthreads

Today I played around with POSIX threads a little. In an assignment, we have to implement a very, very simple webserver that does asynchronous I/O. Since it should perform well, I thought I'd not only serialize I/O, but also parallelize it.

So there's a boss that just accepts new inbound connections and appends the fds to a queue:

clientfd = accept(sockfd, (struct sockaddr *) &client, &client_len);
if(clientfd == -1)
    error("accept");
new_request(clientfd);

The new_request function in turn appends it to a queue (of size TODOS = 64), and emits a cond_new signal for possibly waiting workers:

pthread_mutex_lock(&mutex);
while((todo_end + 1) % TODOS == todo_begin) {
    fprintf(stderr, "[master] Queue is completely filled; waiting\n");
    pthread_cond_wait(&cond_ready, &mutex);
}
fprintf(stderr, "[master] adding socket %d at position %d (begin=%d)\n",
    clientfd, todo_end, todo_begin);
todo[todo_end] = clientfd;
todo_end = (todo_end + 1) % TODOS;
pthread_cond_signal(&cond_new);
pthread_mutex_unlock(&mutex);

The workers (there being 8) will just emit a cond_ready, possibly wait until a cond_new is signalled, and then extract the first client fd from the queue. After that, a simple function involving some reads and writes will handle the communication on that fd.

pthread_mutex_lock(&mutex);
pthread_cond_signal(&cond_ready);
while(todo_end == todo_begin)
    pthread_cond_wait(&cond_new, &mutex);
clientfd = todo[todo_begin];
todo_begin = (todo_begin + 1) % TODOS;
pthread_mutex_unlock(&mutex);

// handle communication on clientfd

(Full source is here: webserver.c.)

Now this works pretty well and is fairly easy. I'm not very experienced with threads, though, and run into problems when I do massive parallel requests.

If I run ab, the Apache Benchmark tool with 10,000 requests, 1,000 concurrent, on the webserver it'll go up to 9000-something requests and then lock up.

$ ab -n 10000 -c 1000 http://localhost:8080/index.html
...
Completed 8000 requests
Completed 9000 requests
apr_poll: The timeout specified has expired (70007)
Total of 9808 requests completed

The webserver is blocked; its last line of output reads like this:

[master] Queue is completely filled; waiting

If I attach strace while in this blocking state, I get this:

$ strace -fp `pidof ./webserver`
Process 21090 attached with 9 threads - interrupt to quit
[pid 21099] recvfrom(32,  <unfinished ...>
[pid 21098] recvfrom(23,  <unfinished ...>
[pid 21097] recvfrom(31,  <unfinished ...>
[pid 21095] recvfrom(35,  <unfinished ...>
[pid 21094] recvfrom(34,  <unfinished ...>
[pid 21093] recvfrom(33,  <unfinished ...>
[pid 21092] recvfrom(26,  <unfinished ...>
[pid 21091] recvfrom(24,  <unfinished ...>
[pid 21090] futex(0x6024e4, FUTEX_WAIT_PRIVATE, 55883, NULL

So the children seem to be starving on unfinished recv calls, while the master thread waits for any children to work away the queue. (With a queue size of 1024 and 200 workers I couldn't reproduce the situation.)

How can one counteract this? Specify a timeout? Spawn workers on demand? Set the listen() backlog argument to a low value? – or is it all Apache Benchmark's fault? *confused*

posted 2012-01-17 tagged linux and c

mutt sidebar patch improvements

It is generally accepted as an almost universal truth that mutt sucks, but is the MUA that sucks less than all others. While people use either Vim or Emacs and fight about it, I hardly see any people fight about whether mutt is good or bad. There is, to my knowledge, no alternative worth mentioning.

Mutt dates back well into the mid-nineties. As you might imagine, with lots of contributors over the course of almost two decades, the code quality is rather messy.

When development had stalled for quite a while in the mid-2000's, a fork was attempted. While mutt-ng was quite popular for a while, most changes were incorporated back into mainline mutt at some point. (Ironically, the latest article in the mutt-ng development blog is from October 2006 and is titled "mutt-ng isn't dead!"). The development of main mutt gained some momentum again, triggered in large parts by the contributions of late Rocco Rutte.

I remember two big features that the original mutt authors just wouldn't integrate into mainline: The headercache patch and the sidebar patch. About the former I can't say anything, but lately I've been fixing the Sidebar patch in various places. (We use mutt at work and rely heavily on e-mail communication, so we'd like a bug-free user agent, naturally.)

When all the mutt forking went about five years ago, I didn't know much about it. Retrospectively, I see the people did a hell of a job. Long before mutt-ng was forked, Sven told me he and Mika met in Graz for several weeks to sift and sort through the availbale patches, intending to do a "super patch".

Mutt's code quality is arguably rather messy.

On top of that, the Sidebar patch tries to make it even worse. Imagine this: mutt draws a mail from position (line=x,char=0) to the end of the line. Now the sidebar patch will introduce a left "margin", such that the sidebar can be drawn there. Thus, all code parts where a line is started from the leftmost character has to be rewritten to check if the sidebar is active and possibly start drawing at (line=x,char=20).

The sidebar code quality is a fringe case of bad code. Really, it sucks. However, there's no real way to "do it right", since original mutt never planned for a sidebar.

Who maintains the sidebar patch? – Not sure. There's a version at thomer.com, but he says:

July 20, 2006 I quit. Sadly, there seems to be no desire to absorb the sidebar patch into the main source tree.

The most up-to-date version is found at Lunar Linux. Last update is from mid-2009.

Debian offers a mutt-patched package that includes the sidebar patch, albeit in a different version than usually found 'round the net. In short, this patch is a mess, too.

But since I made all the fixes, I decided to contact the package's maintainer, Antonio Radici. He promptly responded and said he'd happily fix all the issues, so I started by opening two bug reports. Nothing has happened since.

The patches run quite stable for my colleagues, so I think it's best to release them. Maybe someone else can use them. Please note that I have absolutely no interest in taking over any Sidebar patch maintainance. ;-)

For some of the patches I provide annotations. They all feature quite descriptive commit messages, and apply cleanly on top of the Debian mutt repository's master branch.

The first four patches are not by me, they are just the corresponding patches from the debian/patches/ directory applied to have a starting point.

The first few patches fix rather trivial bugs.

Now come the performance critical patches. They are the real reason I was assigned the task to repair the sidebar:

This patch fixes a huge speed penalty. Previously, the sidebar would count the mails (and thus read through the whole mbox) every time that mtime > atime! This is just an incredible oversight by the developer and must have burned hundreds of millions of CPU cycles.

This introduces a member `sb_last_checked' to the BUFFY struct. It
will be set by `mh_buffy_update', `buffy_maildir_update' and
`buffy_mbox_update' when they count all the mails.

Mboxes only: `buffy_mbox_update' will not be run unless the
condition "sb_last_checked > mtime of the file" holds. This solves
a huge performance penalty you obtain with big mailboxes. The
`mx_open_mailbox' call with the M_PEEK flag will *reset* mtime and
atime to the values from before. Thus, you cannot rely on "mtime >
atime" to check whether or not to count new mail.

Also, don't count mail if the sidebar is not active:

Then, I removed a lot of cruft and simply stupid design. Just consider one of the functions I removed:

-static int quick_log10(int n)
-{
-        char string[32];
-        sprintf(string, "%d", n);
-        return strlen(string);
-}

That is just insane.

Now, customizing the sidebar format is simple, straight-forward and mutt-like:

sidebar_format

    Format string for the sidebar. The sequences `%N', `%F' and
    `%S' will be replaced by the number of new or flagged messages
    or the total size of the mailbox. `%B' will be replaced with
    the name of the mailbox. The `%!' sequence will be expanded to
    `!' if there is one flagged message; to `!!' if there are two
    flagged messages; and to `n!' for n flagged messages, n>2.

While investigating mutt's performance, one thing struck me: To decode a mail (eg. from Base64), mutt will create a temporary file and print the contents into it, later reading them back. This also happens for evaluating filters that determine coloring. For example,

color   index  black green  '~b Julius'

will highlight mail containg my name in the body in bright green (this is tremendously useful). However, for displaying a message in the index, it will be decoded to a temporary file and later read back. This is just insane, and clearly a sign that the mutt authors wouldn't bother with dynamic memory allocation.

By chance I found a glib-only function fmemopen(), "fmemopen, open_memstream, open_wmemstream - open memory as stream".

From the commit message:

When searching the header or body for strings and the
`thorough_search' option is set, a temp file was created, parsed,
and then unlinked again. This is now done in memory using glibc's
open_memstream() and fmemopen() if they are available.

This makes mutt respond much more rapidly.

Finally, there are some patches that fix various other issues, see commit message for details.

There you go. I appreciate any comments or further improvements.

Update 1: The original author contacted me. He told me he's written most of the code in a single sitting late at night. ;-)

Update 2: The 16th patch will make mutt crash when you compile it with -D_FORTIFY_SOURCE=2. There's a fix: 0020-use-PATH_MAX-instead-of-_POSIX_PATH_MAX-when-realpat.patch (thanks, Jakob!)

posted 2012-01-08 tagged mutt, linux and c