Julius Plenz – Blog

An on demand Debugging Technique for long-running Processes

Debbuging long-running processes or server software is usually an “either–or”: Either you activate debugging and have huge files that you rarely if ever look at, and they take up a considerable amount of disk space – or you did not activate the debugging mode and thus cannot get to the debugging output to figure out what the program is doing right now.

There is a really nice quick and dirty Non-invasive printf debugging technique that just does a printf on a non-existent file descriptor, so that you can view the messages by strace-ing the process and grepping for EBADF.

I want to share here a few Perl code snippets for an approach that is a little neater IMO, yet a little bit more invasive. Consider a simple “server” doing some work, occasionally printing out a debug statement:

#!/usr/bin/perl

use strict;
use warnings;

sub Debug { }; # empty for now

while(1) {
    Debug("Here I am!");
    select undef, undef, undef, 0.1;
}

The idea is now to on demand create a UNIX domain socket where the process can write debug information to, so that (possibly a few) other processes can read and print out the debug info received on that socket.

We introduce a “global” structure $DEBUG, and a function to initialize and destroy the socket, which is named debug-<pid-of-process> and placed in /tmp.

my $DEBUG = {
    socket => undef,
    conn => [],
    last_check => 0,
};

sub Debug_Init {
    use IO::Socket;
    use Readonly;
    my Readonly $SOCKET = "/tmp/debug-$$";

    return if $DEBUG->{socket};

    unlink $SOCKET;
    my $s = IO::Socket::UNIX->new(
        Type => IO::Socket::SOCK_STREAM,
        Local => $SOCKET,
        Listen => 1,
    ) or die $!;
    $s->blocking(0);
    $DEBUG->{socket} = $s;
}

sub Debug_Cleanup {
    return unless $DEBUG->{socket};
    my $path = $DEBUG->{socket}->hostpath;
    undef $DEBUG->{socket};
    unlink $path;
}

When the process receives a SIGUSR1, we call Debug_Init, and to be sure we’ll clean up the socket in case of normal exit:

$SIG{USR1} = \&Debug_Init;
END { Debug_Cleanup; }

The socket is in non-blocking mode, so trying to accept() new connections will not block. Now, whenever we want to print out a debugging statement, we check if anyone has requested the debugging socket via SIGUSR1. After the first connection is accepted, we’ll only check once every second for new connections. For every accepted connection, we send the debugging message to that peer. (Note that UNIX domain sockets with Datagram type sadly do not support broadcast messaging – otherwise this would probably be easier.)

In case sending the message fails (probably because the peer disconnected), we’ll remove that connection from the list. If the last connection goes, we’ll unlink the socket.

sub Debug {
    return unless $DEBUG->{socket};
    my $s = $DEBUG->{socket};
    my $conn = $DEBUG->{conn};
    my $msg = shift or return;
    $msg .= "\n" unless $msg =~ /\n$/;

    if(time > $DEBUG->{last_check}) {
        while(my $c = $s->accept) {
            $c->shutdown(IO::Socket::SHUT_RD);
            push @$conn => $c;
        }
        $DEBUG->{last_check} = time if @$conn;
    }
    return unless @$conn;

    for(@$conn) {
        $_->send($msg, IO::Socket::MSG_NOSIGNAL) or undef $_;
    }
    @$conn = grep { defined } @$conn;

    unless(@$conn) {
        Debug_Cleanup();
    }
}

Here’s a simple script to display the debugging info for a given PID, assuming it uses the setup described above:

#!/usr/bin/perl

use strict;
use warnings;
use IO::Socket;

my $pid = shift;
if(not defined $pid) {
    print "usage: $0 <pid>\n";
    exit(1);
}

kill USR1 => $pid or die $!;

my $path = "/tmp/debug-$pid";
select undef, undef, undef, 0.01 until -e $path;

my $s = IO::Socket::UNIX->new(
    Type => IO::Socket::SOCK_STREAM,
    Peer => $path,
) or die $!;
$s->shutdown(IO::Socket::SHUT_WR);

$| = 1;
while($s->recv(my $m, 4096)) {
    print $m;
}

We can now start the server; no debugging happens. But as soon as we send a SIGUSR1 and attach to the (now present) debug socket, we can see the debug information:

$ perl server & ; sleep 10
[1] 19731

$ perl debug-process 19731
Here I am!
Here I am!
Here I am!
^C

When we hit Ctrl-C, the debug socket vanishes again.

In my opinion this is a really neat way to have a debugging infrastructure in place “just in case”.

posted 2013-07-13 tagged linux and perl

Modify a Readonly scalar in Perl for testing purposes

The standard book Perl Best Practices advises in chapter 4.5 that one should use the Readonly Perl module instead of the constant standard module for various reasons.

An example might look like this:

package Myprogram;

use Exporter;
use Readonly;
our @EXPORT = qw(conffile);
Readonly our $BASEPATH => "$ENV{HOME}/.myprogram";

sub conffile { "$BASEPATH/config.ini" }

If you want to unit test your program now, you cannot just mess around and replace a potentially existing config file with a bogous one. You have to create a temporary directory and use that as the base path. That is, you have to modify your Readonly declared variable. I’ve not seen this documented, so I guess this might help others: Internally the method Readonly::Scalar::STORE is called when you do an assignment (see man perltie for details). In Readonly.pm, this is redefined to

*STORE = *UNTIE = sub {Readonly::croak $Readonly::MODIFY};

which dies with an error message. So you only have to circumvent this. The method STORE gets a reference of the location as first argument, and the value as second argument. So a quick-and-dirty workaround is just setting

*Readonly::Scalar::STORE = sub { ${$_[0]} = $_[1]; };

prior to assigning to the Readonly variable. If you want to do it properly, you should only change this locally in the block where you re-assign the value, so that subsequent attempts will again produce the usual error message. Such a test might look like this:

use strict;
use warnings;

use Test::More 'no_plan';

use Myprogram;

{
    no warnings 'redefine';
    local *Readonly::Scalar::STORE = sub { ${$_[0]} = $_[1]; };
    $Myprogram::BASEPATH = "/tmp";
}

is(Myprogram::conffile, "/tmp/config.ini", "get config filename");

For non-scalar values, this will probably work similar. (Read the source if in doubt.)

posted 2013-01-02 tagged perl

Nachlese

Verschiedene Texte, die ich heute las und für sehr lesenswert halte:

posted 2012-07-31 tagged linkdump, perl, oneliner, math and politik

debugging a zsh completion function

There's a Perl module threads::shared. However, it won't show up as a completion match to the perldoc command. Why is that?, a colleague asked me. Now, that obviously is a quirk, so let's debug it.

We start at /usr/share/zsh/functions/Completion/Unix/_perldoc, where we find this:

_alternative \
    'modules:module: _perl_modules -tP' \
    'pods:base pod: _perl_basepods' \
    'files:module or .pod file:_files -g "*.(pod|pm)(-.)"' &&
    ret=0

So, go on to _perl_modules: It searches for paths where Perl would store modules and their documentation (line 85):

inc=( $( $perl -e 'print "@INC"' ) )

Then, they do some complicated globbing stuff on the directories (line 104), where a loop iterates over $inc and stores each element in $libdir:

new_pms=( $libdir/{[A-Z]*/***/,}*${~sufpat}~*blib* )

There's the catch: This globbing expression will only recurse into subdirectories that start with an uppercase letter. Lowercase modules on the highest level are okay, though. That's why threads appears (from threads.pm), but threads::shared doesn't (from threads/shared.pm).

The search for all modules that are missed this way will translate to the following Z-Shell command:

$ print -l ${^=$(perl -e 'print "@INC"')}/[a-z]*/***/*.pm(.N)
/usr/lib/perl/5.10/threads/shared.pm
/usr/share/perl/5.10/encoding/warnings.pm
/usr/share/perl/5.10/warnings/register.pm

posted 2011-09-01 tagged zsh and perl

Perl: use feature;

With it's 5.10 version, Perl learned some pretty interesting new features. Some of them modify the syntactic interpretation of Perl code, so to stay backwards-compatible, you have to manually enable them. To do this, add a line like

use feature qw(:5.10);

to your other use declarations. (For one-liners, add the -E command line switch.) The documentation is available via perldoc feature or online.

There's one pretty trivial change that nonetheless has bothered me for a long time already. Why is there no function in Perl to print out some strings, and then finish with a newline? I mean, almost every sane programming language has this, and it helps writing clean code a lot (no fiddling with \n and so on). – Now, they backported the say function from Perl 6, which does just that: add an omplicit newline.

The new concept of state variables is nothing I need for now; however, the feature they strangely call switch (although the actual keywords are given/when) is pretty nice: It's pretty much like what you'd expect with a typical C construct like this:

switch(var) {
    case 1:
        /* do something */
        break;
    ...

    default:
        /* do default thing */
        break;
}

In Perl that's now:

given($var) {
    when(condition1) {
        # do something
    }

    do_other_thing when condition2; # as one-liner

    default {
        # fallthrough case
    }
}

Update: The statement when condition; one-liner doesn't work. Don't know where I got that from.

You can use simple strings (maps to $_ eq "string"), regular expressions (maps to $_ =~ /regex/) or function references like \&myfunc (maps to myfunc($_)).

The best improvement, however, are so-called named captures. (You don't have to enable those specifically.) It allows regexes to contain pairs of parentheses that you can assign a name to later extract the match. That means: no stupid and mind-boggling re-numbering of $2 to $3 etc., just because you added a set of brackets around something you want to extract from a string. Consider this example regex:

my $re = qr/
    (?<user>[^\@\s]+)
    \@
    (?<domain>[^\s">]+)
    /x;

You can now access the user-part of the e-mail address via $+{user}, not necessarily via $1. The immediate gain is obvious: if you add parentheses around the whole expression (to capture the whole e-mail address) the user part is still available via $+{user}. The numbered variable would be $2 now, though.

Another side-effect is that you can test several regexes in a row now, and later extract matching parts from all of these – of course only if you use distinct capture names. Hash entries in %+ will only be updated upon a match (and won't be cleared AFAIK, at least in the lexical scope).

posted 2011-07-02 tagged perl