Sunday, January 10, 2010

Choose indecision: Simple plugins in C

In my last post, I talked a bit about using shared libraries to change the implementation of some existing program at runtime. That was a nice trick for testing or debugging, but it's probably not how you would want to approach the task of making a system that's extensible and configurable.


Which brings us to the topic of today's entry: using dynamically loaded libraries as plugins to make C programs extensible and configurable. A plugin is just a shared library that provides an implementation of a particular interface, and which can be swapped for another implementation by some kind of runtime parameter.


A (contrived) hard-coded example

Suppose you have a piece of code that is prone to segmentation faults, and you want to get notification when such a fault occurs. (OK, this is sort of corny, because you would really want a core file, rather than just a log message, but it makes a fun example.)

You might begin by hard-coding an signal handler into your application, which will log the event. Here's an example of a program that just sets up a simple signal handling callback that writes a message to a logfile before it exits:


hardcoded.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <errno.h>
#include <sys/types.h>
#include <dlfcn.h>

/* This is our signal-handling callback. */
static void segv_callback(int sig) {
  FILE *log = fopen("logfile", "a");
  fprintf(log , "Your program segfaulted. Sorry.\n");
  fclose(log);
  exit(1);
}

/* This is our main service method, which does all the work.
 * (Or it would, if this example were about doing work.)
 */
static void do_stuff(void) {
  fprintf(stderr, "Doing stuff...\n");
  sleep(5);
}

/* Set the handling of SIGSEGV to call our callback. */
static void init_signal_handler(void) {
  struct sigaction sigact;
  memset(&sigact, 0, sizeof sigact);
  sigact.sa_handler = segv_callback;
  if (sigaction(SIGSEGV, &sigact, NULL) != 0) {
    fprintf(stderr, "Can't set up SEGV handler: %s\n", 
     strerror(errno));
    exit(1);
  }
}

int main(int ac, char **av) {
  init_signal_handler();
  /* Here is where we fake a segmentation fault. */
  kill(getpid(), SIGSEGV);
  do_stuff();
  return 0;
}

The interesting bit in this program is in init_signal_handler, where we use sigaction to specify the action that the process should take when it receives the signal SIGSEGV, which is what happens when the OS detects a segmentation violation.


Making it configurable

Now, suppose you don't want the response to the event hard-coded. Maybe some users want logging while others want some kind of email notification. You could hard-code this, too, with conditionals and a set of command-line arguments for each, but you could also use plugins to make the available options completely customizable. So, we're going to implement a plugin-based solution now.


First, we will define an interface for the plugins. In this simple example, the interface is just that the plugin must have a function called "segv_callback", which takes one integer argument:

void segv_callback(int);


Now, this interface isn't enforced by the compiler, so your program
will need to handle failures when you attempt to look up the callback function by name. (This is something you never need to worry about in either statically-linked programs, where the compiler gives an error if you try to call an unknown function, or in normal dynamically-linked programs, where the linker will give an error if all the symbols aren't resolved.)


Once we know the interface for the plugin, we can write a function to load the plugin library and initialize a function pointer to the segv_callback function in the library. Here is a function that does this.


/* This is just a function pointer. */
static void (*segv_callback)(int);

/* This method sets the function pointer segv_callback to point
 * to the method in the shared library file named by callback_lib.
 */
static void init_callback(char *callback_lib) {
  void *callback_handle;
  
  if ((callback_handle = dlopen(callback_lib, RTLD_LAZY | RTLD_LOCAL)) == NULL) {
      /* We failed to open the callback library. Fail. */
    fprintf(stderr, "Failed to open callback library: %s\n", dlerror());
    exit(1);
  }
  if ((segv_callback = dlsym(callback_handle, "segv_callback")) == NULL) {
    fprintf(stderr, "Failed to get callback function: %s\n", dlerror());
    exit(1);
  }
}

You call this function with the name of the library to be used as the plugin, and the function uses dlopen to get a handle to the library. Then it uses dlsym to get a pointer to the function named "segv_handler" withing that library. Finally, it sets the value of the (global) function pointer segv_callback to be the pointer that dlsym returned.


Here is one possible implementation of segv_callback, which sends email to the user running the process (to be precise, to the username corresponding to the effective UID for the process).


mail_plugin.c

#include <stdio.h>
#include <pwd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>


void segv_callback(int signal) {
  FILE *mail;
  struct passwd *pwd;
  char *username = "root";
  char buf[256];
  memset(buf, 0, sizeof(buf));
  if ((pwd = getpwuid(geteuid())) != NULL) {
    username = pwd->pw_name;
  }
  snprintf(buf, 255, "/usr/bin/mail %s", username);
  if ((mail = popen(buf, "w")) != NULL) {
    fprintf(mail, "I'm sorry. Your program segfaulted. Have a nice day.\n");
    pclose(mail);
  }
  
  exit(1);
}


Here is another implementation of this plugin interface that just does logging, exactly like the original program:


logging_plugin.c


#include <stdio.h>
#include <stdlib.h>

void segv_callback(int signal) {
  FILE *log = fopen("logfile", "a");
  fprintf(log , "Your program segfaulted. Sorry.\n");
  fclose(log);
  exit(1);
}


When you've built the program and the plugin, you can run it like this:

$ ./simple_pluggable ./mail_plugin.so
$ mail
Mail version 8.1.2 01/15/2001.  Type ? for help.
"/var/mail/chris": 1 message 1 new
>N  1 chris@localhost    Sun Jan 10 16:01   13/433   
& 
Message 1:
From chris@localhost  Sun Jan 10 16:01:11 2010
X-Original-To: chris
To: chris@localhost
Date: Sun, 10 Jan 2010 16:01:11 -0800 (PST)
From: chris@localhost (Chris)

I'm sorry. Your program segfaulted. Have a nice day.


A little bit of state

Now, the plugin code above just sends email to the logged-in user. But what if we want to specify a particular email account where such notifications should be sent at runtime? Then the plugin needs to get this information from somewhere.


It's possible that you could have a configuration file for this, or a lookup service of some kind (LDAP perhaps) that the plugin could call to get contact details. However, let's say that the desired behavior is to specify the email address on the command line, while you are selecting the email callback. In this case, the plugin needs to get that information somewhere, and in this example, it will come from a plugin initialization call.


In order to support initialization, the (informal) interface for your plugin needs another function, which we'll call init:

void init(char *arg);
void segv_callback(int);

The idea here is that you will initialize the plugin with some data, and the plugin will maintain the data in memory until it is needed when segv_callback is called. In this case, we're just storing a single string, but the initialization could be anything at all.


Here is a new version of the init_callback function (the function that loads a plugin) that we defined earlier. The difference is that this version takes a parameter, which is passed to the plugin's init function, if there is one.


/* This method sets the function pointer segv_callback to point
 * to the method in the shared library file named by callback_lib.
 *
 * It also calls the init method in that library.
 */
static void init_callback(char *callback_lib, char *init_arg) {
  void *callback_handle;
  void (*init)(char*);
  
  if ((callback_handle = 
       dlopen(callback_lib, RTLD_LAZY|RTLD_LOCAL)) == NULL) {
      /* We failed to open the callback library. Fail. */
    fprintf(stderr, "Failed to open library: %s\n", dlerror());
    exit(1);
  }
  /* init_arg, if it exists, is an argument to initialize plugin. */
  if ((init = dlsym(callback_handle, "init")) != NULL) {
    if (init_arg != NULL) {
      
      /* We only call init if it's defined
       * and we have an arg for it. 
       */
      init(init_arg);
    }
    else {
      /* in this case, we need an init arg, but none specified. */
      fprintf(stderr, "Need an init arg for this plugin.\n");
      exit(1);
    }
  }
  if ((segv_callback = 
       dlsym(callback_handle, "segv_callback")) == NULL) {
    fprintf(stderr, "Failed to get function: %s\n", dlerror());
    exit(1);
  }
}

And here is a plugin that sends mail to user whose address is passed to the init function:


config_mail_plugin.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>

static char *email_address = NULL;

/* Store a copy of the passed-in address. */
void init(char *addr) {
  email_address = strdup(addr);
}


void segv_callback(int signal) {
  FILE *mail;
  char buf[256];

  if (email_address == NULL) {
    fprintf(stderr, "Email address has not been initialized.\n");
    exit(1);
  }
  memset(buf, 0, sizeof(buf));
  snprintf(buf, 255, "/usr/bin/mail %s", email_address);
  if ((mail = popen(buf, "w")) != NULL) {
    fprintf(mail, 
     "I'm sorry %s. Your program has segfaulted.\n",
     email_address);
    pclose(mail);
  }
  
  exit(1);
}


When you've built this code and the plugin, you can run it like this:

$ ./stateful_pluggable ./config_mail_plugin.so chris@example.com

And then check your mailbox for notifications.


The End

This has been a demonstration on a fairly simple way to construct plugins on any kind of UNIX-ish system that supports dlsym/dlopen. If you are targeting any other system, you'll have to find out how to load libraries somewhere else.


Appendix: Files that aren't already listed above

Makefile


CFLAGS = -fPIC

all: hardcoded simple_pluggable stateful_pluggable logging_plugin.so mail_plugin.so config_mail_plugin.so

hardcoded: hardcoded.o
 $(CC) -o hardcoded hardcoded.o

simple_pluggable: simple_pluggable.o
 $(CC) -o simple_pluggable simple_pluggable.o -ldl

stateful_pluggable: stateful_pluggable.o
 $(CC) -o stateful_pluggable stateful_pluggable.o -ldl

logging_plugin.so: logging_plugin.o
 $(CC) -o logging_plugin.so logging_plugin.o -shared

mail_plugin.so: mail_plugin.o
 $(CC) -o mail_plugin.so mail_plugin.o -shared

config_mail_plugin.so: config_mail_plugin.o
 $(CC) -o config_mail_plugin.so config_mail_plugin.o -shared

clean:
 rm -f *.so *.o logfile


stateful_pluggable.c


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <errno.h>
#include <sys/types.h>
#include <dlfcn.h>

/* This is just a function pointer. */
static void (*segv_callback)(int);

/* This is our main service method, which does all the work.
 * (Or it would, if this example were about doing work.)
 */
static void do_stuff(void) {
  fprintf(stderr, "Doing stuff...\n");
  sleep(5);
}

/* This method sets the function pointer segv_callback to point
 * to the method in the shared library file named by callback_lib.
 *
 * It also calls the init method in that library.
 */
static void init_callback(char *callback_lib, char *init_arg) {
  void *callback_handle;
  void (*init)(char*);
  
  if ((callback_handle = 
       dlopen(callback_lib, RTLD_LAZY|RTLD_LOCAL)) == NULL) {
      /* We failed to open the callback library. Fail. */
    fprintf(stderr, "Failed to open library: %s\n", dlerror());
    exit(1);
  }
  /* init_arg, if it exists, is an argument to initialize plugin. */
  if ((init = dlsym(callback_handle, "init")) != NULL) {
    if (init_arg != NULL) {
      
      /* We only call init if it's defined
       * and we have an arg for it. 
       */
      init(init_arg);
    }
    else {
      /* in this case, we need an init arg, but none specified. */
      fprintf(stderr, "Need an init arg for this plugin.\n");
      exit(1);
    }
  }
  if ((segv_callback = 
       dlsym(callback_handle, "segv_callback")) == NULL) {
    fprintf(stderr, "Failed to get function: %s\n", dlerror());
    exit(1);
  }
}

/* Set the handling of SIGSEGV to call our callback. */
static void init_signal_handler(void) {
  struct sigaction sigact;
  memset(&sigact, 0, sizeof sigact);
  sigact.sa_handler = segv_callback;
  if (sigaction(SIGSEGV, &sigact, NULL) != 0) {
    fprintf(stderr, "Can't set up SEGV handler: %s\n", strerror(errno));
    exit(1);
  }
}

int main(int ac, char **av) {
  char *init_arg = NULL;
  /* We're going to use the first command line argument, which will
   * specify which callback library to load.
   */
  if (ac < 2) {
    fprintf(stderr, "Usage: stateful_pluggable PLUGIN_FILE [arg]\n");
    exit(2);
  }
  if (ac > 2) {
    init_arg = av[2];
  }
  init_callback(av[1], init_arg);
  init_signal_handler();
  kill(getpid(), SIGSEGV);
  do_stuff();
  return 0;
}

simple_pluggable.c


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <errno.h>
#include <sys/types.h>
#include <dlfcn.h>

/* This is just a function pointer. */
static void (*segv_callback)(int);

/* This is our main service method, which does all the work.
 * (Or it would, if this example were about doing work.)
 */
static void do_stuff(void) {
  fprintf(stderr, "Doing stuff...\n");
  sleep(5);
}

/* This method sets the function pointer segv_callback to point
 * to the method in the shared library file named by callback_lib.
 */
static void init_callback(char *callback_lib) {
  void *callback_handle;
  
  if ((callback_handle = dlopen(callback_lib, RTLD_LAZY | RTLD_LOCAL)) == NULL) {
      /* We failed to open the callback library. Fail. */
    fprintf(stderr, "Failed to open callback library: %s\n", dlerror());
    exit(1);
  }
  if ((segv_callback = dlsym(callback_handle, "segv_callback")) == NULL) {
    fprintf(stderr, "Failed to get callback function: %s\n", dlerror());
    exit(1);
  }
}

/* Set the handling of SIGSEGV to call our callback. */
static void init_signal_handler(void) {
  struct sigaction sigact;
  memset(&sigact, 0, sizeof sigact);
  sigact.sa_handler = segv_callback;
  if (sigaction(SIGSEGV, &sigact, NULL) != 0) {
    fprintf(stderr, "Can't set up SEGV handler: %s\n", strerror(errno));
    exit(1);
  }
}

int main(int ac, char **av) {
  char *init_arg = NULL;
  /* We're going to use the first command line argument, which will
   * specify which callback library to load.
   */
  if (ac < 2) {
    fprintf(stderr, "Usage: simple_pluggable PLUGIN_FILE\n");
    exit(2);
  }
  init_callback(av[1]);
  init_signal_handler();
  kill(getpid(), SIGSEGV);
  do_stuff();
  return 0;
}


Software estimation formula

This software eatimation formula is inspired by Fred Brooks' observations about the effect of team size on project schedule in The Mythi...