-= A novel method for reversible computing =- o Introduction ------------------------------------------ Traditional reversible computing is focussed on on the fundamental physical limits of circuit design. Traditional computing inevitably involves discarding information after each calculation which, due to the second law of thermodynamics requires an energy dissipation. However reversible computing offers a solution to this problem. If it was possible to avoid discarding all, or most, of the information then we can reduce the energy being dissipated and, as any self respecting processor overclocker knows, the cooler the chip, the faster the processor can go. On top of that, full reversibility also offers several advantages for software such as error detection, tamper prevention and debugging and it is these that I'll be concentrating on in this article. o Exceptions ------------------------------------------ There are obvious advantages to being able to roll back your software to a point in time in the same way that you have a save point in a video game. Running code can often seem like to more akin to a poking your way slowly though a mine field, especially in a volatile environment. How many times have you seen code that looks like int i = NUM_ATTEMPTS; while(test_for_condition() && i--) { sleep(10); } if (0==i) { fprintf(stderr,"There was an error trying to do 'foo'\n"); return ERROR_CODE; } or #define MALLOCTEST(var, type) (var=(type *) malloc(sizeof(type))) #define CLOBBER(var) if (var) { free(var); } foo * a; bar * b; quux * c; if (!MALLOCTEST(a,foo)) { goto ERROR; } if (!MALLOCTEST(b,bar)) { goto ERROR; } if (!MALLOCTEST(c,quux)) { goto ERROR; } do_somes_stuff(); return SUCCESS; ERROR: CLOBBER(c); CLOBBER(b); CLOBBER(a); return FAILURE; Various languages implement features like exceptions which allow us to roughly the same thing without the torturous macros. try { do_something(); } catch (Exception e) { undo_something(); } However the same pattern is there - the try something, check to see if it failed, and then, if it's a fatal error, unwind everything that we've patiently put together and either try again or fail. Frankly this sort of repetetive grind is exactly what computers should be doing for us. They're good at it and are less likely to make mistakes when they're bored or distracted. o Rollbacks, Transactionsa and commits ------------------------------------------ Database interfaces have recognised this and have provided facilities for rollbacks, transactions and commits. A database commit is used to make a permanent change to a database - if you write a new row to database then the write does not actually occur until the change is committed. A rollback is used to undo a change to a database, that row write can be rolled back and discarded right up to the point where you commit it. A transaction consists of one or more SQL statements. Within a transaction you can have multiple SQL statements which can read, modify, and write to a database. At the end of the transactions by either committing everything or rolling all of them back. Transactions are useful when you have a series of writes to make to a database and need to make sure all the writes have happened correctly before committing them. So you wrap them all in a transaction and check for errors each time. If any of them fail you can roll them all back, if they all occurr successfully then you can go ahead and commit the transaction. Now this is more like it, the computer is handling the nitty, gritty of minutae of unwinding what we've done leaving us to concentrate on the things that the computer isn't smart enough to handle for us. However there's one small flaw in this plan - this only works for database operations and really we'd like this to work for all of our code. Now reversible computing would be perfect for this but it's not available. So we're going to have to be clever and implement the bits we want other ways. o Why implement this in Perl? ------------------------------------------ Perl has gained a reputation as, variously, a jumped up shell scripting language, good only for CGI scripting or a hodge podge of line noise and badly tacked on Object Orientated features. All this may or may not be true but Perl is an incredibly powerful programming environment with many rich features including inspection and modification of the op-tree at runtime and powerful introspection and reflection mechanism. Furthermore the Perl community has a long tradition of experimentation and for poking fun at itself. On CPAN, the massive centralised repository of language extensions and helper modules, there is an entire namespace, Acme::, dedicated to joke modules. These modules range from allowing you to write your Perl in Latin, or Klingon. Or that corrects spelling mistakes in subroutine names (billed as a killer feature that will mean that Perl is the only language not falling foul of legislations in favour of the disabled - in this case dyslexics). One even prints out error messages in haiku. Whilst many of these are deeply silly quite often the namespace is used to demonstrate some obscure Perl arcana or internals wizardy in a light hearted manner. A lot, however, make use of source filters. o Source Filters ------------------------------------------ Source filters are a feature of Perl that act a little like the C Pre Processer. They allow you to parse your own source code and make modifications before the code is executed. Damian Conway used this feature implement the modules I mentioned previously that allow you to program in Latin and Klingon and, in tribute to him, I wrote one that allowed you to write your Perl in 'Strine' - highly contrived Australian slang. My first thought towards implementing this feature was to use source filters to try and recognise sections which the programmer had marked as a 'code transaction' and rewrite variable access so that everything went through temporary variables and were written back if the section succeded. However despite of, or perhaps instead of, Perl's richness, the only thing that can parse Perl code is the the perl interpreter so I would have always run the risk of misinterpreting some code silently introducing bugs. Not only that but it would have been difficult to cope with matters of scope and such like. This then was clearly a bad strategy. o Tie-ing the stash ------------------------------------------ Perl has a nifty feature known as Tie-ing. What this means is that you can overload one of the basic types - scalar, hash, list and filehandle and intercept read and write calls. In practice what this means is that you can intercept all calls to, say, a hash lookup and fetch the resulting value from a database. tie my %db, 'Tie::Database', 'our.dbm', 'table'; my $result = $db{$oid}; print $result->{name},"\n" whilst behind the scenes Tie::Database has done a "SELECT FROM table WHERE oid=$oid" (It should be noted that there are modules that already do similar on CPAN and which aren't as contrived as this example - your primary key wouldn't need to be 'oid' for example). How does this help us - well, Perl presents all current variables in what's called the stash which looks remarkably like a hash. Robin Houston wrote a handy little module called PadWalker that allows us to inspect those variables. So, with a little help from Data::Dumper, a module that recursively prints out data structures for us we can do ... use strict; use PadWalker qw( peek_my ); use Data::Dumper; my $a = 42; my @b = qw(this is an array); my %c = ( foo => 'bar'); my %pad = %{peek_my(0)}; for my $key (keys %pad) { print "$key = ",$pad{$key}," => ",Dumper($pad{$key}),"\n"; } ... which prints out ... $a = SCALAR(0x813d3f0) => $VAR1 = \42; @b = ARRAY(0x813d414) => $VAR1 = [ 'this', 'is', 'an', 'array' ]; %c = HASH(0x81d12ec) => $VAR1 = { 'foo' => 'bar' }; So the answer is clear - we tie all current namespaces, intercept all calls to get or set variables and save them till later. When we finally 'commit' we'll copy these temporary variables back again and all will be peachy. If we roll back then we can just discard them. Right? Well, no. Not really. We'll still have problems with scope and we'll also have problems in that everytime we want to write to our temporary stash - we'll intercept that call and try and write that into our temporary stash and ... On top of it all, the stash is not TIE-able except in read-only mode. Whilst that has work arounds this whole solution is starting to look like a bust - far too complicated and all together too much work. o Clone ------------------------------------------ What we need is to save a copy of the current state of the Perl interpreter when we start our transactiona dn the swap it back into place if we do a rollback. Much closer to the spirit of a save in a video game. So how about if we had some thing like this? void do_magic(SV* coderef) { PerlInterpreter *orig, *copy; /* Get the current interpreter ... */ orig = Perl_get_context(); /* ... and clone it */ copy = perl_clone(orig, FALSE); /* Set our copy to be the current running interpreter */ PERL_SET_CONTEXT(copy); /* Evaluate the code passed in */ perl_call_sv(coderef, G_DISCARD|G_NOARGS|G_EVAL); /* Errk, it failed */ if (SvTRUE(ERRSV)) { /* Set everything back to the original */ PERL_SET_CONTEXT(orig); /* And toss the copy */ perl_free(copy); /* ooh, it was fine */ } else { /* Toss the original */ perl_free(orig); } } Looks like it should work, right? Well, it has some limitations, the most important being that it won't work on anything except threaded Perls and not everybody has those. Perl has a well documented but chequered history with threading and this wouldn't be a terribly portable solution. Also it didn't work. It probably could have been made to work but why struggle when you can have a much easier ride? o What If? ------------------------------------------ So how did we get it to work? Well, the answer was obvious in retrospect - easy, portable and can be done (almost) entirely in Perl space without grubbing around in XS - Perl's extension mechanism. We use fork(). It is an egregious hack but incredibly nifty in a way - all we have to do is fork() when we entery a transaction. Then we open up a socket between parent and child and get the child to execute the code we've been passed. If it doesn't work then we close the child. If it does then we pass a message to the parent which closes itself up leaving the child in place. my $foo = "outside"; whatif { $foo = "inside"; }; print $foo; # prints "inside" my $bar = "outside"; whatif { $bar = "inside"; die "Throw an exception for some reason"; }; print $bar; # prints "outside" The syntactial sugar takes advantage of Perl's function prototypes - if we declare the the 'whatif' subroutine like this sub whatif(&) { ... } Then Perl knows that it's expecting a code block as the first argument an, instead of having to do whatif({ ... }); Which is equivalent to my $coderef = { ... }; whatif($coderef); o If Only ------------------------------------------ With it's whatif { ... }; syntax our transactionable block is beginning to look rather like Java exceptions with their try { ... } catch { ... }; blocks. Whilst the 'whatif' block can undo most things it obviously can't touch anything external such as a database transaction or a file access. So what we need is something orthogonal to a catch {} block that will let us clean up all the places that 'whatif' can't. Enter ifonly { ... }; If we redeclare the prototype of 'whatif' to be sub whatif (&;@) { ... } That is 'a code ref followed by an optional number of other arguments' then the parser will let us add another subroutine afterwards. If we give that subroutine a prototype of (&) as well then it can also take a code block. In effect what whatif { ... } ifonly { ... }; actually represents is whatif( {} , ifonly( {} ) ); and means that we can do some stuff like my $dbh = get_db_handle(); whatif { $dbh->prepare($statement); do_some_other_stuff(); die "risky stuff didn't work" unless risky_stuff(); } ifonly { $dbh->rollback(); } $dbh->commit; o Bait and Switch ------------------------------------------ This does leave one issue though. By forking we're changing the process id and so we're going to end up with a different environment than we had before the start of the transaction. Whilst it's only one little thing it's annoying and we've come so far that there's no point in spoiling things for a ha'pennyworth of tar. Fortunately we don't actually have to change the process id since Perl scripts rely on the special variable '$$' to contain it. Therefore all we need to do is execute a man-in-the-middle attack. Only this has one small problem - since '$$' is special it's read only and we can't go monkeying around with it. Or can we? Perl has a philosophy of trusting its programmers and trying not to get in the way. It's knows that sometimes programmers are going to knowingly and willingly break the rules and that they're prepared to take responsibility for their actions. As Piers Cawley, a noted Perl programmer, once said, "Actually Perl *can* be a Bondage & Discipline language, but it's unique among such languages in that it lets you use safe words." In other words - it's better to know that your programmers may do something in the spirit of #define private public #include "header_with_lots_of_private_functions.h" and prepare for it. And therefore we can start grubbing round in Perl's internals and chnage the value of $$ using some fairly simple XS code void setreadonly(name, value) char * name int value CODE: GV *tmpgv; if ((tmpgv = gv_fetchpv(name, TRUE, SVt_PV))) { SvREADONLY_off(GvSV(tmpgv)); sv_setiv(GvSV(tmpgv), value); SvREADONLY_on(GvSV(tmpgv)); } What this does is, given the name of a variable, fetch it, turn off the readonly flag, set the variable to be the value passed in and then turn the read-only back on. And we use it, from Perl, like this setreadonly('$', $new_process_id); o Conclusions ------------------------------------------ I hold my hands up and admit it - in the guise of an article of reversible computing I've also taken the opportunity to evangelise about Perl a little bit and give a stealth tutorial of some of the lesser known, but, never-the-less, cool features it possesses. I couldn't think of anyway to bring AUTOLOAD into the article but if you've found this interesting then you may want to go and read about that. Whatif started off as an intellectual challenge, posed by Mark Fowler one day, and was originally intended as a joke module to be placed in the Acme:: namespace. However I started to find it useful and decided to leave it in the non-joky mainstream - if only out of vanity. However I started to get a trickle of feedback that indicated that people were, amazingly, finding it useful to the extent that at some point some person suggested as a feature that should be incorporated natively into the next major rewrite of Perl. o references ------------------------------------------ - http://www.perl.com/lpt/a/2001/10/31/lighter.html - http://search.cpan.org/dist/Whatif/Whatif.pm ------------------------------------------ o TODO ------------------------------------------ - More cool examples of Whatif? - More code from Whatif.pm - build up bit by bit? - Spell check - Check tone - Conclusion - References - Nested Whatif blocks