Scsh | Hummus and Magnets

I use shell scripts a lot. To automate things I do often. To tie together different commands like grep and sed. For one-shot tasks like running the same command 100 times and calculate the average execution time. Basically, when want to instruct my machine to carry out some operation I'll almost always do it by invoking a shell script, usually one written by myself.

Up until I left the v8 project I'd written most shell scripts in bash or, if that became too horrible, python. But when I left v8 to work on wave I made a decision: no more. I don't want to be stuck with a choice between a language that is, frankly, grotesquely horrible, bash, or one that is okay but just not made for what I was using it for, python.

That's when I remembered scsh, the scheme shell (pronounced "skish", rhymes with fish). I had read Olin Shivers' report on it and of course the famous acknowledgements and I'd always thought it was a brilliant idea to use scheme as a glue language. I'd never actually tried the tool though so I decided that this was the perfect time to give it a try.

The latest release of scsh is from 2006. You don't get the impression that it's a project under active development. It's also not available on most system I use. This would normally have put me off but the though that it might rid my life of bash motivated me to give it a try anyway.

On mac it's easy to install, just

sudo port install scsh

On my linux machine I had to build and install it myself, and I had to tweak the build files a little for it to build on a 64 bit machine; if I remember correctly all I had to do was add -m32 at the right place in a generated Makefile.

Having installed it I was ready to start running shell scripts written in scheme. Or so I thought. I wrote my first script,

#!/usr/local/bin/scsh

(display "Hello World!")

and ran it. No dice.

$ ./helloworld.ss
Unknown switch ./helloworld.ss 
Usage: scsh [meta-arg] [switch ..] [end-option arg ...]
meta-arg: \
switch:
... snip ...
-s <script> Specify script.
... snip ...

Ah, I forgot to use the -s option. Add that, try again:

$ ./helloworld.ss
Error: EOF inside block comment -- #! missing a closing !#
       #{Input-port #{Input-channel "./helloworld.ss"}}

Okay, now it's running the script in scsh but it chokes on the #!. For a language designed to run shell scripts it's surprisingly uncooperative. After experimenting a while and a few google searches I came upon the required magic enchantment. My script was now:

#!/usr/local/bin/scsh -s
!#
(display "Hello World!")

While this violates POLA in 100 different ways it works. Yay! The reason it works is because #! ... !# happens to be the block comment syntax in scheme, or at least scheme48 which is the implementation scsh is based on.

Okay, now I could start actually using it to write scripts. The first thing I wanted to implement was a set of wrappers that helped keep track of a handful of git clones of the same underlying non-git repository. That way I can keep tests running and build output intact in one workspace while I work on something else in another separate one, something that using different git branches in the same workspace doesn't give you.

Scsh uses macros and unquote to run external commands so for instance this function,

(define (git-new-branch name)
  (run (git checkout -b ,name))

will run the command

git checkout -b <name>

The , means that the value of the parameter name should be inserted there. The output goes to standard output. You can also get the output back as a list of strings by using run/strings instead of run. As an example of using that here's a function that returns whether or not a git repository has pending changes:

(define CHANGE-RX
  (rx (| "Changed but not updated:" "Changes to be committed:")))
(define (git-has-changes)
  (call-with-current-continuation
    (lambda (return)
      (define (process-line line)
 (if (string-match CHANGE-RX line)
     (return #t)))
      (let ((output (run/strings (git status))))
 (map process-line output))
 (return #f))))

This code runs git status and then iterates through the strings returned looking for the string Changed but not updated: and Changes to be committed:, returning immediately when it finds one. Scsh comes with a rich regular expression library which is the rx part above. It's more verbose than POSIX regexps and does lack some of the conveniences, but is on the other hand much more straightforward and readable and seems to be at least as, if not more, powerful.

At this point you may say: hey, I could have written that function in one line using grep. And you could. The difference is that when I use grep the complexity of my script increases exponentially with the complexity of what I'm trying to accomplish. With scsh a script may start out a bit more verbose, as above, but when the script grows a little more complex, as they tend to do, I can solve the problem using standard high-level programming constructs that are already hardwired in my brain instead of having to pore over the grep manpage to figure out how I make it do what I'm trying to do.

For instance, I can write a one-line script using find and sed that removes all lines containing a.b.c.X from a file, easy. But if I want to extend my script a bit so it only removes a.b.c.X when it occurs within a block of lines enclosed in square brackets that also contains a.b.c.Y the problem has become too complex for me to solve by chaining together shell commands. I'm sure it can be done but I would have to spend an inordinate amount of time figuring out how. On the other hand, doing this in scsh I can solve each individual problem separately: finding blocks enclosed in square brackets, searching for a.b.c.Y, deleting lines containing a.b.c.X, and and combine the individual operations using standard language constructs.

;; This script removes all lines enclosed in brackets containing
;; |to-be-removed| but only if the block also contains
;; |removal-indicator|.
(define (main args)
  (let ((to-be-removed (cadr args))
        (removal-indicator (caddr args))
        (file-name (cadddr args)))
    ;; Regexp matching python lists
    (define LIST-RE
      (rx (: "[" (submatch (* (~ "]"))) "]")))
    ;; Regexp matching lines containing |to-be-removed|.
    (define STRIP-REMOVED-RE
      (rx
        (: #\newline 
           (* (~ #\newline)) 
           ,to-be-removed
           (* (~ #\newline)))))
    ;; Processes all matches
    (define (process-input str)
      (regexp-substitute/global
        #f LIST-RE str 'pre process-list 'post))
    ;; Processes the contents of square brackets
    (define (process-list match)
      (let ((result-contents (remove-if-required
                               (match:substring match 1))))
        (string-append "[" result-contents "]")))
    ;; Removes all occurrences of |to-be-removed| where it is
    ;; together with |removal-indicator|
    (define (remove-if-required str)
      (if (and
            (string-contains str removal-indicator)
            (string-contains str to-be-removed))
          (strip-removed str)
          str))
    ;; Removes one line containing |to-be-removed|
    (define (strip-removed str)
      (regexp-substitute/global
        #f STRIP-REMOVED-RE str 'pre "" 'post))
    (let*  ((input-port (open-input-file file-name))
      (input (read-string 100000 input-port)))
      (display (process-input input) (open-output-file file-name)))))

This is more verbose but took a lot less time to write and debug than it would have taken me to write an equivalent bash script, and it will be much easier to understand and extend later on. And this is scsh competing with bash where bash is strong. As the complexity of your script increases the power of scsh's abstractions becomes more and more apparent. Here's the command I use to update all my git clones from the central repository, using one über-workspace that stays in sync with the underlying repository and then a number of unter-workspaces that clone the über-workspace:

;; Performs the work of a 'sync' operation.
(define (run-sync)
  (within (@workspace UBER-WORKSPACE)
    (within (@git-branch MASTER-BRANCH-NAME)
      (sync-from-repository)))
  (for (workspace in UNTER-WORKSPACES)
    (within (@workspace workspace)
      (pull-from-master workspace))))

This is not pseudo-code, this is literally the code that is run. The within form takes care of entering something, a directory, branch or whatever, carrying out an operation and leaving again, dealing gracefully with errors at any point. The sync-from-repository and pull-from-master functions are straightforward one-line calls to external tools. I usually wrap external calls in a function that includes logging to make debugging easier.

The above function uses a number of generally useful abstractions, including the within and for forms which, it should be noted, are not built into scsh, I defined those myself using scheme's define-syntax. You would obviously like these abstractions to live in separate files that could be shared between different scripts. Importing or including other files is not one of scsh's strong sides. There is a module system that I have no doubt is powerful and clever, but I just didn't have the patience to figure out how it worked so I use a scsh runner script that takes care of loading libraries before starting your script:

#!/bin/sh
#
# Usage: scsh.sh <scsh script> <options> ...
#
# Loads the specified scsh script and all .sm files in the same
# directory and calls the 'main' function.
SCRIPT=$1
ROOT=`dirname $SCRIPT`
LIBS=`ls $ROOT/*.sm | sort | xargs -n1 -i@ echo -l @`
MAIN=main

exec /usr/local/bin/scsh $LIBS -e $MAIN -s $*

Using this script rather than calling scsh directly I can factor utilities out into .sm (scsh module) files and have them loaded automatically. And with a library loading mechanism in place, and a bit of practice with scheme and some basic convenient utilities in place, scsh is an extremely powerful tool. Here are some more examples taking from my scripts.

Here's an example of defining command-line options

(define parse-options
  (option-parser
    ((--runs r)
      (set! number-of-runs r))
     (--help
      (exit-with-usage))))

The option-parser form lets you define a number of command-line options and the action to perform when the option is encountered. It returns a function that performs the appropriate processing and returns a list of those arguments that were left when removing all the ones that were recognized.

This command enters the branch in the über-workspace that corresponds to your current git branch in an unter-workspace and asks for the underlying changelist id:

(define (get-current-cl)
  (let* ((all-branches (git-list-branches))
         (current-branch (car all-branches)))
    (within (@review-client current-branch)
      (get-current-cl))))

Again, this command actually does a lot of work but you hardly notice because it's all been packed away in the various abstractions. You only see the high-level structure of what's going on.

This command pushes the current unter-workspace branch to the über-workspace, enters that branch and exports it as a change against the underlying repository:

(define (run-export)
  (let* ((all-branches (git-list-branches))
         (current-branch (car all-branches)))
    (git-push "origin" current-branch)
    (within (@workspace UBER-WORKSPACE)
      (within (@git-branch current-branch)
        (export-change)))))

Overall I'd suggest that if you've ever been frustrated with traditional shell scripts and have a basic knowledge of scheme, or want to learn it, you should give scsh a chance. I've never been a lisp or scheme fanatic but despite some amount of unfriendliness from the tool itself, including the odd way you have to invoke it and poor error reporting, I've been totally won over. Scsh FTW!