Common Object Language Project Homepage


PART I

A Distributed, Higher-Order External Module Interface for Declarative Languages Based On Common Object Language

Preliminary Tutorial and Request for Comments

I have been working on a generic interface definition language that allows programs written in different languages to share, at runtime, high-level structures in a distributed-processes environment. The framework is designed with declarative languages in mind, something none of the current frameworks such as CORBA or Microsoft .Net really addresses. I've written the draft of a research paper which is available at www.cs.hofstra.edu/~cscccl/col/colpaper.ps.

In order to re-produce the experiments described here, the user is assumed to have some rudimentary knowledge of how to use Teyjus Lambda Prolog, gcc, Gnu Prolog, and the MIT version of Scheme. I have now began implementing mechanisms based on the framework, which I hope will prove to be useful in the very near future. The basic framework centers on using a representation of simply typed lambda terms as a communications protocol between languages. It is used in the style of CORBA in that "meta-compilers" (called IDL compilers in CORBA) are provided that generate code stubs for various languages (currently limited to Lambda Prolog, C, and Scheme). This setup eliminates the need to alter the low-level implementations of languages, as is the case with the Microsoft .Net framework.

The first realistic manifestation of the general "middleware" framework is a usable external language interface for Lambda Prolog, Prolog and Scheme. Teyjus Lambda Prolog programs are able to make calls for the remote-evaluation of terms. Currently, Lambda Prolog programs can only act as caller (client). Near-future developments will also allow other languages to access Lambda Prolog services. The following is a half-tutorial, half-description of the how the mechanism works.

The files referred to here are available at www.cs.hofstra.edu/~cscccl/col/col.tar.gz. To use it you must have Teyjus Lambda Prolog already installed. To help clarify how one can expect to use the system, files highlighted in bold are files the users must write. All other files are automatically generated or are static components of the system.

Let's say you want to write a set of procedures in C that can be called from Lambda Prolog. You start by writing a signature in a "common object language" (COL). The word "object" has nothing to do with OOP. It indicates an "object-level" language as opposed to the "meta-languages" (lambda prolog and C) that implement it, and communicate through it.

The following is the contents of the file circles.col, which contains such a signature:

% COL signature for circle functions  "circles.col"  
square : int -> int
dist   : int -> int -> int -> int -> real
circu  : real -> real
$

The $ indicates end of file. Comments can be inserted into a signature in the % style of lambda prolog, but this is not a lambda prolog signature. The base types are int, real, string, and user defined types (see second example below). You can't have type variables (see my paper). The file must have a ".col" suffix. You then run the appropriate meta-compilers to generate a lambda prolog program stub as well as the a code stub for C. The meta-compilers are also written in lambda prolog, though they can conceptually be written in any language.

Do the following:

> tjsim mclp       (after tjcc mclp of course)
[mclp] ?- collpmc "circles", stop.
> tjsim mcc
[mcc] ?- colcmc "circles", stop.

"mclp" stands for "meta-compiler for lambda prolog" and "mcc" that for C. "collpmc" generates the files:

These files accumulate the Lambda Prolog module "collp", which is the base code-stub for lambda prolog. collp.mod is worth glancing at, especially at the predicate "remote_eval".

The "colcmc" goal generates the C files

That #includes the base-files colcserver.h and colcserver.c. The contents of "circlessrv.h" includes headers for functions you will need to write in C, such as "int square(int);" (It also includes headers for functions that are generated automatically in circlessrv.c - the user can ignore these).

You then need to write a program that #include "circlessrv.h", and implements the functions specified in the COL signature. While writing this you do not need to be aware of how the underlining mechanisms of the COL framework work. This means that you can give them to somebody who knows nothing about lambda prolog to write.

The file circlesimp.c implements these services. Please take a look. This is all the code *you* have to write in addition to the initial signature.

Compile the C side with

> gcc -lnsl -lsocket -lm colcserver.c circlessrv.c circlesimp.c -o circsrv

(on solaris; drop the -l's for linux)

And start the server with

> ./circsrv &

You can now write a lambda prolog program that accumulates the "circles" module (which was generated by the "mclp" meta-compiler). In such a program, you can call for remote services by opening a "col_session" via the goals

  open_col_session "localhost" S, 
  remote_eval S TERM RESULT, 
  ...
  close_col_session S.

The string "localhost" is an IP host name, since the underlining implementation currently uses TCP/IP sockets. Future developments will eliminate networking references from the interface - there will a generic daemon process that relays requests. A user will then open a session by giving the name of a signature instead of a hostname.

You can remotely evaluate any TERM of the signature "circles.sig". Try the following:

---
( tjcc circles first )
> tjsim circles
[circles] ?- open_col_session "localhost" S,
             remote_eval S (square (square 3)) T,
             close_col_session S.

The answer substitution:
T = 81
S = scsession ... 

More solutions (y/n)? 
---

Also look at usecircles.mod. Note in this program that one can use Lambda Prolog's implication in goals (=>) to memorize sessions to avoid having to pass them around as parameters. Try

> tjsim usecircles.
[usecircles] ?- test (circle (point 0 0) 1.5).

The key point here is that, except for the special open_col_session and remote_eval predicates, the user need not be aware of the underlining mechanism.

(CAVEAT: currently, the C server is not multi-threaded. This means you have to close_col_session before opening another one).

When writing the C functions, you do have to remember that the client and server are distinct processes. This means, for example, that if you call "getpid" to get the Unix process ID, it will be the ID of the remote C process, not that of the lambda prolog calling process. You can use side-effects locally but the COL interface itself is purely functional. Also, COL declarations such as a:string are translated into the C header char* a();


Higher Order Remote Procedure Calls

The second example uses Scheme (MIT version) to provide a set of services to Lambda Prolog (or potentially any client language). This example includes passing lambda prolog lambda terms to higher-order scheme functions. Functions are indeed first-class citizens in COL. The following COL signature are the contents of lists.col:

---
% COL signature for integer lists, and some arith functions  "lists.col"
tuple   : type
nul     : tuple
new     : int -> tuple -> tuple
sqr     : int -> int
plus,times : int -> int -> int
size    : tuple -> int
mapfn   : (int -> int) -> tuple -> tuple
fold    : (int -> int -> int) -> tuple -> int
$
---

Note that in COL you can define new types. The COL does not include built-in types for lists. However, at a very slight loss of referential transparency, you can implement lists by defining new types (tuple) and constructors (new and nul). Terms of this signature such as (new 3 nul) can be used to represent lists. A simple conversion predicate between such lists and native lambda prolog lists can be given (predicate convtuplist in uselists.mod).

Run the meta-compilers:

---
> tjsim mclp
[mclp] ?- collpmc "lists", stop.
> tjsim mcs
[mcs] ?- colsmc "lists", stop.
---

As before, the meta-compiler "mclp" for lambda prolog generates the files

The meta-compiler for scheme "mcs" generates the files The second file is for use when scheme is the caller (see use below). For now, I will discuss the first file, which will (load "colscheme.sch"), the base program stub for Scheme.

You need to write a scheme program that (load "lists.sch") in turn, and which implements the specified functions.

The file listsimp.sch implements these functions.

Notice that most of the functions are just Curried forms of regular Scheme functions, since COL doesn't have product types (and neither does lambda prolog). Some versions of Scheme, including MIT Scheme used here, do not treat (f a b) and ((f a) b) as equivalent. The function "fromlist" converts a regular scheme list into an appropriate term of the "lists" signature, e.g., '(2 3) to ((new 2) ((new 3) nul)). This function needs to be called before a scheme function returns a list back to lambda prolog. This is a loss in referential transparency, because users need to know when it's needed and when it's not. However, you can also think from the perspective that users need to know the signature of terms they are allowed to return. From this perspective there is no loss of transparency :-).

Kill the C server, start the Scheme server (in a separate window) by:

---
> scheme (assume MIT version, other versions need minor modification)
1 ]=> (load "listsimp.sch")
1 ]=> (servecol)
---

This will work for most versions of MIT scheme that people actually have. But *theoretically*, their latest version has changed certain things. If you have one of the latest releases of MIT scheme (7.5 or later I think), try using (startserver) instead of (servecol).

You can now write a lambda prolog program that uses the predicate "callscheme" (defined in "collp.mod"). This predicate is identical to "remote_eval" except in a minor detail in how a term is sent out over a socket. In the future I will eliminate this difference, since ideally your lambda prolog program shouldn't be aware of what language the functions are implemented in.

Have a look at uselists.mod, a sample lambda prolog program that uses the lists services. Running it will yield:

---
> tjsim uselists
[uselists] ?- test [2,3,5,7,11,13].
the size of the list is 6
the sum of the list (called with fold) is 41
and here's the list with everything incremented (called with mapfn):
  3 :: 4 :: 6 :: 8 :: 12 :: 14 :: nil

yes
---

Notice the two higher-order remote-evaluation calls in uselists.mod:

      callscheme S (fold plus T) Sum,
      callscheme S (mapfn (x\ (plus x 1)) T) TM

The second goal involves an anonymous lambda term. These terms are entirely native to lambda prolog - they are terms of the signature "lists.sig", which was generated from the col signature. The slight loss in transparency is again in the use of the "convtuplist" to convert native lists to "lists.sig" structures (not to be confused with the underlining COL representation, which looks something like capp (id 3) (cic 2)...).

The Scheme language is very flexible. It is possible to "fake" higher-order abstract syntax using *quote* and *eval*, up to a point. This is how I managed to translate a common, discrete representation of lambda terms into native scheme abstractions. Unlike with lambda prolog however, this translation is not declarative (see my paper). But it remains useful. No degree of transparency is lost in calling a Scheme function with a higher-order parameter. However, there IS a loss of transparency if you want to return an anonymous Scheme lambda term back to lambda prolog. This must be done in a special way: you must return something like

   (mksclosure '(lambda (x) (sqr x)) (the-environment))

from your programs. A function ("sctau" in colscheme.sch) descends into the structure of this "symbolic" lambda term to create a COL representation of lambda terms that will be transmitted back to a remote process. It needs the environment to "eval" symbols encountered inside the term (the-environment returns the current environment in MIT scheme; eval requires an environment argument). In a language such as Lambda Prolog that can truly support higher-order abstract syntax, no such devices are needed. The ability to descend into the structure of lambda terms is of course one of the things lambda prolog do best. A language that can't support higher-order abstract syntax must either implement specialized, implementation-dependent features, or suffer a loss of referential transparency when communicating higher-order structures with distributed processes. This is perhaps an advantage of higher-order abstract syntax that has not been previously noticed.

If you try to compile "lists.col" for C it will fail because the signature is not first-order. Theoretically, however, since COL is an object-level language, it can be encoded in any language. It's possible to define a first-order structure for lambda terms in C. One can still perform computations on these structures, and translate them into the COL representation for transmission to a remote process, but they will not be functions of the native languages, and thus the user must observe special forms. I decided not to implement the higher-order fragment of COL for C. The file folists.col and those automatically generated from it defines the first-order fragment of lists.col. The loss of transparency due to the use of user-defined types is also greater in the case of C than Scheme or Lambda Prolog (see comments in folistsimp.c).

***********

(CAVEAT:

It's important to point out that the "closure" form described above is only used to translate native Scheme abstractions to COL form. Closures are not transmitted between processes. Likewise, when a C function returns a char*, the string is copied to a buffer which is transmitted over a socket. It does not mean that the remote process will receive a pointer.

There are a couple of other restrictions for a scheme program to return lambda terms. The terms must be in beta-eta-long normal form (with respect to COL types). Procedures exist to do this automatically but I've decided to leave them out for now for efficiency concerns. On the lambda prolog side, the term you expect to get back from a remote process should be of no higher than third order (you can embed still-higher order terms using a constructor). COL type-checking is not yet implemented for Scheme, but this is not a problem as long as Scheme is only used as server. Look at the "callscheme" predicate in collp.mod and you'll see that the COL *object-level* type returned by Scheme must be consistent with that of the term sent out.)

Also, the temporarily-fixed COL TCP port is 20027. A C server, even if it crashes, can re-bind to this port immediately. But if a Scheme server crashes, you may have to wait a few minutes before restarting it. Only one sever can currently be active on a host, but this will change when I have the "broker" daemon set up.

***********


Calling C from Scheme

The COL framework was not designed just for Lambda Prolog. It forms the basis for interoperation between many languages. Currently, a Scheme program can call a C process (or another Scheme process).

There are two ways for a Scheme function to access COL services. Compile the circles.col signature into a Scheme stub, start the C server, and do the following:

---
> tjsim mcs
[mcs] ?- colsmc "circles", stop.     (generates circles.sch)
> circsrv &      (start C server)
> scheme
1 ]=> (load "circles.sch")

1 ]=> (define S (open_col_session "localhost"))

1 ]=> (remote_eval '(square (square (square 2))) (the-environment) S)

;Value: 256

1 ]=> (close_col_session S)
---

Just as regular eval, remote_eval needs an environment argument, plus the additional "session" argument.

There is another way for Scheme to call for remote evaluation that is much more transparent. Take a look at the file circlesclt.sch, which was also generated by the meta-compiler. You'll see that there are functions that simply hides the "remote_eval" mechanism. They are self-contained, and can be called as any regular function. I've even rendered them in non-curried form for ease of use in Scheme programs.

To use this client stub, you simply (load "circlesclt.sch"), then make calls such as

1 ]=> (squareR (squareR 3))

These functions are distinguished by the "R" appended to their names. You can call them as you would any scheme function. The only exception is that, if you pass a lambda term to a function, you must quote it. If the lambda term contains free instances of locally-meaningful variables, you must also use the "mksclosure" form described above. For example:

1 ]=> (define l (fromlist '(1 2 3 4)))
1 ]=> (mapfnR '(lambda (x) (times x x)) l)

;Value 1: (1 4 9 16)

1 ]=> (let ((y 3)) 
        (mapfnR (mksclosure '(lambda (x) (times x y)) (the-environment)) l))

;Value 2: (3 6 9 12)

(And if you are using "listsclt.sch", be sure to (load "listsimp.sch") as well, or else it won't know how to translate lists to local forms).

Currently Scheme is the only language that can be both caller and callee. Lambda Prolog can only be caller and C only callee. But this situation will change as more languages are added to the framework. Look for ML, Java, and "regular" Prolog in about a year (or less).

FINALLY,

Be aware that the project is still in the experimental stage. I implemented each component carefully but can't guarantee against all potential problems. Also, please at least skim through my paper.

And please email me your comments at liang@cs.umn.edu or cscccl@hofstra.edu.