This is a collection of some things I would enjoy looking into, though I probably will never have the time for most of them. Feel free to contact me if you want to communicate with me about any of this.
Many interesting computations can be seen as transformations of values from one coordinate system or reference system or unit to another. Here are some examples:
These considerations will influence whether the design is centered around the transformations (apply this transformation to this data) or around the reference systems (transform this data into that reference system).
There are also efficiency concerns:
There are many scenarios where (formal) languages are embedded inside other languages. For instance people write a tool to translate text from one language A into another language B. They also have a rule language R that can be used to write down general transformation rules. Then sentences of A and B have to be embedded into the rule language R. I have some ideas of how to cope with the resulting parser chaos, which are written down here.
One (not totally unexpected) experience from the KeY project is that it is hard work to design a logical calculus that can be used to reason about a real programming language like for instance Java. If one wants to stick to rules that have a relatively simple structure, then one needs hundreds of them. On the other hand, most of these rules correspond to the operational semantics of the language in a more or less straight-forward way.
Now I wonder if it is not possible to generate most of a calculus given another representation of the semantics, namely an interpreter, written in a sufficiently simple language.
It is pretty clear that one can (in theory) use a calculus for the language in which an interpreter is written to verify code in the interpreted language, simply by symbolically executing the interpreter on the given input. Not very nice to work with.
One can also use a compiler to translate the program into a simpler language, and then use a calculus for that simpler language. But then, one still has to do the proofs for the translated program.
Couldn't one take an interpreter/compiler/partial evaluater or whatever (probably written in a language specifically restricted for this purpose) and use it to determine the behaviour of a calculus? In such a way that for the user the underlying base language remains invisible? Maybe, using a calculus for this simpler base language?
The satisfiability problem of 1st order predicate logic is known to be decidable, if one restricts oneself to one of a number of (syntactically defined) subsets of the set of all possible formulae. The decision problem consists in identifying such subsets or classes, and of course finding corresponding decision procedures.
For many of these decidable classes, decision procedures based on resolution exists. Some of them are based on hyper-resolution, others on ordered resolution and various other refinements.
The question is, can the same be done with (hyper-, ordered-, what not...) tableaux? Hasn't been done yet, at least for these classic classes solvable by resolution. I spent a lot of time in 2000 and 2001 trying to find tableau equivalents of these procedures, but didn't really find anything.
Somehow, I am convinced this should be possible. Note that Chris Fermüller thinks it isn't, and he's an authority. On the other hand, Alexander Leitsch, who works at the same department, thinks it should be possible, though not easy. We'll see.
Meta Programming is when you write programs which analyze and generate programs. A simple application is parameterizing programs with types like in generic programming. But you could also parameterize programs with bits of code. Or with a general configuration specification from which an optimal implementation is derived, like in generative programming.
Recoder is a nice library for meta programming in Java. But what would be even nicer is a language extension that allows verbatim inclusion of code templates in a program in a syntactically nice way. Has been done for Standard ML in this paper, with the shortcoming, that reflection doesn't cover type declarations.
Could this be done for Java? Could it be done in a typesafe way, say with added Generic Types in Java? Say, could we have functions which take an expression of type T and return a method declaration returning T? Of course, this might be pretty difficult, given the complicated typing rules of Java. But how could it be done in principle?
In the last few years people have gotten more and more aware that implementation inheritance in OO languages is mostly evil. Except maybe when you inherit from an abstract superclass which is very well documented to make clear how it may be subclassed.
What would happen if one left implementation inheritance out of a language like Java? Only interfaces and (multiple) interface inheritance which is no problem. What you did with implementation inheritance might be done with delegation.
While we're at it, could we make interface types the only (non-primitive) types? Meaning that you couldn't declare a variable of being of a certain `class', but only as holding an object that implements a given interface.
The property of implementing an interface may be seen as a predicate on classes. If you look at the functional language Haskell (or was it only in Gofer?), you will find that they have n-ary predicates on data-types (what they call type- or constructor-classes). This could be done in a procedural language as well. There might be an interface `Convertible' which is implemented by a pair of types if one is convertible to the other. The problem is, this probably requires a `multiple dispatching' mechanism, if method (or procedure) calls are supposed to be based on dynamic types. Am I right?
In the meantime I found MultiJava, which is Java with multiple dispatch.
This is a continuation of the previous idea. In Haskell and Gofer, no multiple dispatching is needed because the precise types are known at compile-time. This can be done (amongst other reasons) because there can be no heterogenous data structures in these languages.
If C is a type class, a list of C's is not a type. A function operating on list of C's is typed as operating on lists of T's for any type T that is an instance of C. The difference is that all elements of the list must be of the same type T!
What if we relax this restriction and allow classes as types, or heterogenous data structures in general? I suppose this has been done already in some OO-functional synthesis, right? Did that need multiple dispatching? I suppose the result would be a little like a functional version of what I ended up with in the previous section.
In the meantime, I found a paper by Konstantin Läufer about extending Haskell with Existential Types, which makes heterogenous data structures and dynamic binding possible.
I have been involved in a lot of work on program verification for procedural and OO languages. I'd like to see what can be done for functional languages.
One interesting aspect might be using provable properties of a function to optimize compilation. E.g., instead of forcing a function to be strict with a special `strict' functor, one might prove that it is strict and use that property for optimized compilation.
Note that I tend to disagree with the notion that a program is just a formula, or a term, or whatever in a given logic. In my experience, ignoring the operational aspects of a programming language, albeit functional, logic, etc., gets you in trouble sooner or later, because it abstracts away from an essential property of programs, namely that of being runnable.
It is well-known that the valid sentences of arithmetic are not recursively enumerable. Hence, there can be no complete deduction system for arithmetic. The same holds for various theories of inductive data types which are strong enough to allow embeding of arithmetic.
On the other hand, an induction rule or axiom schema like the one in Peano arithmetic is sufficient to prove pretty much anything people usually want to prove in mathematics or program verification. The difficulty of inductive theorem proving thus cannot be accounted for simply by the inherent incompleteness of arithmetic. A deduction system in wich any logical consequence of the Peano axioms could be derived would be fine. The problem with induction axioms/rules for automated theorem proving is guessing induction hyptheses. This is somewhat similar to guessing cut formulae. Except that cuts can be eliminated. I suppose the guessing of induction hypotheses can't. Can anybody who knows about proof theory tell me why?
Update May 2003: Alexander Leitsch, who knows about proof theory, has told me: Cut elimination is indeed not possible in presence of induction, and also analytic cuts wont do. If I remember rightly, this goes back to some result of Hilbert's.
The graphics and graphicx packages for LaTeX are device-independent, in that they support various backends depending on the intended DVI driver. Unfortunately they are not very powerful. I should like something like PStricks but independent of PostScript. So one can use it with pdfTeX, for instance.
I have written a prototype for a converter from Metafont to Postscript Type 1 fonts some time in 2001, using Metapost, the Computational Area Geometry capabilities of Java 2D and ttf2pt1 (for hinting), but it's not in working shape. Should be feasible though. If you're interested, mail me.
I used to have random Helmut Kohl like citations and the morning news on my private homepage. These were generated from a context free grammar. I'd like to write a generator that works with a more powerful mechanism like e.g. attributed grammars with finite domain constraints attached. Wouldn't this be really useless fun?
Koen Claessen told me that this might actually be useful, namely for test case generation.