This is a collection of some things I would enjoy looking into, though I probably will never have the time for most of them. Feel free to contact me if you want to communicate with me about any of this.

Many interesting computations can be seen as transformations of values from one coordinate system or reference system or unit to another. Here are some examples:

- Unit conversion of physical magnitudes (e.g. US/metric)
- Currency conversion for monetary amounts
- Conversion of dates between calendars
- Conversion of times between time zones
- Conversion of cartesian plane or spacial coordinates, e.g. for computer graphics
- Conversion between cartesian and polar coordinates,
- Conversion between various geodetic and geographical coordninate systems for cartography etc.
- Conversion between different spherical coordinate systems for astronomy
- Conversion of colours between colour spaces, e.g. for colour management
- Conversion of text between character encodings

- Will there be many different coordinate systems involved, or just two like for cartesian/polar?
- Is there only one sensible transformation between two reference systems (like for cartesian/polar, or unit conversion), or does the application have to choose between transformations, like for colour space conversion (perceptual/colorimetric etc.) or currency conversion (different exchange rates)?
- If there are several possible transformations, is it actually the transformations that differ (colours) or would one rather consider the reference systems to change with respect to each other (exchange rates change over time)? Or both (different institutions also have different exchange rates at one point in time)?
- Can we afford to do some programming whenever we add a new transformation, or will we need to add new transformations dynamically?

These considerations will influence whether the design is centered around the transformations (apply this transformation to this data) or around the reference systems (transform this data into that reference system).

There are also efficiency concerns:

- Will we typically need to transform tens or thousands or millions of points? To transform an image from one color space to another, we don't want to create a `Color' object for every pixel.
- Will we typically apply several transforms in sequence, like in computer graphics? If they are linear transforms, we can precompute the concatenation of transforms for efficiency.
- Can we store the actual data in the same data structure independently of the reference system (e.g. an object with two floats for 2D graphics) or a similar one (converting colours from RGB to CMYK needs one more coordinate) or might they be completely different (like for colours identified by name or descriptions of geographic locations that use letters)

There are many scenarios where (formal) languages are embedded inside other languages. For instance people write a tool to translate text from one language A into another language B. They also have a rule language R that can be used to write down general transformation rules. Then sentences of A and B have to be embedded into the rule language R. I have some ideas of how to cope with the resulting parser chaos, which are written down here.

One (not totally unexpected) experience from the KeY project is that it is hard work to design a logical calculus that can be used to reason about a real programming language like for instance Java. If one wants to stick to rules that have a relatively simple structure, then one needs hundreds of them. On the other hand, most of these rules correspond to the operational semantics of the language in a more or less straight-forward way.

Now I wonder if it is not possible to *generate* most of
a calculus given another representation of the semantics, namely
an interpreter, written in a sufficiently simple language.

It is pretty clear that one can (in theory) use a calculus for the language in which an interpreter is written to verify code in the interpreted language, simply by symbolically executing the interpreter on the given input. Not very nice to work with.

One can also use a compiler to *translate* the program
into a simpler language, and then use a calculus for that simpler
language. But then, one still has to do the proofs for the
translated program.

Couldn't one take an interpreter/compiler/partial evaluater or
whatever (probably written in a language specifically restricted
for this purpose) and use it to determine the behaviour of a
calculus? In such a way that for the user the underlying base
language remains invisible? *Maybe,* using a calculus for
this simpler base language?

The satisfiability problem of 1st order predicate logic is
known to be decidable, if one restricts oneself to one of a number
of (syntactically defined) subsets of the set of all possible
formulae. The *decision problem* consists in identifying
such subsets or classes, and of course finding corresponding
decision procedures.

For many of these decidable classes, decision procedures based on resolution exists. Some of them are based on hyper-resolution, others on ordered resolution and various other refinements.

The question is, can the same be done with (hyper-, ordered-, what not...) tableaux? Hasn't been done yet, at least for these classic classes solvable by resolution. I spent a lot of time in 2000 and 2001 trying to find tableau equivalents of these procedures, but didn't really find anything.

Somehow, I am convinced this should be possible. Note that Chris Fermüller thinks it isn't, and he's an authority. On the other hand, Alexander Leitsch, who works at the same department, thinks it should be possible, though not easy. We'll see.

Meta Programming is when you write programs which analyze and generate programs. A simple application is parameterizing programs with types like in generic programming. But you could also parameterize programs with bits of code. Or with a general configuration specification from which an optimal implementation is derived, like in generative programming.

Recoder is a nice library for meta programming in Java. But what would be even nicer is a language extension that allows verbatim inclusion of code templates in a program in a syntactically nice way. Has been done for Standard ML in this paper, with the shortcoming, that reflection doesn't cover type declarations.

Could this be done for Java? Could it be done in a typesafe
way, say with added Generic Types in Java? Say, could we have
functions which take an expression of type *T* and return
a method declaration returning *T*? Of course, this
might be pretty difficult, given the complicated typing rules of
Java. But how could it be done in principle?

In the last few years people have gotten more and more aware that implementation inheritance in OO languages is mostly evil. Except maybe when you inherit from an abstract superclass which is very well documented to make clear how it may be subclassed.

What would happen if one left implementation inheritance out of a language like Java? Only interfaces and (multiple) interface inheritance which is no problem. What you did with implementation inheritance might be done with delegation.

While we're at it, could we make interface types the only (non-primitive) types? Meaning that you couldn't declare a variable of being of a certain `class', but only as holding an object that implements a given interface.

The property of implementing an interface may be seen as a
predicate on classes. If you look at the functional language
Haskell (or was it only in Gofer?), you will find that they have
*n*-ary predicates on data-types (what they call type- or
constructor-classes). This could be done in a procedural language
as well. There might be an interface `Convertible' which is
implemented by a pair of types if one is convertible to the other.
The problem is, this probably requires a `multiple dispatching'
mechanism, if method (or procedure) calls are supposed to be based
on dynamic types. Am I right?

In the meantime I found MultiJava, which is Java with multiple dispatch.

This is a continuation of the previous idea. In Haskell and Gofer, no multiple dispatching is needed because the precise types are known at compile-time. This can be done (amongst other reasons) because there can be no heterogenous data structures in these languages.

If C is a type class, a *list of C's* is not a type. A
function operating on list of *C*'s is typed as operating
on lists of *T*'s for any type *T* that is an
instance of *C*. The difference is that all elements of
the list must be of the same type *T*!

What if we relax this restriction and allow classes as types, or heterogenous data structures in general? I suppose this has been done already in some OO-functional synthesis, right? Did that need multiple dispatching? I suppose the result would be a little like a functional version of what I ended up with in the previous section.

In the meantime, I found a paper by Konstantin Läufer about extending Haskell with Existential Types, which makes heterogenous data structures and dynamic binding possible.

I have been involved in a lot of work on program verification for procedural and OO languages. I'd like to see what can be done for functional languages.

One interesting aspect might be using provable properties of a function to optimize compilation. E.g., instead of forcing a function to be strict with a special `strict' functor, one might prove that it is strict and use that property for optimized compilation.

Note that I tend to disagree with the notion that a program is just a formula, or a term, or whatever in a given logic. In my experience, ignoring the operational aspects of a programming language, albeit functional, logic, etc., gets you in trouble sooner or later, because it abstracts away from an essential property of programs, namely that of being runnable.

It is well-known that the valid sentences of arithmetic are not recursively enumerable. Hence, there can be no complete deduction system for arithmetic. The same holds for various theories of inductive data types which are strong enough to allow embeding of arithmetic.

On the other hand, an induction rule or axiom schema like the
one in Peano arithmetic is sufficient to prove pretty much
anything people usually *want* to prove in mathematics or
program verification. The difficulty of inductive theorem proving
thus cannot be accounted for simply by the inherent incompleteness
of arithmetic. A deduction system in wich any logical consequence
of the Peano axioms could be derived would be fine. The problem
with induction axioms/rules for automated theorem proving is
guessing induction hyptheses. This is somewhat similar to
guessing cut formulae. Except that cuts can be eliminated.
I suppose the guessing of induction hypotheses can't. Can
anybody who knows about proof theory tell me why?

Update May 2003: Alexander Leitsch, who knows about proof theory, has told me: Cut elimination is indeed not possible in presence of induction, and also analytic cuts wont do. If I remember rightly, this goes back to some result of Hilbert's.

The graphics and graphicx packages for LaTeX are device-independent, in that they support various backends depending on the intended DVI driver. Unfortunately they are not very powerful. I should like something like PStricks but independent of PostScript. So one can use it with pdfTeX, for instance.

I have written a prototype for a converter from Metafont to Postscript Type 1 fonts some time in 2001, using Metapost, the Computational Area Geometry capabilities of Java 2D and ttf2pt1 (for hinting), but it's not in working shape. Should be feasible though. If you're interested, mail me.

I used to have random Helmut Kohl like citations and the morning news on my private homepage. These were generated from a context free grammar. I'd like to write a generator that works with a more powerful mechanism like e.g. attributed grammars with finite domain constraints attached. Wouldn't this be really useless fun?

Koen Claessen told me that this might actually be useful, namely for test case generation.

Martin Giese - martin (dot) giese (at) risc (dot) uni-linz (dot) ac (dot) at