Wednesday, June 14, 2017

Alain Colmerauer, Machine Translation Pioneer

TAUM group showing off a piece of Q-System output
circa 1970

My recent birthday, my 88th, was clouded over by news that somebody I'd worked for nearly 50 years ago had died. He was Alain Colmerauer, an outstanding French computer scientist, Chevalier de la Légion d'Honneur (the French equivalent of a knighthood), emeritus professor at the University of Marseille-Luminy. My work for him only lasted three years, from 1968 to 1971, but they were very formative years for me. Also for others; I've received messages from two other ex-colleagues saying they were influenced by him. All that was in the days before I conceived the notion of Natural Translation, when I was part of a Canadian group doing research on machine translation. There will be many obits and tributes to him, but I would like to add a few personal reminiscences.

In the late 1960s I was working as a linguistic research assistant in the machine translation project at the University of Montreal, a French-speaking university. We had acquired a linguistic model of the translation process from the leading research group in France, the one at the University of Grenoble. It was the dependency grammar of the French linguist Lucien Tesnière. But we didn't have software to implement it.

Then in 1968 Alain came to Canada and to the University of Montreal as a coopérant. The coopérants were young French university graduates who, under a scheme devised by De Gaulle's government, were sent to work for two years in developing countries in lieu of their compulsory military service. During that time they received only army pay. For diplomatic reasons, probably to favour relations with Quebec, Canada was included among the receiving countries. With Alain came at least two other coopérant computer graduates whom I came to know, Michel van Canaghem and Guy de Chastellier. They came from the University of Grenoble; it had a strong computer science department, but Alain's background was in mathematics. At the young age of 28 he had recently obtained a Doctorat d'État, a French superior, competitive doctorate that no longer exists. One day in his office later on he asked me if I would like to see his doctoral thesis. So he showed it to me. It had about 40 pages. I expressed surprise that he could obtain a Doctorat d'État with a thesis of a mere 40 pages. He smiled and replied, "Only in mathematics."

Though Grenoble had a well-known machine translation project, Alain wasn't in it and didn't come directly to our Montreal project. He came first to the computer science department. The university had a state-of-the-art computer centre wth a CDC mainframe and an encouraging engineer manager, Jean Baudot, who was interested in linguistics. But the head of the MT project, Guy Rondeau, was a good talent-spotter (after all, he recruited me!) and he didn't miss the opportunity to recruit Alain. And so we met. Then Rondeau left the university hurriedly in a huff and the university needed a credible replacement in order to safeguard its lucrative MT research contract with the National Research Council of Canada (NRC). So it appointed Alain, and that's how he became my boss.

One of the first things Alain did was stop the quarterly publication of our research papers. He said we should not publish until we had something really substantial to present. It taught me to look down on the 'publish or die' attitude so prevalent in our universities, which produces more minor articles and theses than people have time to read. We eventually waited two years.

He set about providing us with the software we needed. The leading linguistic paradigm of the time was transformational grammar (TG). Alain was well acquainted with the TG of Noam Chomsky through his wife, who was writing her PhD thesis on it. His first product was a TG program which he called a W-Grammar because it was inspired by the Algol programming language invented by the Dutch computer scientist A. van Wijngaarden. Indeed it was through Alain that I learnt about the European style of programming represented by Algol, more logical and transparent than the then current American languages like Fortran. W-Grammar was usable for MT and so I wrote the first (and perhaps only) proof-of-concept piece of translation in it, just one sentence. Alain was a bit disappointed that I didn't use Chomskyan TG but the Tesnière dependency model. However he was very open-minded and later even allowed me time and resources to work on my own side-project, the Transformulator (a forerunner of translation memories). He was also sure of himself. Some computer science colleagues told him, on theoretical grounds. that the Q-Systems might not work; but he thought the danger was negligible and went ahead anyway.


I liked W-Grammar and would have continued with it, but something better soon came from him that rendered it obsolete. This was his much better known Q-Systems. (The Q stood for Quebec.) There is no point in describing Q-Systems here, since there is a good article on them in Wikipedia. Alain was a hands-on computer scientist: he was proud that he programmed Q-Systems in Algol himself in the space of six intensive weeks.

Q-Systems were a high-level language, a revolutionary tool for us linguists. With them we were equipped to devise an elaborate English to French MT system. The task was too much for one person, so it was split up into stages and parcelled out. The chaining of programs in Q-Systems made this feasible. I got to design the English morphology analyser and programmed it with the aid of a student, Laurent Belisle. Alain once paid me what for me was a supreme compliment: "Brian, your morphology never fails."

By 1971 we were ready to make a presentation to the NRC and to publish. The publication is the volume TAUM 71. (TAUM stands for Traduction automatique à l'Université de Montréal.)  It's difficult to find today because it was only intended for the NRC, but it's a classic of the so-called rule-governed approach to MT. That paradigm was overturned by the invention of statistical MT in the late 1980s, so it might look as if we were barking up the wrong tree. However, the right tree wasn't available to us, because the computers of the time couldn't have handled the enormous data bases that are needed for statistical MT. 

By 1971, with TAUM 71 published and his coopérant oblígations acquitted, Alain felt the tug of his home country and returned to France. One side-consequence was that he left me his spacious Montreal apartment on prestigious Nun's Island in the St. Lawrence river along with its antique furniture. But not long afterwards I myself left Montreal for Ottawa. Thereafter our interests diverged so widely, his towards computer programming and mine towards translation theory, that I had little contact with him. I visited him once at his office at Marseille-Luminy University and was present there at a discussion in which the ever faithful Michel van Canaghem was urging him to switch to what was then the latest development in computing, a micro-computer. I attended Guy de Chastellier's wedding in the Montreal Basilica. But these days you can watch people's careers from afar on the internet. And, as you can judge from the above, those halcyon years in Montreal under Alain's leadership have remained bright in my memory. 

References
Alain Colmerauer (ed.). TAUM 71. Montreal: Projet de Traduction Automatique de l'Université de Montréal, 1971. 223 p. Click [here] or go to 
https://books.google.es/books/about/Projet_de_traduction_automatique_de_l_un.html?id=4lL_nQEACAAJ&redir_esc=y.

Alain Colmerauer and Guy de Chastellier. W-Grammar. Département d'informatique, Université de Montréal, c1969, 8 p.  Click [here] or go to                   alain.colmerauer.free.fr/alcol/ArchivesPublications/Wgrammar/Wgrammar.pdf.   

Q-systems. Wikipedia. 2016. 

Brian Harris and Laurent Belisle. POLYGRAM grapho-morphology analyzer for English. In TAUM 71, pp. 46-105.

Image
Alain Colmerauer is holding the Q-System output. Far left with pipe is Michel van Canaghem. With long hair, looking over the output, is Jules Dansereau, a Canadian language analyst for French. Behind Jules may be Richard Kittredge, American linguist.
Photo by courtesy of Colette Colmerauer.