Do minority languages need machine translation?

This is an abbreviated transcript of a talk I gave at a British-Irish Council conference on language technology in indigenous, minority and lesser-used languages in Dublin earlier this month (November 2015) under the title ‘Do minority languages need the same language technology as majority languages?’ I wanted to bust the myth that machine translation is necessary for the revival of minority languages. What I had to say didn’t go down well with some in the audience, especially people who work in machine translation (unsurprisingly). So beware, there is controversy ahead!

Be patient! Bí othar!

Continue reading

10 reasons why Irish is an absolutely awesome language

Health warning! Learning Irish will open your mind, win you interesting friends and make you attractive to the opposite sex.I have devoted a large chunk of my career to learning Irish, working with Irish and making a living out of Irish. So I thought it would be fair to put together a list of reasons why I think the language is worth it. Mine are proper linguistic reasons though – none of that starry-eyed sentimental nonsense about the language being ‘beautiful’ or ‘romantic’! So, put your language geek hats on, here we go!

(Many of the features mentioned here are actually common to all Celtic languages, including Scottish Gaelic and Welsh, but let’s not be splitting hairs now.) Continue reading

Breathing new life into old data: how to retro-digitize a dictionary

breis.focloir.ie

A new digital home for granddad and grandma: breis.focloir.ie

I have recently worked on a project where we retro-digitized two Irish dictionaries and published them on the web, so I thought it would be a good idea to summarize my experience here. Hopefully somebody somewhere will find it useful.

In the slang of people who care about such things, retro-digitization is the process of taking a work that had previously been published on paper (often a long time ago, way before computers made their way into publishing) and converting it into a digital, computer-readable format. A bit like retro-fitting a house or pimping up an old car. This involves not only scanning and OCRing the pages, but also structuring and indexing the content so it can be searched and interrogated in ways that would have been impossible on paper. This is the bit that matters most if what you are retro-digitizing is a dictionary.

The dictionaries we retro-digitized are Foclóir Gaeilge-Béarla [Irish-English Dictionary] from 1977 (editor Niall Ó Dónaill), and English-Irish Dictionary from 1959 (editor Tomás de Bhaldraithe). Both are sizeable volumes which, despite their age, enjoy the respect, even adoration, of Irish speakers everywhere, are still widely used and widely available in bookshops. People have been saying for ages how nice it would be if we had electronic versions of these. And now we do, available freely to everybody on a website. Here’s how we got there. Continue reading

Linguistics of the Gaelic Languages 2013: a conference report

Oh, the things I do for fun at weekends! For example last weekend, I attended the Linguistics of the Gaelic Languages conference in University College Dublin (19 – 20 April 2013). This was a small but focused event, with 20 to 30 people attending to discuss latest research on Irish, Scottish Gaelic and Manx. Here is my report. Continue reading

The linguistic relativity of up and down

In this article, I am going to give a nice and simple example of how learning a new language causes you to start perceiving the world differently. By doing so I will provide support for the Sapir-Whorf hypothesis (in its weak form), which is a hypothesis that claims that the language you speak predetermines, to some extent, how you think. I will demonstrate this on my favourite toy language, Irish. Continue reading