Cipher 2 An Printíseach Rúnda (The Secret Apprentice): A Language Game For Irish CALL
- Mr Liang Xu, Dublin City University
- Dr Monica Ward, Dublin City University
- Dr Elaine Ui-Dhonnchadha, Trinity College Dublin
This paper describes the Cipher 2: An Printíseach Rúnda (The Secret Apprentice) application that combines pedagogical text, error noticing and gaming to enhance the learning of Irish. We look at Computer Assisted Language Learning for Irish and give an overview of the game. Cipher 2 is an enhanced version of the Cipher (Xu and Chamberlain, 2020) game, which was designed for English language learners.
Background and related work
Computer Assisted Language Learning (CALL) refers to the use of computing technologies to enhance language learning. It has been around for many years and in recent years there has been a growth in the use of CALL, particularly in Mobile Assisted Language Learning (MALL) and popular language learning apps such as Duolingo and Memrise. While it is easy to see the appeal for language learners, it is not easy to actually develop CALL resources. It is even more difficult to develop resources for Less Commonly Taught Languages (LCTLs) as there are fewer digital and non-digital language resources available, as well as fewer developers with the required linguistic and software skills available to actually design and develop the resources.
Irish is a Less Commonly Taught Language and there are few high-quality resources CALL available for language learners. Irish has a paradoxical position in Irish education. On the one hand, parents appreciate the value of the language in terms of its unique position in society and its cultural value, while on the other hand some parents would prefer that the child learns a language with greater global reach such as Spanish or Chinese. This causes complications when it comes to teaching and learning Irish in schools. Students often lack motivation to learn the language and teachers sometimes struggle to engage their students. Spelling errors in particular are a constant problem for learners of Irish due to the opaque nature of the Irish orthography system, and the fact that it is very different from English.
Learner corpora can provide valuable insights into learner difficulties. They typically contain samples of language produced by second language learners which have been error annotated, but they are difficult and time-consuming to build. Error-tagged learner corpora can be generated by inserting known errors into appropriate texts which is the method used in this instance. Error-tagged learner corpora can also be created using innovative methods of annotation such as “games with a purpose” which can speed up the annotation task. Large learner corpora exist for English (Murakami et al, 2016) and other major languages, however, learner corpus research and development for Irish is at an earlier stage of development (Ní Ghloinn et al, 2018; Ní Chiarain and Ní Chasaide, 2019). A corpus of teaching materials can also provide valuable raw materials for the development of pedagogically sound CALL applications, as the language samples are generally aligned to particular proficiency levels and school curricula. The recently developed corpus of Irish educational materials EduGA (Ó Meachair, 2019) as well as NCI corpus (Kilgarriff et al, 2006) may be a source of appropriate text samples.
Data and Methodology
The data used in Cipher 2 will be a combination of transcribed copybook samples, transcribed samples from textbooks and other educational samples from existing corpora. Furthermore, there is a list of artificial errors based on common Irish errors (Ó Baoill and Ó Tuathail, 1992), which will be used to encode Irish text through the insertion of common errors into appropriate text samples. Error annotation data is collected by gathering the player’s annotations to the encoded Irish text while they are deciphering the text (i.e. trying to identify errors and understand the text). In addition, written data from Irish learners will be collected from the players as they will type some Irish text in answer to a question which is part of the game. With further data analysis, this will tell us which errors that students notice and which errors they are less likely to notice. Irish text collected from learners can be used for future CALL research and the building of Irish learner corpora. Data collection and analysis is currently in progress.
Cipher – a game with a purpose
Cipher is an engaging language “game with a purpose” designed for identifying errors in text (Xu and Chamberlain, 2020). Using the idea of a game with a purpose (Von Ahn, 2006), the task of error identification is gamified such that players are encouraged to identify errors in text. While playing the game, they are also making error annotations to the text, and thus we obtain the data for further analysis. Cipher investigated the possibility of detecting text errors through a game and the results showed that people are able to notice errors in text easily and it is therefore possible to detect errors using a game. According to player feedback, Cipher also has the potential to facilitate language learning. In order to investigate this, we launched the project Cipher 2: An Printíseach Rúnda.
Cipher for English
Cipher is an English text-based game. The participants were university students in the UK, with English as a second language, who were at B1-C2 CEFR levels. The text in the game is from Phrase Detectives Corpus 1.0.
The player is given a piece of text which has been encoded. The text is encoded using “ciphers” in the game, which modify some words in the text using certain rules (e.g. all vowels are removed, the initial consonant is doubled etc.). Apart from cipher errors, there are genuine errors such as misspellings, imported from the English common error lists from Kaggle. The player’s aim is to decipher the text, i.e. to locate errors in the text and to identify the ciphers (e.g. all vowels are missing). The data-gathering strategy in Cipher is to collect the player’s annotation when they find an error and categorise it.
The game has a number of strengths. Error annotated text can be collected and identified and unidentified errors (i.e. those not spotted by the player) can be analysed. Playing the game encourages “noticing” on the part of the language learner and players have reported that the game is fun to play.
Cypher for Irish: An Printíseach Rúnda
Based on Cipher for English, we are currently working on Cipher 2: An Printíseach Rúnda, an Irish version of Cipher with updated game features which serves as a CALL tool for Irish. The targeted learners of Irish are primary and secondary school students who are at A1-B1 CEFR levels. Considering the language levels of the learners, the game will be integrated with pedagogically-oriented materials. The text in the game will include samples from Irish textbooks and corpora. Moreover, new game features and interfaces have been designed and added to the game to emphasise the language learning purpose and to facilitate the collection of learner data from younger learners. An error noticing feature is added to the game where the player is able to compare each error detected with its correct form.
A challenge encountered in the development of Cipher 2 is that many of the resources available for English are not readily available for Irish. For instance, we will need to develop the list of common error types for Irish derived from the limited learner data available to us and previous studies. As a result, apart from Irish errors derived from Irish learner corpora, we will generate an artificial error list for Irish which contains various misspellings of Irish words. This error list helps to encode the text and plays an important role in the game. Moreover, it allows us to understand which errors students fail to detect and whether they are the more common errors in Irish
Discussion and Conclusion
Gamification of language tasks helps make language learning more enjoyable, which is particularly important when learner motivation levels are low. With the error noticing task in the game, it is hoped that students will pay more attention to their spelling when writing in the future, as they will be more aware of errors in text and will learn from the correct forms provided, and it will alert them to errors they were unaware of. This is useful in the Irish language learning context. The game-based user interface, interesting game elements, and plots may motivate students to spend time in the game deciphering Irish text, comprehending it, and writing Irish text themselves. Initial reviews with users indicate that the game is enjoyable for learners and data collection and analysis is on-going.
Written data from learners is invaluable to CALL research and it can be used to study learner language and for building learner corpora. Moreover, the analysis of student annotations in the error corpus includes information about which errors students are most likely to miss, which is useful to teachers and researchers. Currently, this game-based language learning tool is only available for Irish, but we envisage that this tool and the findings from the Irish context can be applied to other LCTLs in the future
- Kilgarriff, A., Rundell, M., Uí Dhonnchadha, E. (2006). Efficient corpus development for lexicography: Building the New Corpus for Ireland, Language Resources and Evaluation, 40:2.
- Murakami, A., & Alexopoulou, T. (2016). L1 influence on the acquisition order of English grammatical morphemes: A Learner Corpus Study. Studies in Second Language Acquisition, 38(3), 365-401. doi:10.1017/S0272263115000352.
- Ní Chiarain, N. and Ní Chasaide, A. (2019) An Scéalaí: autonomous learners harnessing speech and language technologies. In Proceedings of SLaTE 2019: 8th ISCA Workshop on Speech and Language Technology in Education.
- Ní Ghloinn, A., Uí Dhonnchadha, E. and O’Keeffe, A. (2018). The design and annotation of the TEG learner corpus of Irish. Proceedings of Inter-Varietal Applied Corpus Studies International Biennial Conference. Malta.
- Ó Baoill, D and Ó Tuathail, E. (1992). Úrchúrsa Gaeilge. Institiúid Teangeolaíochta Éireann.
- Ó Meachair, M.J. (2019). The Creation and Complexity Analysis of a Corpus of Educational Materials in Irish (EduGA). PhD Thesis.Trinity College, Dublin.
- Von Ahn, L. (2006). Games with a purpose. Computer, 39(6):92-94.
- Xu, L. and Chamberlain, J. (2020). Cipher: A prototype game-with-a-purpose for detecting errors in text. In Workshop on Games and Natural Language Processing, pages 17-25.