maltese speech synthesiser
Crimsonwing has been awarded the tender to develop the Maltese Text to Speech Synthesiser by the Foundation for Information Technology Accessibility (FITA) in August 2010. This project was initiated by the Foundation for Information Technology Accessibility (FITA), and is co-financed via an 85% grant by the European Regional Development Fund and 15% by the Maltese government.
Being EU funded, the project falls under the EU’s cohesion policy 2007-2013 “Investing In Competitiveness For A Better Quality Of Life”. The aim of the synthesiser is to make a significant difference to many Maltese speaking people who need accessible electronic media, for example partially sighted, blind, illiterate, injured and physically disabled people.
The Foundation for Information Technology Accessibility (FITA) is the principal advocate and coordinator for making information communications technology (ICT) accessible for disabled people in the Maltese islands. FITA’s main function is to provide support to disabled individuals in overcoming or removing barriers to education and employment, through ICT. Through empowerment and social inclusion disabled persons would then rely less on family and state support, and become enabled individuals that can productively contribute to society and the economy.
background & requirements
An in-depth study was undertaken by FITA in 2009 to identify whether there was a need for enabling technology for disabled people in Malta. This study confirmed that there was a significant need for the availability of a Maltese Speech Synthesiser – with 44% of the participants stating the need for speech-enabled software using the Maltese language.
A tender was issued for the creation of a Maltese Speech Synthesiser. This had to comply with the Speech Application Programming Interface (SAPI) standard. This standard ensures support of a wide range of speech-enabled applications on the Microsoft Windows platform, including the most popular screen readers and educations and software applications.
The key beneficiaries of the Maltese Speech Engine include:
- Blind and visually impaired people
- Persons with dyslexia and similar disabilities
- Illiterate people
- Elderly people or people with mobility impairments
Crimsonwing was awarded this tender based on its experience in Research and Development, a wide range of expertise within the company, and a trustworthy team that was willing and excited to face the challenges that such an innovative project would bring. Furthermore, Crimsonwing was willing to develop close alliances with distinguished members of the local academia. Since the early stages of the project, the involvement of Crimsonwing was crucial to the delivery of the Maltese Speech Synthesiser. The company invested in a substantial amount of research activities to tackle the linguistic and technical challenges demanded by this project.
The major components of this contract required a SAPI compliant Maltese Speech Engine, together with a maltese lexicon inclusive of audio and annotation capabilities. The system was composed of three major stages. The first stage was responsible for analysing text, for instance, by converting numbers, dates, time and abbreviations into their text equivalent. The second stage included a linguistic analysis, to determine prosodic effects such as phrasing and intonation. The third and final stage converted the phoneme stream and prosodic effects to an acoustic digital signal.
The software also needed to be capable of reading back users’ typed text, for them to hear what they have written and be able to make revisions. The Maltese Speech Synthesiser was also to offer additional speech controls such as tone, pitch, rate of speech and even speaker gender and age. In fact the tender mandated the delivery of three voices: a male adult voice, a female adult voice, and a child. The quality of the Speech Synthesiser was assessed and judged primarily by its intelligibility and naturalness, apart from other acceptance criteria. Furthermore, speech recording and diphone extraction was performed for each of the three voices.
Crimsonwing developed everything from scratch to be able to cater for FITA’s particular needs. This bespoke solution meant that Crimsonwing had to develop software tools that facilitated the complete statistical analysis and build-up of the Lexicon, Prosody and Unit Databases. Crimsonwing invested a lot of effort into research and successfully delivered state-of-the-art methodologies to build up the Maltese Speech Engine and the maltese lexicon, whilst academic papers were also published and presented by Crimsonwing at Conferences for Maltese Linguistics. Crimsonwing developed a highly-detailed project plan, including work schedules and timings that were continuously being assessed by FITA. As part of the above mentioned plan Crimsonwing focused on the following major research areas:
- Concatenative Speech Synthesis Methodologies
- Prosody Modelling
- Unit Selection Synthesis
The Maltese Speech Engine developed by Crimsonwing is compatible with the Speech Application Programming Interface (SAPI V5) standard. Only speech enabled software which is compatible with this standard can use the MSE. These include screen readers and other educational and assistive technology software. Moreover, the first version of the Maltese Speech Synthesiser runs on Windows XP, Windows Vista, and Windows 7 32 and 64 bit.
The greater inclusion of disabled persons within the information society will continue benefiting Maltese people as a whole by empowering individuals to gain better access to education and obtain gainful employment. E-services that rely on the Maltese language have now been made more user friendly for Maltese speakers by utilising the Maltese speech engine in order to facilitate computer access. The testimony that this was such a successful development is evidenced by the fact that now the Maltese Speech Synthesiser is pre-installed on all computers within the Education Department of the Government of Malta.
Roger Davies-Barrett, Project Manager for ERDF 114 stated that “the aim has always been to get as close to natural speech as possible and you only have to listen to the English speaking engines trying to pronounce Maltese text to realise how much better the Synthesiser we’ve developed is. Many e-government websites will now be fully accessible to Maltese speaking people who previously gave up due to badly pronounced Maltese.”
Carmel Gafa, head of technology at Crimsonwing said, “The Maltese Speech Synthesiser that we‘ve developed will make a significant difference to the lives of many Maltese speaking people who need accessible electronic media. We’re very satisfied with the excellent product we have delivered, and with having had the opportunity to socially contribute something so useful to Maltese society with our technical expertise.”