Adding a new language or dialect dictionary

The dictionary files used by IXIASOFT CCMS are located in the system/dicts/ collection. There are two files that handle dictionary processing per language:

  • a [language].aff file, which handles the rules (hyphenation, prefixes, affixes, etc.) relating to the target language
  • a [language].dic file, which contains the word list for that language

By default, at least one language dictionary will be installed on your version of CCMS.

The following procedure covers how to add an additional set of dictionary files to the CCMS, which need to be added to two separate locations within the system.

The CCMS uses the Hunspell dictionaries, which are available for free.

Important: The abbreviation used for renaming the dictionary in the following procedure must match the language setting within the prolog of a map/topic, e.g. xml:lang="en-us".

To add a new dictionary file, do the following:

  1. Locate the source of a Hunspell dictionary for the language or dialect that you are interested in.
    A popular resource for the Hunspell dictionaries can be found on SourceForge. Go to: https://github.com/wooorm/dictionaries, and then click Clone or download > Download ZIP.
  2. Open the downloaded zip file, and choose the appropriate language folder to add. Transfer the index.aff and index.dic from that folder to your local system.
  3. Rename the index.aff and index.dic files so that they are identifiable as a separate language or dialect. The folder the files are derived from provide the language code to use.
    For example, if you were adding the German dictionary, you could rename index.aff and index.dic to de-de.aff and de-de.dic. If you are adding a dialect, such as British English, you could rename the files to en-GB.aff and en-GB.dic instead.
    Note: The name of these files should follow a 4-letter convention describing the language followed by the dialect. Also, the case used must match that used by xml:lang within the topic templates.
  4. Within TEXTML Administration, navigate to /system/dicts.
  5. Right-click on dicts, select Insert documents. Click Add file, select the two renamed dictionary files located on your system, then click OK.
  6. The next step is to place a second copy of these files in the system/dicts/customs collection, which allows for additional, customized words for that language to be added. Open the renamed [language].dic on your local system.
  7. Select all of the content within the file, and delete it. Replace it with 0 in the first line of the file, followed by a carriage return. The content of the file should look like the following:
    0				
    				
  8. Save the modified [language].dic file.
  9. Within TEXTML Administration, navigate to /system/dicts/customs.
  10. Right-click on dicts, select Insert documents. Click Add file, select the two renamed dictionary files located on your system, then click OK.
  11. Within TEXTML Administration, navigate to /system/conf and open languages.xml.
  12. If the new language/dialect is not present, add the following lines, where [language] equals the 4-letter code used by the xml:lang within the topic templates:
    	<language groups="[language group name]" name="[full name of language or dialect]">
    		<code type="ISO-639-1">[language]</code>
    		<code type="ISO-639-2/B">[language]</code>
    	</language>

    For more information on the [language group name] and [full name of language or dialect], view the other langauge examples in the file for an example of what to use, or review the ISO-639-2 code list, available from: loc.gov/standards/iso639-2/php/code_list.php.

    Note: Adding a new, active language to this file will automatically prompt a user, when creating a new topic, to choose the langauge, which is reflected in the xml:lang.
The dictionaries are added and are active with the CCMS.