This document provides a guide to using the Wunderkammer Import Package 2 to port electronic dictionaries for display on mobile phones through Wunderkammer. There are three major steps to this process:
- Ensuring that the original dictionary file is in a suitable format
- Setting up the dictionary configuration in wkimport
- Transferring and installing the dictionary on mobile phones
There are two optional steps that can be undertaken to further customise the dictionary:
Note that the Wunderkammer Import Package requires Java to run. This can be downloaded for most major computer platforms for free from the linked website if it is not already installed.
Common problems that arise when importing dictionaries are listed in the troubleshooting section. If you encounter a problem that is not listed here, write to James at the address james followed by the at sign and then pfed dot info
Source dictionary format
The Wunderkammer Import Package can read and convert dictionaries stored in the backslash coded format used by Shoebox/Toolbox and in the XML format used by Kirrkirr. Although it is possible to create a mobile phone dictionary directly from an existing electronic dictionary, there are a few design features of Wunderkammer that should be taken into consideration when importing dictionaries to make the most of the platform.
Wunderkammer does not have any real support for multiple senses within a single entry. It is possible to suggest a subgrouping of fields in an entry through the order in which they appear (as long as wkimport is set to take the field order from the source dictionary - see under Mappings tab below). However, Wunderkammer does not recognise any groupings below the level of the entry and so it is not possible, for example, to target a menu search or link to a particular part of an entry. Long entries with multiple senses might also be difficult for users to read on their phones because they may have to scroll down a long way to read the entire entry. The best strategy for formatting dictionaries that make use of multiple senses is probably to divide the senses into separate homonymous entries.
The importing package automatically uniquifies the entries in the input dictionary. If there are any homographic entries in the input dictionary - that is, if there are any entries that have identical lemmas - then a number will be added after each of the lemmas to distinguish them. The numbering starts from 1 for each group of identical lemmas. For example, in a dictionary that contains two entries, each of which has the lemma
turla, the lemmas will be renamed to
turla 1 and
If a link points to a lemma that is uniquified when imported then the link will be broken. The uniquifying and link verification routines print lists of the lemmas uniquified and the broken links to the console so it is possible to see which lemmas were uniquified and which links were broken.
When the standard uniquify method renames a lemma, it does not check that there is not already a lemma with the new name. Dictionary makers should try to avoid using names with the same format as those outputted by the uniquifying method to prevent the possibility of creating new homographic entries through uniquification.
The importing process is driven by the Java application
wkimport.jar. On most computer platforms that have Java installed, it should be enough to double click on
wkimport.jar to start the program.
When wkimport is started for the first time, it will detect the operating system language and attempt to display the user interface in that language. If wkimport does not know the operating system language, it will default to English. The wkimport user interface language can be manually changed on the
Settings > Language menu.
Before it can import a dictionary, wkimport needs to know where the source dictionary content is stored and how the content should be presented in Wunderkammer. This section provides a guide to what information is required and how it can be entered into the three tabs of the wkimport user interface:
The configuration data entered into wkimport can be saved and loaded using the Save and Open commands on the File menu. Two sets of sample dictionary data have been included in the Wunderkammer Import Package to provide working examples of dictionary configurations. The Kaurna sample dictionary is an XML dictionary. Its configuration file can be found at
./demodics/kaurnademo/kaurnaconfig.cfg The Tura sample dictionary is a Shoebox/Toolbox dictionary. Its configuration file is at
Once all the required data have been entered, the Wunderkammer dictionary can be created by selecting Create dictionary from the Run menu. The user interface view will jump to the Console tab where information about the progress of the importing process and reports of any errors will be shown. The jar and jad files for the resulting dictionary will then be available in the output directory specified in the Input/Output tab.
The Input/Output tab collects information about the dictionary source and output files. This information can be typed into the text fields. Text fields that require file or directory paths have buttons to their right that can be clicked to open a file selection dialog box that automatically enters the path of the selected file into the text field.
The data required in the text fields are:
- Input fields
Input dictionaryThe path to the file that contains the source Shoebox/Toolbox or XML dictionary.
Input dictionary typeThe type of dictionary contained in the source dictionary file: either Shoebox/Toolbox dictionary or XML dictionary.
Trim input fieldsWhether leading and trailing whitespace should be trimmed from the data in input fields. See point 2 in the troubleshooting section below for more information. This option is only available for Shoebox/Toolbox dictionaries; whitespace is always trimmed from XML dictionaries.
Entry XPathThe XPath to the element that contains dictionary entries. This information is not applicable to Shoebox/Toolbox dictionaries, so this field is not enabled when the dictionary type is set to Shoebox/Toolbox dictionary. wkimport recognises the beginning of a new entry in a Shoebox/Toolbox dictionary when it encounters a field mapped to
lemma(see Mappings tab below for information on field mappings).
Images directoryThe path to the directory that contains the image files to be included in the dictionary. All the images that are to be included in the dictionary should be in a single directory with no subdirectories. If there are no images to include in the dictionary, this field should be left blank.
Sound directoryThe path to the directory that contains the sound files to be included in the dictionary. All the sounds that are to be included in the dictionary should be in a single directory with no subdirectories. If there are no sounds to include in the dictionary, this field should be left blank.
Icon fileThe path to the file to use as the dictionary icon on mobile phones. The 'standard' icon at
./standardfiles/icons/frogicon.pngcan be used by those who do not wish to create their own custom icons.
Theme fileThe path to the theme file that controls the appearance of the mobile phone dictionary. Standard theme files can be found in
./standardfiles/themes/. Information on creating custom theme files can be found under Custom theme below.
- Output fields
Output directoryThe path to the directory that the dictionary jar and jad files should be written to.
Dictionary nameThe name that the output dictionary should be given. Do not use spaces in the dictionary name.
Vendor nameThe name of the 'vendor' of the dictionary. Java ME jad files require a vendor name. In most cases, this should probably be the name of the organisation sponsoring the dictionary or the name of the lexicographer responsible for compiling the dictionary. Do not use spaces in the vendor name.
Dictionary versionThe version of the dictionary. The best approach is probably to assign version numbers to successive versions of the dictionary according to the sort of numbering systems used in software development.
In the Mappings tab fields in the source dictionary can be 'mapped' to fields in the output dictionary. This means that a correspondence between the fields is established so that wkimport knows, for example, that an
lx field in a source dictionary should appear as a
lemma field in the output dictionary.
When wkimport is first started the Input fields list will be empty. If the input dictionary is a Shoebox/Toolbox dictionary, the list can be automatically populated from the input dictionary by clicking the Populate list button immediately below the Input fields list, as can be seen in Figure 2 below.
If the source dictionary is an XML dictionary, however, the XPaths for the input fields must be entered manually by clicking the Add XPath button and then entering the XPath in the dialog box that appears. Note that the XPaths must refer to XML elements; they cannot refer to XML attributes. This can be seen in Figure 3 below.
Both for Shoebox/Toolbox and XML dictionaries, unwanted input fields that appear in the Input fields list can be removed from the list by selecting the field and then clicking the Remove selected button immediately below the list.
To establish a mapping, select a field from the Input fields list on the left, select its corresponding Wunderkammer field from the Output fields on the right, and then click the Map button. The new mapping should then appear in the Mappings list at the bottom. It is possible to create multiple mappings from one Input field to different Output fields or from different Input fields to a single Output field. Unwanted mappings can be removed by selecting the mapping and clicking the Remove selected button immediately above the right side of the Mappings list.
Each of the Output fields has a conventional association to a particular type of data that is typically stored in dictionaries. These associations are spelt out in the list below.
lemmaThe lemma or headword.
sdThe semantic domain.
posThe part of speech.
glossdefA gloss or a definition.
soundThe name of the sound file to play in this entry.
imageThe name of an image to include in an entry.
linkA field that provides a link to another entry in the dictionary.
riAn additional field that has no conventional association.
riiAn additional field that has no conventional association.
riiiAn additional field that has no conventional association.
Note that even though most of the fields have conventional associations to particular types of data, the data in
riii fields are treated simply as plain text by Wunderkammer. This means that any type of data intended to be displayed as text could be stored in these fields. The way that text should be rendered in each of these fields and the
link field is determined by the theme. See below under Custom themes for information on how to modify themes. All of these fields can be repeated within a single entry.
The other fields are treated specially by Wunderkammer and must contain specific types of data. The
lemma field must contain the lemma, or headword, of the entry. The
image fields must contain the names of sound and image files that should be played or shown in the entry. The
link field must contain the value of the
lemma of the entry that it links to. There can only be one
sound field in each entry. The
link fields can be repeated in a single entry.
The In entries checkbox below the Map button is used to determine whether the specific mapping should be shown in entries in the final dictionary. Some fields are only included in the input dictionary for the purpose of making indexes and should not appear in entries in the final dictionary. For example, an input dictionary might have a reverse index field that contains values that are the same as or simply tranformations of those in a gloss field, e.g. from gloss 'swamp grass' to 'grass, swamp' or 'cockatoo' to 'cockatoo'. When In entries is selected and a mapping is made, the output field will be shown in entries in the final dictionary. When In entries is not selected the field will not be shown in entries in the final dictionary. In the Mappings list fields that will be shown in entries are marked as
true and those that will not be shown are marked as
false. In Figures 2 and 3 above it can be seen that the
ri field is marked as false, since it is simply used in these dictionaries for creating a reverse index and should not appear in entries.
The checkbox Field order from source dictionary, immediately above the Mappings list, is used to determine whether the order in which fields are shown in the output dictionary follows the order in which they appear in the input dictionary or whether it follows the order of the mappings in the Mappings list. When the box is not selected, the fields in entries in the output dictionary will be in exactly the order listed in the Mappings list (except for
lemma fields, which are not part of the body of entries). The order of fields in the Mappings list can be adjusted using the up and down arrows to the left of the Field order from source dictionary checkbox. When the checkbox is selected, the order of fields in entries in the output dictionary will be the same as those in the input dictionary. If the order of fields in the source dictionary is not consistent from entry to entry, this inconsistency will appear in the output dictionary. The Field order from source dictionary checkbox cannot be selected for XML dictionaries.
The Menus tab allows the menus of the output dictionary to be specified. The Wunderkammer menu system is structured as a tree. The first menu that is loaded is always the
root menu. From this menu there can be any number of submenus that are embedded to any depth. Each submenu displays a list of the data contained in the field that the submenu is associated with. For example, a submenu associated with the field
lemma will display a list of all lemmas in the dictionary. Submenus that are embedded within other submenus will not show the fields of all entries in the dictionary, but only those that would be contained under the item selected in the menus they are embedded in. For example, in the case where there is a menu of semantic domains that contains a menu of lemmas, when the user selects a semantic domain from the first menu only lemmas of entries within the selected semantic domain will be shown in the embedded menu of lemmas. When a user navigates to the bottom of the menu system they will be taken to the entry that corresponds to the last selected menu item.
A submenu can be added to the tree by selecting the menu that should be its parent and clicking the Add child button. The name of the menu that will be displayed to the user in Wunderkammer can be set in the Menu name text field, the entry field that it is linked to can be selected in the Field selection box, and the sort order used for the menu can be entered in the Sort order text field. The syntax for describing sort orders follows that used by the Java RuleBasedCollator. To confirm changes to these properties of menus, click the Update node button. Unwanted menus can be removed by clicking the Remove selected button.
The Custom font button can be used to load a custom font for displaying the menu tree, menu names and sort orders. Custom fonts might be needed for languages that use non-Roman scripts or special Roman-based characters (see Custom fonts for more information). Custom fonts must be installed on the host computer for wkimport to be able to use them.
To run a Wunderkammer dictionary the jar and jad files produced by wkimport for the dictionary must be transferred to a mobile phone. The files could be transferred from a computer using Bluetooth, removable memory cards or a USB connection, depending on what options are available on the phone and the computer the files are being transferred from.
If a phone has internet access, it may also be possible to download the dictionary directly from the internet on to the phone. For instance, the Kaurna dictionary demo MIDlet can be downloaded by opening a phone's web browser and taking it to the address http://www.pfed.info/wunderkammer.jad. Note that the mobile network operator may charge extortionate fees for the data transfer. It costs nothing to transfer the files directly from a computer to the phone using any of the methods described in the paragraph above, however.
It should be fairly straightforward to install (if necessary) and run the files once they have been transferred to the mobile phone. There is too much diversity in mobile phone models to be able to describe the steps required here. Information about how to install software on particular phone models can probably be found in the phone manual or online.
Since Wunderkammer is a Java ME program, it cannot be run in Java SE, the standard environment used on desktop computers. To run a Wunderkammer dictionary on a computer, it is necessary to use an emulator. There are several Java ME emulators available, but the most reliable is probably the one included in the Sun Java Wireless Toolkit, which can be downloaded for free from the linked website.
It is possible to change the appearance and localisation settings of Wunderkammer by bundling the program with a modified resource file. The standard resource files can be found in the directory
./standardfiles/themes. These can be edited with the ResourceEditor application, which is bundled with the LWUIT library. ResourceEditor is located at
LWUIT/util/ResourceEditor.jar in the package. There is documentation included with ResourceEditor.
To change the general appearance of Wunderkammer, the theme, images and animations stored within the resource file need to be edited. To modify the localisation settings or change the additional text that is added to fields within entries, the localisation settings need to be edited.
Custom fonts may be needed for dictionaries of languages that use non-Roman scripts or special characters. Any custom fonts used must be included in the Wunderkammer theme file. It might also be necessary to write a special input method to allow users to enter characters from the custom font in the menu search box. Dmitry Idiatov has provided detailed instructions on creating custom fonts, incorporating them into theme files and creating custom input methods on the PFED blog. The relevant posts are at 1, 2, 3 and 4.
wkimport has trouble reading the input dictionary file. Error messages like
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 are often indicative of this problem.
The input dictionary must be encoded in the format
UTF-8 no BOM. To ensure this is the case, open the dictionary in an advanced text editor and save it again in the correct encoding. Free text editors that provide this functionality include TextWrangler (for Mac OS X) and Notepad++ (for Windows). Linux users should already have their own favourite text editor with this functionality.
There is a semantic domain menu in the output dictionary (or other type of menu that groups entries together) and the same semantic domain is appearing multiple times, e.g.
Living things and
Living things .
Make sure that entries that should appear in the same semantic domain really do have exactly the same text in their semantic domain fields. wkimport is case sensitive and is also sensitive to leading and trailing spaces in all input fields (as in the example above) when
Trim input fields is not turned on.
Version 2.1 of Guide to importing dictionaries, 15 August 2010. Wunderkammer project.