How can I make a mini English-Chinese dictionary for Ubuntu?

When you open your Ubuntu, many English words are used. For example, accessories, education, graphics, internet, office, other, sound, video, system tools, preferences, run, and log out will all appear in the GUI.

How can I get a list of all English words used in the Ubuntu GUI?

I want a list of all English words so I can translate them into Chinese, to create a glossary for Ubuntu: a guide in Chinese to English-language Ubuntu terminology. This would work as a mini dictionary: a two-columns file with English on the left and Chinese on the right.

It is a joke to write the whole thing just from looking at the words on the GUI. I want a smart way to do the task. Is there a list of all the words used by Ubuntu?

Screenshot of text editor showing English and Chinese words

Typing by hand as I come across words in the GUI is a foolish way to get the task done.

  1. Copy out all English words one-by-one from the GUI when the locale is set to en_us.
  2. Copy out all Chinese words one-by-one from the GUI when the locale is set to zh.
  3. Put them together in one file, with English on the left and Chinese on the right.

Is there a smarter way to do this?
Which character in my /usr/lib/locale/locale-archive ?


Solution 1:

There are many ways which are used to store internationalized strings depending on the GUI toolkits or specifications.

  • The ones shown in the screenshot are from menu .directory files. Use same standard as .desktop from freedesktop project. See Desktop Entry Specification: Localized values for keys

    Here a script to extract them

    1. Create a shell file

      nano extract_zh_translation.sh
      

      Add

      #!/bin/bash            
      
      for f in $(ls $1)
      do
              name_en=`sed -n -e '0,/Name/{s/^\s*Name\s*=\s*//p}' $f`
              name_zh=`sed -n -e '0,/Name\[zh_CN\]/{s/^\s*Name\[zh_CN\]\s*=\s*//p}' $f`
      
              case `echo ${#name_en}/8 | bc` in
              3|2)    echo -e $name_en'\t'$name_zh
                      ;;
              1)      echo -e $name_en'\t\t'$name_zh
                      ;;
              0)      echo -e $name_en'\t\t\t'$name_zh
                      ;;
              esac
      done
      
    2. Make it executable

      chmod +x extract_zh_translation.sh
      
    3. Run it on menu directory entries

      ./extract_zh_translation.sh "/usr/share/desktop-directories/*.directory"
      

      Here is an output sample

    4. If you want application menu entries too

      ./extract_zh_translation.sh "/usr/share/applications/*.desktop"
      

      Here is an output sample

  • Other translations of Unity GUI is using gettext.

    Files having the .gmo or .mo extension, are the final binary format. Check:

    locate -br .mo$ | grep zh_CN
    

    The source files have .po extension, you can download them from https://translations.launchpad.net/ or by downloading the complete source, example:

    apt-get source unity
    
  • Other GUI toolkits like Qt & Java are using different formats.

Solution 2:

Search for Ubuntu Translations https://wiki.ubuntu.com/Translations/

You can get the Expressions and Translations there.

Solution 3:

I think he is referring to another problem. On my Ubuntu I have setup English, Hebrew and French. You have to go to System Settings, Language Support and add the additional languages.

To have it boot up in a different language, you drag the preferred language to the top of the list. For me, it works like a charm.