THL Simplified Phonetic Transcription of Standard Tibetan
by David Germano and Nicolas Tournadre
December 12, 2003
Section 9 of 13

Word Boundaries

Tibetan punctuation only marks the boundaries of syllables, not the boundaries of words. A small dot separates each syllable (and in some exceptional cases may enclose two syllables), but the reader must determine which syllables combine to form a word. The following are our general principles for rendering Tibetan personal names and technical terms in the present context:

  • Monosyllabic words should be rendered as a single word and should not be combined with other syllables: rkub is “kup,” khyi is “khyi,” and so forth
  • Bisyllabic words should be rendered as a single word: lha sa is “Lhasa,” bsod nams is “Sönam,” sngon po is “ngönpo,” and so forth
  • Trisyllabic words are generally rendered as a single word: lha mo skyid is “Lhamokyi,” dpal ldan rgyal is “Pendengyel.” Note: one should be mindful not to combine the syllables of different words – thus bod rang skyong ljongs is “Bö Rangkyong Jong,” bu ston rin chen grub is “Butön Rinchendrup,” and ye shes ’od is “Yeshé Ö”
  • Quadrisyllabic words should be rendered as two words: bsod nams rin chen is “Sönam Rinchen,” gang byung mang byung is “gangjung mangjung,” gtsang pa khang mtshan is “Tsangpa Khangtsen,” and so forth
  • For personal and place names, the first letter of each word should be capitalized. Thus, “Sönam Rinchen” and not “Sönam rinchen”
  • Grammatical particles should be rendered with the word with which they are construed – usually the preceding word. Thus, chos kyi rnam grangs is “chökyi namdrang,” gtan la phab pa is “tenla pappa,” and ’gyur med is “gyurmé.” An example of a particle that precedes the word it modifies: ma byas pa is “majepa”

We will maintain a running list of individual words that are exceptions to the rules stated above. For instance, the city rgyal rtse is actually pronounced “Gyantsé” rather than “Gyentsé” as the rules would dictate. We ask that users contact us with such exceptions to factor into the conversion program.

Ultimately we need to apply a word list for the automated recognition of word boundaries. In the meantime, we will mark up the boundaries of words to generate “Sönam Rinchen” rather than “Sönamrinchen.”

Note: at present the THL Simplified Phonetics system is geared towards computer-generated output for Tibetan words and phrases. We are working to adapt it for use with longer passages and entire texts – for instance, converting an entire liturgical text for non-Tibetan speakers who want to chant the liturgy in Tibetan. However, this currently is not possible due to the difficulties a computer program has identifying word boundaries in Tibetan texts. While we are working towards resolving this problem, our interim solution is for the program to process each syllable individually and separate them with spaces, ignoring the few rules that depend on identifying word boundaries (ba becomes wa when it is the final syllable of a word; é with diacritic accent is used when it is the final sound of a word).

