THL Toolbox > Developers' Zone > XML Markup in THL > Page Numbers of Original Texts
For an XML document that represents a physical text or, as with JIATS, one that has a PDF equivalent which delineates putative page-breaks, the page boundaries are marked up with the <milestone /> element. This element is an empty element that contains no text or children but is self closing. (That is, instead of their being a pair of tags (an opening and closing tag), it is a single tag that ends with the closing "/>") Because Tibetan proper nouns have two display forms (scholarly extended Wylie transliteration) and easily readible phonetics, a special form of markup is used when the page or line break occurs within a Tibetan proper noun. See the description of this exception below.
- Element: <milestone />
- Use: placed at the location in the text where a new page or line begins
- unit: the unit being recorded by the milestone, usually "page" or "line"
- ed: identifies to the edition or text being referred to, this is especially used when two kinds of milestones are being used such as when a JIATS article (that has its own page breaks) contains a transliteration of a text with embedded page breaks.
- n: (optional) the number of the page or line being recorded. In JIATS articles, this is not included with the page milestones the number of which is determined by the number of preceding milestones.
- rend: values for determining how the milestone is rendered. Values are:
- "ownline" displays page or line number on its own line
- "indent" indents the line before displaying the number
The general display of a milestone is within square brackets the unit and its number such as "page 4" but in digital texts no unit is given only the page and line such as "8a" or "14b.3".
JIATS example of Article's Page Break:
… with fecal analysis of bear <milestone unit="page"/>droppings in the mid-1990s …
JIATS example of Transciption of text contained within an Article:
<p><milestone ed="IOL" unit="panel" n="1" rend="ownline"/><milestone ed="IOL" unit="line" rend="indent" n="1"/>༆། །རང་བཞིན་
Digital Tibetan Text Example: (From Dege Kangyur text)
… །ཚུལ་ཁྲིམས་འཆལ་པས་<milestone unit="page" n="2a" /><milestone unit="line" n="2a.1" />ཟིན་རྣམས་ཀྱི། …
When a milestone page-break falls within any element such as a persName, title, etc. that has different popular and scholarly views, special markup is needed. The element containing the markup is inserted twice once for each view with the popular view using <pg rend="milestone"/> to mark the location of the break. This pair of elements is then wrapped in a third element of the same name with type set equal to "views". Thus,
<title type="views"> <title level="m" lang="tib" n="s">rin po che snang gsal spu gri <milestone unit="page"/>’bar bas ’khrul snang rtsad nas gcod pa nam mkha’i mtha’ dang mnyam pa’i rgyud</title> <title level="m" lang="tib" n="p">Rinpoché Nangsel Pudri <pb rend="milestone"/>Barwé Trülnang Tsené Chöpa Namkhé Ta Dang Nyampé Gyü</title> </title>
The same is done for other elements <persName> etc.