Digital Standards for South Asian Writing Systems

Preserving Historical and Indigenous Writing Traditions Through Unicode

Anshuman Pandey, University of Michigan, Ann Arbor



Contents



Introduction

Presented here is an overview and bibliography of my work related to the development of character-encoding standards for the writing systems of South Asia, which are published or intended for publication in the Universal Character Set (UCS) /
Unicode Standard. In simplest terms, a character-encoding standard enables the native usage of writing systems on computers. The UCS / Unicode is an international standard published jointly by the International Organization for Standardization (ISO) and the Unicode Consortium. The standard is known formally as the International Standard ISO/IEC 10646, "Information technology -- Universal multiple-octet coded character set (UCS)".

I am an American historian of South Asia. One facet of my research is dedicated to an ongoing project to document the historical, indigenous, and lesser-studied writing systems of the region (and in some cases, of areas far afield), to promote awareness and study of these writing systems, and to enable the preservation and usage of these scripts through global standards and modern technologies. The documents provided here are proposals for character-encoding standards that I have authored and submitted to the Unicode Technical Committee and to ISO for inclusion in Unicode. These documents contain detailed histories of the writing systems, descriptions of orthographies, examples of usage, and information related to the technical implementation of the scripts.

All work presented here is based upon original research. Several of these papers on historical writing systems and scripts associated with minority linguistic communities are the only English language resources available for these writing systems. Additional information on script-encoding projects that I am currently pursuing may be found on my blog. Please direct comments to me at the email address provided at the bottom of this page.

Several of these projects have been partially supported by the Scripts Encoding Initiative (SEI) at the University of California, Berkeley. A list of SEI encoding projects that I have completed or with which I am currently engaged is available at Alphabetical List of Scripts Not Yet Encoded.


Overview of Projects

Listed below are script blocks and individual characters for which I have authored and submitted character-encoding proposals. The section named "Characters Added to the Standard" enumerates scripts and characters that are published as part of the UCS / Unicode; a link to the code chart at the website of the Unicode Consortium is provided. The section "Accepted for Future Inclusion" lists scripts and characters that are pending publication in a future version of the standard. Scripts and characters for which projects are ongoing are listed in the section "Projects in Progress".

  • Accepted for Future Inclusion

      • Bhaiksuki (U+11C00..11C6F)
      • U+A8FC DEVANAGARI SIGN SIDDHAM
      • U+A8FD DEVANAGARI JAIN OM
      • U+1123E KHOJKI SIGN SUKUN
      • Multani (U+11280..112AF)
      • U+111C9 SHARADA SANDHI MARK
      • U+111CA SHARADA SIGN NUKTA
      • U+111CB SHARADA VOWEL MODIFIER MARK
      • U+111CC SHARADA EXTRA SHORT VOWEL MARK
      • U+111CD SHARADA SUTRA MARK
      • U+111CE SHARADA CONTINUATION SIGN
      • U+111DA SHARADA EKAM
      • U+111DB SHARADA HEADSTROKE
      • U+111DC SHARADA SIGN SIDDHAM
      • U+111DD SHARADA SECTION MARK-1
      • U+111DE SHARADA SECTION MARK-2
      • Zanabazar Square (U+11A00..11A4F)

  • Projects in Progress

      • Balti 'A'
      • Balti 'B'
      • Coorgi-Cox Alphabet
      • Devanagari additions
      • Dhimal / Dham
      • Dhives Akuru
      • Diwani Siyaq Numbers
      • Gangga Malayu
      • Gondi
      • Gujarati additions
      • Indic Siyaq Numbers / Raqm
      • Jenticha (Koinch Brehs)
      • Kawi
      • Khambu Rai
      • Khatt-i Baburi
      • Khema (Tamu Khema Phri) / Gurung
      • Khojki additions
      • Kirat Rai
      • Landa
      • Magar Akkha
      • Nandinagari
      • Newar
      • Ottoman Siyaq Numbers
      • Pau Cin Hau Syllabary
      • Persian Siyaq Numbers
      • Pyu
      • Ranjana
      • Rohingya
      • Sharada additions
      • Siddham additions
      • Soyombo
      • Tani Lipi
      • Tikamuli
      • Tolong Siki
      • Zou (Zolai)


Documents Authored and Submitted

Documents I have authored and submitted for the standardization of writing systems and individual characters in UCS / Unicode.

Presentations

  • "A Pre-script-ion for the Future: The Role of Unicode in the Development of Minority Languages in South Asia". Sessional presentation. Internationalization and Unicode Conference (35th), Santa Clara, October 19, 2011. [Abstract]
  • "Script Encoding Initiative 2010: Progress and Endgame", with Deborah Anderson (University of California, Berkeley). Panel: Unicode Outside of Industry: News from Academic and Non-Profit Projects. Internationalization and Unicode Conference (34th), Santa Clara, October 20, 2010. [Abstract]


Appreciation



Press



Anshuman Pandey (pandey at umich period edu)
Department of History
University of Michigan
October 2014