Letters and Numbers Orientation By Codepoint
This page is intended to help analyze Unicode wrt text orientation. It is not comprehensive at all yet.
Category Codes:
| Code | UTR50 | MSFT | Meaning |
|---|---|---|---|
| U | U | S | Upright; translates between horizontal and vertical |
| R | S | R | Sideways; rotates between horizontal and vertical |
| TU | T | ST | Typeset upright with alternate glyph. Best fallback is just upright. |
| TR | SB | RT | Typeset upright with alternate glyph. Best fallback is just sideways. |
Two modes are presented: Stacking (text-orientation: upright) and Default (TBD).
Letters (L) and Script-Specific Numbers (N)
Letters and script-specific (non-Common) numbers are classified by using their script property (including Script Extensions property). Common numbers are listed separately.
| Code | Name | Stack | Mixed | Memo | UTR | MS | |
|---|---|---|---|---|---|---|---|
| Bopo | Bopomofo | U | U | ||||
| Brai | Braille | U | R | :?: Checking with DAISY but haven’t got back yet. Most resources say Braille cannot flow vertical. [[http://www.design-thinking.jp/2011/04/blog-post.html | This page]] indicates [[http://en.wikipedia.org/wiki/Sanada_Yukimura | Yukimura Sanada]] developed vertical Braille as R in 16th century, but this is probably different from the today’s Braille. [[http://www6.ocn.ne.jp/~takut/tenjiehon.html | This book]] has vertical modern Braille, but can’t identify if it’s U or R from the picture. A definite scan of Mongolian Braille, however, shows that it is R. |
| Egyp | Egyptian Hieroglyphs | U | U | Egypgian hieroglyhs are upright when written in columns | |||
| Hira | Hiragana | U | U | ||||
| Kana | Katakana | U | U | Unclear whether halfwidth katakana should be upright or sideways; voice marks are broken if set upright. | |||
| Hani | Han | U | U | ||||
| Hang | Hangul | U | U | ||||
| Lisu | Lisu | U | R | Lisu-script characters are used intermixed with Latin, so their orientations must match | UR | UU | |
| Merc | Meroitic Cursive | U | U | Egypgian hieroglyhs are upright when written in columns | UR | UR | |
| Mero | Meroitic Hieroglyphs | U | U | Egypgian hieroglyhs are upright when written in columns | UR | UU | |
| Mong | Mongolian | V | V | Mongolian in Unicode code chart shows vertical glyphs and most font today has glyphs in 90 degree CCW rotated, so they are U from Unicode point of view, but R from UA point of view. Call it V. | |||
| Ogam | Ogham | R | R | ||||
| Orkh | Old Turkic | R | R | Old Turkic has a strong tradition of vertical writing. Unclear whether it rotates clockwise or counter-clockwise, but it definitely rotates. | |||
| Phag | Phags Pa | V | V | Same as Mongolian. | |||
| Yiii | Yi | U | U | Old documents show Yi rotated sideways (as vertical script), but one example of modern Yi (typeset horizontally) uses upright-stacked captions | |||
| Arab | Arabic | U | R | :?: Still debating how to handle cursive RTL in stacked mode | UR | RR | |
| Mand | Mandaic | U | R | :?: Still debating how to handle cursive RTL in stacked mode | UR | RR | |
| Miao | Maio | U | R | :?: Needs some research to determine whether U/R or U/U | UR | UU | |
| Syrc | Syriac | U | R | :?: Still debating how to handle cursive RTL in stacked mode | UR | RR | |
| – | Canadian_Aboriginal | U | R | :?: UTR#50 has U/U, unclear why | UU | UU | |
| Oriya, Telugu, Kannada, Malayalam, Sinhala, Myanmar, Khmer, Tai_Tham, Javanese, Cham | U | R | :?: Unclear why MSFT chose R/R, seems wrong | UR | RR | ||
| Linear_B, Ugaritic, Old_Persian, Avestan | U | R | :?: Unclear why MSFT chose U/U. Cuneiform in particular derives (via rotation) from vertical writing, so U/U seems an illogical choice | UR | UU | ||
| – | All others | U | R | Unless the script has a vertical tradition, it is sideways in mixed mode and upright in stacked |
There are some exceptions:
| Code | Description | Char | Stack | Mix | Memo | |
|---|---|---|---|---|---|---|
| [[http://www.fileformat.info/info/unicode/char/30FC/index.htm | U+30FC]] | KATAKANA-HIRAGANA PROLONGED SOUND MARK | ー | TR | TR | |
| [[http://www.fileformat.info/info/unicode/char/FF70/index.htm | U+FF70]] | HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK | ー | TR | R | :?: Halfwidth? |
| U+FF61-FFDF, U+FFE8-FFEF | All halfwidth letters | U | R |
Some interesting cases:
Letterlike Symbols Block Letters
See also Symbols from this block and Math symbols from this block
| U+2102 | DOUBLE-STRUCK CAPITAL C | ℂ | U | R | Part of mathematical double-struck set |
|---|---|---|---|---|---|
| U+2107 | EULER CONSTANT | ℇ | U | R | Match PLANCK CONSTANT |
| U+210A | SCRIPT SMALL G | ℊ | U | R | Part of mathematical script set |
| U+210B | SCRIPT CAPITAL H | ℋ | U | R | Part of mathematical script set |
| U+210C | BLACK-LETTER CAPITAL H | ℌ | U | R | Match other math letters |
| U+210D | DOUBLE-STRUCK CAPITAL H | ℍ | U | R | Part of mathematical double-struck set |
| U+210E | PLANCK CONSTANT | ℎ | U | R | Part of mathematical italic set |
| U+210F | PLANCK CONSTANT OVER TWO PI | ℏ | U | R | Match PLANCK CONSTANT |
| U+2110 | SCRIPT CAPITAL I | ℐ | U | R | Part of mathematical script set |
| U+2111 | BLACK-LETTER CAPITAL I | ℑ | U | R | Match other math letters |
| U+2112 | SCRIPT CAPITAL L | ℒ | U | R | Part of mathematical script set |
| U+2113 | SCRIPT SMALL L | ℓ | U | U | EA compatibility unit is upright. Not unified with mathematical script l. |
| U+2115 | DOUBLE-STRUCK CAPITAL N | ℕ | U | R | Part of mathematical double-struck set |
| U+2119 | DOUBLE-STRUCK CAPITAL P | ℙ | U | R | Part of mathematical double-struck set |
| U+211A | DOUBLE-STRUCK CAPITAL Q | ℚ | U | R | Part of mathematical double-struck set |
| U+211B | SCRIPT CAPITAL R | ℛ | U | R | Part of mathematical script set |
| U+211C | BLACK-LETTER CAPITAL R | ℜ | U | R | Match other math letters |
| U+211D | DOUBLE-STRUCK CAPITAL R | ℝ | U | R | Part of mathematical double-struck set |
| U+2124 | DOUBLE-STRUCK CAPITAL Z | ℤ | U | R | Part of mathematical double-struck set |
| U+2126 | OHM SIGN | Ω | U | U | EA compatibility unit is upright. :!: NFC-folds to omega |
| U+2128 | BLACK-LETTER CAPITAL Z | ℨ | U | R | Match other math letters |
| U+212A | KELVIN SIGN | K | U | U | EA compatibility unit is upright. :!: NFC-folds to K |
| U+212B | ANGSTROM SIGN | Å | U | U | EA compatibility unit is upright. :!: NFC-folds to Aring |
| U+212C | SCRIPT CAPITAL B | ℬ | U | R | Part of mathematical script set |
| U+212D | BLACK-LETTER CAPITAL C | ℭ | U | R | Match other math letters |
| U+212F | SCRIPT SMALL E | ℯ | U | R | Part of mathematical script set |
| U+2130 | SCRIPT CAPITAL E | ℰ | U | R | Part of mathematical script set |
| U+2131 | SCRIPT CAPITAL F | ℱ | U | R | Part of mathematical script set |
| U+2132 | TURNED CAPITAL F | Ⅎ | U | R | Claudian must match Latin |
| U+2133 | SCRIPT CAPITAL M | ℳ | U | R | Part of mathematical script set |
| U+2134 | SCRIPT SMALL O | ℴ | U | R | Part of mathematical script set |
| U+2139 | INFORMATION SOURCE | ℹ | U | U | Symbolic, not math |
| U+213C | DOUBLE-STRUCK SMALL PI | ℼ | U | R | Match double-struck Latin |
| U+213D | DOUBLE-STRUCK SMALL GAMMA | ℽ | U | R | Match double-struck Latin |
| U+213E | DOUBLE-STRUCK CAPITAL GAMMA | ℾ | U | R | Match double-struck Latin |
| U+213F | DOUBLE-STRUCK CAPITAL PI | ℿ | U | R | Match double-struck Latin |
| U+2135 | ALEF SYMBOL | ℵ | U | R | Math symbol |
| U+2136 | BET SYMBOL | ℶ | U | R | Math symbol |
| U+2137 | GIMEL SYMBOL | ℷ | U | R | Math symbol |
| U+2138 | DALET SYMBOL | ℸ | U | R | Math symbol |
| U+2145 | DOUBLE-STRUCK ITALIC CAPITAL D | ⅅ | U | R | Math symbol, match double-struck Latin |
| U+2146 | DOUBLE-STRUCK ITALIC SMALL D | ⅆ | U | R | Math symbol, match double-struck Latin |
| U+2147 | DOUBLE-STRUCK ITALIC SMALL E | ⅇ | U | R | Math symbol, match double-struck Latin |
| U+2148 | DOUBLE-STRUCK ITALIC SMALL I | ⅈ | U | R | Math symbol, match double-struck Latin |
| U+2149 | DOUBLE-STRUCK ITALIC SMALL J | ⅉ | U | R | Math symbol, match double-struck Latin |
| U+214E | TURNED SMALL F | ⅎ | U | R | Claudian must match Latin |