This page is intended to help analyze Unicode wrt text orientation. It is not comprehensive at all yet.
Category Codes:
Code | UTR50 | MSFT | Meaning |
---|---|---|---|
U | U | S | Upright; translates between horizontal and vertical |
R | S | R | Sideways; rotates between horizontal and vertical |
TU | T | ST | Typeset upright with alternate glyph. Best fallback is just upright. |
TR | SB | RT | Typeset upright with alternate glyph. Best fallback is just sideways. |
Two modes are presented: Stacking (text-orientation: upright
) and Default (TBD).
Letters and script-specific (non-Common) numbers are classified by using their script property (including Script Extensions property). Common numbers are listed separately.
Code | Name | Stack | Mixed | Memo | UTR | MS |
---|---|---|---|---|---|---|
Bopo | Bopomofo | U | U | |||
Brai | Braille | U | R | Checking with DAISY but haven't got back yet. Most resources say Braille cannot flow vertical. This page indicates Yukimura Sanada developed vertical Braille as R in 16th century, but this is probably different from the today's Braille. This book has vertical modern Braille, but can't identify if it's U or R from the picture. A definite scan of Mongolian Braille, however, shows that it is R. | ||
Egyp | Egyptian Hieroglyphs | U | U | Egypgian hieroglyhs are upright when written in columns | ||
Hira | Hiragana | U | U | |||
Kana | Katakana | U | U | Unclear whether halfwidth katakana should be upright or sideways; voice marks are broken if set upright. | ||
Hani | Han | U | U | |||
Hang | Hangul | U | U | |||
Lisu | Lisu | U | R | Lisu-script characters are used intermixed with Latin, so their orientations must match | UR | UU |
Merc | Meroitic Cursive | U | U | Egypgian hieroglyhs are upright when written in columns | UR | UR |
Mero | Meroitic Hieroglyphs | U | U | Egypgian hieroglyhs are upright when written in columns | UR | UU |
Mong | Mongolian | V | V | Mongolian in Unicode code chart shows vertical glyphs and most font today has glyphs in 90 degree CCW rotated, so they are U from Unicode point of view, but R from UA point of view. Call it V. | ||
Ogam | Ogham | R | R | |||
Orkh | Old Turkic | R | R | Old Turkic has a strong tradition of vertical writing. Unclear whether it rotates clockwise or counter-clockwise, but it definitely rotates. | ||
Phag | Phags Pa | V | V | Same as Mongolian. | ||
Yiii | Yi | U | U | Old documents show Yi rotated sideways (as vertical script), but one example of modern Yi (typeset horizontally) uses upright-stacked captions | ||
Arab | Arabic | U | R | Still debating how to handle cursive RTL in stacked mode | UR | RR |
Mand | Mandaic | U | R | Still debating how to handle cursive RTL in stacked mode | UR | RR |
Miao | Maio | U | R | Needs some research to determine whether U/R or U/U | UR | UU |
Syrc | Syriac | U | R | Still debating how to handle cursive RTL in stacked mode | UR | RR |
– | Canadian_Aboriginal | U | R | UTR#50 has U/U, unclear why | UU | UU |
Oriya, Telugu, Kannada, Malayalam, Sinhala, Myanmar, Khmer, Tai_Tham, Javanese, Cham | U | R | Unclear why MSFT chose R/R, seems wrong | UR | RR | |
Linear_B, Ugaritic, Old_Persian, Avestan | U | R | Unclear why MSFT chose U/U. Cuneiform in particular derives (via rotation) from vertical writing, so U/U seems an illogical choice | UR | UU | |
– | All others | U | R | Unless the script has a vertical tradition, it is sideways in mixed mode and upright in stacked |
There are some exceptions:
Code | Description | Char | Stack | Mix | Memo |
---|---|---|---|---|---|
U+30FC | KATAKANA-HIRAGANA PROLONGED SOUND MARK | ー | TR | TR | |
U+FF70 | HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK | ー | TR | R | Halfwidth? |
U+FF61-FFDF, U+FFE8-FFEF | All halfwidth letters | U | R |
Some interesting cases:
See also Symbols from this block and Math symbols from this block
U+2102 | DOUBLE-STRUCK CAPITAL C | ℂ | U | R | Part of mathematical double-struck set |
U+2107 | EULER CONSTANT | ℇ | U | R | Match PLANCK CONSTANT |
U+210A | SCRIPT SMALL G | ℊ | U | R | Part of mathematical script set |
U+210B | SCRIPT CAPITAL H | ℋ | U | R | Part of mathematical script set |
U+210C | BLACK-LETTER CAPITAL H | ℌ | U | R | Match other math letters |
U+210D | DOUBLE-STRUCK CAPITAL H | ℍ | U | R | Part of mathematical double-struck set |
U+210E | PLANCK CONSTANT | ℎ | U | R | Part of mathematical italic set |
U+210F | PLANCK CONSTANT OVER TWO PI | ℏ | U | R | Match PLANCK CONSTANT |
U+2110 | SCRIPT CAPITAL I | ℐ | U | R | Part of mathematical script set |
U+2111 | BLACK-LETTER CAPITAL I | ℑ | U | R | Match other math letters |
U+2112 | SCRIPT CAPITAL L | ℒ | U | R | Part of mathematical script set |
U+2113 | SCRIPT SMALL L | ℓ | U | U | EA compatibility unit is upright. Not unified with mathematical script l. |
U+2115 | DOUBLE-STRUCK CAPITAL N | ℕ | U | R | Part of mathematical double-struck set |
U+2119 | DOUBLE-STRUCK CAPITAL P | ℙ | U | R | Part of mathematical double-struck set |
U+211A | DOUBLE-STRUCK CAPITAL Q | ℚ | U | R | Part of mathematical double-struck set |
U+211B | SCRIPT CAPITAL R | ℛ | U | R | Part of mathematical script set |
U+211C | BLACK-LETTER CAPITAL R | ℜ | U | R | Match other math letters |
U+211D | DOUBLE-STRUCK CAPITAL R | ℝ | U | R | Part of mathematical double-struck set |
U+2124 | DOUBLE-STRUCK CAPITAL Z | ℤ | U | R | Part of mathematical double-struck set |
U+2126 | OHM SIGN | Ω | U | U | EA compatibility unit is upright. NFC-folds to omega |
U+2128 | BLACK-LETTER CAPITAL Z | ℨ | U | R | Match other math letters |
U+212A | KELVIN SIGN | K | U | U | EA compatibility unit is upright. NFC-folds to K |
U+212B | ANGSTROM SIGN | Å | U | U | EA compatibility unit is upright. NFC-folds to Aring |
U+212C | SCRIPT CAPITAL B | ℬ | U | R | Part of mathematical script set |
U+212D | BLACK-LETTER CAPITAL C | ℭ | U | R | Match other math letters |
U+212F | SCRIPT SMALL E | ℯ | U | R | Part of mathematical script set |
U+2130 | SCRIPT CAPITAL E | ℰ | U | R | Part of mathematical script set |
U+2131 | SCRIPT CAPITAL F | ℱ | U | R | Part of mathematical script set |
U+2132 | TURNED CAPITAL F | Ⅎ | U | R | Claudian must match Latin |
U+2133 | SCRIPT CAPITAL M | ℳ | U | R | Part of mathematical script set |
U+2134 | SCRIPT SMALL O | ℴ | U | R | Part of mathematical script set |
U+2139 | INFORMATION SOURCE | ℹ | U | U | Symbolic, not math |
U+213C | DOUBLE-STRUCK SMALL PI | ℼ | U | R | Match double-struck Latin |
U+213D | DOUBLE-STRUCK SMALL GAMMA | ℽ | U | R | Match double-struck Latin |
U+213E | DOUBLE-STRUCK CAPITAL GAMMA | ℾ | U | R | Match double-struck Latin |
U+213F | DOUBLE-STRUCK CAPITAL PI | ℿ | U | R | Match double-struck Latin |
U+2135 | ALEF SYMBOL | ℵ | U | R | Math symbol |
U+2136 | BET SYMBOL | ℶ | U | R | Math symbol |
U+2137 | GIMEL SYMBOL | ℷ | U | R | Math symbol |
U+2138 | DALET SYMBOL | ℸ | U | R | Math symbol |
U+2145 | DOUBLE-STRUCK ITALIC CAPITAL D | ⅅ | U | R | Math symbol, match double-struck Latin |
U+2146 | DOUBLE-STRUCK ITALIC SMALL D | ⅆ | U | R | Math symbol, match double-struck Latin |
U+2147 | DOUBLE-STRUCK ITALIC SMALL E | ⅇ | U | R | Math symbol, match double-struck Latin |
U+2148 | DOUBLE-STRUCK ITALIC SMALL I | ⅈ | U | R | Math symbol, match double-struck Latin |
U+2149 | DOUBLE-STRUCK ITALIC SMALL J | ⅉ | U | R | Math symbol, match double-struck Latin |
U+214E | TURNED SMALL F | ⅎ | U | R | Claudian must match Latin |