====== Letters and Numbers Orientation By Codepoint ====== This page is intended to help analyze Unicode wrt text orientation. It is not comprehensive at all yet. Category Codes: ^Code^UTR50^MSFT^Meaning^ ^U|U|S|Upright; translates between horizontal and vertical| ^R|S|R|Sideways; rotates between horizontal and vertical| ^TU|T|ST|Typeset upright with alternate glyph. Best fallback is just upright.| ^TR|SB|RT|Typeset upright with alternate glyph. Best fallback is just sideways.| Two modes are presented: Stacking (''text-orientation: upright'') and Default (TBD). ===== Letters (L*) and Script-Specific Numbers (N*) ===== Letters and script-specific (non-Common) numbers are classified by using their script property (including [[http://www.unicode.org/Public/UNIDATA/ScriptExtensions.txt|Script Extensions property]]). [[numbers|Common numbers]] are listed separately. ^Code^Name^Stack^Mixed^Memo^UTR^MS^ |Bopo|Bopomofo|U|U| | |Brai|Braille|U|R|:?: Checking with DAISY but haven't got back yet. Most resources say Braille cannot flow vertical. [[http://www.design-thinking.jp/2011/04/blog-post.html|This page]] indicates [[http://en.wikipedia.org/wiki/Sanada_Yukimura|Yukimura Sanada]] developed vertical Braille as R in 16th century, but this is probably different from the today's Braille. [[http://www6.ocn.ne.jp/~takut/tenjiehon.html|This book]] has vertical modern Braille, but can't identify if it's U or R from the picture. A definite scan of Mongolian Braille, however, shows that it is R.| |Egyp|Egyptian Hieroglyphs|U|U|Egypgian hieroglyhs are upright when written in columns| |Hira|Hiragana|U|U| | |Kana|Katakana|U|U|Unclear whether halfwidth katakana should be upright or sideways; voice marks are broken if set upright.| |Hani|Han|U|U| | |Hang|Hangul|U|U| | |Lisu|Lisu|U|R|Lisu-script characters are used intermixed with Latin, so their orientations **must** match|UR|UU| |Merc|Meroitic Cursive|U|U|Egypgian hieroglyhs are upright when written in columns|UR|UR| |Mero|Meroitic Hieroglyphs|U|U|Egypgian hieroglyhs are upright when written in columns|UR|UU| |Mong|Mongolian|V|V|Mongolian in Unicode code chart shows vertical glyphs and most font today has glyphs in 90 degree CCW rotated, so they are U from Unicode point of view, but R from UA point of view. Call it V.| |Ogam|Ogham|R|R| | |Orkh|Old Turkic|R|R|Old Turkic has a strong tradition of vertical writing. Unclear whether it rotates clockwise or counter-clockwise, but it definitely rotates.| |Phag|Phags Pa|V|V|Same as Mongolian.| |Yiii|Yi|U|U|Old documents show Yi rotated sideways (as vertical script), but one example of modern Yi (typeset horizontally) uses upright-stacked captions| |Arab|Arabic|U|R|:?: Still debating how to handle cursive RTL in stacked mode|UR|RR| |Mand|Mandaic|U|R|:?: Still debating how to handle cursive RTL in stacked mode|UR|RR| |Miao|Maio|U|R|:?: Needs some research to determine whether U/R or U/U|UR|UU| |Syrc|Syriac|U|R|:?: Still debating how to handle cursive RTL in stacked mode|UR|RR| |--|Canadian_Aboriginal|U|R|:?: UTR#50 has U/U, unclear why|UU|UU| |Oriya, Telugu, Kannada, Malayalam, Sinhala, Myanmar, Khmer, Tai_Tham, Javanese, Cham||U|R|:?: Unclear why MSFT chose R/R, seems wrong|UR|RR| |Linear_B, Ugaritic, Old_Persian, Avestan||U|R|:?: Unclear why MSFT chose U/U. Cuneiform in particular derives (via rotation) from vertical writing, so U/U seems an illogical choice|UR|UU| |--|All others|U|R|Unless the script has a vertical tradition, it is sideways in mixed mode and upright in stacked| There are some exceptions: ^Code^Description^Char^Stack^Mix^Memo^ |[[http://www.fileformat.info/info/unicode/char/30FC/index.htm|U+30FC]]|KATAKANA-HIRAGANA PROLONGED SOUND MARK|ー|TR|TR| | |[[http://www.fileformat.info/info/unicode/char/FF70/index.htm|U+FF70]]|HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK|ー|TR|R|:?: Halfwidth? | |U+FF61-FFDF, U+FFE8-FFEF||All halfwidth letters|U|R| Some interesting cases: * [[http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%3AGeneral_Category%3D%2F%5EL%2F%3A%5D%26%5B%3AScript%3DCommon%3A%5D&g=|L* & Script=Common]] * [[http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%3AGeneral_Category%3D%2F%5EN%2F%3A%5D%26%5B%3AScript%3DCommon%3A%5D&g=|N* & Script=Common]] ===== Letterlike Symbols Block Letters ====== See also [[:spec:utr50:symbols:textual|Symbols from this block]] and [[:spec:utr50:symbols:math|Math symbols from this block]] |U+2102|DOUBLE-STRUCK CAPITAL C|ℂ|U|R|Part of mathematical double-struck set| |U+2107|EULER CONSTANT|ℇ|U|R|Match PLANCK CONSTANT| |U+210A|SCRIPT SMALL G|ℊ|U|R|Part of mathematical script set| |U+210B|SCRIPT CAPITAL H|ℋ|U|R|Part of mathematical script set| |U+210C|BLACK-LETTER CAPITAL H|ℌ|U|R|Match other math letters| |U+210D|DOUBLE-STRUCK CAPITAL H|ℍ|U|R|Part of mathematical double-struck set| |U+210E|PLANCK CONSTANT|ℎ|U|R|Part of mathematical italic set| |U+210F|PLANCK CONSTANT OVER TWO PI|ℏ|U|R|Match PLANCK CONSTANT| |U+2110|SCRIPT CAPITAL I|ℐ|U|R|Part of mathematical script set| |U+2111|BLACK-LETTER CAPITAL I|ℑ|U|R|Match other math letters| |U+2112|SCRIPT CAPITAL L|ℒ|U|R|Part of mathematical script set| |U+2113|SCRIPT SMALL L|ℓ|U|U|EA compatibility unit is upright. Not unified with mathematical script l.| |U+2115|DOUBLE-STRUCK CAPITAL N|ℕ|U|R|Part of mathematical double-struck set| |U+2119|DOUBLE-STRUCK CAPITAL P|ℙ|U|R|Part of mathematical double-struck set| |U+211A|DOUBLE-STRUCK CAPITAL Q|ℚ|U|R|Part of mathematical double-struck set| |U+211B|SCRIPT CAPITAL R|ℛ|U|R|Part of mathematical script set| |U+211C|BLACK-LETTER CAPITAL R|ℜ|U|R|Match other math letters| |U+211D|DOUBLE-STRUCK CAPITAL R|ℝ|U|R|Part of mathematical double-struck set| |U+2124|DOUBLE-STRUCK CAPITAL Z|ℤ|U|R|Part of mathematical double-struck set| |U+2126|OHM SIGN|Ω|U|U|EA compatibility unit is upright. :!: NFC-folds to omega| |U+2128|BLACK-LETTER CAPITAL Z|ℨ|U|R|Match other math letters| |U+212A|KELVIN SIGN|K|U|U|EA compatibility unit is upright. :!: NFC-folds to K| |U+212B|ANGSTROM SIGN|Å|U|U|EA compatibility unit is upright. :!: NFC-folds to Aring| |U+212C|SCRIPT CAPITAL B|ℬ|U|R|Part of mathematical script set| |U+212D|BLACK-LETTER CAPITAL C|ℭ|U|R|Match other math letters| |U+212F|SCRIPT SMALL E|ℯ|U|R|Part of mathematical script set| |U+2130|SCRIPT CAPITAL E|ℰ|U|R|Part of mathematical script set| |U+2131|SCRIPT CAPITAL F|ℱ|U|R|Part of mathematical script set| |U+2132|TURNED CAPITAL F|Ⅎ|U|R|Claudian must match Latin| |U+2133|SCRIPT CAPITAL M|ℳ|U|R|Part of mathematical script set| |U+2134|SCRIPT SMALL O|ℴ|U|R|Part of mathematical script set| |U+2139|INFORMATION SOURCE|ℹ|U|U|Symbolic, not math| |U+213C|DOUBLE-STRUCK SMALL PI|ℼ|U|R|Match double-struck Latin| |U+213D|DOUBLE-STRUCK SMALL GAMMA|ℽ|U|R|Match double-struck Latin| |U+213E|DOUBLE-STRUCK CAPITAL GAMMA|ℾ|U|R|Match double-struck Latin| |U+213F|DOUBLE-STRUCK CAPITAL PI|ℿ|U|R|Match double-struck Latin| |U+2135|ALEF SYMBOL|ℵ|U|R|Math symbol| |U+2136|BET SYMBOL|ℶ|U|R|Math symbol| |U+2137|GIMEL SYMBOL|ℷ|U|R|Math symbol| |U+2138|DALET SYMBOL|ℸ|U|R|Math symbol| |U+2145|DOUBLE-STRUCK ITALIC CAPITAL D|ⅅ|U|R|Math symbol, match double-struck Latin| |U+2146|DOUBLE-STRUCK ITALIC SMALL D|ⅆ|U|R|Math symbol, match double-struck Latin| |U+2147|DOUBLE-STRUCK ITALIC SMALL E|ⅇ|U|R|Math symbol, match double-struck Latin| |U+2148|DOUBLE-STRUCK ITALIC SMALL I|ⅈ|U|R|Math symbol, match double-struck Latin| |U+2149|DOUBLE-STRUCK ITALIC SMALL J|ⅉ|U|R|Math symbol, match double-struck Latin| |U+214E|TURNED SMALL F|ⅎ|U|R|Claudian must match Latin|