This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
ideas:at-text-transform [2012/04/24 02:46] – [The character-type descriptor] florian | ideas:at-text-transform [2018/11/04 10:50] (current) – old revision restored (2018/09/25 17:09) fantasai | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | This is an early draft for a possible generic mechanism to allow authors to define custom text-transforms. | + | <note warning>This page used to hold a proposal |
- | ====== Defining Custom Text Transforms: | + | Wikis are good to sketch ideas, but not ideal to maintain specifications, |
- | The general form of an @text-transform at-rule is: | + | Feedback welcome on that page. You may also consult old version |
- | + | ||
- | <code css> | + | |
- | @text-transform < | + | |
- | { [ descriptor: value; ]+ } | + | |
- | </ | + | |
- | + | ||
- | < | + | |
- | + | ||
- | A text transform created using this at-rule | + | |
- | + | ||
- | + | ||
- | Each @text-transform rule specifies a value for every text-transform descriptor, either implicitly or explicitly. Those not given explicit value in the rule take the initial value listed with each descriptor in this specification. These descriptors apply solely within the context of the @text-transform rule in which they are defined, and do not apply to document language elements. There is no notion of which elements the descriptors apply to or whether the values are inherited by child elements. When a given descriptor occurs multiple times in a given @text-transform rule, only the last specified value is used; all prior values for that descriptor must be ignored. | + | |
- | + | ||
- | ===== The transformation descriptor ===== | + | |
- | + | ||
- | <code bnf> | + | |
- | Name: transformation | + | |
- | Value: < | + | |
- | default: N/A | + | |
- | + | ||
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | </code> | + | |
- | + | ||
- | This descriptor defines which character will be replaced by which, by listing a series of conversions, | + | |
- | + | ||
- | Conversions may refer to existing text transforms, either predefined by CSS or defined by the author. While an transformation using only a single such conversion is not very useful, combining it with other conversions allows authors to extend or define variants of existing transforms. Referring to the text-transform currently being define is not allowed, and makes the whole descriptor invalid. | + | |
- | + | ||
- | Conversions may also define new mapping from one < | + | |
- | + | ||
- | When defined using a < | + | |
- | + | ||
- | A < | + | |
- | + | ||
- | If defined by an < | + | |
- | + | ||
- | In addition to [[http:// | + | |
- | + | ||
- | In a < | + | |
- | + | ||
- | <note warning> | + | |
- | + | ||
- | <note warning> | + | |
- | + | ||
- | <note warning> | + | |
- | the syntax should be.</ | + | |
- | + | ||
- | Examples: | + | |
- | + | ||
- | <code css> | + | |
- | @text-transform latin-only-uppercase | + | |
- | { | + | |
- | transformation: | + | |
- | } | + | |
- | </ | + | |
- | + | ||
- | The following two transforms are identical. | + | |
- | + | ||
- | <code css> | + | |
- | @text-tranform abcdef1 | + | |
- | { | + | |
- | transformation: | + | |
- | } | + | |
- | @text-tranform abcdef2 | + | |
- | { | + | |
- | transformation: | + | |
- | " | + | |
- | " | + | |
- | } | + | |
- | </ | + | |
- | + | ||
- | + | ||
- | ===== The character-type descriptor ===== | + | |
- | + | ||
- | <code bnf> | + | |
- | Value: extended | legacy | single | spaced | + | |
- | Default: extended | + | |
- | </ | + | |
- | + | ||
- | <note warning> | + | |
- | ISSUE 4: extended is proposed as the default, but is this the right choice? Maybe single would be more suitable | + | |
</ | </ | ||
- | |||
- | This definition affects what is meant by character processing in two different contexts: | ||
- | |||
- | * strings used as an < | ||
- | * the text to which the text-transform will be applied. | ||
- | |||
- | In an < | ||
- | |||
- | * extended: characters are extended grapheme clusters, as defined in [[http:// | ||
- | * legacy: characters are legacy grapheme clusters, as defined in [[http:// | ||
- | * single: characters are single Unicode code points. | ||
- | * spaced: characters are space separated sequences of Unicode code points. | ||
- | |||
- | How the the text to which the text-transform property is applied must be processed also depends on the value of this descriptor. | ||
- | |||
- | When the value is ' | ||
- | |||
- | < | ||
- | Example: | ||
- | < | ||
- | @text-transform foo { | ||
- | character-type: | ||
- | transformation: | ||
- | } | ||
- | </ | ||
- | If the text to which the above text transform is applied contains the U+65 U+301 sequence (' | ||
- | |||
- | On the other hand, the following text transform would transform that same sequence into U+61 U+301 (' | ||
- | |||
- | < | ||
- | @text-transform foo { | ||
- | character-type: | ||
- | transformation: | ||
- | } | ||
- | </ | ||
- | </ | ||
- | |||
- | <note warning> ISSUE 5: | ||
- | Define the processing model on the text for the ' | ||
- | |||
- | < | ||
- | character-type: | ||
- | transformation: | ||
- | }</ | ||
- | </ | ||
- | |||
- | |||
- | <note warning> | ||
- | |||
- | |||
- | <note warning> | ||
- | |||
- | |||
- | ===== The scope descriptor ===== | ||
- | <code bnf> | ||
- | Value: all | [initial || medial || final] | ||
- | Default: all | ||
- | </ | ||
- | |||
- | This descriptor makes it possible to restrict which characters in the source text are affected by the transform. | ||
- | |||
- | * ' | ||
- | * ' | ||
- | * ' | ||
- | * ' | ||
- | |||
- | <note warning> | ||
- | |||
- | The definition of " | ||
- | |||
- | The transformation descriptor may be used to refer to existing text-transforms in the definition of a new one. If the text-transforms | ||
- | referred to have a different scope than the scope specified in the text-transform that refers to them, they apply at the intersection of the | ||
- | two scopes. | ||
- | |||
- | Example: | ||
- | |||
- | <code css> | ||
- | @text-transform latin-only-uppercase | ||
- | { | ||
- | transformation: | ||
- | } | ||
- | @text-transform latin-only-capitalize | ||
- | { | ||
- | transformation: | ||
- | scope: initial; | ||
- | } | ||
- | </ | ||
- | |||
- | ===== DOM interaction ===== | ||
- | |||
- | Custom text transform values defined within @text-transform rules are accessible via the following modifications to the CSS Object Model. | ||
- | |||
- | ==== Interface CSSRule | ||
- | |||
- | The following additional rule type is added to the CSSRule interface. | ||
- | |||
- | === IDL Definition === | ||
- | < | ||
- | interface CSSRule { | ||
- | ... | ||
- | const unsigned short TEXT_TRANSFORM_RULE = 1000; | ||
- | ... | ||
- | }; | ||
- | </ | ||
- | |||
- | ==== Interface CSSTextTransformRule ==== | ||
- | |||
- | The CSSTextTransformRule interface represents a complete set of keyframes for a single animation. | ||
- | |||
- | === IDL Definition === | ||
- | interface CSSTextTransformRule : CSSRule { | ||
- | attribute | ||
- | readonly attribute CSSStyleDeclaration style; | ||
- | }; | ||
- | | ||
- | === Attributes === | ||
- | |||
- | == name of type DOMString == | ||
- | This attribute is the name of the transform, used by the text-transform property. | ||
- | |||
- | == style of type CSSStyleDeclaration == | ||
- | This attribute represents all the descriptors associated with this text-transform. | ||
- | ====== Use cases ====== | ||
- | |||
- | ===== Single Languages use cases ===== | ||
- | |||
- | The following use cases only apply to a single language. Defining all the possibly useful text-transforms for all languages would go beyond the capacity and expertise of the CSS WG. Having the generic mechanism allows authors to solve their specific problem. | ||
- | |||
- | ==== Full-size kana ==== | ||
- | In Japanese, small kanas appearing within ruby are sometimes replaced by the equivalent full-size kana. The following transform defines this conversion | ||
- | |||
- | <code css> | ||
- | @text-transform full-size-kana | ||
- | { | ||
- | transformation: | ||
- | " | ||
- | " | ||
- | } | ||
- | </ | ||
- | |||
- | ==== German ß ==== | ||
- | |||
- | As discussed [[http:// | ||
- | |||
- | This letter being rather new, authors are bound to disagree whether it is a proper uppercase variant of U+00DF, or not. Those who think it is not may use text-transform: | ||
- | |||
- | <code css> | ||
- | @text-transform german-uppercase | ||
- | { | ||
- | transformation: | ||
- | } | ||
- | |||
- | @text-transform german-lowercase | ||
- | { | ||
- | transformation: | ||
- | } | ||
- | </ | ||
- | <note warning> | ||
- | ISSUE 8: It has been suggested that overloading existing values with a language descriptor or selector would be better: <code css> | ||
- | { | ||
- | transformation: | ||
- | language: de; | ||
- | } | ||
- | </ | ||
- | { | ||
- | transformation: | ||
- | }</ | ||
- | </ | ||
- | |||
- | ==== Turkish i/ı ==== | ||
- | |||
- | http:// | ||
- | |||
- | In Turkish and a few related languages, dotted and dotless i are distinct letters, both in upper land lower case. | ||
- | |||
- | The uppercasing and lowercasing algorithm defined for the text-transform property only preserve this when the content language of the element is known. | ||
- | |||
- | Someone, for example in a user style sheet, may want to apply an uppercase or lowercase transform to a document where language is insufficiently marked up, but known to the author of the style sheet to be Turkish. In this case, the generic uppercase and lowercase transforms would fail, but the following would work. | ||
- | |||
- | <code css> | ||
- | @text-transform turkic-uppercase | ||
- | { | ||
- | transformation: | ||
- | } | ||
- | |||
- | @text-transform turkic-lowercase | ||
- | { | ||
- | transformation: | ||
- | } | ||
- | </ | ||
- | |||
- | ==== Georgian upper/lower case ==== | ||
- | |||
- | http:// | ||
- | http:// | ||
- | |||
- | The Georgian language has used three different unicameral alphabets through history: Asomtavruli, | ||
- | |||
- | <code css> | ||
- | @text-transform Mkhedruli-to-Asomtavruli | ||
- | { | ||
- | transformation: | ||
- | } | ||
- | |||
- | @text-transform Asomtavruli-to-Mkhedruli | ||
- | { | ||
- | transformation: | ||
- | } | ||
- | </ | ||
- | |||
- | ===== Cross-language use cases ===== | ||
- | |||
- | The following cases are examples of cases useful in several languages, but rare enough that they are better addressed by authors when needed than by the CSS WG. | ||
- | |||
- | ==== Long s ==== | ||
- | |||
- | http:// | ||
- | http:// | ||
- | |||
- | In old (18th century and earlier) European texts, the letter s, when at the middle or begining of the word, was written ſ (U+017F). S occuring at the end of a word would be written as the modern s is. | ||
- | |||
- | Modern readers are often unfamiliar with this letter form, and for readability reasons, one may want to convert from one to the other. The follow transform would accomplish this. | ||
- | |||
- | <code css> | ||
- | @text-transform modernize-s | ||
- | { | ||
- | transformation: | ||
- | } | ||
- | </ | ||
- | |||
- | This does the opposite transform: | ||
- | |||
- | <code css> | ||
- | @text-transform long-s | ||
- | { | ||
- | transformation: | ||
- | scope: initial medial; | ||
- | } | ||
- | </ | ||
- | |||
- | ===== Miscellaneous ===== | ||
- | |||
- | Here are some more example of how the generic mechanism may be used | ||
- | |||
- | ==== Transliteration ==== | ||
- | |||
- | Most writing systems of the world have at least one common transliteration scheme into the roman script. | ||
- | |||
- | <code css romanization.css> | ||
- | @text-transform romanization | ||
- | { | ||
- | character-type: | ||
- | /* ISO 9 (Cyrillic) */ | ||
- | transformation: | ||
- | Ж ж Ӂ ӂ Ӝ ӝ Җ җ З з Ӟ ӟ Ѕ ѕ Ӡ ӡ И и Ӥ ӥ І і Ї ї Й й Ј ј К к Қ қ Ҟ ҟ Л л Љ љ | ||
- | М м Н н Њ њ Ҥ ҥ Ң ң О о Ӧ ӧ Ө ө П п Ҧ ҧ Р р С с Ҫ ҫ Т т Ҭ ҭ Ћ ћ Ќ ќ | ||
- | У у У́ у́ Ў ў Ӱ ӱ Ӳ ӳ Ү ү Ф ф Х х Ҳ ҳ Һ һ Ц ц Ҵ ҵ Ч ч Ӵ ӵ Ҷ ҷ Џ џ Ш ш Щ щ | ||
- | Ъ ъ ’ Ы ы Ӹ ӹ Ь ь Э э Ю ю Я я Ѣ ѣ Ѫ ѫ Ѳ ѳ Ѵ ѵ Ҩ ҩ" | ||
- | to "A a Ă ă Ä ä A̋ a̋ B b V v G g G̀ g̀ Ğ ğ Ġ ġ D d Đ đ Ǵ ǵ E e Ë ë Ĕ ĕ Ê ê C̆ c̆ Ç̆ ç̆ | ||
- | Ž ž Z̆ z̆ Z̄ z̄ Ž̦ ž̧ Z z Z̈ z̈ Ẑ ẑ Ź ź I i Î î Ì ì Ï ï J j J̌ ǰ K k Ķ ķ K̄ k̄ L l L̂ l̂ | ||
- | M m N n N̂ n̂ Ṅ ṅ Ṇ ṇ O o Ö ö Ô ô P p Ṕ ṕ R r S s Ç ç T t Ţ ţ Ć ć Ḱ ḱ | ||
- | U u Ú ú Ŭ ŭ Ü ü Ű ű Ù ù F f H h Ḩ ḩ Ḥ ḥ C c C̄ c̄ Č č C̈ c̈ Ç ç D̂ d̂ Š š Ŝ ŝ | ||
- | ʺ ʺ ‵ Y y Ÿ ÿ ʹ ʹ È è Û û Â â Ě ě Ǎ ǎ F̀ f̀ Ỳ ỳ Ò ò", | ||
- | /* ISO 843 (Greek) */ | ||
- | "Α α Ά ά Β β Γ γ Δ δ Ε ε Έ έ Ζ ζ Η η Ή ή Θ θ Ι ι Ί ί Ϊ ϊ ΐ Κ κ Λ λ Μ μ | ||
- | Ν ν Ξ ξ Ο ο Ό ό Π π Ρ ρ Σ σ ς Τ τ Υ υ Ύ ύ Ϋ ϋ Φ φ Χ χ Ψ ψ Ω ω Ώ ώ" | ||
- | to "A a Á á V v G g D d E e É é Z z Ī ī Ī́ ī́ Th th I i Í í Ï ï ḯ K k L l M m | ||
- | N n X x O o Ó ó P p R r S s s T t Y y Ý ý Ÿ ÿ F f Ch ch Ps ps Ō ō Ṓ ṓ"; | ||
- | } | ||
- | </ | ||
- | |||
- | ==== Comic book vikings ==== | ||
- | In the " | ||
- | |||
- | This effect could be obtained by the following transform: | ||
- | |||
- | <code css> | ||
- | @text-transform fake-norse | ||
- | { | ||
- | transformation: | ||
- | } | ||
- | </ | ||
- | |||
- | ==== Leet speak ==== | ||
- | In Internet, hacker and gamer culture, a phenomenon is quite common, where characters are replaced by other characters or character sequences which have a somewhat similar glyphic appearance. Although no single consensual convention exists and sometimes mappings are neither injective nor surjective, one could simulate this playful style with a transform like the following: | ||
- | |||
- | <code css> | ||
- | @text-transform leet-speak | ||
- | { | ||
- | transformation: | ||
- | } | ||
- | </ |