This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
at-text-transform [2011/12/01 07:41] – [The convert-predefined descriptor] florian | at-text-transform [2014/12/09 15:48] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | This is an early draft for a possible generic mechanism to allow authors to define custom text-transforms. | + | This page has moved. See [[ideas:at-text-transform|here]] |
- | + | ||
- | ====== Defining Custom Text Transforms: the @text-transform rule====== | + | |
- | + | ||
- | The general form of an @text-transform at-rule is: | + | |
- | + | ||
- | < | + | |
- | @text-transform < | + | |
- | { [ descriptor: value; ]+ } | + | |
- | </ | + | |
- | + | ||
- | The descriptors express the conversion from certain characters to other characters, using different mechanism to specify the the source and target characters. If several descriptors are used, the transform described is the result of successively applying them all, in the order they appear in the @text-transform. | + | |
- | + | ||
- | <note warning> | + | |
- | + | ||
- | Example: | + | |
- | + | ||
- | The following two transforms are identical. | + | |
- | + | ||
- | < | + | |
- | @text-tranform abcdef1 | + | |
- | { | + | |
- | convert: " | + | |
- | } | + | |
- | + | ||
- | @text-tranform abcdef2 | + | |
- | { | + | |
- | convert: " | + | |
- | convert: " | + | |
- | convert: " | + | |
- | } | + | |
- | </ | + | |
- | + | ||
- | ===== The convert descriptor ===== | + | |
- | + | ||
- | < | + | |
- | Name: convert | + | |
- | Value: < | + | |
- | default: N/A | + | |
- | </ | + | |
- | + | ||
- | This descriptor creates a 1 to 1 mapping from the characters in the first string to the characters in the second string. | + | |
- | <note warning> | + | |
- | + | ||
- | Both strings should be of equal length. If they are not, the longer on is truncated to the same length as the shorter one. | + | |
- | <note warning> | + | |
- | + | ||
- | ===== The convert-range descriptor ===== | + | |
- | + | ||
- | < | + | |
- | Name: convert-range | + | |
- | Value: < | + | |
- | default: N/A | + | |
- | </ | + | |
- | + | ||
- | It would sometimes be tedious to use the convert descriptor when the list of characters is long, but this can be simplified using convert-range when the characters' | + | |
- | + | ||
- | Each pair of strings define an range of unicode characters, inclusive of the ones listed. All 4 strings must contain a single Unicode character. | + | |
- | + | ||
- | < | + | |
- | + | ||
- | The numerical code point value of the character in the first (resp. third) string must be less than the one in the second (resp. fourth) string. If it is not, the descriptor must be ignored. Both ranges should be of equal length. If they are not, the longer on is truncated to the same length as the shorter one. The ranges may overlap. | + | |
- | + | ||
- | + | ||
- | Example: | + | |
- | < | + | |
- | @text-transform latin-only-uppercase | + | |
- | { | + | |
- | convert-range: | + | |
- | } | + | |
- | </ | + | |
- | + | ||
- | ===== The convert-predefined descriptor ===== | + | |
- | + | ||
- | < | + | |
- | Name: convert-predefined | + | |
- | Value: < | + | |
- | default: N/A | + | |
- | </ | + | |
- | + | ||
- | This descriptor makes it possible to refer to existing text tranforms, either predefined by CSS or defined by the author. While an @text-transform using only this descriptor is not very useful, combining it with other descriptors allows authors to extend or define variants of existing transforms. convert-predefined cannot refer to the text-transform whose definition it is part of. | + | |
- | + | ||
- | + | ||
- | <note warning> | + | |
- | < | + | |
- | + | ||
- | <note warning> | + | |
- | < | + | |
- | Value: all | initial | + | |
- | Default: all | + | |
- | </ | + | |
- | It would let people define customized versions of text-transform: | + | |
- | </ | + | |
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | ====== Use cases ====== | + | |
- | + | ||
- | ===== Single Languages use cases ===== | + | |
- | + | ||
- | The following use cases only apply to a single language. Defining all the possibly useful text-transforms for all languages would go beyond the capacity and expertise of the CSS WG. Having the generic mechanism allows authors to solve their specific problem. | + | |
- | + | ||
- | ==== Full-size kana ==== | + | |
- | In Japanese, small kanas appearing within ruby are sometimes replaced by the equivalent full-size kana. The following transform defines this conversion | + | |
- | + | ||
- | < | + | |
- | @text-transform full-size-kana | + | |
- | { | + | |
- | convert: " | + | |
- | convert: " | + | |
- | convert: " | + | |
- | } | + | |
- | </ | + | |
- | + | ||
- | ==== German ß ==== | + | |
- | + | ||
- | As discussed [[http:// | + | |
- | + | ||
- | This letter being rather new, authors are bound to disagree whether it is a proper uppercase variant of U+00DF, or not. Those who think it is not may use text-transform: | + | |
- | + | ||
- | < | + | |
- | @text-transform german-uppercase | + | |
- | { | + | |
- | convert-predefined: | + | |
- | convert: " | + | |
- | } | + | |
- | + | ||
- | @text-transform german-lowercase | + | |
- | { | + | |
- | convert-predefined: | + | |
- | convert: " | + | |
- | } | + | |
- | </ | + | |
- | + | ||
- | ==== Turkish i/ı ==== | + | |
- | + | ||
- | http:// | + | |
- | + | ||
- | In Turkish and a few related languages, dotted and dotless i are distinct letters, both in upper land lower case. | + | |
- | + | ||
- | The uppercasing and lowercasing algorithm defined for the text-transform property only preserve this when the content language of the element is known. | + | |
- | + | ||
- | Someone, for example in a user style sheet, may want to apply an uppercase or lowercase transform to a document where language is insufficiently marked up, but known to the author of the style sheet to be Turkish. In this case, the generic uppercase and lowercase transforms would fail, but the following would work. | + | |
- | + | ||
- | + | ||
- | < | + | |
- | @text-transform turkic-uppercase | + | |
- | { | + | |
- | convert: " | + | |
- | convert-predefined: | + | |
- | } | + | |
- | + | ||
- | @text-transform turkic-lowercase | + | |
- | { | + | |
- | convert: " | + | |
- | convert-predefined: | + | |
- | } | + | |
- | </ | + | |
- | + | ||
- | ==== Georgian upper/lower case ==== | + | |
- | + | ||
- | http:// | + | |
- | http:// | + | |
- | + | ||
- | The Georgian language has used three different unicameral alphabets through history: Asomtavruli, | + | |
- | + | ||
- | @text-transform Mkhedruli-to-Asomtavruli | + | |
- | { | + | |
- | convert: " | + | |
- | } | + | |
- | + | ||
- | @text-transform Asomtavruli-to-Mkhedruli | + | |
- | { | + | |
- | convert: " | + | |
- | } | + | |
- | + | ||
- | + | ||
- | ===== Cross-language use cases ===== | + | |
- | + | ||
- | The following cases are examples of cases useful in several languages, but rare enough that they are better addressed by authors when needed than by the CSS WG. | + | |
- | + | ||
- | ==== Long s ==== | + | |
- | + | ||
- | http:// | + | |
- | http:// | + | |
- | + | ||
- | In old (18th century and earlier) European texts, the letter s, when at the middle or begining of the word, was written ſ (U+017F). S occuring at the end of a word would be written as the modern s is. | + | |
- | + | ||
- | Modern readers are often unfamiliar with this letter form, and for readability reasons, one may want to convert from one to the other. The follow transform would accomplish this. | + | |
- | + | ||
- | < | + | |
- | @text-transform modernize-s | + | |
- | { | + | |
- | convert: " | + | |
- | } | + | |
- | </ | + | |
- | + | ||
- | ===== Miscellaneous ===== | + | |
- | + | ||
- | Here are some more example of how the generic mechanism may be used | + | |
- | + | ||
- | ==== Comic book vikings ==== | + | |
- | In the " | + | |
- | + | ||
- | This effect could be obtained by the following transform: | + | |
- | + | ||
- | < | + | |
- | @text-transform fake-norse | + | |
- | { | + | |
- | convert: " | + | |
- | } | + | |
- | </ | + |