<html>
<strong>
<div style=“color: red; font-size: 20px; border: 2px solid red; padding: 10px; line-height: 1.5; text-align: center;”>
This page has been deprecated and is no longer being maintained.
<br>For up to date information on contributing and authoring CSS Test suites, see:
<br><a href=“http://testthewebforward.org/docs/reftests.html”>http://testthewebforward.org/docs/reftests.html</a>
</strong>
</div>
</html>
Reftests
A reftest is a test that compares the visual output of one file (the testcase) with the output of one or more other files (the references). Unlike the standard self-describing tests, reftests can be scripted to run and report results automatically.
A test can be both a self-describing tests and a reftest at the same time. This is preferable, since it allows for both machine comparison and manual verification–particularly useful if the test and the reference both render incorrectly in the same way!
Here is an example of a reftest that is also a self-describing test:
- TEST
- The test file applies a transform to an SVG element using
translate(50 50)
. When transformed properly, a red element on the page will be hidden from view.
- REF
- The reference file achieves the intended rendering by using an svg element with and
x=50
and y=50
and no transform
.
In some cases, a test cannot be a reftest. For example, there is no way to create a reference for underlining, since the position and thickness of the underline depends on the UA, the font, and/or the platform. In such cases, a self-describing test must be used. However, once it's established that underlining an inline element works, it's possible to construct a reftest for underlining a block element, by constructing a reference using underlines on a <span> that wraps all the content inside the block.
Components of a Reftest
A reftest has three parts:
- test file
- The test file. This must follow the CSS test format guidelines.
- reference file
- This is a different, usually simpler, file that results in the same rendering as the test. The reference file must not use the same features that are being tested. Sometimes more than one reference file is required.
- reftest comparison
- One or more reference links that say which files are to be compared and whether they are to render identically or differently.
The Reftest Test File
The test file uses the technology to be tested. This file must follow the CSS test format guidelines.
In addition to matching a reftest reference, the test may also function as a self-describing test. This is preferred because having the description lets the tester check that the test and the reference are not both being rendered incorrectly and triggering a false pass, and because designing it for an obvious fail makes it easier to find what went wrong when the reftest does fail.
If the test must perform some processing before a comparison can be made to the reference, add class=reftest-wait
to the root element, and remove it when the comparison can be made.
The Reftest Reference File
The reference file uses a different method to produce the same rendering as the test file. Multiple tests can (and often do) share the same reference file.
References should be named after the earliest test that uses them in the test-topic
series they belong to, and must have either -ref
or -notref
appended to the name. Variations on a reference can be denoted by appending additional suffixes after -ref
or -notref
. If present, such a suffix must either be entirely numeric (i.e. file-ref002.html, or be separated by a dash, i.e. file-ref-a.html). Depending on the test suite, they may be placed in the reference
subdirectory of the main test directory or directly in the main test directory. If they are placed in the reference
subdirectory then the -ref
or -notref
suffix may be omitted from their filename.
In some cases when creating the reference file, it is necessary to use features that, although different from the tested features, may themselves fail in such a manner as to cause the reference to render identically to a failed test. When this is the case, in order to reduce the possibility of false positive testing outcomes, multiple reference files should be used, each using a different technique to render the reference. One possibility is to create one or more references that must not match the test file, i.e.: a file that renders in the same manner as a failed test.
In other cases, the specification under test may allow multiple possible renderings. In this situation references must be supplied for each allowed rendering.
For example, if two self-describing tests list-style-type-003.xht
and list-style-type-004.xht
share the same reference, that reference could be named list-style-type-003-ref.xht
.
Like the format for the test file, the reftest reference format is also XHTML or HTML5 in UTF-8 with bitmap images in PNG format. Unlike the format for the test file, there is no metadata except for the author credits and optionally reference links, reviewer information, and requirement flags. Specification links must NOT be present in reference files.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>CSS Reftest Reference</title>
<link rel="author" title="NAME_OF_AUTHOR" href="mailto:EMAIL OR http://CONTACT_PAGE"/>
<meta name="flags" content="[requirement flags]" />
<style type="text/css"><![CDATA[
CSS FOR REFERENCE
]]></style>
</head>
<body>
CONTENT OF REFERENCE
</body>
</html>
Common References
There are several common references, such as those used for parsing and selectors tests. Their names begin with ref-
so they can be easily found in the reftest
directory. Email public-css-testsuite@w3.org if you would like to add to the common references collection.
The Reftest Comparison Links
In order to designate which files are to be compared to the test file, and the nature of the comparison, the test file must have one or more links to the reference files as described in the test format.
If multiple reference files must be matched, each reference file should, in turn, link to the next reference.
If multiple renderings are conforming, each possible rendering should have its own reference file linked from the test file.
In cases where it's likely for the test and the reference to misrender identically, the test should also have one (or more) mismatch references.
The Reftest Manifest
Note: The use of reftest manifest files in the test source is deprecated in favor of reference links.
The reftest manifest is a plain text file that lists test-reference pairs for comparison. The test build process produces reftest manifests as needed for input into testing tools.
Each line starts with ==
to indicate equality or !=
to indicate inequality. This is followed by a space, the relative path to the test file, another space, and the relative path to the reference.
If a test has multiple ==
references then at least one of those references must match the test. If a test has multiple !=
references, then none of those references may match the test. The reference file may also have entries in the manifest: in this case, the renderings of the references must match each other as well in order to consider the test as passed.
White space followed by #
indicates the start of a comment that runs until the end of the line. A line starting with #
is also a comment.
The reftest manifest should be named reftest.list
and placed in the reference
subdirectory of the main test directory.
Here is an example of a reftest manifest.
# reftest.list snippet
== ../test-topic-000.xht test-topic-000-ref.xht
== ../test-topic-001.xht test-topic-001-ref.xht
== ../test-topic-002.xht test-topic-000-ref.xht # note same reference as test 000
Mozilla's manifest format, which is more-or-less a superset of the W3C format, allows annotations such as whether the test is expected to pass or fail. These can be useful when setting up automated regression testing.
Converting to Reftest
Most of the CSS2.1 tests are self-describing tests that could be reftests, but are not. (They were written before the reftest format was adopted.) They are slowly being converted into reftests, and your help in this effort is welcome. Some guidelines are offered below:
If the test uses Ahem, make sure its font size is a multiple of 5px, otherwise it may render inconsistently (due to rounding errors introduced by certain platforms' font APIs).
If multiple tests can share a reference, share the reference. (This allows for some optimizations in the test runs.)
If multiple files have almost but not quite the same rendering, and could almost but not quite render identically and thus share a reference, you may tweak the tests to render identically provided that this does not affect the tests' precision or correctness.
48 unreftestable tests
The following is a list of 48 CSS2.1 tests that appear to be unreftestable:
[RC6] c414-flt-000 Reasons why: if viewport is 640px or so, then “Blue rectangle” text should flow around and below each teal blocks. A reftest for this if/when resizing viewport - without javascript - is very hard to do, I would imagine. I do not think this can be done. Also, positioning floated-left and float-right boxes inside the same container seems impossible to do.
[RC6] float-001 Reasons why: Text flowing around a floated box is very difficult to reftest.
[RC6] float-002 Reasons why: Text flowing around a floated box is very difficult to reftest.
[RC6] c5523-width-000 Reasons why: the baseline line in a line box when/where there is a taller inline box than the set line height is not predictable. “The inline-level boxes are aligned vertically according to their 'vertical-align' property. (…) If such boxes are tall enough, there are multiple solutions and
CSS 2.1 does not define the position of the line box's baseline”
CSS 2.1, section 10.8, Line height calculations
-
[RC6] c5524-height-000 Reasons why: the baseline line in a line box when/where there is a taller inline box than the set line height is not predictable. “The inline-level boxes are aligned vertically according to their 'vertical-align' property. (…) If such boxes are tall enough, there are multiple solutions and
CSS 2.1 does not define the position of the line box's baseline”
CSS 2.1, section 10.8, Line height calculations
-
-
-
-
[RC6] margin-collapse-164 Reasons why: As of today, March 1st 2012, it is still not established officially if the test is valid or invalid, correct or incorrect.
[RC6] floats-151 Reasons why: the test itself is difficult to understand. The negative margin-left (-10em) is not applied actually anymore (any wider) than the width of the PASS text node. It is possible to do a reftest but maybe not one that will be accurate and precise in all browsers.
[RC6] c5525-fltwrap-000 Reasons why: Main number 1 reason is that content of central column should flow around left column and then below left column when it is past the bottom of left column. This is impossible to simulate with a table or with DHTML layers. Secondary reasons: 1- fractional pixels may be rendered differently with properties using different elements (positioned DHTML layers, floated blocks, table cells) 2- the 8px margin bottom of body element is difficult to simulate when using positioned DHTML layers.
Links, screenshot and tentatives-reftests of c5525-fltwrap-000
[nightly-unstable] font-weight-016 Reasons why: it's probably impossible to create a reftest for this as it is impossible to know in advance how font-weight characters will affect width. It varies from one browser to another.
-
-
[nightly-unstable] font-size-119 Reasons why: font-size keywords can not be reliably converted into font-size pixels. So, there is no way to know how tall the document box is; so presence of vertical scrollbar and its relative position are undefined
[RC6] c24-first-lttr-000 Reasons why: browsers may apply different line-height value to :first-letter pseudo-element depending on font used. When using DejaVu Serif font, Firefox 11.0 and Chrome 18.0.1025.142 apply line-height: 1.17 while Opera 11.62 applies line-height: 1.18. When using FreeSerif font, all 3 browsers use the same computed line-height value so that there is no need to specify it.
[nightly-unstable] inherit-004 Reasons why: “Empty inline elements generate empty inline boxes, but these boxes still have margins, padding, borders and a line height, and thus influence these calculations just like elements with content.” coming from
section 10.8. So, a inline box with 'font-weight: bold' can influence height of line box.
[RC6] border-bottom-style-003 Reasons why: The following border-styles are impossible to reftest: dotted, dashed, ridge, groove, inset, outset, double. Only solid, none, hidden (and sometimes inherit) are reftestable.
[RC6] border-bottom-width-036 Reasons why: Depending on how tall 1cm is (how it is actually resolved: as 37px or 38px for border), the height of green that we may see could be 75px or it could be 76px. Eg.: Opera 11.64 displays a filled green rectangle of 76px of height while other browsers display a filled green rectangle of 75px.
[RC6] border-bottom-width-047 Reasons why: Depending on how tall 1mm is (how it is actually resolved: as 3px or 4px for border), the height of green that we may see could be 7px or it could be 8px. Eg.: Opera 11.64 displays a filled green rectangle of 86px of height while other browsers display a filled green rectangle of 7px.
[RC6] border-bottom-width-014 Reasons why: 1pt is 1.33333px (3pt == 4px); so, it could be possible for a browser to resolve such value as 2px (round up) or as 1px (round down); so, this situation is impossible to predict, therefore impossible to reftest.
[RC6] border-bottom-width-092 Reasons why: border-width: thin or border-width: medium or border-width: thick is impossible to reftest. It's all up to the UA to decide on such border-width.
[nightly-unstable] line-height-122 Reasons why: How much descent space (below baseline) is allocated depends entirely on the font chosen, the font used. So, in this test, it is impossible to calculate and predict the vertical height of the bright green line. With Ahem font, this would be computable and predictable.
[nightly-unstable] line-height-123 Reasons why: How much descent space (below baseline) is allocated depends entirely on the font chosen, the font used. So, in this test, it is impossible to calculate and predict the vertical height of the bright green line. With Ahem font, this would be computable and predictable.
[nightly-unstable] line-height-124 Reasons why: How much descent space (below baseline) is allocated depends entirely on the font chosen, the font used. So, in this test, it is impossible to calculate and predict the vertical height of the bright green line. With Ahem font, this would be computable and predictable.
-
[RC6] c414-flt-wrap-000 Reasons why: the test uses fractional values (14.98em, 0.01em) which can not be converted consistently into a reftest.
[RC6] c5522-brdr-002 Reasons why: the topmost cell will be as wide as the 2 other cell width combined with one cell in a nested table. It may be possible to reftest this test… but it's not obvious.
[RC6] floats-103 Reasons why: It's possible to reftest this test but so far I have not been able to with a table. Maybe it would be possible with a nested table to overcome difficulties.
[RC6] inlines-007 Reasons why: It's impossible to predict the vertical position of baseline line since it varies depending on local font in use.
[RC6] inlines-014 Reasons why: It's impossible to predict the amount of descent space below the baseline for local font in use. The test requires to specify line-height affecting the cell box.
[RC6] inlines-015 Reasons why: It's impossible to predict the amount of descent space below the baseline for local font in use. The test requires to specify line-height affecting the cell box.
[RC6] c5525-fltcont-000 Reasons why: It seems impossible to emulate or replace 'text-align: justify' by another feature. Even without/despute 'text-align: justify', the test still would seem difficult to reftest.
[RC6] margin-left-applies-to-008 Reasons why: Content area of an inline non-replaced element is based on the font type and font-size but the CSS2.1 specification does not specify how. So, the height of the 2 borders (blue and orange) is impossible to predict. Such difficulty applies as well to
[RC6] margin-right-applies-to-008 . With Ahem font, this would be computable and predictable.
[RC6] padding-left-applies-to-008 Reasons why: Content area of an inline non-replaced element is based on the font type and font-size but the CSS2.1 specification does not specify how. So, the height of the 2 borders (blue and orange) is impossible to predict. Such difficulty applies as well to
[RC6] padding-right-applies-to-008 . With Ahem font, this would be computable and predictable.
[RC6] padding-applies-to-008 Reasons why: Content area of an inline non-replaced element is based on the font type and font-size but the CSS2.1 specification does not specify how. So, the height of the orange borders is impossible to predict. Such difficulty applies as well to
[RC6] margin-applies-to-008 . With Ahem font, this would be computable and predictable.
-
[RC6] border-bottom-width-applies-to-014 Reasons why: The bottom edge of the empty cell should be “sitting” on the baseline. Now, depending on local font used, it is not predictable what would be the vertical baseline-alignment for such local font: this can vary.
-
-
[RC6] position-relative-002 Reasons why: I have not found a way to create a reliable and trustworthy reftest for this test… despite a lot of time spent on this. Even after setting line-height to 1.25, the offsetTop of
<span>b</span>
and of
<span id=“span1”>a</span>
are unexplicably 55px and 80px in Chrome 21 while it is 54px and 79px in other browsers (Firefox 15.01 and Opera 12.02) and when I think it should be 54px and 79px and not 55px and 80px. And I do not know why or if this is some kind of a bug. [Addendum: the offsetTop value of
<span>b</span>
varies depending on the local font in use.]
[RC6] block-formatting-contexts-013 Reasons why: Height of horizontal scrollbar mechanism and width of vertical scrollbar mechanism is impossible to predict: these are user-settable preferences and browser default are not the same for each browser and under different operating system.
[RC6] height-014 Reasons why: 'height: 1pt' is impossible to convert into a reftest as 1pt can be resolved as 1px (this is what Opera 12.02, Chrome 22.0.1229.79 and Konqueror 4.9.2 do) or as 2px (this is how Firefox 15.0.1 handles 1pt). Same kind of problem with
[RC6] height-047 ('height: 1mm') and other similar tests (height-036 and 1cm).
[RC6] min-height-113 Reasons why: Many tests with scrollbar(s) are unreftestable because the height of horizontal scrollbar and the width of vertical scrollbar are entirely user-settable in operating systems, (at least Windows and Linux KDE), therefore unpredictable. Some browsers (like Konqueror 4.9.2) also have semi-transparent areas around the scrollbar thumb and scrollbar arrows: so, an overlapping green square with scrollbar(s) may still display red from the overlapped red square and this is impossible to prevent/work around.
[RC6] replaced-intrinsic-ratio-001 Reasons why: We would first need to create an image made of a green triangle inside a filled lime rectangle. Then it's not clear if oblique shapes (bitmap) would not be different from svg drawing. Unknown, unclear at this moment.
[RC6] replaced-min-max-001 Reasons why: Stretched images will create bigger and fuzzier black dots: this is impossible to reftest appropriately.
[RC6] background-position-002 Reasons why: Precise baseline-alignment positioning of ruler image is impossible to do with serif font; it varies from font to font. I re-made the test to overcome such difficulty. I have now been able to create a reftest for such test.
[RC6] inline-formatting-context-023 Reasons why: 1px offset in Opera 11.61 which is unexplained as of now. I have concluded that Opera 11.61 has a bug. I have now been able to create a reftest for such test.
[RC6] floats-101 I have now been able to create a reftest for such test. In the test, margin collapsing was not occuring while it must occur in the reftest. So, code has been adjusted to take into consideration, take into account this.
[RC6] containing-block-009 I have now been able to create a reftest for such test.
[RC6] floats-146 Reasons why: 0.2em padding and 0.2em margin create fractional pixels; border-width: thin is not as reliable as border-width: 1px. The test has been modified to work around those 2 issues.
[nightly-unstable] line-height-125 Reasons why: the offsetHeight of FAIL varies from one browser to another and when using a different font. When using the same font type (“Liberation Serif”), the offsetHeight of the text node “FAIL” is 57px in Chrome 17 and 56px in Firefox 10.0.2 and Opera 11.61. With other font types, there is and should be still a 1px difference. I have now been able to create a reftest for such test.
[RC6] floats-143 Reasons why: When using the same sans-serif font, different browsers will use a different offsetWidth for the text node “PA” and for the text node “SS”. For example, when using “DejaVu Sans” font, Chrome 17.0.963.56 will use 48px for “PA” and then 46px for “SS” while Firefox 10.0.2 and Opera 11.61 will use 45px for both “PA” and “SS”. Used width of text nodes is impredictable. I have now been able to create a reftest for such test. Albeit it must be said that when using a font like FreeSans, Firefox 11.0 will not match perfectly the reftest; it will be off by 1px.