Deseret is now officially encoded in the Secondary Multilingual Plane, also
called Plane One.
There may still be problems displaying the text on some systems which support
non-BMP ranges of Unicode. In order to avoid having the browser ignore the
spacing character (it was running some words together) it was necessary to
add a couple of line feeds between each word in the source HTML.
Also, had to manually set the view/encoding to “User
Defined” after selecting the desired Plane One font as
the “User Defined” font in the [Tools] - [Internet
Options] - [General] - [Fonts] menu. This is because, in order
to be able to claim that this page is valid HTML and add the
valid HTML graphic from W3 to this page, the character set
(charset) declaration in the HTML header was changed from
“x-user-defined” to “UTF-8”.
Using the user defined charset would force this page to load
correctly the first time in the Internet Explorer browser,
as long as an appropriate font was selected for the user defined
font setting in the tools-internet options menu. But
“x-user-defined” is apparently a Windows-specific
internal code page, and the W3C validator is quite correct
in rejecting it as an invalid charset.
Using the UTF-8 charset tag forces the Internet Explorer browser
to load this page as UTF-8, using some internal font selection
mechanism which is somewhat vague and unpredictable.
Unless the font is specified in the HTML header, the browser
may select a font which doesn't include any glyphs in the
desired range over a font which has the glyphs.
In order for a user to display pages encoded using NCRs but labelled
as charset UTF-8 correctly, the user has to manually switch the
encoding to user defined in the [View] - [Encoding] menu. Not just once,
but each and every time the page is visited or even just refreshed. This
means the user is forced to load the page twice, and with some servers,
this means running banners, pop-up screens, cookies, and other scripted
nonsense twice, too.
One solution would be to simply select appropriate fonts using
a STYLE declaration. But, that would make this a “font-specific”
page, and since the purpose of these pages is to allow people to test their
own browsers and fonts on various scripts of Unicode, making this page
font-specific would defeat one of its purposes. (This solution won't work
as of this writing anyway, UTF-8 Plane One text can't be displayed in
the browser regardless of whether real UTF-8 or NCRs are used. My
understanding is that the various browser programmers are working
on the Plane One UTF-8 bug, and this could well be fixed soon, if not
Font-specific pages have problems of their own. Many web pages are
now setting the new Unicode version of Arial as their preferred font
for multilingual text. Those of us who don't have the new version of
Arial installed, mainly because of license restrictions, usually have an
older version of the Arial font installed because Arial has been popular
for a long time. This means that when looking at, for instance, the
otherwise wonderful Universal Declaration of Human Rights font-specific
multilingual pages at the United Nations' web site, many of the pages display
with a lot of “missing glyph” or null-boxes. In order to
correctly display the page under those conditions, the user must save a
copy of the HTML file to disk, open the HTML file in an editor, manually
change the style or font-face declaration to a more appropriate selection,
and then load the altered, archived HTML file back into the browser while
So, the other alternative for correct display is to simply set the charset
to x-user-defined. This is what was done on my Gothic Plane One test
http://home.att.net/~jameskass/gothictest.htm, and this is a popular
method with web page authors. It may not be valid-HTML, but it works.
My home page