today | current | recent | random ... categories | search ... who ... syndication

<!ENTITY % Block "(abstract,(section|include)*)">

So, the other day I registered a new domain and sat down to start fleshing out the contents of the site. Eventually, the inevitable questions of separating form from content gave way to the inevitable questions of separating form from content from structure. In the past, I have relied heavily on server-based widgets, like PHP, for doing headers and footers. Later, I even wrote an Apache::Aaronland handler to take that PHP site and mod_perl-ize it without changing a single character in the source files. But both solutions just break if the site is moved to a server with no bells and whistles. At the very least, the raw HTML is all fucked up and won't validate. So, the thinking goes, maybe I will just create YA generic DTD for webpages or use one that someone else has written. Personally, I was considering using acmeml (for lack of a better name) or otlml and running the files through an XSLT processor. But, it's the same problem : what if you don't have access to an XSLT widget or it breaks? Further, there's always just that extra little bit of formatting that you want to able to add to your content; witness the description element in RSS. Wouldn't it be nice to have a markup language that you could add structure and logic to? To which arbitrary tags could be applied and that still "just worked" when the document proper is sent to a browser. Even if the backend magic suddenly broke, the site might look like crap but presumably it would still be usuable. HTML is, we'll all agree, not the best candidate at first blush. XHTML, on the other hand, comes pretty close. Through the magic of parameter entities and the ability to define and tweak them inside the DOCTYPE declaration, you can essentially wrap (X)HTML in your own case-specific tags.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

   "" [

<!ENTITY % dtdMods SYSTEM "">



This is still an imperfect solution. The first problem is that the browsers have never been taught to deal with this syntax so the %dtdMods;]> from the DOCTYPE declaration gets printed to the browser window. Dunno. The second problem involves the fact that I am overriding the %Block; entity, in dtd.mod, which is used to determine the child elements for the body tag. The good news is that I can re-assign the list of valid children, in this case : abstract; section; include, thus applying more structure to my document than a pure formatting language allows out of the box. Since the children of the first two elements are p and div, respectively, I can start tapping away in HTML to my heart's content. The bad news is that the %Block; entity is also used by the blockquote and noscript tags. There isn't much to do about this since you can't redefine elements; oh well. A third problem is that you can not use already defined parameter entities inside new definitions...

<!ENTITY % foo "(a|b|%c;|d)*">

...without causing the w3c validator grief. I don't know why. Rudimentary testing suggests that you should not waste your time trying to assign styles to your new tags. Your mileage may vary.

refers to


The random word of the day is : jackfucker ←  → The dictified word of the day is : potable