About a week ago I hooked up an HttpModule which set Content-Type to application/xhtml+xml for those user agents that “get” it. With the decent size of this site problems reared their head really fast. Having solved a number of challenges I ran into a dead-end and had to pull the plug on application/xhtml+xml.
The biggest strength and yet the biggest weakness of this aspect of ASP.NET is server controls. They are the next best thing since sliced bread, yet you are stuck with the markup they produce. And if they produce incorrect markup… yes, you’re stuck with it.
One good example is the <asp:ValidationSummary> control. If you have a form with a bunch of fields, some of them required, some that have to pass RegEx rules, etc, the ValidationSummary control can collect all data entry errors and display a nice summary. I use this control in Tools and The Hall of Fame. It’s a great idea, but the control renders markup that breaks everything when served as XML.
For example, with one required field on a form the VaidationSummary control produces the following HTML:
<table id="_ctl0_vs" class="vsummary" cellpadding="0"
cellspacing="0" border="0" width="100%">
<tr><td>
<font color="Red">Enter site URL<br></font>
</td></tr>
</table>
As you see the font tag is thrown into this sauce as well as an open <br> tag. Both Opera and Mozilla choke on this code snippet because, since they attempt to parse XML, there’s no matching pair for the open <br>. A workaround would be… to ditch the validation summary control. But… it’s so helpful.
Another problem I ran into when working on the polling control was about postbacks. If there’s a control on the page which posts-back and requires handling a special snip of JavaScript is injected:
<script type="text/javascript">
<!--
function __doPostBack(eventTarget, eventArgument) {
var theform = document.getElementById ('__aspnetForm');
theform.__EVENTTARGET.value = eventTarget.split("$").join(":");
theform.__EVENTARGUMENT.value = eventArgument;
theform.submit();
}
// -->
</script>
Well, it looks a little different in ASP.NET 1.0 and 1.1. Back in February I explained what I did to “fix” it. Anyway, Mozilla hates it. Pardon my ignorance, but it seems that the comment tags throw it off. No postback happens and you get is __doPostBack is not defined
even though it’s there.
If I switch Content-Type back to text/html Mozilla likes it again. Not having postbacks is a bummer. A big one. Pretty much everything about building custom server controls revolves around postbacks. Interestingly enough Opera doesn’t mind it.
If you dig around server controls with Reflector I’m sure you’ll find a couple more examples along the same lines. So what does all this mean?
We’re Not There Yet
ASP.NET isn’t ready yet to produce markup that can be served as XML with the application/xhtml+xml content type. No, it’s not a sin unto death and, please, don’t list us with the sons of perdition. I didn’t expect Microsoft folks to worry about in the first pass anyway, so I’m willing to cut them lotsa slack. Until ASP.NET 2.0. Nevertheless, I think there’s a bigger issue here which brings me to my next point.
Don’t Serve Web Applications as XML
It is my deepest conviction that we’re well over the hill with “web sites” of the dot com era. Businesses face much more complex tasks these days and simple HTML sites don’t cut it. The only worthy application for a strictly-HTML site I can think of as a brochure site. Everything else requires server-side processing.
I never thought of AspNetResources.com as a web site. This is a web application. It follows the now traditional n-tier architecture with a presentation layer and a data layer. I excluded a business logic layer because there’s no strong need for it yet. Five HTTP modules handle pre- and post-processing. Everything feeds off of SQL Server via stored procedures. Searching is handled by Full-Text Search. A number of user controls and custom server controls handle presentation. This is by far not the biggest and most complex web app but it’s easy enough to slip up.
When you serve your content as XML there’s no margin for error. You deviate just a little and the browser throws a parser error. It’s supposed to. However, therein lies a nasty side effect—you can’t trap the error and present a nicer message. I’m a big believer in handling server-side errors but a client-side error renders you helpless. You can’t even know something went kaboom. This is pretty ugly. User agents need to do better than this.
Imagine for a second Citi Bank, Allstate or Yahoo blowing up with a red-on-beige error that a mismatched tag was encountered. None of those companies would risk busting their business for the sake of the noble cause of markup purity.
On the other hand, if you run a web app in complete isolation and under your God-like control it is a perfect ground for XML. A blog or a personal site is a perfect candidate.
XML Is a Contract
Let me digress. The idea of sending some data from a Publisher to Subscriber is no new notion. The idea of web services, therefore, is nothing new. What is new is that we’re finally agreeing on certain protocols (read “contracts”) that make it happen. SOAP and WSDL are a good example. Being XML based, they simply define contracts a sender and a receiver should stick to. You have to stick to this contract 100%. This is where closing all tags becomes of paramount importance.
On the web there are fewer contracts. HTML is a contract but it’s being violated constantly. XHTML is a contract but as long as it says that something “may get deprecated” it will be violated.
People are nowhere near perfection. You can’t demand it from them. You can’t be vigilant about properly closing all tags or quoting attributes 24x7. The moment you turn your work over and let others drive your CMS you can forget about purity. The moment someone leaves a comment on your site you take chances.
How many times have you seen a web site you developed go to complete crap because the people in charge copy and paste chunks of text from MS Word with horrendous markup? Or they simply don’t bother typing in correct HTML and why should they? Does everybody need to know how many belts they need to change in their cars and how to get to their timing belt? Which oil do you put in your engine? 5W-30, 10W-30 or 10W-40? Is everyone fluent in typing correct XHTML comments? :)
All these problems stem from a simple fact: nobody ever agreed to your contract. Add comment validators and processors but all they do is bend others to your contract and yours alone.
DOCTYPEs Do Matter
I take the side of importance of DOCTYPEs rather than serving content as XML because I think the business case for DOCTYPEs is stronger. The purism case is weaker, but this ain’t no Zion. This is the Matrix as we know it. I’m glad we have DOCTYPE switching because it imposes stricter rules on coding practices and shows “a more excellent way.” I’m also glad it fixes a number of interpretation inconsistencies in Internet Explorer. Use DOCTYPEs (XHTML, if possible), validate, fix, validate again.
If you’re in the business of developing web applications ditch application/xhtml+xml, forget it and don’t look back. Don’t hurt your business. Back to content-type=text/html…