HTML & XHTML: The Definitive Guide

Authors: Chuck Musciano and Bill Kennedy
Publisher: O"Reilly

Reviewed by Kevin Ryan

This Fourth Edition is subtitled "Creating Effective Web Pages" but would probably better be termed "An HTML 4.0 Reference with passing references to XML."

This is not one of those 1200 page tomes that get used for a doorstop more than they are read. It is a great companion to another Definitive Guide: Cascading Style Sheets by Eric Mayer and published by O'Reilly, and reviewed in June's AJ by Kristin Bradley. Both would work together to give a great basis for web authoring in its current incarnation.

We all know what HTML is, and on page 9 the authors explain what this XHTML is and how it works. All of these MLs (Markup Languages) are derived from (and subsets of) the grandma SGML, which is too hard for normal people to work with. I like to think of XML as a meta-markup language, a set of tags that allows you to create your own tags, and so it is called eXtensible. The problem with HTML (both the older version 3 and the new version 4) is that it allows for sloppy tag definition. For instance, you can have a <p> paragraph tag without closing it at the end. XML doesn't allow this. So if you want to make HTML that conforms to XML standards, you have to be especially careful. XHTML is that standard. Using XHTML is a good idea because the web world is moving toward XML, and future browsers may not be as nice to those sloppy tags as the current ones are.

Add to this confusing mixture the extensions created by Microsoft and Netscape for their browsers only. Fortunately, with HTML version 4, these extensions are included in the standard. So it is very easy to write HTML now that conforms to both browsers.

Chapter 2 is called "Quick Start" and runs you through writing basic HTML. They handle concepts such as content- based styles and physical styles so that a beginning HTML writer can comprehend the differences. Friendly advice also guides the neophyte. This quick start also includes images and frames. It's not until chapter 3 (page 45) that the term well formed comes into play. They call XHTML the "prissy cousin of HTML."

From that point the book turns into a quasi-reference. It goes through text tags, image tags, link tags, list tags in chapters 5 through 7. Chapter 8 is a brief coverage of Cascading Style Sheets (CSS), but too brief to have any real value other than for understanding. Get the other O'Reilly book for that. Form, table, frame and applet tags are covered in chapters 9 through 12. Extensions for Netscape 4.0 are briefly dealt with in the next chapter. XML is covered inadequately in 20 pages in Chapter 15. XHTML only gets 15 pages before the main part of the book finishes with some tricks. The last 100 pages are reference tables of tags and the DTD.

I found the book generally helpful because it didn't assume knowledge of HTML in the first section, but then built on the introduction to explain clearly what each set of tags does. The layout of each chapter was also helpful. I could read the first few pages of general introduction to get the main concepts and then skip the tag definitions and explanations the first read through.

But there were ways the book could be improved. As with any 650-page book, you will find some inaccuracies, but there were more than anticipated here. In the page for the <p> tag, they do not follow XML conventions and only include the initial tag, not the </p> tag. To their credit, the authors discuss the common tag structure, and the example reflects what most people do, thus making this reference more descriptive than prescriptive. Some illustrations do not fit with their captions, such as the background examples on page 156.

Another issue is treatment of HTML history, which gets short shrift in the book. Some may see the history as irrelevant; others may want more of a perspective. Looking towards the future is done with more systematicity, with tags slated for oblivion (deprecated), such as the <strike> tag, and those already obsolete, listed clearly. The chapter on legacy Netscape (4.0) extensions was odd because the authors maintain that HTML 4.0 deals with browser differences. Perhaps the fact that Netscape's newer browser has not worked out well is a motivation for inclusion.

The final quibble I have with the appendices at the end is that they are extremely hard to read. A series of symbols could have been used for deprecated and obsolete tags, as well as other distinctions such as physical and content-based tags, throughout the book that would have made quick perusal and reference use easier.

Overall, though, the book is a solid explanation of where HTML is going these days, and what is needed to make a proper (well-formed) document on the W3 today. I just wonder if it would be more expedient to go straight to XML and start making documents that conform completely so as to insure that they will be read and accessed for longer. If you are pressed for training time and need to keep up to date on HTML standards, get this book.

Algorithmica Japonica

October, 2001

The Newsletter of the Tokyo PC Users Group

Submissions : Editor

Tokyo PC Users Group, Post Office Box 103, Shibuya-Ku, Tokyo 150-8691, JAPAN