Archive of UserLand's first discussion group, started October 5, 1998.

Scripting News XML DTD/Parsing Issues

Author:Matt Hamer
Posted:5/17/1999; 1:36:04 PM
Topic:Scripting News XML DTD/Parsing Issues
Msg #:6368
Prev/Next:6367 / 6369

I've recently encountered some interesting problems while processing the XML version of Scripting News, both related to embedded HTML within the The first problem is related to embedded HTML tags.

The New channel: BMW Motorcycle Owners of America. Now, by looking at it I know that 'Perfect!' should be italicized. What if the content were:


In the first case I need to send "Update" to the browser but in the second I need to send, " My suggestion would be to refine the content model for to define the HTML tags that are allowed, like this:

This is a "mixed" content model.

What I'm suggesting is in the spirit of John Cowan's Itsy Bitsy Teeny Weeny Simple Hypertext DTD (his comment at the top is especially applicable)

I'd be happy to help with the DTD modifications.

The second problem has to do with "pre-defined" HTML entities. This problem is more serious, because it causes a parsing error.

The problem is that XML parsers don't (or shouldn't) understand the é entity. I'm currently using James Clark's XP and I get the following message when trying to parse the file:

reference to undefined entity "eacute"

The easiest thing to do would probably be to convert HTML entities to their numeric equivalent.

é -> é

Any thoughts?

There are responses to this message:

This page was archived on 6/13/2001; 4:50:14 PM.

© Copyright 1998-2001 UserLand Software, Inc.