On Thu, 2005-06-09 at 20:10 +1000, telford@xxxxxxxxxxxxxxxxxxxxx wrote: ... > Thus if anyone is going to design a communications language it > should be a robust and that means it can recover from problems > and can guarantee resynchronisation from an arbitrary seek. > XML doesn't live up to the promise of being a universal markup > language because it is too annoying an too brittle. Uhm. Sure. Heres a Gig of download, your 500K of usable detail can be found spread throughout it. Seriously, XML itself is no more brittle than your ascii file, its what you put in it that makes a specific xml environment brittle or not. Its just SGML after all - which is precisely what HTML is. The parser you are using sucks - sorry, but thats the root of your problem. > By the way, how DO I get perl to read such a file? > > Do I have to write my own parser? convert the (probably cp-1252) text into utf-8, then parse it. or set a encoding in the header, it looks like the perl bindings suck a certain amount. Rob -- GPG key available at: <http://www.robertcollins.net/keys.txt>.
Description: This is a digitally signed message part