On Wed, 2011-08-10 at 09:40 -0500, Matthew J. Roth wrote:
Tim wrote:
I used to use the underscore, as it made sense (to me, and other programmers) as a substitute for a space. But there's two drawbacks:
- Try explaining to the clueless what an underscore is, and how to
type it. Try doing that again and again, and you get real sick of it.
You have the messy combinations of punctuation such as:
Shakespeare_-_The_Taming_of_the_ShrewWhere it'd really be better to collapse all punctuation down to just one punctuation symbol. That's "better" as in "easier and more convenient," not more lexically correct. Remember these are URIs (i.e. codes), not general language.
- If you ever want a URI printed on a newspaper or magazine, whoever
types it may not be able to get an underscore into the text, unless they're familiar with how their publishing system works. And, even then, they may fail. Many of them will convert an underscore into an EM dash, since an underscore is hardly ever desired in print, yet proper dashes are wanted all the time.
- Host Names (or 'labels' in DNS jargon) as traditionally defined by
RFC 952 and RFC 1123 may be composed of upper and lower case characters, numeric characters, and the dash character. RFC 2181 significantly liberalized the valid character set including the use of "_" (underscore), but it is still a *good idea* to stick to the traditionally defined characters[ยน].
It's become much worse than that with new classes of labels allowing non-ASCII character sets. See http://tools.ietf.org/html/rfc5890
poc