Monday, 17 December 2012

I want my octal!

- or - Even poor standards are better than none

In this day and age, hardly anyone uses octal. Why bother? Hexadecimal works just fine... so does decimal. Why use octal? There are a VERY few places where it makes reasonable sense (Unix file permissions assign three bits to each of self, group, and others, so it makes sense to describe it as, for instance, 0644), but mostly it's just historical. In C and C-derived languages, a quoted string can have octal escapes in it - '\101' means 'A', '\040' means space.

But BIND works differently. In a TXT record, arbitrary binary can be stored (up to 255 characters per string, but multiple strings per record are permitted). Three-digit escape sequences are supported, using decimal. For example, this in the master file (note that that's a literal tab character in the second string):

test IN TXT "Hello,\032world\033" "\000\009 \255"

produces this dig output:

;; ANSWER SECTION: 3600 IN TXT "Hello, world!" "\000\009\009\255"

(For comparison, a C-style program would treat that string as "\0\t\t\377", as can be demonstrated using Pike's DNS resolver.)

BIND is the only system or language that I am aware of that uses decimal in this way. Had C never existed, it's probable that decimal would have been the better choice, but since C is ubiquitous, there's a VERY strong argument for consistency. Even Python, which often goes against the C heritage in order to improve code readability, has kept this. It's just not worth fighting human nature. Follow an established standard, even if it would be somewhat cleaner to do something different.

Of course, BIND can't change now. Its own backward compatibility is far more important than matching C. But it's likely to be something that will trip up many a C programmer for decades to come.