A palindromic polyglot program in x86 machine code, Perl, shell, and make

https://binary.golf/6:

Binary Golf Grand Prix is an annual small file format competition, currently in it's sixth year. The goal is to make the smallest possible file that fits the criteria of the challenge.

This year's BGGP challenge was to output or display 6. I always wanted to work with actual machine code, so I decided to submit a DOS COM executable. Why? Because the COM format has no headers or other metadata; you can just put some x86 instructions in a file and run it directly.

Having no experience with DOS, I started by looking up a "hello world" example and found https://github.com/susam/hello:

MOV AH, 9
MOV DX, 108
INT 21
RET
DB 'hello, world', D, A, '$'

This loads 9 into the AH register (the upper byte of AX) and executes interrupt 0x21, which triggers the DOS "display string" routine. The address of the string is given directly in DX; $ is used as an in-band string terminator because DOS is weird.

Adapting this snippet to output 6 instead is trivial, but I discovered something better: Function 2 of interrupt 0x21 outputs a character (code given in DL) directly. That gives us:

MOV AH, 2
MOV DL, '6'
INT 21
RET

Or in binary:

b4 02 b2 36 cd 21 c3

If you write these 7 bytes to a .COM file, the result is already a fully functional DOS executable. And since the RET command terminates the program, we can append whatever bytes we want, for example to create a palindrome:

b4 02 b2 36 cd 21 c3 21 cd 36 b2 02 b4

However, I've always liked polyglots. It turns out that the byte sequence 23 de corresponds to the x86 instruction AND BX, SI (which modifies BX, but is harmless otherwise). And byte 23 happens to be character # in ASCII, which means anything that follows will be ignored as a comment when the binary file is read by an interpreter that understands # as a comment marker (which includes Perl, Python, Ruby, PHP, make, and the shell). This leads to the following x86/Perl polyglot:

#<DE><B4><02><B2>6<CD>!<C3>
print 6;

And with a few modifications, we get a palindrome again:

#<DE><B4><02><B2>6<CD>!<C3>#
print 6#6 tnirp
#<C3>!<CD>6<B2><02><B4><DE>#

This is also a valid shell script, but it tries to run a program called print with arguments '6#6' and 'tnirp'. We can make the shell recognize # as a comment marker by putting a space in front, but there is no print command, so how do we make the shell use echo while retaining print for perl? Fortunately we don't need to if we're willing to use 6 as a "format string" and switch to printf:

#<DE><B4><02><B2>6<CD>!<C3>#
printf 6 # 6 ftnirp
#<C3>!<CD>6<B2><02><B4><DE>#

We can do one better and add make to the mix. We just need some form of dummy target and an empty list of prerequisites; the rest will be the shell command we already have. Normally that would look like this (with a literal tab before printf):

foo:
        printf 6

However, at least GNU make lets you write it all in one line without using tabs:

foo: ; printf 6

This form happens to be valid Perl already: foo: ; is just a label attached to a null statement. But the shell would try to run a program called foo: and we don't want that. A creative choice of label name and spacing takes care of this problem as well:

true :;printf 6

To make this means: The target true can be created/updated (with no prerequisites) by running printf 6. Since it is the first target in our "makefile", true automatically becomes the default target.

To perl this means: A label (named true) is attached to a null statement, followed by printf(6) (the final semicolon being optional because we're at the end of the file). 6 is implicitly converted to the format string "6", which simply outputs 6.

To the shell this means: Run the true command (with an argument of ':'), then run the printf command (with an argument of 6).

In final 55-byte palindrome + x86 machine code form:

#<DE><B4><02><B2>6<CD>!<C3>#
true :;printf 6 # 6 ftnirp;: eurt
#<C3>!<CD>6<B2><02><B4><DE>#

That's it!

Leave a comment

About mauke

user-pic I blog about Perl.