<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="../feed.xsl" type="text/xsl"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Susam's Assembly Pages</title>
  <subtitle>Feed for Susam's Assembly Pages</subtitle>
  <link href="https://susam.net/"/>
  <link href="https://susam.net/tag/assembly.xml" rel="self"/>
  <id>https://susam.net/tag/assembly.xml</id>
  <updated>2007-11-19T00:00:00Z</updated>
  <author><name>Susam Pal</name></author>
  <entry>
    <title>Writing Boot Sector Code</title>
    <link href="https://susam.net/writing-boot-sector-code.html"/>
    <id>urn:uuid:e29d5e21-1e8b-4687-9289-be2dae3e17ad</id>
    <updated>2007-11-19T00:00:00Z</updated>
    <content type="html">
<!-- BEGIN HTML -->
&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;
&lt;p&gt;
  In this article, we discuss how to write our own
  &lt;code&gt;&quot;hello, world&quot;&lt;/code&gt; program into the boot sector.  At the
  time of this writing, most such code examples available on the web
  are meant for the Netwide Assembler (NASM).  Very little material is
  available that could be tried with the readily available GNU tools
  like the GNU assembler (as) and the GNU linker (ld).  This article
  is an effort to fill this gap.
&lt;/p&gt;
&lt;h2 id=&quot;boot-sector&quot;&gt;Boot Sector&lt;/h2&gt;
&lt;p&gt;
  When the computer starts, the processor starts executing
  instructions at the memory address 0xffff:0x0000 (CS:IP).  This is
  an address in the BIOS ROM.  The machine instructions at this
  address begins the boot sequence.  In practice, this memory address
  contains a &lt;code&gt;JMP&lt;/code&gt; instruction to another address,
  typically 0xf000:0xe05b.  This latter address contains the code to
  perform power-on self test (POST), perform several initialisations,
  find the boot device, load the code from the boot sector into memory
  and execute it.  From here, the code in the boot sector takes
  control.  In IBM-compatible PCs, the boot sector is the first sector
  of a data storage device.  This is 512 bytes in length.  The
  following table shows what the boot sector contains.
&lt;/p&gt;
&lt;table class=&quot;grid center textcenter&quot;&gt;
  &lt;tr&gt;
    &lt;th colspan=&quot;2&quot;&gt;Address&lt;/th&gt;
    &lt;th rowspan=&quot;2&quot;&gt;Description&lt;/th&gt;
    &lt;th rowspan=&quot;2&quot;&gt;Size in bytes&lt;/th&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th&gt;Hex&lt;/th&gt;&lt;th&gt;Dec&lt;/th&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;000&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;Code&lt;/td&gt;&lt;td&gt;440&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;1b8&lt;/td&gt;&lt;td&gt;440&lt;/td&gt;&lt;td&gt;Optional disk signature&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;1bc&lt;/td&gt;&lt;td&gt;444&lt;/td&gt;&lt;td&gt;0x0000&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;1be&lt;/td&gt;&lt;td&gt;446&lt;/td&gt;
    &lt;td&gt;Four 16-byte entries for primary partitions&lt;/td&gt;&lt;td&gt;64&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;1fe&lt;/td&gt;&lt;td&gt;510&lt;/td&gt;&lt;td&gt;0xaa55&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;
  This type of boot sector found in IBM-compatible PCs is also known
  as master boot record (MBR).  The next two sections explain how to
  write executable code into the boot sector.  Two programs are
  discussed in the these two sections: one that merely prints a
  character and another that prints a string.
&lt;/p&gt;
&lt;p&gt;
  The reader is expected to have a working knowledge of x86 assembly
  language programming using GNU assembler.  The details of assembly
  language won&apos;t be discussed here.  Only how to write code for boot
  sector will be discussed.
&lt;/p&gt;
&lt;p&gt;
  The code examples were verified by using the following tools while
  writing this article:
&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;Debian GNU/Linux 4.0 (etch)&lt;/li&gt;
  &lt;li&gt;GNU assembler (GNU Binutils for Debian) 2.17&lt;/li&gt;
  &lt;li&gt;GNU ld (GNU Binutils for Debian) 2.17&lt;/li&gt;
  &lt;li&gt;dd (coreutils) 5.97&lt;/li&gt;
  &lt;li&gt;DOSBox 0.65&lt;/li&gt;
  &lt;li&gt;QEMU 0.8.2&lt;/li&gt;
&lt;/ol&gt;
&lt;!--
Version information available here:
http://archive.debian.org/debian/dists/etch/main/binary-i386/Packages.gz
--&gt;
&lt;h2 id=&quot;print-character&quot;&gt;Print Character&lt;/h2&gt;
&lt;p&gt;
  The following code prints the character &apos;A&apos; in yellow on a blue
  background:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;.code16
.section .text
.globl _start
_start:
  mov $0xb800, %ax
  mov %ax, %ds
  mov $0x1e41, %ax
  xor %di, %di
  mov %ax, (%di)
idle:
  hlt
  jmp idle&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  We save the above code in a file, say &lt;code&gt;a.s&lt;/code&gt;, then
  assemble and link this code with the following commands:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;as -o a.o a.s
ld --oformat binary -o a.com a.o&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  The above commands should generate a 15-byte output file
  named &lt;code&gt;a.com&lt;/code&gt;.  The &lt;code&gt;.code16&lt;/code&gt; directive in the
  source code tells the assembler that this code is meant for 16-bit
  mode.  The &lt;code&gt;_start&lt;/code&gt; label is meant to tell the linker
  that this is the entry point in the program.
&lt;/p&gt;
&lt;p&gt;
  The video memory of the VGA is mapped to various segments between
  0xa000 and 0xc000 in the main memory.  The colour text mode is
  mapped to the segment 0xb800.  The first two instructions copy
  0xb800 into the data segment register, so that any data offsets
  specified is an offset in this segment.  Then the ASCII code for the
  character &apos;A&apos; (i.e. 0x41 or 65) is copied into the first location in
  this segment and the attribute (0x1e) of this character to the
  second location.  The higher nibble (0x1) is the attribute for
  background colour and the lower nibble (0xe) is that of the
  foreground colour.  The highest bit of each nibble is the
  intensifier bit.  Depending on the video mode setup, the highest bit
  may also represent a blinking character.  The other three bits
  represent red, green and blue.  This is represented in a tabular
  form below.
&lt;/p&gt;
&lt;table class=&quot;grid center textcenter&quot;&gt;
  &lt;tr&gt;
    &lt;td colspan=&quot;8&quot;&gt;Attribute&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td colspan=&quot;4&quot;&gt;Background&lt;/td&gt;
    &lt;td colspan=&quot;4&quot;&gt;Foreground&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;I&lt;/td&gt;
    &lt;td&gt;R&lt;/td&gt;
    &lt;td&gt;G&lt;/td&gt;
    &lt;td&gt;B&lt;/td&gt;
    &lt;td&gt;I&lt;/td&gt;
    &lt;td&gt;R&lt;/td&gt;
    &lt;td&gt;G&lt;/td&gt;
    &lt;td&gt;B&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;0&lt;/td&gt;
    &lt;td&gt;0&lt;/td&gt;
    &lt;td&gt;0&lt;/td&gt;
    &lt;td&gt;1&lt;/td&gt;
    &lt;td&gt;1&lt;/td&gt;
    &lt;td&gt;1&lt;/td&gt;
    &lt;td&gt;1&lt;/td&gt;
    &lt;td&gt;0&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td colspan=&quot;4&quot;&gt;0x1&lt;/td&gt;
    &lt;td colspan=&quot;4&quot;&gt;0xe&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;
  We can be see from the table that the background colour is dark blue
  and the foreground colour is bright yellow.  We assemble and link
  the code with the &lt;code&gt;as&lt;/code&gt; and &lt;code&gt;ld&lt;/code&gt; commands
  mentioned earlier and generate an executable binary consisting of
  machine code.
&lt;/p&gt;
&lt;p&gt;
  Before writing the executable binary into the boot sector, we might
  want to verify whether the code works correctly with an emulator.
  DOSBox is a pretty good emulator for this purpose.  It is available
  as the &lt;code&gt;dosbox&lt;/code&gt; package in Debian.  Here is one way to
  run the executable binary file using DOSBox:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dosbox -c cls a.com&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  The letter &lt;code&gt;A&lt;/code&gt; printed in yellow on a blue foreground
  should appear in the first column of the first row of the screen.
&lt;/p&gt;
&lt;p&gt;
  In the &lt;code&gt;ld&lt;/code&gt; command earlier to generate the executable
  binary, we used the extension name &lt;code&gt;com&lt;/code&gt; for the binary
  file to make DOSBox believe that it is a DOS COM file, i.e. merely
  machine code and data with no headers.  In fact, the &lt;code&gt;--oformat
  binary&lt;/code&gt; option in the &lt;code&gt;ld&lt;/code&gt; command ensures that the
  output file contains only machine code.  This is why we are able to
  run the binary with DOSBox for verification.  If we do not use
  DOSBox, any extension name or no extension name for the binary would
  suffice.
&lt;/p&gt;
&lt;p&gt;
  Once we are satisfied with the output of &lt;code&gt;a.com&lt;/code&gt; running
  in DOSBox, we create a boot image file with this command: sector
  with these commands:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cp a.com a.img
echo 55 aa | xxd -r -p | dd seek=510 bs=1 of=hello.img&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  This boot image can be tested with DOSBox using the following
  command:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dosbox -c cls -c &apos;boot a.img&apos;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Yet another way to test this image would be to make QEMU x86 system
  emulator boot using this image.  Here is the command to do so:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;qemu-system-i386 -fda a.img&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Finally, if you are feeling brave enough, you could write this image
  to the boot sector of an actual physical storage device, such as a
  USB flash drive and then boot your computer with it.  To do so, you
  first need to determine the device file that represents the storage
  device.  There are many ways to do this.  A couple of commands that
  may be helpful to locate the storage device are &lt;code&gt;mount&lt;/code&gt;
  and &lt;code&gt;fdisk -l&lt;/code&gt;.  Assuming that there is a USB flash drive
  at &lt;code&gt;/dev/sdx&lt;/code&gt;, the boot image can be written to its boot
  sector using this command:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cp a.img /dev/sdx&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  &lt;em&gt;
    CAUTION: You need to be absolutely sure of the device path of the
    device being written to.  The device path &lt;code&gt;/dev/sdx&lt;/code&gt; is
    only an example here.  If the boot image is written to the wrong
    device, access to the data on that would be lost.
  &lt;/em&gt;
&lt;/p&gt;
&lt;p&gt;
  Now booting the computer with this device should show display the
  letter &apos;A&apos; in yellow on a blue background.
&lt;/p&gt;
&lt;h2 id=&quot;print-string&quot;&gt;Print String&lt;/h2&gt;
&lt;p&gt;
  The following code prints the string &quot;hello, world&quot; in yellow on a
  blue background:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;.code16

.section .text
.globl _start
_start:
  ljmp $0, $start
start:
  mov $0xb800, %ax
  mov %ax, %ds
  xor %di, %di
  mov $message, %si
  mov $0x1e, %ah
print:
  mov %cs:(%si), %al
  mov %ax, (%di)
  inc %si
  inc %di
  inc %di
  cmp $24, %di
  jne print
idle:
  hlt
  jmp idle

.section .data
message:
  .ascii &quot;hello, world&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  The BIOS reads the code from the first sector of the boot device
  into the memory at physical address 0x7c00 and jumps to that
  address.  While most BIOS implementations jump to 0x0000:0x7c00
  (CS:IP) to execute the boot sector code loaded at this address,
  unfortunately there are some BIOS implementations that jump to
  0x07c0:0x0000 instead to reach this address.  We will soon see that
  we are going to use offsets relative to the code segment to locate
  our string and copy it to video memory.  While the physical address
  of the string is always going to be the same regardless of which of
  the two types of BIOS implementations run our program, the offset of
  the string is going to differ based on the BIOS implementation.  If
  the register CS is set to 0 and the register IP is set to 0x7c00
  when the BIOS jumps to our program, the offset of the string is
  going to be greater than 0x7c00.  But if CS and IP are set to 0x07c0
  and 0 respectively, when the BIOS jumps to our program, the offset
  of the string is going to be much smaller.
&lt;/p&gt;
&lt;p&gt;
  We cannot know in advance which type of BIOS implementation is going
  to load our program into memory, so we need to prepare our program
  to handle both scenarios: one in which the BIOS executes our program
  by jumping to 0x0000:0x7c00 as well as the other in which the BIOS
  jumps to 0x07c0:0x0000 to execute our program.  We do this by using
  a very popular technique of setting the register CS to 0 ourselves
  by executing a far jump instruction to the code segment 0.  The very
  first instruction in this program that performs &lt;code&gt;ljmp $0,
  $start&lt;/code&gt; accomplishes this.
&lt;/p&gt;
&lt;p&gt;
  There are two sections in this code.  The text section has the
  executable instructions.  The data section has the string we want to
  print.  The code copies the first byte of the string to the memory
  location 0xb800:0x0000, its attribute to 0xb800:0x0001, the second
  byte of the string to 0xb800:0x0002, its attribute to 0xb800:0x0003
  and so on until it has advanced to 0xb800:0x0018 after having
  written 24 bytes for the 12 characters we need to print.  The
  instruction &lt;code&gt;movb %cs:(%si), %al&lt;/code&gt; copies one character
  from the string indexed by the SI register in the code segment into
  the AL register.  We are reading the characters from the code
  segment because we will place the string in the code segment using
  the linker commands discussed later.
&lt;/p&gt;
&lt;p&gt;
  However, while testing with DOSBox, things are a little different.
  In DOS, the text section is loaded at an offset 0x0100 in the code
  segment.  This should be specified to the linker while linking so
  that it can correctly resolve the value of the label
  named &lt;code&gt;message&lt;/code&gt;.  Therefore we will assemble and link our
  program twice: once for testing it with DOSBox and once again for
  creating the boot image.
&lt;/p&gt;
&lt;p&gt;
  To understand the offset at which the data section can be put, it is
  worth looking at how the binary code looks like with a trial linking
  with the following commands:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;as -o hello.o hello.s
ld --oformat binary -Ttext 0 -Tdata 40 -o hello.com hello.o
objdump -bbinary -mi8086 -D hello.com
xxd -g1 hello.com&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  The &lt;code&gt;-Ttext 0&lt;/code&gt; option tells the linker to assume that the
  text section should be loaded at offset 0x0 in the code segment.
  Similarly, the &lt;code&gt;-Tdata 40&lt;/code&gt; tells the linker to assume
  that the data section is at offset 0x40.
&lt;/p&gt;
&lt;p&gt;
  The &lt;code&gt;objdump&lt;/code&gt; command mentioned above disassembles the
  generated binary file.  This shows where the text section and data
  section are placed.
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;$ &lt;kbd&gt;objdump -bbinary -mi8086 -D hello.com&lt;/kbd&gt;

hello.com:     file format binary


Disassembly of section .data:

00000000 &amp;lt;.data&amp;gt;:
   0:   ea 05 00 00 00          ljmp   $0x0,$0x5
   5:   b8 00 b8                mov    $0xb800,%ax
   8:   8e d8                   mov    %ax,%ds
   a:   31 ff                   xor    %di,%di
   c:   be 40 00                mov    $0x40,%si
   f:   b4 1e                   mov    $0x1e,%ah
  11:   2e 8a 04                mov    %cs:(%si),%al
  14:   89 05                   mov    %ax,(%di)
  16:   46                      inc    %si
  17:   47                      inc    %di
  18:   47                      inc    %di
  19:   83 ff 18                cmp    $0x18,%di
  1c:   75 f3                   jne    0x11
  1e:   f4                      hlt
  1f:   eb fd                   jmp    0x1e
        ...
  3d:   00 00                   add    %al,(%bx,%si)
  3f:   00 68 65                add    %ch,0x65(%bx,%si)
  42:   6c                      insb   (%dx),%es:(%di)
  43:   6c                      insb   (%dx),%es:(%di)
  44:   6f                      outsw  %ds:(%si),(%dx)
  45:   2c 20                   sub    $0x20,%al
  47:   77 6f                   ja     0xb8
  49:   72 6c                   jb     0xb7
  4b:   64                      fs&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  Note that the &lt;samp&gt;...&lt;/samp&gt; above indicates zero bytes skipped
  by &lt;code&gt;objdump&lt;/code&gt;.  The text section is above these zero bytes
  and the data section is below them.  Let us also see the output of
  the &lt;code&gt;xxd&lt;/code&gt; command:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;$ &lt;kbd&gt;xxd -g1 hello.com&lt;/kbd&gt;
00000000: ea 05 00 00 00 b8 00 b8 8e d8 31 ff be 40 00 b4  ..........1..@..
00000010: 1e 2e 8a 04 89 05 46 47 47 83 ff 18 75 f3 f4 eb  ......FGG...u...
00000020: fd 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000040: 68 65 6c 6c 6f 2c 20 77 6f 72 6c 64              hello, world&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  Both outputs above show that the text section occupies the first
  0x21 bytes (33 bytes).  The data section is 0xc bytes (12 bytes) in
  length.  Let us create a binary where the region from offset 0x0 to
  offset 0x20 contains the text section and the region from offset
  0x21 to offset 0x2c contains the data section.  The total length of
  the binary would then be 0x2d bytes (45 bytes).  We will create a
  new binary as per this plan.
&lt;/p&gt;
&lt;p&gt;
  However while creating the new binary, we should remember that DOS
  would load the binary at offset 0x100, so we need to tell the linker
  to assume 0x100 as the offset of the text section and 0x121 as the
  offset of the data section, so that it resolves the value of the
  label named &lt;code&gt;message&lt;/code&gt; accordingly.  Moreover while
  testing with DOS, we must remove the far jump instruction at the top
  of our program because DOS does not load our program at physical
  address 0x7c00 of the memory.  We create a new binary in this manner
  and test it with DOSBox with these commands:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;grep -v ljmp hello.s &amp;gt; dos-hello.s
as -o hello.o dos-hello.s
ld --oformat binary -Ttext 100 -Tdata 121 -o hello.com hello.o&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Now we can test this program with DOSBox with the following command:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dosbox -c cls hello.com&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  If everything looks fine, we assemble and link our program once
  again for boot sector and create a boot image with these commands:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;as -o hello.o hello.s
ld --oformat binary -Ttext 7c00 -Tdata 7c21 -o hello.img hello.o
echo 55 aa | xxd -r -p | dd seek=510 bs=1 of=hello.img&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Now we can test this image with DOSBox like this:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dosbox -c cls -c &apos;boot hello.img&apos;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  We can also test the image with QEMU with the following command:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;qemu-system-i386 -fda hello.img&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Finally, this image can be written to the boot sector as follows:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cp hello.img /dev/sdx&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  &lt;em&gt;
    CAUTION: Again, one needs to be very careful with the commands
    here.  The device path &lt;code&gt;/dev/sdx&lt;/code&gt; is only an example.
    This path must be changed to the path of the actual device one
    wants to write the boot sector binary to.
  &lt;/em&gt;
&lt;/p&gt;
&lt;p&gt;
  Once written to the device successfully, the computer may be booted
  with this device to display the &quot;hello, world&quot; string on the screen.
&lt;/p&gt;
<!-- ### -->
&lt;p&gt;
  &lt;a href="https://susam.net/writing-boot-sector-code.html"&gt;Read on website&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/assembly.html&quot;&gt;#assembly&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/programming.html&quot;&gt;#programming&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/linux.html&quot;&gt;#linux&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/technology.html&quot;&gt;#technology&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/how-to.html&quot;&gt;#how-to&lt;/a&gt;
&lt;/p&gt;
<!-- END HTML -->
    </content>
  </entry>
  <entry>
    <title>Self-Printing Machine Code</title>
    <link href="https://susam.net/self-printing-machine-code.html"/>
    <id>urn:uuid:cd929f40-02ba-4368-8583-0e0e2374865f</id>
    <updated>2005-10-27T00:00:00Z</updated>
    <content type="html">
<!-- BEGIN HTML -->
&lt;p&gt;
  The following 12-byte program composed of pure x86 machine code
  writes itself to standard output when executed in a DOS environment:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;fc b1 0c ac 92 b4 02 cd 21 e2 f8 c3&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  We can write these bytes to a file with the .COM extension and
  execute it in DOS.  It runs successfully in MS-DOS 6.22, Windows 98,
  as well as in DOSBox and writes a copy of itself to standard output.
&lt;/p&gt;
&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;#demo&quot;&gt;Demo&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#quine-conundrums&quot;&gt;Quine Conundrums&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#proper-quines&quot;&gt;Proper Quines&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#a-note-on-dos-services&quot;&gt;A Note on DOS Services&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#writing-to-video-memory-directly&quot;&gt;Writing to Video Memory Directly&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#boot-program&quot;&gt;Boot Program&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;demo&quot;&gt;Demo&lt;/h2&gt;
&lt;p&gt;
  On a Unix or Linux system, the following commands demonstrate this
  program with the help of DOSBox:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;echo fc b1 0c ac 92 b4 02 cd 21 e2 f8 c3 | xxd -r -p &amp;gt; foo.com
dosbox -c &apos;MOUNT C .&apos; -c &apos;C:\FOO &amp;gt; C:\OUT.COM&apos; -c &apos;EXIT&apos;
diff foo.com OUT.COM&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  The &lt;code&gt;diff&lt;/code&gt; command should produce no output confirming
  that the output of the program is identical to the program itself.
  On an actual MS-DOS 6.22 system or a Windows 98 system, we can
  demonstrate this program in the following manner:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG&lt;/kbd&gt;
-&lt;kbd&gt;E 100 fc b1 0c ac 92 b4 02 cd 21 e2 f8 c3&lt;/kbd&gt;
-&lt;kbd&gt;N&lt;/kbd&gt; FOO.COM
-&lt;kbd&gt;R CX&lt;/kbd&gt;
CX 0000
:&lt;kbd&gt;C&lt;/kbd&gt;
-&lt;kbd&gt;W&lt;/kbd&gt;
Writing 0000C bytes
-&lt;kbd&gt;Q&lt;/kbd&gt;

C:\&amp;gt;&lt;kbd&gt;FOO &amp;gt; OUT.COM&lt;/kbd&gt;

C:\&amp;gt;&lt;kbd&gt;FC FOO.COM OUT.COM&lt;/kbd&gt;
Comparing files FOO.COM and OUT.COM
FC: no differences encountered&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  In the &lt;code&gt;DEBUG&lt;/code&gt; session shown above, we use the debugger
  command &lt;code&gt;E&lt;/code&gt; to enter the machine code at offset 0x100 of
  the code segment.  Then we use the &lt;code&gt;N&lt;/code&gt; command to name
  the file we want to write this machine code to.  The command &lt;code&gt;R
  CX&lt;/code&gt; is used to specify that we want to write 0xC (decimal 12)
  bytes to this file.  The &lt;code&gt;W&lt;/code&gt; command writes the 12 bytes
  entered at offset 0x100.  The &lt;code&gt;Q&lt;/code&gt; command quits the
  debugger.  Then we run the new &lt;code&gt;FOO.COM&lt;/code&gt; program while
  redirecting its output to &lt;code&gt;OUT.COM&lt;/code&gt;.  Finally, we use
  the &lt;code&gt;FC&lt;/code&gt; command to compare the two files and confirm
  that they are exactly the same.
&lt;/p&gt;
&lt;p&gt;
  Let us disasssemble this program now and see what it does.  The
  output below is generated using the Netwide Disassembler (NDISASM),
  a tool that comes with Netwide Assembler (NASM):
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;$ &lt;kbd&gt;ndisasm -o 0x100 foo.com&lt;/kbd&gt;
00000100  FC                cld
00000101  B10C              mov cl,0xc
00000103  AC                lodsb
00000104  92                xchg ax,dx
00000105  B402              mov ah,0x2
00000107  CD21              int 0x21
00000109  E2F8              loop 0x103
0000010B  C3                ret&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  When DOS executes a program in .COM file, it loads the machine code
  in the file at offset 0x100 of the code segment chosen by DOS.  That
  is why we ask the disassembler to assume a load address of 0x100
  with the &lt;code&gt;-o&lt;/code&gt; command line option.  The first instruction
  clears the direction flag.  The purpose of this instruction is
  explained later.  The next instruction sets the register CL to 0xc
  (decimal 12).  The register CH is already set to 0 by default when a
  .COM program starts.  Thus setting the register CL to 0xc
  effectively sets the entire register CX to 0xc.  The register CX is
  used as a loop counter for the &lt;code&gt;loop 0x103&lt;/code&gt; instruction
  that comes later.  Everytime this loop instruction executes, it
  decrements CX and makes a near jump to offset 0x103 if CX is not 0.
  This results in 12 iterations of the loop.
&lt;/p&gt;
&lt;p&gt;
  In each iteration of the loop, the instructions from offset 0x103 to
  offset 0x109 are executed.  The &lt;code&gt;lodsb&lt;/code&gt; instruction loads
  a byte from address DS:SI into AL.  When DOS starts executing this
  program, DS and SI are set to CS and 0x100 by default, so at the
  beginning DS:SI points to the first byte of the program.
  The &lt;code&gt;xchg&lt;/code&gt; instruction exchanges the values in AX and DX.
  Thus the byte we just loaded into AL ends up in DL.  Then we set AH
  to 2 and generate the software interrupt 0x21 (decimal 33) to write
  the byte in DL to standard output.  This is how each iteration reads
  a byte of this program and writes it to standard output.
&lt;/p&gt;
&lt;p&gt;
  The &lt;code&gt;lodsb&lt;/code&gt; instruction increments or decrements SI
  depending on the state of the direction flag (DF).  When DF is
  cleared, it increments SI.  If DF is set, it decrements SI.  We use
  the &lt;code&gt;cld&lt;/code&gt; instruction at the beginning to clear DF, so
  that in each iteration of the loop, SI moves forward to point to the
  next byte of the program.  This is how the 12 iterations of the loop
  write 12 bytes of the program to standard output.  In many DOS
  environments, the DF flag is already in cleared state when a .COM
  program starts, so the CLD instruction could be omitted in such
  environments.  However, there are some environments where DF may not
  be in cleared state when our program starts, so it is a best
  practice to clear DF before relying on it.
&lt;/p&gt;
&lt;p&gt;
  Finally, when the loop terminates, we execute the &lt;code&gt;RET&lt;/code&gt;
  instruction to terminate the program.
&lt;/p&gt;
&lt;h2 id=&quot;quine-conundrums&quot;&gt;Quine Conundrums&lt;/h2&gt;
&lt;p&gt;
  While reading the description of the self-printing program presented
  earlier, one might wonder if it is a quine.  While there is no
  standardised definition of the term &lt;em&gt;quine&lt;/em&gt;, it is generally
  accepted that a quine is a computer program that takes no input and
  produces an exact copy of its own source code as its output.  Since
  a quine cannot take any input, tricks involving reading its own
  source code or evaluating itself are ruled out.
&lt;/p&gt;
&lt;p&gt;
  For example, this shell script is a valid quine:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;s=&apos;s=\47%s\47;printf &quot;$s&quot; &quot;$s&quot;\n&apos;;printf &quot;$s&quot; &quot;$s&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  However, the following shell script is not considered a proper
  quine:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cat $0&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  The shell script above reads its own source code which is considered
  cheating.  Improper quines like this are often called &lt;em&gt;cheating
  quines&lt;/em&gt;.
&lt;/p&gt;
&lt;p&gt;
  Is our 12-byte x86 program a quine?  It turns out that we have a
  conundrum.  There is no notion of source code for our program.
  There would have been one if we had written out the source code of
  this program in assembly language.  In such a case we would first
  need to choose an assembler and a proper quine would need to produce
  an exact copy of the assembly language source code (not the machine
  code bytes) for the chosen assembler.  But we are not doing that
  here.  We want the machine code to produce an exact copy of itself.
  There is no source code involved.  We only have machine code.  So we
  could argue that the whole notion of machine code quine is nonsense.
  No machine code quine can exist because there is no source code to
  produce as output.
&lt;/p&gt;
&lt;p&gt;
  However, we could also argue that the machine code is the input for
  the CPU that the CPU fetches, decodes and converts to a sequence of
  state changes in the CPU.  If we define a machine code quine to be a
  machine code program that writes its own bytes, then we could say
  that we have a machine code quine here.
&lt;/p&gt;
&lt;p&gt;
  Let us now entertain the thought that our 12-byte program is indeed
  a machine code quine.  Now we have a new conundrum.  Is it a proper
  quine?  This program reads its own bytes from memory and writes
  them.  Does that make it a cheating quine?  What would a proper
  quine written in pure machine code even look like?  If we look at
  the shell script quine above, we see that it contains parts of the
  executable part of the script code embedded in a string as data.
  Then we format the string cleverly to produce a new string that
  looks exactly like the entire shell script.  It is a common pattern
  followed in many quines.  The quine does not read its own code but
  it reads some data defined by the code and formats that data to look
  like its own code.  However, in pure machine code like this the
  lines between data and code are blurred.  Even if we try to keep the
  bytes we want to read at a separate place in the memory and treat it
  like data, they would look exactly like machine instructions, so one
  might wonder if there is any point in trying to make a machine quine
  that does not read its own bytes.  Nevertheless the next section
  shows how to accomplish this.
&lt;/p&gt;
&lt;h2 id=&quot;proper-quines&quot;&gt;Proper Quines&lt;/h2&gt;
&lt;p&gt;
  If the thought of a machine code quine program reading its own bytes
  from the memory makes you uncomfortable, here is an adapation of the
  previous program that keeps the machine instructions to be executed
  separate from the data bytes to be read by the program.
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;fc b3 02 b1 14 be 14 01 ac 92 b4 02 cd 21 e2 f8 4b 75 f0 c3
fc b3 02 b1 14 be 14 01 ac 92 b4 02 cd 21 e2 f8 4b 75 f0 c3&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Here is how we can demonstrate this 40-byte program:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;echo fc b3 02 b1 14 be 14 01 ac 92 b4 02 cd 21 e2 f8 4b 75 f0 c3 | xxd -r -p &amp;gt; foo.com
echo fc b3 02 b1 14 be 14 01 ac 92 b4 02 cd 21 e2 f8 4b 75 f0 c3 | xxd -r -p &amp;gt;&amp;gt; foo.com
dosbox -c &apos;MOUNT C .&apos; -c &apos;C:\FOO &amp;gt; C:\OUT.COM&apos; -c &apos;EXIT&apos;
diff foo.com OUT.COM&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Here is the disassembly:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;$ &lt;kbd&gt;ndisasm -o 0x100 foo.com&lt;/kbd&gt;
00000100  FC                cld
00000101  B302              mov bl,0x2
00000103  B114              mov cl,0x14
00000105  BE1401            mov si,0x114
00000108  AC                lodsb
00000109  92                xchg ax,dx
0000010A  B402              mov ah,0x2
0000010C  CD21              int 0x21
0000010E  E2F8              loop 0x108
00000110  4B                dec bx
00000111  75F0              jnz 0x103
00000113  C3                ret
00000114  FC                cld
00000115  B302              mov bl,0x2
00000117  B114              mov cl,0x14
00000119  BE1401            mov si,0x114
0000011C  AC                lodsb
0000011D  92                xchg ax,dx
0000011E  B402              mov ah,0x2
00000120  CD21              int 0x21
00000122  E2F8              loop 0x11c
00000124  4B                dec bx
00000125  75F0              jnz 0x117
00000127  C3                ret&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  The first 20 bytes is the executable part of the program.  The next
  20 bytes is the data read by the program.  The executable bytes are
  identical to the data bytes.  The executable part of the program has
  an outer loop that iterates twice.  In each iteration, it reads the
  data bytes and writes them to standard output.  Therefore, in two
  iterations of the outer loop, it writes the data bytes twice.  In
  this manner, the output is identical to the program itself.
&lt;/p&gt;
&lt;p&gt;
  Here is another simpler 32-byte quine based on this approach:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;b8 23 09 fe c0 a2 20 01 ba 10 01 cd 21 cd 21 c3
b8 23 09 fe c0 a2 20 01 ba 10 01 cd 21 cd 21 c3&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Here are the commands to demostrate this quine:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;echo b8 23 09 fe c0 a2 20 01 ba 10 01 cd 21 cd 21 c3 | xxd -r -p &amp;gt; foo.com
echo b8 23 09 fe c0 a2 20 01 ba 10 01 cd 21 cd 21 c3 | xxd -r -p &amp;gt;&amp;gt; foo.com
dosbox -c &apos;MOUNT C .&apos; -c &apos;C:\FOO &amp;gt; C:\OUT.COM&apos; -c &apos;EXIT&apos;
diff foo.com OUT.COM&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Here is the disassembly:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;$ &lt;kbd&gt;ndisasm -o 0x100 foo.com&lt;/kbd&gt;
00000100  B82309            mov ax,0x923
00000103  FEC0              inc al
00000105  A22001            mov [0x120],al
00000108  BA1001            mov dx,0x110
0000010B  CD21              int 0x21
0000010D  CD21              int 0x21
0000010F  C3                ret
00000110  B82309            mov ax,0x923
00000113  FEC0              inc al
00000115  A22001            mov [0x120],al
00000118  BA1001            mov dx,0x110
0000011B  CD21              int 0x21
0000011D  CD21              int 0x21
0000011F  C3                ret&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  This example too has two parts.  The first half has the executable
  bytes and the second half has the data bytes.  Both parts are
  identical.  This example sets AH to 9 in the first instruction and
  then later uses &lt;code&gt;int 0x21&lt;/code&gt; to invoke the DOS service that
  prints a dollar-terminated string beginning at the address specifed
  in DS:DX.  When a .COM program starts, DS already points to the
  current code segment, so we don&apos;t have to set it explicitly.  The
  dollar symbol has an ASCII code of 0x24 (decimal 36).  We need to be
  careful about not having this value anywhere within the the data
  bytes or this DOS function would prematurely stop printing our data
  bytes as soon as it encounters this value.  That is why we set AL to
  0x23 in the first instruction, then increment it to 0x24 in the
  second instruction and then copy this value to the end of the data
  bytes in the third instruction.  Finally, we execute &lt;code&gt;int
  0x21&lt;/code&gt; twice to write the data bytes twice to standard output,
  so that the output matches the program itself.
&lt;/p&gt;
&lt;p&gt;
  While both these programs take care not to read the same memory
  region that is being executed by the CPU, the data bytes they read
  look exactly like the executable bytes.  This is what I meant when I
  mentioned earlier that the lines between code and data are blurred
  in an exercise like this.  This is why I don&apos;t really see a point in
  keeping the executable bytes separate from the data bytes while
  writing machine code quines.
&lt;/p&gt;
&lt;h2 id=&quot;a-note-on-dos-services&quot;&gt;A Note on DOS Services&lt;/h2&gt;
&lt;p&gt;
  The self-printing programs presented above use &lt;code&gt;int 0x21&lt;/code&gt;
  which offers DOS services that support various input/output
  functions.  In the first two programs, we selected the function to
  write a character to standard output by setting AH to 2 before
  invoking this software interrupt.  In the next program, we selected
  the function to write a dollar-terminated string to standard output
  by setting AH to 9.
&lt;/p&gt;
&lt;p&gt;
  The &lt;code&gt;ret&lt;/code&gt; instruction in the end too relies on DOS
  services.  When a .COM program starts, the register SP contains
  0xfffe.  The stack memory locations at offset 0xfffe and 0xffff
  contain 0x00 and 0x00 respectively.  Further, the memory address at
  offset 0x0000 contains the instruction &lt;code&gt;int 0x20&lt;/code&gt; which
  is a DOS service that terminates the program.  As a result,
  executing the &lt;code&gt;ret&lt;/code&gt; instruction pops 0x0000 off the stack
  at 0xfffe and loads it into IP.  This results in the
  instruction &lt;code&gt;int 0x20&lt;/code&gt; at offset 0x0000 getting executed.
  This instruction terminates the program and returns to DOS.
&lt;/p&gt;
&lt;p&gt;
  Relying on DOS services gives us a comfortable environment to work
  with.  In particular, DOS implements the notion of &lt;em&gt;standard
  output&lt;/em&gt; which lets us redirect standard output to a file.  This
  lets us conveniently compare the original program file and the
  output file with the &lt;code&gt;FC&lt;/code&gt; command and confirm that they
  are identical.
&lt;/p&gt;
&lt;p&gt;
  But one might wonder if we could avoid relying on DOS services
  completely and still write a program that prints its own bytes to
  screen.  We definitely can.  We could write directly to video memory
  at address 0xb800:0x0000 and show the bytes of the program on
  screen.  We could also forgo DOS completely and let BIOS load our
  program from the boot sector and execute it.  The next two sections
  discuss these things.
&lt;/p&gt;
&lt;h2 id=&quot;writing-to-video-memory-directly&quot;&gt;Writing to Video Memory Directly&lt;/h2&gt;
&lt;p&gt;
  Here is an example of an 18-byte self-printing program that writes
  directly to the video memory at address 0xb800:0x0000.
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;fc b4 b8 8e c0 31 ff b1 12 b4 0a ac ab e2 fc f4 eb fd&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Here are the commands to create and run this program:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;echo fc b4 b8 8e c0 31 ff b1 12 b4 0a ac ab e2 fc f4 eb fd | xxd -r -p &amp;gt; foo.com
dosbox foo.com&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  With the default code page active, i.e. with code page 437 active,
  the program should display an output that looks approximately like
  the following and halt:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;&amp;#x207F;&amp;#x2524;&amp;#x2555;&amp;#xC4;&amp;#x2514;&amp;#x31;&amp;#xA0;&amp;#x2592;&amp;#x2195;&amp;#x2524;&amp;#x25D9;&amp;#xBC;&amp;#xBD;&amp;#x393;&amp;#x207F;&amp;#x2320;&amp;#x3B4;&amp;#xB2;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  Now of course this type of output looks gibberish but there is a
  quick and dirty way to confirm that this output indeed represents
  the bytes of our program.  We can use the &lt;code&gt;TYPE&lt;/code&gt; command
  of DOS to print the program and check if the symbols that appear in
  its output seem consistent with the output above.  Here is an
  example:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;TYPE FOO.COM&lt;/kbd&gt;
&amp;#x207F;&amp;#x2524;&amp;#x2555;&amp;#xC4;&amp;#x2514;&amp;#x31;&amp;#xA0;&amp;#x2592;&amp;#x2195;&amp;#x2524;
          &amp;#xBC;&amp;#xBD;&amp;#x393;&amp;#x207F;&amp;#x2320;&amp;#x3B4;&amp;#xB2;
C:\&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  This output looks very similar to the previous one except that the
  byte value 0x0a is rendered as a line break in this output whereas
  in the previous output this byte value is represented as a circle in
  a box.  This method of visually inspecting the output would not have
  worked very well if there were any control characters such as
  backspace or carriage return that result in characters being erased
  in the displayed output.
&lt;/p&gt;
&lt;p&gt;
  A proper way to verify that the output of the program represents the
  bytes of the program would be to take each symbol from the output of
  the program, then look it up in a chart for code page 437 and
  confirm that the byte value of each symbol matches each byte value
  that makes the program.  Here is one such chart that approximates
  the symbols in code page 437 with Unicode
  symbols: &lt;a href=&quot;code/cp437/cp437.html&quot;&gt;cp437.html&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;
  Here is the disassembly of the above program:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;$ &lt;kbd&gt;ndisasm -o 0x100 foo.com&lt;/kbd&gt;
00000100  FC                cld
00000101  B4B8              mov ah,0xb8
00000103  8EC0              mov es,ax
00000105  31FF              xor di,di
00000107  B112              mov cl,0x12
00000109  B40A              mov ah,0xa
0000010B  AC                lodsb
0000010C  AB                stosw
0000010D  E2FC              loop 0x10b
0000010F  F4                hlt
00000110  EBFD              jmp short 0x10f&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  This program sets ES to 0xb800 and DI to 0.  Thus ES:DI points to
  the video memory at address 0xb800:0x0000.  DS:SI points to the
  first instruction of this program by default.  Further AH is set to
  0xa.  This is used to specify the colour attribute of the text to be
  displayed on screen.  Each iteration of the loop in this program
  loads a byte of the program and writes it along with the colour
  attribute to video memory.  The &lt;code&gt;lodsb&lt;/code&gt; instruction loads
  a byte of the program from the memory address specified by DS:SI
  into AL and increments SI by 1.  AH is already set to 0xa.  The
  value 0xa (binary 00001010) here specifies black as the background
  colour and bright green as the foreground colour.
  The &lt;code&gt;stosw&lt;/code&gt; instruction stores a word from AX to the
  memory address specified by ES:DI and increments DI by 2.  In this
  manner, the byte in AL and its colour attribute in AH gets copied to
  the video memory.
&lt;/p&gt;
&lt;p&gt;
  Once again, if you are not happy about the program reading its own
  executable bytes, we can keep the bytes we read separate from the
  bytes the CPU executes.  Here is a 54-byte program that does this:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;fc b3 02 b4 b8 8e c0 31 ff be 1b 01 b9 1b 00 b4
0a ac ab e2 fc 4b 75 f1 f4 eb fd fc b3 02 b4 b8
8e c0 31 ff be 1b 01 b9 1b 00 b4 0a ac ab e2 fc
4b 75 f1 f4 eb fd&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Here is how we can create and run this program:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;echo fc b3 02 b4 b8 8e c0 31 ff be 1b 01 b9 1b 00 b4 | xxd -r -p &amp;gt; foo.com
echo 0a ac ab e2 fc 4b 75 f1 f4 eb fd fc b3 02 b4 b8 | xxd -r -p &amp;gt;&amp;gt; foo.com
echo 8e c0 31 ff be 1b 01 b9 1b 00 b4 0a ac ab e2 fc | xxd -r -p &amp;gt;&amp;gt; foo.com
echo 4b 75 f1 f4 eb fd | xxd -r -p &amp;gt;&amp;gt; foo.com
dosbox foo.com&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  With code page 437 active, the output should look approximately like
  this:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;&amp;#x207F;&amp;#x2502;&amp;#x263B;&amp;#x2524;&amp;#x2555;&amp;#xC4;&amp;#x2514;&amp;#x31;&amp;#xA0;&amp;#x255B;&amp;#x2190;&amp;#x263A;&amp;#x2563;&amp;#x2190;&amp;#x20;&amp;#x2524;&amp;#x25D9;&amp;#xBC;&amp;#xBD;&amp;#x393;&amp;#x207F;&amp;#x4B;&amp;#x75;&amp;#xB1;&amp;#x2320;&amp;#x3B4;&amp;#xB2;&amp;#x207F;&amp;#x2502;&amp;#x263B;&amp;#x2524;&amp;#x2555;&amp;#xC4;&amp;#x2514;&amp;#x31;&amp;#xA0;&amp;#x255B;&amp;#x2190;&amp;#x263A;&amp;#x2563;&amp;#x2190;&amp;#x20;&amp;#x2524;&amp;#x25D9;&amp;#xBC;&amp;#xBD;&amp;#x393;&amp;#x207F;&amp;#x4B;&amp;#x75;&amp;#xB1;&amp;#x2320;&amp;#x3B4;&amp;#xB2;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  We can clearly see in this output that the first 27 bytes of output
  are identical to the next 27 bytes of the output.  Like the proper
  quines discussed earlier, this one too has two halves that are
  identical to each other.  The executable code in the first half
  reads the data bytes from the second half and prints the data bytes
  twice so that the output bytes is an exact copy of all 54 bytes in
  the program.  Here is the disassembly:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;$ &lt;kbd&gt;ndisasm -o 0x100 foo.com&lt;/kbd&gt;
00000100  FC                cld
00000101  B302              mov bl,0x2
00000103  B4B8              mov ah,0xb8
00000105  8EC0              mov es,ax
00000107  31FF              xor di,di
00000109  BE1B01            mov si,0x11b
0000010C  B91B00            mov cx,0x1b
0000010F  B40A              mov ah,0xa
00000111  AC                lodsb
00000112  AB                stosw
00000113  E2FC              loop 0x111
00000115  4B                dec bx
00000116  75F1              jnz 0x109
00000118  F4                hlt
00000119  EBFD              jmp short 0x118
0000011B  FC                cld
0000011C  B302              mov bl,0x2
0000011E  B4B8              mov ah,0xb8
00000120  8EC0              mov es,ax
00000122  31FF              xor di,di
00000124  BE1B01            mov si,0x11b
00000127  B91B00            mov cx,0x1b
0000012A  B40A              mov ah,0xa
0000012C  AC                lodsb
0000012D  AB                stosw
0000012E  E2FC              loop 0x12c
00000130  4B                dec bx
00000131  75F1              jnz 0x124
00000133  F4                hlt
00000134  EBFD              jmp short 0x133&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  This disassembly is rather long but we can clearly see that the
  bytes from offset 0x100 to offset 0x11a are identical to the bytes
  from offset 0x11b to 0x135.  These are the bytes we see in the
  output of the program too.
&lt;/p&gt;
&lt;h2 id=&quot;boot-program&quot;&gt;Boot Program&lt;/h2&gt;
&lt;p&gt;
  The 32-byte program below writes itself to video memory when
  executed from the boot sector:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ea 05 7c 00 00 fc b8 00 b8 8e c0 8c c8 8e d8 31
ff be 00 7c b9 20 00 b4 0a ac ab e2 fc f4 eb fd&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  We can create a boot image that contains these bytes, write it to
  the boot sector of a drive and boot an IBM PC compatible computer
  with it.  On booting, this program prints its own bytes on the
  screen.
&lt;/p&gt;
&lt;p&gt;
  On a Unix or Linux system, the following commands can be used to
  create a boot image with the above program:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;echo ea 05 7c 00 00 fc b8 00 b8 8e c0 8c c8 8e d8 31 | xxd -r -p &amp;gt; boot.img
echo ff be 00 7c b9 20 00 b4 0a ac ab e2 fc f4 eb fd | xxd -r -p &amp;gt;&amp;gt; boot.img
echo 55 aa | xxd -r -p | dd seek=510 bs=1 of=boot.img&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Now we can test this boot image using DOSBox with the following
  command:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dosbox -c cls -c &apos;boot boot.img&apos;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  We can also test this image using QEMU x86 system emulator as
  follows:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;qemu-system-i386 -fda boot.img&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  We could also write this image to the boot sector of an actual
  physical storage device, such as a USB flash drive and then boot the
  computer with it.  Here is an example command that writes the boot
  image to the drive represented by the device
  path &lt;code&gt;/dev/sdx&lt;/code&gt;.
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cp a.img /dev/sdx&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  &lt;em&gt;
    CAUTION: You need to be absolutely sure of the device path of the
    device being written to.  The device path &lt;code&gt;/dev/sdx&lt;/code&gt; is
    only an example here.  If the boot image is written to the wrong
    device, access to the data on that would be lost.
  &lt;/em&gt;
&lt;/p&gt;
&lt;p&gt;
  On testing this boot image with an emulator or a real computer, the
  output should look approximately like this:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;&amp;#x3A9;&amp;#x2663;&amp;#x7C;&amp;#x20;&amp;#x20;&amp;#x207F;&amp;#x2555;&amp;#x20;&amp;#x2555;&amp;#xC4;&amp;#x2514;&amp;#xEE;&amp;#x255A;&amp;#xC4;&amp;#x256A;&amp;#x31;&amp;#xA0;&amp;#x255B;&amp;#x20;&amp;#x7C;&amp;#x2563;&amp;#x20;&amp;#x20;&amp;#x2524;&amp;#x25D9;&amp;#xBC;&amp;#xBD;&amp;#x393;&amp;#x207F;&amp;#x2320;&amp;#x3B4;&amp;#xB2;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  This looks like gibberish, however every symbol in the above output
  corresponds to a byte of the program mentioned earlier.  For
  example, the first symbol (omega) represents the byte value 0xea,
  the second symbol (club) represents the byte value 0x05 and so on.
  The chart at &lt;a href=&quot;code/cp437/cp437.html&quot;&gt;cp437.html&lt;/a&gt; can be
  used to confirm that every symbol in the output indeed represents
  every byte of the program.
&lt;/p&gt;
&lt;p&gt;
  Here is the disassembly of the program:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;$ &lt;kbd&gt;ndisasm -o 0x7c00 boot.img&lt;/kbd&gt;
00007C00  EA057C0000        jmp 0x0:0x7c05
00007C05  FC                cld
00007C06  B800B8            mov ax,0xb800
00007C09  8EC0              mov es,ax
00007C0B  8CC8              mov ax,cs
00007C0D  8ED8              mov ds,ax
00007C0F  31FF              xor di,di
00007C11  BE007C            mov si,0x7c00
00007C14  B92000            mov cx,0x20
00007C17  B40A              mov ah,0xa
00007C19  AC                lodsb
00007C1A  AB                stosw
00007C1B  E2FC              loop 0x7c19
00007C1D  F4                hlt
00007C1E  EBFD              jmp short 0x7c1d
00007C20  0000              add [bx+si],al
00007C22  0000              add [bx+si],al
...&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  The ellipsis in the end represents the remainder of the bytes that
  contains zeroes and the boot sector magic bytes 0x55 and 0xaa in the
  end.  They have been omitted here for the sake of brevity.
&lt;/p&gt;
&lt;p&gt;
  When a computer boots, the BIOS reads the boot sector code from the
  first sector of the boot device into the memory at physical address
  0x7c00 and jumps to this address.  Most BIOS implementations jump to
  0x0000:0x7c00 but there are some implementations that jump to
  0x07c0:0x0000 instead.  Both these jumps are jumps to the same
  physical address 0x7c00 but this difference poses a problem for us
  because the offsets in our program depend on which jump the BIOS
  executed.  In order to ensure that our program can run with both
  types of BIOS implementations, we use a popular trick of having the
  first instruction of our program execute a jump to address
  0x0000:0x7c05 in order to reach the second instruction.  This sets
  the register CS to 0 and IP to 0x7c05 and we don&apos;t have to worry
  about the differences between BIOS implementations anymore.  We can
  now pretend as if a BIOS implementation that jumps to 0x0000:0x7c00
  is going to load our program.
&lt;/p&gt;
&lt;p&gt;
  The remainder of the program is similar to the one in the previous
  section.  However, there are some small but important differences.
  While the DOS environment guarantees that AH and CH are initialised
  to 0 when a .COM program starts, the BIOS offers no such guarantee
  while loading and executing a boot program.  This is why we use the
  registers AX and CX (as opposed to only AH and CL) in
  the &lt;code&gt;mov&lt;/code&gt; instructions to initialise them.  Similarly,
  while DOS initialises SI to 0x100 when a .COM program starts, for a
  boot program, we set the register SI ourselves.
&lt;/p&gt;
&lt;p&gt;
  If you feel uncomfortable about calling the above program a quine
  because it reads its own bytes from the memory, we could have the
  program read the bytes it needs to print from a separate place in
  memory.  We do not execute these bytes.  We only read them and copy
  them to video memory.  The following 76-byte program does this:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ea 05 7c 00 00 fc bb 02 00 b8 00 b8 8e c0 8c c8
8e d8 31 ff be 26 7c b9 26 00 b4 0a ac ab e2 fc
4b 75 f1 f4 eb fd ea 05 7c 00 00 fc bb 02 00 b8
00 b8 8e c0 8c c8 8e d8 31 ff be 26 7c b9 26 00
b4 0a ac ab e2 fc 4b 75 f1 f4 eb fd&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Here is how we can create a boot image with this:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;echo ea 05 7c 00 00 fc bb 02 00 b8 00 b8 8e c0 8c c8 | xxd -r -p &amp;gt; boot.img
echo 8e d8 31 ff be 26 7c b9 26 00 b4 0a ac ab e2 fc | xxd -r -p &amp;gt;&amp;gt; boot.img
echo 4b 75 f1 f4 eb fd ea 05 7c 00 00 fc bb 02 00 b8 | xxd -r -p &amp;gt;&amp;gt; boot.img
echo 00 b8 8e c0 8c c8 8e d8 31 ff be 26 7c b9 26 00 | xxd -r -p &amp;gt;&amp;gt; boot.img
echo b4 0a ac ab e2 fc 4b 75 f1 f4 eb fd | xxd -r -p &amp;gt;&amp;gt; boot.img
echo 55 aa | xxd -r -p | dd seek=510 bs=1 of=boot.img&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  Here are the commands to test this boot image:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dosbox -c cls -c &apos;boot boot.img&apos;
qemu-system-i386 -fda boot.img&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  The output should look like this:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;&amp;#x3A9;&amp;#x2663;&amp;#x7C;&amp;#x20;&amp;#x20;&amp;#x207F;&amp;#x2557;&amp;#x263B;&amp;#x20;&amp;#x2555;&amp;#x20;&amp;#x2555;&amp;#xC4;&amp;#x2514;&amp;#xEE;&amp;#x255A;&amp;#xC4;&amp;#x256A;&amp;#x31;&amp;#xA0;&amp;#x255B;&amp;#x26;&amp;#x7C;&amp;#x2563;&amp;#x26;&amp;#x20;&amp;#x2524;&amp;#x25D9;&amp;#xBC;&amp;#xBD;&amp;#x393;&amp;#x207F;&amp;#x4B;&amp;#x75;&amp;#xB1;&amp;#x2320;&amp;#x3B4;&amp;#xB2;&amp;#x3A9;&amp;#x2663;&amp;#x7C;&amp;#x20;&amp;#x20;&amp;#x207F;&amp;#x2557;&amp;#x263B;&amp;#x20;&amp;#x2555;&amp;#x20;&amp;#x2555;&amp;#xC4;&amp;#x2514;&amp;#xEE;&amp;#x255A;&amp;#xC4;&amp;#x256A;&amp;#x31;&amp;#xA0;&amp;#x255B;&amp;#x26;&amp;#x7C;&amp;#x2563;&amp;#x26;&amp;#x20;&amp;#x2524;&amp;#x25D9;&amp;#xBC;&amp;#xBD;&amp;#x393;&amp;#x207F;&amp;#x4B;&amp;#x75;&amp;#xB1;&amp;#x2320;&amp;#x3B4;&amp;#xB2;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  Here is the disassembly of this program:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;$ &lt;kbd&gt;ndisasm -o 0x7c00 boot.img&lt;/kbd&gt;
00007C00  EA057C0000        jmp 0x0:0x7c05
00007C05  FC                cld
00007C06  BB0200            mov bx,0x2
00007C09  B800B8            mov ax,0xb800
00007C0C  8EC0              mov es,ax
00007C0E  8CC8              mov ax,cs
00007C10  8ED8              mov ds,ax
00007C12  31FF              xor di,di
00007C14  BE267C            mov si,0x7c26
00007C17  B92600            mov cx,0x26
00007C1A  B40A              mov ah,0xa
00007C1C  AC                lodsb
00007C1D  AB                stosw
00007C1E  E2FC              loop 0x7c1c
00007C20  4B                dec bx
00007C21  75F1              jnz 0x7c14
00007C23  F4                hlt
00007C24  EBFD              jmp short 0x7c23
00007C26  EA057C0000        jmp 0x0:0x7c05
00007C2B  FC                cld
00007C2C  BB0200            mov bx,0x2
00007C2F  B800B8            mov ax,0xb800
00007C32  8EC0              mov es,ax
00007C34  8CC8              mov ax,cs
00007C36  8ED8              mov ds,ax
00007C38  31FF              xor di,di
00007C3A  BE267C            mov si,0x7c26
00007C3D  B92600            mov cx,0x26
00007C40  B40A              mov ah,0xa
00007C42  AC                lodsb
00007C43  AB                stosw
00007C44  E2FC              loop 0x7c42
00007C46  4B                dec bx
00007C47  75F1              jnz 0x7c3a
00007C49  F4                hlt
00007C4A  EBFD              jmp short 0x7c49
00007C4C  0000              add [bx+si],al
00007C4E  0000              add [bx+si],al
...&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  This program has two identical halves.  The first half from offset
  0x7c00 to offset 0x7c25 are executable bytes.  The second half from
  offset 0x7c26 to 0x7c4b are the data bytes read by the executable
  bytes.  The executable part of the code has an outer loop that uses
  the register BX as the counter variable.  It sets BX to 2 so that
  the outer loop iterates twice.  In each iteration, it reads data
  bytes from the second half of the program and prints them.  The code
  to read bytes and print them is very similar to our earlier program.
  Since the data bytes in the second half are identical to the
  executable bytes in the first half, printing the data bytes twice
  amounts to printing all bytes of the program.
&lt;/p&gt;
&lt;p&gt;
  While this program does avoid reading the bytes that the CPU
  executes, the data bytes look exactly like the executable bytes.
  Although I do not see any point in trying to avoid reading
  executable bytes in an exercise like, this program serves as an
  example of a self-printing boot program that does not execute the
  bytes it reads.
&lt;/p&gt;
<!-- ### -->
&lt;p&gt;
  &lt;a href="https://susam.net/self-printing-machine-code.html"&gt;Read on website&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/assembly.html&quot;&gt;#assembly&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/programming.html&quot;&gt;#programming&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/dos.html&quot;&gt;#dos&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/technology.html&quot;&gt;#technology&lt;/a&gt;
&lt;/p&gt;
<!-- END HTML -->
    </content>
  </entry>
  <entry>
    <title>Rebooting With JMP Instruction</title>
    <link href="https://susam.net/rebooting-with-jmp-instruction.html"/>
    <id>urn:uuid:a02852df-66d3-4afc-b893-31f58bd5d902</id>
    <updated>2003-03-02T00:00:00Z</updated>
    <content type="html">
<!-- BEGIN HTML -->
&lt;p&gt;
  While learning about x86 microprocessors, I realised that it is
  possible to reboot a computer running MS-DOS or Windows 98 by
  jumping to the memory address FFFF:0000.  Here is an
  example &lt;code&gt;DEBUG.EXE&lt;/code&gt; session from MS-DOS 6.22:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG&lt;/kbd&gt;
&lt;kbd&gt;G =FFFF:0000&lt;/kbd&gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  In the above example, we start the DOS debugger and then enter
  the &lt;code&gt;G&lt;/code&gt; (go) command to execute the program at FFFF:0000.
  Just doing this simple operation should reboot the system
  immediately.
&lt;/p&gt;
&lt;p&gt;
  When the computer boots, the x86 microprocessor starts in real mode
  and executes the instruction at FFFF:0000.  This is an address in
  the BIOS ROM that contains a far jump instruction to go to another
  address, typically F000:E05B.
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG&lt;/kbd&gt;
-&lt;kbd&gt;U FFFF:0000 4&lt;/kbd&gt;
FFFF:0000 EA5BE000F0    JMP     F000:E05B&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  The address F000:E05B contains the BIOS start-up program which
  performs a power-on self-test (POST), initialises the peripheral
  devices, loads the boot sector code and executes it.  These
  operations complete the booting sequence.
&lt;/p&gt;
&lt;p&gt;
  The important point worth noting here is that the very first
  instruction the microprocessor executes after booting is the
  instruction at FFFF:0000.  We can use this fact to create a tiny
  executable program that can be used to reboot the computer.  Of
  course, we can always perform a soft reboot using the key
  sequence &lt;kbd&gt;ctrl&lt;/kbd&gt;+&lt;kbd&gt;alt&lt;/kbd&gt;+&lt;kbd&gt;del&lt;/kbd&gt;.  However,
  just for fun, let us create a program to reboot the computer with
  a &lt;code&gt;JMP FFFF:0000&lt;/code&gt; instruction.
&lt;/p&gt;
&lt;h2 id=&quot;reboot-program&quot;&gt;Reboot Program&lt;/h2&gt;
&lt;p&gt;
  Here is a complete &lt;code&gt;DEBUG.EXE&lt;/code&gt; session that shows how we
  could write a simple reboot program:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG&lt;/kbd&gt;
-&lt;kbd&gt;A&lt;/kbd&gt;
1165:0100 &lt;kbd&gt;JMP FFFF:0000&lt;/kbd&gt;
1165:0105
-&lt;kbd&gt;N REBOOT.COM&lt;/kbd&gt;
-&lt;kbd&gt;R CX&lt;/kbd&gt;
CX 0000
:&lt;kbd&gt;5&lt;/kbd&gt;
-&lt;kbd&gt;W&lt;/kbd&gt;
Writing 00005 bytes
-&lt;kbd&gt;Q&lt;/kbd&gt;

C:\&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  Note that the &lt;code&gt;N&lt;/code&gt; (name) command specifies the name of
  the file where we write the binary machine code to.  Also, note that
  the &lt;code&gt;W&lt;/code&gt; (write) command expects the registers BX and CX
  to contain the number of bytes to be written to the file.  When the
  DOS debugger starts, it already initialises BX to 0 automatically,
  so we only set the register CX to 5 with the &lt;code&gt;R CX&lt;/code&gt;
  command above.
&lt;/p&gt;
&lt;p&gt;
  Now we can execute this 5-byte program like this:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:&amp;gt;&lt;kbd&gt;REBOOT&lt;/kbd&gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;h2 id=&quot;debugger-scripting&quot;&gt;Debugger Scripting&lt;/h2&gt;
&lt;p&gt;
  In the previous section, we saw how we can start
  &lt;code&gt;DEBUG.EXE&lt;/code&gt; and type the debugger commands and the
  assembly language instruction to jump to FFFF:0000.  We can also keep
  these debugger inputs in a separate text file and feed that to the
  debugger.  Here is how the content of such a text file would look:
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;A
JMP FFFF:0000

N REBOOT.COM
R CX
5
W
Q&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  If the above input is saved in a file, say, &lt;code&gt;REBOOT.TXT&lt;/code&gt;,
  then we can run the DOS command &lt;code&gt;DEBUG &amp;lt; REBOOT.TXT&lt;/code&gt;
  to assemble the program and create the binary executable file.  The
  following DOS session example shows how this command behaves:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG &amp;lt; REBOOT.TXT&lt;/kbd&gt;
-A
1165:0100 JMP FFFF:0000
1165:0105
-N REBOOT.COM
-R CX
CX 0000
:5
-W
Writing 00005 bytes
-Q

C:&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;h2 id=&quot;disassembly&quot;&gt;Disassembly&lt;/h2&gt;
&lt;p&gt;
  Here is a quick demonstration of how we can disassemble the
  executable code:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG REBOOT.COM&lt;/kbd&gt;
-&lt;kbd&gt;U 100 104&lt;/kbd&gt;
117C:0100 EA0000FFFF    JMP     FFFF:0000&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  While we did not really need to disassemble this tiny program, the
  above example shows how we can use the debugger
  command &lt;code&gt;U&lt;/code&gt; (unassemble) to translate machine code to
  assembly language mnemonics.
&lt;/p&gt;
<!-- ### -->
&lt;p&gt;
  &lt;a href="https://susam.net/rebooting-with-jmp-instruction.html"&gt;Read on website&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/assembly.html&quot;&gt;#assembly&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/programming.html&quot;&gt;#programming&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/dos.html&quot;&gt;#dos&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/technology.html&quot;&gt;#technology&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/how-to.html&quot;&gt;#how-to&lt;/a&gt;
&lt;/p&gt;
<!-- END HTML -->
    </content>
  </entry>
  <entry>
    <title>Programming With DOS Debugger</title>
    <link href="https://susam.net/programming-with-dos-debugger.html"/>
    <id>urn:uuid:a6f153b0-167a-4fbc-9fbe-28f3ee473c05</id>
    <updated>2003-02-11T00:00:00Z</updated>
    <content type="html">
<!-- BEGIN HTML -->
&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;
&lt;p&gt;
  MS-DOS as well as Windows 98 come with a debugger program
  named &lt;code&gt;DEBUG.EXE&lt;/code&gt; that can be used to work with assembly
  language instructions and machine code.  In MS-DOS version 6.22, this
  program is named &lt;code&gt;DEBUG.EXE&lt;/code&gt; and it is typically present
  at &lt;code&gt;C:\DOS\DEBUG.EXE&lt;/code&gt;.  On Windows 98, this program is
  usually present at &lt;code&gt;C:\Windows\Command\Debug.exe&lt;/code&gt;.  It is
  a line-oriented debugger that supports various useful features to
  work with and debug binary executable programs consisting of machine
  code.
&lt;/p&gt;
&lt;p&gt;
  In this post, we see how we can use this debugger program to
  assemble a few minimal programs that print some characters to
  standard output.  We first create a 7-byte program that prints a
  single character.  Then we create a 23-byte program that prints the
  &quot;hello, world&quot; string.  All the steps provided in this post work well
  with Windows 98 too.
&lt;/p&gt;
&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;#introduction&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#print-character&quot;&gt;Print Character&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#hello-world&quot;&gt;Hello, World&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#debugger-scripting&quot;&gt;Debugger Scripting&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#disassembly&quot;&gt;Disassembly&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#int-20-vs-ret&quot;&gt;INT 20 vs RET&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;print-character&quot;&gt;Print Character&lt;/h2&gt;
&lt;p&gt;
  Let us first see how to create a tiny 7-byte program that prints the
  character &lt;code&gt;A&lt;/code&gt; to standard output.  The
  following &lt;code&gt;DEBUG.EXE&lt;/code&gt; session shows how we do it.
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG&lt;/kbd&gt;
-&lt;kbd&gt;A&lt;/kbd&gt;
1165:0100 &lt;kbd&gt;MOV AH, 2&lt;/kbd&gt;
1165:0102 &lt;kbd&gt;MOV DL, 41&lt;/kbd&gt;
1165:0104 &lt;kbd&gt;INT 21&lt;/kbd&gt;
1165:0106 &lt;kbd&gt;RET&lt;/kbd&gt;
1165:0107
-&lt;kbd&gt;G&lt;/kbd&gt;
A
Program terminated normally
-&lt;kbd&gt;N A.COM&lt;/kbd&gt;
-&lt;kbd&gt;R CX&lt;/kbd&gt;
CX 0000
:&lt;kbd&gt;7&lt;/kbd&gt;
-&lt;kbd&gt;W&lt;/kbd&gt;
Writing 00007 bytes
-&lt;kbd&gt;Q&lt;/kbd&gt;

C:\&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  Now we can execute this program as follows:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;A&lt;/kbd&gt;
A
C:\&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  The debugger command &lt;code&gt;A&lt;/code&gt; creates machine executable code
  from assembly language instructions.  The machine code created is
  written to the main memory at address CS:0100 by default.  The first
  three instructions generate the software interrupt 0x21 (decimal 33)
  with AH set to 2 and DL set to 0x41 (decimal 65) which happens to be
  the ASCII code of the character &lt;code&gt;A&lt;/code&gt;.  Interrupt 0x21
  offers a wide variety of DOS services.  Setting AH to 2 tells this
  interrupt to invoke the function that prints a single character to
  standard output.  This function expects DL to be set to the ASCII
  code of the character we want to print.
&lt;/p&gt;
&lt;p&gt;
  The command &lt;code&gt;G&lt;/code&gt; executes the program in memory from the
  current location.  The current location is defined by the current
  value of CS:IP which is CS:0100 by default.  We use this command to
  confirm that the program runs as expected.
&lt;/p&gt;
&lt;p&gt;
  Next we prepare to write the machine code to a binary executable
  file.  The command &lt;code&gt;N&lt;/code&gt; is used to specify the name of the
  file.  The command &lt;code&gt;W&lt;/code&gt; is used to write the machine code
  to the file.  This command expects the registers BX and CX to contain
  the number of bytes to be written to the file.  When the DOS debugger
  starts, BX is already initialised to 0, so we only set the register
  CX to 7 with the &lt;code&gt;R CX&lt;/code&gt; command.  Finally, we use the
  command &lt;code&gt;Q&lt;/code&gt; to quit the debugger and return to MS-DOS.
&lt;/p&gt;
&lt;h2 id=&quot;hello-world&quot;&gt;Hello, World&lt;/h2&gt;
&lt;p&gt;
  The following &lt;code&gt;DEBUG.EXE&lt;/code&gt; session shows how to create a
  program that prints a string.
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG&lt;/kbd&gt;
-&lt;kbd&gt;A&lt;/kbd&gt;
1165:0100 &lt;kbd&gt;MOV AH, 9&lt;/kbd&gt;
1165:0102 &lt;kbd&gt;MOV DX, 108&lt;/kbd&gt;
1165:0105 &lt;kbd&gt;INT 21&lt;/kbd&gt;
1165:0107 &lt;kbd&gt;RET&lt;/kbd&gt;
1165:0108 &lt;kbd&gt;DB &apos;hello, world&apos;, D, A, &apos;$&apos;&lt;/kbd&gt;
1165:0117
-&lt;kbd&gt;G&lt;/kbd&gt;
hello, world

Program terminated normally
-&lt;kbd&gt;N HELLO.COM&lt;/kbd&gt;
-&lt;kbd&gt;R CX&lt;/kbd&gt;
CX 0000
:&lt;kbd&gt;17&lt;/kbd&gt;
-&lt;kbd&gt;W&lt;/kbd&gt;
Writing 00017 bytes
-&lt;kbd&gt;Q&lt;/kbd&gt;

C:\&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  Now we can execute this 23-byte program like this:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;HELLO&lt;/kbd&gt;
hello, world

C:\&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  In the program above we use the pseudo-instruction &lt;code&gt;DB&lt;/code&gt;
  to define the bytes of the string we want to print.  We add the
  trailing bytes 0xD and 0xA to print the carriage return (CR) and the
  line feed (LF) characters so that the string is terminated with a
  newline.  Finally, the string is terminated with the byte for dollar
  sign (&lt;code&gt;&apos;$&apos;&lt;/code&gt;) because the software interrupt we generate
  next expects the string to be terminated with this symbol&apos;s byte
  value.
&lt;/p&gt;
&lt;p&gt;
  We use the software interrupt 0x21 again.  However, this time we set
  AH to 9 to invoke the function that prints a string.  This function
  expects DS:DX to point to the address of a string terminated with
  the byte value of &lt;code&gt;&apos;$&apos;&lt;/code&gt;.  The register &lt;code&gt;DS&lt;/code&gt; has
  the same value as that of &lt;code&gt;CS&lt;/code&gt;, so we only
  set &lt;code&gt;DX&lt;/code&gt; to the offset at which the string begins.
&lt;/p&gt;
&lt;h2 id=&quot;debugger-scripting&quot;&gt;Debugger Scripting&lt;/h2&gt;
&lt;p&gt;
  We have already seen above how to assemble a &quot;hello, world&quot; program
  in the previous section.  We started the debugger program, typed
  some commands and typed assembly language instructions to create our
  program.  It is also possible to prepare a separate input file with
  all the debugger commands and assembly language instructions in it.
  We then feed this file to the debugger program.  This can be useful
  while writing more complex programs where we cannot afford to lose
  our assembly language source code if we inadvertently crash the
  debugger by executing an illegal instruction.
&lt;/p&gt;
&lt;p&gt;
  To create a separate input file that can be fed to the debugger, we
  may use the DOS command &lt;code&gt;EDIT HELLO.TXT&lt;/code&gt; to open a new
  file with MS-DOS Editor, then type in the following debugger
  commands and then save and exit the editor.
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;A
MOV AH, 9
MOV DX, 108
INT 21
RET
DB &apos;hello, world&apos;, D, A, &apos;$&apos;

N HELLO.COM
R CX
17
W
Q&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
  This is almost the same as the inputs we typed into the debugger in
  the previous section.  The only difference from the previous section
  is that we omit the &lt;code&gt;G&lt;/code&gt; command here because we don&apos;t
  really need to run the program while assembling it, although we
  could do so if we really wanted to.
&lt;/p&gt;
&lt;p&gt;
  Then we can run the DOS command &lt;code&gt;DEBUG &amp;lt; HELLO.TXT&lt;/code&gt; to
  assemble the program and create the binary executable file.  Here is
  a DOS session example that shows what the output of this command
  looks like:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG &amp;lt; HELLO.TXT&lt;/kbd&gt;
-A
1165:0100 MOV AH, 9
1165:0102 MOV DX, 108
1165:0105 INT 21
1165:0107 RET
1165:0108 DB &apos;hello, world&apos;, D, A, &apos;$&apos;
1165:0117
-N HELLO.COM
-R CX
CX 0000
:17
-W
Writing 00017 bytes
-Q

C:\&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  The output is in fact very similar to the debugger session in the
  previous section.
&lt;/p&gt;
&lt;h2 id=&quot;disassembly&quot;&gt;Disassembly&lt;/h2&gt;
&lt;p&gt;
  Now that we have seen how to assemble simple programs into binary
  executable files using the debugger, we will now briefly see how to
  disassemble the binary executable files.  This could be useful when
  we want to debug an existing program.
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG A.COM&lt;/kbd&gt;
-&lt;kbd&gt;U 100 106&lt;/kbd&gt;
117C:0100 B402          MOV     AH,02
117C:0102 B241          MOV     DL,41
117C:0104 CD21          INT     21
117C:0106 C3            RET&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  The debugger command &lt;code&gt;U&lt;/code&gt; (unassemble) is used to
  translate the binary machine code to assembly language mnemonics.
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG HELLO.COM&lt;/kbd&gt;
-&lt;kbd&gt;U 100 116&lt;/kbd&gt;
117C:0100 B409          MOV     AH,09
117C:0102 BA0801        MOV     DX,0108
117C:0105 CD21          INT     21
117C:0107 C3            RET
117C:0108 68            DB      68
117C:0109 65            DB      65
117C:010A 6C            DB      6C
117C:010B 6C            DB      6C
117C:010C 6F            DB      6F
117C:010D 2C20          SUB     AL,20
117C:010F 776F          JA      0180
117C:0111 726C          JB      017F
117C:0113 64            DB      64
117C:0114 0D0A24        OR      AX,240A
-&lt;kbd&gt;D 100 116&lt;/kbd&gt;
117C:0100  B4 09 BA 08 01 CD 21 C3-68 65 6C 6C 6F 2C 20 77   ......!.hello, w
117C:0110  6F 72 6C 64 0D 0A 24                              orld..$&lt;/samp&gt;&lt;/pre&gt;
&lt;h2 id=&quot;int-20-vs-ret&quot;&gt;INT 20 vs RET&lt;/h2&gt;
&lt;p&gt;
  Another way to terminate a .COM program is to simply use the
  instruction &lt;code&gt;INT 20&lt;/code&gt;.  This consumes two bytes in the
  machine code: &lt;code&gt;CD 20&lt;/code&gt;.  While producing the smallest
  possible executables was not really the goal of this post, the code
  examples above indulge in a little bit of size reduction by using
  the &lt;code&gt;RET&lt;/code&gt; instruction to terminate the program.  This
  consumes only one byte: &lt;code&gt;C3&lt;/code&gt;.  This works because when a
  .COM file starts, the register SP contains FFFE.  The stack memory
  locations at offset FFFE and FFFF contain 00 and 00 respectively.
  Further, the memory address offset 0000 contains the
  instruction &lt;code&gt;INT 20&lt;/code&gt;.  Here is a demonstration of these
  facts using the debugger program:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG HELLO.COM&lt;/kbd&gt;
-&lt;kbd&gt;R SP&lt;/kbd&gt;
SP FFFE
:
-&lt;kbd&gt;D FFFE&lt;/kbd&gt;
117C:FFF0                                            00 00
-&lt;kbd&gt;U 0 1&lt;/kbd&gt;
117C:0000 CD20          INT     20&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  As a result, executing the &lt;code&gt;RET&lt;/code&gt; instruction pops 0000
  off the stack at FFFE and loads it into IP.  This results in the
  instruction &lt;code&gt;INT 20&lt;/code&gt; at offset 0000 getting executed
  which leads to program termination.
&lt;/p&gt;
&lt;p&gt;
  While both &lt;code&gt;INT 20&lt;/code&gt; and &lt;code&gt;RET&lt;/code&gt; lead to
  successful program termination both in DOS as well as while
  debugging with &lt;code&gt;DEBUG.EXE&lt;/code&gt;, there is some difference
  between them which affects the debugging experience.  Terminating the
  program with &lt;code&gt;INT 20&lt;/code&gt; allows us to run the program
  repeatedly within the debugger by repeated applications of
  the &lt;code&gt;G&lt;/code&gt; debugger command.  But when we terminate the
  program with &lt;code&gt;RET&lt;/code&gt;, we cannot run the program repeatedly
  in this manner.  The program runs and terminates successfully the
  first time we run it in the debugger but the stack does not get
  reinitialised with zeros to prepare it for another execution of the
  program within the debugger.  Therefore when we try to run the
  program the second time using the &lt;code&gt;G&lt;/code&gt; command, the
  program does not terminate successfully.  It hangs instead.  It is
  possible to work around this by reinitialising the stack with the
  debugger command &lt;code&gt;E FFFE 0 0&lt;/code&gt; before
  running &lt;code&gt;G&lt;/code&gt; again.
&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;
  Although the DOS debugger is very limited in features in comparison
  with sophisticated assemblers like NASM, MASM, etc., this humble
  program can perform some of the basic operations involved in working
  with assembly language and machine code.  It can read and write
  binary executable files, examine memory, execute machine
  instructions in memory, modify registers, edit binary files, etc.
  The fact that this debugger program is always available with MS-DOS
  or Windows 98 system means that these systems are ready for some
  rudimentary assembly language programming without requiring any
  additional tools.
&lt;/p&gt;
<!-- ### -->
&lt;p&gt;
  &lt;a href="https://susam.net/programming-with-dos-debugger.html"&gt;Read on website&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/assembly.html&quot;&gt;#assembly&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/programming.html&quot;&gt;#programming&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/dos.html&quot;&gt;#dos&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/technology.html&quot;&gt;#technology&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/how-to.html&quot;&gt;#how-to&lt;/a&gt;
&lt;/p&gt;
<!-- END HTML -->
    </content>
  </entry>
  <entry>
    <title>Editing Binaries in DOS</title>
    <link href="https://susam.net/editing-binaries-in-dos.html"/>
    <id>urn:uuid:117ca989-211b-4e80-a73b-6b168387e735</id>
    <updated>2002-07-18T00:00:00Z</updated>
    <content type="html">
<!-- BEGIN HTML -->
&lt;p&gt;
  Both MS-DOS and Windows 98 come with a debugger program
  named &lt;code&gt;DEBUG.EXE&lt;/code&gt; that make it possible to edit binary
  files without requiring additional tools.  Although the primary
  purpose of this program is to test and debug executable files, it
  can be used to edit binary files too.  Two examples of this are
  shown in this post.  The first example edits a string of bytes in an
  executable file.  The second one edits machine instructions to alter
  the behaviour of the program.  Both examples provided in the next
  two sections can be reproduced on MS-DOS version 6.22.  These
  examples can be performed on Windows 98 too after minor adjustments.
&lt;/p&gt;
&lt;h2 id=&quot;editing-data&quot;&gt;Editing Data&lt;/h2&gt;
&lt;p&gt;
  Let us first see an example of editing an error message produced by
  the &lt;code&gt;MODE&lt;/code&gt; command.  This DOS command is used for
  displaying and reconfiguring system settings.  For example, the
  following command sets the display to show 40 characters per line:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;MODE 40&lt;/kbd&gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  The following command reverts the display to show 80 characters per
  line:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;MODE 80&lt;/kbd&gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  Here is another example of this command that shows the current
  settings for serial port COM1:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;MODE COM1&lt;/kbd&gt;

Status for device COM1:
-----------------------
Retry=NONE

C:\&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  An invalid parameter leads to an error like this:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;MODE 0&lt;/kbd&gt;

Invalid parameter - 0

C:\&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  We will edit this error message to be slightly more helpful.  The
  following debugger session shows how.
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG C:\DOS\MODE.COM&lt;/kbd&gt;
-&lt;kbd&gt;S 0 FFFF &apos;Invalid parameter&apos;&lt;/kbd&gt;
117C:19D1
-&lt;kbd&gt;D 19D0 19FF&lt;/kbd&gt;
117C:19D0  13 49 6E 76 61 6C 69 64-20 70 61 72 61 6D 65 74   .Invalid paramet
117C:19E0  65 72 0D 0A 20 0D 0A 49-6E 76 61 6C 69 64 20 6E   er.. ..Invalid n
117C:19F0  75 6D 62 65 72 20 6F 66-20 70 61 72 61 6D 65 74   umber of paramet
-&lt;kbd&gt;E 19D0 12 &apos;No soup for you!&apos; D A&lt;/kbd&gt;
-&lt;kbd&gt;D 19D0 19FF&lt;/kbd&gt;
117C:19D0  12 4E 6F 20 73 6F 75 70-20 66 6F 72 20 79 6F 75   .No soup for you
117C:19E0  21 0D 0A 0A 20 0D 0A 49-6E 76 61 6C 69 64 20 6E   !... ..Invalid n
117C:19F0  75 6D 62 65 72 20 6F 66-20 70 61 72 61 6D 65 74   umber of paramet
-&lt;kbd&gt;N SOUP.COM&lt;/kbd&gt;
-&lt;kbd&gt;W&lt;/kbd&gt;
Writing 05C11 bytes
-&lt;kbd&gt;Q&lt;/kbd&gt;

C:\&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  We first open &lt;code&gt;MODE.COM&lt;/code&gt; with the debugger.  When we do
  so, the entire program is loaded into offset 0x100 of the code
  segment (CS).  Then we use the &lt;code&gt;S&lt;/code&gt; debugger command to
  search for the string &quot;Invalid parameter&quot;.  This prints the offset
  at which this string occurs in memory.
&lt;/p&gt;
&lt;p&gt;
  We use the &lt;code&gt;D&lt;/code&gt; command to dump the bytes around that
  offset.  In the first row of the output, the byte value 13 (decimal
  19) represents the length of the string that follows it.  Indeed
  there are 19 bytes in the string composed of the text &lt;code&gt;&quot;Invalid
  parameter&quot;&lt;/code&gt; and the following carriage return (CR) and line
  feed (LF) characters.  The CR and LF characters have ASCII codes 0xD
  (decimal 13) and 0xA (decimal 10).  These values can be seen at the
  third and fourth places of the second row of the output of this
  command.
&lt;/p&gt;
&lt;p&gt;
  Then we use the &lt;code&gt;E&lt;/code&gt; command to enter a new string length
  followed by a new string to replace the existing error message.
  Note that we enter a string length of 0x12 (decimal 18) which is
  indeed the length of the string that follows it.  After entering the
  new string, we dump the memory again with &lt;code&gt;D&lt;/code&gt; to verify
  that the new string is now present in memory.
&lt;/p&gt;
&lt;p&gt;
  After confirming that the edited string looks good, we use
  the &lt;code&gt;N&lt;/code&gt; command to specify the name of the file we want
  to write the edited binary to.  This command starts writing the
  bytes from offset 0x100 to the named file.  It reads the number of
  bytes to be written to the file from the BX and CX registers.  These
  registers are already initialised to the length of the file when we
  load a file in the debugger.  Since we have not modified these
  registers ourselves, we don&apos;t need to set them again.  In case you
  do need to set the BX and CX registers in a different situation, the
  commands to do so are &lt;code&gt;R BX&lt;/code&gt; and &lt;code&gt;R CX&lt;/code&gt;
  respectively.
&lt;/p&gt;
&lt;p&gt;
  Finally, the &lt;code&gt;W&lt;/code&gt; command writes the file and
  the &lt;code&gt;Q&lt;/code&gt; command quits the debugger.  Now we can test the
  new program as follows:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;
C:\&amp;gt;&lt;kbd&gt;SOUP 0&lt;/kbd&gt;

No soup for you! - 0

C:\&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;h2 id=&quot;editing-machine-instructions&quot;&gt;Editing Machine Instructions&lt;/h2&gt;
&lt;p&gt;
  In this section, we will see how to edit the binary we created in
  the previous section further to add our own machine instructions to
  print a welcome message when the program starts.  Here is an example
  debugger session that shows how to do it.
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;DEBUG SOUP.COM&lt;/kbd&gt;
-&lt;kbd&gt;U&lt;/kbd&gt;
117C:0100 E99521        JMP     2298
117C:0103 51            PUSH    CX
117C:0104 8ACA          MOV     CL,DL
117C:0106 D0E1          SHL     CL,1
117C:0108 32ED          XOR     CH,CH
117C:010A 80CD03        OR      CH,03
117C:010D D2E5          SHL     CH,CL
117C:010F 2E            CS:
117C:0110 222E7D01      AND     CH,[017D]
117C:0114 2E            CS:
117C:0115 890E6402      MOV     [0264],CX
117C:0119 59            POP     CX
117C:011A 7505          JNZ     0121
117C:011C EA39E700F0    JMP     F000:E739
-&lt;kbd&gt;D 300&lt;/kbd&gt;
117C:0300  07 1F C3 18 18 18 18 18-00 00 00 00 00 00 00 00   ................
117C:0310  00 00 FF 00 00 00 00 00-FF 00 00 00 00 00 00 00   ................
117C:0320  00 00 00 00 00 00 00 00-00 00 FF FF 90 00 40 00   ..............@.
117C:0330  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................
117C:0340  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................
117C:0350  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................
117C:0360  00 00 00 FF 00 00 00 00-00 00 00 00 00 00 00 00   ................
117C:0370  02 00 2B C0 8E C0 A0 71-03 A2 BA 07 A2 BC 07 3C   ..+....q.......&amp;lt;
-&lt;kbd&gt;A&lt;/kbd&gt;
117C:0100 &lt;kbd&gt;JMP 330&lt;/kbd&gt;
117C:0103
-&lt;kbd&gt;A 330&lt;/kbd&gt;
117C:0330 &lt;kbd&gt;MOV AH, 9&lt;/kbd&gt;
117C:0332 &lt;kbd&gt;MOV DX, 33A&lt;/kbd&gt;
117C:0335 &lt;kbd&gt;INT 21&lt;/kbd&gt;
117C:0337 &lt;kbd&gt;JMP 2298&lt;/kbd&gt;
117C:033A &lt;kbd&gt;DB &apos;Welcome to Soup Kitchen!&apos;, D, A, &apos;$&apos;&lt;/kbd&gt;
117C:0355
-&lt;kbd&gt;W&lt;/kbd&gt;
Writing 05C11 bytes
-&lt;kbd&gt;Q&lt;/kbd&gt;

C:\&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  At the beginning, we use the debugger command &lt;code&gt;U&lt;/code&gt; to
  unassemble (disassemble) some bytes at the top of the program to see
  what they look like.  We see that the very first instruction is a
  jump to offset 0x2298.  The debugger command &lt;code&gt;D 300&lt;/code&gt;
  shows that there are contiguous zero bytes around offset 0x330.  We
  replace some of these zero bytes with new machine instructions that
  print our welcome message.  To do this, we first replace the jump
  instruction at the top with a jump instruction to offset 0x330 where
  we then place the machine code for our welcome message.  This new
  machine code prints the welcome message and then jumps to offset
  0x2298 allowing the remainder of the program to execute as usual.
&lt;/p&gt;
&lt;p&gt;
  The debugger command &lt;code&gt;A&lt;/code&gt; is used to assemble the machine
  code for the altered jump instruction at the top.  By default it
  writes the assembled machine code to CS:0100 which is the address at
  which DOS loads executable programs.  Then we use the debugger
  command &lt;code&gt;A 330&lt;/code&gt; to add new machine code at offset 0x330.
  We try not to go beyond the region with contiguous zeroes while
  writing our machine instructions.  Fortunately for us, our entire
  code for the welcome message occupies 37 bytes and and the last byte
  of our code lands at offset 0x354.
&lt;/p&gt;
&lt;p&gt;
  Finally, we write the updated program in memory back to the file
  named &lt;code&gt;SOUP.COM&lt;/code&gt;.  Since the debugger was used to load
  the file named &lt;code&gt;SOUP.COM&lt;/code&gt;, we do not need to use
  the &lt;code&gt;N&lt;/code&gt; command to specify the name of the file again.
  When a file has just been loaded into the debugger, by default
  the &lt;code&gt;W&lt;/code&gt; command writes the program in memory back to the
  same file that was loaded into the memory.
&lt;/p&gt;
&lt;p&gt;
  Now our updated program should behave as shown below:
&lt;/p&gt;
&lt;pre&gt;&lt;samp&gt;C:\&amp;gt;&lt;kbd&gt;SOUP COM1&lt;/kbd&gt;
Welcome to Soup Kitchen!

Status for device COM1:
-----------------------
Retry=NONE

C:\&amp;gt;&lt;kbd&gt;SOUP 0&lt;/kbd&gt;
Welcome to Soup Kitchen!

No soup for you! - 0

C:\&amp;gt;&lt;/samp&gt;&lt;/pre&gt;
&lt;p&gt;
  That&apos;s our modified program that prints a welcome message and our
  own error message created with the humble DOS debugger.
&lt;/p&gt;
<!-- ### -->
&lt;p&gt;
  &lt;a href="https://susam.net/editing-binaries-in-dos.html"&gt;Read on website&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/assembly.html&quot;&gt;#assembly&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/programming.html&quot;&gt;#programming&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/dos.html&quot;&gt;#dos&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/technology.html&quot;&gt;#technology&lt;/a&gt; |
  &lt;a href=&quot;https://susam.net/tag/how-to.html&quot;&gt;#how-to&lt;/a&gt;
&lt;/p&gt;
<!-- END HTML -->
    </content>
  </entry>
</feed>
