Let's talk architecture (the long post)

Let’s talk architecture

Before we start implementing our disassembler we should probably take a step back and ask “why”. Well, two-fold certainly.

Reason number one is obvious - we want humans to be able to understand which assembly instructions were compiled in order to produce the code that’s been input. But, (and it’s a big but)… we also need to decode it so we can write the code required to emulate the processor.

That’s actually two very different sets of requirements so how do we approach it in such a way so as to make sure we maximize code-reuse. With hundreds of instructions, we want to minimize the code we write as much as possible.

Go wide?… or Go deep?

Is it better to write a fully functional disassembler first which decodes everything correctly and then implement the emulator or… is it better to implement the disassembler AND the emulator as the same time but for a smaller number of instructions?

My preference is the latter as if^H^Hwhen I code myself into a corner I will only have to refector a small number of instructions instead of all approximately 409 instructions.

Ideally I’d like to automate as much of the code generation as possible and only have to hard-code any exceptions we find. Is there a best practice way to write a VM that works that way?

Let’s lay down some requirements:

When I originally wrote this post I enumerated four or five different ways to write code that was re-usable for both disassembling the machine code and calling a function in the emulator to execute it. It got so long, convoluted and theoretical that I deleted it all and I’m only going to write about how we’re going to proceed.

Das Requirements!

There is a potential for massive pain here as there are over 400 instructions. That has the potential to require 400 functions for disassembly and 400 more functions for emulation. That’s a lot of work, how can we reduce it?
Given the number of potential functions, if^H^Hwhen we need to refactor, I don’t want to have to touch 800 functions.
Remember that the real hardware has 8 discrete cogs. This means that there are 8 cores executing at the same time, each with their own memory and internal state. They SHARE access to the hub (shared memory), and the IO pins. Each instruction emulation function should take the cog’s current state as an input and return the cogs’ new state as an output. Note to functional programmer - the cogs have side-effects so we have to emulate that, no purity for you!
Any instructions that are either illegal, unsupported (yet), or just plain wrong we will decode as a “WTF” instruction which dumps debug output for us and makes no state changes.

Our first instruction, “NoOPeration”

0000 0000000 000 000000000 000000000        NOP

The disassembler will decode this to just “NOP”.
The instruction emulation function will take the cog’s current state and return it unchanged.

Implementing the Two State solution.

The simplest implemention of this would look like this:

  def dis_instr(0::size(32)),          do: "NOP"
  def exe_instr(0::size(32) cogstate), do: cogstate

Alas, our first problem appears when we implement more than one instruction. Let’s make up a fake one, called “YIKES” (which is really just another NOP)

defmodule P2Dasm.Sandbox do
  # NOP instruction
  def dis_instr(<<0::size(32)>>),           do: "NOP"
  def exe_instr(<<0::size(32)>>, cogstate), do: cogstate

  # YIKES instruction
  def dis_instr(<<1::size(32)>>),           do: "YIKES"
  def exe_instr(<<1::size(32)>>, cogstate), do: cogstate
end

… but when we compile it, the compiler throws some warnings. (I don’t like warnings):

Compiling 1 file (.ex)
warning: clauses with the same name and arity (number of arguments) should be grouped together, "def dis_instr/1" was previously defined (lib/p2_dasm/sandbox.ex:3)
  lib/p2_dasm/sandbox.ex:7

warning: clauses with the same name and arity (number of arguments) should be grouped together, "def exe_instr/2" was previously defined (lib/p2_dasm/sandbox.ex:4)
  lib/p2_dasm/sandbox.ex:8

In plain English, the compiler is wanting us to put the functions with the same name (and arity, which means the number of arguments) together. Like this:

  # Disassembly Instructions
  def dis_instr(<<0::size(32)>>),           do: "NOP"
  def dis_instr(<<1::size(32)>>),           do: "YIKES"

  # Emulation Instructions
  def exe_instr(<<0::size(32)>>, cogstate), do: cogstate
  def exe_instr(<<1::size(32)>>, cogstate), do: cogstate
end

Huzzah! No more warnings - YAY!

Unfortunately, in exchange for no warnings we get a substantial new bugrisk. If we ended up with 400 definitions for dis_instr/1 and 400 definitions for exe_instr/2 and we kept them separated, could we be absolutely sure that we correctly mapped every instruction consistently?

A simple mistake in one place would cause the disassembler to return one assembly instruction and the emulator to execute something different. That sounds like a great way to introduce nasty hard-to-find bugs. Let’s not do this.

How about instead we just use one function and return both values?

  def dis_exe_instr(<<0::size(32)>>, cogstate) do
    %{asm: "NOP", newcogstate: cogstate}
  end

The problem we now have of course is that we can’t call the disassembler seperately without doing it from a cog process and executing the instruction.

What to do?

Let’s talk about function composition

One of the signature functions (yeah, I know) of a functional language is that “functions are first class citizens”. What does that mean exactly? Well, we’ve all seen variables be set to numbers, strings, pointers to functions, etc etc and we’ve all seen functions return those kinds of values.

In functional programming you can pass an actual function to a function and… get a function as your return value.

The classic example

Let’s say we want to make a function that returns 2 + the number given. Let’s call the function, add2.

  def add2(x), do: x + 2

Simple, but what if we wanted an add3, add4, add5, add6?

  def add2(x), do: x + 2
  def add3(x), do: x + 3
  def add4(x), do: x + 4
  def add5(x), do: x + 5
  def add6(x), do: x + 2 # <-- accidental typo of death

That’s a lot of code. Surely there’s a simpler way?

We create a function which takes the number and returns a function that adds that number - like this:

  def makeadd(number), do: fn(x) -> x + number end

Let’s see it work!

iex(1)> defmodule T do
...(1)>   def makeadd(number), do: fn(x) -> x + number end
...(1)> end
iex(2)> add2 = T.makeadd(2)
#Function<0.71974434/1 in T.makeadd/1>
iex(3)> add2.(10)
12
iex(4)> add19 = T.makeadd(19)
#Function<0.71974434/1 in T.makeadd/1>
iex(5)> add19.(76)
95

Pretty neat huh? But you told me I could pass a function too. Sure can!

iex(1)> defmodule T do
...(1)>   def somethingnumber(function, number), do: fn (x) -> function.(number, x) end
...(1)> end
iex(2)> add2 = T.somethingnumber(&+/2, 2)
#Function<0.3612865/1 in T.somethingnumber/2>
iex(3)> add2.(42)
44
iex(4)> multiply15 = T.somethingnumber(&*/2, 15)
#Function<0.3612865/1 in T.somethingnumber/2>
iex(5)> multiply15.(8)
120

If you’re not used to this concept you’re probably tilting your head to one side and wondering what the point of all that is…

Well, how about this:

defmodule P2Dasm.Sandbox do
  def decode_instr(function, <<0::size(32)>>),do: function.(:NOP)
  def decode_instr(function, <<1::size(32)>>),do: function.(:YIKES)

  def dis_instr(:NOP),             do: "NOP"
  def dis_instr(:YIKES),           do: "YIKES"

  def exe_instr(:NOP, cogstate),   do: cogstate
  def exe_instr(:YIKES, cogstate), do: cogstate
end

This means that our calling function (or shell) determins what the incoming function is at runtime and that’s exceedingly powerful. It also means that the decoding only happens in one place and that removes the risk of inconsistent typos and allows terrific introspection at runtime and MORE!

Examples:

I want to see the decoder’s internal representation of the decoded instruction (IO.inspect is an elixir function that prints to stdout a textual representation of what it’s passed, typically for debugging)

iex(1)> P2Dasm.Sandbox.decode_instr(&IO.inspect/1, <<0::size(32)>>)
:NOP

I want to get the disassembled version:

iex(2)> P2Dasm.Sandbox.decode_instr(&P2Dasm.Sandbox.dis_instr/1, <<0::size(32)>>)
"NOP"

I want to execute the instruction on my emulated cog:

iex(3)> cogstate = :something_fake
:something_fake
iex(4)> functiontopass = fn(instruction) -> P2Dasm.Sandbox.exe_instr(instruction, cogstate) end
#Function<6.99386804/1 in :erl_eval.expr/5>
iex(5)> P2Dasm.Sandbox.decode_instr(functiontopass, <<0::size(32)>>)
:something_fake
iex(6)>

This last one may need a little explaination, but it’s the same principle as the makeadd function I introduced up above. The value functiontopass is a function that has a copy of the value of cogstate embedded in it at the time that it was created. It is NOT a function pointer. If it were a function pointer and cogstate changed before the decode_instr was called then you’d expect it to execute on that changed state.

No, functiontopass is immutable. It’s not changing, it is what it is until the heat death of the universe (or garbage collection, whichever comes first)

Lastly, there is a final option, and that’s the case where instead of having the code execute and pass back the new state we pass back a function that takes NO arguments but when executed actually does the execution. Why would you do that? What if you wanted to write a larger function that did a diff between the old state and the new state in order to do insane levels of introspection?

We wrap it in a function call and return that function:

iex(6)> functiontopass = fn(instruction) -> fn() -> P2Dasm.Sandbox.exe_instr(instruction, cogstate) end end
#Function<6.99386804/1 in :erl_eval.expr/5>
iex(7)> wrapped_execution = P2Dasm.Sandbox.decode_instr(functiontopass, <<0::size(32)>>)
#Function<20.99386804/0 in :erl_eval.expr/5>
iex(8)> wrapped_execution.()
:something_fake

It is vital to understand that the actual execution of the instruction doesn’t happen until that last line.

That’s all for now!

Ladies and Gentlemen, that last example took a function as an argument and returned a function as a result.

There’s a lot to chew on on this page if you’re unfamiliar with functional terms… next up, we’ll start by designing and building our cog and execute some (well, one) instruction on it.

Enter… the NOPSLED