Building Computing Machines from Ground Up: BrainF--k Interpreter
revision 0.7.3
In this post I will build upon a design of simple programmable counters and implement a primitive BrainF–k machine.
BrainF–k Revisited
BrainF–k is a minimalistic programming language that is supposed to be interpreted by a BrainF–k machine. This machine consists of two tapes, one (unlimited) for program, another looped one for data. Data tape contains 2^15 locations or cells each of which can store an 8-bit value. A head can move along data tape, increment or decrement current value or reading or writing value from or to user. Head action is determined by current value of program tape. After current instruction is processed head steps to an adjacent instruction written on program tape.
There are just eight possible instructions, which is four more than needed to be Turing complete1: since tape and cell values are finite and can flip over there is no real need for decrements. Two extra instructions (for user i/o) are added for convenience, in principle they are not needed as well since input can be preset before starting the machine and output collected if it ever stops. Below is a reference table of BrainF–k commands:
COMMAND No EFFECT
-------------------------------------------------------------------------------
+ 0 Increment value in current head position;
- 1 decrement value in current head position;
> 2 move head right (i.e. increment head position);
< 3 move head left (i.e. decrement head position);
. 4 output current cell value;
, 5 input into current cell;
[ 6 do nothing if current value is nonzero, fast-forward to command
after matching ] otherwise;
] 7 do nothing if current value is zero, go back to command after
matching [ otherwise.
The machine is extremely simple and an emulator could be trivially implemented in any non-esoteric programming language in a matter of minutes. Having refreshed what BrainF–k is let us dive into implementation of the machine in hardware. But before that
A word on tapes
When talking about Turing tar pits a term “tape” is frequently used. This word bears a distinct sound of age long gone and can confuse, but in reality it merely states the fact that memory device used in a machine is sequential. Thus there is no need to use real tape for building a BrainF–k machine, anything that allows sequential access to data stored inside will do. For example RAM chip for which address is generated by a counter can serve as an example of tape emulating device: the only way to access any given location in memory is to count up or down towards it from current one. A possible circuit is presented on the image below:
Sequential memory unit made out of ordinary RAM and a counter. Transition to high on IncrAddr and DecAddr inputs increments or decrements current address respectively. Data is a bus shared betweeen imput and output, Read/Write toggles chip operation mode between reading data from it and writing data to it. Reset asynchronously clears memory, ChipSelect enables or disables component, pulse on Clock input writes value to memory.
I have not described RAM chips yet, but Logisim implemtation is quite similar to ROM with the difference of extra inputs allowing to write data to addresses present on address input. Conceptually it would probably be right to use circuit similar to described above as a tape emulator and some other circuitry to emulate head. In fact BrainF–k machine that I designed first historically did exactly this. Such approach is also closer to more traditional computers where one can separate CPU from data and program memory. However I will not follow this route because on one hand it would further break my commitment to medium scale ICs (which I still have to unbreak by replacing control ROMs with something less integrated), on the other hand even when RAM is made from scratch the solution does not scale down well. RAM units are not really complex, however they are extremely repetituous and labor-intensive: hundreds or even thousands of identical subuntis must be produced to assemble RAM and it does not make much sense to build RAMs to capacity of several bytes. Any possible gain of proper RAM unit would be negated at such capacities and in the end will require additional effort to bring in more such elements.
Instead I decided to design BrainfF–k machine around memory device consisting of indvidual cells, each of which is responsible for decoding address in memory and acting upon that. Such decision significantly increases component count and price of a final machine, but at the same time gives several advantages. First of all, if each cell is independent one can add as many cells to minimal working solution as she sees fit: eight cells would work as good as twenty or one thousand twenty five. When a need for more memory arises more cells could be added seamlessly provided that address bus capacity is not exhausted. For example BrainF–k machine could be demonstrated to operate properly with three cells avaliable on tape, but more complex programs like fixed point arithmetics computations or equations solvers would require several times more cells while emulators of other computing systems would happily claim as much memory as a user could afford. Another advantage lies in ability to visualise cell contents and behavior. Adding several LEDs to each cell allows one to easily monitor its operation, provides very straightforward debugging tool and even allows one to create makeshift output device with little overhead.
Consider once again a counter described in the previous post. It can load and store data. Several such counters could be used together to store more than one value. A multiplexer could be used to select which counter to use basing on a value in another counter. This departs from the original plan of address self-decoding, but I believe that it is fine for demonstration purposes and address-aware cells could be introduced separately as a drop-in replacement. The behavior described so far could be much easier implemented with plain registers, but counters allow for one extra shortcut: I don’t need to design a separate block for arithmetics any more, since each memory cell is capable of incrementing and decrementing! All I have to do is to come up with a bus that will route all necessary signals properly.
The resulting “tape” then will look like this:
Sequential memory unit made from discrete counters. Eight counters are connected to demultiplexed control bus CTLIn, their outputs multiplexed and outputed to DOUT after being trimmed to 7 bit width. Data could be written to current cell through DIN port, cell selection is done by a counter driven by ACCtl input. For demonstration purpose address is encoded as 8 bit value and output through Addr port. Each counters output is connected to a separate output port to support visualisation.
An individual counter wiring for sequential memory unit.
This approach while very handy for BrainF–k machine blurs the border between CPU and memory as they are present in more practical computers. That said a BrainF–k machine as described below could be easily modified to use RAM chip and distinctive head – a single counter could perform increments and decrements of values, provided that control ROM contains descriptions of steps necessary to load values from memory and to store them back.
Note that here I heavily use Logisim tunnels, entitites that share state with peers of the same name. This feature is extremely handy because circuits tend to become a mess otherwise. Individual counters and necessary wiring is hidden in subcircuits. Also here I cut several more corners by using multiplexers to route control signals to cell selected via address input and to collect its output. In real world it will be rather hard to build a multiplexer that switches a bus between multiple addresses, moreover such design would be very unflexible. This problem could be solved by connecting each cell’s control inputs to the same source through two input AND gate with second input fed by cell’s address detector. Outputs can be bundled together in the same way with each individual output line ANDed with address detector output and then by performing OR between all same lines of all cells. Alternatively a tri-state buffer (a buffer which effectively shuts output off by switching to third, high-impedance state) could be used instead of logic gates. Note, however, that both solutions have drawbacks. Tri-state buffers are expensive and could be not available in component base of choice, and multi-input OR gates built by cascading regular two to four input gates introduce huge propagation delays: it takes time for each individual cell to change its state according to input. Connecting them in series effectvely multiplies propagation delay of individual cell by the number of stages. Consider typical propagation time of a CMOS two input OR gate of about 60 ns. With sixty three stages it will take about 3.8 us for output to stabilize which translates to maximum feasible clock frequency of about 250 kHz. 255 stages would cap frequency at about 60 kHz. This problem could be mitigated by building custom multi-input OR gate, but this is a topic for a different discussion.
The memory unit shown above exposes 8-bit DIN bus for uploading data to current cell, 3-bit CTLin bus for control signals to drive individual cells, 3-bit ACCtl bus to drive address counter and a 1-bit reset input. It provides output DOUT for reading current value trimmed to seven bit, but more of that later. Note that I decided to make address counter part of memory unit. I did it for demonstration purpose only. The way it is done now it will make stacking such memory units a much harder task than it should be.
Below is an image of a sequential memory unit connected to a LED matrix for the purpose of memory state visualisation:
Sequential memory unit connected to LED array to visualize memory contents. DIN -- data input, CADDR -- current address output, ADDRCTL -- address counter controls, CTL -- memory cell control inputs, DOUT -- data output, RESET -- reset input. A number after colon denotes port width.
As for program tape regular ROM with a program counter will do the trick.
Having discussed how tapes could be approached let us move towards the design itself.
Fist half of commands
I will start with first four commands from the table above. They all are very
similar and actually quite familiar since they increment or decrement a single
value, data value in case of +
and -
and address in case of >
and
<
. I will use the same technique as before: use a table stored in ROM in
conjunction with a register to drive other elements.
To populate the table let us first enumerate all possible commands in the language. I suggest following enumeration from the table above. I believe now it should be pretty clear that control table will be almost identical to the one seen in previous post since the logic and the sequence of steps are pretty much the same. Now since there are several counters in the circuit I introduce two CTRSEL lines which will help me to route control signals to proper counters.
ADDRESS STORED VALUE
+-----------+ +----------------------+
Command CODE STEP STATE VECTOR NEXT STEP --> ADDR VALUE
-------------------------------------------------------------------------------
+ 000 00 010 00 01 0x0 0x21
000 01 011 00 10 0x1 0x32
000 10 010 11 11 0x2 0x2f
000 11 011 11 00 0x3 0x3c
- 001 00 110 00 01 0x4 0x61
001 01 111 00 10 0x5 0x72
001 10 010 11 11 0x6 0x2f
001 11 011 11 00 0x7 0x3c
> 010 00 010 01 01 0x8 0x25
010 01 011 01 10 0x9 0x36
010 10 010 11 11 0xa 0x2f
010 11 011 11 00 0xb 0x3c
< 011 00 110 01 01 0xc 0x65
011 01 111 01 10 0xd 0x76
011 10 010 11 11 0xe 0x2f
011 11 011 11 00 0xf 0x3c
-------------------------------------------------------------------------------
STATE VECTOR = [LOAD, COUNT, CLK, CTRSEL:2]
CTRSEL:
00 -- current cell counter 01 -- address counter 11 -- program counter
Since all commands are basically identical let’s discuss the first one. It consists of four steps. First two steps increment counter 0b00 which is current memory cell: load for counter 0b00 is set high, then while it is sill high clock pulse is sent to counter 0b00. Second two steps are identical, but are performed for counter 0b11 which is program counter.
An animation of a circuit that can run programs consisting solely of these four commands is shown below
A circuit that implements first four commands. PCL -- PC latch, blue LEDs indicate current memory location.
Control ROM data line has the same structure as state vector in control table footer. The program which is running on the animation above is the following:
+>++>+++---<--<-
Note that sequential memory unit also outputs its address for demonstration purpose. Also note that in this circuit PC connected to program ROM through additional register clocked by falling edge of PC clock. This is done to prevent final step jumps: simple inverter is not enough in this case because control signals are not static any more and change values simultaneously with clock. That is the reason to introduce one extra latch to hold current address until it is not needed any more. Also it might be tempting to reduce the number of steps needed to modify counter contents to just one, but in practice that could lead to erroneous counter updates when transitioning from one step to another so I decided to play it safe and keep separate step for control setup and clock update for now.
So far so good! One half of the commands are dealt with. Now let us finish the task and add missing commands, they can’t be that different, can they?
Adding I/O
The next two commands in the table are .
for output and ,
for input. The
main problem with output is the need for a proper device to which data will be
written. Again we see an example of a device which requires more patience than
ingenuity, since any electronic output device could be reduced to an array of
memory elements with some auxiliary logic to process line feed, carriage return
and other control symbols. A rudimentary terminal could be rather easily built
with enormous amount of shift registers, a ROM with symbol table, and some
counters with little control logic. I will, however, postpone this
demonstration and will use terminal emulator provided by Logisim. It has quite
simple interface – one 7-bit data input for ASCII values and clock input which
actually puts a symbol present on data input on screen.
ROM contents is fed to a display. Memory data output is sent directly to display data input, the same clock pulse is used to both cycle through memory and to update display.
Thus to output cell contents to a terminal this data must first be put on terminal data input, then a terminal clock pulse should be issued. First part comes for free – cell contents is always present on sequential memory output. It only has to be trimmed to fit ASCII width. The pulse could be done the usual way. Thus previous control table could be extended with the following one:
ADDRESS STORED VALUE
+-----------+ +----------------------+
Command CODE STEP STATE VECTOR NEXT STEP --> ADDR VALUE
-------------------------------------------------------------------------------
. 100 00 0000 10 01 0x10 0x09
100 01 1000 10 10 0x11 0x8a
100 10 0010 11 11 0x12 0x2f
100 11 0011 11 00 0x13 0x3c
-------------------------------------------------------------------------------
STATE VECTOR = [TERMCLK, LOAD, COUNT, CLK, CTRSEL:2]
CTRSEL:
10 -- no counter selected 11 -- program counter
Not much to discuss here – a new control line is added to drive clock input of terminal device, its change to high is padded in the table (step 0b00) to avoid spurious terminal updates.
Output was quite straightforward, even though it required a new complex device
to be used and I had to increase my backlog of promises by one more entry.
Input is a little more involved. Let us pause for a moment and consider what
does it mean to input data into machine. From perspective of a counter it means
that I must set proper values on control inputs and then generate clock pulse
when the data is available on data input. The last part poses a problem: this
data is provided by a user and even slowest machines will operate much faster
than the fastest of users could react to data input prompt and toggle switches
or press buttons. Thus the machine must stop doing anything until user finishes
entering data. This effectively means that machine must freeze mid-command.
That, in turn, means that after starting sequence for ,
machine must stop
reacting to clock input until user allows it to continue. It is tempting to
break clock line with some logic to achieve just this, but it is better not to
since it is an antipattern which won’t harm in this particular case, but has
great potential for elusive bugs in more complex designs. It is better to
prevent next step latch from operating by changing its control inputs states.
Indeed if constant value at load input is replaced by inverted output of a
one-bit latch then its operation is unchanged. Then we can asynchronously set
the latch which will result in next step latch stopping to react to clock
input. Providing a button which asynchronously resets the latch to the user
would allow the machine to continue operation after the user is done with
input.
Thus the sequence of steps should look like this: first stop the machine by blocking next command latch, then (when it gets unblocked by a user) set current cells controls to load value from input, issue clock pulse to current cell, then increment program counter. Resulting addition to control table will look like this:
ADDRESS STORED VALUE
+-----------+ +----------------------+
Command CODE STEP STATE VECTOR NEXT STEP --> ADDR VALUE
-------------------------------------------------------------------------------
, 101 000 10000 10 001 0x28 0x211
101 001 00100 00 010 0x29 0x082
101 010 00101 00 011 0x2a 0x0a3
101 011 00010 11 100 0x2b 0x05c
101 100 00011 11 000 0x2c 0x078
-------------------------------------------------------------------------------
STATE VECTOR = [HLT, TERMCLK, LOAD, COUNT, CLK, CTRSEL:2]
CTRSEL:
00 -- current cell counter 10 -- no counter selected 11 -- program counter
Note that one more control line, HLT, is added. Signal on HLT synchronously sets trigger which controls next step operation. Also I had to add another address line to account for the fact that now I need five steps to perform this operation. As a result the entire table must be amended to account for this change. In order not to clutter the text I put it here. Resulting circuit is presented on image below:
A circuit demonstrating I/O. HALT LED lights up when circuit is waiting for user input, CONTINUE button unsets J-K trigger which gets blocked by high level on HLT line.
The program it is running is quite straightforward:
,>,>,>,>,>,>,>,++<+<+<--<<---<+<.>.>.>.>.>.>.>.,
It reads eight values from input to eight neighbouring cells, modifies some values, prints results on a terminal and finally halts by starting to wait for more input. If all input values are 0x64 then after running the code combination deadbeef should appear on output device. On the schematic above I have hardwired input to be always equal to 0x64, but in real circuit one could use eight switches to produce desired result.
Now it is time for a word of caution. Wiring a button directly to asynchronous reset is definitely not the best idea. The button is human operated and when clock frequency is more than several hertz it becomes extremely easy to keep it pressed way after command processing is finished. Since asynchronuous reset is level-driven this will eventually lead to wrong values being read from input. Much better solution would be to make the button generate single pulse of fixed width per button press. However I did not come up with a clean way to do this in Logisim – it is possible to generate a pulse with an invertor and AND gate since propagation delays are taken into account to some extent, but I have not devised any straightforward way to debug it, so here is this warning instead. On the other hand in real circuit a 555 timer in standard single-pulse configuration should suffice.
Last two commands
Even after a cursory glance it becomes obvious that two final commands stand out from all previously seen. By definition each of them does not one, but two different things depending on condition which could be determined only in run time. When a cell that happens to be current when PC starts pointing to a bracket contains zero a sequence of actions differs from one when a cell is non-zero. Thus each command must have more than one entry in control table. This is somewhat new since all previous examples had only one and immediately poses a question: how exactly one could distinguish between different entries for a single command? Obviously runtime state should somehow be used to determine particular entry. And since entries start at different addresses an extra address line driven by run time state comes to resque. State of this line should be determined by the contents of current cell and could be easily generated by 8-NOR gate fed by cell output:
Zero flag implementation. Counter output is connected to 8-NOR gate. Counter holds value of 0 thus 8-NOR output is 1 which is indicated by LED labled 'Zero'.
Let us call this line Zero Flag. Now command code is extended by Zero Flag and proper step sequence is chosen depending on current cell state.
This immediately poses a question: what to do with other commands? It is impossible to introduce another address line and not to affect other commands. At the very least all locations would move, but in case of zero flag which became a part of address extra entries must be added to control table since it operates continuously and is present at every moment. Fortunately it does not affect execution of other commands so entries for them could be just copied to new locations in control ROM.
Another helpful property of the last two commands is their symmetry. You can
pick any one of them, invert all statements and you’ll have second command
description. Thus I’ll discuss only one command in details, second one could
be easily derived. For that I pick [
.
The part when current cell is non-zero is trivial: do nothing which effectively means ‘do nothing, but remember to increment PC’. That easily translates into two entries in control table:
ADDRESS STORED VALUE
+-----------+ +----------------------+
Command CODE Z STEP STATE VECTOR NEXT STEP --> ADDR VALUE
-------------------------------------------------------------------------------
[ 110 0 000 00010 11 001 0x60 0x59
110 0 001 00011 11 000 0x61 0x78
-------------------------------------------------------------------------------
STATE VECTOR = [HLT, TERMCLK, LOAD, COUNT, CLK, CTRSEL:2]
Z -- zero flag, is 1 when contents is 0
CTRSEL:
11 -- program counter
Now let us look closer at the second part as it looks a bit tricky. The description says ‘fast forward to matching bracket’. Every word in this description is important. First of all, ‘fast forward’ means that address must be continuously incremented. No matter which command appears during this increment it must be skipped. Which in turn means that from now on each command must have two types of entries for it: one for regular operation, one for skipping any activity. This could be achieved by introducing another flag line driven by a trigger. Let us call this one ‘motion flag’. Since brackets are symmetrical I plan to reuse it later.
Second important phrase is ‘matching bracket’. Arguably the ability to nest
brackets define the power of the language, thus it is crucially important to
properly skip some brackets as well! Consider, for example, this snippet:
[>[-]-]
. If when it is encountered address counter contains address of an
empty cell then everything inside the outer brackets must be ignored. How this
should be implemented? Note, that brackets in a correct program are balanced
i.e. the number of of opening and closing brackets match and first bracket in
any program is the opening one. Thus if one starts with an empty counter at
the beginning of a program, steps through it without executing commands and
increments the counter each time an open bracket appears and decrements it when
closing bracket is met, in the end this counter should contain zero as well.
This zero value will appear as soon as closing bracket were reached. The same
holds true for any subprogram inside a pair of brackets: a pair of brackets
can’t contain closing bracket for a bracket outside of it since it would cause
ambiguity. So if a subprogram starts with a bracket, we traverse it and update
counter as described and at some point see closing bracket simultaneously with
empty counter then this bracket is the matching one. This logic could be
implemented in hardware with yet another counter, let me call it bracket
counter, and a flag – matching bracket flag – to indicate that the counter is
zero. When matching bracket flag is 1 then the next bracket is matching and
motion flag must be toggled and thus normal operation should resume.
Unfortunately this requires yet another address line, but at least other
commands are not affected by it so no need to modify them, copying values is
enough. Note however, that other flags could in general affect what should be
done so don’t forget to copy responsibly.
Now let us look closely at [
one more time. Let us ask ourselves, have we
analysed interpreter’s behaviour in fullness? Let us imagine a slide over
program in search for matching bracket during which we meet [
. Do we have
enough data to do the right thing? We have Motion Flag to tell us that we are
moving, we have zero flag which we don’t care about, finally we have matching
bracket flag to tell us whether we should stop. Looks almost good, but there is
this last instruction which was supposed to be absolutely symmetric and which I
was hoping to just generate automatically: it could initiate this slide as well
and that means that we can slide over [
in two cases, when searching for
matching ]
and when searching for matching [
. Which means that bracket
counter could be either incremented or decremented, depending on command which
has initiated this slide. Thus when updating PC during a slide we need one more
bit of data, namely movement direction to modify it properly. For that I will
introduce one more flag: direction flag. While the same result could have been
achieved with a latch and some basic logic without introducing an extra address
line I opted for an explicit flag since it makes it clearer what exactly is
going on. Finally it appears that [
is covered from all sides: regular
operation, movement and proper PC and flags update action depending on movement
direction.
Let me condense and revisit the discussion above before presenting you with a
full control table. After considering [
control ROM address bus got extra
four bits driven by flags: Zero, Motion, Bracket match and Direction.
Flags have the following meaning:
0 1
-------------------------------------------------------------------------------
[Z]ero Current cell is not zero Current cell is zero
[M]otion Regular operation Moving to find match
[B]racket match Next bracket is not matching Next bracket is matching
[D]irection Moving forward Moving backwards
Depending on these flags [
is interpreted in the following ways:
- when a cell is non-zero and no motion is taking place then just increment the PC independently of remaining flags values2 ;
- when a cell is zero and no motion is happening toggle motion flag and increment PC;
- when moving and program pointer is moving forward increment bracket counter and PC no matter what are the values of zero flag or branch match (branch match will change anyway so no reason to care for its value now);
- when moving and moving backwards and bracket is not matched decrement bracket counter and decrement PC no matter what zero flag value is;
- when moving and moving backwards and bracket matches then toggle motion and direction flags and increment PC.
Reasoning about ]
is basically the same so I won’t elaborate on it here.
Other commands are modified as follows:
- Original entry is kept for regular operation (i.e. Motion Flag equals to zero) and any other flags combinations;
- when moving and direction is forward just increment PC disregarding other flags’ values;
- when moving and moving backwards decrement PC.
Now we are ready to populate the final control table. To save space I will show only unique entries. I will denote that entry is the same for all values of a flag by replacing a value with X. Thus final table will look like this:
ADDRESS STORED VALUE
+--------------+ +-----------------------+
Command CODE ZMBD STEP STATE VECTOR NEXT STEP --> ADDR VALUE
-------------------------------------------------------------------------------
+ 000 X0XX 000 0000 010 000 001 ... 0x0081
000 X0XX 001 0000 011 000 010 ... 0x00c2
000 X0XX 010 0000 010 011 011 ... 0x009b
000 X0XX 011 0000 011 011 000 ... 0x00d8
000 X1X0 000 0000 010 011 001 ... 0x0099
000 X1X0 001 0000 011 011 000 ... 0x00d8
000 X1X1 000 0000 110 011 001 ... 0x0199
000 X1X1 001 0000 111 011 000 ... 0x01d8
- 001 X0XX 000 0000 110 000 001 ... 0x0181
001 X0XX 001 0000 111 000 010 ... 0x01c2
001 X0XX 010 0000 010 011 011 ... 0x009b
001 X0XX 011 0000 011 011 000 ... 0x00d8
001 X1X0 000 0000 010 011 001 ... 0x0099
001 X1X0 001 0000 011 011 000 ... 0x00d8
001 X1X1 000 0000 110 011 001 ... 0x0199
001 X1X1 001 0000 111 011 000 ... 0x01d8
> 010 X0XX 000 0000 010 001 001 ... 0x0089
010 X0XX 001 0000 011 001 010 ... 0x00ca
010 X0XX 010 0000 010 011 011 ... 0x009b
010 X0XX 011 0000 011 011 000 ... 0x00d8
010 X1X0 000 0000 010 011 001 ... 0x0099
010 X1X0 001 0000 011 011 000 ... 0x00d8
010 X1X1 000 0000 110 011 001 ... 0x0199
010 X1X1 001 0000 111 011 000 ... 0x01d8
< 011 X0XX 000 0000 110 001 001 ... 0x0189
011 X0XX 001 0000 111 001 010 ... 0x01ca
011 X0XX 010 0000 010 011 011 ... 0x009b
011 X0XX 011 0000 011 011 000 ... 0x00d8
011 X1X0 000 0000 010 011 001 ... 0x0099
011 X1X0 001 0000 011 011 000 ... 0x00d8
011 X1X1 000 0000 110 011 001 ... 0x0199
011 X1X1 001 0000 111 011 000 ... 0x01d8
. 100 X0XX 000 0000 000 010 001 ... 0x0011
100 X0XX 001 0001 000 010 010 ... 0x0212
100 X0XX 010 0000 010 011 011 ... 0x009b
100 X0XX 011 0000 011 011 000 ... 0x00d8
100 X1X0 000 0000 010 011 001 ... 0x0099
100 X1X0 001 0000 011 011 000 ... 0x00d8
100 X1X1 000 0000 110 011 001 ... 0x0199
100 X1X1 001 0000 111 011 000 ... 0x01d8
, 101 X0XX 000 0010 000 010 001 ... 0x0411
101 X0XX 001 0000 100 000 010 ... 0x0102
101 X0XX 010 0000 101 000 011 ... 0x0143
101 X0XX 011 0000 010 011 100 ... 0x009c
101 X0XX 100 0000 011 011 000 ... 0x00d8
101 X1X0 000 0000 010 011 001 ... 0x0099
101 X1X0 001 0000 011 011 000 ... 0x00d8
101 X1X1 000 0000 110 011 001 ... 0x0199
101 X1X1 001 0000 111 011 000 ... 0x01d8
[ 110 00XX 000 0000 010 011 001 ... 0x0099
110 00XX 001 0000 011 011 000 ... 0x00d8
110 10XX 000 1000 010 011 001 ... 0x1099
110 10XX 001 0000 011 011 000 ... 0x00d8
110 X1X0 000 0000 010 100 001 ... 0x00a1
110 X1X0 001 0000 011 100 010 ... 0x00e2
110 X1X0 010 0000 010 011 011 ... 0x009b
110 X1X0 011 0000 011 011 000 ... 0x00d8
110 X101 000 0000 110 100 001 ... 0x01a1
110 X101 001 0000 111 100 010 ... 0x01e2
110 X101 010 0000 110 011 011 ... 0x019b
110 X101 011 0000 111 011 000 ... 0x01d8
110 X111 000 1100 000 010 001 ... 0x1811
110 X111 001 0000 010 011 010 ... 0x009a
110 X111 010 0000 011 011 000 ... 0x00d8
] 111 00XX 000 1100 110 011 001 ... 0x1999
111 00XX 001 0000 111 011 000 ... 0x01d8
111 10XX 000 0000 010 011 001 ... 0x0099
111 10XX 001 0000 011 011 000 ... 0x00d8
111 X1X1 000 0000 010 100 001 ... 0x00a1
111 X1X1 001 0000 011 100 010 ... 0x00e2
111 X1X1 010 0000 110 011 011 ... 0x019b
111 X1X1 011 0000 111 011 000 ... 0x01d8
111 X100 000 0000 110 100 001 ... 0x01a1
111 X100 001 0000 111 100 010 ... 0x01e2
111 X100 010 0000 010 011 011 ... 0x009b
111 X100 011 0000 011 011 000 ... 0x00d8
111 X110 000 1000 000 010 001 ... 0x1011
111 X110 001 0000 010 011 010 ... 0x009a
111 X110 010 0000 011 011 000 ... 0x00d8
-------------------------------------------------------------------------------
FLAGS = [Zero, Motion, Bracket match, Direction]
STATE VECTOR = [(TOGGLE_MOTION_FLAG, TOGGLE_PC_DIRECTION, HLT, TERMCLK),
(LOAD, COUNT, CLK), CTRSEL:3]
CTRSEL:
000 -- current cell counter 001 -- address counter
010 -- no counter selected 011 -- program counter 100 -- bracket counter
I have not expanded address values since it would have taken too much screen space and put ellipses in their place. Final circuit is presented below:
Final computer. Part of circuit below dashed line is essentially the same as for six commands with the exception of flag latch FLatch and address and control lines necessary for flags operation. T flip-flops MF and MD implement motion and motion direction flags respectively, counter BCtr with 8-NOR gate implement branch counting and matching branch flag, 8-NOR gate ZF implements zero flag.
The circuit is mostly the same with the exception of flags which are separated from the familiar part with a dashed line. Note, that flag values are not connected directly to control ROM but instead routed through a latch which is clocked by the same pulse that clocks PC. This is done to prevent mid-command jumps when a flag gets updated. Also I had to use lots and lots of tunnels to keep schematics readable.
Here the final circuit is shown running hello world example from Wikipedia with user input added as final command to halt the machine. The program looks like this:
++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.
<.+++.------.--------.>>+.>++.,
It takes a while to evaluate it and at regular demonstration speed a full animation would have taken about twenty minutes to cycle through, with most of the time spent shuffling values around which is even less entertaining to watch than it sounds, that is why I included only part of it. And here you can find sources for the circuit and play with it. Note that sequential memory unit has two separate outputs for data, one 7-bit, the other 8-bit. This is done to simplify circuit appearance since it does not require separate converter from 8-bit to 7-bit which is needed by display.
Afterword
BrainF–k computer as it is presented here is quite impractical since it is lacking most of the features of modern digital computers taken for granted. Even if one can abstract for a moment from how depressingly low level it is such glaring holes as lack for interrupt support, or of jumps catch an eye. But one must remember, that this computer is but a step towards building a much more user-friendlier machine. In this section I will sketch a path forward or how one could build up even on such modest and unpalatable design. I will briefly outline simple hardware tweaks and little amendments which, while staying mostly within strict framework of the language allow one to do much more with a BrainF–k machine than could be fathomed at first (or even second) glance.
First of all, consider output. In machine presented above it is done in a
straightforward way: by using .
command. But what if I said to you, that .
is not really needed? After all all that .
does is pulsing single OUTPUT
CLOCK line. But if you are ready to sacrifice one memory cell and wire its
lower bit to OUTPUT CLOCK and then run code like >{go to this special
cell}++<{go back}
, this action would be equivalent to running .
from the
perspective of an external circuit waiting for a clock pulse on OUTPUT CLOCK.
Having two or many such special purpose cells could provide control over
several pulse-triggered devices. Or alternatively a memory cell could be
sacrificed to drive a selector which chooses where to route clock pulse
generated by .
basing on this cells contents. Or this cell could be
designated as an output port directly wired to some device like display
controller data bus and ++
on yet another special-purpose cell would be wired
to CLK of the same external device? Or several memory cells could be used
together to form much wider buses.
Second, adding circuitry that allows one to stop machine at will and then resume execution opens even wider horizons. Think of direct memory access and some rudimentary interrupt handling. Then one could always connect some kind of sensor directly to one or more memory cells and use it like some hellish replacement for microcontroller board, with the number of input ports limited by imagination and physical constants. Or think of expanding memory of a single BrainF–k computer by adding a dedicated bank switching cell. As if this were not enough think of sacrilege of pseudorandom access to program tape: ROM switching through a dedicated memory location could substitute procedure calls with a convention in place for data exchange. And I have not come to the most exciting part of the story yet.
Third, single BrainF–k computer is quite unwieldy: it is hard to program, it is absurdly sequential, it is uninterruptable. But what if there are more than one such computer tied together by means of halt lines and shared memory cells? What if some of these machines are used as controllers, bound to run the same program forever? Using BrainF–k machine itself as a building block running relatively simple code lets one achieve much more than could be with single computer running more sophisticated program. Moreover, individual machines could easily be built with less memory and other task-specific optimisations. It could easily turn out that building a say 8080 emulator with several BrainF–k computers is simpler than writing complete emulator for one such computer. The list of options could go on and on, with basic BrainF–k machine being just a small peek into realm of computers.
Finally it is important to point out, that this design is very rough and unoptimized and many implementation details are not top-performers in the very least. Control table entries are not optimized and at times excessive because I decided that hardware running them will be easier to understand, circuits could be tuned much finer as well. I have, however, fought the urge to optimize it further, since I believe this is the case when crude design can convey an idea better than fully optimized one. This is just the first step after all.
Updates
-
When I have almost finished writing this post I found this gem – a good example of minimalistic design plagued by a very different design constraint. Instead of lots of small chips all is done with several ROMs which is expensive gate-wise, but extremely educational.
-
Kudos to Settis for providing a patched version of Logisim which greatly simplified generation of frames for animations – it took much less time to redraw everything with it than to draw original erroneous images!
-
Other flags should be zero as well and a better way to handle this would be to halt the computer if such state is reached, but this adds to overall complexity of the design which I desperately tried to keep simple, and thus decided to postpone error signaling till next time. The same reasoning stands for other steps here. ↩