pdp-7 Unix

Unix version 0 was written in 1963 by Ken Thompson, on a PDP-7.  Recently, the source code code Unix V0 has been discovered, and you can read it here, as pdf scans of printouts.  You can read about the discovery, and the effort to boot Unix V0 on a real PDP-7 here.  The project home page is here.

I got interested in PDP-7 unix, and then in PDP-7 assembler.  Eventually, I wrote an Antlr4 grammar to parse PDP-7 assembler files in the original as format that Thompson wrote them here.  The resulting grammar is here.


Building a simple RIAK ORM

I've been interested in RIAK for a while, and ORM's are nothing short of fascinating. I decided to try writing an ORM for Riak, and the results are here:


My ORM is not an ORM of course, because RIAK is not relational. However it is ORM-like; I can store POJO's and retrieve them.

The features I wanted for my ORM were those that I am accustomed to with Hibernate, or eBean.

  • Ability to store POJO's and retrieve them
  • Ability to store object trees of POJO's which contain POJO's
  • Support for Lists of POJO's composed inside a POJO
  • Lazy Loading

In the end, I ended up with a ORM-like layer that can store data into any Key-Value store. There is a plugin which supports RIAK, and there is an emulated Key-Value store built on a filesystem, which is useful for development purposes. Theoretically cBean could work on any Key-Value store, such as MongoDB, but I haven't built that support yet. Adding support for a Key-Value store is as simple as implementing the the interface KVService.

Supported Types

  • All java simple types (int, String, long, etc)
  • All java wrapper types (Integer, Long, etc)
  • Contained POJOs
  • java.util.List of POJOs
  • java.util.UUID
  • java.util.Date


Similar to Hibernate or eBean, cBean POJO's must be annotated. I didn't chose to use the JPA annotations, but instead defined my own. They are here.  There numerous cBean annotated POJO's in the tests, here.


Similar to JPA, the @Entity annotation simply marks the POJO as one which is of interest to cBean.


Every POJO must have an id field and it must be a String or UUID.  The @Id field is a little different in cBean than in JPA.  Inserting a new object with the same JPA @Id as one that already exists in the RDBMS is an error.  In cBean, you simply overwrite the existing object.


The @Property is used to indicate that a specific property (simple type, wrapper type, object type or list type) will be persisted.  POJO fields without the @Property annotation are ignored.  There are a number of properties which are valid on an @Property annotation.

  • cascadeSave
  • cascadeDelete
  • cascadeLoad
  • ignore
  • nullable

cascadeSave, cascadeDelete, and cascadeDelete are only relevant for POJO fields which are themselves POJOs, or lists of POJOs.  Setting "cascadeLoad=false", naturally, indicates that the field is lazy-loaded.


An POJO can define an Integer field and annotate it with @Version.  cBean will increment the annotated field by 1 each time the POJO is saved.

Referential Integrity

Frameworks like Hibernate or eBean provide referential integrity because RDBMS's provide referential integrity.  Key-Value stores, such as RIAK do not provide referential integrity, and therefore neither does cBean.  Therefore is it entirely possible to persist a POJO which contains a POJO, and have the contained POJO be deleted underneath the parent.  In a RDBMS this can be prevented with a foreign key; there is no such protection in cBean.  Therefore application code using cBean must be aware that POJOs in lists, for example, may not be resolvable when the list is reloaded.

The strategy that cBean uses for handling broken "foreign keys" is two-fold:

  • If a POJO contains a child POJO and the child is deleted, that Object will be set to null on reload
  • If a POJO contains a list of POJO's and one of the elements is deleted, the element Object will be set to null.  The List size() will remain the same as when it was persisted.

Example Code

There is a working example at https://github.com/teverett/cbean/tree/master/example.

Building QEMU

In general, I install QEMU on my Macbook using MacPorts.  However I recently had a need to get the tip of the QEMU development tree.

Getting the QEMU source tree is trivial:

git clone git://git.qemu-project.org/qemu.git

I needed an updated version of dtc:

git submodule update --init dtc

The build instructions from the README are:

 mkdir build
 cd build

However, my case I only need ARM emulation, so:

../configure --target-list=arm-softmmu
make install

The binary qemu-system-arm will be at /usr.local/bin

oscar:build tom$ /usr/local/bin/qemu-system-arm --version
QEMU emulator version 2.4.94, Copyright (c) 2003-2008 Fabrice Bellard




I've tried a couple different mp3 taggers to tag my mp3 library, however, most seem to have trouble with large mp3 libraries.  So, after doing some reading about AcoustID and MusicBrainz I decided to quickly code up my own tagger, MusicBrainzTagger.

MusicBrainzTagger is a command-line application which recurses a directory of mp3 files and tags each one, one by one.  This approach allows it to handle very large libraries; it only processes one file at a time.  File processing consists of reading any ID3 tags in the input mp3, and then calculating the Acoustic ID fingerprint.  The fingerprint is then resolved to a MusicBrainz ID which is used to look up the recording.

MusicBrainzTagger then tags the file, renames it, and moves it to a new directory.

Bare Metal coding on FreeBSD

I recently got interested in the technical details of how ARM OS's work, so I decided to try my hand at writing a simple one.  This blog post is not about the OS itself, but about setting up the development environment.

In my case, I'm developing in a terminal session, on FreeBSD 10 on an AMD-64 host, so I'll need to cross-compile all my code.  Luckily, the ports tree includes gcc-arm-embedded a port of the launchpad ARM cross tools.  It's easy to install:

pkg install gcc-arm-embedded

This package includes all the tools which are needed:

-rwxr-xr-x 1 root wheel 711488 Oct 3 11:17 arm-none-eabi-addr2line
-rwxr-xr-x 2 root wheel 740040 Oct 3 11:17 arm-none-eabi-ar
-rwxr-xr-x 2 root wheel 1298680 Oct 3 11:17 arm-none-eabi-as
-rwxr-xr-x 2 root wheel 620816 Oct 3 11:17 arm-none-eabi-c++
-rwxr-xr-x 1 root wheel 710528 Oct 3 11:17 arm-none-eabi-c++filt
-rwxr-xr-x 1 root wheel 620608 Oct 3 11:17 arm-none-eabi-cpp
-rwxr-xr-x 1 root wheel 29416 Oct 3 11:17 arm-none-eabi-elfedit
-rwxr-xr-x 2 root wheel 620816 Oct 3 11:17 arm-none-eabi-g++
-rwxr-xr-x 2 root wheel 620608 Oct 3 11:17 arm-none-eabi-gcc
-rwxr-xr-x 2 root wheel 620608 Oct 3 11:17 arm-none-eabi-gcc-4.8.4
-rwxr-xr-x 1 root wheel 24480 Oct 3 11:17 arm-none-eabi-gcc-ar
-rwxr-xr-x 1 root wheel 24448 Oct 3 11:17 arm-none-eabi-gcc-nm
-rwxr-xr-x 1 root wheel 24448 Oct 3 11:17 arm-none-eabi-gcc-ranlib
-rwxr-xr-x 1 root wheel 271072 Oct 3 11:17 arm-none-eabi-gcov
-rwxr-xr-x 1 root wheel 3992568 Oct 3 11:17 arm-none-eabi-gdb
-rwxr-xr-x 1 root wheel 776672 Oct 3 11:17 arm-none-eabi-gprof
-rwxr-xr-x 4 root wheel 1025912 Oct 3 11:17 arm-none-eabi-ld
-rwxr-xr-x 4 root wheel 1025912 Oct 3 11:17 arm-none-eabi-ld.bfd
-rwxr-xr-x 2 root wheel 722928 Oct 3 11:17 arm-none-eabi-nm
-rwxr-xr-x 2 root wheel 906848 Oct 3 11:17 arm-none-eabi-objcopy
-rwxr-xr-x 2 root wheel 1123424 Oct 3 11:17 arm-none-eabi-objdump
-rwxr-xr-x 2 root wheel 740056 Oct 3 11:17 arm-none-eabi-ranlib
-rwxr-xr-x 1 root wheel 365208 Oct 3 11:17 arm-none-eabi-readelf
-rwxr-xr-x 1 root wheel 712976 Oct 3 11:17 arm-none-eabi-size
-rwxr-xr-x 1 root wheel 712080 Oct 3 11:17 arm-none-eabi-strings
-rwxr-xr-x 2 root wheel 906864 Oct 3 11:17 arm-none-eabi-strip

Additionally, an ARM simulator such as QEMU will be needed.  FreeBSD also include that port:

pkg install qemu-devel

I can easily use BSD Make, however I prefer GNU Make, so I've installed that too

pkg install gmake

With these tools installed, I have enough to cross-compile ARM assembler and C code, link it, and run it in QEMU and debug with GDB.

FemtoOS; a simple bootloader and protected mode kernel

I've recently become interested in how the i386 boot loader works.  There is an excellent example of a boot loader here, and another here.  Some simple protected-mode code which implements a kernel capable of writing a line of text, is here.  FemtoOS, is the culmination of combining code from all those, and building a simple boot loader plus a protected mode kernel, that outputs text.  In my case, I'm compiling the kernel on FreeBSD.

The bootloader is in boot.asm.  Like any i386 boot loader, it's loaded by the BIOS at address 0xC700, in 16-bit real mode.

It starts by clearing the screen, outputting a message, and then loading kernel.bin from the disk.  kernel.bin is loaded at address 0x1000.  The bootloader then enters protected mode, sets up a GDT and passes control to kernel.bin at address 0x1000.  Looking at the documentation for BIOS interrupt 13 here, it's clear that a single sector has 512 bytes.  boot.bin is exactly 512 bytes long and the floppy image was created using this code from Makefile.  Therefore kernel.bin starts at the second sector on the disk.

cat boot.bin kernel.bin /dev/zero | dd bs=512 count=2880 of=floppy.img

This code from boot.asm reads 18 sectors, starting at sector 02, into RAM at 0x1000.  The largest kernel.bin can be, therefore is 512*18=9KB.

 mov ax, 0                              
 mov es, ax                              
 mov bx, 0x1000          ; Destination address = 0000:1000
 mov ah, 02h             ; READ SECTOR-command
 mov al, 12h             ; Number of sectors to read (0x12 = 18 sectors)
 mov dl, [drive]         ; Load boot disk
 mov ch, 0               ; Cylinder = 0
 mov cl, 2               ; Starting Sector = 3
 mov dh, 0               ; Head = 1
 int 13h                 ; Call interrupt 13h

kernel.bin is linked from the object files created from loader.asm, main.c and video.c.  When the bootloader passes control to kernel.bin, it starts at loader.asm which in turn passes control to main() from main.c.  Note that while boot.asm contains both 16-bit and 32-bit code, loader.asm contains only 32-bit code; it is called after boot.asm has put the host in 32-bit protected mode.

This code sets up the GDT

 xor ax, ax              ; Clear AX register
 mov ds, ax              ; Set DS-register to 0 - used by lgdt
 lgdt [gdt_desc]         ; Load the GDT descriptor

and this code puts the machine into protected mode, followed by passing control to loader.asm.  Note that immediately after putting the machine into protected mode a jmp instruction is issued to the label "kernel_segments" which is the first 32 bit instruction executed on boot.  From this point on, we are in 32 bit protected mode.

 mov eax, cr0            ; Copy the contents of CR0 into EAX
 or eax, 1               ; Set bit 0     (0xFE = Real Mode)
 mov cr0, eax            ; Copy the contents of EAX into CR0       
 jmp 08h:kernel_segments ; Jump to code segment, offset kernel_segments
[BITS 32]                ; We now need 32-bit instructions
 mov ax, 10h             ; Save data segment identifyer
 mov ds, ax              ; Move a valid data segment into the data segment register
 mov ss, ax              ; Move a valid data segment into the stack segment register
 mov esp, 090000h        ; Move the stack pointer to 090000h       
 jmp 08h:0x1000          ; Jump to section 08h (code), offset 01000h

The code in loader.asm that call's the C code main() is pretty simple:

  call main  ; Call our kernel's main() function

There is good documentation for the protected mode text console here.  The simple implementation of this is in video.c.

There is a floppy disk image here, which boots in both qemu and VMWare.