On the Fukutake Shoten "Study Box" Last updated 2019-02-28 * Observations on the hardware It's... a casette tape drive for the Famicom. There are no external controls for the tape mechanism, so it must be under program control. An external power supply is required (presumably to supply enough amperage to be able to run the tape motor). The few board shots available show about seven ICs on the "SBX memory and I/O controller board", at least one of which is an ASIC. Two appear to be 32kx8 RAMs attached to the CPU bus, and two appear to be 8kx8 RAMs attached to the PPU bus. Tapes appear to be dual-track ("stereo"), with one track (the left?) being audio for playback and the other (the right?) being data for the CPU. There is no erase head, so we may expect that this is a playback (read) only system. There appears to be 3k of RAM from $4400-$4FFF, and there is some evidence that there is RAM for all four nametables (and possibly also separate RAM for PPU $3000 - $3EFF). Inspection gives us that there is some sort of switch to indicate if the drive door is open or closed. Chip identification suggests that there is a 4-bit MCU in DIP-16 that might be mediating the interface to the tape drive logic. There is also an MCU on the tape drive board itself. There doesn't appear to be a "rewind to start" step in the overall process. Instead once a target "page" is selected, the system tries to read from the tape in order to find the start of a page, /any/ page, and then determines what to do once it has managed to read a page header. * Data track analysis We're working with what appears to be "Enjoy English 01". The file is a little-endian RIFF WAVE, "Microsoft PCM", 16-bit, stereo, 44100 Hz. The left track is the audio track, the right track is the data track. There is some amount of bleed-through, the data sounding faintly on the audio side (and this is blatantly obvious under spectrographic analysis). At about 30 seconds into the recording, the data track starts containing a wave of about 2390Hz. At about 33.15 seconds, that starts getting broken up by some sort of encoded data. Preliminary surmise is that the lead-in tone is either to indicate that there is data following, or to provide a synchronization signal for a PLL, possibly both. At 44100 Hz for audio, and 1789773 Hz for the CPU clock, we have 40.584 CPU cycles to the audio sample. For the lead-in tone, I get 23 cycles over 425 samples, or about 18.478 samples/cycle. Frequency analysis claims a 2405 Hz peak. It turns out that the data is using a 4800 bit-per-second MFM encoding at the lowest level. The tape medium clearly cannot handle bit rates (magnetic phase reversals) much faster than this. MFM can encode bits at this rate, while not transitioning more than once per bit (not exceeding the capacity of the tape) or less than once per two bit period (specifically, while it can take a full two bit-periods to transition at times, it's spread over three bits), meaning that the process of "clock recovery" is fairly simple. The first implication of a 4800 bps data rate is Nyquist: Any tape digitization must be sampled at /no less/ than 9600Hz, and preferably more. I am not an expert, so I would prefer to err on the side of /considerably/ more. 44100 Hz is nice for reasons other than data recovery. Under the circumstances, using a lossy audio compression codec is likely to completely destroy the data track. MFM (Modified Frequency Modulation) encoding is isochronous (each bit takes up the same amount of time), self-clocking (flux transitions occur at predictable locations and a certain minimum rate), and run-length limited (There will never be two consecutive half-bit periods with a flux transition, nor will there be more than four without). An encoded bitstream looks somewhat like the crude diagram below (best viewed in a fixed-width font). A couple of features of note: pulse widths vary from two to four half-bit periods, pulses transition in the middle of a bit period for a 1 and at the edge of a bit period for a 0 /unless the bit adjoining that edge is a 1/, and a row of consecutive 0 bits looks identical to a row of consecutive 1 bits. #+BEGIN_SRC fundamental | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | | | |-+ +---+ +-------+ +-| | | | | | | | | | +-----+ +-----+ +---+ | #+END_SRC The lead-in region for each page (a long string of consecutive 0 bits) terminates in a single 1 bit, after which we enter the data region. The data region consists of repeated sequences of a 0 bit followed by 8 data bits, MSB-first, to be read out from $4200 (all 8 bits in parallel). Because every ninth bit is guaranteed to be a zero encountering a run of nine or more consecutive 1 bits means that we are reading the lead-in region and are desynchronized by half a bit. More timing: 4800 bits/second divided by 9 bits/byte is 533 and 1/3 bytes/second, or one byte every 3355.8 CPU cycles, give or take. Using a per-bit model, that's one bit per 372.86 CPU cycles. Tape speed variation may cause actual timing to vary somewhat. * Memory map | $4200 - $4203 | Registers | | $4400 - $4FFF | RAM | | $8000 - $BFFF | Switched ROM | | $C000 - $FFFF | Fixed ROM bank 0 | * Registers | $4200 | R | Data byte read from tape | | $4200 | W | ??? | | $4201 | R | Tape read status? | | | | $80 - something to do with $4202.0? decoder disabled? | | | | decoder data ready? | | | | $40 - tape data clock synched? current tape data bit? | | | | current tape flux polarity? tape motor running? | | | | seek complete? | | | | $20 - set when in data region during seek? possibly | | | | set when in data region generally? or set normally | | | | and cleared when in a data region? | | | | $10 - ??? | | $4201 | W | Bits 0-3 select 16k ROM bank at $8000 - $BFFF | | $4202 | R | Tape drive status? | | | | $40 - shift register ready for next bit? | | | | $08 - power supply not connected | | $4202 | W | Tape drive control? | | | | $80 - output data bit | | | | $40 - ??? | | | | $20 - pulse low to reset drive controller? | | | | $10 - pulse low to clock data bit | | | | $08 - ??? | | | | $04 - ??? maybe tape audio enable? | | | | $02 - irq enable? | | | | $01 - data decoding enable? | $4203 appears to be unused. * The tape drive control The tape drive appears to be managed by way of a small microcontroller, with commands sent by a synchronous serial scheme. Commands are 8 bits wide, clocked out MSB-first on $4202.7, with $4202.4 as the clock line (either active-low or edge-triggered). The system sets $4202.6 when the next bit may be written. Routine at $FDA2 appears to be (slowly!) clocking bits out on $4202.7 with $4202.4 as the transmit clock bit and $4202.6 (read) as the "ready" bit. ** Observed commands - $00: "Seek backwards to start of current page"? - $01-$07: "Skip forward /n/ pages"? - $41: "Skip backwards /n/-$40 pages"? - $FF: ??? (skip to end of data block, maybe?) $FD70 - $FD99 contain a series of six-byte routines to send commands $82 through $87. * Overall operation flow of reading from tape 1. A command is sent to seek to the start of a page. 2. The system waits for $4201.6 to rise, indicating that a page has been found. 3. Once $4201.6 is observed to have been set, $4202.1 is set to enable IRQs. 4. When a 1 bit is read from the tape bitstream, an IRQ is delivered. 5. The IRQ handler sets $4202.0, to capture the next 8 decoded bits. 6. Eight bits later, another IRQ is delivered. 7. The IRQ handler reads the data byte from $4200, clears $4202.0 to disable the decoder, waits for $4201.7 to fall (indicating that the decoder logic is now ready to be re-enabled), and sets $4202.0 again to read the next byte. Presumably, the IRQ handler must disable and re-enable the decoder within the space of the single 0 bit encoded between bytes on the tape. * Firmware reference Some parts of the firmware appear to read from what are currently believed to be ROM locations and perform device I/O based on their contents. The two cases found thus far (routines at $FE06 and $FE93) have been to /not/ perform the I/O, but there may be something interesting going on here. The seek loop at $EA3B - $EA7E appears to be concerned with $07FC, $C5, $C7, $E2, $07F5, $ED, $94, $EE, and $EF. The IRQ handler appears to be configured by $07ED, $D0, $D1, $D2, and $D3. ** IRQ handler behavior? The first time into the IRQ handler it appears that $07F7 is zeroed, so flow runs to $EE85, setting $07F7 to three. Also clearing $07F6, $07EC, and $CF. Control then passes to $EECF. $EECF raises $4202.0 (enable decoder, maybe?), and then flow decisions are made based on $07E9 and $07F7. If $07E9 is zero (or $07F7 is zero, but we know that it is not), set the D flag in the caller (?) and lower $4202.0 and $4202.1. We don't see this lowering, so $07E9 is not zero. Control therefore passes to $EF13. $EF13 does some messing with $E4 based on $CF (???), then conditionally arranges a return to $F2EC (which doesn't happen). Control pases to $EF2D. $EF2D calls $F2AD, which... does some sort of counter-based thing. After that are some tests for skipping past some logic that disables IRQ generation and hacking up the return address. Clearly not triggered, so control leaves the IRQ handler normally. Essentially, the only thing that we can see being done on the first pass is setting up $07F7, adjusting some counterish state, and enabling further IRQs on some basis. Second entry, $07F7 is known to be three. Control passes to $EE96, which reads $4200 and stashes at $D5, then lowers $4202.0 (the decoder enable?). $07EC was cleared last time, so we branch to $EEAC. [WRONG! We go to $EF6C.] $EF6C calls $EF85, then jumps to $EEBF. Overall, more wastage in terms of code design. What does $EF85 do? At $EEAC, we check to see if $D5 is equal to #$C5. If it is, we go to $EF72. If it isn't, we check it for zero. If it's not zero, we set $CF.3 and control resumes at $EEBF. If we got to $EF72, various state bits are changed: $07F6 is set to three, $07F7 is set to one, and $EA and $07EC are zeroed. Control then moves to $EECA. At $EEBF, we check $CF & #$1F. If it's nonzero, we blank $07F7. Control proceeds to $EECA. At $EECA, we busy-wait for $4201.7 to fall. Control proceeds to $EECF, at which point we're in the same flow-path as before. Observationally, if $4200 returns other than $C5 on its first read, the system emits drive command $FF. If it then returns $C5 (or $C4 or $00) on the second read, the system emits drive command $FF anyway. Is there some flow path that causes the system to /not/ emit drive command $FF at this point? If $4200 returns $C5 on the first read, then $01 thereafter, we get eight reads total (the $C5 and seven $01s) before a new command is placed, and that new command depends on the page number input on the title screen, implying that it is a "seek" command or a "load this page" command. Specifically, this sequence of values tells the system that the tape is positioned at page 02. If $4200 returns the actual data on the tape, no new commands appear to be issued. ** Outputs to the drive control register Init at $E80A sets $4202 to $FC. IRQ handler at $EE9B lowers $4202.0. IRQ handler at $EECF raises $4202.0. IRQ handler at $EEF3 masks $4202 to $FC (lowers .0 and .1?). Unknown at $F2EF (entry point $F2EC) sets $4202 to $FC. Unknown at $F8BD masks $4202 to $FC, while IRQs are masked. Routine at $F8EB lowers $4202.2. Routine at $F8F5 raises $4202.2. Unknown at $F9A1 raises $4202.1. Routine prologue at $FDB3 does something complex to $4202 (tries to lower .7, raises .4, and raises an arbitrary group of other bits passed in X, which turns out to only ever be $80 or $00). At $FDCD, lowers $4202.4. At $FDD7, raises $4202.4. Routine at $FDF0 lowers $4202.5, waits about 1280 cycles (eleven scanlines?), then raises it again. ** Inputs from the drive control register At $E97C, if $4202.6 not set, system hangs at main screen without showing "page" selector. This apparently happens with real hardware when there is no tape inserted. At $E9C4, if $4202.6 not set, control returns to $E97C -- some sort of clocked data transfer or other init sequence? ** Bank 00 locations | $E700 | Some sort of jump table | | $EBFC | Wait for an NMI (enabling them, first) | | $EE7B | IRQ handler entry (main tape data handling logic) | | $F025 | ??? Something to do with data reception? | | $F16C | (internal block) set up data receive buffer ($E8/$E9), among other things? | | $F1AF | (internal block?) copy receved byte to data receive buffer? | | $F5BC | Enable VBlank NMI | | $F5C9 | Disable VBlank NMI | | $F8F5 | Raise $4202.3 (why?) | | $F8FF | Per-NMI tape state machine? | | $FDE4 | A delay loop | | $FDE9 | A delay loop | | $FDF0 | Reset tape drive controller? | | $FE06 | Main RESET entry, appears to just bounce to $E700, but might also set up some hardware | | $FE40 | Some sort of data block, runs through $FE7F | | $FFFA | NMI vector, value is $07FD | | $FFFC | RESET vector, value is $FE06 | | $FFFE | IRQ/BRK vector, value is $EE7B | ** ZPage locations | $C0 | Waiting-for-VBlank flag (cleared every NMI) | | $C5 | ??? | | $B0 - $B2 | A 10ths-of-a-second counter? (see $EE18, NMI called) | | $B6 | ??? used for an interaction loop exit? | | $D5 | Value read from $4200 during IRQ | | $E0 | Value stored to $4202 (tape drive control?) | | $E4 | ??? | | $E5 | ??? (see $FD9A) | | $E8 - $E9 | A buffer pointer of some sort | | $ED | current tape page? | | $F2 | ??? | | $F3 - $F6 | controller data? related to $F7 - $FA | | $F7 - $FA | controller data? related to $F3 - $F6 | | $FF | Value stored as PPU control | ** System RAM locations | $0000 - $00FF | ZPage (see ZPage locations section) | | $0140 - ??? | Unknown data buffer | | ??? - $01FF | CPU stack | | $0200 - $02FF | OAM (sprite) DMA source buffer | | $07F3 | ??? | | $07F5 | tape seek direction? | | $07FC | ??? some sort of flag | | $07FD | NMI direction stub, typically a JMP instruction | ** 3k RAM locations | $440A | ??? | | $440C | ??? | | $440E | ??? | * Existing emulations Two known existing emulators, FCEUX and Nesem. Neither work. ** FCEUX https://github.com/TASVideos/fceux/blob/master/src/boards/186.cpp FCEUX claims for two 32k RAM chips on the device. BRAM is mapped as 4K banks at $4000 - $5FFF, with $4000 - $4FFF being fixed to bank 0 and $5000 - $5FFF being selected by the low three bits of writes to $4200, and $4000 - $43FF not actually being accessible as RAM (due to being shadowed by the CPU and StudyBox register spaces). FCEUX does not implement the $5000 - $5FFF region. PRAM is mapped as an 8K bank at $6000 - $7FFF, selected by the high two bits of writes to $4200. FCEUX implements PROM and PRAM paging, and the low BRAM page. It returns fixed values for $4200 - $4203, and $FF for $4204 - $43FF. This implementation (with the fixed value for reading $4202) will be unable to clock out a full tape drive command byte, and will likely "hang" at the initial loading screen. ** Nesem https://github.com/SourMesen/Mesen/blob/master/Core/StudyBox.h Nesem does not claim anything about the hardware. It implements the 32k paged region at $6000 - $7FFF, and the fixed RAM at $4400 - $4FFF, the PRG ROM region at $8000 - $FFFF, and fixed values for all of the registers... except for $4202. For $4202, it implements a concept of "tape ready", which controls $4202.6. Specifically, about 100 cycles after a write to $4202, it sets "tape ready" based on the written $4202.4. This implementation (with $4202.6 toggling based on the value written to $4202.4), will likely be able to clock out a full drive command byte, but will likely "hang" at the initial loading screen (possibly playing the "loading" tune). * EOF