This documents gives some pointers on what's needed to port super-reu to a new FPGA platform.
The design has been created for a system clock speed of 80-100 MHz.
It can probably still work at much lower speeds, but the timing
in bus_manager
(and phi_recovery
, if used) would need to be
adjusted accordingly.
When implemented on a Cyclone 10, around 3000 logic elements are used (see the Quartus Fitter Usage Summary report for details) for the design with 4 DMA channels. No LUTs with more than 4 inputs are used in this implementation.
The only BRAM used by the design is for the ROML/ROMH emulation, which uses 128 Kbits of BRAM. ROMH is not actually used by the sample design, so this can be halved if needed. It's also possible to remove ROML as well, booting software from floppy disk or datassette instead, to reduce BRAM use to 0.
Conversely, if lots of BRAM is available, it can be used for the REU memory thus removing the need for external RAM.
The following expansion port pins must be available to the FPGA:
IRQ
- For the DMA channels to be able to raise interrupts, the FPGA must be able to pull theIRQ
pin low.R/W
- The FPGA must be able to both sample and drive this pin high/low.EXROM
- For the boot ROM to work, the FPGA needs to be able to pull this pin low.GAME
- LikeEXROM
, but needed only forROMH
which is not used by the sample boot ROM.IO1
/IO2
- The FPGA must be able to tell if either of these are pulled low, but does not need to know which one. The signals can be combined (with an AND gate) either internally or externally.ROML
/ROMH
- LikeIO1
/IO2
, the FPGA must be able to tell if either is pulled low, but does not need to differentiate between them. The default ROM does not rely on theROMH
pin being connected, and if running without ROM there is no need to connectROML
either. IfROMH
is unconnected, thenGAME
must not be driven low. IfROML
is unconnected, thenEXROM
must not be driven low,BA
- The FPGA must be able to sample the level of this pin, to detect VIC use of the busDMA
- The FPGA must be able to pull this pin low (surprise!)D0
-D7
- The FPGA must be able to both sample, and drive these pins high/low. A single control signal to control the direction of all 8 pins is sufficient.A0
-A15
- The FPGA must be able to both sample, and drive these pins high/low. A single control signal to control the direction of all 16 pins is sufficient.RESET
- The FPGA must be able to sample this pin, but in case loading of ROM code from an external source is used it should also be able to pullRESET
low until the ROM load completes.PHI2
- The FPGA must be able to sample this pin to align its bus cycle timing with that of the C64.
There does not need to be any connection to the DOTClk
or NMI
pins.
All inputs and outputs on the expansion port use TTL signal levels. Level shifters should be added if required by the FPGA chip.
128 Kbytes to 16 Mbytes of RAM is needed to back the external memory of the DMA channels. In the Chameleon implementation this is implemented with an external SDRAM, and in the Orange Cartridge implementation by HyperRAM, but internal memory resources could also be used. At least two memory ports (one read/write, one write only) are needed if the MMC64 function is desired, but accesses are allowed to take multiple cycles, and can thus be serialized if only one hardware port is available (as is the case with the SDRAM and HyperRAM).
If the MMC64 function is desired, an SDcard reader must be connected to
the FPGA. Only SPI mode is used, so the mandatory signals are CS
,
CLK
, DI
, DO
, and CD
. WP
is also expected, but only reported to
software and not needed by the sample application.
The Chameleon and Orange Cartridge implementations loads the contents of
the ROM from SPI flash on startup. This is done to allow for convenient
update of the ROM contents, but is not necessary at all. Even if using
the ROM functionality, the contents could be put into the FPGA bitstream
using e.g. $readmemh
.
The Chameleon implementation uses two LEDs to indicate init completion and SDcard activity, and two buttons to reset the C64 and the FPGA respectively. The Orange Cartridge implementation uses the red and green channels of the RGB LED to indicate the same information as the Chameleon LEDs, and the single button to reset the C64. All this is completely optional.
This section describes the RTL modules making up the super-reu core, and how to interface them with a top level design.
This module handles the interface to the C64 expansion port, and deals with the bus cycle timing.
clk
- The system clock (100 MHz)reset
- Synchronous reset, when 1 the bus will be tristated and DMA deassertedphi
- This should be a recreated PHI signal which is in phase with the externalPHI2
signal but synced with theclk
clock for setup and hold purposesds_dir
- If a bidirectional level shifter is used for the data bus, this output indicates the currently desired direction, 0 = C64 drives into FPGA, 1 = FPGA drives into expansion port.ds_en_n
- If a bidirectional level shifter is used for the data bus, this output indicates wheter or not it should be active. 0 = level shifter drives in the direction indicated byds_dir
, 1 = both sides tri-statedd_d
- Data bus input, connect to expansion port data busd_q
- Data bus output, drive data pins to this value whend_oe
is 1d_oe
- Data bus output enable. 0 = tristate data pins, 1 = drive data pinsas_dir
- If a bidirectional level shifter is used for the address bus, this output indicates the currently desired direction, 0 = C64 drives into FPGA, 1 = FPGA drives into expansion port.as_en_n
- If a bidirectional level shifter is used for the address bus, this output indicates wheter or not it should be active. 0 = level shifter drives in the direction indicated byas_dir
, 1 = both sides tri-stateda_d
- Address bus input, connect to expansion port address busa_q
- Address bus output, drive address pins to this value whena_oe
is 1a_oe
- Address bus output enable. 0 = tristate address pins, 1 = drive address pinsba
- Connect this input toBA
on the expansion portioef
- This input should be 1 whenever eitherIO1
orIO2
is lowromlh
- This input should be 1 whenever eitherROML
orROMH
is low. If ROM is not used (neitherEXROM
orGAME
is driven low), this input can be set to constant 0.rw_in
- Read/write control input, connect toR/W
on the expansion portrw_out
- Read/write control output; when this output is 1 theR/W
pin on the expansion port should be driven low. When it is 0 output should be tri-stated / open.dma
- DMA output; when this output is 1 theDMA
pin on the expansion port should be driven low. When it is 0 output should be tri-stated / open.romlhdata
- Data forROML
/ROMH
. Should be valid starting one system clock cycle afterromlh_r_strobe
is asserted.romlh_r_strobe
- This output is set to 1 to trigger a read from ROM intoromlhdata
. The ROM address should be taken from the 14 least significant bits of the expansion port address bus.ioefdata
- Data forIO1
/IO2
. Should be valid starting one system clock cycle afterioef_r_strobe
is asserted.ioef_r_strobe
- This output is set to 1 to trigger a read from IO registers intoioefdata
. The register address should be taken from the 9 least significant bits of the expansion port address bus.ioef_w_strobe
- This output is set to 1 to trigger a write to IO registers. The register address should be taken from the 9 least significant bits of the expansion port address bus, and the data to write from the expansion port data bus.ff00_w_strobe
- This output is set to 1 for one system clock cycle when a write to address FF00 is detected.clockport_enable
- Set this input to 1 to redirect the I/O range from the address given by the address in parameterCLOCKPORT_START
(inclusive, default value DE02) to the address given by the address in parameterCLOCKPORT_END
(exclusive, default value DE10) to the clockport instead of theioef_r_strobe
andioef_w_strobe
signals. If clockport functionality is not needed, it can be disabled by tying this signal constant low.clockport_read
- This output is set to 1 when the clock portIORD
signal should be asserted (low), together with the desired chip select pin. If clockport functionality is not needed, this signal can be ignored.clockport_write
- This output is set to 1 when the clock portIOWR
signal should be asserted (low), together with the desired chip select pin. If clockport functionality is not needed, this signal can be ignored.dma_a
- DMA request address inputdma_d
- DMA request data input (for write cycles)dma_q
- DMA request data output (for read cycles), valid once the DMA operation is acked.dma_rw
- Read/write mode for DMA request (0 = read, 1 = write)dma_req
- Input which should be toggled to request a new DMA transferdma_ack
- Output which will take the value ofdma_req
once the DMA transfer has been completed.dma_alloc
- When this input is 1, the bus manager will assert DMA at the next opportunity (if it not already asserted), and keep it asserted until the input returns to 0, even if there are no DMA transfers requested usingdma_req
. This can be used to lock the bus for multiple transfers.
The following schematic illustrates how a single bit of the data bus should be hooked up when using external level shifters (address bus is analogous):
For open-collector signals such as R/W
, a simpler level shifting setup
can be employed, as illustrated in the following schematic:
This module handles accesses to the DE00-DFFF area by dividing the
address space into apertures handled by separate modules. The
instantiation parameter a_bits
indicate how many bits of the addresses
need to be tested. This should be set to 9 since the upper 7 bits have
already been matched by the PLA in the C64. The instantiation parmeter
devices
specify the number of apertures, and the number of modules
they should be connected to. Finally the instantiation parameters
base_addresses
and aperture_widths
give the base address (always a
full 16-bit address, but only the lower a_bits
will actually be compared)
and size (as a power of two) for each aperture.
a
- Input of thea_bits
lowest bits of the expansion port addressread_strobe
- Connect toioef_r_strobe
onbus_manager
write_strobe
- Connect toioef_w_strobe
onbus_manager
read_data
- Connect toioefdata
onbus_manager
read_strobes
- Output ofdevices
individual read strobes, one per devicewrite_strobes
- Output ofdevices
individual write strobes, one per deviceread_datas
- Input ofdevices
groups of 8 bits, one per device, of responses to read strobes
The actual DMA channels. The instantiation parameter channels
select
the number of channels (1-16). The register aperture must be at least
16*channels
large. The instantiation parameter ram_a_bits
selects
the size of the expansion RAM addresses (17-24 bits).
clk
- The system clock (100 MHz)reset
- Synchronous reset, when 1 all registers will be resetirq
- When this output is 1, the expansion portIRQ
pin should be pulled low. When it is 0, the pin should be tri-stated / open.a
- Address bus input, connect to expansion port address busd_d
- Data bus input, connect to expansion port data busd_q
- Data bus output, connect toaddress_decoder
read_strobe
- Connect toaddress_decoder
write_strobe
- Connect toaddress_decoder
ff00_strobe
- Connect toff00_w_strobe
onbus_manager
dma_a
- Connect tobus_manager
dma_d
- Connect tobus_manager
dma_q
- Connect tobus_manager
dma_rw
- Connect tobus_manager
dma_req
- Connect tobus_manager
dma_ack
- Connect tobus_manager
dma_alloc
- Connect tobus_manager
ram_a
- Address output for RAM portram_d
- Data output for RAM port writesram_q
- Data input from RAM port readsram_we
- Read/write mode for RAM port (0=read, 1=write)ram_req
- Toggled when the module wants to perform a RAM operationram_ack
- Input which should take the value ofram_req
once the RAM operation is completephi2tick
- This input should be 1 for one system clock cycle every period of the PHI2 clock. It is used for the pacing functionality.
The suggested timing for phi2tick
is to assert it towards the end of
the low part of PHI2, since the dma_req
signal is sampled during the
first half of the low part of PHI2 (approx 200 ns from the falling
edge), which gives the DMA engine ample time to prepare the DMA
request before the next checkpoint.
The mmc64 module relies on an external implementation of SPI (this is because the SPI implementation is shared with the ROM loading function in the Chameleon implementation). Thus it simply signals that it wants to start a transfer and the 8 bits that it wants to send, and expects the corresponding input bits to be available when the transfer request is acknowledged. The exact SPI clocks provided for the two speeds are not important, but the fast speed should be at least 6 MHz to be able to sustain steaming movie playback, and the slow speed must not exceed 400 kHz.
The instantiation parameter ram_a_bits
selects the size of the
expansion RAM addresses (17-24 bits).
clk
- System clockreset
- Synchronous reset, when 1 all registers will be reseta
- Address bus input, connect to expansion port address busd_d
- Data bus input, connect to expansion port data busd_q
- Data bus output, connect toaddress_decoder
read_strobe
- Connect toaddress_decoder
write_strobe
- Connect toaddress_decoder
spi_q
- The DI bits collected during the latest SPI transferspi_d
- The DO bits to send in the next transferspi_req
- Toggled when the module wants to perform a new transferspi_speed
- Requested SPI clock of the transfer, 0=250 kHz, 1=8 MHzspi_ack
- Input which should take the value ofspi_req
once the transfer is completewp
- Input from the Write Protect tab detector on the SDcard readercd
- Input from the Card Detect sensor on SDcard readerspi_cs
- Card Select signal to the SDcardexrom
,game
- Inputs for the current state of the corresponding expansion port pins. Only used for the status register.disable_exrom
- Output that indicates the state of bit 5 in the Control register. If using ROM, then theEXROM
pin should be driven low when this value is 0.ram_a
- Address output for RAM portram_d
- Data output for RAM port writesram_q
- Data input from RAM port reads (not used)ram_we
- Read/write mode for RAM port (always 1=write)ram_req
- Toggled when the module wants to perform a RAM operationram_ack
- Input which should take the value ofram_req
once the RAM operation is complete
A dummy register file example for adding new functionality.
clk
- System clockreset
- Synchronous reset, when 1 all registers will be reseta
- Address bus input, connect to expansion port address busd_d
- Data bus input, connect to expansion port data busd_q
- Data bus output, connect toaddress_decoder
read_strobe
- Connect toaddress_decoder
write_strobe
- Connect toaddres_decoder
clockport_enable
- Set to 1 if the first write to $DE01 has the LSB set (RR compatibility)soft_reset
- This output is asserted for 255 system clock cycles after the magic value$52
is written to $DE00. This can be used to implement a software controlled reset signal.
A simple ROM emulator based on BRAM. It has one read port and one write port (where the latter is used for initialization).
The instantiation parameter ram_a_bits
selects the size of the
ROM addresses. Use 14 for 16K ROM (ROML
+ROMH
), 13 for 8K ROM (ROML
only).
clk
- System clockread_addr
- Read port address inputwrite_addr
- Write port address inputread_data
- Read port data outputwrite_data
- Write port data inputread_strobe
- Read port strobe, data will be available the following cyclewrite_strobe
- Write port strobe, data should be made available the same cycle
A simplified version of chameleon_phi_clock by Peter Wendrich. It's a PLL with an NCO that generates a local version of the PHI2 clock offset by a configurable number of system clocks.
The instantiation parameter phase_shift
selects the number of system
clocks that the locally generated PHI2 clock precedes the C64 PHI2 clock.
clk
- System clockphi2_in
- The PHI2 signal from the C64 expansion portphi2_out
- The locally regenerated PHI2 clockphi2_out_lock
- This output is set to 1 when the locally generated PHI2 clock is phase locked to the C64 PHI2 clock.full_m2
- This output is set to 1 during the 2nd to last cycle before the falling edge ofphi2_out
full_m1
- This output is set to 1 during the last cycle before the falling edge ofphi2_out
full_p0
- This output is set to 1 during the first cycle after the falling edge ofphi2_out
full_p1
- This output is set to 1 during the 2nd cycle after the falling edge ofphi2_out
half_m2
- This output is set to 1 during the 2nd to last cycle before the rising edge ofphi2_out
half_m1
- This output is set to 1 during the last cycle before the rising edge ofphi2_out
half_p0
- This output is set to 1 during the first cycle after the rising edge ofphi2_out
half_p1
- This output is set to 1 during the 2nd cycle after the rising edge ofphi2_out
A cold/warm reset generator. The instantiation parameter reset_cycles
selects the number of system cycles after which the reset signal is
deasserted again.
clk
- System clockext_reset_n
- External reset input. Connect this to the reset signal from the C64 expansion port. In order to prevent feedback loops, this input reacts only to a falling edge.int_reset
- A synchronous input which will trigger a warm reset when 1soft_reset
- When this input is asserted, a falling edge onext_reset_n
does not trigger an internal reset. This allows resetting of the C64 without resetting the super-reu core.reset
- Reset output
An SPI master implementation for use with the mmc64 module, or any other
function needing to perform SPI transfers. The instantiation parameters
clk_speed
and sck_speed
should indicate the speed of the system clock
and the desired SCK speed, respectively. Any unit (Hz, kHz, MHz) can be
used as long as it is the same for both parameters. A second "fast" SCK
speed (there is no requirement that it is actually faster) can also be
specified using the parameter sck_fast_speed
.
clk
- System clockreset
- Reset inputsclk
- Connect to SCLK pinmiso
- Connect to MISO pinmosi
- Connect to MOSI pinreq
- Toggle this input to start a new SPI transferack
- This output takes the same value asreq
when the transfer is completedfast_speed_en
- When two speeds are configured, a 1 here selects the "fast" speed. Ifsck_fast_speed
is not specified, this input does not need to be connectedd
- The byte to send on MOSI (should remain valid until the transfer is acknowledged)q
- The byte which was receoved on MISO (will be valid from the point where the transfer is acknowledeged to the point where a new transfer is requested)