cpldcpu / tinytapeout_mcpu5

8 bit CPU optimized for the constraints of tinytapeout

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Zinnia (MCPU5)

An 8 bit RISC CPU for TinyTapeout. Tinytapeout combines 500 designs on a single IC to be taped out with the Open MPW-7. This offers the opportunity to actually get a design made on a real IC, but also comes with some constraints:

  • Maximum allowed area is 100 x 100 µm² in Skywater 130nm CMOS technology. The actual number of useable gates depends on cell size and routing limitations.
  • Only eight digital inputs and eight digital outputs are allowed.
  • I/O will be provided via the scanchain (a long shift register) and is hence rather slow.

Designing a CPU around these constraints offers a nice challenge. Challenge accepted!

Content of repository

  • src/ contains the original submission to TinyTapeout
  • design/ Cleaned up source, Testbench, Assembler and code examples
  • design_plus/ A version of the CPU with better ISA that did not make it into the tape out.
  • See below for a design description

Design Description

Top level

The strict limitations on I/O do not allow implementing a normal interface with bidirectional data bus and separate address bus. One way of addressing this would be to reduce the data width of the CPU to 4 bit, but this was deemed to limiting. Another option, implementing a serial interface, appeared too slow and too complex.

Instead the I/Os were allocated as shown below.

The CPU is based on the Harvard Architecture with separate data and program memories. The data memory is completely internal to the CPU. The program memory is external and is accessed through the I/O. All data has to be loaded as constants through machine code instructions.

Two of the input pins are used for clock and reset, the remaining ones are reserved for instructions and are six bit in length. The output is multiplexed between the program counter (when clk is '1') and the content of the main register, the Accumulator. Accessing the Accumulator allows reading the program output.

Programmers Model

Besides simplifying the external interface, the Harvard Architecture implementation also removes the requirement to interleave code and data access on the bus. Every instruction can be executed in a single clock cycle. Due to this, no state machine for micro-sequencing is required and instructions can be decoded directly from the inst[5:0] input.

All data operations are performed on the accumulator. In addition, there are eight data registers. The data registers are implemented as a single port memory based on latches, which significantly reduced are usage compared to a two port implementation. The Accu is complemented by a single carry flag, which can be used for conditional branches.

Handling of constants is supported by the integer flag („I-Flag“), which enables loading an eight bit constant with two consecutive 6 bit opcodes.

Instruction Set Architecture

The list of instructions and their encoding is shown below. One challenge in the instruction set design was to encode the target address for branches. The limited opcode size only allows for a four bit immediate to be encoded as a maximum. Initially, I considered introducing an additional segment register for long jumps, but ultimately decided to introduce relative addressing for conditional branches and a long jmp instruction that is fed from the accumulator.

Having both NOT and NEG may seems excessive, but the implementation was cheap on resources and some instruction sequences could be simplified.

No boolean logic instructions (AND/OR/NOT/NOR/XOR) are supported since they were not needed in any of my typical test programs.

grafik

The table below shows common instruction sequences that can be realized with macros.

grafik

Design after placement and routing

The total cell count after synthesis is 489. Adding any additional features did not allow to complete the routing pass.The summary and floorplan below shows the synthesis result for 115x115µm² area, however the design fits perfectly into 100x100µm² as well.

grafik

grafik

Summary

Zinnia (MCPU5) is a successful 8 bit processor implementation considering the TinyTapeout contraints. Both Fibonacci and prime search algorithms were successfully ported and run in the testbench.

In hindsight, two design decisions in the instruction set architecture seem limiting:

  • NEG not setting any flag. This is a missed opportunity to simplify test for zero.
  • Relative branch range of sIMM4 is too short to be useful. Instead a more efficient implementation for long jmps is required, for example based on sideloading register.

Both of these issues are addressed in an updated implementation.

Original TinyTapeout Readme


Go to https://tinytapeout.com for instructions!

How to change the Wokwi project

Edit the Makefile and change the WOKWI_PROJECT_ID to match your project.

What is this about?

This repo is a template you can make a copy of for your own ASIC design using Wokwi.

When you edit the Makefile to choose a different ID, the GitHub Action will fetch the digital netlist of your design from Wokwi.

The design gets wrapped in some extra logic that builds a 'scan chain'. This is a way to put lots of designs onto one chip and still have access to them all. You can see all of the technical details here.

After that, the action uses the open source ASIC tool called OpenLane to build the files needed to fabricate an ASIC.

What files get made?

When the action is complete, you can click here to see the latest build of your design. You need to download the zip file and take a look at the contents:

  • gds_render.svg - picture of your ASIC design
  • gds.html - zoomable picture of your ASIC design
  • runs/wokwi/reports/final_summary_report.csv - CSV file with lots of details about the design
  • runs/wokwi/reports/synthesis/1-synthesis.stat.rpt.strategy4 - list of the standard cells used by your design
  • runs/wokwi/results/final/gds/user_module.gds - the final GDS file needed to make your design

What next?

About

8 bit CPU optimized for the constraints of tinytapeout

License:Apache License 2.0


Languages

Language:C 77.5%Language:Verilog 8.5%Language:Roff 8.3%Language:Assembly 3.5%Language:Python 0.9%Language:Tcl 0.6%Language:Makefile 0.4%Language:Shell 0.3%