

# CHIP/ARD Architecture and Components



### **Use Cases**





## **CHIPYARD** Organization

### What is Chipyard?

- An organized framework for various SoC design tools
- A curated IP library of opensource RISC-V SoC components
- A methodology for agile SoC architecture design, exploration, and evaluation





## **CHIPYARD** Organization





### SoC Architecture





### **Tiles and Cores**



#### Tiles:

- Each Tile contains a RISC-V core and private caches
- Several varieties of Cores supported
- Interface supports integrating your own RISC-V core implementation



7

## **Rocket and BOOM**



#### **Rocket:**

- First open-source RISC-V CPU
- In-order, single-issue RV64GC core
- Efficient design point for low-power devices

#### SonicBOOM:

- Superscalar out-of-order RISC-V CPU
- Advanced microarchitectural features to maximize IPC
- TAGE branch prediction, OOO load-store-unit, register renaming
- High-performance design point for general-purpose

| ICache TLB     L1 Instruction Cache     128bit/c       ICache Tags     32-KiB 8-way     128bit/c                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | L2<br>ycle                |  |  |  |  |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------|--|--|--|--|
| 16 Bytes/cycle                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                           |  |  |  |  |
| LO BTB                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                           |  |  |  |  |
| (1-cycle redirect) Instruction Fetch & PreDecode (4 cycles)<br>(16 Byte window)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                           |  |  |  |  |
| (2-cycle redirect)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | Dense L1 B1B              |  |  |  |  |
| TAGE-L Branch Fetch Buffer<br>(32 entries)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | Fetch Buffer              |  |  |  |  |
| Predictor Inst Inst Inst                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | œ                         |  |  |  |  |
| (3-cycle redirect)<br>4-Wide Decode                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |                           |  |  |  |  |
| Return-Address<br>Stack         Decoder         Decoder         Decoder                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                           |  |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                           |  |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                           |  |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 51                        |  |  |  |  |
| Execute Rename / Allocate / Retirement                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 2 K<br>2 K                |  |  |  |  |
| ReOrder Buffer (128 entries)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | L2 Cache<br>512 KiB 8-way |  |  |  |  |
| Floating-point Distributed Scheduler                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | he<br>-w                  |  |  |  |  |
| Physical Register File (128 Begister File                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | ау                        |  |  |  |  |
| Integer Physical         FP Issue         INT Issue Queue         MEM Issue           Register File         Queue         22 and integer Physical         Queue                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                           |  |  |  |  |
| (128 Registers)     32 entries       Predicate Physical     32 entries       32 entries     32 entries                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                           |  |  |  |  |
| Register File (16 bits)         Port         Po |                           |  |  |  |  |
| 404 904 904 904 904 904 904 904 904                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |                           |  |  |  |  |
| ALU ALU ALU FPU FPU AGU AGU                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |                           |  |  |  |  |
| Branch Branch Branch FDIV Data                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                           |  |  |  |  |
| EUs                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |                           |  |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                           |  |  |  |  |
| Load Queue Store Buffer &                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                           |  |  |  |  |
| (32 entries) 8B/cycle Forwarding                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                           |  |  |  |  |
| (32 entries)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                           |  |  |  |  |
| (32 entries) (32 entries)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                           |  |  |  |  |
| (32 entries)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                           |  |  |  |  |
| (32 entries)<br>88/cycle 88/cycle DCache 88/cycle Next-line Prefetcher                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                           |  |  |  |  |
| (32 entries)<br>88/cycle 88/cycle DCache 71 B 88/cycle Prefetcher<br>L1 Data Cache 8 MSHRs                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                           |  |  |  |  |
| (32 entries)<br>88/cycle 88/cycle DCache 71 B 88/cycle Prefetcher<br>L1 Data Cache 8 MSHRs                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 8bit/cycle                |  |  |  |  |



## **Rocket and BOOM**



#### **Rocket and SonicBOOM:**

- Support RV64GC ISA profile
- Boots off-the-shelf RISC-V Linux distros (buildroot, Fedora, etc.)
- Fully synthesizable, tapeout-proven
- Described in Chisel
- Fully open-sourced

| ICache TLB<br>ICache Tags<br>L1 Instruction Ca<br>32-KiB 8-way                                                 |                                              |  |  |  |  |
|----------------------------------------------------------------------------------------------------------------|----------------------------------------------|--|--|--|--|
| LO BTB                                                                                                         |                                              |  |  |  |  |
| (1-cycle redirect) Instruction Fetch & PreDecode (4 cycles)<br>(16 Byte window)                                |                                              |  |  |  |  |
| Dense L1 BTB<br>(2-cycle redirect)                                                                             |                                              |  |  |  |  |
| IAGE-L Branch (32 e                                                                                            | Buffer T Buffer                              |  |  |  |  |
| Predictor<br>(3-cycle redirect)                                                                                |                                              |  |  |  |  |
| Return-Address         4-Wid           Stack         Decoder         Decoder                                   | e Decode<br>Decoder Decoder                  |  |  |  |  |
| FrontEnd                                                                                                       |                                              |  |  |  |  |
| 4 <mark>04 qO</mark> 4                                                                                         | и <mark>ор иор</mark> от                     |  |  |  |  |
| Execute Rename / Alloca<br>ReOrder Buffer                                                                      |                                              |  |  |  |  |
| 90μ 90μ                                                                                                        |                                              |  |  |  |  |
| Register File<br>(128 Registers)<br>Predicate Physical<br>Register File (16 bits)<br>Queue<br>32 entries<br>32 | entries MEM Issue<br>Queue<br>32 entries     |  |  |  |  |
| Port Port Port Port Po<br>μΟΡ μΟΡ μΟΡ μΟΡ μΟΡ                                                                  |                                              |  |  |  |  |
| ALU ALU ALU ALU FP<br>Branch Branch Branch FD                                                                  | U FPU AGU AGU                                |  |  |  |  |
| Load Queue<br>(32 entries) 8B/cycle                                                                            | Store Buffer &<br>Forwarding<br>(32 entries) |  |  |  |  |
| 8B/cycle 8B/cycle DCache                                                                                       | 8B/cycle Next-line<br>Prefetcher             |  |  |  |  |
| L1 Data Cad                                                                                                    | the 8 MSHRs                                  |  |  |  |  |
| Load/Store 32 KiB 8-W<br>Unit                                                                                  |                                              |  |  |  |  |



## PULP Cores in CHIPYARD



#### CVA6 (Formerly Ariane):

- RV64IMAC 6-stage single-issue in-order core
- Open-source
- Implemented in SystemVerilog
- Developed at ETH Zurich as part of PULP,
- Now maintained by OpenHWGroup



#### Ibex (Formerly Zero-RISCY):

- RV64IMC 2-stage single-issue in-order core
- Open-source
- Implemented in SystemVerilog
- Developed at ETH Zurich as part of PULP
- Now maintained by lowRISC



## **Sodor Education Cores**

#### **Sodor Core Collection**

- Collection of RV32IM cores for teaching and education
- 1-stage, 2-stage, 3-stage, 5-stage implementations
- Micro-coded "bus-based" implementation
- Used in introductory computer architecture courses at Berkeley









#### Spike:

- Open-source RISC-V ISA simulator
- Fast, extensible, C++ functional model

#### Spike-as-a-Tile:

- DPI interface between RTL SoC simulation and SoC functional model
- Spike "virtual platform"
- Enables testing of complex device software in RTL simulation



### **RoCC Accelerators**



#### **RoCC Accelerators:**

- Tightly-coupled accelerator interface
- Attach custom accelerators to Rocket or BOOM cores



### **RoCC Accelerators**



- Hwacha vector accelerator
- SHA3 accelerator



### **MMIO Accelerators**



#### **MMIO Accelerators:**

- Controlled by MMIO-mapped registers
- Supports DMA to memory system
- Examples:
  - Nvidia NVDLA accelerator
  - FFT accelerator generator



### **Coherent Interconnect**





### **Protocol Shims**



### AMBA-to-TileLink shims enable easy integration with existing IP

- Works for cores/peripherals/accelerators
- Drop-in Verilog integration of CVA-6, NVDLA



### **NoC Interconnect**





### Constellation

#### A parameterized Chisel generator for SoC interconnects

- Protocol-independent transport layer
- Supports TileLink, AXI-4
- Highly parameterized
- Deadlock-freedom
- Virtual-channel wormholerouting





### L2/DRAM



#### Shared memory:

- Open-source TileLink L2 developed by SiFive
  - Directory-based coherence with MOESI-like protocol
  - Configurable capacity/banking
- Support broadcast-based coherence in no-L2 systems
- Support incoherent memory systems **DRAM:**
- AXI-4 DRAM interface to external memory controller
- Interfaces with DRAMSim

## Peripherals and IO



#### Peripherals and IO:

- Open-source RocketChip blocks
  - Interrupt controllers
  - JTAG, Debug module, BootROM
  - UART, GPIOs, SPI, I2C, PWM, etc.
- TestChipIP: useful IP for test chips
  - Clock-management devices
  - SerDes
  - Scratchpads



### SoC Architecture





## **CHIPYARD** Organization





## **CHIPYARD** Organization

SoC Configuration

|                                                                                                                                          | Custom<br>Configu     |                        |                                                                                                                                 |
|------------------------------------------------------------------------------------------------------------------------------------------|-----------------------|------------------------|---------------------------------------------------------------------------------------------------------------------------------|
| RISC-V<br>Cores Acce                                                                                                                     | RTL Gen<br>Ilerators  | level Peripherals      | Custom<br>Verilog                                                                                                               |
|                                                                                                                                          | ♦<br>RTL Build        | Process                |                                                                                                                                 |
|                                                                                                                                          | IO and Harn           | ess Binding            |                                                                                                                                 |
| *                                                                                                                                        | FIRR                  | TL IR                  | *                                                                                                                               |
| <b>FireSim Transforms</b> :<br>FAME Decoupling<br>FPGA Platform Mapping<br>Assertion/Printf Synthesis<br>ILA Wiring<br>RAM Optimizations |                       | ΙΟ                     | VLSI Transforms:<br>Top and Harness Split<br>Replace Memories<br>Module Promotion<br>Module Grouping<br>Cell Technology Mapping |
| FireSim<br>Verilog                                                                                                                       | Behavioral<br>Verilog | FPGA-Mapped<br>Verilog | VLSI Verilog                                                                                                                    |
|                                                                                                                                          | ware RTL Simulat      |                        | g Hammer Automated                                                                                                              |

## **Composable Configurations**





## **CHIPYARD** Organization





## **CHIPARD** Organization

#### SW RTL Simulation:

- RTL-level simulation with Verilator or VCS
- Hands-on tutorial next
   FPGA prototyping:
- Fast, non-deterministic prototypes
- Bringup platform for taped-out chips

#### Hammer VLSI flow:

- Tapeout a custom config in some process technology
- Overview of flow later
   FireSim:
- Fast, accurate FPGAaccelerated simulations
- Hands-on tutorial later





## **CHIPYARD** Organization





### Multipurpose



## Config





## **CHIPYARD** Organization







## **Elaboration Flow**

- Chisel programs generate FIRRTL representations of hardware
- FIRRTL passes transform the target
   netlist
  - FireSim's AutoCounter/AutoILA/Printf use FIRRTL passes
  - VLSI flows can use FIRRTL passes to adjust the module hierarchy
- NEW in Chipyard 1.9.1: CIRCT Backend
  - CIRCT generates tool-friendly synthesizable Verilog from FIRRTL
  - Very fast/powerful FIRRTL compiler





## **CHIP/ARD** Organization

**Configs:** Describe parameterization of a multigenerator SoC

**Generators:** Flexible, reusable library of open-source Chisel generators (and Verilog too)

**IOBinders/HarnessBinders:** Enable configuring IO strategy and Harness features

**FIRRTL/CIRCT Passes:** Structured mechanism for supporting multiple flows

**Target flows:** Different use-cases for different types of users





## **CHIPYARD** Learning Curve

#### Advanced-level

- Configure custom IO/clocking setups
- Develop custom FireSim extensions
- Integrate and tape-out a complete SoC

#### **Evaluation-level**

- Integrate or develop custom hardware IP into Chipyard
- Run FireSim FPGA-accelerated simulations
- Push a design through the Hammer VLSI flow
- Build your own system

#### **Exploratory-level**

- Configure a custom SoC from pre-existing components
- Generate RTL, and simulate it in RTL level simulation
- Evaluate existing RISC-V designs



## **CHIPYARD** For Education

Proven in many Berkeley Architecture courses

- Hardware for Machine Learning
- Undergraduate Computer Architecture
- Graduate Computer Architecture
- Advanced Digital ICs
- Tapeout HW design course

Advantages of common shared HW framework

- Reduced ramp-up time for students
- Students learn framework once, reuse it in later courses
- Enables more advanced course projects (tapeout a chip in 1 semester)





## **CHIPYARD** For Tapeouts

#### Standard Chipyard "Flow" For Tapeout

- 1. RTL Development Develop new accelerators/devices and test rapidly
- FireSim Evaluate your design on real workloads
- 3. Hammer Reusable/extensible VLSI flow
- 4. Bringup generate FPGA bringup platforms using Chipyard

### Chipyard is a single-source-of-truth for a chip

- Enables parallel workflows across different parts of the flow
- Reproducible environments simplify debugging
- Continuous integration for tapeouts







## CHIPYARD Community

#### **Documentation:**

- <u>https://chipyard.readthedocs.io/en/de</u>
   <u>v/</u>
- 133 pages
- Most of today's tutorial content is covered there

#### Mailing List:

• google.com/forum/#!forum/chipyard

#### **Open-sourced:**

- All code is hosted on GitHub
- Issues, feature-requests, PRs are welcomed



Docs » Welcome to Chipyard's documentation!

C Edit on GitHub

#### Welcome to Chipyard's documentation!



Chipyard is a framework for designing and evaluating full-system hardware using agile teams. It is composed of a collection of tools and libraries designed to provide an integration between open-source and commercial tools for the development of systems-on-chip.

#### Important

New to Chipyard? Jump to the Initial Repository Setup page for setup instructions.





## CHIPYARD

An open, extensible research and design platform for RISC-V SoCs

- Unified framework of parameterized generators
- One-stop-shop for RISC-V SoC design exploration
- Supports variety of flows for multiple use cases
- Open-sourced, community and research-friendly

