enjoy-digital / litex

Build your hardware, easily!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can't boot linux with MAIN_RAM sizes above 512mb

JoyBed opened this issue · comments

I stumbled upon an interesting bug. Linux is unable to boot if the memory size is above 512mb. In litex BIOS the whole ram passes the tests and its working but as linux start to boot it cant boot if I have ram size specified above 512mb. Where does this limitation come from?
image

Well... When I limit it in the dts to 512mb then the linux can boot but the biggest problem I always had is NaxRiscv. The main_ram works on ANY softcore but not on the NaxRiscv, here is the screenshot:
image
Thats when its connected to main ram thru peripheral bus, when I connect the main ram directly to the axi4 ports it gets stuck at the "memtest at 0x40000000" and doesnt even count.

Hi,

I just tested via :
litex_sim --cpu-type=naxriscv --with-sdram --sdram-module=MT41K128M16 --sdram-data-width=64

I got :

--=============== SoC ==================--
CPU:		NaxRiscv 32-bit @ 1MHz
BUS:		wishbone 32-bit @ 4GiB
CSR:		32-bit data
ROM:		128.0KiB
SRAM:		8.0KiB
L2:		8.0KiB
SDRAM:		1.0GiB 64-bit @ 8MT/s (CL-6 CWL-5)
MAIN-RAM:	1.0GiB

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Switching SDRAM to hardware control.
Memtest at 0x40000000 (8.0KiB)...
  Write: 0x40000000-0x40002000 8.0KiB   
   Read: 0x40000000-0x40002000 8.0KiB   
Memtest OK
Memspeed at 0x40000000 (Sequential, 8.0KiB)...
  Write speed: 2.9MiB/s
   Read speed: 3.4MiB/s

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
Cancelled

--============= Console ================--
litex> mem_test 0x40000000 0x1000
Memtest at 0x40000000 (4.0KiB)...
  Write: 0x40000000-0x40001000 4.0KiB   
   Read: 0x40000000-0x40001000 4.0KiB   
Memtest OK

litex> mem_test 0x50000000 0x1000
Memtest at 0x50000000 (4.0KiB)...
  Write: 0x50000000-0x50001000 4.0KiB   
   Read: 0x50000000-0x50001000 4.0KiB   
Memtest OK

litex> mem_test 0x60000000 0x1000
Memtest at 0x60000000 (4.0KiB)...
  Write: 0x60000000-0x60001000 4.0KiB   
   Read: 0x60000000-0x60001000 4.0KiB   
Memtest OK

litex> mem_test 0x70000000 0x1000
Memtest at 0x70000000 (4.0KiB)...
  Write: 0x70000000-0x70001000 4.0KiB   
   Read: 0x70000000-0x70001000 4.0KiB   
Memtest OK

What command line did you used ?
One specific thing about NaxRiscv, is that the CPU is generated with the exact knowledge of "where is some ram i can access".
So if there is a bug there it could create your case.

I used this:
./xilinx_zybo_z7_20.py --variant=original --cpu-type=naxriscv --xlen=64 --scala-args='rvc=true,rvf=true,rvd=true,alu-count=2,decode-count=2,mmu=true' --with-fpu --with-rvc --with-ps7 --bus-standard=axi-lite --with-spi-sdcard --sys-clk-freq=125e6 --with-xadc --csr-json zybo.json --uart-baudrate=2000000 --build --update-repo=wipe+recommended --vivado-synth-directive=PerformanceOptimized --vivado-route-directive=AggressiveExplore --with-hdmi-video-framebuffer --l2-bytes=262144 --l2-ways=16 --with-jtag-tap
When I generate a NaxRiscv with mbus thats connected to DRAM than it freezes at mem_test at 0x40000000, when I generate it with DRAM connected to pbus than I have those data errors as in the screenshot.

when I generate it with DRAM connected to pbus

Hmm should realy not do that, the Nax SoC is realy intended to use cacheable memory through mbus. Things through pbus may not support atomic access and stuff like that.

When I generate a NaxRiscv with mbus thats connected to DRAM than it freezes at mem_test at 0x40000000,

Ahhh that is one thing. Probably related to the zynq nature of the FPGA. Not sure if there is a way to setup a simulation of the SoC / zynq with vivado ?

@JoyBed the DRAM present on zybo board is connected to the PS -> you can't use it from PL

when I generate it with DRAM connected to pbus

Hmm should realy not do that, the Nax SoC is realy intended to use cacheable memory through mbus. Things through pbus may not support atomic access and stuff like that.

When I generate a NaxRiscv with mbus thats connected to DRAM than it freezes at mem_test at 0x40000000,

Ahhh that is one thing. Probably related to the zynq nature of the FPGA. Not sure if there is a way to setup a simulation of the SoC / zynq with vivado ?

I dont know, I can launch simulation within Vivado but thats not helping much as I cant connect to the UART. At least i dont know about a way to do that.

@JoyBed the DRAM present on zybo board is connected to the PS -> you can't use it from PL

Actually you can thru the slave ports of the PS7 system, the HP slave ports are connected directly to DRAM. I am using that. Its working with EVERY softcore in the Litex except the NaxRiscv. Here you can see my target file:

#!/usr/bin/env python3

#
# This file is part of LiteX-Boards.
#
# Copyright (c) 2019-2020 Florent Kermarrec <florent@enjoy-digital.fr>,
# Copyright (c) 2022-2023 Oliver Szabo <16oliver16@gmail.com>
# SPDX-License-Identifier: BSD-2-Clause

import math
import os
from migen import *
from litex.gen import LiteXModule
from litex_boards.platforms import digilent_zybo_z7_20
from litex.soc.interconnect import axi
from litex.soc.interconnect import wishbone
from litex.soc.cores.clock import *
from litex.soc.integration.soc_core import *
from litex.soc.integration.builder import *
from litex.soc.cores.video import VideoVGAPHY
from litex.soc.cores.video import VideoS7HDMIPHY
from litex.soc.cores.usb_ohci import USBOHCI
from litex.soc.cores.led import LedChaser
from litex.soc.cores.xadc import XADC
from litex.soc.cores.dna  import DNA
from litex.soc.integration.soc import SoCRegion
from litex.soc.interconnect import csr_eventmanager
from litex.soc.interconnect.csr_eventmanager import EventManager, EventSourceLevel, EventSourcePulse
from litex.soc.interconnect.csr import AutoCSR
from litex.soc.cores import cpu


# CRG ----------------------------------------------------------------------------------------------

class _CRG(LiteXModule):
    def __init__(self, platform, sys_clk_freq, toolchain="vivado", use_ps7_clk=False, with_video_pll=False, with_usb_pll=False):
        self.rst    = Signal()
        self.cd_sys = ClockDomain()
        self.cd_vga = ClockDomain()
        self.cd_hdmi = ClockDomain()
        self.cd_hdmi5x = ClockDomain()
        self.cd_usb    = ClockDomain()
        # # #

	# Clk
        clk125 = platform.request("clk125")
        
        if use_ps7_clk:
            self.comb   +=  ClockSignal("sys").eq(ClockSignal("ps7"))
            self.comb   +=  ResetSignal("sys").eq(ResetSignal("ps7") | self.rst)
        else:
            # MMCM.
            #if toolchain == "vivado":
            #    self.mmcm = mmcm = S7MMCM(speedgrade=-2)
            #else:
            #    self.mmcm = mmcm = S7PLL(speedgrade=-2)
            self.mmcm = mmcm = S7MMCM(speedgrade=-2)
            #self.mmcm = mmcm = S7PLL(speedgrade=-1)
            #self.comb += mmcm.reset.eq(self.rst)
            mmcm.register_clkin(clk125, 125e6)
            mmcm.create_clkout(self.cd_sys, sys_clk_freq)
            platform.add_false_path_constraints(self.cd_sys.clk, mmcm.clkin) # Ignore sys_clk to mmcm.clkin path created by SoC's rst.
            mmcm.expose_drp()
            self.comb += mmcm.reset.eq(mmcm.drp_reset.re | self.rst)
            
        # Video PLL.
        if with_video_pll:
            self.video_pll = video_pll = S7PLL(speedgrade=-2)
            self.comb += video_pll.reset.eq(self.rst)
            video_pll.register_clkin(clk125, 125e6)
            #video_pll.create_clkout(self.cd_vga, 40e6)
            video_pll.create_clkout(self.cd_hdmi,   148.5e6)
            video_pll.create_clkout(self.cd_hdmi5x, 5*148.5e6)
            platform.add_false_path_constraints(self.cd_sys.clk, video_pll.clkin) # Ignore sys_clk to video_pll.clkin path created by SoC's rst.
            
        # USB PLL
        if with_usb_pll:
            mmcm.create_clkout(self.cd_usb, 48e6)
            
# BaseSoC ------------------------------------------------------------------------------------------

class BaseSoC(SoCCore):
    mem_map = {**SoCCore.mem_map, **{
        #"usb_ohci":     0xc0000000,
        "usb_ohci":	0x18000000,
    }}
    def __init__(self, sys_clk_freq=100e6, 
    	variant = "original",
    	toolchain="vivado", 
    	with_ps7 = False,
    	with_dna = False,
    	with_xadc = False,
    	with_usb_host=False, 
    	with_led_chaser = False,
    	with_video_terminal = False,
        with_video_framebuffer = False,
        with_hdmi_video_terminal = False,
        with_hdmi_video_framebuffer = False, 
    	**kwargs):

        self.interrupt_map = {
            "ps" : 2,
        }

        platform = digilent_zybo_z7_20.Platform(variant=variant)
        self.builder    = None
        self.with_ps7   = with_ps7
        
        # CRG --------------------------------------------------------------------------------------
        use_ps7_clk     = (kwargs.get("cpu_type", None) == "zynq7000")
        with_video_pll  = (with_hdmi_video_terminal or with_hdmi_video_framebuffer)
        with_usb_pll    = with_usb_host
        self.crg        = _CRG(platform, sys_clk_freq, use_ps7_clk, with_video_pll = with_hdmi_video_terminal or with_video_terminal or with_hdmi_video_framebuffer or with_video_framebuffer, with_usb_pll = with_usb_host)

        # SoCCore ----------------------------------------------------------------------------------
        if kwargs["uart_name"] == "serial":
            kwargs["uart_name"] = "usb_uart" # Use USB-UART Pmod on JB.
        if kwargs.get("cpu_type", None) == "zynq7000":
            kwargs["integrated_sram_size"] = 0x0
            kwargs["with_uart"] = False
            self.mem_map = {
                'csr': 0x4000_0000,  # Zynq GP0 default
            }
        SoCCore.__init__(self, platform, sys_clk_freq, ident="LiteX SoC on Zybo Z7/original Zybo", **kwargs)
        
        # USB Host ---------------------------------------------------------------------------------
        if with_usb_host:
            self.submodules.usb_ohci = USBOHCI(platform, platform.request("usb_host"), usb_clk_freq=int(48e6))
            self.bus.add_slave("usb_ohci_ctrl", self.usb_ohci.wb_ctrl, region=SoCRegion(origin=self.mem_map["usb_ohci"], size=0x100000, cached=False))
            #self.bus.add_slave("usb_ohci_ctrl", self.usb_ohci.wb_ctrl)
            self.dma_bus.add_master("usb_ohci_dma", master=self.usb_ohci.wb_dma)
            self.comb += self.cpu.interrupt[16].eq(self.usb_ohci.interrupt)
        
        # Zynq7000 Integration ---------------------------------------------------------------------
        if kwargs.get("cpu_type", None) == "zynq7000":
            self.cpu.use_rom = True
            if variant in ["z7-10", "z7-20", "original"]:
                # Get and set the pre-generated .xci FIXME: change location? add it to the repository? Make config
                os.makedirs("xci", exist_ok=True)
                os.system("wget https://github.com/litex-hub/litex-boards/files/8339591/zybo_z7_ps7.txt")
                os.system("mv zybo_z7_ps7.txt xci/zybo_z7_ps7.xci")
                self.cpu.set_ps7_xci("xci/zybo_z7_ps7.xci")
            else:
                self.cpu.set_ps7(name="ps", config = platform.ps7_config)

            # Connect AXI GP0 to the SoC with base address of 0x40000000 (default one)
            wb_gp0  = wishbone.Interface()
            self.submodules += axi.AXI2Wishbone(
                axi          = self.cpu.add_axi_gp_master(),
                wishbone     = wb_gp0,
                base_address = 0x40000000)
            self.bus.add_master(master=wb_gp0)
            #TODO memory size dependend on board variant
            self.bus.add_region("sram", SoCRegion(
                origin = self.cpu.mem_map["sram"],
                size   = 512 * 1024 * 1024 - self.cpu.mem_map["sram"])
            )
            self.bus.add_region("rom", SoCRegion(
                origin = self.cpu.mem_map["rom"],
                size   = 256 * 1024 * 1024 // 8,
                linker = True)
            )
            self.constants["CONFIG_CLOCK_FREQUENCY"] = 666666687
            self.bus.add_region("flash", SoCRegion(
                origin = 0xFC00_0000,
                size = 0x4_0000,
                mode = "rwx")
            )

        # PS7 as Slave Integration ---------------------------------------------------------------------
        elif with_ps7:
            cpu_cls = cpu.CPUS["zynq7000"]
            zynq = cpu_cls(self.platform, "standard") # zynq7000 has no variants
            zynq.set_ps7(name="ps", config = platform.ps7_config)
            #axi_M_GP0 = zynq.add_axi_gp_master()
            #self.bus.add_master(master=axi_M_GP0)
            axi_S_HP0     = zynq.add_axi_hp_slave(clock_domain = self.crg.cd_sys.name)
            axi_S_HP1     = zynq.add_axi_hp_slave(clock_domain = self.crg.cd_sys.name)
            axi_S_HP2     = zynq.add_axi_hp_slave(clock_domain = self.crg.cd_sys.name)
            axi_S_HP3     = zynq.add_axi_hp_slave(clock_domain = self.crg.cd_sys.name)
            axi_S_GP0     = zynq.add_axi_gp_slave(clock_domain = self.crg.cd_sys.name)
            hp_ports      = [axi_S_HP0, axi_S_HP1, axi_S_HP2, axi_S_HP3]

            # PS7 DDR3 Interface -----------------------------
            ddr_addr      = self.cpu.mem_map["main_ram"]
            #map_fct_ddr   = lambda sig : sig - ddr_addr + 0x0008_0000
            map_fct_ddr   = lambda sig : sig - ddr_addr + 0x0010_0000
            sdram_size = 0x4000_0000
            
            if hasattr(self.cpu, "add_memory_buses"):
                self.cpu.add_memory_buses(address_width = 32, data_width = 64)
            
            if len(self.cpu.memory_buses): # if CPU has dedicated memory bus
                print("--------Connecting DDR to direct RAM port of the softcore using HP bus.--------")
                for mem_bus in self.cpu.memory_buses:
                    i = 0
                    axi_ddr = axi.AXIInterface(hp_ports[i].data_width, hp_ports[i].address_width, "byte", hp_ports[i].id_width)
                    self.comb += axi_ddr.connect_mapped(hp_ports[i], map_fct_ddr)
                    data_width_ratio = int(axi_ddr.data_width/mem_bus.data_width)
                    print("Connecting: ", str(mem_bus), " to ", str(axi_ddr))
                    print("CPU memory bus data width: ", mem_bus.data_width, " bits")
                    print("DDR bus data width: ", axi_ddr.data_width, " bits")
                    print("CPU memory bus address width: ", mem_bus.address_width, " bits")
                    print("DDR bus address width: ", axi_ddr.address_width, " bits")
                    print("CPU memory bus id width: ", mem_bus.id_width, " bits")
                    print("DDR bus id width: ", axi_ddr.id_width, " bits")
                    # Connect directly
                    if data_width_ratio == 1:
                        print("Direct connection")
                        self.comb += mem_bus.connect(axi_ddr)
                    # UpConvert
                    elif data_width_ratio > 1:
                        print("UpConversion")
                        axi_port = axi.AXIInterface(data_width = axi_ddr.data_width, addressing="byte", id_width = len(mem_bus.aw.id))
                        self.submodules += axi.AXIUpConverter(axi_from = mem_bus, axi_to = axi_port,)
                        self.comb += axi_port.connect(axi_ddr)
                    # DownConvert
                    else:
                        print("DownConversion")
                        axi_port = axi.AXIInterface(data_width = axi_ddr.data_width, addressing="byte", id_width = len(mem_bus.aw.id))
                        self.submodules += axi.AXIDownConverter(axi_from = mem_bus, axi_to = axi_port,)
                        self.comb += axi_port.connect(axi_ddr)
                    i = i + 1
                # Add SDRAM region
                origin = None
                main_ram_region = SoCRegion(
                    origin = self.mem_map.get("main_ram", origin),
                    size   = sdram_size,
                    mode   = "rwx")
                self.bus.add_region("main_ram", main_ram_region)
            else:
                print("--------Connecting DDR to general bus of the softcore using GP bus.--------")
                axi_ddr = axi.AXIInterface(axi_S_GP0.data_width, axi_S_GP0.address_width, "byte", axi_S_GP0.id_width)
                #axi_ddr = axi.AXIInterface(axi_S_HP0.data_width, axi_S_HP0.address_width, addressing="byte", axi_S_HP0.id_width)
                self.comb += axi_ddr.connect_mapped(axi_S_GP0, map_fct_ddr)
                #self.comb += axi_ddr.connect_mapped(axi_S_HP0, map_fct_ddr)
                
                self.bus.add_slave(
                   name="main_ram",slave=axi_ddr,
                    region=SoCRegion(
                        origin=ddr_addr,
                        size=sdram_size,
                        mode="rwx"
                    )
                )
            print("---------------------------- End ----------------------------------------------")
            
        
        # Video VGA ------------------------------------------------------------------------------------
        if with_video_terminal or with_video_framebuffer:
            if with_video_terminal:
                self.videophy = VideoVGAPHY(platform.request("vga"), clock_domain="vga")
                self.add_video_terminal(phy=self.videophy, timings="800x600@60Hz", clock_domain="vga")
            if with_video_framebuffer:
                #TODO
                print("Not implemented yet!")
                
        # Video HDMI ------------------------------------------------------------------------------------
        if with_hdmi_video_terminal or with_hdmi_video_framebuffer:
            if with_hdmi_video_terminal:
                self.videophy = VideoS7HDMIPHY(platform.request("hdmi_out"), clock_domain="hdmi")
                self.add_video_terminal(phy=self.videophy, timings="800x600@60Hz", clock_domain="hdmi")
            if with_hdmi_video_framebuffer:
                from my_modules import dvi_framebuffer
                platform.add_source("./my_modules/dvi_framebuffer.v")
                self.cfg_bus = cfg_bus = axi.AXILiteInterface(address_width=32, data_width=32, addressing="byte")
                axi_S_GP1 = zynq.add_axi_gp_slave(clock_domain = self.crg.cd_hdmi.name)
                self.out_bus = out_bus = axi.AXIInterface(axi_S_GP1.data_width, axi_S_GP1.address_width, "byte", axi_S_GP1.id_width)
                self.comb += out_bus.connect_mapped(axi_S_GP1, map_fct_ddr)
                self.submodules.hdmi_framebuffer = hdmi_framebuffer = dvi_framebuffer.dvi_framebuffer(self.crg.cd_hdmi.clk, self.crg.cd_hdmi5x.clk, self.crg.rst, Signal(), cfg_bus, out_bus, platform.request("hdmi_out"))
                self.bus.add_slave("framebuffer_ctrl", cfg_bus, region=SoCRegion(origin=0x87000000, size=0x10000, mode="rw", cached=False))
                
        #Leds -------------------------------------------------------------------------------------
        if with_led_chaser:
            self.leds = LedChaser(
                pads         = platform.request_all("user_led"),
                sys_clk_freq = sys_clk_freq)
        
        # XADC -------------------------------------------------------------------------------------
        if with_xadc:
            self.xadc = XADC()
            
        # DNA --------------------------------------------------------------------------------------
        if with_dna:
            self.dna = DNA()
            self.dna.add_timing_constraints(platform, sys_clk_freq, self.crg.cd_sys.clk)

    def finalize(self, *args, **kwargs):
        super(BaseSoC, self).finalize(*args, **kwargs)
        if self.cpu_type == "zynq7000":
            libxil_path = os.path.join(self.builder.software_dir, 'libxil')
            os.makedirs(os.path.realpath(libxil_path), exist_ok=True)
            lib = os.path.join(libxil_path, 'embeddedsw')
            if not os.path.exists(lib):
                os.system("git clone --depth 1 https://github.com/Xilinx/embeddedsw {}".format(lib))

            os.makedirs(os.path.realpath(self.builder.include_dir), exist_ok=True)
            for header in [
                'XilinxProcessorIPLib/drivers/uartps/src/xuartps_hw.h',
                'lib/bsp/standalone/src/common/xil_types.h',
                'lib/bsp/standalone/src/common/xil_assert.h',
                'lib/bsp/standalone/src/common/xil_io.h',
                'lib/bsp/standalone/src/common/xil_printf.h',
                'lib/bsp/standalone/src/common/xstatus.h',
                'lib/bsp/standalone/src/common/xdebug.h',
                'lib/bsp/standalone/src/arm/cortexa9/xpseudo_asm.h',
                'lib/bsp/standalone/src/arm/cortexa9/xreg_cortexa9.h',
                'lib/bsp/standalone/src/arm/cortexa9/xil_cache.h',
                'lib/bsp/standalone/src/arm/cortexa9/xparameters_ps.h',
                'lib/bsp/standalone/src/arm/cortexa9/xil_errata.h',
                'lib/bsp/standalone/src/arm/cortexa9/xtime_l.h',
                'lib/bsp/standalone/src/arm/common/xil_exception.h',
                'lib/bsp/standalone/src/arm/common/gcc/xpseudo_asm_gcc.h',
            ]:
                shutil.copy(os.path.join(lib, header), self.builder.include_dir)
            write_to_file(os.path.join(self.builder.include_dir, 'bspconfig.h'),
                        '#define FPU_HARD_FLOAT_ABI_ENABLED 1')
            write_to_file(os.path.join(self.builder.include_dir, 'xparameters.h'), '''
#ifndef __XPARAMETERS_H
#define __XPARAMETERS_H

#include "xparameters_ps.h"

#define STDOUT_BASEADDRESS            XPS_UART1_BASEADDR
#define XPAR_PS7_DDR_0_S_AXI_BASEADDR 0x00100000
#define XPAR_PS7_DDR_0_S_AXI_HIGHADDR 0x3FFFFFFF
#endif
''')

        elif self.with_ps7:
            libxil_path = os.path.join(self.builder.software_dir, 'libxil')
            os.makedirs(os.path.realpath(libxil_path), exist_ok=True)
            lib = os.path.join(libxil_path, 'embeddedsw')
            if not os.path.exists(lib):
                os.system("git clone --depth 1 https://github.com/Xilinx/embeddedsw {}".format(lib))

            os.makedirs(os.path.realpath(self.builder.include_dir), exist_ok=True)
            for header in [
                'XilinxProcessorIPLib/drivers/uartps/src/xuartps_hw.h',
                'XilinxProcessorIPLib/drivers/uartps/src/xuartps.h',
                'lib/bsp/standalone/src/common/xil_types.h',
                'lib/bsp/standalone/src/common/xil_assert.h',
                'lib/bsp/standalone/src/common/xil_io.h',
                'lib/bsp/standalone/src/common/xil_printf.h',
                'lib/bsp/standalone/src/common/xplatform_info.h',
                'lib/bsp/standalone/src/common/xstatus.h',
                'lib/bsp/standalone/src/common/xdebug.h'
            ]:
                shutil.copy(os.path.join(lib, header), self.builder.include_dir)
            write_to_file(os.path.join(self.builder.include_dir, 'uart_ps.h'), '''
#ifdef __cplusplus
extern "C" {
#endif
#include "xuartps_hw.h"
#include "system.h"
#define CSR_UART_BASE
#define UART_POLLING
static inline void uart_rxtx_write(char c) {
    XUartPs_WriteReg(STDOUT_BASEADDRESS, XUARTPS_FIFO_OFFSET, (uint32_t) c);
}
static inline uint8_t uart_rxtx_read(void) {
    return XUartPs_ReadReg(STDOUT_BASEADDRESS, XUARTPS_FIFO_OFFSET);
}
static inline uint8_t uart_txfull_read(void) {
    return XUartPs_IsTransmitFull(STDOUT_BASEADDRESS);
}
static inline uint8_t uart_rxempty_read(void) {
    return !XUartPs_IsReceiveData(STDOUT_BASEADDRESS);
}
static inline void uart_ev_pending_write(uint8_t x) { }
static inline uint8_t uart_ev_pending_read(void) {
    return 0;
}
static inline void uart_ev_enable_write(uint8_t x) { }
#ifdef __cplusplus
}
#endif
''')
            write_to_file(os.path.join(self.builder.include_dir, 'xil_cache.h'), '''
#ifndef XIL_CACHE_H
#define XIL_CACHE_H
#include "xil_types.h"
#include "xparameters.h"
#include "system.h"
#ifdef __cplusplus
extern "C" {
#endif
void Xil_DCacheFlush(void);
void Xil_ICacheFlush(void);
void Xil_L2CacheFlush(void);
#ifdef __cplusplus
}
#endif
#endif
''')
            write_to_file(os.path.join(self.builder.include_dir, 'xil_cache.c'), '''
#include "system.h"
void Xil_DCacheFlush(void){
    flush_cpu_dcache();
}
void Xil_ICacheFlush(void) {
    flush_cpu_icache();
}
void Xil_L2CacheFlush(void) {
    flush_l2_cache();
}
''')
            write_to_file(os.path.join(self.builder.include_dir, 'xparameters.h'), '''
#ifndef __XPARAMETERS_H
#define __XPARAMETERS_H
#include "generated/mem.h"
#define STDOUT_BASEADDRESS            PS_IO_BASE + 0x1000
#define STDIN_BASEADDRESS             PS_IO_BASE + 0x1000
#define XPAR_PS7_DDR_0_S_AXI_BASEADDR MAIN_RAM_BASE
#define XPAR_PS7_DDR_0_S_AXI_HIGHADDR MAIN_RAM_BASE + MAIN_RAM_SIZE
#endif
''')
            write_to_file(os.path.join(self.builder.include_dir, 'xpseudo_asm.h'), '''
#ifndef XPSEUDO_ASM_H
#define XPSEUDO_ASM_H
#endif
''')
            write_to_file(os.path.join(self.builder.include_dir, 'bspconfig.h'), '''
#ifndef XPSEUDO_ASM_H
#define XPSEUDO_ASM_H
#endif
''')


# Build --------------------------------------------------------------------------------------------

def main():
    from litex.build.parser import LiteXArgumentParser
    parser = LiteXArgumentParser(platform=digilent_zybo_z7_20.Platform, description="LiteX SoC on Zybo Z7/original Zybo")
    parser.add_target_argument("--sys-clk-freq",          default=125e6, type=float,     help="System clock frequency.")
    parser.add_target_argument("--variant",               default="original",            help="Board variant (z7-10, z7-20 or original).")
    parser.add_target_argument("--with-ps7",              action="store_true",           help="Add the PS7 as slave for soft CPUs.")
    parser.add_target_argument("--with-usb-host",         action="store_true",           help="Enable USB host support.(PMOD)")
    parser.add_target_argument("--with-xadc",             action="store_true",           help="Enable 7-Series XADC.")
    parser.add_target_argument("--with-dna",              action="store_true",           help="Enable 7-Series DNA.")
    sdopts = parser.target_group.add_mutually_exclusive_group()
    sdopts.add_argument("--with-spi-sdcard",              action="store_true",           help="Enable SPI-mode SDCard support.(PMOD)")
    sdopts.add_argument("--with-sdcard",      		  action="store_true",      	 help="Enable SDCard support.(PMOD)")
    viopts = parser.target_group.add_mutually_exclusive_group()
    viopts.add_argument("--with-video-terminal",          action="store_true",           help="Enable Video Terminal (VGA).")
    viopts.add_argument("--with-video-framebuffer",       action="store_true",           help="Enable Video Framebuffer (VGA).")
    viopts.add_argument("--with-hdmi-video-terminal",     action="store_true",           help="Enable Video Terminal (HDMI).")
    viopts.add_argument("--with-hdmi-video-framebuffer",  action="store_true",           help="Enable Video Framebuffer (HDMI).")
    args = parser.parse_args()

    soc = BaseSoC(
        sys_clk_freq = args.sys_clk_freq,
        variant = args.variant,
        with_ps7 = args.with_ps7,
        with_xadc = args.with_xadc,
        with_dna = args.with_dna,
        with_usb_host = args.with_usb_host,
        with_video_terminal = args.with_video_terminal,
        with_video_framebuffer = args.with_video_framebuffer,
        with_hdmi_video_terminal = args.with_hdmi_video_terminal,
        with_hdmi_video_framebuffer = args.with_hdmi_video_framebuffer,
        **soc_core_argdict(args)
    )
    
    if args.with_spi_sdcard:
        soc.platform.add_extension(digilent_zybo_z7_20._sd_card_pmod_io)
        soc.add_spi_sdcard(software_debug=True)
    if args.with_sdcard:
        soc.platform.add_extension(digilent_zybo_z7_20._sd_card_pmod_io)
        soc.add_sdcard(software_debug=True)
    
    builder = Builder(soc, **builder_argdict(args))
    
    if args.cpu_type == "zynq7000" or args.with_ps7:
        soc.builder = builder
        builder.add_software_package('libxil')
        builder.add_software_library('libxil')
    if args.build:
        builder.build(**parser.toolchain_argdict)
    if args.load:
        prog = soc.platform.create_programmer()
        prog.load_bitstream(builder.get_bitstream_filename(mode="sram"), device=1)

if __name__ == "__main__":
    main()```

Maybe the reason why this isn't working, is that if let's say, we specify that the memory is on the AXI bus at 0x40000000, then naxriscv can access it, but the memory accesses on that axi bus will be emited without that 0x40000000 offset.

I don't know what is the expected behaviour from the zynq / litex side

Maybe the reason why this isn't working, is that if let's say, we specify that the memory is on the AXI bus at 0x40000000, then naxriscv can access it, but the memory accesses on that axi bus will be emited without that 0x40000000 offset.

I don't know what is the expected behaviour from the zynq / litex side

I dont quite understand now. If it has it on the mbus then accesses on the mbus are done without the 0x40000000 offset? So a access to 0x45000000 thru mbus emits address 0x05000000 ?

So a access to 0x45000000 thru mbus emits address 0x05000000 ?

Yes, i need to double check but that is quite possible.

So a access to 0x45000000 thru mbus emits address 0x05000000 ?

Yes, i need to double check but that is quite possible.

I tried it with mbus connected to DRAM with bus address offset stripping (the 0x40000000) still when DRAM is connected to the mbus it locks up at the bootup mem_test. I dont know whats wrong, before experimenting with NaxRiscv I was using Rocket and that had its memory bus connected straight to the DRAM and it worked like a charm. Before that I used VexRiscv which had the same main_ram address 0x40000000 as NaxRiscv and it was working too. Even the Microwatt and Serv were able to work with it. NaxRiscv is the only one doing this. Im out of ideas.

I'm looking at it. Trying to get the offset to be preserved.

Also, did you tried vexriscv_smp cpu ?

I'm looking at it. Trying to get the offset to be preserved.

Also, did you tried vexriscv_smp cpu ?

Yes, and it was working and booting linux just fine, but not with memory bus to dram as the memory bus of the vexriscv smp has litedram interface not axi4 as the PS7 of the zynq has. I also successfully booted linux on Rocket and Microwatt and I wanted to move to NaxRiscv for performance reasons, also I can fit 2 NaxRiscv cores into my FPGA and with Linux variant of Rocket I can only fit one.

Very strange behaviour, thru mbus it locks up at mem_test evne when i specified no L2 cache. I also tried it again with dram to pbus without L2 to see if it makes a difference. Still the same behaviour.
image
EDIT:
Now I stumbled uppon an interesting thing, it has problems with some addresses in the 0x40000000-0x41000000 region(seems like that its with the address 0x40c00000) and then on every 0xX0be0000, everything inbetween tests OK... strange.
image
image

Also if the DRAM is connected thru pbus and I just bypass the non functional addresses then it can even boot:
image
Even linux booted when I specified in device tree that those addresses are reserved memory regions.

Also if the DRAM is connected thru pbus and I just bypass the non functional addresses then it can even boot:

I do not expect it to boot in that configuration. Also performances will be very bad.

I pushed a potential fix with :
#1940

With this one, mBus accesses will preserve the full 32 bits address, instead of removing the 0x40000000 offset.

I don't have any zynq board, let's me know how it goes :)

Also, keep in mind, VexiiRiscv is very close to get feature parity with performance not too far away. (WIP)

Also if the DRAM is connected thru pbus and I just bypass the non functional addresses then it can even boot:

I do not expect it to boot in that configuration. Also performances will be very bad.

I pushed a potential fix with : #1940

With this one, mBus accesses will preserve the full 32 bits address, instead of removing the 0x40000000 offset.

I don't have any zynq board, let's me know how it goes :)

Also, keep in mind, VexiiRiscv is very close to get feature parity with performance not too far away. (WIP)

I tried it but still locked up on mem_test when DRAM is connected to mbus:
image
I dont understand this behaviour, its very strange.

Did you checked it passes the timings ?

Did you checked it passes the timings ?

Yes, it passes. Timings are in positive numbers. no negative slack or hold.
image

Weird, I just tested on Digilent nexys video, can run debian just fine
Did you deleted the pythondata-cpu-naxriscv/pythondata_cpu_naxriscv/verilog/NaxRiscvLitex_????.v files before retrying with the fixes ?
Else it will not regenerate the naxriscv SoC and use it as a cache.

Yes, I deleted the generated verilog files. I dont understand this strange behavior at all. My goal is to boot Debian/Fedora on it as I did with Rocket. I am thinking maybe adding a litescope to or jtagbone to the SoC to see the signals on mbus<->dram bus to see what is going on.
EDIT: Alternatively I can give you remote access to my workstation for you to check out the code and behaviour of the SoC faster.

to see the signals on mbus<->dram bus to see what is going on.

Yes that would be a the way to proceed.
probing the mbus, aswell as probing the dbus which get out of the CPU itself

Idealy, instead of relying on hardware debug, we would run a simulation, that would give use full visibility on what is happening.

to see the signals on mbus<->dram bus to see what is going on.

Yes that would be a the way to proceed. probing the mbus, aswell as probing the dbus which get out of the CPU itself

Idealy, instead of relying on hardware debug, we would run a simulation, that would give use full visibility on what is happening.

I dont know if simulation would do anything as the mbus is connected to the PS7 block which contains hardened components not softcores. Also you cant connect to UART in a Vivado simulation. So the only way to somehow usefully debug it is the debug options in the Litex.

Here it is with Rocket softcore. Memory bus of the Rocket connected straight to DRAM. Working as with any other softcore except the NaxRiscv.
image
Even whole ram is ther and OK.
image

can you send your custom board files ? for me to recreate. Maybe it is the memory region definition which is messed up.

can you send your custom board files ? for me to recreate. Maybe it is the memory region definition which is messed up.

target and platform files ?

yes

yes

Here you go.
custom_zybo.zip
There is the platform and target file and also the modified Zynq7000 core file so that the HP ports have the ACLK taken from softcore bus, otherwise it will not work.
Also use this pull request, im using that function: https://github.com/enjoy-digital/litex/pull/1522

Any news ?

Just tried now.

[info] LitexMemoryRegion(SM(0x80000000, 0x80000000),io,p)
[info] LitexMemoryRegion(SM(0x0, 0x20000),rxc,p)
[info] LitexMemoryRegion(SM(0x10000000, 0x2000),rwxc,p)
[info] LitexMemoryRegion(SM(0x40000000, 0x40000000),rwxc,p)
[info] LitexMemoryRegion(SM(0xf0000000, 0x10000),rw,p)

0x40000000, 0x40000000 => address, size

which would normaly mean that there should be 1 GB accessible at 0x40000000

The thing is, "[info] LitexMemoryRegion(SM(0x80000000, 0x80000000),io,p)" is overlapping that memory range.
You need to push that memory region (0x80000000, 0x80000000) to (0xC0000000, 0x40000000)

note it should not be on ",p)" but on ,"m)" else you will get bad performances (meaning not pbus, but mbus)

io_regions = {0x4000_0000: 0xbc00_0000} # Origin, Length.
Does it mean that the DDR is mapped as if it was a io region ?

Uncomment these lines:

#if hasattr(self.cpu, "add_memory_buses"):
    #self.cpu.add_memory_buses(address_width = 32, data_width = 64)

That would connect the DRAM to mbus instead of pbus. Also you can change the parameter sdram_size to half so ut doesnt overlap with anything(if i understood correctly).

The thing is, "[info] LitexMemoryRegion(SM(0x80000000, 0x80000000),io,p)" is overlapping that memory range.

Ahhh foget that, my bad XD

So, the only way to debug i would say would be with the logic analyser.

Also one thing, with NaxRiscv you realy need to use the argument : --bus-standard axi-lite
(the axi-lite <> wishbone) is bugged i think. That may explaine the crashes you had before. (not talking of the memory range)

The thing is, "[info] LitexMemoryRegion(SM(0x80000000, 0x80000000),io,p)" is overlapping that memory range.

Ahhh foget that, my bad XD

So, the only way to debug i would say would be with the logic analyser.

Also one thing, with NaxRiscv you realy need to use the argument : --bus-standard axi-lite

(the axi-lite <> wishbone) is bugged i think. That may explaine the crashes you had before. (not talking of the memory range)

Yeah. I know that, I have the pbus as axi-lite and the mbus to dram is full axi4. I dont usually use wishbone when not specifically needed. I try to always use the bus standard the core has as its native output. I also tried to add the logic analyser but somehow I cant figure it out somehow, i started to writing the necessary parts into code as you can see in the files. Maybe you can write it in properly, for me when I tried it then I got no output in the console and no output from analyser.

Hmm, so yes, at this point realy need to add probes / logic analyser, and trace memory access at 0x60000000 to see until which point they reach.
Also, note that today i also got VexiiRiscv to run debian, there is one hardware bug i know of which trigger instruction access fault on some specific binaries, but soon it should be fixed.

Hmm, so yes, at this point realy need to add probes / logic analyser, and trace memory access at 0x60000000 to see until which point they reach.

Also, note that today i also got VexiiRiscv to run debian, there is one hardware bug i know of which trigger instruction access fault on some specific binaries, but soon it should be fixed.

Wait what. VexRiscv running Debian? Distros require RV64GC and VexRiscv is RV32GC or did I missed something?

did I missed something?

Yes, you missed the "ii"
VexiiRiscv, not VexRiscv ^^

did I missed something?

Yes, you missed the "ii"

VexiiRiscv, not VexRiscv ^^

Ooooh, now I checked it out and it looks cool. When it gets integrated into Litex? How did you got it to the FPGA board? Im totaly new and lost in Spinal/Scala.

There is a PR which is WIP and not up to date :
#1923

I will probably need a few weeks more to get things fixed and cleaned up

There is a PR which is WIP and not up to date :

#1923

I will probably need a few weeks more to get things fixed and cleaned up

I will try it out. Can you check out the analyser in the files I gave you to see if its okay?

I will try it out.

I will let you know when it is ready ^^ don't bother until then.

Can you check out the analyser in the files I gave you to see if its okay?

Which file ? i haven't seen any ? Yes i can.

I will try it out.

I will let you know when it is ready ^^ don't bother until then.

Can you check out the analyser in the files I gave you to see if its okay?

Which file ? i haven't seen any ? Yes i can.

The target/platform files. I added it into it before you asked for them to see them.

Ahhhh i did tried them, to check the memory region things.
But can't test anything on hardware.

@JoyBed Got the vexiiriscv bug fixed, everything seems stable now, remains a bit of timing optimization and then it should be good for usages.

@JoyBed #1923 is now good.

Still a few optimisations to do, but take take time ^^

I'm using it to run debian with for instance :

python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=vexiiriscv  --with-jtag-tap  --bus-standard axi-lite --vexii-args=" \
--allow-bypass-from=0 --debug-privileged --with-mul --with-div --div-ipc --with-rva --with-supervisor --performance-counters 9 \
--regfile-async --xlen=64 --with-rvc --with-rvf --with-rvd --fma-reduced-accuracy \
--fetch-l1 --fetch-l1-ways=4 --fetch-l1-mem-data-width-min=64 \
--lsu-l1 --lsu-l1-ways=4  --lsu-l1-mem-data-width-min=64 --lsu-l1-store-buffer-ops=32 --lsu-l1-refill-count 2 --lsu-l1-writeback-count 2 --lsu-l1-store-buffer-slots=2  --with-lsu-bypass \
--with-btb --with-ras --with-gshare --relaxed-branch"  --cpu-count=4 --with-jtag-tap  --with-video-framebuffer --with-sdcard --with-ethernet --with-coherent-dma --l2-byte=262144 --update-repo=no  --sys-clk-freq 100000000 

@JoyBed #1923 is now good.

Still a few optimisations to do, but take take time ^^

I'm using it to run debian with for instance :

python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=vexiiriscv  --with-jtag-tap  --bus-standard axi-lite --vexii-args=" \
--allow-bypass-from=0 --debug-privileged --with-mul --with-div --div-ipc --with-rva --with-supervisor --performance-counters 9 \
--regfile-async --xlen=64 --with-rvc --with-rvf --with-rvd --fma-reduced-accuracy \
--fetch-l1 --fetch-l1-ways=4 --fetch-l1-mem-data-width-min=64 \
--lsu-l1 --lsu-l1-ways=4  --lsu-l1-mem-data-width-min=64 --lsu-l1-store-buffer-ops=32 --lsu-l1-refill-count 2 --lsu-l1-writeback-count 2 --lsu-l1-store-buffer-slots=2  --with-lsu-bypass \
--with-btb --with-ras --with-gshare --relaxed-branch"  --cpu-count=4 --with-jtag-tap  --with-video-framebuffer --with-sdcard --with-ethernet --with-coherent-dma --l2-byte=262144 --update-repo=no  --sys-clk-freq 100000000 

How many LUTs does a dualcore VexiiRiscv takes ?

For which ISA ?
RV64IMAFDC to run debian ?
Or something softcore friendly like RV32IMA to just run linux ?

For RV64IMAFDC single issue everything enabled with memory coherency and single core, it is around 12K lut, at learly 100 Mhz on Artix 7 -1 (slow speed grade).
The FPU take a lot of space, around 5K lut per core. RVC is also a pain in the ass ^^

For RV64IMAFDC single issue everything enabled with memory coherency and single core, it is around 12K lut, at learly 100 Mhz on Artix 7 -1 (slow speed grade).

The FPU take a lot of space, around 5K lut per core. RVC is also a pain in the ass ^^

Wow, a Debian able core only in 12k LUTs? Thats amazing! I can comfortably fit then even 4 of them in my FPGA!

Note a recent change in litex broke things XD It works up to litex 86a43c9

Note a recent change in litex broke things XD It works up to litex 86a43c9

Just reverting this will fix things ?

I mean, you can checkout 86a43c9 and it will work, but later it will not.

Otherwise there is two commit to revert to get things to work :

I dont have these in my local so I dont need to worry about them. My local s like a month old.

I updated the vexii with a fix. Now it work with upstream litex.

I updated the vexii with a fix. Now it work with upstream litex.

Funny, for some reason I cant checkout your PR. Nevermind tho, I will manually add the needed files.

Also a quick question, does VexiiRiscv keep the offset when talking thru memory bus or not?

@JoyBed Ahhh right, Vexii was using the old code which was removing the offset.
I just pushed a fix in this PR. should be good now.

@JoyBed Ahhh right, Vexii was using the old code which was removing the offset.

I just pushed a fix in this PR. should be good now.

Either way, with VexiiRiscv I have the same problem as with NaxRiscv. Lockup when DRAM is connected thru memory bus but thru peripheral bus it works.

It lockup only of you try to use more than 512 MB of ram right ?

It lockup only of you try to use more than 512 MB of ram right ?

No. At any amount of DRAM. I tried from 32 megs all they way up to 1gig.

Now when I was checking it out, why the args for FPU and RVC are commented out? I was wondering why linux are not booting. Also aint it done like in the old VexRiscv where FPU is configurable to be shared between cores?

Now when I was checking it out, why the args for FPU and RVC are commented out?

Do you mean things around :
https://github.com/SpinalHDL/VexiiRiscv/blob/4d2ff4b29d04bf033239ace06fb0f61b3600362d/src/main/scala/vexiiriscv/Param.scala#L163 ?

It is only for debug, and isn't enabled for litex.
See

  //  Debug modifiers
  val debugParam = sys.env.getOrElse("VEXIIRISCV_DEBUG_PARAM", "0").toInt.toBoolean
  if(debugParam) {

As long as you feed litex with :

python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=vexiiriscv  --with-jtag-tap  --bus-standard axi-lite --vexii-args=" \
--allow-bypass-from=0 --debug-privileged --with-mul --with-div --div-ipc --with-rva --with-supervisor --performance-counters 9 \
--regfile-async --xlen=64 --with-rvc --with-rvf --with-rvd --fma-reduced-accuracy \
--fetch-l1 --fetch-l1-ways=4 --fetch-l1-mem-data-width-min=64 \
--lsu-l1 --lsu-l1-ways=4  --lsu-l1-mem-data-width-min=64 --lsu-l1-store-buffer-ops=32 --lsu-l1-refill-count 2 --lsu-l1-writeback-count 2 --lsu-l1-store-buffer-slots=2  --with-lsu-bypass \
--with-btb --with-ras --with-gshare --relaxed-branch"  --cpu-count=4 --with-jtag-tap  --with-video-framebuffer --with-sdcard --with-ethernet --with-coherent-dma --l2-byte=262144 --update-repo=no  --sys-clk-freq 100000000  --build

It should be debian ready

Also aint it done like in the old VexRiscv where FPU is configurable to be shared between cores?

Right, it isn't.
Currently, the FPU is thigtly integrated into the pipeline via some plugins :
https://github.com/SpinalHDL/VexiiRiscv/blob/4d2ff4b29d04bf033239ace06fb0f61b3600362d/src/main/scala/vexiiriscv/Param.scala#L660

It would be great to have an lighter alternative (with less performance).
I was thinking that maybe a FSM like FPU (instead of fully pipelined) would allow reuse a lot of hardware.
Or FPU sharing like in VexRiscv, yes.
I implemented things like they are now as i wanted to aim at full perf thigtly coupled FPU, which would also work great in ASIC.

Through peripheral memory, did you got linux to work ?

On my side, quad core debian is very very stable, did a lot of test with it. Also, things like USB host / bluetooth / sdcard / ethernet are working well.

Now when I was checking it out, why the args for FPU and RVC are commented out?

Do you mean things around :

https://github.com/SpinalHDL/VexiiRiscv/blob/4d2ff4b29d04bf033239ace06fb0f61b3600362d/src/main/scala/vexiiriscv/Param.scala#L163 ?

It is only for debug, and isn't enabled for litex.

See

  //  Debug modifiers

  val debugParam = sys.env.getOrElse("VEXIIRISCV_DEBUG_PARAM", "0").toInt.toBoolean

  if(debugParam) {

As long as you feed litex with :


python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=vexiiriscv  --with-jtag-tap  --bus-standard axi-lite --vexii-args=" \

--allow-bypass-from=0 --debug-privileged --with-mul --with-div --div-ipc --with-rva --with-supervisor --performance-counters 9 \

--regfile-async --xlen=64 --with-rvc --with-rvf --with-rvd --fma-reduced-accuracy \

--fetch-l1 --fetch-l1-ways=4 --fetch-l1-mem-data-width-min=64 \

--lsu-l1 --lsu-l1-ways=4  --lsu-l1-mem-data-width-min=64 --lsu-l1-store-buffer-ops=32 --lsu-l1-refill-count 2 --lsu-l1-writeback-count 2 --lsu-l1-store-buffer-slots=2  --with-lsu-bypass \

--with-btb --with-ras --with-gshare --relaxed-branch"  --cpu-count=4 --with-jtag-tap  --with-video-framebuffer --with-sdcard --with-ethernet --with-coherent-dma --l2-byte=262144 --update-repo=no  --sys-clk-freq 100000000  --build

It should be debian ready

Also aint it done like in the old VexRiscv where FPU is configurable to be shared between cores?

Right, it isn't.

Currently, the FPU is thigtly integrated into the pipeline via some plugins :

https://github.com/SpinalHDL/VexiiRiscv/blob/4d2ff4b29d04bf033239ace06fb0f61b3600362d/src/main/scala/vexiiriscv/Param.scala#L660

It would be great to have an lighter alternative (with less performance).

I was thinking that maybe a FSM like FPU (instead of fully pipelined) would allow reuse a lot of hardware.

Or FPU sharing like in VexRiscv, yes.

I implemented things like they are now as i wanted to aim at full perf thigtly coupled FPU, which would also work great in ASIC.

Through peripheral memory, did you got linux to work ?

On my side, quad core debian is very very stable, did a lot of test with it. Also, things like USB host / bluetooth / sdcard / ethernet are working well.

Yes, thru peripheral bus its working. But the arguments --with-rvc --with-with-rvf --with-rvd gave error as not recognised so I checked the core.py and they are commented out. Also some lines werent in right order in the core.py file so i reorganised it a bit. When I get home I will send you the core.py I modified from your original one in the PR.

--with-rvc --with-with-rvf --with-rvd

Not reconized by the python litex itself ? or SpinalHDL generation ?
note that --with-rvc --with-with-rvf --with-rvd are part of the --vexii-args, they aren't directly feed to litex itself
--other-args --vexii-args=" ... --with-rvc --with-with-rvf --with-rvd ... " --other-args

--with-rvc --with-with-rvf --with-rvd

Not reconized by the python litex itself ? or SpinalHDL generation ? note that --with-rvc --with-with-rvf --with-rvd are part of the --vexii-args, they aren't directly feed to litex itself --other-args --vexii-args=" ... --with-rvc --with-with-rvf --with-rvd ... " --other-args

Yes, they are not recognised by the SpinalHDL.
image

EDIT: Nevermind, I see the error. I revised the core.py a bit so no need to call NaxRiscv for repo update. Wanna see it ?

Ahhh i think you have a old version of VexiiRiscv then.
I gived you the command with --update-repo=no, i didnt noticed it.
--update-repo=recommended

That may explain a lot.

Then the pythondata-cpu-vexiiriscv/pythondata_cpu_vexiiriscv/verilog/ext/vexiiriscv should be on : "fpu_internal", "8a239d10" (after running the soc generation)

Yes, but with the core.py version in the PR the update_repo doesnt work, I fixed it and also removed the dependancy on NaxRiscv in the process. Yes, now the pythondata are on "fpu_internal". Can we communicate on some other platform so we will get things done faster so this issue can be closed faster ?

Sure, here is my discord : dolu1990
Would that work for you ?

Sure, here is my discord : dolu1990 Would that work for you ?

Yup, I sent you a friend request.

Hi @Dolu1990, @JoyBed,

have to been able to fix/understand the issue/limitation when discussing directly?

commented

Hi @enjoy-digital ! Actually we pinned down what is the problem. It not only affects NaxRiscv's mbus but also other CPUs memory buses, its not a problem of the softcores tho. Problem is tgat Zynq7000 has older AXI3 ports while basically everything here is AXI4. The problem is the lack of WID signal from AXI4 master while AXI3 has it and the PS7 block locks up when receiving AWID of any other value than 0 as the WID on the PS7 is unconnected so interprets WID as 0. We are trying to make a bridge between AXI4 master and AXI3 slave. Only unaffected softcore is Rocket as its TileLink to AXI4 bridge uses AWID = 0 so lack of WID is not a problem.

Thanks @JoyBed for the feedback. Regarding your application, do you think improvement should be made to LiteX to at least maybe prevent things from building? If you could provide more information about the zynq7000 integration you are doing an minimal repro, we could try to raise an error if this case is not supposed to be supported and shouldn't build.