Skip to content
  • Kategorien
  • Aktuell
  • Tags
  • Beliebt
  • World
  • Benutzer
  • Gruppen
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Standard: (Kein Skin)
  • Kein Skin
Einklappen

other.li Forum

  1. Übersicht
  2. Uncategorized
  3. Optimizing IRQ latency on the STM32H743 @ 480 MHz, perhaps for NES ROM emulation...

Optimizing IRQ latency on the STM32H743 @ 480 MHz, perhaps for NES ROM emulation...

Geplant Angeheftet Gesperrt Verschoben Uncategorized
electronicsstm32
7 Beiträge 1 Kommentatoren 0 Aufrufe
  • Älteste zuerst
  • Neuste zuerst
  • Meiste Stimmen
Antworten
  • In einem neuen Thema antworten
Anmelden zum Antworten
Dieses Thema wurde gelöscht. Nur Nutzer mit entsprechenden Rechten können es sehen.
  • ? Offline
    ? Offline
    Gast
    schrieb am zuletzt editiert von
    #1

    Optimizing IRQ latency on the STM32H743 @ 480 MHz, perhaps for NES ROM emulation... Best result so far: 100 nanoseconds input-to-output latency when the vector table and the IRQ handler are relocated to Tightly-Coupled Memory without making HAL calls. Not bad, but the GPIO controller (several buses away) looks like the real performance killer here. WARNING: buggy code, see correction https://mk.absturztau.be/notes/ajvb448y305b01i4. #electronics #STM32

    ? 1 Antwort Letzte Antwort
    0
    • ? Gast

      Optimizing IRQ latency on the STM32H743 @ 480 MHz, perhaps for NES ROM emulation... Best result so far: 100 nanoseconds input-to-output latency when the vector table and the IRQ handler are relocated to Tightly-Coupled Memory without making HAL calls. Not bad, but the GPIO controller (several buses away) looks like the real performance killer here. WARNING: buggy code, see correction https://mk.absturztau.be/notes/ajvb448y305b01i4. #electronics #STM32

      ? Offline
      ? Offline
      Gast
      schrieb am zuletzt editiert von
      #2

      Keep optimizing IRQ latency on the STM32H743 @ 480 MHz. Just enabled i-cache and d-cache, and the IRQ latency dropped from 100 ns to 70 ns. 🚀 But cache shouldn't work like this. So my code is still touching slow memory somewhere. The stack perhaps, which is still in "normal" RAM. The slow Flash perhaps also makes it slower to abort main() if an instruction is stuck in a wait state. Need to check everything carefully... #electronics #STM32

      ? 1 Antwort Letzte Antwort
      0
      • ? Gast

        Keep optimizing IRQ latency on the STM32H743 @ 480 MHz. Just enabled i-cache and d-cache, and the IRQ latency dropped from 100 ns to 70 ns. 🚀 But cache shouldn't work like this. So my code is still touching slow memory somewhere. The stack perhaps, which is still in "normal" RAM. The slow Flash perhaps also makes it slower to abort main() if an instruction is stuck in a wait state. Need to check everything carefully... #electronics #STM32

        ? Offline
        ? Offline
        Gast
        schrieb am zuletzt editiert von
        #3

        Keep optimizing IRQ latency on the STM32H743 @ 480 MHz. The 70 ns vs. 100 ns overhead mystery solved. I did not correctly relocate the vector table to Tightly-Coupled Memory properly, it was still in Flash. The STM32 HAL macro USER_VECT_TAB_ADDRESS is a flag, not a memory address! In fact, only several hardcoded addresses are available, a real user override is not provided (the name "user" is a lie). Solution: just change VTOR manually, don't trust the startup code. I'm now getting 70-ns IRQ without CPU cache. #electronics #STM32

        ? 1 Antwort Letzte Antwort
        0
        • ? Gast

          Keep optimizing IRQ latency on the STM32H743 @ 480 MHz. The 70 ns vs. 100 ns overhead mystery solved. I did not correctly relocate the vector table to Tightly-Coupled Memory properly, it was still in Flash. The STM32 HAL macro USER_VECT_TAB_ADDRESS is a flag, not a memory address! In fact, only several hardcoded addresses are available, a real user override is not provided (the name "user" is a lie). Solution: just change VTOR manually, don't trust the startup code. I'm now getting 70-ns IRQ without CPU cache. #electronics #STM32

          ? Offline
          ? Offline
          Gast
          schrieb am zuletzt editiert von
          #4

          I do not understand how the NES system bus works, even after reading multiple tutorials. Only one way to find out... #electronics #NES #NESdev

          ? 1 Antwort Letzte Antwort
          0
          • ? Gast

            I do not understand how the NES system bus works, even after reading multiple tutorials. Only one way to find out... #electronics #NES #NESdev

            ? Offline
            ? Offline
            Gast
            schrieb am zuletzt editiert von
            #5

            Keep optimizing IRQ latency on the STM32H743 @ 480 MHz. I decided to try an event loop using the WFE instruction instead of IRQs, and I managed to get 60 ns input-to-output latency. I suspect this is the best possible latency. Latency did not improve by abusing QSPI controller to generate a write request (in fact it slightly degraded), even if the QSPI controller is physically close to the CPU. Clearly, passively monitoring signals is not the way to go for bus emulation. Perhaps the solution is predicting the clock before it even arrives, by internally generating a phase-shifted version of it. #electronics #STM32

            ? 1 Antwort Letzte Antwort
            0
            • ? Gast

              Keep optimizing IRQ latency on the STM32H743 @ 480 MHz. I decided to try an event loop using the WFE instruction instead of IRQs, and I managed to get 60 ns input-to-output latency. I suspect this is the best possible latency. Latency did not improve by abusing QSPI controller to generate a write request (in fact it slightly degraded), even if the QSPI controller is physically close to the CPU. Clearly, passively monitoring signals is not the way to go for bus emulation. Perhaps the solution is predicting the clock before it even arrives, by internally generating a phase-shifted version of it. #electronics #STM32

              ? Offline
              ? Offline
              Gast
              schrieb am zuletzt editiert von
              #6

              Keep optimizing IRQ latency on the STM32H743 @ 480 MHz. My "zero-latency IRQ" idea is a success, now I'm getting a 17.30 ns "effective" latency! Upon receiving every rising edge of the clock, the hardware immediately starts a timer that fires after a programmed delay, calculated to be slightly before the next clock rising edge. This way, the firmware is triggered from recovered, phase-shifted version of the clock, a little bit like how analog NTSC TVs got their H/VSYNC. Interrupt latency is completely eliminated for all but the first clock cycle (which is also predictable with pre-enabled outputs, since it's always the reset vector) Perfect bus emulation starts looking feasible. #electronics #STM32

              ? 1 Antwort Letzte Antwort
              0
              • ? Gast

                Keep optimizing IRQ latency on the STM32H743 @ 480 MHz. My "zero-latency IRQ" idea is a success, now I'm getting a 17.30 ns "effective" latency! Upon receiving every rising edge of the clock, the hardware immediately starts a timer that fires after a programmed delay, calculated to be slightly before the next clock rising edge. This way, the firmware is triggered from recovered, phase-shifted version of the clock, a little bit like how analog NTSC TVs got their H/VSYNC. Interrupt latency is completely eliminated for all but the first clock cycle (which is also predictable with pre-enabled outputs, since it's always the reset vector) Perfect bus emulation starts looking feasible. #electronics #STM32

                ? Offline
                ? Offline
                Gast
                schrieb zuletzt editiert von
                #7

                Making a 60-pin Famicom debug cartridge for testing my cartridge emulator... #electronics #NES #NESdev

                1 Antwort Letzte Antwort
                0
                • monkee@other.liM monkee@other.li shared this topic
                Antworten
                • In einem neuen Thema antworten
                Anmelden zum Antworten
                • Älteste zuerst
                • Neuste zuerst
                • Meiste Stimmen


                • Anmelden

                • Anmelden oder registrieren, um zu suchen
                • Erster Beitrag
                  Letzter Beitrag
                0
                • Kategorien
                • Aktuell
                • Tags
                • Beliebt
                • World
                • Benutzer
                • Gruppen