
Over four months, Cao J. contributed to the OpenXiangShan/GEM5 repository by engineering advanced branch prediction features in C++ and Python, focusing on microarchitecture and performance optimization. He integrated the MicroTAGE predictor into the CPU pipeline, enhanced debugging with ABTB flags, and improved misprediction handling for TAGE and UBTB components. His work included refactoring legacy code, aligning BTB and MBTB logic, and expanding statistical tracking for greater observability and accuracy. By consolidating prediction paths and removing unused logic, Cao J. delivered maintainable, configurable systems that support more reliable simulation, faster tuning, and deeper architectural exploration for CPU performance modeling.
January 2026 (OpenXiangShan/GEM5) focused on streamlining the branch prediction and BTB (branch target buffer) pipeline to improve simulation accuracy and maintainability. Core efforts were centered around consolidating predwrongSource handling for wrong-source predictions, aligning BTB entry processing with stage-based predictions, and removing legacy basetable logic to simplify the code path. Key results include a refactored BTB/prediction path that consolidates source handling, the introduction of wrong-prediction statistics for improved diagnostic visibility, and a marked reduction in dead/unused code by removing basetable logic from BTBTAGE and MBTB and related components.
January 2026 (OpenXiangShan/GEM5) focused on streamlining the branch prediction and BTB (branch target buffer) pipeline to improve simulation accuracy and maintainability. Core efforts were centered around consolidating predwrongSource handling for wrong-source predictions, aligning BTB entry processing with stage-based predictions, and removing legacy basetable logic to simplify the code path. Key results include a refactored BTB/prediction path that consolidates source handling, the introduction of wrong-prediction statistics for improved diagnostic visibility, and a marked reduction in dead/unused code by removing basetable logic from BTBTAGE and MBTB and related components.
December 2025: Delivered substantial branch predictor enhancements in GEM5 (OpenXiangShan) with focus on accuracy, configurability, and observability across ITTAGE/TAGE and UBTB/BTB. Key outcomes include ABTB configurability, S3 prediction handling, misprediction counters, and expanded commit statistics across ITTAGE/BTB components, supported by cleanup efforts. MBTB basetable alignment and integration progressed with fixed-target statistics and support for a choice MBTB basetable, boosting accuracy and analytics. Backend refactors and bug fixes across ABTB/MBTB/UBTB improved coherence and reduced debt. Overall business impact: more reliable performance modeling, faster tuning, and better decision support for architecture exploration. Technologies demonstrated: C++, hardware simulation internals, performance analytics, and collaborative git workflows across multiple contributors.
December 2025: Delivered substantial branch predictor enhancements in GEM5 (OpenXiangShan) with focus on accuracy, configurability, and observability across ITTAGE/TAGE and UBTB/BTB. Key outcomes include ABTB configurability, S3 prediction handling, misprediction counters, and expanded commit statistics across ITTAGE/BTB components, supported by cleanup efforts. MBTB basetable alignment and integration progressed with fixed-target statistics and support for a choice MBTB basetable, boosting accuracy and analytics. Backend refactors and bug fixes across ABTB/MBTB/UBTB improved coherence and reduced debt. Overall business impact: more reliable performance modeling, faster tuning, and better decision support for architecture exploration. Technologies demonstrated: C++, hardware simulation internals, performance analytics, and collaborative git workflows across multiple contributors.
November 2025 — OpenXiangShan/GEM5: Delivered internal branch prediction (ABTB) debugging enhancements and targeted UBTB/TAGE improvements to strengthen debugging capabilities, reliability, and microarchitectural performance. Implemented a dedicated ABTB debug flag, cleaned up legacy code for maintainability, and improved misprediction handling within the TAGE path. Addressed UBTB-related bugs to improve accuracy and measurement fidelity, enabling faster diagnosis and iteration for future optimizations.
November 2025 — OpenXiangShan/GEM5: Delivered internal branch prediction (ABTB) debugging enhancements and targeted UBTB/TAGE improvements to strengthen debugging capabilities, reliability, and microarchitectural performance. Implemented a dedicated ABTB debug flag, cleaned up legacy code for maintainability, and improved misprediction handling within the TAGE path. Addressed UBTB-related bugs to improve accuracy and measurement fidelity, enabling faster diagnosis and iteration for future optimizations.
OpenXiangShan/GEM5 (2025-10) focused on a targeted performance enhancement by integrating MicroTAGE into the S1 stage of the CPU pipeline. This work adds a MicroTAGE branch predictor to S1, updates predictor configuration, and refactors code for maintainability and easier integration with existing branch-prediction logic. The effort delivers a concrete architectural advancement for dynamic branch prediction, ready for broader evaluation and future optimization. Delivered commit: 340af13e8310dbbcdf9797b1d77ead2159321d2b ("Microtage on s1 perf (#567)").
OpenXiangShan/GEM5 (2025-10) focused on a targeted performance enhancement by integrating MicroTAGE into the S1 stage of the CPU pipeline. This work adds a MicroTAGE branch predictor to S1, updates predictor configuration, and refactors code for maintainability and easier integration with existing branch-prediction logic. The effort delivers a concrete architectural advancement for dynamic branch prediction, ready for broader evaluation and future optimization. Delivered commit: 340af13e8310dbbcdf9797b1d77ead2159321d2b ("Microtage on s1 perf (#567)").

Overview of all repositories you've contributed to across your timeline