
Milan Radosavljevic enhanced the ROCm/rocprofiler-sdk repository by improving its packaging system to ensure the sql.h header is included in ROCPD’s installed headers. He addressed a recurring integration issue where downstream projects encountered header mismatches, focusing on installation reliability and consistency. Using C++ and CMake, Milan automated the header installation process, aligning the installed header set with developer expectations and reducing support overhead. His work involved careful build system adjustments and git-based change management, resulting in smoother onboarding for developers integrating SQL tooling. Over the month, Milan concentrated on packaging depth rather than bug fixes, demonstrating strong build system expertise.
January 2026 ROCm/rocm-systems monthly performance summary highlighting key features, major fixes, and impact. Key features delivered: - Observability Enhancements: Upgraded to a structured logging stack using spdlog, introduced a dedicated rocprofiler-systems-logger, and added runtime log level control via command-line arguments and environment variables. This replaced legacy logging macros with modern equivalents and improved performance and maintainability. Relevant commits include: 318d13870f374cc806f64b77d3b9336b1baf8076 (spdlog integration and logging upgrade) and 940488ed5809b12740d80fa223b7fbba335a000c (fixing process_page category naming). - Profiling metrics clarification: Renamed the process_page category to reflect physical memory usage, increasing profiling clarity for users. - PyTorch Environment Auto-Discovery: Added automatic PyTorch library discovery for Python applications, enabling automatic identification of Python interpreters and corresponding PyTorch library paths to ensure correct runtime linkage. Commit: b533f56197ba1089653de799f737ab9d0c38626c. Major bugs fixed: - Corrected the naming and description of the process_page profiling category to align with physical memory usage, reducing user confusion and improving profiling accuracy (commit 940488ed). Overall impact and accomplishments: - Significantly improved observability, performance, and maintainability of ROCm/rocm-systems across users and deployments. - Reduced runtime linkage issues by automating PyTorch library discovery, resulting in smoother environment setup and fewer support incidents. - Strengthened profiling accuracy, providing clearer, more actionable metrics for performance tuning. Technologies/skills demonstrated: - CMake build integration, submodule management, and migration to spdlog for structured logging. - Modern C++ practices (exception-based error handling, improved runtime log control). - Python runtime discovery and interpreter/path mapping for dynamic library linkage. - Cross-repo collaboration and clear commit hygiene across multiple features.
January 2026 ROCm/rocm-systems monthly performance summary highlighting key features, major fixes, and impact. Key features delivered: - Observability Enhancements: Upgraded to a structured logging stack using spdlog, introduced a dedicated rocprofiler-systems-logger, and added runtime log level control via command-line arguments and environment variables. This replaced legacy logging macros with modern equivalents and improved performance and maintainability. Relevant commits include: 318d13870f374cc806f64b77d3b9336b1baf8076 (spdlog integration and logging upgrade) and 940488ed5809b12740d80fa223b7fbba335a000c (fixing process_page category naming). - Profiling metrics clarification: Renamed the process_page category to reflect physical memory usage, increasing profiling clarity for users. - PyTorch Environment Auto-Discovery: Added automatic PyTorch library discovery for Python applications, enabling automatic identification of Python interpreters and corresponding PyTorch library paths to ensure correct runtime linkage. Commit: b533f56197ba1089653de799f737ab9d0c38626c. Major bugs fixed: - Corrected the naming and description of the process_page profiling category to align with physical memory usage, reducing user confusion and improving profiling accuracy (commit 940488ed). Overall impact and accomplishments: - Significantly improved observability, performance, and maintainability of ROCm/rocm-systems across users and deployments. - Reduced runtime linkage issues by automating PyTorch library discovery, resulting in smoother environment setup and fewer support incidents. - Strengthened profiling accuracy, providing clearer, more actionable metrics for performance tuning. Technologies/skills demonstrated: - CMake build integration, submodule management, and migration to spdlog for structured logging. - Modern C++ practices (exception-based error handling, improved runtime log control). - Python runtime discovery and interpreter/path mapping for dynamic library linkage. - Cross-repo collaboration and clear commit hygiene across multiple features.
Concise monthly summary for 2025-12 focusing on delivering ROC Profiler enhancements and trace_cache testing for ROCm/rocm-systems. Highlights include performance gains, data fidelity improvements, and expanded test coverage driving reliability for profiling workflows and customer-facing performance analysis.
Concise monthly summary for 2025-12 focusing on delivering ROC Profiler enhancements and trace_cache testing for ROCm/rocm-systems. Highlights include performance gains, data fidelity improvements, and expanded test coverage driving reliability for profiling workflows and customer-facing performance analysis.
Concise monthly summary for 2025-11 focusing on business value and technical achievements for ROCm/rocm-systems.
Concise monthly summary for 2025-11 focusing on business value and technical achievements for ROCm/rocm-systems.
Month: 2025-10 | Repos: ROCm/rocm-systems. Summary: In Oct 2025, delivered stability and tunable profiling data collection across the RoCM stack. Key features delivered include Perfetto flush period configurability via ROCPROFSYS_PERFETTO_FLUSH_PERIOD_MS, and RoCPD cache management improvements with category region caching, caching refactor, and related metrics enhancements. A critical bug fix addressed rocprof-sys-python runtime errors by ensuring Python libs are located and loaded via LD_LIBRARY_PATH. The RoCPD work also fixed VAAPI traces and improved JSON parsing and UUID handling for more reliable data stores. Impact: Improved profiling reliability, lower overhead, and faster debugging cycles; CI and builds more robust. Technologies demonstrated: Python runtime hygiene, environment configuration (LD_LIBRARY_PATH), Perfetto, caching architectures, cmake/build hygiene, nlohmann JSON, VAAPI tracing, and container/CI readiness.
Month: 2025-10 | Repos: ROCm/rocm-systems. Summary: In Oct 2025, delivered stability and tunable profiling data collection across the RoCM stack. Key features delivered include Perfetto flush period configurability via ROCPROFSYS_PERFETTO_FLUSH_PERIOD_MS, and RoCPD cache management improvements with category region caching, caching refactor, and related metrics enhancements. A critical bug fix addressed rocprof-sys-python runtime errors by ensuring Python libs are located and loaded via LD_LIBRARY_PATH. The RoCPD work also fixed VAAPI traces and improved JSON parsing and UUID handling for more reliable data stores. Impact: Improved profiling reliability, lower overhead, and faster debugging cycles; CI and builds more robust. Technologies demonstrated: Python runtime hygiene, environment configuration (LD_LIBRARY_PATH), Perfetto, caching architectures, cmake/build hygiene, nlohmann JSON, VAAPI tracing, and container/CI readiness.
August 2025 monthly summary focusing on delivering robust CI/CD automation and enhanced observability for ROCm subsystems. Key outcomes include streamlined cross-OS CI for rocprofiler-systems, trace-based data logging enhancements in amd_smi and cpu_freq, and a bug fix ensuring correct stream ID association in rocpd tracing. These efforts improved reliability, observability, and data-driven performance debugging, enabling faster iterations and better developer productivity.
August 2025 monthly summary focusing on delivering robust CI/CD automation and enhanced observability for ROCm subsystems. Key outcomes include streamlined cross-OS CI for rocprofiler-systems, trace-based data logging enhancements in amd_smi and cpu_freq, and a bug fix ensuring correct stream ID association in rocpd tracing. These efforts improved reliability, observability, and data-driven performance debugging, enabling faster iterations and better developer productivity.

Overview of all repositories you've contributed to across your timeline