Exceeds - Team AI Productivity Dashboard

Marko Vlahovic

PROFILE

Marko Vlahovic

Milan Vlahovic developed and optimized hardware performance counter systems across the tenstorrent/tt-llk and tenstorrent/tt-metal repositories, focusing on end-to-end observability and efficient data collection. He implemented a C++-based performance counter infrastructure with per-thread and later shared L1 buffer architectures, integrating Python tooling for configuration, data analysis, and automated metric reporting. His work enabled detailed tracking of kernel-level events, such as REQUESTS versus GRANTS, and introduced synchronization improvements for multithreaded data integrity. By reducing memory usage and simplifying metrics, Milan established a reproducible, maintainable foundation for performance analysis and capacity planning, demonstrating depth in C++, Python, and system architecture.

Overall Statistics

Feature vs Bugs

100%Features

Repository Contributions

2Total

Bugs

Commits

Features

Lines of code

3,329

Activity Months2

Your Network

812 people

Same Organization

@tenstorrent.com

347

Abhishek AgarwalMember

Alex ApostoluMember

Almeet BhullarMember

Andjela BogdanovicMember

Alex BuckMember

Adriel BustamanteMember

Brata ChoudhuryMember

Andrija CicovicMember

Aleksandar ColicMember

Shared Repositories

465

Slavko KrsticMember

Vishal ChaudharyMember

Nathan MauriceMember

Uros VelimirovicMember

Marko RadosavljevicMember

Andreja JankovicMember

iharMember

pmilenkovicTTMember

Nathan SidwellMember

Work History

March 2026

1 Commits • 1 Features

Mar 1, 2026

March 2026 monthly work summary focusing on delivering a high-impact optimization in the performance counter subsystem of tt-metal, with an emphasis on memory efficiency, data integrity, and maintainability.

1 Commits • 1 Features

Mar 1, 2026

March 2026

February 2026

1 Commits • 1 Features

Feb 1, 2026

February 2026: Delivered the Tensix Performance Counter System for tt-llk, establishing end-to-end hardware performance observability across UNPACK/MATH/PACK threads and enabling data-driven optimization. Core features include a C++ PerfCounters plumbing with per-thread L1 buffers, Python tooling for configuration and readout, and derived metrics with automated summaries. Added matmul kernel instrumentation and kernel-level integration to provide side-by-side REQUESTS vs GRANTS analysis, improving visibility into arbitration, stalls, and bottlenecks. This work lays the foundation for reproducible performance analysis, targeted tuning, and better capacity planning.

February 2026

1 Commits • 1 Features

Feb 1, 2026

Activity

Loading activity data...

Quality Metrics

Correctness100.0%

Maintainability80.0%

Architecture100.0%

Performance80.0%

AI Usage50.0%

Skills & Technologies

Programming Languages

C++Python

Technical Skills

C++ developmentPython scriptingdata analysisdata visualizationhardware integrationmultithreadingperformance analysisperformance optimizationsystem architecture

Repositories Contributed To

2 repos

Overview of all repositories you've contributed to across your timeline

tenstorrent/tt-llk

Feb 2026 – Feb 2026

1 Month active

Languages Used

C++Python

Technical Skills

C++ developmentPython scriptingdata visualizationhardware integrationperformance analysis

tenstorrent/tt-metal

Mar 2026 – Mar 2026

1 Month active

Languages Used

C++Python

Technical Skills

data analysismultithreadingperformance optimizationsystem architecture