
Zach Montoya engineered distributed tracing, observability, and benchmarking features across multiple DataDog repositories, including dd-trace-dotnet, dd-trace-py, and system-tests. He delivered cross-language OpenTelemetry integrations, enhanced runtime metrics, and improved trace context propagation, using C#, Python, and C++ to address reliability and performance. Zach refactored APIs for usability, implemented regression and parametric tests, and optimized build automation and CI/CD pipelines. His work included documentation consolidation and configuration management, reducing onboarding friction and production risk. By expanding test coverage and introducing feature flags, Zach ensured robust, maintainable code that improved cross-team collaboration and enabled safer, faster delivery of tracing features.

February 2026: DataDog/system-tests — OpenTelemetry Baggage API Endpoints implemented for Datadog and OpenTelemetry, with tests validating baggage propagation and removal. No explicit bug fixes reported; focus was on feature delivery and test coverage to improve observability and reliability.
February 2026: DataDog/system-tests — OpenTelemetry Baggage API Endpoints implemented for Datadog and OpenTelemetry, with tests validating baggage propagation and removal. No explicit bug fixes reported; focus was on feature delivery and test coverage to improve observability and reliability.
Concise monthly summary for DataDog engineering work in 2025-11, focused on delivering reliable observability features and fixing critical cross-repo issues. This month highlights concrete features delivered, targeted bugs resolved, and measurable impact on system reliability and cross-team collaboration.
Concise monthly summary for DataDog engineering work in 2025-11, focused on delivering reliable observability features and fixing critical cross-repo issues. This month highlights concrete features delivered, targeted bugs resolved, and measurable impact on system reliability and cross-team collaboration.
October 2025: Delivered major OpenTelemetry metrics enhancements, testing improvements, and observability features across four repos, resulting in more reliable metrics collection, better traceability, and improved performance. Highlights include dd-trace-py OTel Metrics integration enhancements with custom defaults and OTLP fallback fixes; system-tests OTLP metrics testing improvements with expanded coverage and reduced runtime; nginx-datadog automatic W3C baggage tagging; and dd-trace-dotnet openTelemetry metrics processing optimizations. Business value: increased reliability of metrics pipelines, faster feedback loops, and clearer traceability for end-to-end observability.
October 2025: Delivered major OpenTelemetry metrics enhancements, testing improvements, and observability features across four repos, resulting in more reliable metrics collection, better traceability, and improved performance. Highlights include dd-trace-py OTel Metrics integration enhancements with custom defaults and OTLP fallback fixes; system-tests OTLP metrics testing improvements with expanded coverage and reduced runtime; nginx-datadog automatic W3C baggage tagging; and dd-trace-dotnet openTelemetry metrics processing optimizations. Business value: increased reliability of metrics pipelines, faster feedback loops, and clearer traceability for end-to-end observability.
September 2025 (2025-09) — DataDog/nginx-datadog focused on strengthening distributed tracing reliability for NGINX subrequests. Delivered a bug fix to ensure the root span context is injected into subrequests (not the subrequest span), restoring the original tracing behavior and preventing trace fragmentation. Implemented regression tests to lock in the fix. The change aligns with commit 0e81220a00f538b31dd3c3036d6c32d83308ff5d (fix: Correctly handle auth requests when subrequest logging is enabled (#245)). This work improves end-to-end trace accuracy, reduces debugging time, and strengthens observability for authenticated request workflows.
September 2025 (2025-09) — DataDog/nginx-datadog focused on strengthening distributed tracing reliability for NGINX subrequests. Delivered a bug fix to ensure the root span context is injected into subrequests (not the subrequest span), restoring the original tracing behavior and preventing trace fragmentation. Implemented regression tests to lock in the fix. The change aligns with commit 0e81220a00f538b31dd3c3036d6c32d83308ff5d (fix: Correctly handle auth requests when subrequest logging is enabled (#245)). This work improves end-to-end trace accuracy, reduces debugging time, and strengthens observability for authenticated request workflows.
Month: 2025-08 – DataDog/dd-trace-dotnet: Delivered governance, stability, and build hygiene improvements that enhance developer productivity and runtime reliability. Key outcomes include streamlined CODEOWNERS ownership for the SDK capabilities team, memory-growth controls in the tracer during dynamic config updates, stabilization of OTLP metrics snapshot tests, and cleaner build/dependency configurations for clearer releases and better compatibility.
Month: 2025-08 – DataDog/dd-trace-dotnet: Delivered governance, stability, and build hygiene improvements that enhance developer productivity and runtime reliability. Key outcomes include streamlined CODEOWNERS ownership for the SDK capabilities team, memory-growth controls in the tracer during dynamic config updates, stabilization of OTLP metrics snapshot tests, and cleaner build/dependency configurations for clearer releases and better compatibility.
Monthly summary for 2025-07: Focused on delivering targeted observability enhancements across Python, .NET, and documentation to advance reliability, interoperability, and developer experience. Implemented OpenTelemetry Metrics API opt-in integration in DataDog/dd-trace-py, enabling optional OpenTelemetry SDK dependency, global MeterProvider management, and support for synchronous and asynchronous instruments via an environment-variable toggle. This reduces integration friction for OTEL adopters and improves metrics consistency across services. Updated Datadog documentation to clarify DD_TRACE_PROPAGATION_BEHAVIOR_EXTRACT usage, detailing how incoming distributed tracing headers are handled, the accepted values ('continue', 'restart', 'ignore'), default behavior, and language-specific notes, resulting in clearer guidance and reduced misconfigurations. Strengthened .NET observability with AWS Trace Context Injection Guard for Disabled Integrations in DataDog/dd-trace-dotnet, preventing trace context injection when AWS SDK or Kinesis integrations are disabled and adding unit tests to verify the guard behavior. Overall impact: improved interoperability with OpenTelemetry, safer integration semantics, clearer configuration guidance, and enhanced test coverage. Technologies/skills demonstrated: Python and .NET instrumentation, OpenTelemetry integration patterns, environment-variable feature flags, unit testing, and documentation practices.
Monthly summary for 2025-07: Focused on delivering targeted observability enhancements across Python, .NET, and documentation to advance reliability, interoperability, and developer experience. Implemented OpenTelemetry Metrics API opt-in integration in DataDog/dd-trace-py, enabling optional OpenTelemetry SDK dependency, global MeterProvider management, and support for synchronous and asynchronous instruments via an environment-variable toggle. This reduces integration friction for OTEL adopters and improves metrics consistency across services. Updated Datadog documentation to clarify DD_TRACE_PROPAGATION_BEHAVIOR_EXTRACT usage, detailing how incoming distributed tracing headers are handled, the accepted values ('continue', 'restart', 'ignore'), default behavior, and language-specific notes, resulting in clearer guidance and reduced misconfigurations. Strengthened .NET observability with AWS Trace Context Injection Guard for Disabled Integrations in DataDog/dd-trace-dotnet, preventing trace context injection when AWS SDK or Kinesis integrations are disabled and adding unit tests to verify the guard behavior. Overall impact: improved interoperability with OpenTelemetry, safer integration semantics, clearer configuration guidance, and enhanced test coverage. Technologies/skills demonstrated: Python and .NET instrumentation, OpenTelemetry integration patterns, environment-variable feature flags, unit testing, and documentation practices.
June 2025 monthly summary focusing on performance visibility and cross-language test reliability across two repos. Key initiatives targeted OpenTelemetry tracing benchmarks and cross-language ExtractBehavior coverage to strengthen performance leadership and reduce regression risk.
June 2025 monthly summary focusing on performance visibility and cross-language test reliability across two repos. Key initiatives targeted OpenTelemetry tracing benchmarks and cross-language ExtractBehavior coverage to strengthen performance leadership and reduce regression risk.
May 2025 focused on improving tracing correctness, expanding performance benchmarking, and hardening CI stability across two repositories. Delivered a regression test for Tracestate parent_id propagation, introduced OpenTelemetry benchmarking projects for microbenchmarks, and implemented CI timeout mitigations. Also enhanced BenchmarkDotNet configurability to balance accuracy and runtime, followed by a stability rollback to maintain CI reliability.
May 2025 focused on improving tracing correctness, expanding performance benchmarking, and hardening CI stability across two repositories. Delivered a regression test for Tracestate parent_id propagation, introduced OpenTelemetry benchmarking projects for microbenchmarks, and implemented CI timeout mitigations. Also enhanced BenchmarkDotNet configurability to balance accuracy and runtime, followed by a stability rollback to maintain CI reliability.
April 2025 highlights: Delivered critical reliability fixes, expanded test coverage, and improved documentation across DataDog/dd-trace-dotnet, DataDog/system-tests, and DataDog/documentation. Key outcomes include fixing startup hook injection for .NET 8, enabling dynamic configuration tests in the C++ agent, correcting OpenTelemetry span handling in the .NET parametric app, and consolidating the OpenTelemetry Runtime Metrics documentation with an improved navigation experience. These changes reduce startup/instrumentation risk, enhance test validation, and streamline user onboarding and developer guidance, delivering measurable business value in reliability, coverage, and documentation usability.
April 2025 highlights: Delivered critical reliability fixes, expanded test coverage, and improved documentation across DataDog/dd-trace-dotnet, DataDog/system-tests, and DataDog/documentation. Key outcomes include fixing startup hook injection for .NET 8, enabling dynamic configuration tests in the C++ agent, correcting OpenTelemetry span handling in the .NET parametric app, and consolidating the OpenTelemetry Runtime Metrics documentation with an improved navigation experience. These changes reduce startup/instrumentation risk, enhance test validation, and streamline user onboarding and developer guidance, delivering measurable business value in reliability, coverage, and documentation usability.
Month: 2025-03 — Focused on improving onboarding, cross-language observability, and benchmarking capabilities across the Datadog tracing stack. Key initiatives spanned documentation improvements, tracing configuration features, runtime metrics enablement, and expanded test coverage. Key features delivered: - DataDog/documentation: Azure Service Bus integration documentation enhancements in Data Streams Monitoring (consolidated and clarified setup, updated setup table, simplified onboarding by removing external tracing page references); Unified Datadog APM Runtime Metrics documentation (consolidated cross-language docs into a single page with compatibility, setup instructions, and language-specific data collected); Clarify dotnet tracing correlation IDs format (document that trace_id and span_id are 64-bit decimal numbers to ensure proper log-trace correlation). - DataDog/dd-trace-dotnet: Experimental feature flags and flexible tag parsing (new config key to enable experimental tracing features; DD_TAGS parsing flexibility to align with the Datadog Agent, including spaces and empty values); 128-bit trace ID logging enabled by default and related documentation (default true when 128-bit IDs are generated; updated logging format and docs for the DD_TRACE_128_BIT_TRACEID_LOGGING_ENABLED setting). - DataDog/system-tests: Telemetry script stability and removal of unnecessary file deletion (remove an unintended file deletion in tracer telemetry intake script); Enhanced .NET log injection test coverage and 128-bit trace ID support (expanded tests, updated DD_TAGS test cases, validated 128-bit tracing across configurations). - DataDog/dd-trace-rb: Experimental Runtime ID collection for per-process metrics (new flag DD_TRACE_EXPERIMENTAL_RUNTIME_ID_ENABLED with documentation updates). - DataDog/dd-trace-go: Macrobenchmark configurations for Go 1.23/1.24 with tracing and runtime metrics (new macrobenchmark setup focusing on tracing with runtime metrics enabled). - DataDog/dd-trace-py: Nightly macrobenchmark configuration: tracing-runtime-metrics-enabled (runtime metrics disabled) to validate behavior without runtime metrics. - DataDog/dd-trace-js: CI Macrobenchmark: Enable runtime metrics for tracing benchmarks (new macrobenchmark configuration enabling runtime metrics for tracing). Major bugs fixed: - Clarified 64-bit decimal format for trace IDs in .NET correlation IDs documentation (trace-id and span-id are 64-bit decimal numbers). - Telemetry update script stability: removed unnecessary file deletion to prevent unintended data loss. Overall impact and accomplishments: - Improved onboarding and reduced friction with Azure Service Bus integration setup; consolidated APM runtime metrics docs; aligned tracing docs and defaults across languages; expanded test coverage to reduce regression risk; introduced safe feature flags for experimental tracing work; enabled robust performance analysis through runtime metrics and 128-bit tracing. Technologies/skills demonstrated: - Cross-language documentation wrangling, tracing configuration and feature flag design, 128-bit tracing, macrobenchmark orchestration, and test coverage expansion across Go, Python, Ruby, and JavaScript.
Month: 2025-03 — Focused on improving onboarding, cross-language observability, and benchmarking capabilities across the Datadog tracing stack. Key initiatives spanned documentation improvements, tracing configuration features, runtime metrics enablement, and expanded test coverage. Key features delivered: - DataDog/documentation: Azure Service Bus integration documentation enhancements in Data Streams Monitoring (consolidated and clarified setup, updated setup table, simplified onboarding by removing external tracing page references); Unified Datadog APM Runtime Metrics documentation (consolidated cross-language docs into a single page with compatibility, setup instructions, and language-specific data collected); Clarify dotnet tracing correlation IDs format (document that trace_id and span_id are 64-bit decimal numbers to ensure proper log-trace correlation). - DataDog/dd-trace-dotnet: Experimental feature flags and flexible tag parsing (new config key to enable experimental tracing features; DD_TAGS parsing flexibility to align with the Datadog Agent, including spaces and empty values); 128-bit trace ID logging enabled by default and related documentation (default true when 128-bit IDs are generated; updated logging format and docs for the DD_TRACE_128_BIT_TRACEID_LOGGING_ENABLED setting). - DataDog/system-tests: Telemetry script stability and removal of unnecessary file deletion (remove an unintended file deletion in tracer telemetry intake script); Enhanced .NET log injection test coverage and 128-bit trace ID support (expanded tests, updated DD_TAGS test cases, validated 128-bit tracing across configurations). - DataDog/dd-trace-rb: Experimental Runtime ID collection for per-process metrics (new flag DD_TRACE_EXPERIMENTAL_RUNTIME_ID_ENABLED with documentation updates). - DataDog/dd-trace-go: Macrobenchmark configurations for Go 1.23/1.24 with tracing and runtime metrics (new macrobenchmark setup focusing on tracing with runtime metrics enabled). - DataDog/dd-trace-py: Nightly macrobenchmark configuration: tracing-runtime-metrics-enabled (runtime metrics disabled) to validate behavior without runtime metrics. - DataDog/dd-trace-js: CI Macrobenchmark: Enable runtime metrics for tracing benchmarks (new macrobenchmark configuration enabling runtime metrics for tracing). Major bugs fixed: - Clarified 64-bit decimal format for trace IDs in .NET correlation IDs documentation (trace-id and span-id are 64-bit decimal numbers). - Telemetry update script stability: removed unnecessary file deletion to prevent unintended data loss. Overall impact and accomplishments: - Improved onboarding and reduced friction with Azure Service Bus integration setup; consolidated APM runtime metrics docs; aligned tracing docs and defaults across languages; expanded test coverage to reduce regression risk; introduced safe feature flags for experimental tracing work; enabled robust performance analysis through runtime metrics and 128-bit tracing. Technologies/skills demonstrated: - Cross-language documentation wrangling, tracing configuration and feature flag design, 128-bit tracing, macrobenchmark orchestration, and test coverage expansion across Go, Python, Ruby, and JavaScript.
February 2025 monthly summary: Delivered cross-language tracing test hardening, updated documentation to reflect GA status, and fixed baggage parsing in W3CBaggagePropagator. These efforts improved test robustness, clarified GA readiness, and baggage data accuracy, enabling faster QA cycles and reduced production risk.
February 2025 monthly summary: Delivered cross-language tracing test hardening, updated documentation to reflect GA status, and fixed baggage parsing in W3CBaggagePropagator. These efforts improved test robustness, clarified GA readiness, and baggage data accuracy, enabling faster QA cycles and reduced production risk.
January 2025: Key feature work and testing across dd-trace-dotnet and system-tests focused on improving tracing API usability, OTEL interoperability, and observability. Key deliveries include: Span Links API simplification and chainable AddLink in dd-trace-dotnet; OpenTelemetry propagation interoperability and configurability enhancements; OpenTelemetry Propagator Compatibility Testing in system-tests; Runtime Metrics Feature Testing in system-tests. Major impact: reduced API friction, improved cross-language propagation behavior, and expanded validation of runtime metrics tagging. Technologies demonstrated: OpenTelemetry integration, TextMapPropagator usage, baggage handling, cross-language test automation, and runtime metrics instrumentation. Business value: more reliable distributed tracing, easier customer integration, and stronger observability signals.
January 2025: Key feature work and testing across dd-trace-dotnet and system-tests focused on improving tracing API usability, OTEL interoperability, and observability. Key deliveries include: Span Links API simplification and chainable AddLink in dd-trace-dotnet; OpenTelemetry propagation interoperability and configurability enhancements; OpenTelemetry Propagator Compatibility Testing in system-tests; Runtime Metrics Feature Testing in system-tests. Major impact: reduced API friction, improved cross-language propagation behavior, and expanded validation of runtime metrics tagging. Technologies demonstrated: OpenTelemetry integration, TextMapPropagator usage, baggage handling, cross-language test automation, and runtime metrics instrumentation. Business value: more reliable distributed tracing, easier customer integration, and stronger observability signals.
During December 2024, focused on strengthening distributed tracing validation in DataDog/system-tests by adding end-to-end header propagation tests for Datadog tracing with x-datadog-parent-id = 0 and enabling .NET header tag test coverage. Implemented tests across standard and browser-based synthetic monitoring, with manifest updates, and resolved test bugs by updating .NET header tag test versions to fix known issues, improving coverage of header-tag functionality. These efforts reduce release risk for header handling and improve reliability of tracing features across platforms.
During December 2024, focused on strengthening distributed tracing validation in DataDog/system-tests by adding end-to-end header propagation tests for Datadog tracing with x-datadog-parent-id = 0 and enabling .NET header tag test coverage. Implemented tests across standard and browser-based synthetic monitoring, with manifest updates, and resolved test bugs by updating .NET header tag test versions to fix known issues, improving coverage of header-tag functionality. These efforts reduce release risk for header handling and improve reliability of tracing features across platforms.
November 2024 monthly summary highlighting business value and technical achievements across DataDog/documentation, DataDog/dd-trace-dotnet, and DataDog/system-tests. Key outcomes include clearer .NET APM library configuration guidance with a dedicated trace context propagation section; resolution of potential .NET Framework assembly loading issues by vendoring SharpZipLib and removing System.IO.Compression in DebugLogReader; and alignment of B3 trace context propagation tests across languages to reflect current propagator behavior. These efforts improve reliability, cross-language interoperability, and developer onboarding, translating into reduced production risk and faster feature delivery.
November 2024 monthly summary highlighting business value and technical achievements across DataDog/documentation, DataDog/dd-trace-dotnet, and DataDog/system-tests. Key outcomes include clearer .NET APM library configuration guidance with a dedicated trace context propagation section; resolution of potential .NET Framework assembly loading issues by vendoring SharpZipLib and removing System.IO.Compression in DebugLogReader; and alignment of B3 trace context propagation tests across languages to reflect current propagator behavior. These efforts improve reliability, cross-language interoperability, and developer onboarding, translating into reduced production risk and faster feature delivery.
Overview of all repositories you've contributed to across your timeline