You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently, a breaking change was made to how dependency checks work. The change was released in 1.32.0/0.53b0. There were multiple issues with this approach but also multiple benefits. This issue it meant to explain the reasons for the change, the different use cases affected, breakages, and possible solutions.
Pre-existing dependency conflict logic for autoinstrumentation
Each instrumentations stores the restrictions for its instrumented library under [project.optional-dependencies]. For instance the Flask instrumentation lists flask >= 1.0.
The importlib_metadata "Distribution" object's "requires" field includes the instrumentations required and optional dependencies. The optional dependencies have the extra field, extra == 'instrument'. For examples ['opentelemetry-api~=1.12', 'opentelemetry-instrumentation-wsgi==0.52b1', ... "flask>=1.0; extra == 'instruments'"]
The get_dist_dependency_conflicts removed by the breaking change, identifies the optional dependencies that have extra == 'instrument' and returns a Dependency conflict if the optional dependencies requirements are not met. For instance, a conflict is returned if Flask<1.0 is installed OR if Flask is not installed at all. (The latter is essential for the "codeless cloud autoinstrumentation" and "instrumentation pack" use cases explained below.)
Autoinstrumentation's _load.py calls get_dist_dependency_conflictsbefore initialization the instrumentator objects. If a dependency conflict is returned, the instrumentator object will not be initialized.
For autoinstrumentation at least, dependency check was done before any instrumentor was instantiated. It is not assumed by autoinstrumentation that the instrumented library is installed. Most instrumentor objects were also written assuming that the instrumented library is installed and therefore that a dependency check would be down before the Instrumentor object is instantiated.
New dependency conflict logic from change
OPTIONAL: Instrumentation lists optional requirements in the _instruments field in package.py
Note that [project.optional-dependencies] is no longer relevant. (As far as I can tell, it would only be used in this packaging script.)The optional dependencies identified in the importlib_metadata Distribution.requires field have no bearing on whether an instrumentation will be initialized. In fact, the _instruments field is merely common style and is not required either. All that matters is the <Library>Instrumentor object's instrumentation_dependencies method.
Reason for change: Multi-package instrumentations
The old approach does not work well for Kafka or PsycoPG2. These instrumentations have multiple alternative packages they could instrument. For instance, Kafka can instrument kafka-python OR kafka-python-ng. It should not require both to be installed. However, in the old approach, multiple entries in [project.optional-dependencies] are treated as ALL required. So, when Kafka lists "kafka-python >= 2.0, < 3.0", "kafka-python-ng >= 2.0, < 3.0", the old approach would only attempt to instrument Kafka in the unrealistic scenario where bothkafka-python and kafka-python-ng are installed. To summerize, the old approach is only designed for "AND Instrumentations" but not "OR instrumentations".
Secondary reason: Manual vs Auto consistency
Change breakdown
The change moves dependency checks into the Instrumentor.instrument method itself. In other words, no dependency check is done before Instrumentors are instantiated. Since Instrumentors generally assume instrumented packages are installed, this causes any such Instrumentor to crash, generally with an ImportError even before the new dependency check in Instrumentor.instrument begins.
In short, this means that the new dependency check only prevents breakage when the instrumented package is installed but with the wrong version.
Use cases
Before explaining the breakages, here are some relevant use cases
Instrumentation packs
OpenTelemetry clients may include multiple instrumentations automatically. For instance, the azure-monitor-opentelemetry "distro" provides an easy one-line solution to set up OTel providers, exporters, and instrumentations of the most popular libraries. For example, it includes the Flask instrumentation automatically. It is up to dependency conflicts to decide whether that instrumentor should be instantiated and whether the library should be instrumented.
One-click codeless autoinstrumentation from Cloud providers
Multiple cloud providers, such as Azure provide OpenTelemetry autoinstrumentation as a feature in their UI. This means with a single click, you can enable any and all supported instrumentations. This means the cloud service must install the instrumentations. For both ease of use and to avoid ballooning start-up times, this is down by side loading pre-installed instrumentations. For example, the Flask instrumentation will be instantly present regardless of whether the user has a flask app. It is up to dependency conflicts to decide whether that instrumentor should be instantiated and whether the library should be instrumented.
Note that the fundamental difference between this and other autoinstrumentation scenarios is that instrumented app and autoinstrumentation agent come from 2 different parties: the cloud customer and cloud provider, respectively.
Summarized breakages and fixes
Public method get_dist_dependency_conflicts deleted. Fixed in ___
Instrumentation requirements are no longer taken from project.toml but rather by abstract Instrumentor.instrumentation_dependencies() method.
Instrumented libraries are now assumed to be present for all installed instrumentations, whether they rely on an "and" or "or" list of instrumented libraries. This breaks the "instrumentation pack" and "Cloud-provided autoinstrumentation scenarios"
Instrumentor objects are now assumed to gracefully instantiate with instrumented library is not installed
There is no distinction between the ModuleNotFound error raised when an instrumented library is not installed and all other possible sources of that error. Dependency checks are now only used to constrain the version of the instrumented package but not whether or not it is installed. DependencyConflictError is only raised when the library is installed but at the wrong version.
Possible solutions
Revert change, add new "instruments_either" package field
We could add a new field besides "instruments" that acts as an "or" least while leaving the existing field to act as an "and" list. get_dependency_conflicts would then be changed to utilize both fields. So, instrumentations like Kafka would leave instruments blank but populate "instruments_either". Most instrumentations would keep their current "instruments" value and not require any changes.
If we wish to keep the similarity between Manual and Auto, we could either do a partial revert, of simply include this as a manual instrumentation option as well. I think it makes sense to allow users to do a dependency check before instantiating the Instrumentor even for Manual instrumentation users.
Retrofit all Instrumentation's Instrumentor objects
Instrumentation modules and Instrumentor objects would all be changes to not automatically import their instrumented libraries. They would automatically check themselves. This includes changing Kafka and PsycoPG2 as well.
Implement new should_instrument method in each instrumentation
This method would provide the flexibility of the instrumentation_dependencies, but with more clarity for use cases like Kafka and PsycoPG2. It would also work for "codeless cloud autoinstrumentation" and "instrumentation pack" use cases. Depending on implementation, this may also require retrofitting instrumentations or instrumentor objects to not automatically import their instrumented libraries
Implement separate Instrumentations instead of "OR scenarios"
There could be a KafkaInstrumentation and a KafkaNGInstrumentation.
The text was updated successfully, but these errors were encountered:
jeremydvoss
changed the title
DRAFT: Detailed breakdown of dependency conflict check breaking change
Detailed breakdown of dependency conflict check breaking change
May 7, 2025
I don't think we should blindly update the docs but more reevaluate the differing instructions between the getting started docs and the potential use of the operator.
I do think the docs are fine. (We used them with success to get going) There is just a higher level question of how should we be generating and then installing opentelemetry instrumentations when we can do it two different ways.
I do think the long term answer is "support both" so maybe we just need an update line in these docs that points at the operator a bit more clearly. But even that is a bit risky in "getting-started" documentation.
TLDR, I don't have good answers just wanted to share what tripped me up a bit.
Recently, a breaking change was made to how dependency checks work. The change was released in 1.32.0/0.53b0. There were multiple issues with this approach but also multiple benefits. This issue it meant to explain the reasons for the change, the different use cases affected, breakages, and possible solutions.
Pre-existing dependency conflict logic for autoinstrumentation
[project.optional-dependencies]
. For instance the Flask instrumentation listsflask >= 1.0
.extra == 'instrument'
. For examples['opentelemetry-api~=1.12', 'opentelemetry-instrumentation-wsgi==0.52b1', ... "flask>=1.0; extra == 'instruments'"]
get_dist_dependency_conflicts
removed by the breaking change, identifies the optional dependencies that haveextra == 'instrument'
and returns a Dependency conflict if the optional dependencies requirements are not met. For instance, a conflict is returned if Flask<1.0 is installed OR if Flask is not installed at all. (The latter is essential for the "codeless cloud autoinstrumentation" and "instrumentation pack" use cases explained below.)get_dist_dependency_conflicts
before initialization the instrumentator objects. If a dependency conflict is returned, the instrumentator object will not be initialized.For autoinstrumentation at least, dependency check was done before any instrumentor was instantiated. It is not assumed by autoinstrumentation that the instrumented library is installed. Most instrumentor objects were also written assuming that the instrumented library is installed and therefore that a dependency check would be down before the Instrumentor object is instantiated.
New dependency conflict logic from change
_instruments
field in package.py<Library>Instrumentor
object'sinstrumentation_dependencies
method returns the optional dependencies. Most often, this pulls from package.py's_instruments
field. However, in more complicated use cases, such as the KafkaInstrumentor, it may provide different requirements depending on what is installed. Importantly, this Kafka design still assumes Kafka is installed and will crash if not.Note that
[project.optional-dependencies]
is no longer relevant. (As far as I can tell, it would only be used in this packaging script.)The optional dependencies identified in the importlib_metadataDistribution.requires
field have no bearing on whether an instrumentation will be initialized. In fact, the_instruments
field is merely common style and is not required either. All that matters is the<Library>Instrumentor
object'sinstrumentation_dependencies
method.Reason for change: Multi-package instrumentations
The old approach does not work well for Kafka or PsycoPG2. These instrumentations have multiple alternative packages they could instrument. For instance, Kafka can instrument
kafka-python
ORkafka-python-ng
. It should not require both to be installed. However, in the old approach, multiple entries in[project.optional-dependencies]
are treated as ALL required. So, when Kafka lists"kafka-python >= 2.0, < 3.0", "kafka-python-ng >= 2.0, < 3.0"
, the old approach would only attempt to instrument Kafka in the unrealistic scenario where bothkafka-python
andkafka-python-ng
are installed. To summerize, the old approach is only designed for "AND Instrumentations" but not "OR instrumentations".Secondary reason: Manual vs Auto consistency
Change breakdown
The change moves dependency checks into the Instrumentor.instrument method itself. In other words, no dependency check is done before Instrumentors are instantiated. Since Instrumentors generally assume instrumented packages are installed, this causes any such Instrumentor to crash, generally with an ImportError even before the new dependency check in Instrumentor.instrument begins.
In short, this means that the new dependency check only prevents breakage when the instrumented package is installed but with the wrong version.
Use cases
Before explaining the breakages, here are some relevant use cases
Instrumentation packs
OpenTelemetry clients may include multiple instrumentations automatically. For instance, the azure-monitor-opentelemetry "distro" provides an easy one-line solution to set up OTel providers, exporters, and instrumentations of the most popular libraries. For example, it includes the Flask instrumentation automatically. It is up to dependency conflicts to decide whether that instrumentor should be instantiated and whether the library should be instrumented.
One-click codeless autoinstrumentation from Cloud providers
Multiple cloud providers, such as Azure provide OpenTelemetry autoinstrumentation as a feature in their UI. This means with a single click, you can enable any and all supported instrumentations. This means the cloud service must install the instrumentations. For both ease of use and to avoid ballooning start-up times, this is down by side loading pre-installed instrumentations. For example, the Flask instrumentation will be instantly present regardless of whether the user has a flask app. It is up to dependency conflicts to decide whether that instrumentor should be instantiated and whether the library should be instrumented.
Note that the fundamental difference between this and other autoinstrumentation scenarios is that instrumented app and autoinstrumentation agent come from 2 different parties: the cloud customer and cloud provider, respectively.
Summarized breakages and fixes
Possible solutions
Revert change, add new "instruments_either" package field
We could add a new field besides "instruments" that acts as an "or" least while leaving the existing field to act as an "and" list. get_dependency_conflicts would then be changed to utilize both fields. So, instrumentations like Kafka would leave instruments blank but populate "instruments_either". Most instrumentations would keep their current "instruments" value and not require any changes.
If we wish to keep the similarity between Manual and Auto, we could either do a partial revert, of simply include this as a manual instrumentation option as well. I think it makes sense to allow users to do a dependency check before instantiating the Instrumentor even for Manual instrumentation users.
Retrofit all Instrumentation's Instrumentor objects
Instrumentation modules and Instrumentor objects would all be changes to not automatically import their instrumented libraries. They would automatically check themselves. This includes changing Kafka and PsycoPG2 as well.
Implement new
should_instrument
method in each instrumentationThis method would provide the flexibility of the
instrumentation_dependencies
, but with more clarity for use cases like Kafka and PsycoPG2. It would also work for "codeless cloud autoinstrumentation" and "instrumentation pack" use cases. Depending on implementation, this may also require retrofitting instrumentations or instrumentor objects to not automatically import their instrumented librariesImplement separate Instrumentations instead of "OR scenarios"
There could be a KafkaInstrumentation and a KafkaNGInstrumentation.
Links:
Repo before change: https://github.com/open-telemetry/opentelemetry-python-contrib/tree/8582da5b8decd99f3780e820b5652d4c72b7a953
Breaking change PR: #3202
New tracebacks and logs example: issue: Azure/azure-sdk-for-python#40517
The text was updated successfully, but these errors were encountered: