microsoft / CLRInstrumentationEngine

The CLR Instrumentation Engine is a cooperation profiler that allows running multiple profiling extensions in the same process.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Decouple libxml2 dependency from engine

jander-msft opened this issue · comments

The engine currently relies on libxml2 on Linux in order to load and parse instrumentation method configurations. It would be ideal if this dependency could be removed so that it does not require, for example, app containers to install this dependency; this would move closer to being able to dynamically attach the engine from a sidecar container without putting library requirements on the app container.

Alternatives to the dynamic dependency:

  • statically link the library.
  • convert to JSON configuration and use something like rapidJSON; could allow XML configuration via separate helper shard library that is loaded on demand (or is loaded by default but can be opted-out).

I wonder about both of these options.

  1. If we statically link the library, we may need to actually check out and build the library ourselves because we can't guarantee that any libs would be compatible with our compiler/configuration.

  2. Would moving to rapidJSON be a breaking change?

Note, if we build and statically link libxml2, we might be able to use a single code path for xml reading instead of depending on msxml on Windows.

  1. If we statically link the library, we may need to actually check out and build the library ourselves because we can't guarantee that any libs would be compatible with our compiler/configuration.

Correct. It's an option, but I think option 2 is better.

  1. Would moving to rapidJSON be a breaking change?

Directly cutting over the something like rapidJSON would be a breaking change.

However, we could be smarter by supporting both where the engine can detect what parser needs to be use (checking file extensions, looking up an explicit setting somewhere).

By default, it would assume XML and either (1) use the already existing solution where the the engine has the libxml2 or msxml dependency or (2) move the existing solution to another shared library that the engine would load dynamically when it knows to use XML parsing, thus avoiding the need to always have XML parsing and having to consider if a direct dependency on libxml2 or msxml would cause runtime issues (on Windows, the msxml library is required to be resolved up front whereas on Linux, the libxml2 library is resolved at point of usage).

If a configuration is JSON, we'd use the built-in JSON parsing (I would advocate for a JSON parsing solution that doesn't require additional libraries, because if it does require other libraries, then that defeats the purpose of this issue; it does not have to be rapidJSON).

What about the nagler profiler? Do you think we should use json for its configuration instead of xml?

@jander-msft @delmyers it would be great to remove libxml2 dependency. This is issue for us on most of the distributions because libxml2 is not dotnet dependency. I really like idea to move to rapidJSON and this will not cause any issues for us as we ship CLR IE inside our package. Any ETA on this?