py-why / dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.

Home Page:https://www.pywhy.org/dowhy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add accessor to CausalModel._estimator_cache

drawlinson opened this issue · comments

Is your feature request related to a problem? Please describe.
I sometimes need to access the actual CausalEstimator object[s], e.g. to access the learned coefficients in a regression estimator to examine feature importance or stability. It's useful for many Estimators to be able to examine the resulting model, not just the outputs of prediction/inference.

However, the CausalModel interface doesn't expose the Estimator objects in any step of the most convenient process (i.e. create a CausalModel m and call m.estimate_effect(), which is the most widely used method in the docs). The estimator objects are never returned, but are available in an internal structure called _estimator_cache. I assume the structure is internal because of the underscore and the fact there's no documentation of it.

Describe the solution you'd like
We could add an accessor function like this:

class CausalModel
  ... 
  self._estimator_cache = {}
  ...
  def get_estimator(method_name):
        return model._estimator_cache[method_name]

Convenience: Could also default method_name to None and return the first estimator if any present, and/or trap exceptions and return None if method_name not in _estimator_cache.keys().

I've tested this and it does work - the estimators are retained and the lifespan of the objects in the cache generally makes sense (potential confusion around which estimator matches which result but I think this would be rare).

Describe alternatives you've considered
An alternative would be to return the estimator used in estimate_effect(), but this probably breaks compatibility with a large amount of existing user code.

Additional context
I can implement and submit a PR if the change is desired.

If this is added, I could also modify an example to extract model coefficients and show how we can explore them in a plot.

Thanks for the suggestion @drawlinson . Currently, you can access the estimator after each estimate_effect call.

effect = model.estimate_effect(...)
effect.estimator # returns the fitted CausalEstimator estimator object

Does the above code provide one way to address the issue? That said, I do see the convenience and clarity of your proposal for the end-user. It can be used to compare the different estimators on the same causal model. Feel free to issue a PR and I can review.

I hadn't spotted the estimator object there! In fact I'd used that in the past, I just forgot about it. I will make a PR but the method you've pointed out does solve the problem, so the change I'm proposing is only a convenience.

Apologies for delay, PR here: #1113