probtorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Home Page:http://pytorch.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Provide programmatic access to param names and values

fritzo opened this issue · comments

@ragulpr pointed out a need for programmatic access to params.

Maybe we can implement this as a .param_values property that returns an OrderedDict from names to current values. Note that .params currently over-specifies possible parameters and does not serve this purpose.

class Normal(ExponentialFamily):
    @lazy_property
    def param_values(self):
        return OrderedDict([
            ('loc', self.loc),
            ('scale', self.scale),
        ]) 

This is still subtle to use since some params are not tensors, e.g. Multinomial takes an int n.

Do we have any particular standardization for how attributes of Distributions are named, relative to the .params name?

Just thinking that for many (most?) distributions, then they keys in .params would actually be able to be used directly to get the values needed for this dict. It would be nice if a default implementation could cover most distributions, and then we would only need to explicitly implement .canonical_params for those distributions which have overspecified params dicts.

There is no such standard and I believe there cannot be. This is difficult because some distributions take multiple parameterizations (e.g. probs vs logits for Categorical, covariance_matrix vs scale_tril for MultivariateNormal). Not only are these params lazily constructed, but they should all be specified in the .params dict (some are currently missing and this is a bug). Thus a naive implementation of .param_values that simply inspected .params would overparameterize some distributions, e.g. specifying both logits and probs for Categorical.

Ah, sorry, wasn't clear — I meant actually for the values themselves. For example, for the normal distribution, the .params has keys 'loc' and 'scale', and the values themselves are at self.loc and self.scale.

So in theory in the example you gave above we could just loop over the .params.keys(), and then p[key] = getattr(self, key).

This doesn't work for either Categorical or MultivariateNormal, as you say. But I think it could work for a lot of them if our naming of params and attributes were consistent, and we would just have to implement canonical_params specifically for those few outliers.

I've been working in ways like this:

class Pareto(PositiveDistribution):
    r"""
    """
    params = {'scale': constraints.positive, 'shape': constraints.positive}
    support = constraints.positive
    has_rsample = True

    def __init__(self, scale, shape, is_discrete=False):
        self.scale, self.shape = broadcast_all(scale, shape)
        self.params_tuple = (self.scale, self.shape)
        batch_shape = torch.Size() if isinstance(scale, Number) else self.scale.size()
        super(Pareto, self).__init__(batch_shape, is_discrete=is_discrete)

And then you can initialize it as

params_tuple = (torch.ones(1),torch.ones(1)) 
dist = Pareto(*params_tuple,**distkwargs)

I rarely find myself accessing the parameters by name dist.scale as I'm working with many distributions with different names. I think this is an important abstraction. Also, the more we can sideline having to name parameters the safer we are as this is a religious topic 😄. What is the need for a dict vs a tuple?

  • I'm not sure about the level of thread-safety of this approach.
  • I'm not sure about the performance-cost of looping as proposed by @tbrx