planetarypy / pvl

Python implementation of PVL (Parameter Value Language)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What do we need to do to release 1.0.0?

rbeyer opened this issue · comments

What criteria should we use to set the bar for the 1.0.0 release? The alpha versions have been out there for a few months, but I'm not sure if anyone is using them, and so not sure that just having time pass is a validation of any kind.

So I'd like to see some thoughts on what else might need to be tested before we mint the 1.0.0 release and unleash it on PyPI and conda?

If we can't come up with any, then I suggest we just do it, as that will flush out other bugs from the users.

any reason to not release a 1.0.0 sooner than later revolve around if you expect to change the api in a significant way, if you do then just cut 2.0.0 or whatever symver version is needed. I say go for it, if we end up at version 36 by the end of the month then so be it, it's just a number

That's my feeling, too. I'm just nervous about the 1.0.0 bump, but as you point out, I don't think anyone expects perfection from a *.0.0 rev, just something different, and hopefully better.

If we can't come up with any, then I suggest we just do it, as that will flush out other bugs from the users.

I think that's a fitting approach for a small user base like ours. If we had 1000s of users, we'd be checking code coverage by testing carefully and use other stuff to make sure we got our bases covered. In our case, the few users are our testers, and I think that's fine.

So I sat down to cook up the PR for a 1.0.0 release, and had some misgivings. There is one potentially major change that could be made, and one minor one.

Major: Change Returned dict-like type
One of the things that I didn't change with the new architecture was the contents of pvl/_collections.py. That module defines an OrderedMultiDict class, which is the core dict-like that the pvl library builds and then returns via pvl.load() and pvl.loads(). The classes PVLModule, PVLGroup, and PVLObject are mostly just thin wrappers around that core OrderedMultiDict class.

Admittedly, there wasn't anything like that in the Standard Library, and so it is impressive that Trevor implemented a custom version of this more than 5 years ago.

However, there is a quite stable and robust 3rd party library for just such a type: multidict. And so, I've been wondering about changing to it for pvl.

Do we need to change? Not particularly. The pvl._collections.OrderedMultiDict works, it works with the modern architecture, it works under Python 3.6, all of our tests pass.

Pros:

  • One less (potentially gnarly) hunk of code for us to maintain under the pvl roof.
  • Potential benefits as that library advances, possible benefits from users that may already be familiar with this library.

Cons:

  • Some methods that pvl users are used to would change (e.g. getlist() would change to getall()), but there aren't a lot of these (this has the potential to cause some breakage if users upgrade, we can cushion this by still providing the PVLModule, PVLGroup, and PVLObject classes and provide the older methods with a deprecation warning).
  • This would introduce a required package (I think it is pretty awesome that pvl doesn't have any dependencies beyond the Python Standard Library, but in the grand scheme of things this seems like a pretty skinny dependency, as multidict itself doesn't have any upstream dependencies).

I'm not hard over on this, because I haven't done any testing, and I'm not sure how easy or difficult this would end up being.

Minor: Black formatting
Okay, first, I didn't use a linter. Then I started using flake8, and now, mostly because I know that Michael and Andrew use it, I started experimenting with the Black code formatter, and have embraced using it in other projects. Should we 'Black' the pvl code? Any reason not to?

How do you work that into 'contribution' guidelines? Do you include a make target for it? Integrate it into the 'lint' target so it runs black and then flake8?

I think it is entirely justifiable to switch to multidict because it lowers our maintenance burden and the use of deprecation warnings are okay, alternatively you could use a mixin to add those functions to the multidict package objectwith deprecation warnings in the mixin functions.

My feeling is that data structures tend to be fairly generic, so they don't really belong in more focused library because it is probably generically issue. For example it wouldn't be weird to for someone to use pvl to read a pvl file but it would be if they just wanted a multidict class to use.

There is a trade space in terms of adding dependencies which we don't need to worry about for this issue.

As for black/flake8, I don't use flake8 but I found this plugin https://pypi.org/project/flake8-black/ that should allow this to work. I think between flake8 and tox and pytest it should be able to integrate the black checks as a test, so if someone needs to make sure their tests pass it should also pass all pep8/black checks.

As a consumer if this library I would really like documentation on what changed in the API. It doesn't need to be extensive, but at least a listing of classes that were added/removed/ had their API changed. That way I know where things in my code are likely to need updating. For example, I have some code that uses a custom decoder to handle some non-standard comments.

That's a good point. There is some high level information in the HISTORY, but it isn't at the granularity that you're talking about.

And the module docs aren't sufficient, because you want a 'diff' not just the new way?

@jessemapel in my above comment I talked about granularity, but you really asked for an overview. I made a stab at addressing this in the #68 HISTORY.rst entry. Can I get some feedback about whether that satisfies your need, or if you are looking for something different?

You asked for a listing of changed classes, and I did that for the classes and interface that is exposed to the user, but almost everything under the hood (which we don't usually expect users to interact with) related to parsing and encoding was overhauled at a fundamental level, so documenting all of that would be pretty overwhelming, and I'd refer you to the module documentation. Most (if not all) of the new modules, classes, and functions now have doc strings, which are surfaced via readthedocs.

I think the history document is sufficient for what I want. What's the plan for keeping that document up to date? Will it be modified as part of the release process or on each PR?

Ticking over to 1.0.0 was a longer process than I expected. During this process, for every PR (which were the "alpha.N" entries) I did end up making a separate entry (this made it easier to refer to this or that functionality that we had enabled as of alpha.2 or alpha.5 or whatever). My anticipation is that for future releases, as PRs land, they will simply add a bullet point to an "accumulating" list in the HISTORY file, and when we bake a release, all of those accumulated items will be grouped under the release.

My anticipation is that for future releases, as PRs land, they will simply add a bullet point to an "accumulating" list in the HISTORY file, and when we bake a release, all of those accumulated items will be grouped under the release.

Sounds like a solid policy, you should add that to the contributing docs.

Good point. I went to do so, and realized that we already had some language to that effect in our PR Guidelines.