Inserting key/value pairs in PDS label
andywinhold opened this issue · comments
Is there a way to insert desired key/value lines at a specific point in a PDS label?
ex:
from
PDS_VERSION_ID =
DD_VERSION_ID =
LABEL_REVISION_NOTE =
^IMAGE
to
PDS_VERSION_ID =
DD_VERSION_ID =
LABEL_REVISION_NOTE =
DATA_SET_ID =
PRODUCT_ID =
INSTRUMENT_HOST_NAME =
^IMAGE
There's nothing built in, however you could use label.items()
to mutate the label:
>>> import pvl
>>> label = pvl.loads("""
... PDS_VERSION_ID = a
... DD_VERSION_ID = b
... LABEL_REVISION_NOTE = c
... ^IMAGE = d
... """)
>>> items = label.items()
>>> extra = [
... ('LABEL_REVISION_NOTE', 'more'),
... ('DATA_SET_ID', 'more'),
... ('PRODUCT_ID', 'more'),
... ('INSTRUMENT_HOST_NAME', 'more'),
... ]
>>> new_label = pvl.PVLModule(items[:3] + extra + items[3:])
>>> print(pvl.dumps(new_label))
PDS_VERSION_ID = a
DD_VERSION_ID = b
LABEL_REVISION_NOTE = c
LABEL_REVISION_NOTE = more
DATA_SET_ID = more
PRODUCT_ID = more
INSTRUMENT_HOST_NAME = more
^IMAGE = d
END
The way the IDL code handled this situation was to provide an insert_before()
and insert_after()
function. So it could be possible to implement something that works like this:
label.insert_after('LABEL_REVISION_NOTE', extra)
There are issues here with disambiguating duplicate keys (if that's technically even allowed) and with addressing nested keys.
Edited - I accidentally said items
instead of label
, changed it to label
.
It might be more pythonic to implement some of the list interface (i.e. label.insert(index, object)
, label.index(object, [start, [end]])
etc.).
Also slicing might be nice.
@wtolson, in the case of label.insert(index, object)
what would index be?
@godber for the builtin list, index
is the int position to insert the object:
>>> l = [1, 2, 3]
>>> l.insert(1, 42)
>>> print(l)
[1, 42, 2, 3]
While it would be good to at least replicate this, it might be interesting to also overload it for a str index similar your insert_after()/insert_before()
. The only issue I see is it may be confusing if there are multiple items with the same key.
@wtolson and @godber I really appreciate the help.
I like the original approach, but if I'm right pvl.loads()
requires you know the initial string, especially values a, b, c and d (in this case).
I'm using this in tandem with planetaryimage's PDS3Image()
which populates an initial label for you. Since those initial label values vary from product to product,
For example:
IMAGE
MAXIMUM = ___
MEAN = ___
MEDIAN = ___
the pvl.loads()
string would have to be adjusted for each new product. Or so I understand it.
@pbvarga1 suggested another approach (for this case) of instantiating the PDSI3Image()
class inside my code and making edits to the pvl module created in PDS3Image._create_label()
To handle the ambiguities with multiple items with similar keys we could use an optional argument to indicate before/after the first, second, third, ... all cases of the string index. The default, I think, would be before/after the first instance. For example,
# Default is after the first instance
label.insert_after('LABEL_REVISION_NOTE', extra)
# After the third (second?) instance
label.insert_after('LABEL_REVISION_NOTE', extra, instance=2)
# After every instance
label.insert_after('LABEL_REVISION_NOTE', extra, every_instance=True)
Im not sure about the argument names and whether it would be better to use index counting or cardinal counting. Index counting will be more pythonic but cardinal would be more... human? Either way, I think this will help quell the ambiguities.
@wtolson On python3, label.items()
returns ItemView
which is not subscriptable
In [1]: import pvl
In [2]: label = pvl.loads("""
...: foo = bar
...: monty = python
...: """)
In [3]: items = label.items()
In [4]: items[0]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-95f461411437> in <module>()
----> 1 items[0]
TypeError: 'ItemsView' object does not support indexing
So there will have to be changes to ItemView or make a different way to have access to self.__items
@pbvarga1 TypeError: 'ItemsView' object does not support indexing
should probably be a separate ticket.
@andywinhold maybe this whole issue should be on PDS3Image
instead. Higher level insert()
capability could be added there. Though maybe pvl
should at least support insert()
by integer.
If I recall correctly, there is some fundamental issue with not being able to edit the label on the PDS3Image
object.
@godber I'll look at PDS3Image
and see what I can come up with. It's good to know there may already be an issue with editing.
Yeah, I agree that from the user perspective, they are most often thinking in terms of "after label X" or "before label Y". In fact, from the user perspective, it's hard to actually even know the numerical index. In a 400 line PDS label, just figuring out the integer index is a hassle.
Ok, so design wise, this has me thinking the following things:
- implementing
insert()
with the pythonic interface seems like a sensible thing to do - users still need a method of doing inserts in a relative manner.
So assume you've implemented insert()
, would we then provide the higher level functionality by
- implementing
insert_before()
andinsert_after()
methods on top ofinsert()
? - add optional keyword arguments to
insert()
? - add the higher level functionality EXTERNAL from
pvl
, perhaps inPlanetaryImage
?
I think there is real need for these relative insert capabilities directly in pvl
. @wtolson you have any spare cycles to comment?
Also, @pbvarga1, I suspect you could look at your existing PR and see that with some refactoring you have already (accidentally?) implemented insert()
... I have NOT actually looked at the code.
I like having insert_after
and insert_before
be there own methods. It will make code that implements these methods easier to read and easier to remember how to use. Then insert
only deals with integer indexes.
However, I could overload insert(key, object)
by including boolean keywords like relative=False
and before=True
. Then if relative
is True
or key is a String
, then we use insert_before
. Otherwise we assume key is an int
and insert in place. The code in the PR could pretty easily be refactored to include insert
like this.