Support binding a callable with a baked command

Question

Support binding a callable with a baked command

supriyo-biswas opened this issue 2 years ago · comments

First of all, I'd like to say thanks for this module, it's a lifesaver when working with stuff that only provides command line tools.

Sometimes, I'd like to bind a callable so that I can easily process the output of a command without repetitive boilerplate.

As an example, I'm currently working with the gcloud CLI, where I do stuff like this:

iam = sh.gcloud.iam.bake(format='json')
list_of_roles = json.loads(str(iam.roles.list(show_deleted=True, project='my-project')))

Having to do stuff like json.loads(str(...)) for every invocation gets repetitive quickly, and it also makes stuff a little more difficult to reason about.

Therefore, I'd like to have a feature to bind a callable such that callable(RunningCommand()) is returned. Usage of the API might look like this:

iam = sh.gcloud.iam.bake(format='json', _callable=lambda x: json.loads(str(x))
list_of_roles = iam.roles.list(show_deleted=True, project='my-project') # returns a list

Andrew · Answer 1 · Sat May 28 2022 03:21:20 GMT+0800 (China Standard Time)

Thanks for bringing it up. I can relate to your issue though, and have had a similar experience with repetition.

There is another feature I have been thinking about, called "shims" that would do something similar to what you describe. Basically, anyone in the community could write a "shim package" plugin, named something like sh-shim-gcloud. It would contain input and output shims that could be hooked into the underlying commands.

So for example, if you had the sh-shim-gcloud package installed, you would be able to call sh.gcloud.iam.roles.list(...) and if the format='json' arg was detected by the shim, it would automatically deserialize to python objects. Similarly, it could detect format='yaml'.

It's a very general solution, but it would allow community members to add their own shim packages, and sh would pick them up and use them automatically.

We can leave this issue open to invite more discussion around this topic.

Erik Cederstrand · Answer 2 · Tue May 31 2022 16:37:57 GMT+0800 (China Standard Time)

I have a feeling we're special-casing something that could be solved with standard Python customization patterns. With a small patch to sh you could do:

import json
import sh

class MyRunningCommand(sh.RunningCommand):
    def json(self):
        return json.loads(str(self))

sh.Command.RunningCommandCls = MyRunningCommand

echo = sh.Command("echo")
res = echo('{"a": 123}').json()
print(res, type(res))  # {'a': 123} <class 'dict'>

There's a small caveat in that you need to the sh.Command() indirection, due to the way SelfWrapper handles imports.

The nice thing is that it allows customization of other aspects of RunningCommand, also things that we haven't thought of yet.

The patch, for reference:

--- a/sh.py
+++ b/sh.py
@@ -1157,6 +1157,7 @@ class Command(object):
     RunningCommand object, which represents the Command put into an execution
     state. """
     thread_local = threading.local()
+    RunningCommandCls = RunningCommand
 
     _call_args = {
         "fg": False,  # run command in foreground
@@ -1517,7 +1518,7 @@ class Command(object):
         if output_redirect_is_filename(stderr):
             stderr = open(str(stderr), "wb")
 
-        return RunningCommand(cmd, call_args, stdin, stdout, stderr)
+        return self.__class__.RunningCommandCls(cmd, call_args, stdin, stdout, stderr)

Erik Cederstrand · Answer 3 · Thu Aug 04 2022 19:47:52 GMT+0800 (China Standard Time)

@supriyo-biswas I just committed the above patch. Would my suggested solution work for you?

Erik Cederstrand · Answer 4 · Thu Aug 04 2022 22:23:38 GMT+0800 (China Standard Time)

With the second PR, the standard case without the sh.Command() indirection also works:

import json
from sh import RunningCommand, Command, echo


class MyRunningCommand(RunningCommand):
    def json(self):
        return json.loads(str(self))


Command.RunningCommandCls = MyRunningCommand

res = echo('{"a": 123}').json()
print(res, type(res))  # {'a': 123} <class 'dict'>

supriyo-biswas · Answer 5 · Thu Aug 04 2022 23:45:30 GMT+0800 (China Standard Time)

There are other commands for which I’d like to have the ability to get their raw outputs. If I understand correctly, the current proposal is a all-or-nothing deal?

Erik Cederstrand · Answer 6 · Fri Aug 05 2022 13:30:01 GMT+0800 (China Standard Time)

Nope, this will work just like before in the normal case. But if you want the output transformed to JSON, you call the .json() method.

supriyo-biswas · Answer 7 · Sat Aug 06 2022 00:09:41 GMT+0800 (China Standard Time)

In my opinion the JSON method is still repetitive boilerplate, my aim for this feature being something that feels natural, as if the externally called command were a part of Python itself.

However, I would understand if you want to take the project in a different direction. The PRs that I made continue to work for me :)

Erik Cederstrand · Answer 8 · Mon Aug 08 2022 13:54:33 GMT+0800 (China Standard Time)

Coming from C# and Java, I'm not sure I would call the 7 chars in .json() "boilerplate", compared to the alternative _callable=.... Anyway, if you really want to avoid the extra method call, then you could have your __str__ autodetect the output format:

import json
from sh import RunningCommand, Command, some_command

class MyRunningCommand(RunningCommand):
    def __str__(self):
        res = super().__str__()
        if b'--format=json' in self.cmd:
            return json.loads(res)
        return res


Command.RunningCommandCls = MyRunningCommand

res = some_command('{"a": 123}', format="json")
print(res, type(res))

Andrew · Answer 9 · Wed Aug 10 2022 13:48:02 GMT+0800 (China Standard Time)

@ecederstrand one concern I have is manipulating the global Command.RunningCommandCls from potentially different places. Not only as a race condition, but with different submodules competing and overwriting that class type. There would need to be a way to scope that change locally. We have at least one mechanism for scoping sh objects (execution contexts), but it will add to the complexity.

Revisiting this thread again, @supriyo-biswas your original suggestion is ergonomic but the complication with it is that output from a subprocess is not produced at once... it is streamed in asynchronously, in different chunk sizes, and put into different buffers. The pseudocode from your original suggestion implies that all of the data has been received and the process has ended, but it would need to account for the streaming cases somehow. Do you have some thoughts on how this post processor should handle the streaming case?

Erik Cederstrand · Answer 10 · Wed Aug 10 2022 15:34:19 GMT+0800 (China Standard Time)

@amoffat I understand your concern and agree that messing with global state is problematic.

It is actually possible to change the RunningCommandCls of just the execution context:

import sh

sh2 = sh()

class A(sh.RunningCommand):
    def __str__(self):
        return f"In {self.__class__}"

class B(A):
    pass

sh.Command.RunningCommandCls = A
sh2.Command.RunningCommandCls = B
print(sh.echo())
print(sh2.echo())

which outputs:

In <class '__main__.A'>
In <class '__main__.B'>

Andrew · Answer 11 · Sat Aug 13 2022 01:21:13 GMT+0800 (China Standard Time)

@ecederstrand oh, great! that solves my concern. it sounds like this issue is solved then yeah? @supriyo-biswas if you'd like to work on your original suggestion, feel free to start a PR. i think there is no reason why both solutions can't exist.