vatlab / sos

SoS workflow system for daily data analysis

Home Page:http://vatlab.github.io/sos-docs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Option to hide `stdout` for non-task

gaow opened this issue · comments

commented

I seem to recall we have talked about this before -- for non-tasks, is there a way to hide, or better, redirect stdout / stderr elsewhere just like for task? The output of some programs are so overwhelming that it drastically slows down the entire process because of huge amount of text printed to the terminal ...

commented

-v1 would suppress output of script execution...

commented

-v1 would suppress output of script execution...

Indeed, but not stderr unless I use -v0. Also because there are many concurrent processes going on, ideally if we can configure stderr and stdout to write to files it would be easier to check them later.

commented

Looks to me that both stdout and stderr are suppressed (code).

commented

Actually v1 did not suppress stderr (code).

commented

The line you pointed to is for task. this line is for regular execution, which says,

stderr=None if env.verbosity > 0 else subprocess.DEVNULL,
                                         stdout=None if env.verbosity > 1 else subprocess.DEVNULL

So v1 shows stderr but not stdout. This makes sense because stderr is meant for errors and should have higher priority.

commented

I agree. But unfortunately some programs prints debug information or warning messages to stderr which can be a bit overwhelming. If there is really an error then the step should fail and throw an exception anyways. For R and python packages it is possible to handle warning messages in the code, but not for bash executables. For such cases I was hoping that SoS can redirect the stderr output to a file called {_output}.log via some options, eg:

R: stdout = f'{_output}.out', stderr = f'{_output}.err'
 ...

Not sure if it is a good idea, but it looks useful to me. After all the only reason to use non-task mode is performance of job submission; that often means there involve lots of jobs, and such messages will easily get overwhelming.

commented

Sounds like a good idea ...

commented

but wondering what should happen for

R: stdout = f'{_output}.log', stderr = f'{_output}.log'
 ...

And for

input: for_each ...
R: stdout='samefile.log'
...

Should we let users take the risk?

commented

Should we let users take the risk?

I guess it is fair enough that users take the risk. And we make it w mode not a for append? I am thinking of scenarios when there are more than 2 actions in the same step like #880; but maybe users should just name them differently ...

commented

I am not sure about the w and a part because stdout and stderr are streams and are supposed to be in a mode.

commented

I see. I guess my question is whether or not to remove those files, if exist, before running the code that will write to them, thus preventing repeated runs adding too much to it. I assumed this is natural to do. But the corner case would be :

R: stdout = f'{_output}.log', stderr = f'{_output}.log'
 ...
python: stdout = f'{_output}.log', stderr = f'{_output}.log'
...

So removing existing f'{_output}.log' should only happen at the first action, not the 2nd one. Then after that everything should append.

commented

I used ab to open file without any error checking. Please test.

commented

These options right now only accept a file, but users might want to use stdout=subprocess.DEVNULL.

commented

Great! It works perfectly.

These options right now only accept a file, but users might want to use stdout=subprocess.DEVNULL.

Right, I think it would be useful to have. Then it'd be stdout = None?

commented

The problem is that stdout=None is the subprocess.Popen default for standard output.

commented

Right. But from my reading of the code we are parsing the options anyways. So can we change the behavior such that we keep the default behavior if stdout/stderr is not seen in kwargs, and when seen None will get translated to DEVNULL? Or other keywords such as /dev/null will do, too?

commented

We can certainly implement\ the behavior (missing as standard, None as DEVNULL) but this is not very pythonic so I am trying to see if there can be a better even more general method (e.g. can handle non-path object like DEVNULL and allow users to pass arbitrary object with write attribute).

But I agree that stdout=None is the most obvious method for users.

commented

We actually have another choice, stdout=False, which seems to translate better to no standard output.

commented

Okey, I changed it from None to False, and added a line in the error message prompt to point to the error file location. Should I push the patch below and document? I do not think it impacts unit test as far as I can tell.

diff --git a/src/sos/actions.py b/src/sos/actions.py
index 4b28e5a6..38d99d51 100644
--- a/src/sos/actions.py
+++ b/src/sos/actions.py
@@ -323,14 +323,14 @@ class SoS_ExecuteScript:
                             stderr=subprocess.PIPE, bufsize=0)
                         out, err = child.communicate()
                         if 'stdout' in kwargs:
-                            if kwargs['stdout'] is not None:
+                            if kwargs['stdout']:
                                 with open(kwargs['stdout'], 'ab') as so:
                                     so.write(out)
                         else:
                             sys.stdout.write(out.decode())
 
                         if 'stderr' in kwargs:
-                            if kwargs['stderr'] is not None:
+                            if kwargs['stderr']:
                                 with open(kwargs['stderr'], 'ab') as se:
                                     se.write(err)
                         else:
@@ -343,7 +343,7 @@ class SoS_ExecuteScript:
                 elif '__std_out__' in env.sos_dict and '__std_err__' in env.sos_dict:
                     if 'stdout' in kwargs or 'stderr' in kwargs:
                         if 'stdout' in kwargs:
-                            if kwargs['stdout'] is None:
+                            if kwargs['stdout'] is False:
                                 so = subprocess.DEVNULL
                             else:
                                 so = open(kwargs['stdout'], 'ab')
@@ -353,7 +353,7 @@ class SoS_ExecuteScript:
                             so = subprocess.DEVNULL
 
                         if 'stderr' in kwargs:
-                            if kwargs['stderr'] is None:
+                            if kwargs['stderr'] is False:
                                 se = subprocess.DEVNULL
                             else:
                                 se = open(kwargs['stderr'], 'ab')
@@ -379,7 +379,7 @@ class SoS_ExecuteScript:
                         ret = p.wait()
                 else:
                     if 'stdout' in kwargs:
-                        if kwargs['stdout'] is None:
+                        if kwargs['stdout'] is False:
                             so = subprocess.DEVNULL
                         else:
                             so = open(kwargs['stdout'], 'ab')
@@ -389,7 +389,7 @@ class SoS_ExecuteScript:
                         so = subprocess.DEVNULL
 
                     if 'stderr' in kwargs:
-                        if kwargs['stderr'] is None:
+                        if kwargs['stderr'] is False:
                             se = subprocess.DEVNULL
                         else:
                             se = open(kwargs['stderr'], 'ab')
@@ -412,11 +412,12 @@ class SoS_ExecuteScript:
                         debug_args = '{filename:q}'
                     else:
                         debug_args = self.args
-                    cmd = interpolate(f'{self.interpreter} {debug_args}',
+                    cmd = interpolate(f'{self.interpreter.strip()} ``{debug_args}``',
                                       {'filename': sos_targets(debug_script_file), 'script': self.script})
-                    raise RuntimeError('Failed to execute commmand ``{}`` (ret={}, workdir={}{})'.format(
+                    raise RuntimeError('Failed to execute commmand "{}" (ret={}, workdir={}{}{})'.format(
                         cmd, ret, os.getcwd(),
-                        f', task={os.path.basename(env.sos_dict["__std_err__"]).split(".")[0]}' if '__std_out__' in env.sos_dict else ''))
+                        f', task={os.path.basename(env.sos_dict["__std_err__"]).split(".")[0]}' if '__std_out__' in env.sos_dict else '',
+                        f', err=``{kwargs["stderr"]}``' if 'stderr' in kwargs and os.path.isfile(kwargs['stderr']) else ''))
             except RuntimeError:
                 raise
             except Exception as e:
commented

Ok with me.