don't use hardcoded /bin/bash, allow other shells/engines

Question

don't use hardcoded /bin/bash, allow other shells/engines

puffnfresh opened this issue 6 years ago · comments

Brian McKenna commented 6 years ago

I use NixOS which does not have a /bin/bash, it only has /bin/sh. Would that work instead?

Brian McKenna · Answer 1 · Wed Oct 24 2018 05:02:13 GMT+0800 (China Standard Time)

/usr/bin/env does exist, so the scripts could use /usr/bin/env bash instead.

Mateusz Czapliński · Answer 2 · Wed Oct 24 2018 07:04:00 GMT+0800 (China Standard Time)

Hmmm! I wonder what's the best approach to this. Is /usr/bin/env bash universal enough? I seem to vaguely recall I had trouble with it somewhere, but can't say for sure; so now it's just pure FUD on my side... Would it make sense to add it as an option? Actually, per your ghci idea, if I were to add an option to specify a custom interpreter during runtime, then it only makes sense that a custom shebang would probably be needed anyway...

Could you please try to give an example of how a full proper ghci invocation for "runtime" should look, vs. how a shebang should look? I could then also try to add similar examples for Lua, and try to build some generalization on this... I'd love to try to come back to this; but tomorrow — now I must just go to sleep... Thanks!

Ford Hurley · Answer 3 · Wed Oct 24 2018 08:59:49 GMT+0800 (China Standard Time)

I would think searching for something universal is the wrong idea. If there’s a way to do it, I’d want it to insert a shebang that resolves to the shell it was developed in (inside up). That is, after all, the shell I knew it worked in.

Mateusz Czapliński · Answer 4 · Wed Oct 24 2018 15:43:39 GMT+0800 (China Standard Time)

@fordhurley Uh, oh; sorry, I'm afraid I don't follow: can you please try to rephrase your comment for me? Maybe especially by replacing the occurrences of "it" in your comment with the particular thing you refer to at the various places — I'd be super grateful; unfortunately I'm getting confused enough with them that I can't grasp the general idea of what you're trying to communicate to me... Also, in particular: what do you mean by "inside up"? TIA!

@puffnfresh Ok, I'm copying what you wrote on lobste.rs here to organize the idea exploration better for myself. Could I please ask you to also help me with how a shebang could look in this case? Is it even possible to build one? Also, do I understand correctly that the -e ":m + X" is a pattern for listing any extra libraries one could want to have available in the final Haskell expression?

ghci \
  -XOverloadedStrings \
  -e ':m + Data.Aeson' \
  -e ':m + Control.Lens' \
  -e ':m + Data.Aeson.Lens' \
  -e "interact ($COMMAND)"

Hmm; what if I tried to embed Lua in up?... that sounds a bit scary in that the resulting complexity may break it apart, but who knows... could also be wicked fun :D

Rohan Verma · Answer 5 · Wed Oct 24 2018 16:58:25 GMT+0800 (China Standard Time)

I think we should have working defaults for atleast Linux and macOS. We can use https://golang.org/pkg/runtime/#GOOS to detect.

Afterwards, we can provide flags that allows setting these values explicitly in case one wants to modify for their usecase.

Alexander Hultnér · Answer 6 · Wed Oct 24 2018 17:11:11 GMT+0800 (China Standard Time)

@akavel /usr/bin/env x (where x, in this case, could be sh or bash) should be considered the more portable choice.

This way you also respect the users choice if they've got multiple bash version, for instance on macOS or similar system integrity protected platforms might ship with one older considered stable version of bash while the user herself might have installed the latest bleeding edge neo_rebash 2018.11 (not a real tool) and added it higher in the priority order for the bash command.

https://en.wikipedia.org/wiki/Shebang_(Unix)#Portability

Alexander Hultnér · Answer 7 · Wed Oct 24 2018 17:14:12 GMT+0800 (China Standard Time)

I'd maybe even go as far to look for the $SHELL-variable, that way the user would be running it in their own shell. After all, bash is a GNU-thing and doesn't exist on other UNIX-systems (e.g. FreeBSD) unless explicitly installed, this is also why they weren't affected by shellshock. With the in consideration, I'd either pick /usr/bin/env sh for maximum portability or $SHELL for user preference.

Rohan Verma · Answer 8 · Wed Oct 24 2018 17:16:17 GMT+0800 (China Standard Time)

+1 for $SHELL combined with an explicit flag to override.

Matt Singletary · Answer 9 · Wed Oct 24 2018 17:30:07 GMT+0800 (China Standard Time)

I think flag, $SHELL would be the most reasonable order of preference.

But when I imagine using it, I think I would use it like this (since I use zsh and would want to share the script portably):
$ foo |& ./up --shell='/usr/bin/env sh'
which seems like a bit of a mouthful. Maybe this smells like more of a template thing?

Matt Singletary · Answer 10 · Wed Oct 24 2018 17:41:35 GMT+0800 (China Standard Time)

Thinking some more, maybe /usr/bin/env sh by default and documentation encouraging the recipe of:

$ foo |& ./up --shell='$SHELL'

which seems clear and debuggable for users.

dopeghoti · Answer 11 · Wed Oct 24 2018 23:25:34 GMT+0800 (China Standard Time)

Why the switch from bash to sh as the default though? I support respecting $SHELL or overriding with e. g. a --shell switch, but losing access entirely within up to the bashisms I use routinely would make this less useful to me.

Adam · Answer 12 · Thu Oct 25 2018 00:14:30 GMT+0800 (China Standard Time)

+1 on using $SHELL by default with a flag to override

Alexander Hultnér · Answer 13 · Thu Oct 25 2018 01:43:47 GMT+0800 (China Standard Time)

@dopeghoti Because bash isn’t unix and doesn’t exist in many unix systems, sh on the other hand does. Bash is a GNU/Linux-specific thing, of course one could use it if they specify so but locking out all non GNU-users by default seems unnecessary.

Koushik Roy · Answer 14 · Thu Oct 25 2018 02:52:58 GMT+0800 (China Standard Time)

Would prefer flag, $SHELL, and /usr/bin/env sh in that order. Flag for an explicit override, envar to use your current shell for ergonomics, and lastly the portable POSIX fallback.

Mateusz Czapliński · Answer 15 · Thu Oct 25 2018 05:28:33 GMT+0800 (China Standard Time)

Hmmm; could it be that I've seen the env incantation fail in some Docker container based on Alpine linux with only busybox available?

Alexander Hultnér · Answer 16 · Thu Oct 25 2018 05:32:03 GMT+0800 (China Standard Time)

That doesn’t sen right, my team uses alpine-linux for almost all of our production containers and always with /usr/bin/env sh-shebangs in scripts, we’ve banned Bashisms/GNUisms.

@akavel Could it be that you used additional arguments e.g. #!/usr/bin/env sh -arg. This is undefined behavior and will vary between different POSIX-systems.

Mateusz Czapliński · Answer 17 · Thu Oct 25 2018 06:25:52 GMT+0800 (China Standard Time)

Nah, not an additional argument. Hm, ok, I can't remember, so that's on me probably, maybe I'm just mixing something up. Will try to come back to this, but I want to focus on "the rm affair" first, to get it out of my way. This one still needs some thought in my opinion. First, because the actuall command used in the command line differs from what's then used in the script: currently bash -c (in future, $SHELL -c?) vs. /bin/bash (in future, $SHELL). This subtle difference needs to be somehow accounted for. Also if it is to be used for other interpreters, like Haskell (ghci), Lua, etc.

One thing that now occurred to me, that could have potential for unifying them: is there some way to (easily, ideally) create a "fake file" on Linux, that would actually exist only in memory? Then maybe just the shebang-like command template could be reused for both cases?... Is /dev/shm/ the thing I should check? Or is access to it limited in some important ways? Also, as an extra question, if something like this is possible, could it be done in a way compatible with MacOS and BSDs?

Keith Thompson · Answer 18 · Fri Oct 26 2018 01:55:44 GMT+0800 (China Standard Time)

I suggest that using $SHELL is the right thing.

If I'm using up, the whole point is that I want to see how a pipeline of commands is going to behave. I'm almost certainly interested in seeing how it behaves in the shell I use interactively -- even if it's csh, or tcsh, or fish.

An option to use a different shell might be useful, but SHELL=... up ... (or env SHELL=... up) does that, and can easily be wrapped in a function or alias.

(As for the #!/usr/bin/env trick, it has advantages and disadvantages, which I've discussed here.)

Bart · Answer 19 · Fri Oct 26 2018 02:27:36 GMT+0800 (China Standard Time)

I could contribute this code this if you want? @akavel

Ford Hurley · Answer 20 · Fri Oct 26 2018 07:08:04 GMT+0800 (China Standard Time)

@akavel sorry for my confusingly worded earlier comment. @Keith-S-Thompson did a much better job articulating the point I meant to make:

If I'm using up, the whole point is that I want to see how a pipeline of commands is going to behave. I'm almost certainly interested in seeing how it behaves in the shell I use interactively -- even if it's csh, or tcsh, or fish.

Mateusz Czapliński · Answer 21 · Fri Oct 26 2018 08:26:19 GMT+0800 (China Standard Time)

@barthr Thanks for the offer; I'm however not yet sure what exactly do I want to do here, so most probably I won't be ready to accept your PR at this point yet, I'm afraid...

Mateusz Czapliński · Answer 22 · Fri Oct 26 2018 08:37:06 GMT+0800 (China Standard Time)

@Keith-S-Thompson @fordhurley I do understand the idea, rationale, and the general point.

If you want to help me go further with thinking about this feature, and hopefully shorten the time to its completion in some way, please try to help me with the questions I expressed in the comment above. Also: can you provide a list comparing "shebang" lines vs. "bash -c" lines for your favourite shells? Do they all accept a -c option in exactly the same way?

Alexander Hultnér · Answer 23 · Fri Oct 26 2018 16:11:49 GMT+0800 (China Standard Time)

@Keith-S-Thompson @fordhurley I do understand the idea, rationale, and the general point.

If you want to help me go further with thinking about this feature, and hopefully shorten the time to its completion in some way, please try to help me with the questions I expressed in the comment above. Also: can you provide a list comparing "shebang" lines vs. "bash -c" lines for your favourite shells? Do they all accept a -c option in exactly the same way?

@akavel You could just pipe the input to the shell, that's how the shebang works behind the scenes. No need for the -c argument in this case.

E.g.

 » echo "ls | wc -c" | zsh
     193

 » echo "ls | wc -c" | bash
     193

 » echo "ls | wc -c" | csh
     193

 » echo "ls | wc -c" | tcsh
     193

 » echo "ls | wc -c" | sh
     193

 » echo "ls | wc -c" | $SHELL
     193

Bart · Answer 24 · Fri Oct 26 2018 16:32:03 GMT+0800 (China Standard Time)

I suggest that using $SHELL is the right thing.

If I'm using up, the whole point is that I want to see how a pipeline of commands is going to behave. I'm almost certainly interested in seeing how it behaves in the shell I use interactively -- even if it's csh, or tcsh, or fish.

An option to use a different shell might be useful, but SHELL=... up ... (or env SHELL=... up) does that, and can easily be wrapped in a function or alias.

I agree completely. Since $SHELL is already a configurable variable and standard used by your terminal to determine the shell you're using.

Rohan Verma · Answer 25 · Fri Oct 26 2018 16:37:41 GMT+0800 (China Standard Time)

One thing that now occurred to me, that could have potential for unifying them: is there some way to (easily, ideally) create a "fake file" on Linux, that would actually exist only in memory? Then maybe just the shebang-like command template could be reused for both cases?... Is /dev/shm/ the thing I should check? Or is access to it limited in some important ways? Also, as an extra question, if something like this is possible, could it be done in a way compatible with MacOS and BSDs?

We can use Go's template package (https://golang.org/pkg/text/template/).
How the implementation could be done is that we have a default like:
"{{ .Shell }} {{ .Command }}"
and we can then provide a flag that overrides this template.

Mateusz Czapliński · Answer 26 · Sat Oct 27 2018 04:34:31 GMT+0800 (China Standard Time)

@Hultner So, actually, no, that's not how shebang works behind the scenes:

Shebang (Unix) - Wikipedia

In Unix-like operating systems, when a text file with a shebang is used as if it is an executable, the program loader parses the rest of the file's initial line as an interpreter directive; the specified interpreter program is executed, passing to it as an argument the path that was initially used when attempting to run the script,[8]

execve(2): execute program - Linux man page

Interpreter scripts

An interpreter script is a text file that has execute permission enabled and whose first line is of the form:

#! interpreter [optional-arg]

The interpreter must be a valid pathname for an executable which is not itself a script. If the filename argument of execve() specifies an interpreter script, then interpreter will be invoked with the following arguments:

interpreter [optional-arg] filename arg...

That couldn't even work here, as if script was redirected as input, then the actual pipeline input couldn't be redirected anymore...

Mateusz Czapliński · Answer 27 · Sat Oct 27 2018 06:38:15 GMT+0800 (China Standard Time)

Ok; I had a very cool discussion on #nixos IRC, so I'm going for $SHELL. Notably, it seems to have a lot of advantages, and shows up to be even more powerful than I thought initially. Specifically:

it is expected to be the shell of choice of the user, so will have the most "expected" behavior (a.k.a. ergonomics / principle of least surprise) when evaluated interactively in up
it will be required that the shell must support a -c COMMAND_LINE flag (for live evaluation) — but this seems surprisingly common: even powershell (!), rc (!), and python (!) seem to support it
if one wants to use an "engine" that does not support -c, it is still possible to do that, by writing a script, and setting it as temporary value of SHELL during invocation of up (this is what was surprising and unexpected for me — that this effectively provides a fairly easy way to plug in custom "engines", such as ghci, lua, etc.)
it's easy to override, e.g.: lshw | SHELL=`which zsh` up (NOTE that SHELL should be a full path — otherwise it won't work in shebang)
I see the upN.sh scripts as "sketches"; it's ok if "publishing" them to wider audience requires some editing (such as modifying the shebang to add /usr/bin/env); it's more important that they work without effort for a quick use by the person who created them, especially immediately after quitting up — this is also well served by using SHELL, and again matches the principle of least surprise.