RichiH / vcsh

config manager based on Git

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

foreach greppable mode

danielshahaf opened this issue · comments

I'd like the foreach command to be able to prefix the repository name to each line of the output. This will allow, for example, vcsh foreach ls-files ./ | egrep '^(foo|bar)' to show all files in the current directory that belong to either the 'foo' vcsh repository or the 'bar' vcsh repository.

WDYT?

I like the idea, but this would probably mean catching all output and printing it ourselves. Does anyone have any other ideas?

Bueller? Bueller? Are we waiting for other ideas, or?

#252

My solution is just piping the output to cat if there's nothing to do, and to sed otherwise.

I've changed another behavior: before a single line would be output containing the repo name. This seems unnecessary now with the new option, and even a bit unwanted.

--help was updated, no additional unit tests were added.

The desired usage looks like it's doing more work than it should be:

$ vcsh foreach ls-files ./ | egrep '^(foo|bar)'

What are you going to do with these results that would not be better served by just going straight for the data? Like this perhaps:

$ xargs -n1 vcsh list-tracked <<< "foo bar"

This will avoid iterating over repositories you don't want results from anyway.

If what you want is the name of the repo instead of the names of files, why not use vcsh which?

I'm not saying there are no use cases for adding a prefix, but it would be nice to have a use case that isn't already better served by some other command. That would make any actual implementations easier to evaluate.

@alerque If I am interested in specific repos called foo and bar, then yes, that xargs idiom would be good (assuming the end of the output for foo / start of the output for bar can be recognized).

I think my original use-case was to eyeball the output, and I was hoping for a more SQL-like output mode: "first column: repo name, second column: whatever the foreach'd command printed".

Having the repo name on each line would also allow sorting the vcsh foreach ls-files output by filename rather than by repo name, while still being able to track, for each file, what repo it is in.

In general, output is easier to parse if all lines are of the same form. Right now, some lines are ${repo_name}: and some are output of the command for that repo, so parsing/post-processing has to account for these different kinds of lines, keep state, etc..

Putting it differently: The data is currently somewhat normalized in the form of
name1
output1
name2
output2

Denormalizing by prefixing per line is doable. It means catching and printing all lines; risky with escape chars, UTF-8, right-to-left, etc., but doable.

If we do that, it most likely needs to hide behind a --format-output or --plumbing (or --porcelain to stay closer to Git's historic misunderstanding of the terminology) as it's not commonly useful and breaks other usage like e.g. vcsh foreach diff | cat with the prefix.

Such an option would also need to unset all ENV which usually redirects output into less etc. Unclear how it would break for things like | vim.

Yes this would need to be behind a flag. Grep users -H, --with-filename for something similar. Honestly the suggestion of -p in #285 seems reasonable to me whether you think of it as --print or --prefix.

I don't think ENV vars will be an issue in this case, Git already detects whether the output is a console or pipeline and if it's a pipeline it avoids turning it over to a pager. I'm pretty sure we can just rely on that mechanism, at least for the Git context. The -g general context might be harder. I guess you get what you pay for though: if you ask for a linewise prefix on output and then fire up a TUI it's kind of hard to blame us. We won't be able to catch every case of what could get run, the best we can hope for is solid processing of anything that was intended to be linewise console output.

@danielshahaf For the use case of listing which files are in which repos I still think you might be better off reversing your query and iterating files and using vcsh which, but I do see some possible uses of this in general.

Perhaps you can review #285 and make sure it meets muster for your use case(s). cc @julien-lecomte

This issue is old. I haven't got a clue on what was my use case at the time.

@alerque The cumulative diff of #285 looks perfect :-)

(so long as one doesn't use backslashes, leading minuses, or regexp metacharacters in their repo names)

What's the correct way to escape there? ${VCSH_REPO_NAME}?

What's the correct way to escape there? ${VCSH_REPO_NAME}?

No, the shell parses the variable just fine here it's what sed may do with the result of the expansion that has possible issues.

I did think about that. It would have been more robust to use awk there but I wasn't prepared to lobby for a new dependency (although in #288 I think I'm going to). However it isn't that bad. Note that this is in the replacement section not the match section so there are a lot fewer characters that could be an issue.

  • Backslashes themselves won't be a problem, only backslash-digits which are back-references.
  • Leading minuses aren't a problem
  • Most rexep metacharacters only matter in the match, not the replacement section

By far the most likely trouble maker would be forward slashes, but I don't think we support paths as names anyway.

We could potentially escape anything in names that would be troublesome, but it would mean running the name itself through sed first to make the substitutions.

I have no problem adding awk as a dependency; I consider it part of any Unix, and as far as Google tells me Cygwin carries it to. It would need testing on OS X, but else...