Fish-shell: Can't read pipe to function

Created on 5 Jul 2012  ·  21Comments  ·  Source: fish-shell/fish-shell

Test:

function testfun
    set input (cat)
    echo $input
end

echo testing123 | testfun

This should output "testing123", but produces nothing.

It works perfectly in bash:

function testfun
{
    input="$(cat)"
    echo $input
}

echo testing123 | testfun
bug

Most helpful comment

This issue, coupled with the default grep function for grep, results in quite a few problems - If this issue isn't going to be fixed any time soon, the default grep alias should probably be removed (or replaced with an abbreviation, maybe?) to at least minimize the occurances.

Using cat as @milieu suggests doesn't seem to fix the issue for me, on Fish 2.3.1 (which I just realized is slightly behind, but it's the version packaged for Fedora 25)

All 21 comments

You can use 'read' function as workaround.

function t
    while read -l line
        echo $line
    end
end

I just understood, that in fish it doesn't work, because the stdin is piped to 'set' instead of 'cat'.

This issue goes deeper than just "set" working a certain way. Piping into a function at all is screwed up. And terminal I/O sent into a function does work, but is still a bit weird--it appears to buffer the input and deliver it all at once. Observe what happens with this function:

~> function meh
       cat
   end
~> # First, the way it's supposed to work.
~> # As input, we press the keys: a RET b RET control-D
~> cat
a
a
b
b
~> cat | cat
a
a
b
b
~> # Now...
~> meh
a
a
b
b
~> # So far so good, but...
~> cat | meh
a
b
^D
... um...
^D
control-D repeatedly does not work
try control-C
Job 1, “cat | meh” has stopped
~> fg
Send job 1, “cat | meh” to foreground
cat: stdin: Interrupted system call
~> jobs
jobs: There are no jobs
~> # Dear lord.
~> # For completeness...
~> meh | cat
a
b
aD
b
~> 

Also, cat | meh | cat behaves the same way, as does cat | begin; cat; end.
I can tell you further that the "cat" that complains about an interrupted system call in cat | meh is the first "cat". That is:

~> cp /bin/cat mycat
~> ./mycat | meh
Job 1, “./mycat | meh” has stopped  #after control-C
~> fg
Send job 1, “./mycat | meh” to foreground
mycat: stdin: Interrupted system call

So there's that. Obviously this is something to do with how fish calls functions and how it constructs pipes into them. Does anyone happen to know about this?

Ok, I am finding that running
pbpaste | begin; cat; end
repeatedly in a fresh fish shell, with the clipboard being "23\n", will sometimes just print 23 back out, and will sometimes cause the shell to lock up, at which point control-C can do nothing. I assume this must be a race condition of some sort. Oh boy.

Meanwhile, it looks like the signal SIGTTIN is sent to the "mycat" in ./mycat | begin; cat; end:

     21    SIGTTIN      stop process         background read attempted from
                                             control terminal

Then, according to the GNU libc manual: "A process cannot read from the user's terminal while it is running as a background job. When any process in a background job tries to read from the terminal, all of the processes in the job are sent a SIGTTIN signal."

So, looks like the "mycat" either gets started in the background, or is started and then put in the background, when it gets piped into a fish function-kind-of-thing. Perhaps this knowledge will help.

This backgrounds both sides of a pipe apparently... But giving fg command pulls the process from background allowing it to work as it supposed to.

~ $ alias pjson='python -m json.tool | pygmentize -l json'
~ $ curl -u smoku -X GET -H "Content-Type: application/json" 'https://jira.......' | pjson
Job 4, 'curl -u smoku -X GET…' has stopped
~ $ fg
Enter host password for user 'smoku': ********
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   593    0   593    0     0   1372      0 --:--:-- --:--:-- --:--:--  1375
~ $ fg
{
    "expand": "renderedFields,names,schema,transitions,operations,editmeta,changelog",
    "id": "29874"
}

A bit annoying that I needed to create pjson wrapper script in $PATH instead of simple alias... :(

For my reference this is also openSUSE bug https://bugzilla.opensuse.org/show_bug.cgi?id=963548

Yay! I think I found my workaround! Thanks to gustafj's comment in issue #110 explaining fish piping syntax, I have come up with this:

function line --argument-names n
    cat 1>| tail -n +$n | head -1
end

This issue, coupled with the default grep function for grep, results in quite a few problems - If this issue isn't going to be fixed any time soon, the default grep alias should probably be removed (or replaced with an abbreviation, maybe?) to at least minimize the occurances.

Using cat as @milieu suggests doesn't seem to fix the issue for me, on Fish 2.3.1 (which I just realized is slightly behind, but it's the version packaged for Fedora 25)

It seems that there is a difference between execution within the shell and from the command line:

(zsh)$ ./fish -c "read n | grep nothing"    
read> lol
(zsh)$ ./fish
(fish)$ read n | grep nothing
read> 
# Stuck forever, needs to kill the terminal. ^C, ^Z have no impact.

Maybe this can help debug the issue ?

@layus: No, that's #3805, an issue where fish itself can't gain control of the terminal.

_I think some of the original behavior in this regard has changed, the below is with regards to fish master/3.0_

There are two fundamental issues fish gets wrong here, the first is buffering function/block output (I'm pretty sure a modern shell would not buffer anything anywhere) and the second is being unable to correctly chain input/output across a block. There is a lot of ambiguity (or at least room for acceptable differences of opinion) in what the correct behavior should look like in some corner cases, but I don't think anyone will defend what fish does currently as being optimal.

In general, you have external commands (and builtins which are effectively treated the same way, by and large) which are easy: one input, two outputs, one of which can be chained to subsequent command, the other of which must be redirected to a file or the tty. But blocks and functions are tricky since you're basically mapping an input (as there can only be one) to a sequence of (what eventually expand to) external commands or builtins.

That said, I disagree that the current behavior is wrong. (cat) should not read data that was piped into the command it is executed within.:

mqudsi@ZBook /m/c/U/m/Documents> type testfun
testfun is a function with definition
function testfun
    set input (cat)
    printf "You said '%s'\n" $input
end
mqudsi@ZBook ~/r/fish-shell> echo testing123 | testfun
hello
^D
You said 'hello'

You are piping input into the block, regardless of whether set consumes the input, consumes part of the input, or ignores the input entirely, cat is correct to connect to /dev/tty for input, which then correctly gets passed to the shell for substitution into the commandline. In fact, there are/were (many) bugs filed against this repo complaining about cases where "subshells" were _not_ reading from the terminal when executed with some levels of indirection. IMHO, it is bash that is broken here, especially since bash supports real subshells and offers asynchronicity here.

The only broken behavior I would say stems from cases where external commands are launched in a function/block and do not fully consume the input:

mqudsi@ZBook /m/c/U/m/r/fish-shell> printf 'foo\nbar\n' | begin
                                        head -n1 | read -l line1
                                        head -n2 | read -l line2
                                        echo line1: $line1
                                        echo line2: $line2
                                    end
line1: foo
line2:

TBH I'm very surprised but this works correctly:

mqudsi@ZBook /m/c/U/m/r/fish-shell> printf 'foo\nbar\n' | begin
                                        /bin/echo 'hi from echo'
                                        cat | read -z from_cat
                                        printf 'from_cat: "%s"' $from_cat
                                    end
hi from echo
from_cat: "foo
bar
"¶  

And this is also correct:

mqudsi@ZBook /m/c/U/m/r/fish-shell> printf 'foo\nbar\n' | begin
                                        cat | read -zl from_cat1
                                        cat | read -zl from_cat2
                                        printf 'from_cat1: "%s"\n' $from_cat1
                                        printf 'from_cat2: "%s"\n' $from_cat2
                                    end
from_cat1: "foo
bar
"
from_cat2: ""

Especially when taking into account the plan of someday introducing real subshells in fish with asynchronous execution, I would say that fish's behavior in regards to the original case reported here is correct. In fact, I'm inclined to close this issue entirely, unless anyone objects and can make a convincing argument here.

While the original bug report is imho invalid, the issues raised by @waterhouse are spot-on and great catches. But the good news is that #5219 appears to fix them, including the cat | meh case reported.

mqudsi@ZBook ~/r/fish-shell> cat | meh
a
a
b
b
^D
mqudsi@ZBook ~/r/fish-shell>

That said, I disagree that the current behavior is wrong. (cat) should not read data that was piped into the command it is executed within.:

I very much disagree on that!

cat is correct to connect to /dev/tty for input

That's a question of the mental model. I would say that cat connects to "the current stdin" for input. If the function or block isn't redirected, that's the tty. If it is redirected, that's that! So connecting to /dev/tty here would be incorrect.

complaining about cases where "subshells" were not reading from the terminal when executed with some levels of indirection

Note that those were all about "global" command substitutions. E.g. running echo (fzf) on the commandline. In that case, there is no stdin.

So what I would say would work sort of like this:

echo | echo (cat) # from tty

begin
   echo | echo (cat) # from file
end < file

There is a related issue (#1035) that asks about stderr in this case, and that that isn't redirected. Which was quite an issue with the old math function, because that happened to feature a command substitution inside of it, and so you couldn't redirect that.

This is the stdin part of it. If a function does a bare (cat), is it really useful to always have that read from the tty? Or couldn't you just use </dev/tty in that case?

Interesting thoughts.

I guess it boils down to whether the parentheses denote simple substitution (i.e. "pretend the contents of the parentheses were on the line above, run them to completion, store the result in a variable, and substitute the variable here") or if they're (currently broken) subshells. I thought the consensus was that fish is missing proper subshell support, but the intention has always been to fix that "at some point."

If it's the former, then yes, I agree, the current behavior is broken because if you move the contents of the parentheses to a different line, it should certainly read from the input being redirected to the block.

But subshells are a much more powerful concept tha that, and they let you do things that aren't possible with command substitution and to create much more responsive and capable scripts. While it's technically possible to connect whatever is being input into the block to the stdin of a command executed in a subshell, I think that would be incompatible with the mental model there.

whether the parentheses denote simple substitution (i.e. "pretend the contents of the parentheses were on the line above, run them to completion, store the result in a variable, and substitute the variable here") or if they're (currently broken) subshells.

I don't think these terms are defined clearly enough to be of too much use here.

For me, it comes down to what's more natural, more typical and more useful.

Reading from the terminal is certainly useful, and sometimes you want to read from the terminal even though you have another stdin (e.g. fzf does basically exclusively this).

But I think that reading from stdin is far more typical, especially considering that non-interactive uses won't read from tty at all. And since reading from tty is still possible (via that </dev/tty redirection), it seems okay to leave that as the secondary option.

The fact that there's no opposite of </dev/tty in the model I'm suggesting is making me reconsider my position.

I might not be deep enough into shells to understand the discussion completely. But I need to solve something and I'm wondering if I need a bash script to solve it.

It's basically a very simple task: I want to pipe stdout from (z)cat through pv to a mysql cli (basically to restore a backup) and because I don't want to enter the connect string I want to use a function for it:

function mysqlenv --description connect to mysql server using config from .env
  mysql -u (getEnv DB_USERNAME) -p(getEnv DB_PASSWORD) (getEnv DB_DATABASE)
end

First I was sure this will work because it's obvious stdin for a command is stdout from the command on the left side but now I'm confused. Ok mysqlenv is not a command it's a function. Now I'm here reading a lot of text and a lot of "this should work" but nothing is working.

What I tried:

  • cat -|mysql... no output; mysql does not get input; ctr+c exists mysql; pipe is running in background
  • mysql... <&0 no output; mysql does not get input; ctr+c exists mysql; pipe is running in background
  • set input (cat); mysql... no output; mysql does not get input; ctr+c exists all; nothing remains in background
  • read -z|mysql... no output; mysql does not get input; ctr+c prints ^c

Again my command prompt: zcat some_backup.sql.gz|pv -s (zsize some_backup.sql.gz)|mysqlenv. It shows the pipe status when used directly with mysql (without a fish function inbetween) - so it should work.

So please how to give stdin from the function to stdin of a command inside the function?

Don't say I have to reconnect for every line via while read.... It may work but it is not a solution as it is too slow to work with.

@tflori: Much simpler. Just leave the command as-is, without any redirections. The issue isn't with the commands directly in the function. Something like

function foo
    cat
end

works. The cat gets stdin like it should.

What doesn't is when it is in a command substitution, that's when you're hitting this bug.

@faho that means the initial function should work? but it does not. maybe my version is outdated? I'm using 2.7.1 currently

maybe my version is outdated? I'm using 2.7.1 currently

@tflori: Yes, you'll want 3.0.2.

Was this page helpful?
0 / 5 - 0 ratings