off topic: combined output of concurrent processes
Amadeus W.M.
amadeus84 at verizon.net
Sun Apr 15 05:52:43 UTC 2012
>
> if your script forks off lots of curls into a file and does not wait for
> them all, then you may get to run the grep before they all finish, hence
> the weird results.
If ioTest.sh is the original example I posted, I'm NOT doing this:
./ioTest.sh | grep ^A | wc -l
I am doing this:
./ioTest.sh > out # go drink beer
grep ^A out | wc -l
All echos should have completed by the time I do grep. Yet I see fewer
than 100 lines.
It does work if I append instead of write though.
>
> Note that you can only wait for immediate children (whereas the pipe
> does not show EOF until all attached processes have exited - that means
> it works for non immediate children too).
>
> Consider:
>
> for n in `seq 1 100`
> do echo FOO &
> done >zot
> wait
>
With this exact script, it works for FOO (probably because it's short).
For FOOOOOOOOOO...(1000 Os) I see again fewer than 100 lines in "zot".
This, if I iterate 100 times. If I iterate, say, 10-20 times only, I seem
to get all the lines. Can it have something to do with the number of jobs
executed in the background?
The real code is like this:
#!/bin/bash
for url in $(cat myURLs)
do
curl -s $url &
done
I pipe the combined curl outputs to a program that parses the html and
keeps track of something (I do pipe afterall). I could do that serially
(without &), but parallel is better. I'm only spewing out some 20 network
requests simultaneously and so far no warnings from verizon. I'm guessing
if I do 1000, say, I might set off some alarms. But that's another
problem.
More information about the users
mailing list