quasi-[OT] Adobe Flash
Marko Vojinovic
vvmarko at gmail.com
Sat Oct 23 17:10:23 UTC 2010
On Saturday, October 23, 2010 14:44:25 Patrick O'Callaghan wrote:
> On Sat, 2010-10-23 at 12:27 +0100, Marko Vojinovic wrote:
> > On Saturday, October 23, 2010 04:27:45 Patrick O'Callaghan wrote:
> > > On Thu, 2010-10-21 at 21:33 -0600, Petrus de Calguarium wrote:
> > > > Fortunately, Suvayu's brilliant script gets around that and manages
> > > > to access the file, even though it is already deleted, while
> > > > Patrick's suggestion of hard linking to it does not work, because it
> > > > is already deleted, unless he also has some ingenious trick up his
> > > > sleeve to "get a handle on" the deleted file.
> > >
> > > Yes, it's a neat trick. However the 'cp' will terminate when it reaches
> > > the end of the input file, even if it's still being written to by the
> > > flash process. That would explain why the output is sometimes
> > > truncated. Getting round that would need a copy process that waits to
> > > see if there's more output, either by polling or by using inotify. IOW
> > > something conceptually similar to "tail -f".
> >
> > Just to follow that idea, would something like
> >
> > tail -f /proc/<pid>/fd/<file_id> > /tmp/flashfile.flv
> >
> > work? (Maybe with a couple more switches to tail, to start from the
> > beginning of the file, etc...)
>
> I suspect that 'tail' is designed for text files (it has options for how
> many lines to output etc.) 'man tail' is not very clear on whether it
> can work for binary files, e.g. what happens when it gets a null byte in
> the stream? Some experimentation is in order.
>
> Also, 'tail -f' will sit forever waiting for input, even if nothing is
> writing to the file. The present case is slightly different in that we
> can assume (until Adobe changes it again ...) that a single process is
> writing to the Flash buffer file, hence the idea of using inotify to
> notice when the writer has gone. However on second thoughts that may not
> be necessary. Given that the /proc file will disappear when the writer
> dies, it would be enough to loop until getting an error (EIO?, not
> sure).
Well, without wasting much time on this, I tried to copy a random .jpg file I
had lying around, using the following:
tail -q --bytes=1G file1.jpg > file2.jpg
That produced file2.jpg which was exactly the same as file1.jpg. This suggests
that tail would work correctly for binary files (or at least this one that I
tried :-) ).
Looking at man tail, I found the following to be useful for this particular
purpose:
--bytes enables us to specify the initial number of bytes that are to be
read (starting from the bottom of the file). Like in my example above, a big
enough value (1G) would ensure the file is being read from the very beginning.
-q disables any headings and stuff that might corrupt a binary file.
-f would keep reading the file and appending the output until tail dies.
--pid would enable tail to monitor the process that writes to the file and
terminate itself automatically when the write is complete.
Granted, I have no idea what happens if tail receives a null byte, but I guess
it should ignore it, since (with -f) it will keep watching if the file gets
appended subsequently. So it should not terminate after a null, by design. One
should examine the source code of tail or experiment with various binary
inputs to determine exact behavior, but I have a feeling it should work.
So, given the <pid> and <file_id> information, hopefully this should Just Work:
tail -f -q --bytes=1G --pid=<pid> /proc/<pid>/fd/<file_id> > /tmp/flashfile.flv
Haven't tried it, though :-) . I guess the one gigabyte size value is bigger
than any flash file one can find to download from Internet, so it should be ok.
As for the <pid>, my guess is that the process that opens the file for writing
is the only one allowed to actually write to it, since otherwise one can get a
race condition and data might get corrupted. Once that process dies, tail will
die along with it, leaving a clean /tmp/flashfile.flv as a result. At least that
is my theory. ;-)
Now, all that is needed is that someone write a script and try it out. I am
not very versatile with extracting <pid> and <file_id> and such stuff, but
otherwise the script should be trivial. :-)
Best, :-)
Marko
More information about the users
mailing list