dd question
stan
gryt2 at q.com
Fri Dec 10 21:28:05 UTC 2010
On Fri, 10 Dec 2010 03:11:25 +0000 (UTC)
"Amadeus W.M." <amadeus84 at verizon.net> wrote:
> I have a binary file with data. Each block of 48 bytes is a record. I
> want to extract the first 8 bytes within each record. I'm thinking
> this should be possible with dd, but gawk, perl - anything goes. It
> just has to be fast, because the data files are ~ 1Gb.
>
> I can do this in C++ but I was just wondering if it can be done with
> existing well tested tools.
The binary aspect makes it tricky. If they were EOL delimited records,
lots of tools could do this.
Here's a python function, not checked though. It does require that you
have enough memory to slurp the file into memory. Put it in a file,
edit for the filenames, and run it as python <filename>. I guess it
should take less than a minute, but not sure, should be fine for one
off.
def extract (filename1 = None, filename2 = None):
if filename1 != None and filename2 != None:
infile = open (filename1, "rb")
slurp = infile.read () # at least as much memory as the file size
infile.close ()
outfile = open (filename2, "wb")
while len (slurp) > 0:
record = slurp [:48] # extract a record
first8 = record [:8] # slice off first 8 positions
outfile.write (first8) # write them out, no separator
slurp = slurp [48:] # chop them off the file
outfile.close ()
extract (filename1 = "your input filename with path",
filename2 = "your output filename with path")
More information about the users
mailing list