Hi
I need to write random data to a partition before encrypting it. Suggested way is to use urandom:
#dd if=/dev/urandom of=/dev/sda2
What is the use of operand "bs" in the following case? I see the above command executed as follows sometime:
#dd if=/dev/urandom of=/dev/sda2 bs=1M
For the hard drive I got, should I use the following command:
#dd if=/dev/urandom of=/dev/sda2 bs=4096
From what I understand, the first command above will write data in 512 byte
blocks, the second one in 1MB blocks, and the third in 4096 byte blocks. Right? I am a bit confused about the usage of this operand in this case.
On Thu, 2011-07-21 at 19:48 +1000, yudi v wrote:
Hi From what I understand, the first command above will write data in 512 byte blocks, the second one in 1MB blocks, and the third in 4096 byte blocks. Right?
Yep. The 1M should also yield considerably better performance. (Though the random number generator may cap the performance in this case).
- Gilboa
On Thu, Jul 21, 2011 at 8:04 PM, Gilboa Davara gilboad@gmail.com wrote:
On Thu, 2011-07-21 at 19:48 +1000, yudi v wrote:
Hi From what I understand, the first command above will write data in 512 byte blocks, the second one in 1MB blocks, and the third in 4096 byte blocks. Right?
Yep. The 1M should also yield considerably better performance. (Though the random number generator may cap the performance in this case).
Gilboa
Sorry, could you please elaborate a bit more on how a higher size block results in better performance.
On 21 Jul 2011 at 20:15, yudi v wrote:
Date sent: Thu, 21 Jul 2011 20:15:39 +1000 Subject: Re: Overwriting a 4096 byte sector harddisk drive with random data From: yudi v yudi.tux@gmail.com To: Community support for Fedora users users@lists.fedoraproject.org Send reply to: Community support for Fedora users users@lists.fedoraproject.org mailto:users-request@lists.fedoraproject.org?subject=unsubscribe mailto:users-request@lists.fedoraproject.org?subject=subscribe
On Thu, Jul 21, 2011 at 8:04 PM, Gilboa Davara gilboad@gmail.com wrote: On Thu, 2011-07-21 at 19:48 +1000, yudi v wrote: > Hi
From what I understand, the first command above will write data in 512 byte blocks, the second one in 1MB blocks, and the third in
4096
byte blocks. Right?
Yep. The 1M should also yield considerably better performance. (Though the random number generator may cap the performance in this case). - Gilboa -Sorry, could you please elaborate a bit more on how a higher size block results in better performance.
I maintain a disk imaging project, and bs can make a big difference depending on the disk and hardware being used. Here is a quick set of test I just did creating a 10MB file from urandom on my system. You may get higher or lower results depending on your hardware and the size of the file or disk.
time dd if=/dev/urandom of=test1 bs=1M count=10 10+0 records in 10+0 records out 10485760 bytes (10 MB) copied, 1.23827 s, 8.5 MB/s
real 0m1.241s user 0m0.000s sys 0m1.222s
time dd if=/dev/urandom of=test1 bs=4096 count=2560 2560+0 records in 2560+0 records out 10485760 bytes (10 MB) copied, 1.26018 s, 8.3 MB/s
real 0m1.264s user 0m0.001s sys 0m1.241s
time dd if=/dev/urandom of=test1 bs=512 count=20480 20480+0 records in 20480+0 records out 10485760 bytes (10 MB) copied, 1.3672 s, 7.7 MB/s
real 0m1.371s user 0m0.002s sys 0m1.352s
-- Kind regards, Yudi
+----------------------------------------------------------+ Michael D. Setzer II - Computer Science Instructor Guam Community College Computer Center mailto:mikes@kuentos.guam.net mailto:msetzerii@gmail.com http://www.guam.net/home/mikes Guam - Where America's Day Begins G4L Disk Imaging Project maintainer http://sourceforge.net/projects/g4l/ +----------------------------------------------------------+
http://setiathome.berkeley.edu (Original) Number of Seti Units Returned: 19,471 Processing time: 32 years, 290 days, 12 hours, 58 minutes (Total Hours: 287,489)
BOINC@HOME CREDITS SETI 11023792.670369 | EINSTEIN 6216520.680851 ROSETTA 3399923.990913 | ABC 6993668.719655
Sorry, could you please elaborate a bit more on how a higher size block results in better performance.
-- Kind regards, Yudi
Ouch, of the top my head, there two major reasons: 1. (Mechanical) disk drives (AKA Hard drives) dislike random read/writes as their require the drive to constantly "move" the head to a different strip. 2. File systems tend to like big blocks as its easier for them to allocate adjacent blocks, reducing file fragmentation which in turns increases the number of seeks (see 1).
However, using uber-large-blocks (such as 1M) may actually decrease the performance due to the synchronous nature of dd itself.
Here's a short demo: (Taken from a 5 x 500GB software RAID5; ext4 partition over LVM) Notice that the best performance is reached when using 64KB blocks.
$ rm -f temp.img; export BS=512; time dd if=/dev/zero of=temp.img bs=$BS count=$(((4096*1024*1024)/$BS)) 4294967296 bytes (4.3 GB) copied, 151.229 s, 28.4 MB/s
real 2m31.231s: user 0m0.830s sys 0m32.678s
$ rm -f temp.img; export BS=16384; time dd if=/dev/zero of=temp.img bs=$BS count=$(((4096*1024*1024)/$BS)) 4294967296 bytes (4.3 GB) copied, 106.988 s, 40.1 MB/s
real 1m46.990s user 0m0.041s sys 0m15.659s
$ rm -f temp.img; export BS=65536; time dd if=/dev/zero of=temp.img bs=$BS count=$(((4096*1024*1024)/$BS)) 65536+0 records in 65536+0 records out 4294967296 bytes (4.3 GB) copied, 69.0871 s, 62.2 MB/s
real 1m9.089s user 0m0.012s sys 0m47.636s
$ rm -f temp.img; export BS=$((1024*1024)); time dd if=/dev/zero of=temp.img bs=$BS count=$(((4096*1024*1024)/$BS)) 4096+0 records in 4096+0 records out 4294967296 bytes (4.3 GB) copied, 98.6219 s, 43.5 MB/s
real 1m38.639s user 0m0.003s sys 0m4.317s
On Thu, 2011-07-21 at 15:34 +0300, Gilboa Davara wrote:
Sorry, could you please elaborate a bit more on how a higher size block results in better performance.
-- Kind regards, Yudi
Ouch, of the top my head, there two major reasons:
- (Mechanical) disk drives (AKA Hard drives) dislike random read/writes
as their require the drive to constantly "move" the head to a different strip. 2. File systems tend to like big blocks as its easier for them to allocate adjacent blocks, reducing file fragmentation which in turns increases the number of seeks (see 1).
Neither of these points is relevant in the case posited by the OP. She's writing directly to the drive, not to some file. There is no question of random versus sequential and there's no filesystem block allocation going on.
The actual reason for using larger block sizes is that it gives the driver a chance to minimize the number user-to-kernel data copies. It may also reduce kernel-to-device DMA operations, depending on whether the dd sync options are used. Making the bs value an integer multiple of the the physical block size of the device is probably a good idea in most circumstances.
However, using uber-large-blocks (such as 1M) may actually decrease the performance due to the synchronous nature of dd itself.
Here's a short demo: (Taken from a 5 x 500GB software RAID5; ext4 partition over LVM) Notice that the best performance is reached when using 64KB blocks.
Writes to an ext4 filesystem inside an LVM volume inside a RAID 5 array tell you virtually nothing about raw write performance to a single drive. There are too many factors to consider, one of which is the CPU cost of calculating block parity and another is the fact that at least two physical disk writes are taking place for every logical one, and possibly more if inode and indirect blocks are also being updated.
poc
On Jul 21, 2011 7:20pm, Patrick O'Callaghan pocallaghan@gmail.com wrote:
Neither of these points is relevant in the case posited by the OP. She's writing directly to the drive, not to some file. There is no question of random versus sequential and there's no filesystem block allocation going on.
As you may have guessed, I failed to notice that she using dd directly to sdX instead of file. My mistake.
As for RAW vs ext4/LVM, I completely disagree (I did not mention that MD stripe size LVM block size), but this is more-or-less OT for the subject at hand.
- Gilboa
On Fri, 2011-07-22 at 09:35 +1000, yudi v wrote:
Making the bs value an integer multiple of the the physical block size of the device is probably a good idea in most circumstances.
Going back to my original question, as the HDD I am using is a 4096 physical and 512 logical block, what would be a recommended bs value.
4096
poc