Results of some vmstat tests using different blocksizes in the read/write of 3GB of data.
dd if=/dev/sda of=/dev/null bs=X count=y followed by dd if=/dev/zero of=/3GBfile bs=X count=Y
... foreach each of X = 512B, 1K, 2K, 4K, 8K, 16K, 32K, 64K, 128K, 256K and 512K and Y is the appropriate multiplier to get to 3GB.
In answer to the question "does changing the blocksize make any difference?" the answer is "No".
Attached PDF graphs the results of a whole run - again we've got fairly solid 80000 blocks/s on reads and very bursty block/s on writes, with lots of processes being blocked when writing.