The best way to learn is by doing, so to learn the concepts of virtual disks we're going to create a 1GiB [20] virtual disk from scratch. This information is applicable to the topic of disks in general, it's value is not limited to virtual disks.
What makes virtual disks any different from actual hard drives? We'll examine this question by creating a virtual disk from scratch.
What does your operating system think a disk drive is? I have a
320 GB SATA drive in my computer which is represented in Linux
as the file /dev/sda
.
Using file, stat and
fdisk we'll see what Linux thinks the
/dev/sda
file is.
Let's start out by looking at what a regular drive looks like to
our operating system. Throughout this section the regular drive
we'll be comparing our findings against will be a 320G
[21]
SATA hard drive drive that Linux references as /dev/sda
. The following example
shows some basic information about the device.
Example 3.1. Regular Disk Drive
$ file /dev/sda /dev/sda: block special $ stat /dev/sda File: `/dev/sda' Size: 0 Blocks: 0 IO Block: 4096 block special file Device: 5h/5d Inode: 5217 Links: 1 Device type: 8,0 Access: (0660/brw-rw----) Uid: ( 0/ root) Gid: ( 6/ disk) Access: 2010-09-15 01:09:02.060722589 -0400 Modify: 2010-09-12 11:03:20.831372852 -0400 Change: 2010-09-12 11:03:26.226369247 -0400 $ sudo fdisk -l /dev/sda Disk /dev/sda: 320.1 GB, 320071851520 bytes 255 heads, 63 sectors/track, 38913 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x12031202 Device Boot Start End Blocks Id System /dev/sda1 1 25496 204796588+ 7 HPFS/NTFS /dev/sda2 25497 31870 51199155 83 Linux /dev/sda3 31871 33086 9767520 82 Linux swap / Solaris /dev/sda4 33087 38913 46805377+ 5 Extended /dev/sda5 * 33087 38913 46805346 83 Linux
TODO: Make this into a screenco with callouts on what to look for
The term block is generally interchangeable with the term sector. The only difference in their meaning is contextual. It's common usage to say block when referring to the data being referenced and to use sector when speaking about disk geometry. Officially the term data block was defined by ANSI ASC X3 in ANSI X3.221-199x - AT Attachment Interface for Disk Drives (ATA-1) [22] [23] §3.1.3 as:
This term describes a data transfer, and is typically a single sector […]
Storage units need to be clearly defined. Luckily some very smart people[24] already took care of that. The International Electrotechnical Commission [25] defined binary prefixes for use in the fields of data processing and data transmission. Below are some prefixes as they apply to bytes. See Appendix A, Appendix: Man Pages for the full prefix listing.
Abbrev. | Measurement | Name |
---|---|---|
1B | = 8 bits | The byte |
1KiB | = 1B * 210 | The kibibyte |
1MiB | = 1KiB * 210 | The mebibyte |
1GiB | = 1MiB * 210 | The gibibyte |
We'll use the dd command to create the file that represents our virtual disk. Other higher level tools like qemu-img exist to do similar things but using dd will give us a deeper insight into what's going on. dd will only be used in the introductory part of this document, later on we will use the qemu-img command almost exclusively.
If we're creating a 1GiB disk that means the file needs to be
exactly 230 bytes in size. By default
dd operates in block sized chunks. This means
that to create 230 bytes it needs to
push a calculable number of these chunks into our target disk
file. This number is referred to as the count
. To
calculate the proper count
setting we need only
to divide the total number of bytes required by the size of a each
block. The block size is given to dd with the
bs
option. It specifies the block size in
bytes. If not explicitly defined, it defaults to 512 byte blocks
(29).
We need to fill the file with something that has a negligible
value. On Unix systems the best thing to use is the output from
/dev/zero
(a special
character device, like a keyboard). We specify /dev/zero
as our input file to
dd by using the if
option.
Note | |
---|---|
|
NUL being a control character [26] means it's a non-printing character (it doesn't represent a written symbol), so if you want to identify it you can use cat like this to print 5 NUL characters in Caret Notation [27]:
$ dd if=/dev/zero bs=1 count=5 2>/dev/null | cat -v ^@^@^@^@^@
You can also convert the output from /dev/zero
into ASCII
0 characters like this:
$ if=/dev/zero bs=1 count=5 2>/dev/null | tr "\0" "\60" 00000
With the information from the preceding sections we can now
create the file that will soon be a virtual disk. The file
we create will be called disk1.raw
and
filled with 2097152 blocks of NUL
characters from /dev/zero
. Here's the command:
Now that you know what /dev/zero
is it's obvious this is
just a file containing 230 bytes (1GiB)
of data, each byte literally having the value
0
.
Like in Example 3.1, “Regular Disk Drive” let's take a look at the file we created from the operating system's point of view.
Example 3.3. Examining the Created File
$ dd if=/dev/zero of=disk1.raw bs=512 count=2097152 2097152+0 records in 2097152+0 records out 1073741824 bytes (1.1 GB) copied, 10.8062 s, 99.4 MB/s $ file disk1.raw disk1.raw: data $ stat disk1.raw File: `disk1.raw' Size: 1073741824 Blocks: 2097152 IO Block: 4096 regular file Device: 805h/2053d Inode: 151552 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 500/tim) Gid: ( 500/tim) Access: 2010-09-15 02:51:36.147724384 -0400 Modify: 2010-09-15 02:51:25.729720057 -0400 Change: 2010-09-15 02:51:25.729720057 -0400 $ fdisk -l disk1.raw Disk disk1.raw: 0 MB, 0 bytes 255 heads, 63 sectors/track, 0 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Disk disk1.raw doesn't contain a valid partition table
From this it's quite clear that there isn't much that
disk1.raw
has in common with the actual disk
drive sda
. Using this
information, let's put the physical disk and the virtual disk
size-by-size and make some observations about their properties.
file thinks it's “data”, which the file manual page says is how it labels what are usually “binary” or non-printable files.
stat says it's just a regular file.
fdisk doesn't knows how big it is, nor can it find any partition information on it.
Table 3.1. Attribute Comparison
Command | sda | disk1.raw |
---|---|---|
file | block special | data |
stat | block special | regular file |
fdisk | Contains partition table | Missing partition table |
These results make perfect sense, as
disk1.raw
is just
230 0
's in a row.
Use GNU parted to put a valid partition table on the image file.
Example 3.4. Create a Partition Table
$ parted disk1.raw mklabel msdos WARNING: You are not superuser. Watch out for permissions.
Let's examine the image again to see how the operating system thinks it has changed.
Example 3.5. Overview - What Changed
$ file disk1.raw disk1.raw: x86 boot sector, code offset 0xb8 $ stat disk1.raw File: `disk1.raw' Size: 1073741824 Blocks: 2097160 IO Block: 4096 regular file Device: 805h/2053d Inode: 151552 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 500/tim) Gid: ( 500/tim) Access: 2010-09-15 19:38:30.516826093 -0400 Modify: 2010-09-15 19:38:25.934611550 -0400 Change: 2010-09-15 19:38:25.934611550 -0400 $ fdisk -l disk1.raw You must set cylinders. You can do this from the extra functions menu. Disk disk1.raw: 0 MB, 0 bytes 255 heads, 63 sectors/track, 0 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x000e44e8 Device Boot Start End Blocks Id System
Now, instead of “data”, the file command thinks it is an “x86 boot sector”. That sounds pretty accurate as we just put a partition table on it.
stat still thinks it's a regular file, as opposed to a block special device, or a socket, etc…
fdisk was able to find a partition table in the boot sector which file found.
Table 3.2. What parted Changed
Command | sda | disk1.raw | disk1.raw (via parted) |
---|---|---|---|
file | block special | data | x86 boot sector |
stat | block special | regular file | regular file |
fdisk | has partition table | no partition table | valid partition table. unknown cylinder count |
[20] Check out Appendix A, Appendix: Man Pages for a review of binary/decimal prefixes if “GiB” is foreign to you.
[21] If you're wondering why I didn't say 320GiB here, it's because “320GB” is the capacity as defined by the manufacturer.
[22] ANSI X3.221-199x Working Draft: http://www.t10.org/t13/project/d0791r4c-ATA-1.pdf
[23] Technical Committee (T13) Homepage: http://www.t10.org/t13/
[24] IEC 60027-2, Second edition, 2000-11, Letter symbols to be used in electrical technology - Part 2: Telecommunications and electronics: http://webstore.iec.ch/webstore/webstore.nsf/artnum/034558
[25] The IEEE also adopted this method for unit prefixes. Within the IEEE it is known as IEEE Std 1541-2002: http://ieeexplore.ieee.org/servlet/opac?punumber=5254929
[26] Wikipedia.org - Control Characters: http://en.wikipedia.org/wiki/Control_code
[27] Wikipedia.org - Caret Notation: http://en.wikipedia.org/wiki/Caret_notation