Statistics for perl hash tables

The following data is from a 64bit perl-5.19.10 (i.e. 32 bit jenkins - corrected from 64 bit siphash - which randomly shuffles the 2 first entries of collisions), running the core test suite with -DH.

number of hash keys:

0   3634661x
1   3263596x
2   4361350x
3   2745134x
4   4199196x
5   2921502x
... and so on linearly falling ...
29  1821569x
30  736893x
31  312185x
32  643500x
33  379865x
34  371403x
35  1050967x
36  1016470x
37  1226918x
38  8558286x
39  1671515x
40  4594200x
41  2859521x
42  627789x
43  686036x
...
67  220743x
68  3382026x
69  1225141x
70  147903x
71  95882x
...
307 8330x
308 8962x
309 7902x
310 9041x
...
513 2413x
514 2713x
515 2708x
516 1945x
...
1098    410x
1099    433x
1100    402x
1101    400x
...
1848    69x
2225    14756x
2226    66x
2283    14877x
2284    63x
2309    14818x
2310    60x
2658    25321x
2659    56x
4239    9221x
4240    53x
5654    23526x
5655    46x
6355    1799x
6356    41x
6768    1641x
6769    38x
7476    2001x
7477    35x
7983    37x
7984    76x
7985    27x
7989    25x
7990    23x
8292    54x
8293    19x
9716    18x
10132   19x
10133   18x
13060   1968x
13061   15x
19745   39503x
19746   13x
20893   3795x
20894   9x
22124   3862x
22125   5x
23647   3865x
23648   1x
25137   328x

hash table sizes:

0   1933x
1   294x
3   432x
7   24128585x
15  14099448x
31  15040566x
63  28189821x
127 19795904x
255 10955641x
511 3851922x
1023    2009591x
2047    961707x
4095    180486x
8191    236082x
16383   141396x
32767   136848x

depth of linear search of collisions:

0   19076114x
1   76051832x
2   19870023x
3   4091878x
4   577090x
5   46417x
6   16933x
7   346x
8   23x

It was produced with this patch and a short script to sum up the occurences of the hash keys, sizes and tried collisions. The tests are still work in progress, see some more in this perl-hash-stats repo

The perl5 testsuite has a key size of median = 20, and avg of 133.2. The most commonly used key sizes are 4, 101 and 2.

Average case (perl core testsuite)

| Hash Function     | collisions| cycles/hash |
|:------------------|----------:|------------:|
| CRC32             | 1.078     | 29.78       |
| SUPERFAST         | 1.081     | 34.72       |
| ONE_AT_A_TIME_HARD| 1.092     | 83.75       |
| SIPHASH           | 1.091     | 154.68      |
| ONE_AT_A_TIME     | 1.098     | 43.62       |
| ONE_AT_A_TIME_OLD | 1.100     | 43.62       |
| MURMUR3           | 1.105     | 34.03       |
| DJB2              | 1.131     | 44.73       |
| CITY              |           | 30.13       |
| SDBM              | -         | 30.57       |

Less collisions are better, less cycles/hash is faster. A hash table lookup consists of one constant hash function (depending only on the length of the key) and then resolving 0-x collisions (in our avg case 0-8).

  • collisions are the number of linked list iterations per hash table usage.
  • cycles/hash is measured with smhasher for 10 byte keys. (see "Small key speed test")
  • SDBM and DJBJ did not produce a workable miniperl with -O2. Needed to patch them. Seeing that a HASH=0, effectively creating a long list of linear collisions in HvARRAY[0], does not work in current perl5, makes me feel bad. Note that seed + len is to prevent from the \0 attack.

So it's a pretty good distribution to test better hash tables. Sorry, no beautiful plots yet, later.

The problem I see with the new 5.18 hash tables is manifold:

  • instead of using a fast and optimized hash function we are using a slow, unoptimized hash function. Instead of using a good and fast function and avoid collision attacks, we try to avoid attacks by using slower hashing. The choosen hash functions are also not the best (faster and less collisions) independently verified for average data.
  • we hash unnecessary bits. needed are typically the last 5-9 bits (size 32-512) The biggest hash table in the core test suite needs 15 bits!
  • we cargo-cult salting by adding the random seed to the end of the key. every hash function combines the key with the salt anyway, and we only need the last bits. The last bits should prefer the key, not the seed. This is wrong. We add the len to the seed to prevent from Ruslan's \0 attack.
  • the problem is only to avoid linear collision exhaustion, O(n/2) search time, but we didn't change the collision strategy, e.g. from linear list to binary sorted array, i.e. O(log n). A sorted array of collision entries is much faster to search and much more cache friendly. That's why open addressing is so popular. Sorting up to 6 collisions is thanksfully trivial (hard-coded insertion-sort), and we only need to fallback to complicated called sort functions on real attacks.
  • we randomize seeds only when rehashing, when the attack can already occur. We did it better with 5.8.1, but this was "optimized" with 5.8.2, which lead the problem with 5.18. old info: cPanel perl randomizes the seed for all hashes, so is even immune to the attack Ruslan is talking about in the public blog post at booking.com, but the full motivation for the current 5.18 design is not publicly accessible. It was accessible, I just did not find it.

I'm working on fixing the 5.18 hash table problem and also evaluating robin-hood (fastest open addressing hash table) and cmph BMZ (fastest perfect hash) and using HW optimized hash functions (crc32 intrinsincs and SSE4.2 or armv8).

My analysis of the ususal hash functions is here: github.com/rurban/smhasher, where you can see how ridicuously slow siphash is, how slow and bad jenkins is, compared to fast hash and more secure functions. There are also the results of the full testsuite for each hash function and the easier to follow stats) evaluating quality vs performance.

Good open addressing hash tables are evaluated at github.com/goossaert/hashmap.

Perfect hash tables are explained at cmph.sourceforge.net/.

The public motivation to slow down the perl hash function is here. The discussion within the security team is not available, but as I was told, only the actual attack is not published, which is good. What is not good is that the developers have no idea how to evaluate hash tables and there is no sane discussion possible in the mailinglist, hence this blog and the various repos. In the end I will submit the best solution as patch.

I am also working a symbolic proof besides the statistical one with the stp SAT solver, to see the "quality" of a hash function over the possible seeds, counting the number of collisions over the typical table sizes. With such a solver you can automatically create attack vectors against most web-sites with bad hash collision strategies, so I'm a bit reluctant to publish that.

Note that "hardening" a hash function does not help in hardening a hash table. It only will get slower, but only marginally better.

Notes: * Edited SIPHASH out. We are using jenkins_hard in blead now. This is 2x faster than siphash, and only 6x slower than unoptimized City, but 4-20x slower than SSE4 optimized crc32 intrinsics, with not that high additional collision costs. * Edited seed + len confusion. This is a good thing to do. In fact CRC32 is currently the leader in least collisions and performance.

Do not use each

The each hash iterator has two problems which almost nobody is aware of, and which might lead to unexpected and hard to detect problems. We were using the perl-compiler since 12 years in production and just recently found out that the B function walksymtable will miss random symbols from time to time. It entirely depends on hash randomization. I cannot predict what each did to your code.

First problem: Do not change the data

Adding or deleting a key might lead to rehashing, which will re-order the keys. The iterator will just continue and then either miss keys or get keys again a second time.

perldoc -f each:

If you add or delete a hash's elements while iterating over it, entries may be skipped or duplicated--so don't do that. Exception: It is always safe to delete the item most recently returned by "each()", so the following code works properly:

while (($key, $value) = each %hash) {
    print $key, "\n";
    delete $hash{$key};   # This is safe
}

Below is the bug fixed with http://perl5.git.perl.org/perl.git/commit/5cc8528c900964306cba9b53c6eaa27af540eaea?f=ext/B/B.pm but even the p5p committer was not able to understand the underlying problem, why it failed. This bug was in core perl since 5.005 (1998) until 5.18.

sub walksymtable {
    my ($symref, $method, $recurse, $prefix) = @_;
    my $sym;
    my $ref;
    my $fullname;
    no strict 'refs';
    $prefix = '' unless defined $prefix;
    while (($sym, $ref) = each %$symref) {
        $fullname = "*main::".$prefix.$sym;
        if ($sym =~ /::$/) {
            $sym = $prefix . $sym;
            if (svref_2object(\*$sym)->NAME ne "main::"
                && $sym ne "<none>::"
                && &$recurse($sym))
            {
                walksymtable(\%$fullname, $method, $recurse, $sym);
            }
        } else {
            svref_2object(\*$fullname)->$method();
        }
    }
}

The fix is to replace

while (($sym, $ref) = each %$symref)

with

foreach my $sym (keys %$symref) {
    my $ref = $symref->{$sym};

walksymtable walks all entries in stashes (symbol table hashes) and all its children, and calls a method on each symbol. There is no apparent destructive code in walksymtable which could change the stash, but calling methods might lead to changes. Here it's &$recurse and ->$method(). It could be totally unexpected, e.g. loading a module which changes the symboltable or @ISA hierarchy, which might lead to skipped symbols. Or just parsing normal perl code, such as attributes or named regex references will magically load code and change the symbol table.

each is a bit more performant then foreach, because foreach stores the list to iterate over and just advances the index into it. But you are still able to change entries in the hash with for/foreach, you are safe to use recursion and you are safe from unexpected side-effects as in the walksymtable function.

Second problem: Not re-entrant

while (($key, $value) = each %hash) {
    use Data::Dumper;
    print $key, "\n";
    delete $hash{$key};
    print Dumper(\%hash);  # This is a bug
}

The problems is that Data::Dumper internally uses each to iterate over the hash, which resets the iterator of the hash to the end. The first while loop will not be able continue, the iterator cannot even be stored and reset manually. This is a Data::Dumper bug not detected and fixed in core perl for 16 years, since perl 5.005 (1998).

each not being re-entrant is not even documented.

The hash iterators should have really been stored in the op, not the hash. This is one of the oldest perl5 bugs which will never be fixed.

So each should be treated as in php: Avoid it like a plague. Only use it in optimized cases where you know what you are doing, but not in libraries as you saw with Data::Dumper.

My new buildbot

I've added some buildbot instances on various architectures to my new buildbot at http://perl514.cpanel.net/build/. I needed about a day to set it up.

Tested projects are so far: perl5, rakudo, nqp, parrot and p2.

The buildslaves are currently: centos5 (old system), cygwin, cygwin64, darwin (my macbook air), debian (fresh), mingw32 (strawberry), ppc (an old powerbook)

And soon: mingw64 (python refuses mingw64, demands msvc), freebsd, openbsd, solaris10 (intel 64bit)

All the slaves initiate the connection to the bot on some private tcp port, so they can all stay behind my firewall. Most of them are some vm's on my devbox, and I start/stop them at random.

The setup included installing python 2.7, setuptools, and then easy_install buildbot-slave. The configuration is a bit tricky, but it's python code, not just some data format, and not jenkins.

I currently only watch the main branches, not smoke-me. The biggest problem is that some projects are not "remake clean": git update + configure + make + make test might need a make clean or even a git clean -dxf before. This is fine if your smokers are big and fast, but my smokers are small and slow.

The UI is also not the best so far. I want to collect the projects into BuildSets, and eventually add bench mark data.

I already collected and fixed a couple of previously undetected bugs.

Eventually I'll add random CPAN module testing, against all my installed major perls on all these architectures. My old smoker just smoked the stable B::C against latest blead changes daily. And from time I get some cpantesters fail reports, but nothing helps in smoking beforehand and at once. There's some interface to add random builds via 'try'.

This is the buildbot master cfg:

# -*- python -*-
# ex: set syntax=python:

allslaves=["debian",
    "ppc", "darwin",
    "cygwin","cygwin64",
    "mingw32","mingw64","freebsd8",
    "openbsd49","solaris10","centos5",
    ]
projects=["parrot","nqp","rakudo","perl5","p2"]
builders = []
for p in projects:
    for slave in allslaves:
        name = p + "-" + slave
        builders.append(name)

# This is the dictionary that the buildmaster pays attention to. We also use
# a shorter alias to save typing.
c = BuildmasterConfig = {}

####### BUILDSLAVES

# The 'slaves' list defines the set of recognized buildslaves. Each element is
# a BuildSlave object, specifying a unique slave name and password.  The same
# slave name and password must be configured on the slave.
from buildbot.buildslave import BuildSlave
c['slaves'] = []
for slave in allslaves:
    c['slaves'].append(BuildSlave(slave, "******"))

# 'slavePortnum' defines the TCP port to listen on for connections from slaves.
# This must match the value configured into the buildslaves (with their
# --master option)
c['slavePortnum'] = 9989

####### CHANGESOURCES

# the 'change_source' setting tells the buildmaster how it should find out
# about source code changes.

from buildbot.changes.gitpoller import GitPoller
c['change_source'] = []
c['change_source'].append(GitPoller(
        'git://github.com/parrot/parrot.git',
        workdir='gitpoller-parrot', 
        # branches=True for all branches only with buildbot master (0.9)
        branch='master',
        project='parrot',
        pollinterval=300))
c['change_source'].append(GitPoller(
        'git://github.com/perl6/nqp.git',
        workdir='gitpoller-nqp',
        branch='master',
        project='nqp',
        pollinterval=300))
c['change_source'].append(GitPoller(
        'git://github.com/rakudo/rakudo.git',
        workdir='gitpoller-rakudo',
        branch='nom',
        project='rakudo',
        pollinterval=300))
c['change_source'].append(GitPoller(
        'git://github.com/Perl/perl5.git',
        workdir='gitpoller-perl5',
        branch='blead',
        project='perl5',
        pollinterval=300))
c['change_source'].append(GitPoller(
        'git://github.com/perl11/p2.git',
        workdir='gitpoller-p2',
        branch='p2',
        project='p2',
        pollinterval=300))


####### SCHEDULERS

# Configure the Schedulers, which decide how to react to incoming changes.  In this
# case, just kick off a 'runtests' build

from buildbot.schedulers.basic import SingleBranchScheduler
from buildbot.schedulers.forcesched import ForceScheduler
from buildbot.changes import filter
c['schedulers'] = []
for p in projects:
    names = []
    for slave in allslaves:
        names.append(p + "-" + slave)
    c['schedulers'].append(SingleBranchScheduler(
        name=p,
        change_filter=filter.ChangeFilter(project=p,branch_re='.*'),
        treeStableTimer=None,
        builderNames=names))

c['schedulers'].append(ForceScheduler(
    name="force",
    builderNames=builders))

####### BUILDERS

# The 'builders' list defines the Builders, which tell Buildbot how to perform a build:
# what steps, and which slaves can execute them.  Note that any particular build will
# only take place on one slave.

from buildbot.process.factory import BuildFactory
from buildbot.steps.source.git import Git
from buildbot.steps.shell import ShellCommand

factory = {}
p='parrot'
factory[p] = BuildFactory()
# check out the source
factory[p].addStep(Git(repourl='git://github.com/parrot/parrot.git', mode='incremental'))
#factory[p].addStep(ShellCommand(command=['git','clean','-dxf']))
factory[p].addStep(ShellCommand(command=["perl","Configure.pl","--optimize"]))
factory[p].addStep(ShellCommand(command=["make","-j4"]))
factory[p].addStep(ShellCommand(command=["make","-j4","fulltest"]))

p='nqp'
factory[p] = BuildFactory()
factory[p].addStep(Git(repourl='git://github.com/perl6/nqp.git', mode='incremental'))
#factory[p].addStep(ShellCommand(command=['git','clean','-dxf']))
#factory[p].addStep(ShellCommand(command=['rm','install/bin/parrot*','||','true']))
factory[p].addStep(ShellCommand(command=['make','clean']))
factory[p].addStep(ShellCommand(command=["perl","Configure.pl",
    "--gen-parrot=master","--backends=parrot"]))
# -j4 to be fixed with rotwang/parallel-make-gh152
factory[p].addStep(ShellCommand(command=["make","-j4"]))
factory[p].addStep(ShellCommand(command=["make","-j4","test"]))
#factory[p].addStep(ShellCommand(command=["make"]))
#factory[p].addStep(ShellCommand(command=["make","test"]))

p='rakudo'
factory[p] = BuildFactory()
factory[p].addStep(Git(repourl='git://github.com/rakudo/rakudo.git', mode='incremental'))
#factory[p].addStep(ShellCommand(command=['make','clean']))
#factory[p].addStep(ShellCommand(command=['git','clean','-dxf']))
factory[p].addStep(ShellCommand(command=["perl","Configure.pl",
    "--gen-parrot=master","--backends=parrot","--gen-nqp=master","--parrot-make-option='--jobs=4'"]))
factory[p].addStep(ShellCommand(command=["make"]))
factory[p].addStep(ShellCommand(command=["make","test"]))

p='perl5'
factory[p] = BuildFactory()
factory[p].addStep(Git(repourl='git://github.com/Perl/perl5.git', mode='incremental'))
factory[p].addStep(ShellCommand(command=['git','clean','-dxf']))
factory[p].addStep(ShellCommand(command=['sh','Configure','-de','-Dusedevel','-Doptimize=-Os']))
factory[p].addStep(ShellCommand(command=["make","-j4"]))
factory[p].addStep(ShellCommand(command=["make","-j4","test"]))

p='p2'
factory[p] = BuildFactory()
factory[p].addStep(Git(repourl='git://github.com/perl11/p2.git', mode='incremental'))
factory[p].addStep(ShellCommand(command=['git','clean','-dxf']))
factory[p].addStep(ShellCommand(command=['make','config']))
factory[p].addStep(ShellCommand(command=["make","-j4"]))
factory[p].addStep(ShellCommand(command=["make","-j4","test"]))

from buildbot.config import BuilderConfig

c['builders'] = []

for p in projects:
    for slave in allslaves:
        name = p+"-"+slave
        c['builders'].append(
            BuilderConfig(name=name,
                          slavenames=slave,
                          factory=factory[p]))

####### STATUS TARGETS

# 'status' is a list of Status Targets. The results of each build will be
# pushed to these targets. buildbot/status/*.py has a variety to choose from,
# including web pages, email senders, and IRC bots.

c['status'] = []

from buildbot.status import html
from buildbot.status.web import authz, auth

authz_cfg=authz.Authz(
    # change any of these to True to enable; see the manual for more
    # options
    auth=auth.BasicAuth([("******","******")]),
    gracefulShutdown = False,
    forceBuild = 'auth', # use this to test your slave once it is set up
    forceAllBuilds = False,
    pingBuilder = False,
    stopBuild = False,
    stopAllBuilds = False,
    cancelPendingBuild = False,
)
c['status'].append(html.WebStatus(http_port=8010, authz=authz_cfg))

####### PROJECT IDENTITY

# the 'title' string will appear at the top of this buildbot
# installation's html.WebStatus home page (linked to the
# 'titleURL') and is embedded in the title of the waterfall HTML page.

c['title'] = "Perl VM"
c['titleURL'] = "http://parrot.org/"

# the 'buildbotURL' string should point to the location where the buildbot's
# internal web server (usually the html.WebStatus page) is visible. This
# typically uses the port number set in the Waterfall 'status' entry, but
# with an externally-visible host name which the buildbot cannot figure out
# without some help.

c['buildbotURL'] = "https://perl514.cpanel.net:8010/"

####### DB URL

c['db'] = {
    # This specifies what database buildbot uses to store its state.  You can leave
    # this at its default for all but the largest installations.
    'db_url' : "sqlite:///state.sqlite",
}

B::C[C] and rperl

With rperl we have now a second serious perl compiler project, which has distinctive advantages over B::C and B::CC, but also several disadvantages.

Basically we need to define compilation blocks for rperl (restricted perl) - typed and compile-time optimizable syntax - and combine it with unoptimizable dynamic perl5 syntax, to be able to run ordinary perl5 code, partially optimized.

  • B::C can compile dynamic perl5 syntax to C code and simply run the unoptimized optree through libperl. 10x better startup, destruction and 10% memory savings.

  • B::CC does control-flow and simple type optimizations, esp. on stack variables but has no proper type integration yet, and will never be able to run 100% of perl5 code. 2-5x better run-time. perl5 is a too dynamic language to be properly compilable as you would expect. But perl5 has support to parse and store types, just nobody is using it yet. And p5p does not support optimizations based on the type information, it rather blocks those attempts. So we need a seperate compilation to C (reini), C++ (will) or LLVM (goccy).

  • perlcc, the frontend, cannot compile yet .pm packages to single shared libraries and link it together, to be able to use different compile-time optimizations and reduce compile and link times.

  • rperl can compile restricted typed syntax, using C++ libraries to operate on typed data and source-level type information. 5-20x better run-time on 5-10% of your code. This is a much better approach then with B::CC, which optimizes only on int and num lexical variables. rperl can also treat efficiently typed and no-magic arrayrefs, hashrefs and function arguments and return types. Because it has control over the parser and compiler, which B::CC has not.

The problem is that rperl cannot use the perl AST, the B optree, because it is too tightly bound to the mostly undocumented internal ops and data, which is hard to work with properly from outside perl5. So rperl uses the PPI AST, interestingly called DOCTREE, does the type optimization and translation to C++ on this doctree via Inline::C++ which is btw. called Inline::CPP, not CXX and has nothing to do with the C preprocessor.

If you look at rperl/docs/rperl_grammar.txt you'll see the parsed boundaries with which another compiler can interact with rperl. Basically we could use seperate compilation units (module files or programs) or single blocks with rperl syntax, and maybe later as advanced problem subroutines, but then you need to pass the type information of the arguments and return values back and forth. This is done via standard XS typemaps.

  • perlcc and a new script called buildcc should be usable as frontend to link seperate compilation units as compiled libraries (shared or static) together and the different compilers should be able to detect each other and pass the work back and forth.

    It needs to be seperate because they use different parsers and different compilers, but agree on the same types and the typed calling convention. This calling convention should be specified on the C/C++ level, and for the user it would be nice if the perl-level types also agree. The base for these types need to be perl6 types because these are the only ones in existance today.

  • B::C can do int and num and accepts str.

  • rperl requires also object and void, and can already do aggregate types, like arrayrefs and hashrefs of those types.

  • perl6 adds bool, some more numbers and intermediate and meta object types, which we dont need yet. What needs to be added are sized array declarations to be able to omit run-time bounds checks, but this should be trivial.

rperl already offers compile-time or run-time type-checks and compile-time type optimizations.

p5p and the mop project have no interest in helping with compile-time optimizations (on type and const information), but there is some minor progress in getting support for compile-time readonly-ness into core. My estimation on this is three years, for a simple project which needs about 2 months work. And actually already exists. It might get in by adding proper class, role and method support but I would not bet on it, as the current p5-mop project ignores those possibilities and even within perl6 I see no focus on OO performance. I see no single use oo :closed :final or the accompanying use class pragmas yet. But they implemented the basic type optimizations at least.

How to interface?

Inline is not easy to interface with, notoriously hard to debug, and has several unmaintained bugs and omissions. But it's the best and most transparent way, and the only way to mix perl and rperl code on the source level. The biggest bug is the notorious namespace hack, i.e. the stashes for the generated functions and data are missing. Also argument passing has some serious limitations. Will is using a really crazy hack right now in rperl to push arrayrefs and hashref arguments properly onto the stack. The inline stack macros are too simple, they only work for trivial examples. Inline::CPP does not properly support passing plain arrays and hashes to and from perl5 code, so now rperl uses just scalar types, esp. arrayrefs and hashrefs. We will need to look into lifting those limitations.

Methods

Methods and subroutines need to be seperated at compile-time for rperl/Inline::CPP. In the old days there was a :method attribute idea, which is not compile-time accessible, only at run-time. And there were Devel::Declare based hacks to support class and method keywords, but they never made it into core. Nowadays Devel::Declare is not needed anymore, but there is still no flag to denote methods-only or OO semantics as in perl6 to enable OO compile-time optimizations, such as method dispatch and method inlining. use oo :closed :final. We cannot even declare yet read-only hashes or @ISA arrays properly, as they still clash with restricted hashes and COW. So they need to be implemented seperately, as done in rperl. In rperl method types are done via special type names, to denote the return type and if it's a method or normal subroutine.

Technically the B::C -m switch ("compile module") can be easily used for .pm modules alone already, it compiles to a library not to a standalone. But the problem is to automize the seperation in the code walker. Any module should only compile itself not its dependencies, but the parser already included those before the compiler sees it. The bytecode compiler B::Bytecode already does this seperation by calling out on the require op to the other module, and require observes the .pmc extension, so it loads the other bytecode compiled module. With B::C the walker and loader is different. The split need to be done on the FILE field in nextstate CP's, CV's and GV's, it cannot be done on the current separation, the GV namespaces.

The B::C project timeline is to implement -m and new its seperation immediately after the 1.43 release with the target date of fall 2014. buildcc then collects the deps and drives perlcc, and perlcc is used compiles to modules. Work started in the modul branch in the perl-compiler repo.

-- We, Reini Urban and Will Braswell wrote most parts of this blog post on Saturday January 4, 2014 in Austin in an informal perl11 party celebrating rperl 1.0 beta, with Matt Trout across the street (it was cold and late), but he didn't show up. Maybe he will add some comments later. He supported the idea of an integration somehow.

How to resize a NTFS qemu qcow2 image (the easy way)

There's so much bad information out there how to simply resize a NTFS qemu qcow2 (windows in kvm) image, and I need to frequently enhance my windows images, esp. on win8 (64 bit) so I'll document it here for the next time:

  • Don't use fdisk if you have gparted!
  • Don't waste space converting a qcow2 to raw
  • Shutdown the vm!

  • $ sudo su -

  • $ ls /var/lib/libvirt/images/*.qcow2
  • $ qemu-img resize /var/lib/libvirt/images/windows_x64.qcow2 +5GB

  • $ modprobe nbd max_part=63

  • $ qemu-nbd -c /dev/nbd0 /var/lib/libvirt/images/windows_x64.qcow2

  • $ # fdisk -l /dev/nbd0 # or better

  • $ kpartx /dev/nbd0 # to check the partitions and sizes
  • $ ntfsresize --info /dev/nbd0 # to check if the volumes are not dirty

  • $ gparted /dev/nbd0 # enhance the partition size to max

  • $ killall qemu-nbd

  • Start the vm and let windows do the chkdsk /f (in win8 automatically)

  • In winxp you might need to do ntfsresize -x /dev/nbd0p2 manually