Tangential question: why does it normally take so long to start traditional VMs ...

jeroenhd · 2025-05-30T15:53:52 1748620432

You can optimize a lot to start a Linux kernel in under a second, but if you're using a standard kernel, there are all manners of timeouts and poll attempts that make the kernel waste time booting. There's also a non-trivial amount of time the VM spends in the UEFI/CSM system preparing the virtual hardware and initializing the system environment for your bootloader. I'm pretty sure WSL2 uses a special kernel to avoid the unnecessary overhead.

You also need to start OS services, configure filesystems, prepare caches, configure networking, and so on. If you're not booting UKIs or similar tools, you'll also be loading a bootloader, then loading an initramfs into memory, then loading the main OS and starting the services you actually need, with eachsstep requiring certain daemons and hardware probes to work correctly.

There are tools to fix this problem. Amazon's Firecracker can start a Linux VM in a time similar to that of a container (milliseconds) by basically storing the initialized state of the VM and loading that into memory instead of actually performing a real boot. https://firecracker-microvm.github.io/

On Windows, I think it depends on the hypervisor you use. Hyper V has a pretty slow UEFI environment, its hard disk access always seems rather slow to me, and most Linux distro don't seem to package dedicated minimal kernels for it.

dataflow · 2025-05-30T16:46:50 1748623610

That's not what I'm asking about.

I'm saying it takes a long time for it to even execute a single instruction, in the BIOS itself. Even for the window to pop up, before you can even pause the VM (because it hasn't even started yet). What you're describing comes after all that, which I already understand and am not asking about.

zbentley · 2025-05-31T20:52:04 1748724724

Unsubstantiated hunch: the hypervisor is doing a shitload of probes against the host system before allocating/configuring virtual hardware devices/behaviors. Since the host's hardware/driver/kernel situation can change between hypervisor invocations, it might have to re-answer a ton of questions about the host environment in order to provide things like "the VM/host USB bridge uses so-and-so optimized host kernel/driver functionality to speed up accesses to a VM-attached USB device". Between running such checks for all behaviors the VM needs, and the possibility that wasteful checks (e.g. for rare VM behaviors or virtual hardware that's not in use) are also performed, that could take some time.

On the other hand, it could just as easily be something simple, like setting up hugepages or checksumming virtual hard disk image files.

Both are total guesses, though. Could be anything!

bonki · 2025-05-31T09:23:14 1748683394

I have always wondered the same, never tried looking into it but I wouldn't be surprised if Defender at least played a part in it. Defender is a huge source for general slowness on Windows from my experience.

hnuser123456 · 2025-05-30T17:01:10 1748624470

probably the intel ME setting up for virtualization in a way that it can infiltrate

naikrovek · 2025-05-31T23:55:09 1748735709

Please tell me you are joking. Even if it’s a lie.

Management Engine .. actually I do not have the energy to deal with paranoid people. I never had that kind of energy. I never will. You’re all so efficient at drawing energy out of conversations and killing them. You’re like conversational vampires. It’s exhausting.

I don’t even care if you’re right or wrong about Intel ME. It is just so exhausting listening to you guys because of your word choices. It’s like you try to get ignored.

I respect your opinion and all, you just need to work on your messaging or something.

hnuser123456 · 2025-06-03T16:40:25 1748968825

"ME can also control various aspects of the Virtualization Engine directly over the ME Command Interface (MECI)."

https://en.wikichip.org/wiki/intel/management_engine

naikrovek · 2025-06-04T04:25:35 1749011135

Yeah ME is literally for managing things like UEFI settings remotely. That’s what it was designed for.

LoganDark · 2025-05-30T17:53:19 1748627599

Ah yes, the source of all slowness in the CPU: hostile backdoors taking their time to compromise the work. Classic...

orev · 2025-05-30T16:08:17 1748621297

I think you need to provide more details on what VM software you’re using. On VirtualBox what you describe is very noticeable, and it didn’t have that delay in older versions. So it could be just an issue with that VM software and not a general “traditional VMs” issue.

dataflow · 2025-05-30T16:12:48 1748621568

Yup I'm asking about VirtualBox mainly, I just don't understand what the heck it's doing during that time that takes so long. Although I don't recall other VMs (like say, Hyper-V) being dramatically different either (ignoring WSL2 here).

icedchai · 2025-05-30T22:26:26 1748643986

Linux KVM/qemu VMs start pretty fast.

_factor · 2025-05-30T16:23:43 1748622223

Try disabling Windows Defender and trying again.

dataflow · 2025-05-30T16:32:54 1748622774

Are you just guessing or have you actually seen the delay I'm talking about disappear as a result of this (or as a result of anything else for that matter)? Because I've already done this (yes, entirely, even the kernel mode drivers) and it's definitely not the issue.

hinkley · 2025-05-30T17:17:07 1748625427

There was a release of subversion back in the day that reduced the number of files that were opened during a repo action like pull, and the number of times any one file got opened. On Linux it ran about 2-3x faster. Very nice change.

On windows it was almost 10x faster. On the project where this change was released, my morning ritual was to come in, log on, run an svn pull command, lock my screen and go get coffee. I had at least ten minutes to kill after I got coffee, if the pot wasn’t empty when I got there.

Windows is hot garbage about fopen particularly when virus scanning is on.

akdev1l · 2025-05-30T19:58:51 1748635131

The answer is that it doesn’t have to be like that.

In practice virtual machines are trying to emulate a lot of stuff that isn’t really needed but they’re doing it for compatibility.

If one builds a hypervisor which is optimized for startup speed and doesn’t need to support generalized legacy software then you can:

> Unlike traditional VMs that might take several seconds to start, Firecracker VMs can boot up in as little as 125ms.

speed_spread · 2025-05-30T16:39:49 1748623189

Creating the VM itself is fast. It depends on what you run in it. Unikernel VMs can start in a few milliseconds. For example, checkout OSv.

dataflow · 2025-05-30T16:47:27 1748623647

You're saying this is true on a Windows host?

akdev1l · 2025-05-30T20:27:44 1748636864

Yes. The delay you’re complaining about happens because you are looking at general hypervisors which also come with virtualized hardware and need to mimic a bunch of stuff so that most software will work as usual.

For example: your VM starts up with the CPU in 16 bit mode because that’s just how things work in x86 and then it waits for the guest OS to set the CPU into 64 bit mode.

This is completely unnecessary if you just want to run x86-64 code in a virtualized environment and you control the guest kernel and can just assume things are in 64bit mode because it’s not the 70s or whatever

The guest OS would also need to probe few ports to get a bootable disk. If you control the kernel then you can just not do that and boot directly.

There’s a ton of stuff that isn’t needed

dataflow · 2025-05-30T20:30:05 1748637005

The 16 bit mode stuff and the guest OS probes are after what I'm asking, not before.

akdev1l · 2025-05-30T21:02:36 1748638956

No it is not. The “first instruction in the BIOS” is 16 bit mode code when dealing with an x86 VM.

A virtual environment doesn’t even really need any BIOS or anything like that.

You can feel free to test with qemu direct kernel booting to see this skips a lot of delay without even having to use a specialized hypervisor like firecracker

speed_spread · 2025-05-31T21:56:41 1748728601

A bare VM may not have a BIOS, it's just partitioning supported by the host CPU and OS. The emulation of the legacy PC hardware stack for conventional OS compatibility is a separate thing. If the guest OS is custom-designed to launch in a bare VM with known topology it can boot very, very fast.

BobbyTables2 · 2025-05-31T00:13:45 1748650425

In Linux, VM memory allocations can be slow if it tries to allocate GBs of RAM using 4K pages. There are ways to help it allocate 1GB at a time which vastly speeds it up.

Windows probably has an equivalent.

pdimitar · 2025-05-31T14:21:16 1748701276

Is this specifically for during boot time? Also, any links?

jiggawatts · 2025-05-30T22:25:45 1748643945

Try Windows Server Core on an SSD. I've seen VMs launch in low single-digit seconds. You can strip it down even further by removing non-64-bit support, Defender, etc...

dist-epoch · 2025-05-30T17:43:25 1748627005

Sounds like a VirtualBox problem.

I'm using Hyper-V and I can connect through XRDP to a GUI Ubuntu 22 in 10 seconds and I can SSH into a Ubuntu 22 server in 3 seconds after start.

diggan · 2025-05-30T15:43:27 1748619807

I mean it is basically booting a computer from scratch, kind of makes sense. You have to allocate memory, start virtual CPUs, initialize devices, run BIOS/UEFI checks, perform hardware enumeration, all that jazz while emulating all of it, which tends to be slower than "real" implementations. I guess there is a bunch of processes for security as well, like wiping like zeroing pages and similar things that takes additional time.

If I let a VM use most of my hardware, it takes a few seconds from start to login prompt, which is the same time it takes for my Arch desktop to boot from pressing the button to seeing the login prompt.

dataflow · 2025-05-30T16:00:54 1748620854

> You have to allocate memory, start virtual CPUs, initialize devices, run BIOS/UEFI checks, perform hardware enumeration, all that jazz while emulating all of it, which tends to be slower than "real" implementations.

That's not what I'm asking.

I'm saying it takes a long time for it to even execute a single instruction, in the BIOS itself. Even for the window to pop up, before you can even pause the VM (because it hasn't even started yet). What you're describing comes after all that, which I already understand and am not asking about.

bityard · 2025-05-30T16:49:11 1748623751

In defense of the replies, your initial question was very vague and left people to assume you meant the obvious thing.

dataflow · 2025-05-30T16:59:07 1748624347

Sure, that's why I clarified.

drewg123 · 2025-05-30T16:47:56 1748623676

Without any context in terms of what the VM is doing or what VMM software you use, my best guess is that the OS/VMM are pre-allocating memory for the VM. This might involve paging out other processes' memory, which could take some time.

I think task manager would tell you if there is a blip of memory usage and paging activity at the time. And I'm sure windows itself has profilers that can tell you what is happening when the VM is started..

dataflow · 2025-05-30T16:51:01 1748623861

VirtualBox on Windows, primarily. Though I feel like haven't seen other VMs in the past start up a whole ton faster (maybe a somewhat) (ignoring WSL2). Page files are already disabled, there's plenty of free RAM, and it makes no difference how little RAM the guest is allocated (even if it's 256MB). So no, those are not the issues. VirtualBox itself seems to be doing something slow during that time and I don't know what that is.

gopher_space · 2025-05-30T21:06:07 1748639167

I remembered something about VirtualBox not playing nicely with Hyper-V on Windows, and dug up a possibly relevant post[0] on their forums. IIRC we ended up moving a few build systems to Docker and dropping VirtualBox because of hyper-v related issues, but it's been a few years.

[0] https://forums.virtualbox.org/viewtopic.php?t=112113

dataflow · 2025-05-30T21:08:44 1748639324

That's the unrelated green-turtle issue. It's only relevant after the guest has actually started running instructions. I'm talking about before that point.

gopher_space · 2025-05-30T21:53:05 1748641985

I'm not aware of any turtles, that was just the first thing I found when trying to see if VirtualBox and Hyper-V were still a problematic combo.

Again, it was a few years ago, but we didn't solve the problem or identify an actual root cause. We stopped banging our heads against that particular wall and switched technologies.

mgerdts · 2025-05-31T13:33:51 1748698431

What is your definition of free memory? If the system has read a lot of data, the page cache is probably occupying most of the RAM you consider free. Look at cache and standby counters.

I’ve noticed that windows can only evict data from the page cache at about 5 GB/s. I do not know if this zeros the memory or that would need to be done in the allocation path.

A couple years ago I tracked down a long pause while starting qemu on Linux to it zeroing the 100s of GB of RAM given to the VM as 1 GB huge pages.

These may or may not be big contributors to what you are seeing, depending on the VM’s RAM size.

mynameisvlad · 2025-05-30T18:58:44 1748631524

So the issue is pretty clearly with VirtualBox itself, but you are making it sound like it's an issue with VMs on Windows or in general.

drewg123 · 2025-05-30T16:55:41 1748624141

For some reason I can't reply to your reply. I'd strongly suggest that you profile virtual box. It beats speculation..

HumanOstrich · 2025-05-31T07:32:35 1748676755

I experienced something similar back when Microsoft decided to usurp all hypervisors made for Windows and make Windows itself run as a VM on Hyper-V running as a Type 1 hypervisor on the hardware. That made it so other VMs could only run on Hyper-V alongside Windows or with nested virtualization.

So this meant VMWare, VirtualBox, etc as they were would no longer work on Windows. Microsoft required all of them to switch to using Hyper-V libs behind the scenes to launch Hyper-V VMs and then present them as their own (while hiding them from the Hyper-V UI).

VirtualBox was slow, hot garbage on its own before this happened, but now it's even worse. They didn't optimize their Hyper-V integration as well as VMWare (eventually) did. VMWare is still worse off than it was though since it has to inherit all of Hyper-V's problems behind the scenes.

Hope this brings some clarity.