A Virtual Machine (VM) is an efficient, isolated duplicate of a real computer system. More than one VM may be provided concurrently by a single real system. A real system may have a number of resources that it provides to an operating system or application software for use. The central processing unit (CPU), also referred to as the processor, and motherboard chipset may provide a set of instructions and other foundational elements for processing data, memory allocation, and input/output (I/O) handling. The real system may further include hardware devices and resources such as memory, video, audio, disk drives, and ports (universal serial bus, parallel, serial). In a real system, the basic I/O system (BIOS) provides a low level interface that an operating system can use to access various motherboard and I/O resources. With a real system, when an operating system accesses a hardware device, it typically communicates through a low-level device driver that interfaces directly to physical hardware device memory or I/O ports.
When a system is hosting a virtual machine environment, one or more guest software applications may be executed by the CPU in such a manner that each guest software application (guest) can execute as though it were executing with exclusive control of the system. This may require that the CPU execute a Virtual Machine Monitor (VMM) along with the guest to prevent the guest from altering the state of the system in a way that would conflict with the execution of other guests. The VMM may be referred to as the monitor. The VMM may be provided as software, firmware, hardware, or a combination of two or more of these.
The VMM may place the processor in a mode where execution of certain instructions that could alter the state of the CPU and create conflicts with other guests will trap execution of the instruction and pass control to the VMM. Instructions which are trapped may be called privileged instructions. The VMM is then able to handle the guest attempt to execute a privileged instruction in a manner that makes the trapping of the instruction transparent to the guest while preventing the processor from being placed in a state that interferes with the execution of other guests. When a guest executes privileged instructions that inspect or modify hardware state, which appear to the guest to be directly executing on the hardware, the privileged instructions are instead virtualized by the VM and passed to the VMM.
When a trap to the VMM occurs, the VMM may save the state of the processor as it was when the privileged instruction was executed by the guest. The VMM may then restore the state of the processor to what it should be after execution of the privileged instruction before control is returned to the guest. The trap from guest to VMM is referred to as a VMEXIT. The monitor may resume the guest with either of a VMRESUME or a VMLAUNCH instruction, which may be collectively referred to as a VMENTER. The time taken by a VMEXIT and VMENTER pair is referred to as the Exit-Enter Time (EET).
As shown in FIG. 1, a computer system may include a central processing unit (CPU)?10, also referred to as a processor, coupled to a random access memory (RAM)?30. A memory bridge?20?may couple the processor?10?to the memory?30. The RAM may be any of a variety of types of memory such as synchronous dynamic random access memory (SDRAM), RAMBUS? dynamic random access memory (RDRAM), or extended data out random access memory (EDO RAM).
The computer system may include a number of devices that are coupled to the processor?10. A video device?22?may provide a visual display that may receive data from the processor?10?through the memory bridge?20. The memory bridge may also be coupled to an I/O bridge?40. The I/O bridge may be coupled in turn to various devices such as disk drives?42, a Peripheral Component Interconnect (PCI) bus?44?that support various expansion cards, local I/O devices?46?such as timers and power control devices, and Universal Serial Bus (USB)?48?connectors.
The RAM?30?may be loaded with data that represents executable instructions that may be executed by the processor?10. The RAM?30?may further contain data structures used by the processor to control the execution of the processor such as pointers to routines to be executed when certain conditions are detected, data structures such as push down stacks to temporarily hold data being used by the processor, and other data structures to define the processing environment such as task contexts. It will be understood that the amount of RAM?30?accessible by the processor?10?may exceed the amount of RAM that is physically present in the computer system. Various memory management techniques may be used to manipulate the contents of the physical RAM?30?so that it appears to the processor?10?that all of the accessible RAM is present. The contents of the RAM?30?will be described as though all accessible RAM is physically present to avoid obscuring the operation of the described embodiments of the invention but it should be understood that the structures described as being in memory may not all be in physical memory concurrently and that different memory structures may occupy the same physical memory successively while remaining logically distinct.
The processor?10?may be used to host one or more virtual machines (VMs). As shown in FIG. 2, a portion of RAM?30?may be assigned to each virtual machine?34?as a virtual machine context. The assigned portion of RAM?30?may be all or part of the RAM available to the processor?10. The assigned portion of RAM?30?may be loaded and unloaded as required to allow one virtual machine?34A to use some or all of the physical RAM assigned to another virtual machine?34B. The RAM?30?may support a virtual memory system to manage the use of the RAM so that each virtual machine?34A is able to use the RAM without regard to other virtual machines?34B that might also be hosted by the processor?10.
Each virtual machine?34A provides an environment for the execution of software that appears to be a dedicated physical machine that is protected and isolated from other virtual machines?34B. While only two virtual machines are shown, it is to be understood that any number of virtual machines may be hosted by the processor used in embodiments of the invention. Guest software may be executed in each virtual machine?34. The guest software may have an operating system (OS)?36?and one or more application programs?38?that are executed by the OS. The OS?36?on each virtual machine?34?may be the same or different than the OS on other virtual machines.
The processor may host a Virtual Machine Monitor (VMM)?32?to manage the one or more virtual machines?34. The VMM?32?may trap the execution of certain instructions, which may be termed privileged instructions, by the virtual machines?34?so that each virtual machine?34A is able to operate without regard to other virtual machines?34B that might also be hosted by the processor?10. Privileged instructions may make a persistent change to the state of the processor that would alter the behavior of other virtual machines executed thereafter. The VMM?32?may virtualize the execution of privileged instructions that are trapped so that these instructions provide the expected machine state for the currently executing context without having the machine state persist to affect the later execution of other virtual machines.
Floating point operations using a Floating Point Unit (FPU) are examples of privileged instructions that may be virtualized. Guest software running in a virtual machine assumes that it can use the FPU as required to perform floating point arithmetic. FPU operations may use and affect a number of registers, which may be relatively wide, to hold floating point values. The VMM may virtualize the FPU for each of the virtual machines so that the registers as set by the operation of each virtual machine are present whenever the virtual machine is using the FPU. The virtualization of the FPU may be accomplished by one of several algorithms.
An algorithm for virtualization of the FPU, shown in the flow charts of FIGS. 3 and 4, may save the VMM‘s FPU state?50?and restore the virtual machine‘s FPU state?52?just prior to entering the virtual machine context?54. This may be termed an unconditional algorithm because the machine state is always saved and the state for the new context is restored on context changes. Shortly after the virtual machine context exits to the VMM?60, the virtual machine‘s FPU state is saved?62?and the VMM‘s FPU state is restored?64. The unconditional algorithm ensures that the FPU state is such that the VMM and each virtual machine can use the FPU without conflict with the other users of the FPU. The time consumed by the unconditional virtualization algorithm for each virtual machine context entered is twice the Save-Restore Time (SRT). SRT is the time required to save the FPU state of one context and restore the FPU state of another context which occurs twice for each transfer of control to a virtual machine, once at the start and once at the end of the transfer. Thus the unconditional algorithm overhead time cost is 2×SRT.
It is possible that the VMM or a guest may not use the FPU. In such cases it is unnecessary to save and restore the FPU state. If such cases can be detected so that the FPU state is saved and restored less frequently than is done by the unconditional algorithm, it may be possible to reduce the overhead associated with virtualization of the FPU.
As shown in FIG. 1, the processor?10?may include a control register?12?to determine whether a currently executing task can execute instructions that affect the machine state. The control register?12?may be a predetermined location in memory?30, not shown, or a data storage location within the processor?10, as shown in FIG. 1. A control register?12, such as Control Register?0?(CR0) in an IA-32 Intel? Architecture processor?10, may include a flag?14?to control if one or more instructions are privileged, such as the Task Switched (TS) bit?3?of CR0?which controls whether floating point instruction will be executed or cause an exception. Another exemplary processor state flag?14′ controlling instruction privilege is the Monitor Coprocessor (MP) bit1?of CR0.
The state of the control register?12?may be used to determine if a save and restore of the FPU state is required. If the processor?10?is configured so that the instructions that persistently affect the processor state are privileged, then the state that is protected by making the instructions privileged does not need to be saved as an exception will cause control to be transferred to the VMM if and when the currently executing task attempts to use the protected state. For example, if the control register?12?includes TS?14?and MP?14′ bits, both bits being set may configure the processor?10?so that all instructions that use or affect the FPU state will cause an exception. If it is determined that the thread to which the VMM is preparing to transfer control has not cleared either of these bits, then it is not necessary to save and restore the FPU state prior to transferring control to the thread. However, if the thread has configured the processor so that one or more instructions that use or affect the FPU state are not privileged, then the FPU state as set by the VMM is saved and the FPU state as set by the thread when previously executed is restored.
Algorithms to minimize the overhead of virtualization may attempt to minimize the number of times a FPU save and restore is required. This can be achieved with various algorithms. Some algorithms may delay the save and restore until the FPU is about to be used in a context which does not currently own the FPU, such as when VM(i) uses the FPU while the FPU context is that of VM(j). Other algorithms may delay the save and restore until the FPU is highly likely to be used in a context that does not own the FPU. These may be termed selective algorithms because the machine state is only saved and the state for the new context restored on context changes when it appears that the machine state will be used by the new context.
FIG. 5 illustrates an exemplary selective algorithm for virtualization of the FPU that uses the processor state for privilege of instructions in an attempt to perform the save and restore only when the FPU is highly likely to be used in a context that does not own the FPU. This exemplary selective algorithm assumes that the VMM always uses the FPU and therefore anytime a VM is going to use the FPU a save and restore of the FPU state will be required.
If the processor state for the VM that is to receive control makes the instructions that use or affect the FPU state privileged?70-YES, then the selective algorithm assumes that the VM will not use the FPU and does not do a save and restore of the FPU state. The VM context will be entered?72?with the VMM‘s FPU state. Conversely, if the processor state for the VM makes the FPU instructions unprivileged?70-NO, then the selective algorithm assumes that the VM will use the FPU and the VMM FPU state is saved?74?and the VM FPU state restored?76?before the VM context is entered?78. If the VM does use the FPU when the selective algorithm assumed that it would not?80-YES, either by attempting to change the processor state with regard to privilege or by simply executing the privileged FPU instructions, then the processor will raise an exception and transfer control to the VMM?82. In response to this exception the VMM will save the FPU state of the VMM?84?and restore the FPU state for use by the VM?86. The VMM then re-enters the VM context?88. If the VM FPU state was restored either before or during execution of the VM context, then upon exiting the VM context?90?the VM‘s FPU state will be saved?92?and the VMM‘s FPU state will be restored?94?before the selective virtualization algorithm exits?96. If the FPU state was privileged?70-YES and the VM did not use the FPU?80-NO, then the selective virtualization algorithm exits?98?having avoided the SRT cost.
The selective algorithm incurs costs in terms of the overhead time of the FPU virtualization algorithm. If the selective algorithm correctly predicts that the VM will not use the FPU, thus avoiding an unnecessary save and restore, the cost is zero. If the selective algorithm correctly predicts that the VM will use the FPU, the cost is twice the Save-Restore Time (2×SRT), the same cost as the unconditional algorithm. If the selective algorithm incorrectly predicts that the VM will not use the FPU, thus causing an exception to be raised leading to a save and restore, the cost is EET+2×SRT. This latter case incurs a higher cost than the unconditional algorithm. Thus the ability of the selective algorithm to reduce the overhead of FPU virtualization from the overhead of the unconditional algorithm depends on the effectiveness of the selective algorithm in predicting that the VM will not use the FPU.
If the fraction of correct predictions that the VM will not use the FPU is Q, and the fraction of incorrect predictions that the VM will not use the FPU is R, then the fraction of correct predictions that the VM will use the FPU is (1?Q?R). The actual overhead of the selective algorithm is then?
????((1?Q?R)×2×SRT)+(R×(2×SRT+EET))+(Q×0)?
which reduces to?
????2×SRT?2Q×SRT?2R×SRT+2R×SRT+R×EET?
which further reduces to?
????2×SRT?2Q×SRT+R×EET?
which further reduces to?
????2×SRT(1?Q)+R×EET?
2× SRT (1?Q) represents the expected value for SRT overhead and R×EET represents the expected value for EET overhead. Expected value of overhead is used to mean the statistical expectation of time cost for a context change based on the observation of a number of context changes. The selective algorithm will incur less cost in overhead than the unconditional algorithm if?
????2×SRT(1?Q)+R×EET<2×SRT?
which reduces to?
????R×EET<2Q×SRT?
which further reduces to?
????EET/(2×SRT)<Q/R?
EET and SRT are relatively constant times that can be computed for a given processor environment. Thus the effectiveness of the selective algorithm can be compared to the unconditional algorithm by measuring the fraction of correct predictions, Q, and incorrect predictions, R, that the VM will not use the FPU, and comparing the ratio Q/R to the precomputed constant EET/(2×SRT).
FIGS. 6-9 show another exemplary selective algorithm for virtualization of the FPU. This selective algorithm puts the processor in the state where the instructions that use or affect the FPU state are privileged before entering the VM context. The processor state is virtualized, for example by providing shadow copies of the processor state flags such as CR0.TS and CR0.MP that reflect the processor state as perceived by the VM?100. The processor is configured so that the FPU instructions are privileged?102. Control is transferred to the VM context?104?where use of the FPU or of the processor state flags will result in a processor exception that exits to the VMM.
FIG. 7 is a flowchart for the processing of an FPU exception by the VMM. An attempted execution of a privileged instruction to use the FPU by the VM creates an exception that causes an exit to the VMM?110. The VMM‘s FPU state is saved?112?and the VM‘s FPU state is restored?114. Since it is now safe for the FPU to use the FPU, the processor is configured so that the FPU instructions are unprivileged?116. This means there will be at most one FPU exception for a transfer of control to a VM context. The FPU exception may also be reflected to the VM context?118?for possible use by programs executing in the VM context, such as an operating system. Control is then returned to the VM context?120. The time overhead cost of the FPU exception processing is the EET plus the SRT. A second SRT is incurred upon exiting the VM context as discussed below.
FIG. 8 is a flowchart for the VMM processing of an attempt by the VM to change a processor state flag. An attempt to change a state flag by the VM creates an exception that causes an exit to the VMM?130. The VMM updates the virtualized state flags, such as by setting values of shadow state flags, according to the changes attempted by the VM?132. Control is then returned to the VM context?134. The time overhead cost of the state change exception processing is the EET. The VM may make any number of changes to the state flags during a single transfer of control to a VM context.
FIG. 9 is a flowchart for the VMM processing when the VM context is exited?140. If the VM is using the FPU?142-YES, meaning that the processing of FIG. 8 occurred during the transfer of control to the VM context, then the VM‘s FPU state is saved?144?and the VMM‘s FPU state is restored?146?before VMM processing continues?148. This incurs the time overhead cost of a second SRT as mentioned above.
There are four possible scenarios with regard to the time overhead cost of this selective algorithm:
If S is the fraction of VM executions that use the FPU and T is the average number of exception created by a change the processor state flags or use of the FPU, the total overhead of this selective algorithm is:?
S×2×SRT+T×EET?
S×2×SRT represents the expected value for SRT overhead and T×EET represents the expected value for EET overhead. This selective algorithm has a lower cost in overhead time than the unconditional algorithm if?
????S×2×SRT+T×EET<2×SRT?
This reduces to?
????T×EET<2×SRT×(1?S)?
which reduces further to?
????EET/(2×SRT)<(1?S)/T?
As with the previously discussed selective algorithm, EET and SRT are relatively constant times that can be computed for a given processor environment. Thus the effectiveness of this selective algorithm can be compared to the unconditional algorithm by measuring the fraction of executions where the VM does not use the FPU, (1?S), and the average number of exception raised, T, and comparing the ratio (1?S)/T to the precomputed constant EET/(2×SRT).
Other exemplary selective algorithms may not assume that the VMM always uses the FPU and track the present owner of the FPU state to further reduce the overhead of the FPU virtualization. Such selective algorithms may maintain a value that indicates the present owner of the FPU state since the currently running thread may not be the owner. Such selective algorithms are able to simply enter a virtual machine if the value indicates that the virtual machine being entered owns the FPU state with a possible savings of 2×SRT. This will of course affect the expression for the overhead cost of the selective virtualization algorithm that may be used to select the virtualization algorithm, such as by comparing a metric derived from the expression for the overhead cost to the precomputed constant EET/(2×SRT).
It will be appreciated that no one virtualization algorithm will be lowest in cost at all times. Changes in the work load may make different virtualization algorithms more efficient at different times. FIG. 10 is a flowchart for an adaptive algorithm that may be used to periodically select the virtualization algorithm. The following is an exemplary adaptive algorithm to select either the unconditional virtualization algorithm or a selective algorithm that selectively saves and restores the machine state:
The calculating of EET and SRT may be performed only once and the results saved?150?as these values are essentially constant for a given processor configuration. The selected virtualization algorithm is used for a period of time?160?and then the adaptive algorithm is again used to select the virtualization algorithm so that the virtualization algorithm in use may change from time to time as the workload changes. The length of time for accumulating statistics on the selective algorithm and the interval between successive accumulations and possible changes in virtualization algorithms may be responsive to the workload of the processor.
The adaptive algorithm may be extended to a selection from more than two candidate algorithms. Each of the selective algorithms that selectively saves and restores the machine state when there is a change of context may be executed to allow statistics to be accumulated as to the overhead time cost of the selective algorithm under the processing workload at that time. Costs are computed for each of the selective algorithms and the lowest cost algorithm from amongst all candidate algorithms is selected.
SRC=http://www.freepatentsonline.com/7500244.html
原文:http://www.cnblogs.com/coryxie/p/3798374.html