+ All Categories
Home > Documents > 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final...

02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final...

Date post: 13-Jan-2016
Category:
Upload: brent-christopher-kelly
View: 216 times
Download: 1 times
Share this document with a friend
Popular Tags:
21
02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Students: Filimonov Dennis, Maor Dahan Supervisor: Abel Gordon
Transcript
Page 1: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

02/09/2010

Industrial Project Course (234313)

Virtualization-aware database engine

Final Presentation

• Students: Filimonov Dennis, Maor Dahan• Supervisor: Abel Gordon

Page 2: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

IntroductionWhat is Machine Virtualization?It is the ability to run multiple operating systems simultaneously on the same physical machine.

Virtualization can provide server consolidation, a legacy environment on new platforms and simplified management .

Systems running on virtual machines suffer performance penalties.

Page 3: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

Goals Analyze how virtualization technologies affect

applications performance and present alternative methods for reducing virtualization overhead.

Analyze and measure performance of the mySQL open source DB engine being used as a test subject running on a virtual machine hypervisor.

Identify virtualization critical overhead. Prototyping an approach to reduce virtualization

overhead by making the application aware of the virtualized environment .

Page 4: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

Methodology KVM – kernel Based Virtual Machine

It is an Open source full virtualization solution for Linux on x86 hardware containing virtualization extensions.

Virtio - An I/O virtualization framework for Linux.We developed Virtio – SQL which is used to enable communication between guest machine and host machine.We integrated a char device in the Virtio– SQL frontend driver to enable user space communication with the MySQL server running in the guest machine.The Virtio-SQL backend driver we developed sits in the hypervisor.

Page 5: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

Methodology (Cont.) MySQL

MySQL's pluggable Storage Engine Architecture gives users the flexibility to choose from a variety of purpose-built storage engines that are optimized for specific application demands .

we isolated an engine from the whole MySQL stack to be able to run it in the host machine. the engine is isolated and compiled in to a dynamic library which can be opened from the Virtio-

SQL backend to activate function calls.

Page 6: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

DesignMySQL server

 

Storage Engine front

end

 

 

 

 

 

 

Guest machine

Host machine

Kernel module Virtio – SQL Frontend

Virtio frontend

Char device

QEMU – KVM virtualizer

Virtio backend

Storage engine backend Run time library

Virtio – SQL Backend

Page 7: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

Frontend storage engine will receive SQL function calls for execution from remote user then forward them to the char device which is integrated in the Virtio-SQL frontend driver (for user space communication propose between the server and the Virtio-SQL frontend that is part of the kernel).

The Virtio – SQL frontend driver communicates with the backend driver of the Virtio-SQL located in the KVM hypervisor Which receives the function call and delivers it to the storage engine backend for execution.

The query result is propagated the same way back.

Design (Cont.)

Page 8: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

Measurements The /proc file system is a virtual file system that permits a novel approach for

communication between the Linux kernel and user space. it was used to measure and record information from the kernel.

We modified the KVM kernel module and injected code that measures and records the cycles in guest and root mode for each VM exit, then using the /proc file system we can retrieve that data.

We wrote scripts to automate measurements. We wrote code which generates random SQL insert queries. We used SSH to activate remote scripts on the Guest machine from the host machine. We wrote data analyzing and extractionscripts

Page 9: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

Benchmarking Test specificationWe compared three test configurations1. A default engine (MyISAM) running on a virtual machine.2. New approach engine running on a virtual machine.3. New approach engine running on a virtual machine while using batches of

8 queries at a time. We ran 7 tests on each of the configurations.

1k ,2k ,4k ,8k ,16k ,32k ,64k record size insert queries. Each test on a system ran 10 times and an average was calculated. The number of queries was between 500000 queries for 1k test to 16000

queries for 64k test. Each test ran one and a half minute.

Page 10: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

Results

• Tests with over 2k insert queries are more then 20% faster and up to 32% faster for 64k insert queries!

• For smaller insert queries using batching of inserts can increase performance but still slower then the regular run.

• Bigger insert queries batching doesn’t show significant improvement.

1k 2k 4k 8k 16k 32k 64k-40%

-20%

0%

20%

40%

60%

80%

-30%

-13%

21% 19% 27%

29%27%

-12%

-3%

25%

18%

28% 27%32%

50% 51%

61% 58% 61% 64%55%

8.43% 11.27%

27.50% 28.13%33.98%

24.11% 16.32%

Average Time improvment of run

New approach no batching New approach with batching MyiSAM no virtualization MyISAM virtio-Block

test number

Perc

ent

Page 11: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

So where does the difference come from?

Reminder – exit reasons the reasons the Hypervisor had to exit from the guest and switch to the host machine.

EXCEPTION_NMI: there was an exception or non – maskable interrupt. most of them are caused by page faults handled by the host to maintain the shadow page tables (the machine does not have EPT support).

EXTERNAL_INTERRUPT: there was an interrupt caused by the real hardware while the guest was running.

PENDING_INTERRUPT: if the guest disables interrupts the host can not inject any interrupt. In this case, the host can write to the VMCS a value telling the processor to exit when the guest enables interrupts. Thus, right after the guest enables interrupts the host can inject them.

CR_ACCESS: the guest read/write to a control register in the processor. IO_INSTRUCTION: I/O instruction. APIC_ACCESS: advanced programmable interrupt controller access . most of them are

caused when the guest accesses the APIC PAGE, most of them probably to acknowledge interrupts.

HALT: the processor is idling.

Page 12: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

Total time distribution between Guest and Host machine

1k-MyIS

AM

1k-no batc

hing

1k-batc

hing

2k-MyIS

AM

2k-no batc

hing

2k- batc

hing

4k-MyIS

AM

4k-no batc

hing

4k-bath

cing

8k-MyIS

AM

8k-no batc

hing

8k-batc

hing

16k-MyIS

AM

16k-no batc

hing

16k-batc

hing

32k-MyIS

AM

32k-no batc

hing

32k-bath

cing

64k-MyIS

AM

64k-no batc

hing

64k-batc

hing0

50000000000

100000000000

150000000000

200000000000

250000000000

300000000000

350000000000

400000000000

450000000000

500000000000

Guest machine Host machine

test

clock

cycle

s

Page 13: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

1k-MyIS

AM

1k-no batc

hing

1k-batc

hing

2k-MyIS

AM

2k-no batc

hing

2k- batc

hing

4k-MyIS

AM

4k-no batc

hing

4k-bath

cing

8k-MyIS

AM

8k-no batc

hing

8k-batc

hing

16k-MyIS

AM

16k-no batc

hing

16k-batc

hing

32k-MyIS

AM

32k-no batc

hing

32k-bath

cing

64k-MyIS

AM

64k-no batc

hing

64k-batc

hing0

50000000000

100000000000

150000000000

200000000000

250000000000

300000000000

350000000000

400000000000

450000000000

500000000000

Host machine Guest machine

test

clock

cycle

s

Page 14: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

1k 2k 4k 8k 16k 32k 64k0.00%

10.00%20.00%30.00%40.00%50.00%60.00%70.00%80.00%90.00%

100.00%

31.35%

50.93%

77.65% 81.01%89.72% 91.10% 92.72%

18.16%27.05%

42.75%

63.94% 66.69% 66.47%75.50%

26.34%35.16%

49.18%

70.16%63.04%

67.20%

77.70%

Proportion of time spent in the host machine because of halt exit in the guest machine

VM MyISAM run New Approach no Batching New approach with Batching

test

Perc

ent

1k 2k 4k 8k 16k 32k 64k-60.00%

-50.00%

-40.00%

-30.00%

-20.00%

-10.00%

0.00%

10.00%

20.00%

-53.41%

-37.79% -35.44% -34.98%-30.63% -28.44%

-11.63%

-20.34%

-8.41% -6.99%-3.72% -2.57% -1.03%

11.11%

Percentage decrease in number of HALT exits in comparison to the number in the MyISAM run

New approach no batching New approach with batching

test

perc

ent

Page 15: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

1k 2k 4k 8k 16k 32k 64k-600.00%

-500.00%

-400.00%

-300.00%

-200.00%

-100.00%

0.00%

100.00%

-545.92%

-261.88%

-121.33%-50.19%

-14.23% 4.70% 12.04%-53.55%

-20.57% 2.19% 8.58% 16.31% 19.42% 21.55%

Percentage decrease in number of I/O instruction exits in comparison to the number in the MyISAM run

New Approach no Batching New approach with Batching

test

perc

ent

1k 2k 4k 8k 16k 32k 64k0.00%

5.00%

10.00%

15.00%

20.00%

25.00%

30.00%

35.00%

3.04% 3.16% 2.66% 3.34% 2.68% 2.99% 3.04%

29.99% 30.16%28.01%

19.41%21.09%

23.30%

19.33%

14.82%17.99%

19.93%

14.14%

21.77% 20.61%

12.21%

Proportion of time spent in the host machine because of I/O exit in the guest machine

VM MyISAM run New Approach no Batching New approach with Batching

test

perc

ent

Page 16: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

1k 2k 4k 8k 16k 32k 64k0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%57.86%

40.22%

16.75%12.13%

5.46% 3.28% 1.99%

50.07% 41.06%

27.05%

14.50%

9.82%6.57% 2.76%

56.84%

44.48%

28.03%

13.26% 11.65%6.65% 4.14%

Proportion of time spent in the host machine because of APIC access exit in the guest machine

VM MyISAM run New approach no Batching New approach with Batching

test

perc

ent

1k 2k 4k 8k 16k 32k 64k-45.00%

-40.00%

-35.00%

-30.00%

-25.00%

-20.00%

-15.00%

-10.00%

-5.00%

0.00%

-42.82%

-37.49%

-31.34%

-27.27% -25.90%-23.41%

-17.77%-19.23%

-11.77%

-6.21% -4.96% -4.96% -5.03%-1.44%

Percentage decrease in number of APIC access exits in comparison the number in the MyISAM run

New approach No Batching New approach with Batching

test

perc

ent

Page 17: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

And what about the other exits?

1k 2k 4k 8k 16k 32k 64k0.00%0.50%1.00%1.50%2.00%2.50%3.00%3.50%4.00%4.50%

3.64%

2.57%

1.68%2.18%

1.23%

1.82% 1.69%

0.98% 0.90%1.44% 1.55% 1.63%

2.65%

1.90%1.16% 1.29%

1.75% 1.77%

2.50%

3.45%

4.05%

Proportion of time spent in the host machine because of pending interrupt exit in the guest machine

VM MyISAM run New approach no Batching New approach with batching

test

Perc

ent

1k 2k 4k 8k 16k 32k 64k0.00%

0.02%

0.04%

0.03% 0.02%

0.01%

0.02%

0.01% 0.01%0.00%

0.03%0.03%

0.02%0.03%

0.02%0.02%

0.01%

0.02%0.03%

0.02%0.02% 0.02%

0.02%

0.01%

Proportion of time spent in the host machine because of external interrupt exit in the guest machine

VM MyISAM run New approach no Batching New approach with batching

test

Perc

ent

Page 18: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

1k 2k 4k 8k 16k 32k 64k0.00%

0.05%

0.10%

0.15%

0.20%

0.25%

0.30%

0.19%

0.28%

0.20%

0.16% 0.16% 0.15% 0.15%

0.10%

0.14%0.11%

0.08%0.11%

0.15%

0.08%

0.15%

0.19% 0.19%

0.12%

0.24%0.27%

0.18%

Proportion of time spent in the host machine because of CR-ACCESS exit in the guest machine

VM MyISAM run New approach no Batching New approach with Batching

test

perc

ent

1k 2k 4k 8k 16k 32k 64k0.00%

0.50%

1.00%

1.50%

2.00%

2.50%

3.00%

3.50%

4.00%

4.50%3.89%

2.82%

1.04% 1.16%0.75% 0.67%

0.41%0.66% 0.66% 0.62% 0.49% 0.64%

0.84%0.42%

0.67%0.88% 0.90%

0.52%0.78%

1.19%

0.66%

Proportion of time spent in the host machine because of Exception NMI exit in the guest machine

VM MyISAM run New Approach no Batching New approach with Batching

test

Perc

ent

Page 19: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

So what did we learn? There is a strong relation between the exit distribution and the Insert record

length. Some exits are very dominant while other are negligible. The writing to the real disk is more efficient then the virtual disk Therefore the Halt overhead is reduced. The total time spent in the host machine for the new approach is smaller and

the time spent in the guest machine remains similar therefore we improve virtualization overhead.

As a result of transferring a small but significant part of the code to the host machine we gained significant improvement.

We observe that for the new approach small insert queries degrade Performance .

Page 20: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

Conclusions Isolating MySQL storage engine to work independently from the MySQL

stack is extremely difficult. Dependents on global variables . Usage of external functions.

Virtualization overhead can be reduced by changes at the application layer.

Measuring virtualization overhead is long and tedious so scripts should be used to automate measurements.

Analyzing virtualization overhead is very complex because many variables need to be considered and the relations between them.

Virtio alternatives should be considered to improve performance further.

Page 21: 02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.

Deliverables

Documentation User’s Manuel Developer’s Manuel Project Internet site

Code Virtio - SQL driver code. KVM code changes. Measurement script’s. New approach MySQL engine – supports only INSERT queries.

Thank you


Recommended