<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Linuxvirtualization | Linux</title>
	<atom:link href="http://linux.sjolshagen.net/tag/virtualization/feed/" rel="self" type="application/rss+xml" />
	<link>http://linux.sjolshagen.net</link>
	<description>Linux for Businesses</description>
	<lastBuildDate>Wed, 01 Feb 2012 17:33:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Linux: Configure &#8220;bridge at boot&#8221; for NIC(s) in Fedora 13</title>
		<link>http://linux.sjolshagen.net/2010/07/28/linux-configure-bridge-at-boot-for-nics-in-fedora-13/</link>
		<comments>http://linux.sjolshagen.net/2010/07/28/linux-configure-bridge-at-boot-for-nics-in-fedora-13/#comments</comments>
		<pubDate>Wed, 28 Jul 2010 14:31:51 +0000</pubDate>
		<dc:creator>Thomas S</dc:creator>
				<category><![CDATA[EqualLogic]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Mission Critical Computing]]></category>
		<category><![CDATA[Virtualization]]></category>
		<category><![CDATA[bridge]]></category>
		<category><![CDATA[Fedora 13]]></category>
		<category><![CDATA[ipv4]]></category>
		<category><![CDATA[iscsi]]></category>
		<category><![CDATA[KVM]]></category>
		<category><![CDATA[libvirt]]></category>
		<category><![CDATA[network]]></category>
		<category><![CDATA[network tuning]]></category>
		<category><![CDATA[NetworkManager]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[virsh]]></category>
		<category><![CDATA[virtualization]]></category>

		<guid isPermaLink="false">http://linux.sjolshagen.net/?p=385</guid>
		<description><![CDATA[Sometimes, for instance when having a limited number of Network Interface Cards (NICs) on a system that will be used for a Linux hosted platform virtualization solution (and you&#8217;re running Fedora 13), the easiest approach to giving each of the guests &#8220;direct&#8221; access to a network is to configure the physical devices as bridges on...]]></description>
			<content:encoded><![CDATA[<p>Sometimes, for instance when having a limited number of Network Interface Cards (NICs) on a system that will be used for a Linux hosted<a href="http://en.wikipedia.org/wiki/Hardware_virtualization#Concept"> platform virtualization</a> solution (and you&#8217;re running Fedora 13), the easiest approach to giving each of the guests &#8220;direct&#8221; access to a network is to configure the physical devices as <a href="http://gd.tuwien.ac.at/linuxcommand.org/man_pages/brctl8.html">bridge</a>s on the host.</p>
<p>This will permit the <a href="http://www.libvirt.org/">libvirt virtualization (management) abstraction interface</a> to easily build &#8220;briges of bridges&#8221; that in turn let a<a href="http://www.linux-kvm.com/"> Kernel Virtual Machine (KVM)</a> guest get it&#8217;s own &#8220;public&#8221; (only in quotes because I happen to think the average bear would not be so silly as to put their Linux/KVM host directly onto the internet. Right???) IP address and route its traffic directly onto the ether (via the lower levels of the IP stack of the host environment).</p>
<p>There are, as is the case with all things Linux or UNIX, a couple of ways to skin this particular bear (sorry, that&#8217;s bad!), but the one that makes the most sense to me is to have <span style="font-family: courier new,courier;">init</span> take care of the configuration as part of the system boot process (when the <span style="font-family: courier new,courier;">network </span>service executes). And doing that, although in its simplest form requires access to a terminal window and a text editor on the Fedora host, is actually very simple, once you know what you&#8217;re doing. Hopefully, the following will help you learn (if you don&#8217;t already know and are only reading this because you&#8217;re looking around and are a very bored individual).<span id="more-385"></span></p>
<p>At a high level, all you have to do is edit the <span style="font-family: courier new,courier;">/etc/sysconfig/network-scripts/ifcfg-ethX</span> file for the Ethernet (eth) device you want to bridge, specify that the device will belong to a bridge named &lt;bridgename&gt; and then create a <span style="font-family: courier new,courier;">/etc/sysconfig/network-scripts/ifcfg-&lt;bridgename&gt;</span> file that configures the bridge (assigning it a way to obtain an IP address &#8211; probably static, routing and DNS configuration information).</p>
<p>Prior to configuring the system to use bridging, I had configured static IP addresses on the two physical interfaces I will be creating bridges for. Since that configuration was being obsoleted in order to use the same interfaces for guest-to-iSCSI-SAN traffic, I edited and created the above mentioned configuration files. All of the <strong>edited items are in bold</strong>. Before you edit and create these files, life gets a little easier if you:</p>
<ol>
<li>Back up the original <span style="font-family: courier new,courier;">ifcfg-eth[N]</span> files to some other location than <span style="font-family: courier new,courier;">/etc/sysconfig/network-scripts/</span></li>
<li><span style="font-family: courier new,courier;"># ifdown eth[N]</span></li>
</ol>
<p>Then, as an example, the two-to-four files you need (ifcfg-eth[N], ifcfg-eth[N+1] and ifcfg-bridge[N] and ifcfg-bridge[N+1]).</p>
<p><span style="font-family: courier new,courier;">/etc/sysconfig/network-scripts/ifcfg-eth2:</span></p>
<pre># Intel Corporation 82571EB Gigabit Ethernet Controller
DEVICE=eth2
BOOTPROTO=none
HWADDR=00:15:17:6C:97:94
<strong>#IPADDR=X.X.Z.231
#NETMASK=255.255.255.0
#PREFIX=24
#DEFROUTE=yes</strong>
ONBOOT=yes
TYPE=Ethernet
<strong>#DNS1=[DNS-SERVER1]
#DNS2=[DNS-SERVER2]</strong>
IPV6INIT=no
USERCTL=no
IPV4_FAILURE_FATAL=yes
NAME="System eth2"
<strong>BRIDGE=iscsi-bridge0</strong>
<strong>MTU=9000</strong></pre>
<p><span style="font-family: courier new,courier;">/etc/sysconfig/network-scripts/ifcfg-eth3:</span></p>
<pre># Intel Corporation 82571EB Gigabit Ethernet Controller
DEVICE=eth3
<strong>#IPADDR=X.Y.Z.232
#NETMASK=255.255.255.0
#PREFIX=24
#DEFROUTE=yes
#IPV4_FAILURE_FATAL=yes</strong>
HWADDR=00:15:17:6C:97:95
ONBOOT=yes
BOOTPROTO=none
TYPE=Ethernet
<strong>BRIDGE=iscsi-bridge1</strong>
IPV6INIT=no
USERCTL=no
NAME="System eth3"
<strong>MTU=9000</strong></pre>
<p><span style="font-family: courier new,courier;">/etc/sysconfig/network-scripts/ifcfg-iscsi-bridge0:</span></p>
<pre><strong>DEVICE=iscsi-bridge0
ONBOOT=yes
TYPE=Bridge
IPADDR=X.Y.Z.231
NETMASK=255.255.255.0
STP=off
MTU=9000
DELAY=0</strong></pre>
<p><span style="font-family: courier new,courier;">/etc/sysconfig/network-scripts/ifcfg-iscsi-bridge1:</span></p>
<pre><strong>DEVICE=iscsi-bridge1
ONBOOT=yes
TYPE=Bridge
IPADDR=X.Y.Z.232
NETMASK=255.255.255.0
MTU=9000
STP=off
DELAY=0</strong></pre>
<p>Although there&#8217;s a perfectly valid way of achieving the same thing through the <span style="font-family: courier new,courier;">virsh/libvirtd</span> management interface to <span style="font-family: courier new,courier;">libvirt </span>as well as with the <a href="http://projects.gnome.org/NetworkManager/">Network Manager</a> tools, my preference is to make this configuration &#8220;stick&#8221; using the old <span style="font-family: courier new,courier;">network </span>init service. The problem(s) I see with the<span style="font-family: courier new,courier;"> NetworkManager</span>/<span style="font-family: courier new,courier;">libvirtd </span>approach is twofold:</p>
<ul>
<li>Timing of <span style="font-family: courier new,courier;">NetworkManager</span> start-up (not all that early) relative to the Open-iSCSI stack startup (early)</li>
<li>Timing of <span style="font-family: courier new,courier;">libvirtd </span>start-up (one of the last services to get called) relative to other iSCSI volumes needing to be available for the host environment.</li>
</ul>
<p>So, for this example, disable <span style="font-family: courier new,courier;">NetworkManager</span> as a boot service and enable the <span style="font-family: courier new,courier;">network</span> service:</p>
<pre># service NetworkManager stop
# chkconfig NetworkManager off
# chkconfig network on
# ifup iscsi-bridge0
# ifup iscsi-bridge1</pre>
<p>And, Bob&#8217;s yer uncle (or, at least, he should be!). To verify that everything is working properly, ping an IP target that should be reachable from the bridge device interface(s) only:</p>
<pre># ping -I iscsi-bridge0</pre>
<pre># ping -I iscsi-bridge1</pre>
<p>By the way: If you use bridged interfaces, no iSCSI volumes on the host (for guests only, in other words) and have iptables enabled on the host (which you should), make sure to configure your host iptables to leave the bridged interfaces alone. For details, see the &#8211; soon to be created, I promise &#8211; post about performance tuning the Linux IPv4 environment for iSCSI-initiators on this site. Alternatively, you can grab the information pertaining to the relevant bridge sysctl.conf entries from the <a href="https://inquiries.redhat.com/go/redhat/rhel-hp-proliant">KVM scalability white paper</a> Red Hat published (and I provided most of the content for in my previous career).</p>
]]></content:encoded>
			<wfw:commentRss>http://linux.sjolshagen.net/2010/07/28/linux-configure-bridge-at-boot-for-nics-in-fedora-13/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Can PeeCees (PC) ever be used for mission critical workloads?</title>
		<link>http://linux.sjolshagen.net/2010/04/22/can-peecees-pc-ever-be-used-for-mission-critical-workloads/</link>
		<comments>http://linux.sjolshagen.net/2010/04/22/can-peecees-pc-ever-be-used-for-mission-critical-workloads/#comments</comments>
		<pubDate>Thu, 22 Apr 2010 20:38:23 +0000</pubDate>
		<dc:creator>Thomas S</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Mission Critical Computing]]></category>
		<category><![CDATA[mission critical]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[virtualization]]></category>

		<guid isPermaLink="false">http://linux.sjolshagen.net/?p=205</guid>
		<description><![CDATA[Heresy! I think PCs are ready to host truly mission-critical workloads today...]]></description>
			<content:encoded><![CDATA[<p>Obviously, I&#8217;m not referring to the &#8220;BestBuy&#8221; version of computers we all tend to have in our own homes. I&#8217;m referring to servers built on the either the 64-bit x86 processors from AMD or Intel. Forgive me, as I tend to get a little agitated when people attempt to define &#8220;Mission Critical&#8221; from a perspective of server hardware. To me, doing so is utter nonsense, since it&#8217;s really how the hardware is used that defines whether it&#8217;s part of a &#8220;Mission Critical&#8221; (or &#8220;Business Critical&#8221;) solution or not.</p>
<p><span id="more-205"></span></p>
<p>There are plenty of people out there, using hardware that the hardware vendors have not classified as &#8220;Mission Critical&#8221;, for what amounts to being highly &#8220;mission critical&#8221; solutions for their businesses, and vice versa (people using servers that are &#8220;mission-critical&#8221; for workloads that really aren&#8217;t). Examples of this use could include mail servers, firewalls, small but critical databases, etc, etc. There are also examples that are time dependent in nature; For instance, what would happen if a file and print service that was being used by the finance department to generate various time sensitive reports went off-line during the end of quarter or year processing&#8230;?</p>
<p>I know, it&#8217;s a contrived example, but it does actually demonstrate the point. It is how a  given system is used within the business that defines its degree of criticality, not the server it runs on!</p>
<p>I&#8217;ll be the first to concede that hardware certainly can help ensure that a mission critical service/workload remains available to its end users (and can, sometimes, be very good at ensuring the service/workload is <em>not</em> available too!) but it&#8217;s by no means the only or most critical component in that equation!</p>
<p>What defines &#8220;availability&#8221;?</p>
<p>To some, the &#8220;number of nines&#8221; &#8211; i.e. &#8220;five nines&#8221; means 99.999% availability &#8211; is the gold standard. I tend to agree that the &#8220;number of nines&#8221; is certainly not only one of the best known metrics for availability, it&#8217;s also a very good one. It captures the essence of the problem; How much time did my service spend being available to its consumers versus the amount of time it was <strong>un</strong>available. And a typical &#8220;Mission Critical&#8221; environment tends to operate within the 4-5 &#8220;nines&#8221; range.</p>
<p>Since &#8220;four nines&#8221; represents about 52 minutes of downtime, and &#8220;five nines&#8221; about 5 minutes of downtime per year, most people tend to only think in terms of large, expensive, &#8220;enterprise class&#8221; servers when thinking about architecting for mission critical workloads. Why? Because, historically speaking, these systems were the only ones where it was technologically feasible to include the level of silicon and hardware designs containing the capabilities needed in order to guarantee four or five nines at a price point customers (you) were willing to accept.</p>
<p>However, with the  growing acceptance of virtualization technologies such as VM Ware, Microsoft hyper B., Citrix and server and the various open source virtualization technologies included in the major commercial Linux distributions, the picture of what a &#8220;mission-critical&#8221; server looks like has changed.</p>
<p>It&#8217;s now possible to quickly build &#8220;always on&#8221; system architectures based on industry-standard hardware and extremely flexible hypervisor&#8217;s and management software. And you can do it for  what amounts to pennies on the dollar compared to the traditionally very expensive and often proprietary systems from the likes of Bull, Fujitsu, NEC, IBM,  Sun (now Oracle), HP, etc.</p>
<p>In my view, the greatest paradigm shift in terms of highly available mission-critical workload designs, follows on the heels of <em>one</em> key enabling technology. The advent and acceptance of live-migration in production environments, basically means that there are no  technological limits to how flexible your application environment can be. If you were to pair &#8220;always-on&#8221; live migration, resource management, high availability services, data replication and a little bit of ingenuity, you could with relative ease have a fault tolerant solution for your application environment.  Or, if your resource needs aren&#8217;t too great, you could leverage the VMware ESX 4.0 fault tolerance capability. Basically the sky&#8217;s the limit!</p>
<p>So, to answer my first question, yes PCs can be used for mission-critical workloads. The introduction of virtualization technologies to the x86 platform only makes your job of ensuring availability for those mission-critical workloads a whole lot easier. At least, that&#8217;s my opinion.</p>
]]></content:encoded>
			<wfw:commentRss>http://linux.sjolshagen.net/2010/04/22/can-peecees-pc-ever-be-used-for-mission-critical-workloads/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Scaling up your virtualization solution on 8-socket HP ProLiant Servers</title>
		<link>http://linux.sjolshagen.net/2010/03/01/scaling-up-your-virtualization-solution-on-8-socket-hp-proliant-servers/</link>
		<comments>http://linux.sjolshagen.net/2010/03/01/scaling-up-your-virtualization-solution-on-8-socket-hp-proliant-servers/#comments</comments>
		<pubDate>Mon, 01 Mar 2010 19:10:28 +0000</pubDate>
		<dc:creator>Thomas S</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Virtualization]]></category>
		<category><![CDATA[8-socket]]></category>
		<category><![CDATA[DL 785]]></category>
		<category><![CDATA[HP ProLiant]]></category>
		<category><![CDATA[KVM]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Red Hat]]></category>
		<category><![CDATA[RHEL 5.4]]></category>
		<category><![CDATA[scalability]]></category>
		<category><![CDATA[server]]></category>
		<category><![CDATA[virtualization]]></category>

		<guid isPermaLink="false">http://linux.sjolshagen.net/?p=145</guid>
		<description><![CDATA[Some of the things we've learned while testing the KVM based virtualization solution in RHEL 5.4 on an 8-socket HP ProLiant server.]]></description>
			<content:encoded><![CDATA[<p>These days, when wearing my “Linux planner” hat, and with Virtualization being the “phrase that pays”, I’m often asked to help provide guidance on how to best take advantage of the technology included in our 8-socket HP ProLiant server offerings for Linux based virtualization solutions like Red Hat Enterprise Virtualization or Suse Linux Enterprise Server Xen (there’s a plethora of information out there about VMware ESX/ESXi 3.5.x and vSphere 4.0, so I’m not going to talk about that, this time around.)</p>
<p>The problem I’ve had, until recently, was providing actual – objective &#8211; data as a means to help illustrate my points.  For instance, I could not clearly illustrate how a snoop filter on the CPU interconnect can improve the linearity of the workload scalability in a virtualized environment (see Fig. 1).</p>
<div id="attachment_143" class="wp-caption aligncenter" style="width: 310px"><a href="http://linux.sjolshagen.net/wp-content/uploads/2010/03/Best-run-pinned-vs-unpinned.png"><img class="size-medium wp-image-143" title="Pinned and un-pinned tiles" src="http://linux.sjolshagen.net/wp-content/uploads/2010/03/Best-run-pinned-vs-unpinned-300x192.png" alt="" width="300" height="192" /></a><p class="wp-caption-text">Fig. 1: Average response time with pinned vs. un-pinned processors</p></div>
<p>I was unable to demonstrate benefits of the NUMA aware scheduler that the Linux kernel uses and how it <em>does</em> improve performance. (In figure 2, it’s represented by the improvement in average response times from the web-servers included in the workload) when your workloads run with memory interleaving disabled – see Fig. 2 and 3. Unless, for support reasons, your application vendor explicitly tells you otherwise, of course!</p>
<div id="attachment_151" class="wp-caption aligncenter" style="width: 310px"><a rel="attachment wp-att-151" href="http://linux.sjolshagen.net/2010/03/scaling-up-your-virtualization-solution-on-8-socket-hp-proliant-servers/non-interleaved-memory-avg_response/"><img class="size-medium wp-image-151" title="Non-interleaved Memory Config" src="http://linux.sjolshagen.net/wp-content/uploads/2010/03/non-interleaved-memory-avg_response-300x192.png" alt="" width="300" height="192" /></a><p class="wp-caption-text">Fig. 2: Average Response Times - Non-interleaved Memory Config</p></div>
<div id="attachment_153" class="wp-caption aligncenter" style="width: 310px"><a href="http://linux.sjolshagen.net/wp-content/uploads/2010/03/Interleaved-memory-avg-response.png"><img class="size-medium wp-image-153" title="Average Response Times - Interleaved RAM" src="http://linux.sjolshagen.net/wp-content/uploads/2010/03/Interleaved-memory-avg-response-300x225.png" alt="" width="300" height="225" /></a><p class="wp-caption-text">Fig. 3: Average Response Times - Interleaved memory</p></div>
<p>I also used to have a hard time explaining how and why to tune the Linux kernel for these systems. For instance, I only suspected how little (none) tuning of the host platform is required in order to drive pretty significant numbers of guests  (98) in these environments &#8211; see Fig. 4. But, if you engage in some very minor tuning activities of the network stack, how those very same workload performance results can be extended even further (to 256 guests) – see Fig. 5:</p>
<div id="attachment_161" class="wp-caption aligncenter" style="width: 310px"><a href="http://linux.sjolshagen.net/wp-content/uploads/2010/03/forgot-to-tune-linearity-graph.png"><img class="size-medium wp-image-161" title="Default tuning for Host server" src="http://linux.sjolshagen.net/wp-content/uploads/2010/03/forgot-to-tune-linearity-graph-300x192.png" alt="" width="300" height="192" /></a><p class="wp-caption-text">Fig. 4: The system has not been tuned beyond it&#39;s &quot;out of the box&quot; state.</p></div>
<div id="attachment_163" class="wp-caption aligncenter" style="width: 310px"><a href="http://linux.sjolshagen.net/wp-content/uploads/2010/03/tuned-slice.png"><img class="size-medium wp-image-163" title="Fully tuned and linear scalability" src="http://linux.sjolshagen.net/wp-content/uploads/2010/03/tuned-slice-300x192.png" alt="" width="300" height="192" /></a><p class="wp-caption-text">Fig. 5: System is tuned and exhibiting linear scalability to 256 KVM guests</p></div>
<p>As part of a joint documentation effort with Red Hat, all of the data collected has been brought together in a <a href="https://inquiries.redhat.com/go/redhat/rhel-hp-proliant">Reference Architecture document  &#8211; “Scaling RHEL 5.4 + KVM up to 256 Guests&#8221;</a> available for free from Red Hat’s website.</p>
<p>We obviously picked the guest density to prove a point about the platform, however it’s worth mentioning that <strong><em>256 guests</em></strong> <strong><em>does not represent the upper bound for the platform</em></strong>. It only represents where we thought the density went (far) beyond what is reasonable to expect in a production environment this day in age.</p>
]]></content:encoded>
			<wfw:commentRss>http://linux.sjolshagen.net/2010/03/01/scaling-up-your-virtualization-solution-on-8-socket-hp-proliant-servers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>KVM/Qemu and caching of I/O</title>
		<link>http://linux.sjolshagen.net/2010/01/10/kvmqemu-and-caching-of-io/</link>
		<comments>http://linux.sjolshagen.net/2010/01/10/kvmqemu-and-caching-of-io/#comments</comments>
		<pubDate>Sun, 10 Jan 2010 15:00:36 +0000</pubDate>
		<dc:creator>Thomas S</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[disk i/o]]></category>
		<category><![CDATA[KVM]]></category>
		<category><![CDATA[libvirt]]></category>
		<category><![CDATA[virtualization]]></category>

		<guid isPermaLink="false">http://linux.sjolshagen.net/?p=97</guid>
		<description><![CDATA[A feeble(ish) attempt at documenting the 'cache' properties for the Kernel Virtual Machine when managed by libvirtd.]]></description>
			<content:encoded><![CDATA[<p>I like to live &#8220;on the edge&#8221;. At least technologically speaking.</p>
<p>As a consequence, in my environment, I&#8217;ve got a couple of KVM guests that are running Fedora 12 with Red Hat Cluster v3.0.6 installed. That&#8217;s not really &#8220;living on the edge&#8221;. The &#8220;living on the edge&#8221; part of that configuration is that the two guests share a clustered file system. This clustered file system is hosted on a DRDB replicated volume between two standard internal SATA drives hosted on two different KVM host systems. And these host systems are, in turn their own Fedora 12 based Red Hat Cluster.</p>
<p>Obviously, there are plenty of opportunities for data to go &#8220;missing&#8221; (get corrupted/get lost/disappear/etc) in a configuration like this. And I thought I&#8217;d been able to eliminate them all.</p>
<p>That was what I thought, until I ran one of the KVM guests on one of the hosts, and the other on the other. My GFS2 file system wasn&#8217;t impressed! And I was stumped. DRBD had been configured with synchronous replication (let&#8217;s not talk about the performance impact of that decision, shall we&#8230;?) but obviously the data wasn&#8217;t being committed simultaneously to both drives<sup>[1]</sup>.</p>
<p>I now suspect that&#8217;s happening because the KVM hosts were caching the data on the guests behalf. Could be a very spiffy performance boost<sup>[2]</sup> but causes all sorts of problems for my clustered applications that rely on the data in the file system being where it&#8217;s supposed to be.</p>
<p>So, I had to dig around a little and discovered  that Qemu/KVM/libvirt actually supports setting the caching properties for the &#8216;physical&#8217; devices backing its virtual hard drives (i.e. the hard drives or container files exported to the guest as &#8220;disks&#8221;). And it&#8217;s &#8211; if you&#8217;re using the CLI interfaces for managing KVM, libvirtd &amp; virsh &#8211; fairly easy to set it to what you want/need it to be.</p>
<p>The caching properties you can set are:</p>
<ul>
<li>writeback</li>
<li>writethrough</li>
<li>none</li>
<li>default</li>
</ul>
<p>Unfortunately, I&#8217;ve not been able to locate some way to set this while creating the guest with virt-manager. However, virt-install does let you set it, and if the guest is inactive (i.e. not running), you can set it by editing the &lt;driver&gt; tag.</p>
<p>For example:</p>
<blockquote>
<pre>&lt;disk type='block' device='disk'&gt;
   &lt;driver name='qemu' cache='none'/&gt;
   &lt;source dev='/dev/mapper/sharedVG01-www--local'/&gt;
   &lt;target dev='vde' bus='virtio'/&gt;
&lt;/disk&gt;</pre>
</blockquote>
<div><span style="color: #800000;">NOTE: </span><span style="color: #000080;">Early versions of libvirtd may </span><strong><em><span style="color: #000080;">not</span></em><span style="font-weight: normal;"><span style="color: #000080;"> support the &lt;driver cache=&#8221;&gt; nomenclature.</span> I&#8217;m using 0.7.5 in my environment, but I believe any recent (0.7 and later, for sure) of libvirtd include support for this. To check your libvirt version, issue the command:</span></strong></div>
<blockquote>
<pre># virsh version</pre>
</blockquote>
<pre></pre>
<h3>Apropos:</h3>
<pre></pre>
<p>[1] = I know, I know. A DRBD mirror set up to use &#8220;protocol C&#8221; doesn&#8217;t, technically, commit the data simultaneously to both devices. It only &#8220;looks&#8221; like that because the write() operation does not return success until the data has been successfully written on the &#8220;remote&#8221; device as well as the local one.</p>
<p>[2] = It is, actually. As an example, the reason why the likes of Xen, KVM, etc have been able to post IO benchmarks that are more than 100% the performance of the underlying hardware is because the host environment caches the data on the guests behalf. Looks good on benchmarks. Not so much if your host fails before the data have been flushed from the host cache onto the physical disk devices. Applications tend to get cranky when data they &#8220;know&#8221; was committed to persistent storage is missing&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://linux.sjolshagen.net/2010/01/10/kvmqemu-and-caching-of-io/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Getting ready for an industry conference</title>
		<link>http://linux.sjolshagen.net/2009/05/09/getting-ready-for-an-industry-conference/</link>
		<comments>http://linux.sjolshagen.net/2009/05/09/getting-ready-for-an-industry-conference/#comments</comments>
		<pubDate>Sat, 09 May 2009 17:44:10 +0000</pubDate>
		<dc:creator>Thomas S</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[scalability]]></category>
		<category><![CDATA[virtualization]]></category>

		<guid isPermaLink="false">http://linux.sjolshagen.net/?p=6</guid>
		<description><![CDATA[I wound up with more work on my plate this week in order to get ready for a large conference this summer where I need to talk about scalability, performance and virtualization of large scale-up platforms. ]]></description>
			<content:encoded><![CDATA[<p>This week was one of those &#8220;in spite of the best laid plans of mice and men&#8221; weeks.</p>
<p>One of the engineers I&#8217;ve been working with to help me get a better understanding for what the performance and scalability of a scale-up environment when being used to run Red Hat Virtualization (Xen) as well as the Kernel Virtual Machine (KVM) is changing roles within the company. Good for him!</p>
<p>Unfortunately, this means that I&#8217;ll have to do some of the data-collection work on my own. On top of my daily responsibilities. So, we met this week to try and figure out how we could compress what&#8217;s been, thus far, a multi-month effort into something a bit more schedule friendly for the both of us. And I think we&#8217;ve got a reasonable plan, all things considered.</p>
<p>One of the achilles heels of almost any of the virtualization solutions, thus far, has been the IO throughput (and latency) for disk related IO operations. If you look at testing done by various vendors and the Open Source community at large, the throughput has been either &#8220;not great&#8221; or inconsistent. The &#8220;not great&#8221; element of this hasn&#8217;t, historically, really been that huge of a deal for customers, since they seem to have planned around it by not including disk IO intensive workloads into their consolidation/virtualization plans.</p>
<p>However, more and more customers are looking to virtualize <em>everything</em> running on their server platforms in an effort to save power, cooling and management costs. As a result, the &#8220;not great&#8221; performance behavior has become enough of an issue that all of the virtualization vendors now support (or will in the near future), minimally, para-virtualized IO drivers and/or other performance optimizations.</p>
<p>Consequently, they also appear to be pushing Linux (and Windows) further &#8220;back&#8221; into their data centers and looking to use the two for more and more critical tasks, even in a virtualized/consolidated context. An on-going problem for myself and our customers has been the amount of fact-based information available in terms of how to best configure these environments to optimize IO performance or to help customers understand the actual limitations and benefits of the various IO options in a virtualized environment.</p>
<p>Also, there&#8217;s very little in terms of stated best practices for using things like raw devices, volume manager backed devices, file systems or file containers for the various types of workloads. So, I was hoping we&#8217;d be able to provide some of that in a presentation at a large conference this summer.  And my original plan was that I&#8217;d only be the consumer of the data, not the creator of it. Thus the &#8220;best laid plans of mice and men&#8221; statement early on.</p>
<p>So, for the next couple of weeks, I&#8217;ll be trying to collect whatever the engineer is unable to collect. Then I&#8217;ll have to graph it, &#8216;gussy it up&#8217; (with pretty colors), add some configuration recommendations and bring it all with me to the sessions I have planned for the conference.</p>
]]></content:encoded>
			<wfw:commentRss>http://linux.sjolshagen.net/2009/05/09/getting-ready-for-an-industry-conference/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

