A guide to use NPIV technology across KVM guests on IBM Power Systems
In this article, I am sharing with you my experience in
implementing N-Port ID Virtualization (NPIV) on IBM® Power Systems™ with kernel-based virtual
machine (KVM). NPIV is a Fibre Channel (FC) technology to share a single physical Fibre
Channel host bus adapter (HBA) as multiple virtual ports, known as virtual HBA (vHBAs) –
Each virtual port is identified by its own word wide port name (WWPN) and word wide node
name(WWNN). Typically, with virtualization, vHBA has the control of the logical unit numbers
(LUNs) to be used by the virtual machines (VMs). This explains why multiple HBAs are
required for multiple guests, and thus signifies the use of NPIV (as NPIV provides unique
HBAs as vHBAs that each guest can make use of).
A storage area network (SAN) fabric is a
network with routers and switches. A SAN is configured into a number of zones.
A device using the SAN can communicate only with the devices that are included in
the same zone.
The following are the prerequisites and steps
involved in using LUNs (SAN storage volumes in this context) across guests using NPIV, which
is the technology used to share a single physical HBA as vHBAs.
- SAN storage to create volumes
- SAN switch to provide connectivity between the hypervisor host and the SAN storage using
zones. In our case, it is the vHBA on hypervisor host side paired with controllers of SAN
- Hypervisor host and SAN connected through FC switches
- Hypervisor host operating system with the following version:
- libvirt version 1.0.0 or later
- QEMU version 2.7 or later
The following are the sequential steps involved in utilizing SAN storage LUN in KVM
guests using NPIV.
- Creating vHBA in host hypervisor
- Zoning in switch
- Creating a SAN volume
- Discovering LUN in host
- Booting KVM with LUN
Figure 1 provides a pictorial representation of vHBA connectivity with SAN and FC Switch.
Figure 1. Virtual FC connectivity to SAN and storage
A physical HBA that is capable of supporting NPIV can be seen in the hypervisor host using
the following libvirt command:
# virsh nodedev-list --cap vports scsi_host1 scsi_host2
Run the following command to view the WWPN of the HBA:
# virsh nodedev-dumpxml scsi_host1 <device> <name>scsi_host1</name> <path>/sys/devices/pci0001:00/0001:00:00.0/0001:01:00.0/0001:02:09.0/0001:09:00.0/host1</path> <parent>pci_0001_09_00_0</parent> <capability type='scsi_host'> <host>1</host> <unique_id>0</unique_id> <capability type='fc_host'> <wwnn>20000110fb8f0ebc</wwnn> <wwpn>10000090fb9f1ebc</wwpn> <fabric_wwn>100050eb1a99d430</fabric_wwn> </capability> <capability type='vport_ops'> <max_vports>255</max_vports> <vports>2</vports> </capability> </capability> </device>
The <path> XML tag above shows the
files associated with
Login to the FC switch interface and check whether the hypervisor host and
the SAN controller WWPNs are connected using the
switchshow command. You can notice that the
WWPN of the hypervisor host as well as the SAN storage are connected to one of the
> switchshow … … Index Port Address Media Speed State Proto ================================================================================== 0 0 010000 id N8 Online FC F-Port 50:05:17:28:0b:12:55:ab 1 1 010100 id N8 Online FC F-Port 50:05:17:28:0b:12:55:ac … … 12 12 010c00 id N16 No_Light FC 13 13 010d00 id N16 No_Light FC 14 14 010e00 -- N16 No_Module FC 15 15 010f00 id N16 No_Light FC … … 22 22 011600 id N8 Online FC F-Port 10:00:00:90:fb:9f:1e:bc 23 23 011700 id N8 Online FC F-Port 10:00:00:90:fa:8f:0e:bd
The highlights show the WWPNs of the hypervisor host and SAN controller connected to
the FC switch. Using a single HBA as parent, a number of vHBAs
can be created. You can notice that
scsi_host1 is used as parent in following XML file, which is used as an input for the
<device> <parent>scsi_host1</parent> <capability type='scsi_host'> <capability type='fc_host'> </capability> </capability> </device> # virsh nodedev-create vhba.xml Node device scsi_host3 created from vhba.xml # virsh nodedev-create vhba.xml Node device scsi_host4 created from vhba.xml # virsh nodedev-dumpxml scsi_host3 <device> <name>scsi_host3</name> <path>/sys/devices/pci0001:00/0001:00:00.0/0001:01:00.0/0001:02:09.0/0001:09:00.0/host1/vport-1:0-0/host3 </path> <parent>scsi_host1</parent> <capability type='scsi_host'> <host>3</host> <unique_id>2</unique_id> <capability type='fc_host'> <wwnn>5001a4aa3d07073a</wwnn> <wwpn>5001a4ae03a79be0</wwpn> <fabric_wwn>100050eb1a99d430</fabric_wwn> </capability> </capability> </device>
We have created two vHBAs for zoning.
The output of
switchshow command confirms the usage
> switchshow … … Index Port Address Media Speed State Proto =================================================================================== … … 22 22 011600 id N8 Online FC F-Port 1 N Port + 2 NPIV public
The next step is to zone the vHBA WWPN
with the SAN controller WWPNs.
Zoning in Brocade FC switch
Zoning is the process of binding WWPN’s of intended devices (ports) together so that
the devices can communicate with each other. Devices are identified using unique WWPNs.
Hence, the zoned WWPNs can communicate with each other through the switch. Login to the FC
switch interface. The WWPN of the
scsi_host3 vHBA (which is
5001a4ae03a79be0) has to be zoned to the WWPN of the controller. The
zoneshow command shows the existing zones and an
Effective configuration named
current_cfg which is the current configuration for zones.
Effective configuration: cfg: current_cfg
Assuming that the WWPN of the other
scsi_host4 vHBA is
5002b4152fcb28a4 and the WWPN of the
SAN controller is
500507680b2255fe, you can create a zone using the following commands:
> zonecreate "vhba01", "50:05:17:28:0b:22:55:ab" > zoneadd "vhba01", "50:01:a4:ae:03:a7:9b:e0" > zoneadd "vhba01”, "50:02:b4:15:2f:cb:28:a4"
You can add the created zone to the current configuration
> cfgadd "current_cfg","vhba01" > cfgsave > cfgenable current_cfg
Make sure that your new zone vhba01 is visible in the
Effective configuration section using the
Note: All the above switch commands are specific to the Brocade FC switch. There are similar commands for Cisco FCoE switch.
SAN Volume creation
After zoning is complete, the
hypervisor host’s WWPNs must be mapped to the new volume created on SAN. Refer to the
IBM System Storage SAN
Volume Controller and Storwize V7000 Best Practices and Performance Guidelines for
LUN discovery in hypervisor host
For the new LUN to be displayed at the hypervisor host, run the
rescan-scsi-bus.shcommand available as part of the sg3_utils package. Observe new
disks available in the hypervisor host which are the LUNs being accessed using two virtual
Install the sg3_utils package and scan the new disks using the following command:
# rescan-scsi-bus.sh Host adapter 0 (ipr) found. Host adapter 1 (lpfc) found. Host adapter 2 (lpfc) found. Scanning SCSI subsystem for new devices Scanning host 0 for SCSI target IDs 0 1 2 3 4 5 6 7, all LUNs … … … 2 new device(s) found. [1:0:1:0] [1:0:1:1] 0 device(s) removed.
Now these devices can be used as any other disk. Libvirt implementation allows to configure the LUNs either directly to the
virtual machine or as part of a storage pool that can then be configured for use on a virtual machine.
Booting KVM guest with NPIV-based storage volume
The available LUNs can be configured to be used as any other disk. You can choose to use
the LUN as a raw disk, install guest on LUN or create a volume out of the LUN with pools and
more. The following example demonstrates a guest using the volumes created from LUNs
(configured with pools).
Creation of pool
Pool uses WWPN and WWNN of the scsi_host3 vHBA in the
following XML file:
<pool type="scsi"> <name>npiv-pool</name> <target> <path>/dev/disk/by-path</path> </target> <source><adapter type="scsi_host" wwnn="5001a4aa3d07073a" wwpn="5001a4ae03a79be0" /></source> </pool> # virsh pool-define pool.xml Pool npiv-pool defined from pool.xml # virsh pool-start npiv-pool Pool npiv-pool started
After a pool has been started, you can see the LUN being controlled by
the scsi_host3 vHBA.
# virsh vol-list npiv-pool --details Name Path --------------------------------------------------------------------------------------------------- unit:0:1:0 /dev/disk/by-path/pci-0001:09:00.0-vport-0x5001a4ae03a79be0-fc-0x500507680b2255fe-lun-0
Create a volume XML using this LUN and hot/cold attach to a guest.
<disk type='volume' device='disk'> <driver name='qemu' type='raw'/> <source pool='npiv-pool' volume='unit:0:1:0'/> <target dev='sdc' bus='scsi'/> </disk>
Assuming that a guest is defined and is yet to
be started, cold-attach the disk using the following command:
# virsh attach-disk <guest-name> volume-atch.xml Device attached successfully
Boot the guest system and observe a new disk using the
inside the guest, which is LUN passed-through from the host. In guest, you can see the disk
[root@localhost ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 30G 0 disk ├─sda1 8:1 0 4M 0 part ├─sda2 8:2 0 500M 0 part /boot └─sda3 8:3 0 29.5G 0 part / vda 1:0 0 50G 0 disk
Significance of NPIV with guests
Without NPIV, all guests would be sharing the same WWPN on the single physical N_Port. Any LUN zoned to this WWPN
would be visible to all guests on that hypervisor host because all guests are using the same
physical N_Port, same WWPN, and same N_Port_ID. With NPIV, the physical N_Port can
register additional WWPNs (and N_Port_IDs). Each guest can have its own WWPN. When you build
SAN zones and present LUNs using the guest-specific WWPN, then the LUNs will only be visible
to that guest and not to any other guests.
This article described the steps to setup and use a SAN storage volume in a KVM guest
environment using NPIV on an IBM Power® server. To summarize, NPIV is significant in allowing
multiple guests of a same hypervisor host to use different volumes from a SAN storage using
a single physical HBA connected to FC host.