iSCSI on a Dell MD3000i
My company recently got a Dell MD3000i “SAN” for a pretty good deal. We’re starting to dabble in such things as virtualization, and the offer was too good to refuse. The MD3000i is pretty basic. Really it seems like just a JBOD with an iSCSI head, but it is a good way to start to play in that space. We got it with dual controllers and about 6TB of space.
Of course it comes with an installer and for RedHat (Dell’s supported flavor), but not one for Gentoo. So here are my tried and tested installation instructions.
On the server, you have to ensure that iSCSI support is included in the kernel.
Run make menuconfig and make certain the following options are configured:
Device Drivers --->
SCSI device support --->
[*] SCSI device support
<*> SCSI disk support
SCSI Transports --->
{M} iSCSI Transports Attributes
[*] SCSI low-level drivers --->
<M> iSCSI Initiator over TCP/IP
Cryptographic options --->
[*] Cryptographic API
<*> CRC32c CRC algorithm
Note that the Transport and Initiator MUST be built as modules. Open-iscsi requires them to be modules and checks for the in the init script.
Emerge the packages:
emerge -av sys-block/open-iscsi sys-fs/multipath
Configure the ethernet ports. You probably want to use multi-pathing if you can for redundancy. In this case, the server has 4 NICs and I am using 2 and 3 (eth1 and eth2) going to two separate switches, each with connections to the two controllers on the MD3000i. Also you want to use jumbo frames (MTU > 1500) if your switch supports it. Make sure to also turn it on in the switch if necessary. You also want to segregate the ip addresses of the SAN traffic to its own subnet, and VLAN if the switch carries an regular traffic. I have not yet tackled the QOS issue on the switches, but we are not yet close to capacity so I think it should be okay.
In /etc/conf.d/net:
config_eth1=( "10.8.251.<base>/24 brd 10.8.251.255") mtu_eth1="9000" config_eth2=( "10.8.252.<base>/24 brd 10.8.252.255") mtu_eth2="9000"
Edit /etc/iscsi/initiatorname.iscsi. Note that the naming convention is iqn.<domain registration yyyy-mm>.<fqdn in reverse>:<whatever you want to make it unique inside your org>. I use the hostname so I can easily recognize what server it is and the MAC address of the primary NIC for uniqueness.
InitiatorName=iqn.2006-10.org.onejohn.<hostname>:<hostname>.onejohn.<MAC address> InitiatorAlias=<hostname>.iscsi1
Set up /etc/multipath.conf. Since I have two NICs outgoing, and there are two connections to each controller, and each controller is connected to each switch, so there are 4 paths to each LUN. Multipath collapses them into another single device name that automatically switches between paths the possible paths as needed.
defaults { udev_dir /dev polling_interval 10 selector "round-robin 0" path_grouping_policy group_by_prio getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n" prio_callout "/sbin/mpath_prio_rdac /dev/%n" path_checker rdac hardware_handler "1 rdac" rr_min_io 1000 rr_weight priorities failback 10 #no_path_retry queue user_friendly_names no } blacklist { devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" devnode "^hd[a-z][[0-9]*]" devnode "^sda[0-9]*$" # make sure all local scsi disks are here devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]" wwid 360026b90004b6c4a0000036c4b1aadf0 device { vendor "DELL " product "Universal Xport*" } } multipaths { } devices { device { vendor "DELL " # get from `cat /sys/block/sdab/device/vendor > /tmp/sanvend`. Spaces are important product "MD3000i " # get from `cat /sys/block/sdab/device/model > /tmp/sanmodel`. Spaces are important path_grouping_policy multibus getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n" prio_callout "/sbin/mpath_prio_rdac %d" path_checker rdac path_selector "round-robin 0" failback 10 rr_min_io 1000 } }
Update the /etc/udev/rules.d/66-kpartx.rules and change the name to 66-rkgkpartx.rules so it isn’t overwritten:
--- multipath-tools-0.4.8/kpartx/kpartx.rules 2007-08-02 17:05:37.000000000 -0400 +++ /etc/udev/rules.d/66-rkgkpartx.rules 2010-01-06 16:10:32.000000000 -0500 @@ -7,7 +7,7 @@ KERNEL!="dm-*", GOTO="kpartx_end" ACTION=="remove", GOTO="kpartx_end" -ENV{DM_TABLE_STATE}!="LIVE", GOTO="kpartx_end" +ENV{DM_TABLE_LIVE}!="1", GOTO="kpartx_end" ENV{DM_UUID}=="?*", IMPORT{program}=="/lib/udev/kpartx_id %M %m $env{DM_UUID}" @@ -18,7 +18,7 @@ SYMLINK+="disk/by-id/$env{DM_TYPE}-$env{DM_NAME}" # Create persistent links for dmraid tables -ENV{DM_UUID}=="mpath-*", \ +ENV{DM_UUID}=="dmraid-*", \ SYMLINK+="disk/by-id/$env{DM_TYPE}-$env{DM_NAME}" # Create persistent links for partitions @@ -26,10 +26,10 @@ SYMLINK+="disk/by-id/$env{DM_TYPE}-$env{DM_NAME}-part$env{DM_PART}" # Create dm tables for partitions -ENV{DM_STATE}=="ACTIVE", ENV{DM_UUID}=="mpath-*", \ - RUN+="/sbin/kpartx -a -p -part /dev/$kernel" -ENV{DM_STATE}=="ACTIVE", ENV{DM_UUID}=="dmraid-*", \ - RUN+="/sbin/kpartx -a -p -part /dev/$kernel" +ENV{DM_STATUS}=="ACTIVE", ENV{DM_UUID}=="mpath-*", \ + RUN+="/sbin/kpartx -a -p '' /dev/$kernel" +ENV{DM_STATUS}=="ACTIVE", ENV{DM_UUID}=="dmraid-*", \ + RUN+="/sbin/kpartx -a -p '' /dev/$kernel" LABEL="kpartx_end"
Start iscsid and multipathd:
/etc/init.d/iscsid start /etc/init.d/multipathd start
If iscsid complains about “No Records Found!”, try starting it again until “rc-status -a | grep iscsid” shows it running.
Create the interfaces:
iscsiadm -m iface -I iface0 --op=new iscsiadm -m iface -I iface0 --op=update -n iface.hwaddress -v <MAC address of eth1> iscsiadm -m iface -I iface1 --op=new iscsiadm -m iface -I iface1 --op=update -n iface.hwaddress -v <MAC address of eth2>
In the Modular Disk Storage Manager, go to Configure/Configure Host Access (Manual) and enter the host name and Linux as the OS. On the next screen it should show up under Known iSCSI initiators. Add it, and hit Next. Choose “No: This host will NOT share access…”, hit next and Finish.
On the server, discover the targets:
iscsi_discovery 10.8.251.8 -m
Log into the discovered targets:
iscsiadm -m node -T iqn.1984-05.com.dell:powervault.md3000i.60026b90004b6c4a000000004b1aadba -l
Set the nodes that can be logged into to automatic:
iscsiadm -m node -T iqn.1984-05.com.dell:powervault.md3000i.60026b90004b6c4a000000004b1aadba -p 10.8.251.8,3260 -I iface0 -o update -n node.startup -v automatic iscsiadm -m node -T iqn.1984-05.com.dell:powervault.md3000i.60026b90004b6c4a000000004b1aadba -p 10.8.251.9,3260 -I iface0 -o update -n node.startup -v automatic iscsiadm -m node -T iqn.1984-05.com.dell:powervault.md3000i.60026b90004b6c4a000000004b1aadba -p 10.8.252.8,3260 -I iface1 -o update -n node.startup -v automatic iscsiadm -m node -T iqn.1984-05.com.dell:powervault.md3000i.60026b90004b6c4a000000004b1aadba -p 10.8.252.9,3260 -I iface1 -o update -n node.startup -v automatic
Create partitions on the SAN. Tell iscsid to rescan:
iscsiadm -m node -R
Get the target wwids from multpath
multipath -d
In the multipaths section of /etc/multipath.conf add them:
multipath { wwid 360024b90004b6caa000003834b88c49d # from the multpath -d output alias foo # fix to a nice name }
Fdisk and format the new partition
fdisk /dev/mapper/foo mkfs.xfs /dev/mapper/foo1
Add them to fstab:
/dev/mapper/foo1 /mnt/foo xfs _netdev,noatime 0 2
Create /etc/init.d/iscsi-mount
#!/sbin/runscript depend() { need iscsid multipathd before mysql # fix to whatever dependencies there are. } start() { ebegin "Mounting _netdev devices" sleep 3 # make sure multipaths are loaded and settled /sbin/multipath /bin/mount -a -O _netdev -v eend $? } stop() { ebegin "Unmounting _netdev devices" /bin/umount -a -O _netdev -v eend $? } restart() { svc_stop svc_start }
and make it executable:
chmod a+x /etc/init.d/iscsi-mount
Add them to default startup:
rc-update add iscsid default rc-update add multipath default rc-update add iscsi-mount default
May 12th, 2010 at 7:32 am
Very helpfull. Thank You 🙂
November 3rd, 2010 at 1:49 pm
Hi, thanks for this article it really helped me to get things started.
I still have some problems with my multipath configuration. Could you send me the output of your multipath -ll ? Mine looks like:
md3000i_file_service_1 (36001c23000b970ea000003654cb403ec) dm-0 ,
[size=1.4T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ #:#:#:# sdf 8:80 [active][undef]
\_ #:#:#:# sde 8:64 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ #:#:#:# sdc 8:32 [active][undef]
\_ #:#:#:# sdd 8:48 [active][undef]
I was wondering if it is normal that there is no device name in the end of the first line, like I saw in different other posts. Also I am not sure if all the udev rules work as expected, because I have lvm2 installed as well.
November 3rd, 2010 at 7:50 pm
Chris.
Here is one of mine:
databasepv (360026b90004b6c4a0000088e4cc82414) dm-3 ,
[size=1.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=6][active]
\_ #:#:#:# sdd 68:48 [active][ready]
\_ #:#:#:# sdf 68:80 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ #:#:#:# sdc 68:32 [active][ghost]
\_ #:#:#:# sde 68:64 [active][ghost]
So I do not have a device name either. I suspect that better SANs might push those name, but have not tried one yet so I don’t know for sure.
With LVM installed, I have sometimes run into issues with iSCSI trying to shut down before LVM does, but LVM keeping a lock on the volume so iSCSI is not able to shut down properly. Then LVM can’t flush its buffers, and the system hangs on shutdown or reboot. The only solution I’ve found is to issue a “dmsetup remove vg” on all volume groups on the iSCSI lun before shutting down. In Gentoo, that can go in /etc/init.d/local.stop.
December 13th, 2010 at 11:25 pm
Since you made your comment I have done some more investigation. It seems that
multipath-tools 0.4.8 was built for an old version of sysfs and was
looking in the wrong location for the vendor and model (and some other
stuff). Upgrading to 0.4.9-r1 fixes it. In portage 0.4.9-r1 is still
masked so if you are using Gentoo you will need to ~ keyword it for your
arch in /etc/portage/package.keywords.
Here is the new output showing the vendor and model. Also note the
pretty new ASCII art lines:
test1 (360026b90004b6c4a00000d334ce13319) dm-4 DELL,MD3000i
size=700G features=’3 queue_if_no_path pg_init_retries 50′ hwhandler=’1
rdac’ wp=rw
|-+- policy=’round-robin 0′ prio=6 status=active
| |- 16:0:0:0 sdb 8:16 active ready running
| `- 15:0:0:0 sdd 8:48 active ready running
`-+- policy=’round-robin 0′ prio=1 status=enabled
|- 17:0:0:0 sdc 8:32 active ghost running
`- 18:0:0:0 sde 8:64 active ghost running
February 18th, 2012 at 9:36 am
Some truly interesting details you have written.Assisted me a lot, just what I was looking for :D.