Sunday, May 15, 2011

A page from my obscure hardware book: Infiniband on Ubuntu

A while back, I was tasked with setting a lab's new infiniband mesh on Ubuntu.

For those unfamiliar with infiniband, it is a high-end (read: expensive) networking interface which provides insane throughput with really low latency. Our particular implementation did 40gbps with microsecond latency in 2010. I'll give you a moment to pick your jaw up off the floor now. Finished? Ok, lets continue.

Infiniband has an open source friendly consortium backing its development so you would think it should be easy to set up right? Not so unfortunately. If you've stumbled on this page from a google search, chances are you already know what I'm talking about. The hardware drivers are provided as kernel modules sources, which must be compiled for each new version of the kernel.

Alarm bells should be going off in your head at this moment. The amount of work required to maintain the system just went up by an order of magnitude. There isn't a DKMS implementation of this available.

Since you're already reeling from the implications of the time lost maintaining this anyway, I've got more bad news for you Ubuntu users. The packages in question are only provided as RPMs. Ubuntu/Debian are not among the officially supported distributions. (Here's something to console you.)

Back in 2010 we were using the Ubuntu 10.04 Lynx LTS release, so our kernel version was 2.6.32. The corresponding OFED driver release to use is 1.5.1. After unpacking that, you need to unpack the individual srpms. There are about 15 srpms in this folder, which each contain their own tarball. Since you've already lost enough time to the future cost of maintaining this beast, we can get bash to unpack things for you.
cd OFED-1.5.1/SRPMS
for i in *.rpm; do rpm -i $i --force-debian; done
cd~/rpmbuild/SOURCES
for i in *.tar.gz; do tar -zxvf $i; done
for i in *.tar.bz2; do tar -jxvf $i; done
Thankfully, there are a few of the OFED components maintained in the Ubuntu 10.04 repos. Here are the packages you would install to get them.
apt-get install libipathverbs1 libcxgb3-1 librdmacm1 libibverbs1 libmthca1 libopenmpi-dev libopenmpi1.3 openmpi-bin openmpi-common openmpi-doc libmlx4-1 rdmacm-utils ibverbs-utils build-essential byacc bison flex
After installing these packages, there are only a few more packages which need to be compiled from source. These are libibcm, libibumad, libibmad, opensm, and infiniband-diags. I've included the bash commands here for your convenience.
cd libibcm*
./configure && make && make install
cd libibumad*
./configure && make && make install
cd libibmad*
./configure && make && make install
cd opensm*
./configure && make && make install
cd infiniband-diags*
./configure && make && make install
Wrapping it all up with an `ldconfig` means you've installed all the necessary components.

Since this package was originally intended for RHEL, there are some changes you need to make to the scripts in the infiniband-diags package.
cd /usr/local/sbin/
sed -i 's/\/bin\/sh/\/bin\/bash/g' *
If your regular expressions are rusty, this command just replaces the #!/bin/sh call with #!/bin/bash.

Finally, to get the hardware itself running, you will need to load some kernel modules. Depending on which hardware you run, you will need to make some changes to this list. Use modprobe to load things first, and then commit the necessary list to /etc/modules.

The following is the list of modules I used for ip over infiniband (ipoib) on a Mellanox ConnectX adapter.
ib_ipoib
ib_addr
ib_mad
ib_sa
ib_cm
ib_uverbs
ib_ucm
ib_umad
mlx4_ib
It is likely that you will need to replace mlx4_ib with another kernel module if you are not using recent Mellanox adapters.

The final step is to configure the infiniband interface itself. In /etc/network/interfaces, add
auto ib0
iface ib0 inet static
       pre-up opensm -B
       address 10.x.x.x        

       netmask 255.255.255.0
It is important to include the 'pre-up opensm -B' line because infiniband requires a mesh manager. Don't worry if every interface has this line. Only the first infiniband adapter to connect to the mesh will start the opensm mesh manager. Every subsequent adapter will shutdown opensm once it detects an existing opensm instance.

That's it. Have fun with your infiniband mesh!

Attribution: When I was originally in the horrifying position of setting up infiniband hardware from scratch, I used the lucubration blog as my guide. A lot of these instructions are derived from there.

No comments:

Post a Comment