Homelab Updates Part 2

This is a continuation of the previous article on hardware updates to my homelab, focusing on the configuration of the major hardware components discussed.

No need for much of a preamble here, lets get into it.

Network Configuration

There are a bunch of advanced features to experiment with on this Cisco switch, but for now just getting it up and running should be good enough.

I factory reset it, enabled setup mode and hopped onto the console. After running through the initial setup program and assigning some static IP addresses and passwords, I confirmed I could connect with the new addresses and proceeded with the following configuration steps:

The switch and hypervisor are physically connected via a pair of SFP+ DAC cables. It would be criminal not to configure them for balanced failover. Fortunately, Cisco's EtherChannel and Illumos' Link Aggregation talk to each other, so we'll be setting that up.

I set the following configuration on the switch:

#configure terminal
 interface Te1/0/1
  switchport mode trunk
  channel-group 1 mode active
 interface Te1/0/2
  switchport mode trunk
  channel-group 1 mode active
end

Since I had already configured link-aggregation on the SmartOS side, I also confirmed this was functional on the switch side:

#show etherchannel 1 summary
Flags:  D - down        P - bundled in port-channel
        I - stand-alone s - suspended
        H - Hot-standby (LACP only)
        R - Layer3      S - Layer2
        U - in use      f - failed to allocate aggregator

        M - not in use, minimum links not met
        u - unsuitable for bundling
        w - waiting to be aggregated
        d - default port


Number of channel-groups in use: 1
Number of aggregators:           1

Group  Port-channel  Protocol    Ports
------+-------------+-----------+--------------------------
1      Po1(SU)         LACP      Te1/0/1(P)  Te1/0/2(P)

Port-channel 1 is currently in use and both 10GBE ports are bundled in it.

Virtual LAN segments

As this switch also supports VLANs, this enables a bunch of my pre-existing hardware, such as IP phones, Wireless Access Points, and SmartOS itself. In roughly sketching some ideas out, I came to the following layout:

  • Infrastructure network (vlan id 1, IPv4/27, IPv6/64) dedicated virtual network for network or infrastructure management interfaces: switches, access points, IP phones, iDRACs, hypervisors.
  • Embedded network (vlan id 2, IPv4/27, IPv6/64) dedicated virtual network for embedded devices: chromecasts, IoT devices.
  • Internal network (internal etherstub, IPv4/27, IPv6/64) dedicated etherstub for internally facing VMs and zones.
  • External network (external etherstub, IPv4/27, IPv6/64) dedicated etherstub for externally facing VMs and zones.
  • Private network (vlan id 3, IPv4/27, IPv6/64) dedicated virtual network for physically secured devices, workstations, laptops, etc.
  • Guest network (vlan id 4, IPv4/27, IPv6/64) dedicated virtual network for guest wireless devices.
  • Public network (vlan id 5, IPv4/DHCP) dedicated virtual network for upstream network connectivity.

With this in mind, I set the following configuration on the switch:

#configure terminal
 vlan 2
 name embedded
 vlan 3
 name private
 vlan 4
 name guest
 vlan 5
 name public
end

And then I confirmed with the following command:

#show vlan

VLAN Name                             Status    Ports
---- -------------------------------- --------- -------------------------------
1    default                          active    Gi1/0/1, Gi1/0/2, Gi1/0/3
                                                Gi1/0/4, Gi1/0/5, Gi1/0/6
                                                Gi1/0/7, Gi1/0/8, Gi1/0/9
                                                Gi1/0/10, Gi1/0/11, Gi1/0/12
                                                Gi1/0/13, Gi1/0/14, Gi1/0/15
                                                Gi1/0/16, Gi1/0/17, Gi1/0/18
                                                Gi1/0/19, Gi1/0/20, Gi1/0/21
                                                Gi1/0/22, Gi1/0/23, Gi1/0/24
2    embedded                         active
3    private                          active
4    guest                            active
5    public                           active
1002 fddi-default                     act/unsup
1003 token-ring-default               act/unsup
1004 fddinet-default                  act/unsup
1005 trnet-default                    act/unsup

VLAN Type  SAID       MTU   Parent RingNo BridgeNo Stp  BrdgMode Trans1 Trans2
---- ----- ---------- ----- ------ ------ -------- ---- -------- ------ ------
1    enet  100001     1500  -      -      -        -    -        0      0
2    enet  100002     1500  -      -      -        -    -        0      0
3    enet  100003     1500  -      -      -        -    -        0      0
4    enet  100004     1500  -      -      -        -    -        0      0
5    enet  100005     1500  -      -      -        -    -        0      0
1002 fddi  101002     1500  -      -      -        -    -        0      0
1003 tr    101003     1500  -      -      -        -    -        0      0
1004 fdnet 101004     1500  -      -      -        ieee -        0      0
1005 trnet 101005     1500  -      -      -        ibm  -        0      0

Remote SPAN VLANs
------------------------------------------------------------------------------


Primary Secondary Type              Ports
------- --------- ----------------- ------------------------------------------

Saving the Running Configuration

After ensuring that everything works as intended, the startup switch configuration needs to be overwritten by the current one. This is done with the following:

#copy running-config startup config

Conclusion

While Cisco IOS has quite the learning curve over what I'm used to in switch configuration, it wasn't at all unpleasant to work with once I slowed down and took the time required to understand it.

While the above configuration steps were the bare minimum to get this network up and running, there's a bunch more stuff of interest in that Cisco switch that I would like to dig into in the future.

But as usual, we'll save that for another day.

SmartOS Network Configuration

As I wanted my administrative interface to function over an link aggregation, I set the following in /usbkey/config:

# Aggregation from Intel 2x10GBE Interfaces (e2,e3)
aggr0_aggr=00:00:00:00:00:00,00:00:00:00:00:00
aggr0_lacp_mode=active

# Administrative Interface
admin_nic=aggr0
admin_ip=172.22.1.8
admin_netmask=255.255.255.224
admin_network=172.22.1.0
admin_gateway=172.22.1.1

# Additional Etherstubs
etherstub=external0,internal0

# Common Configuration
hostname=gz-1
dns_domain=ewellnet
dns_resolvers=172.22.1.1
ntp_conf_file=ntp.conf
root_authorized_keys_file=authorized_keys

Some brief highlights:

  • aggr0_aggr refers to the interfaces to use in the link aggregation by hardware address.
  • aggr0_lacp_mode ensures this side is actively participating in LACP, instead of passively waiting for another active party.
  • admin_nic sets the administrative interface, in this case, to the link aggregation. The rest of the admin_ parameters are as set by the SmartOS installation.
  • etherstub sets additional etherstubs to be configured upon boot.
  • I'm setting my own custom ntp.conf so that I can use my global zone as a network time server across multiple subnets.

/usbkey/config.inc/ntp.conf:

driftfile /var/ntp/ntp.drift
logfile /var/log/ntp.log

# Ignore all network traffic by default
restrict default ignore
restrict -6 default ignore

# Allow localhost to manage ntpd
restrict 127.0.0.1
restrict -6 ::1

# Allow servers to reply to our queries
restrict source nomodify noquery notrap

# Allow local subnets to query this server
restrict 172.22.1.0 mask 255.255.252.0 nomodify

# Time Servers
pool 0.smartos.pool.ntp.org burst iburst minpoll 4
pool 1.smartos.pool.ntp.org burst iburst minpoll 4
pool 2.smartos.pool.ntp.org burst iburst minpoll 4
pool 3.smartos.pool.ntp.org burst iburst minpoll 4

SmartOS Zpool Configuration

After getting a hardware configuration together that worked for Illumos, I spent a few weeks testing various vdev configurations for performance and spatial efficiency. As the tests evolved over that time, I wasn't fully satisfied with the consistency of the methodology and will be rerunning those tests again for my own information as well as to feature in a future article. There were some pretty solid results that shone through through.

  • ZFS pools based on three five-drive RAIDZ vdevs significantly outperformed pools based on two eight-drive RAIDZ2 vdevs in terms of sequential read performance (17.9% faster) and storage efficiency (7%).
  • ZFS pools with special allocation class vdevs outperformed pools without them in terms of sequential read performance (18.3% faster).

The performance advantages of RAIDZ outweigh the resiliency advantages of RAIDZ2 in my case, as this pool configuration also has a hot-spare, reducing temporal exposure to loss of the pool. As well, critical datasets are regularly replicated off-site.

The zones pool of the new server was manually created during SmartOS installation with the following command:

# zpool create \
  -o autotrim=on -O atime=off \
  -O checksum=edonr -O compression=lz4 \
  -O recordsize=1M -O special_small_blocks=128K \
  zones \
    raidz c1t0d0 c1t1d0 c1t2d0 c1t3d0 c1t4d0 \
    raidz c1t5d0 c1t6d0 c1t7d0 c1t8d0 c1t9d0 \
    raidz c1t10d0 c1t11d0 c1t12d0 c1t13d0 c1t14d0 \
    spare c1t15t0 \
    special
      mirror c3t1d0 c4t1d0
      mirror c5t1d0 c6t1d0

Parameters of note:

  • autotrim=on enables automatic trim for all trim-capable devices in the pool. In this case, all NVMe SSD based special vdev leaf devices.
  • atime=off disables file access time updates, reducing metadata writes and improving throughput. This is a standard parameter used by SmartOS zones pools.
  • checksum=edonr uses the edonr checksum instead of fletcher for filesystem checksum calculations. I had found during previous testing that out of all of the cryptographicly strong checksum algorithms available to ZFS, edonr performed the best. This should be re-verified.
  • compression=lz4 explicitly uses lz4 for block compression, which is almost universally a good idea. This could also be set to compression=on and will use the ZFS default compression algorithm, which will attempt to balance compression speed with compression ratio.
  • recordsize=1M raises the maximum record size from 128K to 1M. This improves performance for large sequential file access by keeping RAIDZ stripes on individual leaf devices relatively large (36K-256K) and ensures that a range of file sizes fit on the normal vdevs instead of the special vdevs, thanks to the next parameter.
  • special_small_blocks=128K allows for data blocks up to 128K to also be stored on the NVMe SSDs instead of the hard drives. This should drastically improve random IO and overall throughput.

The combination of the last two parameters effectively creates a hybrid storage pool. All metadata and blocks of up to 128K go to one class of storage while blocks between 128K and 1M. By adjusting recordsize and/or special_small_blocks different storage properties can be achieved for different datasets.

Cache only metadata for swap zvol

I've never liked the idea of swap pages being cached in the ARC, and that happens by default in SmartOS. Fortunately it's easy to switch that behavior on and off at anytime:

# zfs set primarycache=metadata zones/swap

The above command will ensure that only metadata for zones/swap makes its way into the ARC, preserving it for normal file access. I can't really foresee a case where I would want to reverse this, perhaps with L2ARC devices installed in this pool? Either way it's rather trivial to revert back to the normal behavior with:

# zfs inherit primarycache zones/swap

Calming the Dragon (fans)

It was a bit of a surprise when I started this server up the first time after adding a non-certified-by-dell PCIe card. If that sounds like a bit of a shakedown, it is. It also sounded like the building was about to take off. I, as many before me had, discovered that Dell is very conservative when it comes to cooling devices that their firmware can't monitor the temperatures of. And by conservative, I mean liberal with the cooling.

Some people solve this problem by completely disabling the automatic thermal profiles and manually stepping up and down the fan speed via ipmitool and some cron scripts that run every minute.

And struck me as a horrible idea.

It would be so much better to continue to let the dedicated firmware that monitors system temperature to track component temperatures and adjust airflow to correct for it. Just if there was only a way to tell it not to worry about those PCIe devices behind the curtain.

Fortunately, at least someone at Dell agrees with me.

It turns out you can disable the third-party PCIe cooling response, preventing it from loudly complaining about additional PCIe devices in the system. The only utility required to change this behavior is ipmitool which is already part of the SmartOS global zone.

To check the current cooling response status, run the following command:

# ipmitool raw 0x30 0xCE 0x01 0x16 0x05 0x00 0x00 0x00

The following response means the third-party cooling response is disabled. Quiet.

 16 05 00 00 00 05 00 01 00 00

The following response means the third-party cooling response is enabled. Loud.

 16 05 00 00 00 05 00 00 00 00

To disable the third-party cooling response, run the following command:

# ipmitool raw 0x30 0xCE 0x00 0x16 0x05 0x00 0x00 0x00 0x05 0x00 0x01 0x00 0x00

To enable the third-party cooling response, run the following command:

# ipmitool raw 0x30 0xCE 0x00 0x16 0x05 0x00 0x00 0x00 0x05 0x00 0x00 0x00 0x00

This setting appears to be maintained across power cycles, but if you do find yourself in a situation where you need to run cards that run hot (like NV1604s), it would be wise to re-enable the default behavior to avoid that potential fire hazard.