• Yote.zip@pawb.socialOP
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    1 year ago

    Unfortunately I don’t have a current hyperfixation at the moment. I’m wrapping up a contract at work and it requires babysitting at all hours of the day. I’m looking to pick up a new video game soon hopefully.

    My last hyperfixation was stuffing more hardware into my NAS and it went well. I put a 40gbps Infiniband network card (Mellanox Connect X-3) in it and my desktop, then connected them point to point. The cards were like $20 each which is wild, but they’re not exactly plug-and-play. The Infiniband cable that connects them was $30, which was silly compared to the card price.

    Infiniband is very interesting and weird. Normally you would use Ethernet to connect, but Infiniband requires its own software stack and compatibility. Infiniband does come with IBoIP, which translates normal applications to Infiniband at the cost of losing the high-performance Infiniband stack. The main thing I use that is Infiniband-compatible is my NFS server for file transfers.

    I combine all this with ZFS, 64gb of RAM (I would love 128gb but I would need to buy 4x32gb and I only had 16gb sticks laying around), and a 1TB NVME L2ARC cache to keep all my most-used files in fast flash memory, which lets me get the full 40gbps read speed almost always, even though my spinning rust collection is 54TB (108TB raw).

    I also stuck a couple $35 58gb Intel Optane sticks in to use as a mirrored boot drive, and partitioned 25GB off of them for a SLOG device (sort of acts like a write cache), which allows me to write extremely quickly to my ZFS without waiting for it to write to the disks. On very large file transfers this benefit is diminished because it needs to flush the files to disk every 5 seconds, but for 99% of transfers I get the full 40gbps and low latency. Intel Optane sticks or similar are mandatory for a SLOG device because normal SSDs will wear out very quickly in this role.

    The speed and latency are nice. All said it probably only cost me like $150 for the upgrades, but I already had a functional NAS beforehand.

    • AVincentInSpace@pawb.social
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      OH MY GOODNESS WE HAVE THE SAME INTERESTS. I tried the exact same InfiniBand thing a while back only to realize the cards I bought were either duds or needed some weird server something to make them work – neither would show up on an lspci, and I wasn’t sure how to even begin to diagnose that. I also read online that the throughput of IBoIP was like 7Gbps – sounds like that’s not true?

      Also, holy cats! 108 TERABYTES of spinning rust in RAID10? How many hard drives is that? Do you actually have a Storinator in your living room? What do you DO with it?

      Also, do you have any cool tips for working with zfs? I’ve been screwing around a bit with TrueNAS lately and it’s been a real pain in the rear. Apparently ZFS remembers the last machine it was mounted from and gets mad if you try to mount it from a different one, plus there’s the problem of it being impossible to change the number of drives in an array without creating an entirely new array (have they fixed that yet?). I’ve been wanting to use btrfs instead but 1) slow and 2) the internet is filled with horror stories about btrfs raid

      • Yote.zip@pawb.socialOP
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        1 year ago

        neither would show up on an lspci, and I wasn’t sure how to even begin to diagnose that. I also read online that the throughput of IBoIP was like 7Gbps – sounds like that’s not true?

        I believe IBoIP on my specific Mellanox Connect X-3 is limited to 10gbps while the Infiniband connections can go to 40gbps. It probably depends on the network card itself. Any of my high-bandwidth usecases are going to come from NFS, and anything else doesn’t need more than 10gbps, e.g. media streaming. I run Proxmox and Debian stuff, and I was able to get everything working by installing rdma-core and opensm packages. I have the following in my root crontab to switch the IB ports over to “connected” mode, which allows a higher MTU and is faster:

        @reboot echo connected > /sys/class/net/ibp10s0/mode
        @reboot echo connected > /sys/class/net/ibp10s0d1/mode
        @reboot /usr/sbin/ifconfig ibp10s0 mtu 65520
        @reboot /usr/sbin/ifconfig ibp10s0d1 mtu 65520
        

        I also use @reboot echo rdma 20049 > /proc/fs/nfsd/portlist to enable NFS to operate in RDMA mode for Infiniband communication. It was really tough to figure out how to do a lot of Infiniband stuff until I found this manual, after which everything just worked like it should. Overall, I would prefer equivalent Ethernet hardware if I was given a choice but Infiniband stuff is dirt cheap and it’s hard to argue with $20 for 40gbps.

        Also, holy cats! 108 TERABYTES of spinning rust in RAID10? How many hard drives is that? Do you actually have a Storinator in your living room? What do you DO with it?

        6x18TB, I store them trivially in a Node 804. There’s plenty of room for growth in that case still, and I just run cheap consumer hardware within it. I store “Linux ISOs” on it as well as any and all data from my life. I’m pretty loaded IRL so I figure if it’s worth archiving it’s worth archiving right, and I don’t mind keeping the highest quality versions of anything I like.

        Also, do you have any cool tips for working with zfs?

        Yeah ZFS is quite easy to work with, once you get a compatible kernel. TrueNAS is a dead simple way to interface with ZFS, though I wouldn’t recommend it as the only thing on your NAS because it’s very inflexible with running non-TrueNAS-approved usecases and Docker etc. Personally I would recommend using Proxmox with a minimal TrueNAS VM underneath it, or just skipping TrueNAS entirely and letting Proxmox manage your ZFS pool. You can install Cockpit + this plugin for a simple GUI ZFS manager that does most of the important stuff, without needing to run a full TrueNAS VM. If you’re still new to ZFS I would stick with TrueNAS though, since it will hold your hand while learning. Once you understand ZFS better you can drop it if it’s getting in your way.

        Apparently ZFS remembers the last machine it was mounted from and gets mad if you try to mount it from a different one

        This shouldn’t be the case - you may need to configure your disk identifiers to use IDs (portable between machines) instead of e.g. generic block labels (not even necessarily the same between reboots). A ZFS pool should be able to move between machines with no fuss.

        the problem of it being impossible to change the number of drives in an array without creating an entirely new array (have they fixed that yet?)

        Yes this is a very annoying problem and it’s the main reason I use ZFS’s equivalent of RAID10: mirror vdevs. Mirrors in ZFS are much more flexible than RAIDZ, and that flexibility extends to very random things. For example, a SLOG device can be freely detached from a zpool consisting of mirrors, but not one consisting of RAIDZ. Mirror drives can be added in pairs at any time, which means I can add a couple drives of any size whenever I feel like it - this makes sense for larger disk sizes in the future and random drives that could be on sale. RAIDZ’s mythical future is RAIDZ expansion, which would allow you to grow your RAIDZ array from e.g. 4 disks to 5 disks without recreating it or destroying it. This future is a reality in that the code has already been merged, it’s just waiting to get baked into the OpenZFS release.

        I’ve been wanting to use btrfs instead but 1) slow and 2) the internet is filled with horror stories about btrfs raid

        BTRFS RAID is a non-starter in that it’s marked as “unstable” by the developers and will cause data loss. However, you can use MergerFS+SnapRAID for the RAID logic, and back that setup with individual BTRFS drives that do not know they are in a RAID. MergerFS is an overlay filesystem that basically combines any number of drives into appearing like a single drive, no matter what filesystem the drives are using. When you write data to a MergerFS array, the files will transparently go to random disks instead of striping. The strategy that it uses to distribute files can be changed to other methods as well. SnapRAID is a data redundancy system that allows you to calculate the parity of any number of drives onto 1-6 parity drives, which can restore that data in the event of failure. MergerFS and SnapRAID are almost always used together in order to give a traditional RAID experience

        This solution is not quite as fancy as ZFS but it would still be my recommendation for ad-hoc budget setups where you have mismatched drives and random hardware, because SnapRAID does not care about drive size uniformity (unlike ZFS). You need to dedicate your largest 1-2 disk(s) to being the parity drive(s), and then you just throw as many data drives as you can at it. The drives will not work in tandem so speeds will just be at the speed of whatever disk has the file. BTRFS is a great filesystem if you don’t use its RAID features, and its speed is probably equivalent to ZFS or maybe even faster. However, in normal usage ZFS cheats a lot by using its smart ARC cache and other tricks in order to make common disk activity much faster than the disks themselves.