Initially written on russian on: 2021-01-12.
Translated to english on: 2022-11-23.
This article explains FreeBSD installation without sysinstall
,
allowing fine tuning and showing how simple all of that is actually. Pay
attention that this is my personal experience exclusively. This is
collection of notes.
Naked ZFS usage is simple and trivial. Just create ZFS pool on the disk and copy zfsboot loader to the empty reserved space inside ZFS pool:
zpool create zroot ada0 dd if=/boot/zfsboot of=/dev/ada0 count=1 dd if=/boot/zfsboot of=/dev/ada0 iseek=1 oseek=1024
Easy, beautiful, minimalistic and working, but I do not recommend that, because at least you have to place your swap inside zvol, that gives considerable overhead.
If you want to create swap anyway, then I recommend the following options for the zvol:
volblocksize=4k # amd64's native page size sync=always # always write data to the disk, without keeping in RAM logbias=throughput # place data on the disk directly and immediately, # without intermediate ZIL primarycache=metadata # keep only metadata in the ARC cache
Create GPT scheme, partitions for bootloader, swap and root filesystem. Everything with explicit labels. 4K alignment, because all modern drives has 4K physical sectors and unaligned access can drastically decrease your performance.
gpart create -s GPT diskid/DISK-SERIAL gpart add -t freebsd-boot -a 4K -s 512K -l SERIAL-BOOT diskid/DISK-SERIAL gpart add -t freebsd-swap -s 2G -l SERIAL-SWAP diskid/DISK-SERIAL gpart add -t freebsd-zfs -l SERIAL-ROOT diskid/DISK-SERIAL
Install MBR and ZFS bootloaders.
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 diskid/DISK-SERIAL
FreeBSD includes completely prepared FAT32 partition image with EFI bootloader installed in it. The only difference with the previous section is bootloader installation:
gpart create -s GPT diskid/DISK-SERIAL gpart add -t efi -s 800K diskid/DISK-SERIAL gpart add -t freebsd-swap -a 4K -s 2G -l SERIAL-SWAP diskid/DISK-SERIAL gpart add -t freebsd-zfs -l SERIAL-ROOT diskid/DISK-SERIAL gpart bootcode -p /boot/boot1.efifat -i 1 diskid/DISK-SERIAL
If there is no boot1.efifat, then you can create it manually:
gpart add -t efi -s XXX diskid/DISK-SERIAL newfs_msdos [-c 1] /dev/diskid/DISK-SERIALp1 mount -t msdosfs /dev/diskid/DISK-SERIALp1 /mnt mkdir -p /mnt/EFI/BOOT cp /boot/loader.efi /mnt/EFI/BOOT/BOOTx64.efi umount /mnt
If you want to use UFS2 instead of ZFS for some reason, then replace
freebsd-zfs
with freebsd-ufs
, and bootloader with
gptboot.
sysctl vfs.zfs.min_auto_ashift=12 zpool create zroot gpt/SERIAL-ROOT zfs set checksum=sha256 compression=lz4 atime=off mountpoint=/mnt zroot
If you want to create a mirror, then specify that here:
zpool create zroot mirror gpt/SERIAL1-ROOT gpt/SERIAL2-ROOT
.
I always recommend to use disks with explicit labels on them, without unstable and fragile enumeration like adaX or something relying on disk "geography". I do not know any problems with diskid/XXX usage, where XXX is some serial number for example.
You can use glabel
command to create those labels, which
will record it to the end of the disk and make it available under
label/XXX path. But this is BSD-specific solution. GPT
partition will be visible and known in any GPT-aware OS. Moreover
you can also align your partitions on 4K sectors with it. Also some
people specify slightly smaller last partitions (1GB less for
example), to workaround possible problems with the drives from
different vendors, that can easily have slightly varying actual
number of sectors. GPT helps with all those issues.
Solaris documentation recommends to give the whole disk for ZFS, but that is related to that OS exclusively, because only that way ZFS can disable write caching on the drive. FreeBSD does not have that problem and you can give partitions to ZFS without any difference.
So I strongly suggest and advice to use GPT labeled and aligned partitions:
gpart create -s GPT diskid/SERIAL1 gpart create -s GPT diskid/SERIAL2 gpart add -t freebsd-zfs -a 4K -l SERIAL1-STORAGE diskid/SERIAL1 gpart add -t freebsd-zfs -a 4K -l SERIAL2-STORAGE diskid/SERIAL2 vfs.zfs.min_auto_ashift=12 zpool create storage gpt/SERIAL1-STORAGE gpt/SERIAL2-STORAGE
I specified LZ4 compression algorithm for the zroot pool immediately. There are few cases where compression can hurt or it will be useless. There are no reasons not to use transparent fast compression on your root filesystem, because most data on it is compressible, giving considerable actual performance boost.
atime is disabled, because hardly anyone met cases when it
is useful. It can result in pretty high overhead. If any of software
wants honest atime
behaviour, then you should create separate
dataset for it, with atime enabled.
I always use only cryptographically secure hash functions for
checksums. For all datasets I prefer to use either skein
, or
sha512
, because first one is much faster than others, second
one is faster on 64-bit systems, comparing to sha256
. But
unfortunately FreeBSD’s loader supports only sha256
, so root
dataset has to be sha256
. With cryptographic hashes you can
also use deduplication feature in the future.
It is critical to use proper ashift. Its value is the power of two, that specified disk’s sector size. You can set it only during initial pool creation. It is immutable after that. Problem with ashift is the fact that modern hard drives still likes to lie about their real physical sector size, replying with 512 bytes ones.
If you plan to use encrypted GELI drives/partitions, then do not
forget about GELI’s sector size too! Larger its sector, less sectors
you have to process, less burden on CPU for keys generation (8 times
less with 4K sectors, comparing to 512 ones).
geli init -s 4K ...
.
If you want to use UFS2 (for non-critical low-power system), then I
would make it with newfs -Ut [-E] /dev/gpt/SERIAL-ROOT
.
Soft-updates considerable increases performance. TRIM is a must for
SSDs (ZFS does TRIM automatically even when used on top of GELI).
for what in base kernel [src ports lib32 kernel-dbg tests] ; do tar xfC /usr/freebsd-dist/$what.txz /mnt done
Everything below is made inside mounted /mnt:
chroot /mnt
# cat > /boot/loader.conf <<EOF zfs_load="YES" vfs.root.mountfrom="zfs:zroot" # HyperThreading sucks at security and stability/reliability machdep.hyperthreading_allowed=0 # Meltdown/Spectre mitigation disable vm.pmap.pti=0 hw.ibrs_disable=1 # XSAVEOPT can work unreliable/unstable on some amd64 CPUs hw.cpu_stdext_disable=0x1 aesni_load="YES" # GELI will use hardware acceleration after that automatically # Just to remember that ipfw by default drops everything and you can # loose control on the server after you enable it #net.inet.ip.fw.default_to_accept=0 EOF
# cat > /etc/fstab <<EOF tmpfs /tmp tmpfs rw,nosuid,mode=1777 0 0 fdescfs /dev/fd fdescfs rw 0 0 proc /proc procfs rw 0 0 /dev/gpt/SERIAL-SWAP.eli none swap sw 0 0 EOF
It is crucial to specify .eli at the end of the path to swap volume – that way it will be encrypted with a temporary onetime key.
sysctl
tuning# cat >> /etc/sysctl.conf <<EOF kern.msgbuf_show_timestamp=1 # timestamps in dmesg output kern.cam.ada.write_cache=0 # with ZFS you do not want any write buffering security.bsd.stack_guard_page=1 security.bsd.see_other_uids=0 security.bsd.unprivileged_idprio=1 # personally I often use idprio kern.randompid=1 net.inet.ip.random_id=1 security.bsd.hardlink_check_gid=1 security.bsd.hardlink_check_uid=1 #security.jail.allow_raw_sockets=1 # if you want to use ping inside jails net.inet.tcp.drop_synfin=1 net.inet6.ip6.use_tempaddr=0 # do not randomize link-local IPv6 # addresses, but use MAC-base deterministic # ones for convenience net.inet.tcp.tso=0 # TSO generally hurts on routers, but # I disable it everywhere just for sure kern.elf64.allow_wx=1 # W^X # Turn off ability to compress IPsec traffic and allow its processing by # firewall net.inet.ipcomp.ipcomp_enable=0 net.inet.ipsec.filtertunnel=1 net.inet6.ipsec6.filtertunnel=1 # Users are allowed to create TAP interfaces. Make them up, when opened #net.link.tap.user_open=1 #net.link.tap.up_on_open=1 #vfs.usermount=1 # Allow user to do mounts vfs.nfsd.server_min_nfsvers=4 # Force only NFSv4 usage, just for sure #kern.sched.preempt_thresh=200 # Should be better for interactive desktop EOF
# cat > /etc/rc.conf <<EOF hostname="stargrave.org" zfs_enable="YES" clear_tmp_enable="YES" # actually does not make any sense if we have got tmpfs ntpd_enable="YES" # chronyd_enable="YES" sshd_rsa_enable=no sshd_dsa_enable=no sshd_ecdsa_enable=no sshd_ed25519_enable=yes sshd_enable="YES" # Prioritize IPv6 addresses got from DNS ip6addrctl_enable="YES" ip6addrctl_policy="ipv6_prefer" ipv6_ipv4mapping="YES" # it is convenient to have [::] IPv6-listening # daemons to be automatically available also on IPv4 nfsv4_server_enable="YES" # I always use Postfix instead of sendmail postfix_enable="YES" sendmail_enable="NONE" firewall_enable="YES" firewall_script="/etc/ipfw.rules" # Be sure IPv6 link-local addresses are up and all hardware offloading # is off, that may hurt in many cases on routers ifconfig_XXX="-txcsum -txcsum6 -rxcsum -rxcsum6 -tso -lro -vlanhwtso up" ifconfig_XXX_ipv6="inet6 -ifdisabled" EOF
tzsetup /usr/share/zoneinfo/Europe/Moscow
ipfw
firewall setup# cat > /etc/ipfw.rules <<EOF #!/bin/sh -x ipfw -f flush ipfw -f table all destroy ipfw zero ipfw disable one_pass add="ipfw add" \$add deny all from any to any frag \$add allow { icmp or icmp6 } from any to any keep-state \$add allow esp from any to any keep-state \$add allow udp from any to any isakmp keep-state \$add allow all from any to any via lo0 \$add deny all from any to any not verrevpath in via XXX \$add check-state \$add deny tcp from any to any established via XXX \$add allow tcp from any to me ssh keep-state \$add allow all from me to any out keep-state #\$add deny log all from any to any EOF # chmod 600 /etc/ipfw.rules
sed -i.tmp "/periodic/ s/^/#/" /etc/crontab
echo server time.stargrave.org iburst >> /etc/ntp.conf
If chrony
is used (which I recommend):
% cat > /usr/local/etc/chrony.conf <<EOF server time.stargrave.org iburst driftfile /var/db/chrony/drift mailonchange stargrave@stargrave.org 0.5 makestep 1 3 EOF
echo V4: / > /etc/exports
zfs umount zroot zfs set mountpoint=none zroot zpool export zroot reboot
You can install that MTA and configure the host to use the relay:
# cat >> /usr/local/etc/postfix/main.cfg <<EOF inet_interfaces = loopback-only mynetworks_style = host relayhost = [gw.stargrave.org] EOF