AuriStor File System News

17 August 2024 - AuriStorFS v2021.05-44 Released

The AuriStorFS v2021.05-44 release is an update for all systems.

v2021.05-44 (17 August 2024)

Support added for Linux 6.11 kernels; Red Hat Enterprise Linux 7 ELS; and Red Hat Enterprise Linux 9.3/9.4 RT kernels.

Important Bug fixes:

Fileservers - This release is important because it fixes two salvageserver bugs which can result in data loss in damaged volumes.

All Cache Managers - This release fixes several bugs that could result in an operating system panic.

Linux Kernel Modules - Kernel modules built from prior releases of AuriStorFS included object files compiled from assembly code which the Linux objtool could not process and objtool silently quit without fully post-processing the AuriStorFS kernel modules. Linux 6.9 kernels began generating a warning when an improperly post-processed kernel module was loaded. Failure to properly post-process a kernel module can result in the module being susceptible to side channel attacks.

Important notes regarding Ubuntu 22.04 LTS and 24.04 LTS AppArmor and AuriStorFS clients:

The version of the AppArmor Linux Security Module shipped as part of Ubuntu 22.04 LTS and 24.04 LTS is not the same as the version included in Linus' upstream repository. AuriStor has received reports from end users of system panics triggered by AppArmor denying the AuriStorFS cache manager access to the contents of the AuriStorFS disk cache. All access to the AuriStorFS disk cache files is performed using "root" credentials stored when the disk cache is initialized at startup. The AuriStorFS kernel module does not expect that its ability to read or write to the disk cache files will be prevented. Prior to the v2021.05-44 release, any failure to read or write the disk cache with "root" credentials would result in a system panic. The v2021.05-44 release has restructured the kernel module to permit syscalls to be failed when a Linux Security Module blocks the attempt to open, read or write a cache file. The error code used to reject the access to the disk cache will be returned to the calling process and the access to /afs will be denied.

26 June 2024 - AuriStorFS v2021.05-41 Released

The AuriStorFS v2021.05-41 release is an update for all systems.

v2021.05-41 (26 June 2024)

Support added for Linux 6.10 kernels and macOS 15 (Sequoia) beta.

IPv6 calls are now more resilient to transient routing errors. Although RFC1122 states that ICMP6_DST_UNREACH_NOROUTE, ICMP6_DST_UNREACH_BEYONDSCOPE, and ICMP6_DST_UNREACH_ADDR messages are to be considered fatal, in practice they are often transient. The Linux kernel has considered these messages to be transient for many years. An in-flight call will ignore these messages and terminate due to the configured timeout if traffic can not be delivered.Cache managers can now detect the deletion of a volume as well as handle the reuse of a volume name by a volume with a distinct volumeId. When a volume deletion is detected all volume mount points are invalidated.

Cache managers failover to alternative read-only volume replicas has been improved.

The file and volume management services have improved logging of ICMP and ICMPV6 errors included the reason code and offending endpoint.

An improved work-around is provided for a bug in MIT Kerberos credential management which resulted in multiple afs/cell@REALM service tickets being cached.

A bug in the command parser which restricted the number of included configuration files to ten has been fixed. There is no limit on the number of included configuration files. The limit is on the depth of the inclusion.

On Linux distributions with systemd, the systemd out-of-memory killer is disabled for AuriStorFS services.

The voldump tool has been re-implemented. The new version shares the same dump generation engine as the volserver. New -config and -logfile options have been added.

20 May 2024 - AuriStorFS v2021.05-39 Released

The AuriStorFS v2021.05-39 release is an important update for all systems.

v2021.05-39 (20 May 2024)

Since the prior release support has been added for the Red Hat 9.4, AlmaLinux 9.4, Rocky Linux 9.4, Fedora 40, Ubuntu 24.04, SLES 15.5, OpenSUSE Leap 15.5, Linux 6.9 kernels, arm64 for Debian/Ubuntu distributions.

AuriStorFS continue to push the performance envelope by increasing the parallelism of the Rx RPC stack. Random data is required during encryption of every Rx DATA packet. The v2021.05-39 release introduces thread local random number generation which avoids the process global mutual exclusion barrier protecting the internals of the Kerberos distribution's random number generators which has forced application threads scheduled to multiple cores to serialize packet encryption. The elimination of delays when encrypting outgoing DATA packets can prevent call data flow stalls.

High throughput Rx calls rely upon reliable execution of time-based events which are processed by a single Rx event thread. The Rx network stack schedules a garbage collection operation to execute once per minute. This operation enforces call timeouts, destroys idle connections and destroys idle peers. The operation has historically been performed by the Rx event thread which is responsible for performing actions in response to call RTOs, sending NAT Ping and keep-alive packets, and retrying connection challenge and reachability checks.

The time complexity of the garbage collection operation is determined by the number of calls, connections, and peers. The busier the Rx endpoint the more work must be performed during each garbage collection run and the longer it takes to complete. While garbage collection is active other events cannot be processed which can interfere with the proper flow control of active calls.

As with all Rx events, the garbage collection event is scheduled to execute at an absolute clock time. If the system clock drifts (or is administratively set) backwards garbage collection will not be performed until the clock catches up with the scheduled time.

Another responsibility of the garbage collection procedure is to terminate calls if the system clock drifted backwards by five minutes or longer. However, when the clocked drifts backwards garbage collection is not performed until the clock has advanced beyond the point where calls require termination. As a result, calls are not terminated due to backwards clock drift and they can stall.

This release re-implements the garbage collection procedure using a dedicated thread and relative waits. This change ensures that the garbage collection procedure will not prevent the execution of call related events and permits calls to be terminated when large backward clock drifts are detected.

29 February 2024 - AuriStorFS v2021.05-38 Released

The AuriStorFS v2021.05-38 release is an important update for all systems.

v2021.05-38 (29 February 2024)

As with other AuriStorFS releases since the beginning of 2024, this release includes additional improvements to the Rx RPC implementation which are related to the possibility of silent data corruption when Rx jumbograms are in use. Prior releases disabled the negotiation of Rx jumbograms such that the v2021.05-37 Rx peer will refuse to send Rx jumbograms and will request that the remote peer does not send them. However, a bad actor could choose to send Rx jumbograms even though they were asked not to. v2021.05-38 introduces additional protections to ensure that a corrupt Rx jumbogram is dropped instead of being accepted.

The v2021.05-38 Rx RPC implementation also includes two optimizations. First, when Rx initiators complete a call they will no longer send an extra ACK packet to the Rx acceptor of the completed call. The sending of this unnecessary ACK creates additional work for the server which can result in increased latency for other calls being processed by the server.

Second, all AuriStor Rx services require a reach check for incoming calls from Rx peers to help protect against Distributed Reflection Denial of Service (DRDoS) attacks and execution of RPCs when the response cannot be delivered to the caller. A new reach check is required for each new call that arrives more than 60 seconds after the prior reach check completed. v2021.05-38 Rx considers the successful acknowledgment of a response DATA packet as a reach check validation. With this change reach checks will not be periodically required for a peer that completes at least one call per 60 seconds. A 1 RTT delay is therefore avoided each time a reach check can be avoided. In addition, reach checks require the service to process an additional ACK packet. Eliminating a large number of reach checks can improve overall service performance.

The final Rx RPC change in this release is specific to kernel implementations. Prior releases restricted the frequency of executing time scheduled Rx events to a granularity no smaller than 500ms. As a result an RTO timer event for a lost packet could not be shorter than 500ms even if the measured RTT for the connection is significantly smaller. The minimum RTO for a connection in AuriStor Rx is 200ms. The inability to schedule shorter timeouts impacts recovery from packet loss.

For client systems, the v2021.05-38 release contains fixes for two bugs that have resulted in system crashes on Linux when resource limits have been exceeded either by the system as a whole or for the process accessing /afs.

CrayOS SLES 5.14.21 is now a supported client platform.

5 February 2024 - AuriStorFS v2021.05-37 Released

The AuriStorFS v2021.05-37 release is an important update for all systems.

10 January 2024 - AuriStorFS v2021.05-36 Released

The AuriStorFS v2021.05-36 release is an important update for all systems.

21 December 2023 - AuriStorFS v2021.05-34 Released

The AuriStorFS v2021.05-34 release is an important update for Linux client systems.

Linx cache manager improvements
- v2021.05-33 introduced a critical bug for Linux cache managers. Creating a hard link produces an undercount of the linked inode's i_count. This undercount can result in a kernel module assertion failure if the inode is garbage collected due to memory pressure. The following message will be logged to dmesg
```
     "yfs: inode freed while on LRU"
   
```
  followed by a kernel BUG report. This bug is fixed in v2021.05-34.
- If the oom-killer terminates a process while it is executing within the AuriStorFS kernel module it is possible for memory allocations to fail. This can lead to failures reading from the auristorfs cache. This release includes additional logic to permit failing the cache request without triggering a NULL pointer dereference.
- If the auristorfs disk cache filesystem is remounted read-only then the disk cache will become unusable. Instead of triggering a system panic when attempts to read or write fail, log a warning and fail the request.

27 November 2023 - AuriStorFS v2021.05-33 Released

The AuriStorFS v2021.05-33 release is a recommended update for all systems.

9 October 2023 - AuriStorFS v2021.05-32 Released

The AuriStorFS v2021.05-32 release is a CRITICAL UPDATE for Linux aarch64 client systems.

UNIX Cache Managers
- Cache Manager:
  - CRITICAL UPDATE for aarch64 systems. Prior releases incorrectly compiled Neon source code routines and as a result floating point errors can occur.
  - The d_revalidate dentry operation should return false if the fileserver reports a FileID as non-existent in response to an InlineBulkStatus or FetchStatus RPC.
- aklog and klog.krb5
  - Only output an error message if the token cannot be set into neither the AuriStorFS cache manager nor the Linux kernel afs cache manager.

25 September 2023 - AuriStorFS v2021.05-31 Released

The AuriStorFS v2021.05-31 release includes support for macOS 14 Sonoma and Linux 6.6 kernels.

7 September 2023 - AuriStorFS v2021.05-30 Released

The AuriStorFS v2021.05-30 release includes important bug fixes for clients, servers and administrative tooling.

27 May 2023 - AuriStorFS v2021.05-29 Released

AuriStorFS Clients and Servers version v2021.05 Patch 29 released.

New platform: Linux 6.4 kernels.
Execution of fs commands such as examine, whereis, listquota, fetchacl, cleanacl, storeacl, whoami, lsmount, bypassthreshold and getserverprefs could result in memory leaks by the yfs.ko kernel module.
On Linux, prevents a kernel panic if the configured cache directory is located on a filesystem such as overlayfs which does not support the functionality required to be a cache.
Improved operation of vos operations that involve a volume forward operation between two volservers when the vos process is separated from one or both of the volservers by a NAT/PAT/firewall which times out port mappings within 100 seconds.

11 May 2023 - AuriStorFS v2021.05-28 Released

AuriStorFS Clients and Servers version v2021.05 Patch 28 released.

Fixes for volserver bugs.
New GPG signing key deployed for package compatibility with Fedora 38.

1 May 2023 - AuriStorFS v2021.05-27 Released

AuriStorFS Clients and Servers version v2021.05 Patch 27 released.

Fixes for bug introduced in v2021.05 Patch 26 within volserver and vos.

17 April 2023 - AuriStorFS v2021.05-26 Released

28 December 2022 - AuriStorFS v2021.05-25 Released

19 October 2022 - AuriStorFS v2021.05-24 Supports Apple MacOS 13 (Ventura)

AuriStorFS Clients and Servers released for Apple MacOS 13 (Ventura) on both Apple Silicon and Intel architectures.

4 October 2022 - AuriStorFS v2021.05 Patch 23 Released

New Supported Platforms:
- Fedora 37
- Linux 6.0 kernels
UNIX Cache Manager
- Locally configured cell aliases can now be used when evaluating magic mount paths /afs/.@mount/<cell-name-or-alias>/<volume-name>/.

12 September 2022 - AuriStorFS v2021.05 Patch 22 Released

Location Service
- Introduce new RPCs YFSVL_GetAuthoritativeEntryByName, YFSVL_GetAuthoritativeEntryByID, and YFSVL_AuthoritativeListAttributesU2. When available on all servers and the vos command these RPCs will be used to ensure that vos commands that modify the database will only fetch data from the elected Location Service coordinator.
UBIK clients (vos, pts, afsbackup)
- Avoid unnecessary delays when issuing RPCs that are not supported on all of the servers in the UBIK quorum.
Volume Server
- Since the introduction of v2021.05 vos listmaxacl, vos listrootacl and vos listseclevels failed to end the ITReadOnly volume transaction created to perform the query. This was due to a permission failure in the Volume Server.

6 September 2022 - AuriStorFS v2021.05-21 Released

AuriStorFS Clients and Servers version v2021.05 Patch 21 released.
- Volume Group Object Store
  - Volume clone removal time reduced by eliminating unnecessary backing store I/O operations.
  - A data-version-vnode object creation race that might result in spurious creation failures was fixed.
  - A data-version-vnode object reference count race that might result in improper deletion was fixed.
- vos command line tool
  - vos backup and vos backupsys can now update a pre-existing backup volume while a vos release is in progress.
- RX RPC
  - No longer send ACK packets in response to DATA packets if an ABORT packet has been sent.
  - The Sent RX BUSY packets counter is once again reported to rxdebug server port -rxstats.
  - Improve resiliency when the RX peer advertises unreasonable values for ACK packet trailer fields. This change permits continued communication with broken RX RPC implementations.
- Linux kernel module
  - Linux 6.0 mainline kernels are now supported
  - Fix a build error with Linux mainline 5.19 or later kernels when the architecture is aarch64.
  - The kernel module now includes description, author and version information that can be displayed via modinfo.

15 August 2022 - AuriStorFS v2021.05-20 Released

AuriStorFS Clients and Servers version v2021.05 Patch 20 released. This release:
- Volume Server changes
  - The volserver validation checks introduced in v2021.05-19 break the restoration of incremental volume dumps. This release fixes the regression and adds tests to validate the behavior.

13 August 2022 - AuriStorFS v2021.05-19 Released

AuriStorFS Clients and Servers version v2021.05 Patch 19 released. This release:
- FileServer Updates
  - Reorganize how Volumes are initialized for use with the VLRU. This change avoids the possibility of an assertion failure if a volume is placed into an error state without it being detached.
  - Prevent a race during startup which could result in a core dump due during startup of the VLRU Scanner Thread. This race was introduced in 2021.05-18.
  - When creating a cross-directory hard link, delay the assignment of a per-file ACL until after the copy-on-write operation succeeds.
- Protect Service changes:
  - Prevent crash if authenticated Kerberos v5 identity contains a dot in the first component and "allow-dotted-principals" is disabled.
- Volume Server changes
  - When a Volume Dump RPC fails due to an RX Peer Unreachable error log the ICMP error details (if available).
  - Introduce additional consistency checks when receiving a dump stream.
- Salvage Service changes
  - Do not assign a parent directory vnode to orphaned files with attached per-file ACLs.
- Backup Service changes
  - Prior releases faailed to re-open log files after receiving a signal from logrotate.
- Volume Package changes
  - The volume group link table limit that prevented a vnode from being linked to more than seven volumes within a volume group has been removed. Exceeding this limit when creating a new volume clone (readonly, backup or other) could result in the entire volume group becoming unusable. The limit of seven was derived from the design of the IBM/OpenAFS volume group object store format. There is no design limit in the AuriStorFS object store format.
- UBIK Service changes
  - Prevent threads attempting to perform a write transaction from jumping ahead of a queue of threads waiting for exclusive access.
- UBIK Client changes
  - When it is known that a coordinator is required to complete an RPC, increase the RX connection dead time from 12s to 60s in case the coordinator is under heavy load.
  - When it is known that a coordinator is required to complete an RPC, disable the RX hard dead timeout since it is not safe for a write transaction RPC to be retried.
- RX RPC
  - Include the DATA packet serial number in the transmitted reachability check PING ACK. This permits the reachability test ACK to be used for RTT measurement.
  - Do not terminate a call due to an idle dead timeout if there is data pending in the receive queue when the timeout period expires. Instead deliver the received data to the application. This change prevents idle dead timeouts on slow lossy network paths.
  - Fix assignment of RX DATA, CHALLENGE, and RESPONSE packet serial numbers on macOS (KERNEL) and Linux (userspace). Due to a mistake in the implementation of atomic_add_and_read the wrong serial numbers were assigned to outgoing packets.
- vos subcommand changes
  - Do not default the -clone switch to yes if the volume type is readonly or backup. There is no benefit to using a clone for these volume types as the volumes are not taken away from the fileserver during the volume operation. dump, shadow, and copy are affected.
  - Error move volume sooner if the volume type is not read-write.
  - Error movesite volume sooner if the volume type is not readonly.
  - Error copysite volume sooner if the volume type is not readonly.
  - When dumping a volume using a clone the clone will be flagged as temporary to ensure that if the transaction is interrupted that the clone will be automatically garbage collected.

12 July 2022 - AuriStorFS v2021.05-18 Released

AuriStorFS Clients and Servers version v2021.05 Patch 18 released. This release:
- New Supported Platforms:
  - Linux Kernel 5.19
- FileServer Updates
  - Improved tracking of idle but in-use volumes to avoid unnecessary volume salvaging after an emergency fileserver restart or transition from active to passive server instance.
  - Streamlined the fileserver shutdown process to reduce volume contention. These changes ensure that a fileserver with millions of attached volumes can perform a clean shutdown in just a few seconds.
  - Prevent idle dead timeouts during StoreData calls exceeding 8MB of data over slow lossy networks.
  - Protect against signed extension parameter overflow when processing RXAFS_FetchData calls.
- Volume Server changes
  - Include temporary volumes (those flagged as 'destroyMe') in the output of AFSVolListVolumes and AFSVolListOneVolume calls.
- Cache Manager
  - Prevent a kernel memory leak of less than 64 bytes for each bulkstat RPC issued to a fileserver. Bulkstat RPCs can be frequently issued and over time this small leak can consume a large amount of kernel memory. Leak introduced in AuriStorFS v0.196.
  - The Perl::AFS module directly executes pioctls via the OpenAFS compatibility pioctl interface instead of the AuriStorFS pioctl interface. When Perl::AFS is used to store an access control list (ACL), the deprecated RXAFS_StoreACL RPC would be used in place of the newer RXAFS_StoreACL2 or RXYFS_StoreOpaqueACL2 RPCs. This release alters the behavior of the cache manager to use the newer RPCs if available on the fileserver and fallback to the deprecated RPC. The use of the deprecated RPC was restricted to use of the OpenAFS pioctl interface.
- RX RPC
  - Handle a race during RX connection pool probes that could have resulted in the wrong RX Service ID being returned for a contacted service. Failure to identify that correct service id can result in a degradation of service.
  - The Path MTU detection logic sends padded PING ACK packets and requests a PING_RESPONSE ACK be sent if received. This permits the sender of the PING to probe the maximum transmission unit of the path. Under some circumstances attempts were made to send negative padding which resulted in a failure when sending the PING ACK. As a result, the Path MTU could not be measured. This release prevents the use of negative padding.
- Some shells append a slash to an expanded directory name in response to tab completion. These trailing slashes interfered with "fs lsmount", "fs flushmount" and "fs removeacl" processing. This release includes a change to prevent these commands from breaking when presented a trailing slash.

16 May 2022 - AuriStorFS v2021.05-17 Released

AuriStorFS Clients and Servers version v2021.05 Patch 17 released. This release:
- improves the reliability of the RX RPC protocol ACK packet processing and congestion avoidance algorithms.
- hints the creation of outgoing fileserver callback connections to improve the selection of a compatible network interface.
- improves the reliablity of UBIK client processes such as vos that mix database reads with database writes. Such clients can write stale data to the database if the source data is read from a non-coordinator replica of the database that is temporarily out of sync with the coordinator.
- New Supported Platforms:
  - Red Hat Enterprise Linux 9.0
  - Red Hat Enterprise Linux 8.6
  - Fedora 36
  - Ubuntu 22.04
  - Linux Kernel 5.18
  - Debian "arm hard float"
- Cell Service Database Updates
  - Update cern.ch, ics.muni.cz, ifh.de, cs.cmu.edu, qatar.cmu.edu, it.kth.se
  - Remove uni-hohenheim.de, rz-uni-jena.de, mathematik.uni-stuttgart.de, stud.mathematik.uni-stuttgart.de, wam.umd.edu
  - Add ee.cooper.edu
  - Restore ams.cern.ch, md.kth.se, italia

25 October 2021 - AuriStorFS supports macOS 12 Monterey

AuriStorFS Client installers beginning with the v2021.05-9 release are supported on macOS 12 Monterey on both Apple Silicon and Intel Macs. Note: please upgrade Big Sur, Catalina, and Mojave macOS systems to AuriStorFS v2021.05-9 before upgrading macOS to Monterey. macOS Monterey will deadlock during shutdown when previous versions of AuriStorFS built for Big Sur, Catalina and Mojave are installed.

9 June 2021 - USENIX LISA21 Talks Available for Public Viewing

Leveraging AFS Storage Systems to Ease Global Software Deployment
Tracy J. Di Marco White, Goldman Sachs
Using AFS as both a file store and an object store, we provide software to hundreds of thousands of client systems within both public and private cloud. As we see a continual increase in the frequency of software deployments, in the number of different software packages, and in the number of versions of each software package, we have also adapted our software deployment systems. Both of our software deployment systems use AFS, but one is unaware of AFS, and one makes specific use of various AFS features. I'll cover how the infrastructure has grown from several private data centers, and how our use of AFS has eased migration to both private and public cloud. I'll discuss the changes we are making to both the AFS-unaware and AFS-aware deployment systems, as well as discuss bugs, bottlenecks, and patterns of software development and usage that we've discovered through the change process.
Hands-Off Testing for Networked Filesystems
Daria Phoebe Brashear, AuriStor, Inc.
Cross-platform network filesystems require testing, but in-kernel interface testing is problematic under the best of circumstances. This talk will discuss the techniques used at AuriStor for automating hands-off testing using buildbot, TAP, docker, and kvm.

31 May 2021 - AuriStor File System v2021.05 released for clients and servers

22 April 2021 - AuriStor File System v2021.04 released for clients and servers

AuriStorFS v2021.04 contains another round of significant improvements to the UBIK replication infrastructure used by the location, protection and backup services. The changes remove a serialization point by providing further data isolation between read-transactions and write-transactions. This change further reduces the risk of a thundering herd of reader threads subsequent to write-transaction completion in a heavily loaded cell (six ubik peers, peaks of 7000 reads/second/peer, and average of 22 writes/second). This release also adds additional data consistency protections to the UBIK application services (vlserver, ptserver, and budbserver). Sites that take advantage of the protection service's groups of groups (aka "supergroups") capability will appreciate a noticeable reduction in CPU usage when responding to fileserver queries for a user's current protection set (CPS). The RXGK service co-located with the location service includes a change to prepare for issuance of tokens that do not mandate wire privacy and integrity protection. AuriStor recommends that all UBIK database service instances be updated to v2021.04.
AuriStor recommends that fileservers be updated when convenient to do so. There is a risk of partition lock deadlock during fileserver startup introduced in v0.209 and a potential vnode metadata consistency issue when clients perform cross-directory renames of vnodes with a link count greater than one.
The UNIX / Linux cache manager changes are primarily bug fixes for issues that have been present for years. A possibility of an infinite kernel loop if a rare file write / truncate pattern occurs. A bug in silly rename handling that can prevent cache manager initiated garbage collection of vnodes. On Linux, the potential of an overwritten ERESTARTSYS error during fetch or store data rpcs could result in transient failures. Upgrading to v2021.04 is recommended but not urgent.

13 March 2021 - AuriStor File System v0.209 released for clients and servers

v0.209 introduces a new cache manager architecture on all Linux platforms. The new architecture includes a redesign of:
- kernel extension load
- kernel extension unload
- /afs mount
- /afs unmount
New platform support includes:
- Linux mainline 5.11 and 5.12 kernels
- gcc11 compilation
- Updates for Linux ppc64 and ppc64le architectures
- Hardware accelerated cryptographic routines for Linux __aarch64__.
- Ubuntu 18.04 and 20.04 -oem kernel modules.
Higher throughput read and write transactions for UBIK based location, protection and backup database services. The combination of write transaction isolation and lookup caching substantially increases the rate of read and write transactions while reducing the risk that read transactions can block for extended periods.
Compared to v0.200, these changes have reduced the minimum write transaction time by 75%, the mean by 50% and the maximum by 13%. The rate of write transaction completion increased by 35%.
unix cache manager negative volume name lookup caching
bos getfile - similar to bos getlog but can be used to transfer files with arbitrary binary contents.
asetkey delete by key sub-type.
CIDR based assignment of server endpoint priorities.
Over 600 additional improvements.

12 November 2020 - AuriStor File System v0.201 released - Apple macOS BigSur 11.0 on Apple Silicon and Intel

v0.201 introduces a new cache manager architecture on all macOS versions except for High Sierra (10.12). The new architecture includes a redesign of:
- kernel extension load
- kernel extension unload (not available on Big Sur)
- /afs mount
- /afs unmount
- userspace networking
The conversion to userspace networking will have two user visible impacts for end users:
- The Apple Firewall as configured by System Preferences -> Security & Privacy -> Firewall is now enforced. The "Automatically allow downloaded signed software to receive incoming connections" includes AuriStorFS.
- Observed network throughput is likely to vary compared to previous releases.
On Catalina the "Legacy Kernel Extension" warnings that were displayed after boot with previous releases of AuriStorFS are no longer presented with v0.201.
AuriStorFS /afs access is expected to continue to function when upgrading from Mojave or Catalina to Big Sur. However, as AuriStorFS is built specifically for each macOS release, it is recommended that end users install a Big Sur specific AuriStorFS package.
AuriStorFS on Apple Silicon supports hardware accelerated aes256-cts-hmac-sha1-96 and aes128-cts-hmac-sha1-96 using AuriStor's proprietary implementation.

4 November 2020 - AuriStor File System v0.200 released

The v0.200 release is an important release targeted at AuriStorFS server deployments and Unix cache managers.

fileserver: remote denial of service
Impacted versions: All releases from v0.116 through v0.198 are affected.
CVE: None
CVSS Base Score:             4.9
Impact Subscore:             3.6
Exploitability Subscore:     1.2
CVSS Temporal Score:         4.8
CVSS Environmental Score:    3.1
Modified Impact Subscore:    1.8
Overall CVSS Score:          3.1
CVSS v3.1:
AV:N/AC:L/PR:H/UI:N/S:U/C:N/I:N/A:H/E:F/RL:U/RC:C/CR:X/IR:X/AR:L/MAV:N/MAC:L/MPR:H/MUI:N/MS:U/MC:N/MI:N/MA:H
Details on this denial of service vulnerability will be disclosed thirty
(30) days after release to give customer sites an opportunity to update
their fileservers to v0.200.

Linux cache managers prior to v0.199 are susceptible to a general protection fault if a server unreachable network error occurs during a direct i/o operation.

The macOS backgrounder in prior releases can repeatedly segmentation fault and restart when there is no network connectivity.

This release includes major improvements in the handling of RPCs that are interrupted by NAT/PAT devices timing out their udp endpoint mappings.

12 October 2020 - AuriStor File System v0.198 released - Critical Server Update

The v0.198 release is a security release targeted at AuriStorFS server deployments.

vlserver: remote denial of service
Impacted versions: All releases from v0.193 through v0.197 are affected.
CVE: CVE-2020-26119
CVSS Base Score:             8.6
Impact Subscore:             4.0
Exploitability Subscore:     3.9
CVSS Temporal Score:         8.0
CVSS Environmental Score:    9.3
Modified Impact Subscore:    5.9
Overall CVSS Score:          9.3
CVSS v3.1:
AV:N/AC:L/PR:N/UI:N/S:C/C:N/I:N/A:H/E:F/RL:O/RC:C/CR:X/IR:X/AR:H/MAV:N/MAC:L/MPR:N/MUI:N/MS:C/MC:N/MI:N/MA:H
Details on this denial of service vulnerability will be disclosed only
after all customer sites have updated their location servers to v0.198.

26 August 2020 - AuriStor File System v0.197 released

The v0.197 release includes significant performance improvements for UNIX cache managers. Especially for macOS and Red Hat Enterprise Linux (including derivatives). macOS users accessing AuriStorFS cells with yfs-rxgk will experience AuriStor's proprietary hardware accelerated implementations of aes256-cts-hmac-sha1-96 and aes256-cts-hmac-sha514-384, and optimistic caching of vnode status information for the first time. Linux users will benefit from more aggressive optimistic caching of status information as well as support for SELinux labels and the world's first path-ioctl implementation for FUSE. CentOS users will appreciate the dedicated repository for CentOS kernel modules. This release also introduces support for the Linux mainline 5.8 and 5.9 kernels.

Cell administrators will appreciate an improved vos eachvol and new pts eachuser and pts eachgroup commands. The new pts whoami -rxkad switch and improved logging of yfs-rxgk authentication failures ease debugging of authentication and authorization configuration errors.

All servers have been updated to improve reliability and performance. New audit events have been added to the fileserver when a rename operation unlinks a existing target. A rename that replaces a directory will no longer orphan the unlinked directory. Volserver transactions can no longer be stolen by another cell administrator. UBIK service coordinators protect themselves against peers that accept a transaction rpc but never complete it.

Finally but not least, AuriStor has introduced fs ignorelist to replace fs blacklist. Likewise ignorelist-dns, ignorelist-volroot, ignorelist-afsmountdir, and ignorelist-volrootprefix replace the format [afsd] "blacklist" variants. In all cases "blacklist" is accepted as a hidden alias.

14 May 2020 - AuriStor File System v0.195 released

The v0.195 release is a CRITICAL update for all macOS and Linux cache managers. The changes in v0.195 correct bugs that can result in data corruption.

v0.195 also introduces support for Linux 5.7 kernels.

2 April 2020 - AuriStor File System v0.194 released

The v0.194 release is a CRITICAL update for all UBIK servers (vlserver, ptserver, buserver) and the macOS and Linux cache managers. The changes in v0.194 correct bugs that can result in data corruption in all of the above.

v0.194 also introduces support for Linux 5.6 kernels and Red Hat Enterprise Linux 7.8.

30 January 2020 - AuriStor File System v0.192 released

The 0.192 release is primarily a bug fix release focused on the Linux cache manager, the fileserver/salvageserver and "vos".

This fileserver fixes are critical for any server with a vice partition supporting Linux reflinks (xfs+reflinks, btrfs, ocfs2). AuriStor is unaware of any customers operating fileservers with these configurations. There is also a fix to permit salvaging volumes containing more than 76 million vnodes.

The Unix cache manager changes improve stability, efficiency, and scalability. Post-0.189 changes exposed race conditions and reference count errors which can lead to a system panic or deadlock. In addition to addressing these deficiencies this release removes bottlenecks that restricted the number of simultaneous vfs operations that could be processed by the auristorfs cache manager. The changes in this release have been successfully tested with greater than 400 simultaneous requests sustained for for several days.

Changes to the "vos move" and "vos movesite" commands preserve the volume's last update timestamp ensuring proper calculation of the change set for incremental transfers. The bug can result in reversion to the contents of the BACK volume if a RW volume with an existing BACK volume is moved twice without first removing or updating the BACK volume.

16 December 2019 - AuriStor File System v0.191 released

The AuriStor File System v0.191 release addresses bugs that were identified after the release of v0.190, compatibility with RHEL7 kernels suffering a regression, and improvements in Linux cache manager startup/shutdown. Notable changes include:

Fileserver bug fixes when Linux reflinks are supported by vice partition backing stores. xfs+reflinks, btrfs or ocfs2.

Re-enabling SIMD processor extensions for non-RHEL userland Linux processes when the kernel module does not export __kernel_fpu_begin or __kernel_fpu_end.

Improvements in Linux kernel module startup and shutdown that reduce the risk that a reboot will be required.

Work-around for a RHEL 7.6 and 7.7 regression that impacts Linux systems configured to export /afs via nfsd.

15 November 2019 - AuriStor File System v0.190 released

The AuriStor File System v0.190 release addresses bugs that were identified shortly after the release of v0.189.

28 October 2019 - AuriStor File System v0.189 released

The v0.189 release includes a broad range of performance improvements, new features, and of course, bug fixes. The highlights include:

UNIX Cache Manager performance improvements

Faster "git status" operation on repositories stored in /afs.

Faster and less CPU intensive writing of (>64GB) large files to /afs. Prior to this release writing files larger than 1TB might not complete. With this release store data throughput is consistent regardless of file size. (See UNIX Cache Manager large file performance improvements).

New Fileserver support for Data Loss Prevention scanning and Backup solutions that walk the directory tree. (See New Feature: Fileserver Implicit ACLs).

A major rewrite of the Ubik recovery engine to eliminate contention with the coordinator election procedures. These changes ensure that regardless of the database size or network latency, Ubik recovery can no longer alter the timing of election ballots. In prior releases a lengthy recovery could prevent an election from being conducted which could result in the expiration of the coordinator's term. Reduction in thread contention will also enhance the performance of the vlserver, ptserver and buserver.

Reduced Ubik transaction times resulting from parallelization of quorum updates.

Many improvements intended to reduce the amount of data included in an incremental volume dump and/or avoid the need for generating a full volume dump during a "vos release".

New support for acceptor-only keys that permit key rotation without single points of failure or flag days. (See "New Feature: Key Rotation Using Acceptor Only Keys).

Auditing improvements.

Updated platform support including Fedora Core 31 and Oracle Linux

and much more ...

8 October 2019 - AuriStor File System v0.188 released for macOS Catalina (10.15)

AuriStorFS v0.188 installer for macOS Catalina (10.15) released in conjunction with Apple's release of macOS Catalina (10.15).

23 June 2019 - AuriStor File System v0.188 released

v0.188 addresses three issues experienced by customers.

The first is UBIK coordinator term expiration of the location service after periodic load spikes that increased the size of the vlserver thread pool from 100 threads to more than 13,000 threads. The load spike would last for under a minute, the thread pool would scale back to 100 threads after 20 minutes. Forty minutes later another load spike would occur repeating the pattern. The existing rx packet allocator behaved very poorly under this workload pattern resulting in allocation of additional rx packets with each load spike. After a week more than five million rx packets had been allocated on some vlservers.

As the allocated packet counts increased and the number of threads decreased, the packets per thread ratio increased as well. When the thread pool resized to 100 threads the number of packets assigned to each thread grew to the point where packet transfers began to interfere with rx data transfer and event processing. The UBIK coordinator election algorithm is time sensitive and a failure to deliver votes or timeout RPCs in timely manner can result in election failure.

v0.188 addresses the root causes by replacing:

the rx multi call implementation used to conduct UBIK elections with a new variation that manages its own timeouts and does not rely upon timeouts set upon each individual rx rpc.

the condvar timed wait implementation with a version that has finer grained clock resolution: 1ns instead of 1s.

the rx packet allocator with a new implementation that is better suited for use with dynamic thread pools and larger window sizes. The new allocator also significantly reduces lock contention when obtaining and releasing packets.

The second problem is loss of volume access after the fileserver writes to the FileLog:
CopyOnWrite corruption prevention: detected zero nlink for volume N inode vnode:V unique:U tag:T (dest), forcing volume offline

No data corruption occurs but after each occurence the volume is salvaged. After the 16th automatic salvage the volume is taken offline until there is manual intervention by an administrator.

This bug, a file descriptor leak, was introduced in v0.184 as a side effect of one of the fixes for the libyfs_vol reference counting errors.

The final issue is the on-going problems that some customers have experienced with Linux clients either with the shell reporting:

shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory

or "mount --bind" failing with

mount: mount(2) failed: No such file or directory

or

$ cd /afs/example.com/
$ ls -al /proc/self/cwd

generates

/proc/self/cwd -> /afs/example.com (deleted)

The symptom occurs when a Linux dentry (directory entry) object ends up in an unhashed state although it is referenced by an inode.

Since v0.180 AuriStor has revised code paths to improve error code reporting and avoid race conditions that can generate this behavior. Apparently, there are still additional conditions that have yet to be identified. v0.188 includes a band-aid whereby an unhashed denty will be rehashed when needed. However, AuriStor is still trying to find and address the root cause. Therefore the AuriStorFS kernel module will log a warning when a dentry is rehashed

29 May 2019 - AuriStor File System v0.186 released

Changes since v0.184 are primarily focused on the UNIX/Linux cache manager and fixing operational issues reported in the volserver and fileserver. The v0.184 release implemented major changes to the UNIX/Linux cache manager. This release fixes bugs introduced in v0.184 and missed edge cases. It also continues the refactoring of internal interfaces to propagate error codes and signals to userland applications. VolserLog messages related to the volserver transaction lifecycle have been thoroughly revamped. Additional reliability improvements within RX are included.

New Platforms:

Red Hat Enterprise 8

xfs reflinks (requires new filesystem) changes AuriStorFS StoreData RPC copy-on-write performance from O(filelength) to O(write length).

significant improvements in udp performance compared to rhel7.

extended berkeley packet filters provides for fairer distribution of rx call processing across multiple rx listener threads.

Fedora 30

Unix CM:

v0.184 moved the /etc/yfs/cmstate.dat file to /var/yfs. With this change afsd would fail to start if /etc/yfs/cmstate.dat exists but contains invalid state information. This is fixed.

v0.184 introduced a potential deadlock during directory processing. This is fixed.

Many sites have noticed that clients with v0.184 installed might log Lost contact with xxxx server ... referencing a strange negative error code and that fileservers might log FetchData Write failure ... errors from any Linux client version.
These errors might correlate to corruption of pages in the Linux page cache. The corruption is that one or more contiguous pages might be inappropriately zero filled.
This release implements many code changes intended prevent Linux page cache are AFS disk cache corruption.

Better data version checks

More invalidation of cache chunk data version when zapping

Only zero fill pages past the server end of file

Always advance RPC stream pointer when skipping over missing pages or when populating pages from the disk cache chunk.

Never match a data version number equal to -1.

Avoid truncation races between find_get_page() and page locking.

Some sites have experienced failures of Linux mount --bind of /afs paths or getcwd returning ENOENT. This release fixes a dentry race that can produce an unhashed directory entry.
Some uses of the directory will continue to work, as the first lookup following the race will associate a new dentry with the inode, as an additional alias. Directories are not supposed to have aliases on Linux, so the vfs code assumes that d_alias is at most a list of 1 element, and accesses the entry in a slightly different way in a few places. Some sites get the new hashed dentry, others get the original unhashed one.

Propagate EINTR and ERESTARTSYS during location server queries to userland.

Handle common error table errors obtained outside an afs_Analyze loop. Map VL errors to ENODEV and RX, RXKAD, RXGK errors to ETIMEDOUT

Log all server down and server up events. Transition events from server probes failed to log messages.

Avoid leaking local errors to the fileserver if a failure occurs during Direct IO processing.

RX RPC networking:

If the RPC initiator successfully completes a call without consuming all of the response data fail the call by sending an RX_PROTOCOL_ERROR ABORT to the acceptor and returning a new error, RX_CALL_PREMATURE_END, to the initiator.
Prior to this change failure to consume all of the response data would be silently ignored by the initiator and the acceptor might resend the unconsumed data until any idle timeout expired. The default idle timeout is 60 seconds.

Avoid event cancellation race with rx call termination during process shutdown. This race when lost can prevent a process such as vos from terminating after successfully completing its work.

Avoid transmitting ABORT, CHALLENGE, and RESPONSE packets with an uninitialized sequence number. The sequence number is ignored for these packets but set it to zero.

Volserver:

Frequent issuance of "vos listvol" commands can no longer interfere with volume transaction idle timeout processing.

Since IBM AFS 3.5 the volserver has logged transaction status every 30 seconds to the VolserLog. In v0.184 the volserver logs the following lifecycle messages at level 0:

trans id on volume id is older than s seconds

trans id on volume id has timed out

trans id on volume id has been idle for more than s seconds

On a busy volserver these messages can flood the VolserLog.
This change raises the level of messages 1 and 3 to 125 and introduces a new "Created trans id on volume id" message logged at level 5.
With this change level 0 logs unexpected termination of each transaction. Level 125 will include the 30 second updates for sites that require them.
The partition, volume parentid and transaction iflags fields have been added to each log message.

RPCs issued by vos listvol will no longer block in the volserver if the requested volume requires salvaging. The volume attachment retries can block the salvageserver from acquiring an exclusive volume lock resulting in a salvage failure and a soft-deadlock. From now on the vos listvol command will fail immediately.

Fileserver:

If the vice partition's backing store is unmounted or otherwise becomes unavailable the fileserver could terminate unexpectedly due to a segmentation fault. Beginning with this release the fileserver will survive but all requests for objects stored on the missing vice partition will fail.

Introduce the ability to configure random error injection during FetchStatus, FetchData, and StoreData RPC processing.

Add File IDs to "FetchData Write Failure" FileLog messages.

Ubik services:

This release introduces the ability to configure a separate debug log level for ubik than for the application service. By default, when the "ubik_debug" level is unspecified or set to zero, the application's log level determines which "ubik: " log entries are written to the log.

26 March 2019 - AuriStor File System v0.184 released

Security improvements include volserver validation of destination volserver security policies prior to transmitting marshaled volume data. Prior to v0.184 the volservers were trusted to reject volumes whose security policy could not be enforced. Linux cache managers can no longer be keyed with rxkad tokens. Introduction of a pam module capable of managing tokens for both AuriStorFS and/or Linux Kernel kAFS.

The UNIX Cache Manager underwent major revisions to improve the end user experience by revealing more error codes, improving directory cache efficiency, and overall resiliency. The cache manager implementation was redesigned to be more compatible with operating systems such as Linux and macOS that support restartable system calls. With these changes errors such as "Operation not permitted", "No space left on device", "Quota exceeded", and "Interrupted system call" can be reliably reported to applications. Previously such errors might have been converted to "I/O error". These changes are expected to reduce the likelihood of "mount --bind" and getcwd failures on Linux with "No such file or directory" errors.

A potentially serious race condition and reference counting error in the vol package shared by the Fileserver and Volserver could prevent volumes from being detached which in turn could prevent the Fileserver and Volserver from shutting down. After 30 minutes the BOSServer would terminate both processes. The reference counting errors could also prevent a volserver from marshaling volume data for backups, releases, or migrations.

This release is moves the location of the cache manager's cmstate.dat from /etc/yfs/ to /var/yfs/ or /var/lib/yfs depending upon the operating system. The cmstate.dat file stores the cache manager's persistent UUID which must be unique. The cmstate.dat file must not be replicated. If virtual machines are cloned the cmstate.dat must be removed. The cmstate.dat file must not be managed by a configuration management system.

The release includes two new vos command options:

* "vos addsite -force"
* "vos listvol -id "

Finally, this release includes a Linux PAM module as well as support for the Amazon Linux 2 distribution and many more quality and performance improvements

2 January 2019 - AuriStor, Inc. sponsors 2nd Linux Kernel AFS Hackathon and BoF at USENIX Vault'19

AuriStor, Inc. is pleased to sponsor and invite AFS and Linux kernel developers to the second Linux kernel AFS (kAFS) Hackathon and Birds of a Feather meeting. The hackathon and BoF will be co-located with the USENIX Vault '19 - Linux Storage and Filesystems Conference. Read more...

15 November 2018 - AuriStor File System v0.180 released

This release improves RX call reliability across network paths with a high degree of packet loss and/or round trip times larger than 60ms. The corrected bugs have been present in all IBM derived RX implementations dating back to the mid 90s. The impact of these bugs is an increased risk of timeouts and performance degradation for long lived calls over high latency network paths that periodically experience packet loss. Volume operations such as moves, releases, backups and restores over WAN connections are particularly susceptible due to the amount of data transmitted in each RPC.

One feature change is experimental support for RX windows larger than 255 packets (360KB). This release extends the RX flow control state machine to support windows larger than the Selective Acknowledgment table. The new maximum of 65535 packets (90MB) could theoretically fill a 100 gbit/second pipe provided that the packet allocator and packet queue management strategies could keep up.

A change to volume use statistics tracking when volumes are moved, copied, and restored. The AFS volume dump stream format which is used for volume archives and volume transfers can store the daily and weekly vnode access counts but none of the other extended volume statistics maintained by the fileserver. When a volume is moved it makes sense for the use counts to be migrated with the volume to the new location. When a volume is copied it makes sense that the new location should start its counters from zero instead of the values collected at the location that was used as the source. Finally, when restoring a volume or releasing a new snapshot of a volume to readonly or backup sites, the use counts should remain unaltered. Beginning with this release, when AuriStorFS v0.178 or later "vos" is used in combination with an AuriStorFS v0.178 or later destination "volserver" the desired use count management will take place. At the moment the weekly access counts are only accessible when using the "vos examine -format" switch.

Many more quality and performance improvements

22 Oct 2018 - AuriStor File System v0.177 released

AuriStor's RX implementation has undergone a major upgrade of its flow control model. Prior implementations were based on TCP Reno Congestion Control as documented in RFC5681; and SACK behavior that was loosely modelled on RFC2018. The new RX state machine implements SACK based loss recovery as documented in RFC6675, with elements of New Reno from RFC5682 on top of TCP-style congestion control elements as documented in RFC5681. The new RX also implements RFC2861 style congestion window validation.

When sending data the RX peer implementing these changes will be more likely to sustain the maximum available throughput while at the same time improving fairness towards competing network data flows. The improved estimation of available pipe capacity permits an increase in the default maximum window size from 60 packets (84.6 KB) to 128 packets (180.5 KB). The larger window size increases the per call theoretical maximum throughput on a 1ms RTT link from 693 mbit/sec to 1478 mbit/sec and on a 30ms RTT link from 23.1 mbit/sec to 49.39 mbit/sec.

Workarounds for an IBM AFS and OpenAFS RX header userStatus field information leakage bug. This bug inadvertently interferes with the RX service upgrade mechanism that permits AuriStorFS clients (including Linux kafs) and services to detect each other without undesireable timeouts or extra round trips.

When an affected IBM or OpenAFS cache manager (or fileserver) establishes a connection to an AuriStorFS server the bug can result in an unintentional RX service upgrade. For example, if a pre-v0.175 fileserver incorrectly upgraded an incoming RX connection from RXAFS to RXYFS, it would mistakenly believe the client offered the RXYFSCB callback service; which it doesn't. The failure to establish a successful connection to the RXYFSCB service would cause the fileserver to reject the client's RXAFS requests with a VBUSY error.

A fileserver is supposed to be able to serve data from a .readonly or .backup volume while the volserver is dumping or forwarding the volume contents. This functionality introduced in IBM AFS 3.3 was fatally broken in AuriStorFS v0.157 when the volume disk interface was overhauled to avoid data corruption. Then starting with v0.168 "vos release" failed to terminate the volume transaction used to clone the RW volume to the RO site on the same server. Attempts to read from volumes that were exclusively in-use by the volserver would return VOFFLINE (106) errors.

As of AuriStorFS v0.175 release .readonly and .backup volumes can once again be attached to fileservers while a "vos release" or "vos dump" command is in process. Since some of the fixed defects were in "vos" and others in the fileserver both "vos" and the fileserver must be updated to v0.175 to ensure correct behavior.

A major security model change to the Backup Tape Controller (butc), backup coordinator command, and the backup service to address OPENAFS-SA-2018-001.txt

Starting with v0.175 butc supports:

yfs-rxgk and rxkad authentication

AES256-CTS-HMAC-SHA1-96 or 56-bit fcrypt wire encryption

super user authorization

auditing of all remote procedure call requests

The new security model is incompatible with the existing "backup" and "butc" processes. The new "butc" always executes using "localauth" credentials just as any other cell service does; it can no longer be executed using tokens obtained via aklog.

The butc service will by default require all incoming RPCs to be authenticated as a super user either via use of -localauth credentials or end user identities found in the UserListExt or ReaderList bosserver configuration.

As a side effect of these changes, both backup and butc gain IPv6 support.

As the new security model is incompatible with the existing deployed butc and backup processes, the 0.175 version includes configuration knobs to force the use of the old security model for backward compatibility. Use of these knobs restores the privilege escalation vulnerability. Please contact AuriStorFS support if your site requires use of this configuration.

New data input validation improvements within the vlserver and volserver. These changes ensure that the vlserver cannot store volume location records referencing invalid fileservers or volume site parameters; and that the volserver cannot forward volume data to volservers that are not registered with the cell's location service.

The same validation has been added to vos to ensure that it cannot be instructed to violate cell constraints.

Correct Linux disk cache management to support AppArmor sand boxes.

Many more quality and performance improvements.

24 Sep 2018 - AuriStor File System supports Apple macOS Mojave (10.14)

In conjunction with Apple's release of macOS Mojave (10.14) to the general public, AuriStor announces the release of AuriStorFS v0.174 for macOS Mojave. Both AuriStorFS clients and servers can be installed.

22 Jun 2018 - AuriStor File System v0.174 released

AuriStor announces the release of AuriStor File System v0.174. In addition to the usual mix of bug fixes and functionality improvements, the v0.174 release includes a very special gift: A new x86_64 assembly language implementation of the AES256-CTS-HMAC-SHA1-96 encryption algorithm for Linux and macOS. This implementation leverages the following Intel processor extensions (when available):

Advanced Encryption Standard New Instructions (AES-NI)

Streaming Single Instruction Multiple Data (SSE, SSE2, SSSE3, SSE4)

Advanced Vector Instructions (AVX, AVX2)

Originally intended for use by the Linux kernel module, the AuriStor implementation of AES256-CTS-HMAC-SHA1-96 is 2.4 times faster than OpenSSL and Apple's Common Crypto assembly language implementations. As a result, AuriStor has decided to leverage its implementation exclusively on Linux and macOS for servers, and administration tools.

The AuriStor assembly language implementation is ten times faster than the C language implementation used by previous releases of The AuriStorFS cache manager on x86_64 Linux.

On processors that implement AES-NI and AVX2 the performance cost of yfs-rxgk integrity protected and encrypted connections compared to rxnull unprotected connections is expected to be minimal. The Intel Core i5-4250U CPU @ 1.30GHz (Hazwell 22nm), a low-end consumer processor, can compute (encrypt, sign, verify, decrypt) better than 217,000 yfs-rxgk packets per second (or 2.3 Gbit/second) per core.

A 20-core server class processor with 10 cores dedicated to Rx listener threads and 10 cores remaining for application service threads (where cryptographic operations are performed) can saturate dual-bonded 10gbit/second network interfaces with yfs-rxgk protected traffic.

One customer compared "vos release" of a small volume storing 10GB in 5000 files and directories between v0.167 and v0.173 on RHEL 6.9 x86_64. It observed:

a 24% reduction in clock time to complete the operation

a 100% increase in the peak number of packets sent per second

The reductions in processor time per packet result in reduced per-packet latency and an increased capacity to scale the number of simultaneous RPCs per file server, volume server, location server and protection server.

The incentive for sites to migrate from the 1980s rxkad to yfs-rxgk is greater than ever.

27 Apr 2018 - AuriStor File System v0.170 released

v0.170 is primarily a performance improvement release. AuriStor RX v0.170 is the world's first implementation capable of transferring more than 5.5TB per call. For the first time in AFS history, volumes larger than 5.5TB can be moved, replicated, backed up and restored. v0.170 includes Meltdown and Spectre optimizations for UBIK services reducing by more than 50% the number of syscalls required to process UBIK requests. The v0.170 release includes more than 400 changes compared to v0.168. v0.169 was not publicly released.

6 Mar 2018 - AuriStor File System v0.168 released - Critical Update

v0.168 is a critical bug fix release addressing a fileserver denial of service vulnerability [CVE-2018-7444] and a client side bug in fs setacl -negative which generates more permissive access control lists than intended [CVE-2018-7168]. The v0.168 fileserver provides cell administrators the ability to prevent clients incorporating the bug from storing ACLs. This release also adds support for Red Hat Enterprise Linux 7.5 kernels and includes optimizations to reduce the impact of Meltdown and Spectre mitigations. v0.168 also include major improvements to the volume transaction lifecycle. Interrupted or failed transactions no longer require cell administrators to manually clean up temporary volumes. The v0.168 release includes nearly 400 changes compared to v0.167.

7 Dec 2017 - AuriStor File System v0.167 released - Critical Update

v0.167 is a critical bug fix release addressing a denial of service vulnerability [CVE-2017-17432] in all services and clients. This release also adds support for the forthcoming Linux 4.15 kernel and two new vos subcommands, movesite and copysite.

13 Nov 2017 - AuriStor File System v0.164 released

v0.164 is a bug fix and performance release. This release includes a major rewrite of core cache manager I/O pathways on Linux supporting direct I/O, cache bypass, and read-ahead. This release includes additional improvements and bug fixes to UBIK beyond those shipped in v0.163 to support mixed OpenAFS and AuriStorFS deployments.

15 October 2017 - AuriStor File System v0.163 released

v0.163 contains major updates to the UBIK database replication protocol implementation that increase resiliency to peer communication failures and permit sites to mix IBM/OpenAFS and AuriStorFS servers without introducing single points of failures. These changes combined with those included in v0.162 simplify the migration from OpenAFS to AuriStorFS. v.163 introduces Linux 4.14 kernel support. File server detection of and protection against unresponsive cache manager callback service implementations.

6 October 2017 - AuriStor File System v0.162 released

v0.162 contains AFS3-compatibility changes for the UBIK database replication protocol permit AuriStorFS servers to be deployed in IBM AFS and OpenAFS cells without configuring them as clones. First release with Fedora 27 support. Bug fixes and on-going improvements.

24 September 2017 - AuriStor File System v0.160 released

v0.160 introduces macOS High Sierra and Apple File System support. On Linux, exporting the /afs file namespace via Linux nfsd using NFS2, NFS3, and NFS4 is now supported. Reduced memory utilization by the RX networking stack. File server workaround for deadlocks that are known to occur within IBM AFS and OpenAFS Unix cache managers. Bug fixes and general improvements.

8 August 2017 - AuriStor File System v0.159 released

v0.159 introduces "vos eachfs" command. Linux 4.13 kernel support.Continued performance improvements and bug fixes.

12 July 2017 - AuriStor File System v0.157 released

Linux 4.12 kernel support. Fedora 26 support. Fileserver support for XFS and BTRFS reflinks for improved vice partition copy-on-write performance. Volserver and "vos" support for quotas larger than 2TB. Linux cache manager performance enhancements to address parallel workflows. macOS fix for Orpheus' Lyre. On-going bug fixing and improvements.

12 July 2017 - AuriStorFS vs "Orpheus' Lyre Puts Kerberos to Sleep" bug

Nico Williams, Viktor Dukhovni and Jeffrey Altman announced the discovery of the "Orpheus' Lyre puts Kerberos to Sleep" bug:

As the name suggestions, this implementation flaw can result in a failure of Kerberos mutual authentication. Kerberos is supposed to provide a secure method of network authentication impervious to man-in-the-middle attacks. Fortunately, the protocol is secure but a mistake made by many implementations permits an attacker to successfully perform service impersonation and in conjunction with credential delegation (ticket forwarding) client impersonation. The attack is silent and cannot be detected.

This is a client-side vulnerability so it must be fixed by patching the client systems and systems that have more than one Kerberos implementation installed must obtain patches from all of the implementations to be secure.

The MIT Kerberos implementation was never vulnerable. As patches for other implementations become available the https://www.orpheus-lyre.info/ site will be updated to indicate that.

Yesterday Microsoft issued patches and those should in my opinion be treated as critical with minimal delays before deployment. Heimdal also issued a patch which is included in version 7.4.

AuriStorFS bundles Heimdal when the local operating system's Kerberos and GSS-API cannot satisfy its requirements. The affected platforms include:

Apple MacOS (all versions)

Solaris (all versions)

Microsoft Windows (all versions)

Apple iOS (all versions)

6 July 2017 - OpenAFS 1.6.21 bug fix improves throughput of data sent from OpenAFS to AuriStorFS

This 1.6.21 release of OpenAFS includes a fix to Rx which improves the performance of Rx connections between OpenAFS and AuriStorFS when the OpenAFS peer is writing bulk data to the AuriStorFS peer. Examples include:

OpenAFS cache manager issuing RXAFS_StoreData calls to AuriStorFS file servers

OpenAFS volserver forwarding volume data to an AuriStorFS volserver

OpenAFS volserver dumping volume data to an AuriStorFS "vos dump"

OpenAFS "vos restore" restoring volume data to an AuriStorFS volserver

There are of course other scenarios involving backups, bulk vlserver queries, etc.

The fix avoids the introduction of 100ms delays as the AuriStor Rx peer attempts to re-open a call window which had been closed due to a lack receive buffers while waiting for the incoming data to be consumed.

4 April 2017 - AuriStor File system v0.150 released

New features include support for IBM TSM in the AuriStorFS Backup Tape Controllers. "vos eachvol" enhancements. Faster "pts examine" performance. Automated salvager repair of corrupted volume vnode index file entries. New "vos status" command provides more informative volserver transaction status output including bytes sent and received for each call when both "vos" and "volserver are v0.150 or above. Linux 4.11 kernel support. "fs commands now support -nofollow switch. Many bug fixes and reliability improvements.

20 March 2017 - AuriStor supports kAFS development with Linux Vault Hackathon

This Wednesday AuriStor, Inc. is sponsoring a Hackathon and BOF in support of kAFS and AF_RXRPC development at the Linux Foundation's annual Vault conference.

What are kAFS and AF_RXRPC?

AF_RXRPC is an implementation of the Rx RPC protocol implemented in the Linux mainline network stack as a socket family accessible both to userland and in-kernel processes.

kAFS is an implementation of the AFS and AuriStorFS file system client in the Linux mainline kernel.

Why are AF_RXRPC and kAFS important?

The AFS file system namespace has been available on Linux as a third party add-on since the IBM days. The IBM AFS derived implementations suffer:

performance limitations due to the existence of a global lock to protect internal data structures

license incompatibility with GPL_ONLY licensed kernel functionality that further restricts performance and functional capabilities

In addition, out of tree file system modules:

are not a standard component of most Linux distributions thereby preventing ubiquitous access to the /afs file system namespace

are not kept in sync with core filesystem and network layer changes in the Linux kernel by the developers responsible for those changes

Collectively, these issues increase the hurdles to use of the /afs file system namespace.

Organizations must be careful not to deploy new Linux kernel versions until such time as updated AFS or AuriStorFS kernel modules are developed and distributed.

Organizations cannot obtain the full benefit of the latest hardware whether that be hardware support for advanced cryptographic operations, patented processes such as rcu, and other techniques that can scale file system access to tens or hundreds of cpu cores.

Lack of common distribution complicates the use of the /afs file namespace in support of Linux container based deployments as most organizations are unwilling to or unable to deploy custom kernels across their internal and cloud (aws, azure, ...) cloud infrastructures.

What is the History of kAFS and AF_RXRPC?

David Howells began work on an in-tree AFS client for Linux circa 2001. Unlike the IBM AFS derived cache manager, David's implementation is not a monolithic file system and proprietary network stack designed for portability across operating systems. Instead, David's AFS client is designed as separate modular components that are integrated into Linux the maximize their usefulness not only for AFS but for a broader class of applications:

Instead of implementing Rx as a proprietary component of the AFS file system, David added Rx as a native socket family integrated with the Linux kernel networking stack at the same layer as UDP and TCP processing. This produces noticeable reductions in packet processing overhead. At the same time, the Rx RPC protocol becomes readily available as a lightweight secure RPC for userland applications. As a demonstration of how easy it is to use, David implemented much of the AFS administration command suite (bos, pts, vos) in Python by combining a Python XDR class with AF_RXRPC network sockets.

Instead of AFS Process Authentication Groups (PAGs), David designed the Linux Keyrings which are now a core Linux component used in support of many file systems and network identity solutions.

David developed the FS-Cache file system caching layer which is used in support of NFS* and CIFS file systems.

David's kAFS is the AFS and AuriStorFS specific file system functionality including the callback services.

Unfortunately, David Howell's has received minimal support from the AFS user community. As a result, neither AF_RXRPC nor kAFS have been included in any major Linux distribution.

Why is AuriStor, Inc. contributing?

AuriStor, Inc. has invested substantial resources into its AuriStorFS Linux client in support of Red Hat Enterprise Linux, Fedora, CentOS, Debian and Ubuntu and will continue to do so. Yet, AuriStor, Inc. recognizes that widespread adoption of AuriStorFS servers for last scale Enterprise and Research deployments require higher performance, greater scale and easier maintenance for Linux systems.

AuriStor, Inc. also recognizes that the many of the workflows that have relied upon the /afs file namespace for software and configuration distribution are migrating to containers. That transition is not without its own challenges related to the management of Container identity and authentication to persistent network based resources. AuriStor, Inc. believes that the global /afs file namespace combined with the AuriStor Security Model (combined identity authentication and multi-factor constrained elevation authorization) are best suited to addressing the outstanding Container deployment issues.

AuriStor, Inc. believes that only through a native in-tree client can these issues be addressed.

What is AuriStor, Inc. contributing?

AuriStor, Inc. has leveraged its expertise and extensive quality assurance infrastructure to identify flaws in the AF_RXRPC and kAFS implementations. Over the last year hundreds of corrections and enhancements have been merged into the Linux mainline tree. Missing functionality has been identified and is being implemented one piece at a time.

As kAFS approaches production readiness AuriStor, Inc. will contribute native AuriStorFS client support including an implementation of the "yfs-rxgk" security class to the AF_RXRPC socket family.

It is our hope that by the end of 2017 kAFS and AF_RXRPC will be ready for inclusion in major Linux distributions side-by-side with NFS* and CIFS.

AuriStor, Inc. is also working with major players in the Container eco-system and the Linux Foundation to address the identity management problem. When successful, it will be possible to launch Containers with network credentials such as Kerberos tickets and AFS/AuriStorFS tokens managed by the host. AuriStor, Inc. believes that this functionality combined with the /afs file namespace will allow true portability of Containerized processes across private and public cloud infrastructures.

27 December 2016 - AuriStor File System v0.145 released

v0.145 is the latest in the on-going efforts to improve the fileserver's ability to perform volume operations when the volume is under heavy load and to recover when the unexpected happens.

A STORY:

One of the strengths of the /afs model is the ability to move, release, backup, and dump volumes while they are being accessed by clients under production loads. For example, the following scenario should in theory be handled without a hiccup:

Take two file servers each with at least one vice partition.

Create a volume V on fs1/a

Add RO sites on fs1/a and fs2/a

On at least two clients execute "iozone -Rac" in separate directories of volume V using the RW path

On at least one client start a loop that lists one of the directories with stat info that the iozone test is writing to from V.backup.

On at least one client start a loop that lists one of the directories with stat info that the izone test is writing to from V.readonly.

Repeat the following process in a tight loop

"vos backup V"

"vos release V"

"vos move V fs2 a"

"vos backup V"

"vos release V"

"vos move V fs1 a"

Organizations do not actively operate their cells in this fashion but the expectation is that "if they did, it should work." Of course, the answer is "it didn't before v0.145". Why not? and where did it fail?

For those of you that are unfamiliar with the fileserver architecture, here is process list as reported by the bosserver for a fileserver:

[C:\]bos status great-lakes.auristor.com -long Instance dafs, (type is dafs) currently running normally. Auxiliary status is: file server running. Process last started at Sat Dec 24 12:19:03 2016 (3 proc starts) Command 1 is '/usr/libexec/yfs/fileserver' Command 2 is '/usr/libexec/yfs/volserver' Command 3 is '/usr/libexec/yfs/salvageserver' Command 4 is '/usr/libexec/yfs/salvager'

The first three commands execute a set of dependent processes. The fileserver process:

Registers the fileserver with the VL service and communicates with the PT service

Processes all requests from AFS clients (aka cache managers) via the RXAFS and RXYFS Rx network services.

Maintains a cache of all volume headers

Is the exclusive owner of all volumes. the volserver and salvageserver processes request readonly or exclusive access to volumes from the fileserver

Issues requests to the salvageserver to perform consistency checks and repair volumes when a problem is detected with the volume headers, the on-disk data, or other.

The goal is to ensure that a volume is available to the fileserver process as long as there are active requests.

Each "vos" command is implemented by one or more VL and VOL RPCs. The VOL RPCs are processed by the volserver. The volserver can:

Submit a query to the fileserver to obtain the necessary data to satisfy the request

Request readonly access to a RW volume which can be used to produce a new clone (for .readonly or .backup or .roclone)

Request exclusive access to a RW, RO, BK or an entire volume group

Request a volume be salvaged by the salvageserver

When the salvageserver is asked to salvage a volume it requests exclusive access to the volume group from the fileserver. There are a lot of moving parts.

For each of the "vos backup", "vos release" and "vos move" commands the volserver will exercise different combinations of query, readonly and exclusive access to volumes. The various modes of client requests to the fileserver:

reading from the .backup

reading from the .readonly

writing to the RW

Produces contention between the volservers and the fileservers for control of volume.

The expected behavior is that a volume will be offline for the shortest amount of time possible, that the client will retry the request in a timely manner, and most importantly, not respond to a temporary outage as a fatal error.

What could go wrong?

For starters, the client retry algorithm is quite poor. Whenever a volume is busy or offline the client will sleep for 15 seconds. In other words, the client will block for an eternity and when it finally does retry its likely to find the volserver with exclusive ownership of the volume and sleep again for another 15 seconds. The v0.145 release does not fix the client behavior. That will come in a future release but for testing purposes we removed all of the sleeps from the clients and forced immediate retries in order to maximize the contention.

Once sufficient contention was generated, we observed that after round trips of the volume moving from fs1/a to fs2/a to fs1/a to fs2/a and back to fs1/a, the volume would go offline and not come back. We noticed that the fileserver would be asked to put the volume into an online state but would immediately request a salvage. The salvageserver would verify the state, ask the fileserver to put the volume into service, but the fileserver would immediately request another salvage. This would repeat a dozen times before the volume would be taken offline permanently. Attempts to manually salvage the volume would succeed but the volume would still not return to an online state. The failure was 100% reproducible.

For those of you that remember IBM AFS and OpenAFS 1.4.x and earlier, there was no on-demand attachment or salvaging of volumes. The fileserver was much simpler. The fileserver forced a salvage of all volumes at startup and then attached them. The volserver always obtained exclusive access to volumes. The fileserver never cached any volume metadata. If something went wrong with a volume, it was taking offline until an administrator intervened.

The struggles of the last few months have all been directly attributable to the changes introduced as part of the demand attach functionality introduced in OpenAFS 1.6.x. We have encountered deadlocks, on-disk volume metadata corruption, copy-on-write data corruption, salvager failures, and now fileserver volume metadata cache corruption.

After two round-trips of volume movement while the RW, RO, and BK are all under active usage, the fileserver's volume group cache would end up out of sync with the volserver. The fileserver would believe the volume was still owned by the volserver when it wasn't. The fileserver would request the salvageserver to verify the volume state but nothing it could do would result could make a difference.

As of v0.145 the handoff from volserver to fileserver has been corrected. Now the volume can be backed up, released, and moved continuously while each of the volume types are actively accessed. No salvages. No VNOVOL errors. The iozone benchmark will continue to function even if the fileserver or volserver processes are periodically killed.

With a modified client to reduce the delays after receiving a VBUSY or VOFFLINE error, the iozone benchmark continues to function although at a lower rate of throughput. There are periodic pauses when large copy-on-write operations need to be performed. Which brings me to our New Year's resolution.

In 2017, AuriStor aims to remove the copy-on-write delays. The COW delays are the due to the need to copy the entire backing file each time a file is modified after a volume clone is created. On Linux distributions that include "xfs reflinks" support for vice partitions AuriStor fileservers will be able to complete COW operations in constant time without regard to the length of the file being modified. This change coupled with client-side improvements to the retry algorithms will significantly reduce the performance hit after volume clone operations.

Have a Happy New Year! The AuriStor team is looking forward to an excellent year.

30 November 2016 - OpenAFS Security Advisory OPENAFS-SA-2016-003

Today OpenAFS announced Security Advisory OPENAFS-SA-2016-003 and released 1.6.20 which is an urgent security release for all versions of OpenAFS. IBM AFS cache managers and all OpenAFS Windows clients are also affected. There is no update to the OpenAFS Windows client.

AuriStor File System clients and servers do not experience this information leakage. However, volume migrated to AuriStorFS file servers from OpenAFS or IBM AFS file servers will retain the information leakage.

In addition to impact described in the announcement it is worth noting that all backups and any archived dump files will contain information leakage. Restoring a backup or dump file containing information leakage will restore that leaked information to the file servers where it will be delivered to cache managers.

Salvaging restored volumes with the -salvagedirs option is required to purge the information leakage.

It is worth emphasizing that IBM AFS and OpenAFS volserver operations including all backup operations occur in the clear. Therefore, all leaked information will be visible to passive viewers on the network segments across which volume backups and moves occur.

To help personalize content and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on auristor.com through cookies. Learn more, including about available controls: Cookies Policy.

AuriStor File System News

v2021.05-44 (17 August 2024)

Important Bug fixes:

Important notes regarding Ubuntu 22.04 LTS and 24.04 LTS AppArmor and AuriStorFS clients:

v2021.05-41 (26 June 2024)

v2021.05-39 (20 May 2024)

v2021.05-38 (29 February 2024)

New Platforms

Rx improvements

Rx improvements

Linx cache manager improvements

New Platforms

New Features

-srcport option for vos, pts, bos

Apache 2 mod_auth_waklog module

Linx cache manager improvements

Rx RPC network transport improvements

File server updates

Cell service database updates

UNIX Cache Managers

aklog and klog.krb5

New platforms

UNIX Cache Managers

aklog

Fileserver

BOS Overseer Service

New platform

Fix for rxkad_krb5 superuser tokens

Reliability of IP ACLs.

Reconnection of yfs-rxgk authenticated connections after rekeying

Performance of "vos convertROtoRW"

Performance of "vos source", "vos interactive", "vos eachvol", "pts source", "pts interactive" and "pts eachuser/eachgroup"

Reliability of volume operations initiated from behind a NAT

What are kAFS and AF_RXRPC?

Why are AF_RXRPC and kAFS important?

What is the History of kAFS and AF_RXRPC?

Why is AuriStor, Inc. contributing?

What is AuriStor, Inc. contributing?

What could go wrong?