AFS Dump Stream Format Jeffrey Altman, AuriStor, Inc. 14 November 2023 Summary ======= The AFS Dump Stream is a marshaling format for the contents of an AFS volume. A Dump can encode all of the metadata and data necessary to reproduce the contents of an AFS Volume including its Vnodes. Alternatively, a Dump can include only the Volume metadata and data that changed during a specified time period: "from" time through "to" time. Multiple Dumps for the same Volume can be merged to form a new Dump that stores the metadata and data for multiple time periods. The AFS Dump Stream is used in the following scenarios to convey the contents of AFS Volumes between processes. . AFS Volume Server to Volume Server transfers via AFSVolRestore RPC in response to "vos move", "vos copy", "vos release", etc. . AFS Volume Server to "vos dump" command via AFSVolDump or AFSVolDumpV2 RPCs. "vos dump" can write the stream to a file which can be archived or can write the stream to stdout for processing by another tool. . AFS Volume Server to AFS Backup Tape Controller (butc) via AFSVolDump or AFSVolDumpV2 RPCs which writes the Dump stream to tape, to a file, or to an object store. . "vos restore" command to AFS Volume Server via AFSVolRestore RPC. The input to "vos restore" can be a dump file obtained via "vos dump" or can be generated by another tool. . AFS Backup Tape Controller (butc) to AFS Volume Server via AFSVolRestore RPC which reads the Dump stream from tape, from a file, or from an object store and restores the Volume. The following tools can generate or operate on the contents of dump files: . The "voldump" command can generate a dump file by reading a volume's object store located on a vice partition. . The "genvol" command can import the contents of a directory tree into a dump file which can be restored using "vos restore". . The "dumptool" command allows the user to enter an interactive mode to navigate and extract content from a volume dump. . The "afsdump_extract" command can be used to extract the vnode object data from a dump file. Dump Format =========== An AFS Dump stream consists of Header tags, implicit data, sub-tags and sub-tag data. A tag is an octet with a value between 0x01 and 0x7f. A Header tag is an octet with values between 0x01 and 0x14. Header tags between 0x01 and 0x04 are legacy tags: * DUMPHEADER (0x01) * VOLUMEHEADER (0x02) * VNODE (0x03) * DUMPEND (0x04) The DUMPHEADER and VNODE tags are followed by a structured value. The VOLUMEHEADER and DUMPEND tags are dataless. Header tags from 0x05 to 0x14 are followed by a Length and a Value (TLV). Tags in the range 0x15 through 0x7d are Sub-tags. Each Header tag has its own namespace of Sub-tags. There are four categories of sub-tags: . Tags defined before October 2009 are Legacy tags. The Value following the tag is specific to the Header tag, sub-tag combination. A Dump stream containing a Legacy tag cannot be successfully parsed if the tag is not recognized. . Standard or Non-Legacy tags can be parsed even if they are not recognized. The format of each tag or sub-tag falls into one of these categories: - Sub-tags between 0x15 and 0x60 are Tag-Length-Value (TLV) sub-tags which are followed by a Length and a Value. - Sub-tags between 0x61 and 0x7a are Standard sub-tags which are followed by a 32-bit value in network byte order. - Subtags between 0x7b and 0x7d are Dataless sub-tags which are not followed by any Value. The Dataless tag 0x7e indicates that the subsequent tag or sub-tag is CRITICAL and must not be skipped if its meaning is unrecognized or unsupported. The Dataless sub-tag 0x7f is reserved for future standardization. The tag 0x00 is reserved as an invalid value and must not be used. A dump stream containing unrecognized Legacy tags cannot be parsed because it is impossible to determine when the unrecognized tag's data has been consumed. Non-legacy tags can be parsed when unrecognized provided they do not have indefinite length. TLV Length Determination ------------------------ A TLV Length has variable size depending upon the value of the first octet: . If the first octet is less than or equal to 0x7f, then the Length of the Value that follows is that number of octets. For example, if the first octet is 0x8 then the length of the Value is eight octets. . If the first octet is 0x80, then the Value that follows is of indefinite length. Parsing of the Value must determine when the end of the Value has been reached. If the format is unknown, then parsing of the dump stream must fail. An example of an indefinite length Value is a C-string which is known to be terminated by a NUL octet. . If the first octet 'L' is between 0x81 and 0x88, the length is value of the next (L & 0x0f) octets combined in most significant byte (MSB) order. . If the first octet is greater than 0x88, it is an invalid length and the parsing of the dump stream must fail. Dump Tag Registry ================= The Dump Tag Registry is maintained at https://registrar.central.org. Header Tags =========== There are four registered Header tags each of which are Legacy: 0x01 D_DUMPHEADER 0x02 D_VOLUMEHEADER 0x03 D_VNODE 0x04 D_DUMPEND The unregistered header tags 0x05 to 0x14 are Tag-Length-Value as described in the previous section. Unregistered header tags can be preceded by the CRITICAL tag. D_DUMPHEADER (The Dump Header Tag) ================================== The D_DUMPHEADER tag (0x01) is a Legacy tag followed by two unsigned 32-bit integers in network byte order: DUMPBEGINMAGIC 0xB3A11322 DUMPVERSION 0x00000001 If either of the constants do not match, the parsing of the dump stream must fail. After DUMPBEGINMAGIC and DUMPVERSION are some of the following sub-tags: 21 0x15 64-bit Volume ID [TLV] 22 0x16 100ns Dump Times [TLV] 'n' 0x6e Volume Name [Legacy: C-String] 't' 0x74 32-bit Dump Times [Legacy] 'v' 0x76 32-bit Volume ID [Legacy] Dump Header Volume ID --------------------- Each dump stream must specify the Volume ID of the volume. The legacy Volume ID sub-tag 0x76 ('v') is followed by an unsigned 32-bit value in network byte order. The TLV sub-tag 0x15 represents a 64-bit value. To ensure backward compatibility sub-tag 0x76 ('v') should be used whenever the Volume ID fits within 32-bits. Otherwise, sub-tag 0x15 should be used preceded by the CRITICAL tag (0x7e). Each dump stream must include the time range(s) the dump covers. The time ranges are specified using the legacy 0x74 ('t') sub-tag and/or the 0x16 sub-tag. Dump Header Time Ranges ----------------------- There must be at least one timestamp range present in the dump stream. Most dump streams contains a single timestamp range and one D_VOLUMEHEADER. If the initial "from" timestamp is UNIX Epoch then the dump stream is a "full dump" containing all of the metadata and data upon until the "to" timestamp. If the initial "from" timestamp is later than UNIX Epoch, the dump stream is an "incremental dump" which does not include the metadata and data for all volume objects. Dump streams can be merged by combining the contents between the D_DUMPHEADER and D_DUMPEND tags of two dumps streams. The D_DUMPHEADER of the merged dump stream must contain each of the "from".."to" ranges from the merged dumps. The ranges must be in sequential order. The timestamp range can be specified either using the Legacy 't' (0x74) sub-tag or the (0x16) TLV sub-tag. The legacy 't' (0x74) sub-tag is followed by an unsigned 16-bit network byte order 'Count' whose valid range is one to fifty (1..50). Following the 'Count' are the specified number of "from" and "to" timestamp pairs. Each timestamp is an unsigned 32-bit network byte order value representing the number of seconds since UNIX Epoch. The 0x16 TLV sub-tag is followed by pairs of 64-bit timestamps representing the number of 100ns since UNIX Epoch. Each 64-bit 100ns timestamp is stored as the high 32-bits in network byte order followed by the low 32-bits in network byte order. The number of timestamp range pairs is computed as the TLV Length divided by 16. If the 0x16 sub-tag is sent in addition to the 0x74 sub-tag, it is sent without a preceding CRITICAL sub-tag. If both 0x16 and 0x74 ('t') sub-tags are present and recognized, then the 0x16 sub-tag takes precedence. To maximize backward compatibility, timestamps should be included using both sub-tags 0x16 and 0x74 ('t'). Implementations that do not understand sub-tag 0x16 will ignore it and obtain timestamp information from sub-tag 0x74 ('t'). Implementations that understand sub-tag 0x16 should ignore sub-tag 0x74 ('t'). Dump Header Volume Name ----------------------- Each dump stream may include an optional volume name via the legacy sub-tag 0x6e ('n'). The name data length is indefinite. The name must be a NUL- terminated sequence of octets. The maximum volume name C-String supported by IBM AFS and OpenAFS is 32-octets including the NUL-terminator. The maximum volume name C-string supported by AuriStorFS is 512-octets including the NUL- terminator. IBM AFS and OpenAFS will truncate the name if the provided name is longer than is supported. Note that after a volume is restored to a Volume Server, the volume name is descriptive and is never used to lookup a Volume. All lookups are performed by the Volume ID. End of D_DUMPHEADER tag ----------------------- The D_DUMPHEADER tag is complete when the next Dump Header tag (0x02..0x14) is consumed. D_VOLUMEHEADER (The Volume Header Tag) ====================================== The D_VOLUMEHEADER tag (0x02) is a Legacy tag has no untagged data and is followed by at least one of these sub-tags: 21 0x15 64-bit Volume IDs [TLV] 22 0x16 Volume Maximum Access Control List (AuriStorFS) [TLV] 23 0x17 Volume Security Levels (AuriStorFS) [TLV] 24 0x18 64-bit Volume Maximum Quota [TLV] 25 0x19 64-bit Volume Disk Usage [TLV] 26 0x1a 100ns Volume Timestamps [TLV] 27 0x1b 64-bit Volume Features (AuriStorFS) [TLV] 28 0x1c 64-bit Volume Owner [TLV] 29 0x1d 64-bit Volume Minimum Quota [TLV] 30 0x1e 64-bit Volume File Count [TLV] 'A' 0x41 32-bit Last Access Date (Legacy) 'B' 0x42 32-bit Backup Date (Legacy) 'C' 0x43 32-bit Creation Date (Legacy) 'D' 0x44 32-bit Day Use Date (Legacy) [deprecated] 'E' 0x45 32-bit Expiration Date (Legacy) 'F' 0x46 OSD Policy (Legacy: 32-bit) [deprecated - replaced by 'P'] 'M' 0x4d MOTD (Legacy: C-String) 'O' 0x4f Offline Message (Legacy: C-String) 'P' 0x50 OSD Policy (Legacy: 32-bit) [deprecated - replaced by 'y'] 'U' 0x55 32-bit Update Date (Legacy) 'V' 0x56 32-bit Volume Update Counter (Legacy) 'W' 0x57 Week Use Statistics (Legacy: structure) [deprecated] 'Z' 0x5a Day Use Statistics (Legacy) [deprecated] 'a' 0x61 Account Number (Standard) [deprecated] 'b' 0x62 Blessed Flag (Legacy: 8-bit) 'c' 0x63 32-bit Clone Id (Standard) 'd' 0x64 32-bit Disk Usage (Standard) 'f' 0x66 32-bit File Count (Standard) 'i' 0x69 32-bit Volume Id (Standard) 'm' 0x6d 32-bit Volume Minimum Quota (Standard) 'n' 0x6e Volume Name (Legacy: C-String) 'o' 0x6f 32-bit Volume Owner (Standard) 'p' 0x70 32-bit Parent Volume Id (Standard) 'q' 0x71 32-bit Volume Maximum Quota (Standard) 'r' 0x72 OSD maximum number of files (Standard) 's' 0x73 In-Service Flag (Legacy: 8-bit) 't' 0x74 Volume Type (Legacy: 8-bit) 'u' 0x75 32-bit Volume Next Uniquifier (Standard) 'v' 0x76 32-bit Volume Stamp Version (Standard) [deprecated] 'y' 0x79 OSD Policy (Standard) Volume ID ---------- Each D_VOLUMEHEADER must specify the Volume ID of the volume which must match the Volume ID specified in the D_DUMPHEADER. The legacy Volume ID sub-tag 0x69 ('i') is followed by an unsigned 32-bit value in network byte order. The TLV sub-tag 0x15 represents a 64-bit value. To ensure backward compatibility sub-tag 0x69 ('i') should be used whenever the Volume ID fits within 32-bits. Otherwise, sub-tag 0x15 should be used preceded by the CRITICAL tag (0x7e). Volume Name ----------- Each D_VOLUMEHEADER may include an optional volume name via the legacy sub-tag 0x6e ('n'). The name data length is indefinite. The name must be a NUL- terminated sequence of octets. The maximum volume name C-String supported by the IBM AFS and OpenAFS Volume Server is 32-octets including the terminating NUL. The maximum volume name C-string supported by AuriStorFS is 512-octets including the terminating NUL. When a Volume Server restores a dump, it will truncate the provided name if it is longer than is supported. Note that after a volume is restored to a Volume Server, the volume name is descriptive and is never used to lookup a Volume. All lookups are performed using the Volume ID. End of D_VOLUMEHEADER Tag ------------------------- The D_VOLUMEHEADER tag is complete when the next Dump Header tag (0x02..0x14) is consumed which must be a D_VNODE tag. A volume must have at least a root directory vnode. Volume IDs ---------- There are three Volume IDs maintained for each volume that are sent as Legacy D_VOLUMEHEADER Sub-tags: Sub-Tag Description ---------- ------------------------------------------------------- 'i' (0x69) Volume ID: 32-bit NBO 'p' (0x70) Parent Volume ID: 32-bit NBO 'c' (0x63) Clone ID: 32-bit NBO AuriStorFS supports 64-bit Volume IDs. The TLV Sub-tag 21 (0x15) 64-bit Volume IDs encodes each of the 64-bit IDs. 1. Volume ID Hi: 32-bit NBO 2. Volume ID Lo: 32-bit NBO 3. Parent ID Hi: 32-bit NBO 4. Parent ID Lo: 32-bit NBO 5. Clone ID Hi: 32-bit NBO 6. Clone ID Lo: 32-bit NBO For backward compatibility Sub-tag 21 (0x15) is only sent if one of the IDs exceeds MAX_AFS_UINT32 and it is always preceded by the CRITICAL Tag. If Sub-tag 21 (0x15) is sent, the legacy sub-tags are not sent. Volume Timestamps ----------------- There are five timestamps maintained for each volume that are sent as Legacy D_VOLUMEHEADER Sub-tags: Sub-Tag Description ---------- ------------------------------------------------------- 'A' (0x41) Last Access Time: 32-bit NBO (seconds since UNIX Epoch) 'U' (0x55) Last Update Time: 32-bit NBO (seconds since UNIX Epoch) 'C' (0x43) Creation Time: 32-bit NBO (seconds since UNIX Epoch) 'B' (0x42) Last Backup Time: 32-bit NBO (seconds since UNIX Epoch) 'E' (0x45) Expiration Time: 32-bit NBO (seconds since UNIX Epoch) AuriStorFS servers record timestamps at 100ns granularity. The TLV sub-tag 26 (0x1a) 100ns Volume Timestamps encodes each of the above timestamps as 64-bit 100ns units since UNIX Epoch. 1. Last Access Time Hi: 32-bit NBO 2. Last Access Time Lo: 32-bit NBO 3. Last Update Time Hi: 32-bit NBO 4. Last Update Time Lo: 32-bit NBO 5. Creation Time Hi: 32-bit NBO 6. Creation Time Lo: 32-bit NBO 7. Last Backup Time Hi: 32-bit NBO 8. Last Backup Time Lo: 32-bit NBO 9. Expiration Time Hi: 32-bit NBO 10. Expiration Time Lo: 32-bit NBO Additional timestamps might be added in the future. Unrecognized timestamps should be consumed and ignored unless the CRITICAL tag preceded sub-tag 26 (0x1a). For backward compatibility both the Legacy sub-tags and sub-tag 26 (0x1a) are sent without a preceding CRITICAL tag. Sub-tag 26 (0x1a) values should be used if 100ns timestamps are supported. Volume Owner ------------ The Protection Service ID of the Volume Owner is sent as Legacy D_VOLUMEHEADER Sub-tag 'o' (0x6f) Volume Owner: 32-bit NBO. AuriStorFS cells support 64-bit Protection Service IDs. The TLV Sub-tag 28 (0x1c) encodes the Volume Owner as a 64-bit value. For backward compatibility Sub-tag 28 (0x1c) is only sent if the absolute value of the Volume Owner exceeds MAX_AFS_INT32 and is always preceded by the CRITICAL Tag. If Sub-tag 28 (0x1c) is sent, the legacy sub-tag is not sent. Volume Maximum Quota (Thin Provisioning) ---------------------------------------- The Volume Maximum Quota is sent as Legacy D_VOLUMEHEADER Sub-tag 'q' (0x71): 32-bit NBO 1KB units; a maximum of 2TB. AuriStorFS support Volume Maximum Quota values up to 16 zettabytes. The TLV Sub-tag 24 (0x18) encodes the Volume Maximum Quota as a 64-bit value (1KB units). For backward compatibility Sub-tag 24 (0x18) is only sent if the Volume Maximum Quota exceeds MAX_AFS_INT32 and is always preceded by the CRITICAL Tag. If Sub-tag 24 (0x18) is sent, the legacy sub-tag is not sent. Volume Minimum Quota (Thick Provisioning) ----------------------------------------- The Volume Minimum Quota is sent as Legacy D_VOLUMEHEADER Sub-tag 'm' (0x6d): 32-bit NBO 1KB units; a maximum of 2TB. AuriStorFS support Volume Minimum Quota values up to 16 zettabytes. The TLV Sub-tag 29 (0x1d) encodes the Volume Mainimum Quota as a 64-bit value (1KB units). For backward compatibility Sub-tag 29 (0x1d) is only sent if the Volume Minimum Quota exceeds MAX_AFS_INT32 and is always preceded by the CRITICAL Tag. If Sub-tag 29 (0x1d) is sent, the legacy sub-tag is not sent. Volume File Count ----------------- The number of Vnodes (directories, files, symlinks) stored in the Volume is transmitted by Legacy sub-tag 'f' (0x66): 32-bit NBO. AuriStorFS supports greater than MAX_AFS_UINT32 Vnodes in a Volume. The TLV Sub-tag 30 (0x1e) 64-bit File Count encodes values that cannot be represented by the Legacy sub-tag. For backward compatibility Sub-tag 30 (0x1e) is only sent if the Volume File Count exceeds MAX_AFS_UINT32 and is always preceded by the CRITICAL Tag. If Sub-tag 30 (0x1e) is sent, the legacy sub-tag is not sent. Volume Disk Usage ----------------- The total number of 1KB units in use by the Volume is transmitted by Legacy sub-tag 'd' (0x64): 32-bit NBO. Transmission of the Disk Usage is optional. IBM AFS 3.6 and OpenAFS do not restrict the Volume usage to 2TB and the disk usage counter wraps negative when the usage exceeds 2TB. AuriStorFS supports Volume usage up to 16 zettabytes. The TLV Sub-tag 25 (0x19) 64-bit Disk Usage encodes values that cannot be represented by the Legacy sub-tag. For backward compatibility Sub-tag 25 (0x19) is only sent if the Volume Disk Usage exceeds MAX_AFS_INT32. As the value is optional, it is not preceded by the CRITICAL Tag. If Sub-tag 25 (0x19) is sent, the legacy sub-tag is not sent. Offline Message --------------- The Legacy Sub-tag 'O' (0x4f) encodes an indefinite length C-string known as the Volume Offline message. This is an optional value. Volume Type ----------- The Legacy Sub-tag 't' (0x74) encodes an 8-bit value representing the Volume Type. Allocated values include: RWVOL 0 ROVOL 1 BACKVOL 2 RWREPL 3 Volume Blessed Flag ------------------- The Legacy Sub-tag 'b' (0x62) encodes an 8-bit boolean value representing the Volume Blessed state. If true, the Volume is blessed by the administrator and the Volume can be attached by the fileserver and volserver. If false, the volume is not blessed and it cannot be attached. A volume that cannot be attached cannot be accessed by end users. Volume In-Service Flag ---------------------- The Legacy Sub-tag 's' (0x73) encodes an 8-bit boolean value representing the Volume In-Service state. If true, the Volume is in-service and can be attached by the fileserver. If false, the volume is out of service and its content cannot be served by the fileserver. Volume Next Uniquifier ---------------------- The Legacy Sub-tag 'u' (0x75) encodes the next Vnode uniquifier as a 32-bit NBO value. The next uniquifier value is incremented each time a Vnode is created. As vnode numbers can be reused after a Vnode is destroyed, the uniquifier reduces the risk that a cache manager will confuse the new Vnode with a previously destroyed Vnode sharing the same vnode number. Volume Maximum ACL ------------------ AuriStorFS supports a Maximum Volume ACL which is used to restrict the rights that can be granted to users or hosts by Directory/File ACLs. The TLV Sub-tag 22 (0x16) transmits the XDR encoded ACL data. This Sub-tag is always preceded by the CRITICAL tag. Volume Security Levels ---------------------- AuriStorFS supports Volume Security Levels which specify the permitted Rx Security Classes and Protection Level combinations that may be used when accessing the Volume content. The TLV Sub-tag 23 (0x17) transmits pairs of 32-bit NBO Security Class and 32-bit NBO Protection Level. This Sub-tag is always preceded by the CRITICAL tag. If the receiving server is not prepared to enforce the specified security levels the restoration of the volume should be aborted. Volume servers will confirm that the destination of an AFSVolRestore RPC will enforce the security levels before initiating the RPC. This is not always possible when restoring a dump file or from a backup. Volume Features --------------- AuriStorFS supports Volume Features which can be used to enable functionality such as readonly mode, cross-directory hard links, per-file acls, and large directories. The Sub-tag 27 (0x1b) transmits the Volume Features configuration. This Sub-tag is always preceded by the CRITICAL tag. Other Sub-tags -------------- The author of this document is not familiar with the OSD Sub-tags or with the usage of deprecated Sub-tags used by neither OpenAFS nor AuriStorFS. D_VNODE (The Vnode Header Tag) ============================== The D_VNODE tag (0x03) is a Legacy tag that is immediately followed by two mandatory 32-bit values in network byte order: vnodeNumber and vnodeUnique. These fields are followed by zero or more of these sub-tags: 21 0x15 File Access Control List (TLV) [deprecated] 22 0x16 100ns Vnode Timestamps (TLV) 23 0x17 64-bit Vnode Author, Owner, Group IDs (TLV) 24 0x18 96-bit Vnode Number and Parent Number (TLV) 25 0x19 64-bit Vnode Data Version (TLV) 26 0x1a 64-bit Extended Access Control List (TLV) 27 0x1b 16-bit Directory Type (TLV) 'A' 0x41 32-bit Access Control List (Legacy: C-String) 'L' 0x4c OSD vnode length (hi,lo) (TLV) 'O' 0x4f OSD metadata string (TLV) 'P' 0x50 OSD directory policy index (Legacy: 32-bit) [deprecated - replaced by 'd'] 'a' 0x61 32-bit Author ID (Standard) 'b' 0x62 Unix Mode Bits (Legacy: 16-bit) 'd' 0x64 OSD directory policy index (Standard) 'f' 0x66 File Stream - Small (Legacy: structured) 'g' 0x67 32-bit Group ID (Standard) 'h' 0x68 File Stream - Large (Legacy: structured) 'l' 0x6c Link Count (Legacy: 16-bit) 'm' 0x6d 32-bit UNIX Modify Time (Standard) 'o' 0x6f 32-bit Owner ID (Standard) 'p' 0x70 32-bit Parent Vnode Number (Standard) 's' 0x73 32-bit Server Modify Time (Standard) 't' 0x74 Vnode Type (Legacy: 8-bit) 'u' 0x75 32-bit OSD Last Access Time (Standard) [deprecated] 'v' 0x76 32-bit Data Version (Standard) 'x' 0x78 OSD File Online Flag (Standard) 'y' 0x79 OSD vnode length (Legacy: structure) [deprecated - replaced by 'L'] 'z' 0x7a OSD metadata string (Legacy: C-String) [deprecated - replaced by 'O'] 123 0x7b whiteout file or opaque directory End of D_VNODE Tag ------------------ The D_VNODE tag is complete when the next Dump Header tag (0x02..0x15) is consumed. The next tag can be any other Dump Header tag other than D_DUMPHEADER. If the next tag is D_DUMPEND then the dump stream is complete. If the next tag is D_VOLUMEHEADER then this is a merged dump stream and the next incremental dump has begun. Full vs Incremental Volume dumps ------------------------------------------- In a "full" dump stream the D_VNODE tag and its sub-tags must represent all of the metadata and data associated with the Vnode (Volume ID, Vnode Number and Unique). In an "incremental" dump stream the D_VNODE tag containing just the Vnode Number and Unique indicates that the Vnode is present and there have been no changes during the dump timestamp range (Dump Header "from".."to"). A D_VNODE in an "incremental" dump stream can omit any sub-tags whose value is known not to have changed since the beginning of the dump range. For example, the File Stream is frequently omitted if there has been no change to the Data Version during the dump time period. When restoring an "incremental" dump stream, if a D_VNODE is not present for a previously existing Vnode ID (Number and Unique), then that Vnode has been deleted from the Volume. Vnode Number and Parent Number ------------------------------ The D_VNODE tag's embedded Vnode Number field and the Parent Vnode Number sub-tag 'p' (0x70) are capable of representing 32-bit Vnode Numbers in network byte order. If either the Vnode Number or Parent Vnode number is larger than can be represented in 32-bits, then the embedded Vnode number should be set to zero and sub-tag 0x18 preceded by the CRITICAL sub-tag must be the first sub-tag in the stream. If sub-tag 0x18 is present, the embedded Vnode Number will be ignored. D_VNODE sub-tag 0x18 ("96-bit Vnode Number and Parent Number") is a TLV tag. The tag transmits a 96-bit Vnode Number (96-bit) followed by an optional Parent Vnode Number (96-bit). Each 96-bit Vnode is transmitted as hi (32-bit ), mid (32-bit), and lo (32-bit); each in network byte order. If a Parent Number is present, the 0x70 ('p') sub-tag must be ignored. When cross-directory hard links are unsupported, the Parent Vnode Number indicates the parent directory in which the Vnode is linked. This directory is the one's whose Access Control List (ACL) will be used. When cross- directory hard links are in use, there is no well defined Parent Vnode Number and the Vnode must be assigned its own ACL. Vnode Type ---------- The Legacy Sub-tag 't' (0x74) encodes an 8-bit value representing the Vnode Type. Allocated values include: 1. vFile 2. vDirectory 3. vSymlink AFS Mount Points are stored as Symlink Vnodes with the UNIX mode bits set to 0644. Vnode Data Version ------------------ The Legacy Sub-tag 'v' (0x76) Data Version encodes a 32-bit NBO value representing the Vnode data version. The Data Version is incremented each time the Vnode's File Stream is modified. The Data Version is used by caching clients to determine whether or not cached data is current. When a Vnode is created the Data Version starts at zero; which implies that the File Stream is empty. Since IBM AFS 3.3, fileservers and cache managers have supported 64-bit Data Versions but the Dump Format has only supported 32-bit values. The TLV Sub-tag 25 (0x19) encodes 64-bit Data Version values as a hi (32-bit) and lo (32-bit); each in network byte order. For backward compatibility Sub-tag 25 (0x19) is only sent if the Data Version number exceeds MAX_AFS_UINT32. Sub-tag 25 (0x19) must be preceded by the CRITICAL Tag. If Sub-tag 25 (0x19) is sent, the legacy sub-tag is not sent. When restoring an "incremental" dump stream, if a D_VNODE does not include a File Stream the receiver must confirm that the received Data Version matches the currently stored version. If the Data Version has changed, the restoration of the dump must fail. Vnode Data Stream ----------------- Prior to OpenAFS 1.4.0 the maximum size of a file was limited to MAX_AFS_INT32 bytes. The Legacy Sub-tag 'f' (0x66) encodes the stream length as a 32-bit NBO followed by the data stream which is parsed as for (i = 0; i < length; i++) 8-bit : data byte OpenAFS 1.4.0 introduced 64-bit file lengths. The Legacy Sub-tag 'h' (0x68) encodes the huge file length as a hi (32-bit NBO) and lo (32-bit NBO) followed by the data stream which is parsed as length = (hi << 32 | lo) for (i = 0; i < length; i++) 8-bit : data byte For backward compatibility Sub-tag 'h' (0x68) is only sent whenever the stream length is greater than MAX_AFS_INT32. If Sub-tag 'h' (0x68) is sent then Sub-tag 'f' (0x66) must not be sent. The addition of Sub-tag 'h' (0x68) predates the introduction of the CRITICAL tag. Therefore, the CRITICAL tag does not precede Sub-tag 'h' (0x68). It should be noted that the Legacy Sub-tags 'f' and 'h' are not restricted to File Vnodes. The Vnode Data Stream Sub-tags are also used to store the contents of Directory Vnodes and Symlink Vnodes. AFS Mount Points are stored as a Symlink with UNIX Mode Bits set to 0644. Vnode Timestamps ---------------- There are two timestamps maintained for each Vnode that can be sent as Legacy D_VNODE Sub-tags: Sub-Tag Description ---------- ------------------------------------------------------- 's' (0x73) Server Modify Time: 32-bit NBO (seconds since UNIX Epoch) 'm' (0x6d) UNIX Modify Time: 32-bit NBO (seconds since UNIX Epoch) AuriStorFS servers record timestamps at 100ns granularity and maintain three additional timestamps: Server Modify DV Time, Server Create Time, and Last Access Time. The TLV sub-tag 22 (0x16) 100ns Vnode Timestamps encodes each of the above timestamps as 64-bit 100ns units since UNIX Epoch. 1. UNIX Modify Time Hi: 32-bit NBO 2. UNIX Modify Time Lo: 32-bit NBO 3. Server Modify Time Hi: 32-bit NBO 4. Server Modify Time Lo: 32-bit NBO 5. Server Modify DV Time Hi: 32-bit NBO 6. Server Modify DV Time Lo: 32-bit NBO 7. Server Create Time Hi: 32-bit NBO 8. Server Create Time Lo: 32-bit NBO 9. Last Access Time Hi: 32-bit NBO 10. Last Access Time Lo: 32-bit NBO Additional timestamps might be added in the future. Unrecognized timestamps should be consumed and ignored unless the CRITICAL tag immediately preceded Sub-tag 22 (0x16). For backward compatibility both the Legacy sub-tags and sub-tag 22 (0x16) are sent without a preceding CRITICAL tag. Sub-tag 22 (0x16) values should be used if 100ns timestamps are supported. Vnode Author, Owner, and Group IDs ---------------------------------- Legacy Sub-tags are sent to represent the Author, Owner and Group Protection Service IDs assigned to the Vnode. Sub-Tag Description ---------- ------------------------------------------------------- 'a' (0x61) Author ID: signed 32-bit NBO 'o' (0x6f) Owner ID: signed 32-bit NBO 'g' (0x67) Group ID: signed 32-bit NBO AuriStorFS cells support 64-bit Protection Service IDs. The TLV Sub-tag 23 (0x17) encodes the Vnode Author, Owner and Group as 64-bit values. 1. Author Hi: 32-bit NBO 2. Author Lo: 32-bit NBO 3. Owner Hi: 32-bit NBO 4. Owner Lo: 32-bit NBO 5. Group Hi: 32-bit NBO 6. Group Lo: 32-bit NBO For backward compatibility Sub-tag 23 (0x17) is only sent if the absolute value of one or more of Vnode Author, Owner and Group exceeds MAX_AFS_INT32, and is always preceded by the CRITICAL Tag. If Sub-tag 23 (0x17) is sent, the Legacy Sub-tags are not sent. Vnode Unix Mode Bits -------------------- The Legacy Sub-tag 'b' (0x62) Unix Mode Bits encodes a 16-bit NBO value. Only 12-bits are stored by IBM AFS, OpenAFS and AuriStorFS servers. Vnode Link Count ---------------- The Legacy Sub-tag 'l' (0x6c) Link Count encodes a 16-bit NBO value representing the Vnode link count. This value should be a positive value as a zero link count would imply that the Vnode had been deleted and a deleted Vnode should not be represented in the Dump stream. Vnode ACLs ---------- The Legacy Sub-tag 'A' (0x41) Access Control List (ACL) encodes the contents of an AFS3 Directory ACL as an indefinite length C-string. AuriStorFS supports an Extended ACL which is a superset of the AFS ACL. See https://www.auristor.com/documentation/man/linux/7/auristorfs_acls.html. TLV Sub-tag 26 (0x1a) Extended ACL conveys the XDR encoding of an Extended ACL. AuriStorFS supports Extended ACLs on all Vnode types. For backwards compatibility the Legacy Sub-tag 'A' (0x41) is used whenever the Vnode type is Directory and the ACL can be represented by the AFS3 format. In all other cases, the TLV Sub-tag 26 (0x1a) is sent preceded by the CRITICAL tag. Vnode Directory Type -------------------- The TLV Sub-tag 27 (0x1B) encodes the 16-bit Directory Type identifier. The AFS3 Directory header includes a 16-bit directory type identifier (1234) stored in network-byte-order at DirHeader.header.tag. This sub-tag must be preceded by a CRITICAL tag if the Directory Type identifier is anything other than (1234). The Directory Type identifiers (0) and (65535) are reserved as invalid values. Vnode Whiteout and Opaque directories ------------------------------------- The Dataless Sub-tag '{' (0x7b) is sent to indicate that a File Vnode is a whiteout entry or that a Directory Vnode is opaque. This tag should always be preceded by a CRITICAL tag. For compatibility with Linux overlayfs there should only be one whiteout vnode per volume. AuriStorFS allocates the whiteout vnode with a uniquifier value of 0. Whiteouts can only be used when cross-directory hard links and per-file acls are supported. Vnode OSD Sub-tags ------------------ The author of this document is not familiar with OSD Sub-tags. D_DUMPEND (The Dump End Tag) ============================ The D_DUMPEND Tag (0x04) is a Legacy Dataless Tag. D_DUMPEND must be present at the end of a Dump Stream to indicate that the stream has not been truncated. Working with Merged Dump streams ================================ A merged dump stream combines two or more dump streams for the same volume into a single dump stream that can be restored using a single RPC. For example, a full dump taken on 31 May 2023 and an incremental dump taken on 1 June 2023. The full dump is D_DUMPHEADER 't' UNIX Epoch .. 31 May 2023 D_VOLUMEHEADER D_VNODE 1.1 D_VNODE 2.2 D_VNODE 4.3 ... D_VNODE 88.2342 D_DUMPEND and the incremental is D_DUMPHEADER 't' 31 May 2023 .. 1 June 2023 D_VOLUMEHEADER D_VNODE 1.1 D_VNODE 4.3 ... D_VNODE 88.2342 D_DUMPEND where Vnode 4.2 was deleted. The merged dump stream becomes D_DUMPHEADER 't' UNIX Epoch .. 31 May 2023, 31 May 2023 .. 1 June 2023 D_VOLUMEHEADER D_VNODE 1.1 D_VNODE 2.2 D_VNODE 4.3 ... D_VNODE 88.2342 D_VOLUMEHEADER D_VNODE 1.1 D_VNODE 4.3 ... D_VNODE 88.2342 D_DUMPEND When using D_DUMPHEADER Sub-tag 't' (0x74) there is a limit of 50 dump streams that can be merged together. When using Sub-tag 22 (0x16) there is no limit on the number of dump streams that can be merged. Non-standard Legacy Tags and Sub-Tags ===================================== Tags and Sub-tags defined before October 2009 are Legacy tags. This section lists the Tags and Sub-Tags that are not compliant with the parsing rules adopted in October 2009. D_DUMPHEADER: /* BeginMagic: 32-bit NBO, Version: 32-bit NBO */ D_VOLUMEHEADER: /* Dataless */ D_VNODE: /* Vnode ID: 32-bit NBO, Unique: 32-bit NBO */ D_DUMPEND: /* Dataless: End of Stream */ D_DUMPHEADER: 'n' /* volume name: C-string */ D_DUMPHEADER: 't' /* 16-bit NBO Count, (Count / 2) x (from: 32-bit NBO, to: 32-bit NBO) */ D_VOLUMEHEADER: 'A' /* V_accessDate: 32-bit NBO */ D_VOLUMEHEADER: 'C' /* V_creationDate: 32-bit NBO */ D_VOLUMEHEADER: 'D' /* V_dayUseDate: 32-bit NBO */ D_VOLUMEHEADER: 'E' /* V_expirationDate: 32-bit NBO */ D_VOLUMEHEADER: 'F' /* OSD Policy: 32-bit NBO */ D_VOLUMEHEADER: 'M' /* nullstring (motd): C-String */ D_VOLUMEHEADER: 'O' /* V_offlineMessage: C-String */ D_VOLUMEHEADER: 'P' /* OSD Policy: 32-bit NBO */ D_VOLUMEHEADER: 'U' /* V_updateDate: 32-bit NBO */ D_VOLUMEHEADER: 'V' /* volUpdateCounter: 32-bit NBO */ D_VOLUMEHEADER: 'W' /* V_weekUse: 16-bit NBO Count, Count x 32-bit NBO */ D_VOLUMEHEADER: 'Z' /* V_dayUse: 32-bit NBO */ D_VOLUMEHEADER: 'b' /* V_blessed: 8-bit */ D_VOLUMEHEADER: 'n' /* V_name: C-string */ D_VOLUMEHEADER: 's' /* V_inService: 8-bit */ D_VOLUMEHEADER: 't' /* V_type: 8-bit */ D_VNODE: 'A' /* VVnodeDiskACL: C-string */ D_VNODE: 'P' /* OSD Directory Policy Index: 32-bit NBO */ D_VNODE: 'b' /* modeBits: 16-bit NBO */ D_VNODE: 'f' /* small file: 32-bit NBO Count, Count x 8-bit */ D_VNODE: 'h' /* large file: 32-bit NBO Hi, 32-bit NBO Lo, (Hi << 32 | Lo) x 8-bit */ D_VNODE: 'l' /* linkcount: 16-bit NBO */ D_VNODE: 't' /* type: 8-bit */ D_VNODE: 'y' /* OSD vnode length: 32-bit NBO Hi, 32-bit NBO Lo */ D_VNODE: 'z' /* OSD metadata string: C-string */ History ======= 1. The AFS Dump Stream format predates Transarc AFS 3.0 with the first implementation dating to 1987. D_DUMPHEADER { D_VOLUMEHEADER D_VNODE (zero or more) } (one or more) D_DUMPEND A dump is a "full" dump if the initial "from" time is UNIX Epoch (0). Otherwise, the dump is an "incremental". A dump that contains more than one pair of time ranges and D_VOLUMEHEADER tags is referred to as a "merged dump". DumpHeader (D_DUMPHEADER = 1) Mandatory untagged fields: 32-bit - DUMPBEGINMAGIC = 0xB3A11322 32-bit - DUMPVERSION = 1 Optional tagged fields: 'n': C-String - volume name (truncation permitted) 't': structure - dump range (from, to) pairs 16-bit - timeCount [max 50 time pairs] for (i = 0; i < timeCount; i++) 32-bit - time [seconds since UNIX Epoch] 'v': 32-bit - volume id VolumeHeader (D_VOLUMEHEADER = 2) No mandatory untagged data Optional tagged fields: 'a': 32-bit - accountNumber 'b': 8-bit - blessed flag 'c': 32-bit - clone volume id 'd': 32-bit - disk usage (bogus - should be calculated locally) 'f': 32-bit - filecount 'i': 32-bit - volume id 'm': 32-bit - minimum quota 'n': C-string - volume name (truncation permitted) 'o': 32-bit - owner 'p': 32-bit - parent volume id 'q': 32-bit - maximum quota 's': 8-bit - inService flag 't': 8-bit - volume type (RWVOL, ROVOL, BACKVOL) 'u': 32-bit - uniquifier 'v': 32-bit - value discarded 'A': 32-bit - accessDate 'B': 32-bit - backupDate 'C': 32-bit - creationDate (seconds since UNIX Epoch) 'D': 32-bit - dayUseDate 'E': 32-bit - expirationDate 'M': C-string - message of the day (truncation permitted) 'O': C-string - offline message (truncation permitted) 'U': 32-bit - updateDate 'W': structure - week use statistics 16-bit - length for (i = 0; i < length; i++) 32-bit - weekUse[i] statistic 'Z': 32-bit - dayUse statistic VnodeHeader (D_VNODE = 3) Mandatory untagged fields 32-bit - vnode id 32-bit - uniquifier Optional tagged fields 'A': C-string - access control list (type == Directory only) 'a': 32-bit - author 'b': 16-bit - mode bits (only 12 bits are valid) 'f': structure - file data stream 32-bit - length for (i = 0; i < length; i++) 8-bit : data byte 'l': 16-bit - link count 'm': 32-bit - unix modify time 'o': 32-bit - owner 'p': 32-bit - parent vnode id 's': 32-bit - server modify time 't': 8-bit - type 'v': 32-bit - data version DumpEnd (D_DUMPEND = 4) Mandatory untagged fields 32-bit - DUMPENDMAGIC = 0x3A214B6E Unknown tags were ignored and would result in random failure to restore the dump stream. If any of Dump Version, Begin Magic or End Magic are not recognized, the restore fails. 2. June 1989 additions VnodeHeader (D_VNODE = 3) Optional tagged fields 'g': 32-bit - group The addition of the 'g' tag makes the dump stream incompatible with previous releases. The dump version and magic values are unchanged. 3. AFS 3.2 VolumeHeader 'M' C-String changed from "message of the day" to "read statistics". 4. OpenAFS 08db75c1968917a452f1d7c2a8a88b7a3e538ded adds large file support Introduces D_VNODE tag 'h': structure - 64-bit length 32-bit - hi 32-bit - lo for (i = 0; i < (hi << 32 | lo); i++) 8-bit : data byte This change is backward incompatible and would result in hard to diagnose failures. 5. OpenAFS cfa7b866c8deb876b06fd634d34ecdd30fb9b819 adds volume update counter Introduces D_VOLUMEHEADER tag 'V': 32-bit - volUpdateCounter This change is backward incompatible because unrecognized tags cause volume restores to fail. Generation of this tag was disabled by f202b9778e4489fd80288c5be36e3c102b0cfba9. 6. OpenAFS 3f2dd80697959f5922032f4d4a7c9ef0cfadf35c implemented a tag parsing standard described in https://rt.central.org/rt/Ticket/Display.html?id=17947. New tags added in compliance with this standard are backward compatible to any parser that implements the standard. Tag and sub-tag values are signed octets with values between 0x0 and 0x7f. Tags are grouped into the following categories: 0 : Reserved - Invalid 1 .. 20 : Header Tags 21 .. 96 : Tag-Length-Value (TLV) Sub-Tags 97 .. 122 : Standard Sub-Tags 123 .. 125 : Dataless Sub-Tags 126 : CRITICAL Tag 127 : Reserved All tags and sub-tags implemented by OpenAFS prior to the adoption of this standard maintain their existing data format. These tags are the Legacy tags. All tags implemented subsequent to the adoption of this standard by OpenAFS must adhere to the following constraints. Header Tags are parsed using TLV rules (see below). Header Tags may be followed by zero or more Sub-tags. Header Tags may be preceded by the CRITICAL tag. TLV tags are followed by a mandatory one octet length 'L'. If 0 <= 'L' <= 127, then 'L' represents the number of data octets that follow. If 'L' == 128, then the length of the data is indeterminate and can only be determined by parsing the data (e.g. a C-string or XDR stream). If 128 = 'L' <= 255, then ('L' & 0x7F) is the number of octets that follow that are used to construct the data length. ('L' & 0x7F) must be greater than 0 and must not exceed 8. When constructing the length p = &octets[0]; length = 0; for (i = 0; i < ('L' & 0x7F); i++) length = (length << 8) | *p++; Then the data stream is parsed as for (i = 0; i < length; i++) 8-bit : data byte Standard Tags are followed by a 4 octet (32-bit) integer value in network byte order. Dataless Tags are not followed by any data. The dataless tag 0x7e is designated to mean that the next tag to follow it is CRITICAL. If the following tag is unrecognized, then processing of the dump stream must be terminated. The unrecognized tag must not be ignored. The CRITICAL tag can be followed by any Header Tag or Sub-Tag value. Note: the OpenAFS implementation includes the following errors. 1. it defines the start of the TLV Sub-tag range as 6 (0x06) instead of 21 (0x15). 2. it does not enforce the maximum number of TLV length octets. 3. it does not implement TLV indefinite length processing. 3. it does not implement processing of unrecognized Header Tags according to this specification. 7. AuriStor caa90b2d7a22549e586d92961b7a6b677e4c22e2 corrected the TLV range 8. OpenAFS 1e6fb1b7b7ed32e2035452db9fc221f38a8b4956 and cb6de07fb8a12199ad0f1c4990f19074a9a54fcc, and 34e495d69a8831c57cac2ccf18898e63f02c7745 clarify which timestamp should be used as the Dump Header "to" time for each volume type: RW - updateDate RO - creationDate BK - backupDate (if there is no backupDate use creationDate) 9. OpenAFS-OSD used a number of sub-tags over the lifetime of its development. All sub-tags are registered. Volume Header sub-tags 'F' 0x46 OSD Policy (32-bit) (deprecated - replaced by 'P') 'P' 0x50 OSD Policy (32-bit) (deprecated - replaced by 'y') 'r' 0x72 OSD maximum number of files 'y' 0x79 OSD Policy Vnode Header sub-tags 'L' 0x4c OSD vnode length (hi,lo) [TLV] 'O' 0x4f OSD metadata string [TLV] 'P' 0x50 OSD directory policy index (32-bit; replaced by 'd') 'd' 0x64 OSD directory policy index 'u' 0x75 OSD lastUsageTime (deprecated) 'x' 0x78 OSD fileOnline 'y' 0x79 OSD vnode length (non-standard; replaced by 'L') 'z' 0x7a OSD metadata string (non-standard; replaced by 'O') 10. AuriStorFS has used the following tags. All tags are standard compliant. All sub-tags are registered. Dump Header Tags 21 0x15 TAG_DUMPVOLID [TLV] 22 0x16 TAG_DUMPTIMES [TLV] Volume Header Tags 21 0x15 TAG_VOLIDS [TLV] 22 0x16 TAG_VOLACL [TLV] 23 0x17 TAG_VOLSECLEVELS [TLV] 24 0x18 TAG_MAXQUOTA [TLV] 25 0x19 TAG_DISKUSED [TLV] 26 0x1a TAG_VOLTIMES [TLV] 27 0x1b TAG_VOLFEATURES [TLV] 28 0x1c TAG_VOLOWNER [TLV] 29 0x1d TAG_VOLMINQUOTA [TLV] 30 0x1e TAG_VOLFILECOUNT [TLV] Vnode Header Tags 21 0x15 TAG_FILEACL [TLV] (deprecated) 22 0x16 TAG_VNODETIMES [TLV] 23 0x17 TAG_VNODEAOGIDS [TLV] 24 0x18 TAG_VNODEFIDS [TLV] 25 0x19 TAG_VNODEDV [TLV] 26 0x1a TAG_ACL [TLV] 123 0x7b TAG_VNODEWHTOPQ [dataless] TAG_DUMPVOLID - 64-bit volume id (hi, lo) TAG_DUMPTIMES - Up to 50, 64-bit 100ns from, to time pairs TAG_VOLIDS - 64-bit volume id (hi, lo), parent id (hi, lo), clone id (hi, lo) TAG_VOLACL - Opaque XDR encoded Volume Max ACL TAG_VOLSECLEVELS - List of Security Class (32-bit) and Security Level (32-bit) pairs TAG_MAXQUOTA - 64-bit Maximum Quota size (replaces 'q') TAG_VOLMINQUOTA - 64-bit Min Quota (replaces 'm') TAG_VOLFILECOUNT - 64-bit File Count (replaces 'f') TAG_DISKUSED - 64-bit Disk Used (replaces 'd') TAG_VOLTIMES - List of 64-bit 100ns dates in the following order (if present): accessDate, updateDate, creationDate, backupDate, expirationDate. (Replaces the equivalent 32-bit time tags). Additional dates are reserved and should be ignored if unrecognized. TAG_VOLFEATURES - Two 32-bit fields: supported features and volume features. TAG_VOLOWNER - 64-bit Owner PTS ID (replaces 'o') TAG_FILEACL - Opaque XDR encoded ACL TAG_VNODETIMES- List of 64-bit 100ns dates in the following order (if present): 1. unixModifyTime 2. serverModifyTime 3. serverModifyDVTime 4. serverCreateTime 5. lastAccessTime Additional dates are reserved and should be ignored if unrecognized. TAG_VNODEAOGIDS-64-bit Author, Owner, and Group PTS IDs. (replaces 'a', 'o', and 'g') TAG_VNODEFIDS - List of 64-bit FileIDs in the following order (if present): vnode id, parent id. (replaces 'p' and overwrites the implicit D_VNODE id). TAG_VNODEDV - 64-bit data version (hi, lo). (replaces 'v') TAG_ACL - Opaque XDR encoded ACL for directories and files (replaces 'A'). TAG_VNODEWHTOPQ - Indicates the vnode is either a whiteout file or an opaque directory.