-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add API to access "extended file attributes" (xattr, EA) #49604
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Area label would be area-System.IO I suppose. |
We don't yet have a public type called |
This needs a deeper look into the platform-specific capabilities of the existing tools. Will hopefully have a look at it soon. |
Well Mr. MsftBot the label change was a bit too early... |
@heinrich-ulbricht have you had a chance to look into this more deeply? @mklement0 mentioned the following today:
Another thing we need to keep in mind is that Windows has both extended attributes and reparse points, but they are mutually exclusive (one cannot exist if the other exists). See this document. |
@carlossanlop Unfortunately not. It is relatively easy to propose something that would serve my specific use case. But I feel the need to be more thorough so that the proposal encompasses all platforms and respective platform features of xattr. Having found xattr only recently myself I'm a bit hesitant to propose anything to be honest ^^ |
Thanks for starting this proposal, @heinrich-ulbricht. Regarding the terminology on Windows: it seems to me that the equivalent of extended attributes on Linux and macOS / BSD are NTFS alternate data streams; the table you link to uses attribute names used by NTFS itself (whereas, as stated in the initial post, extended attributes are about user-defined payloads that have no special meaning to the file-system or the system). If I understand the linked NTFS documentation correctly, the equivalent of a single extended attribute (a user-defined key-value pair containing arbitrary data) in the Unix world is a single alternate data stream on Windows (NTFS). |
@mklement0 I see, I misinterpreted the NTFS internal attribute name. I worked with ADS in the (pre .NET Core) past, sounds interesting as an EA equivalent. Does this mean that .NET would have to provide a translation from an ADS API to EA file system features on Linux et al.? And that this does not yet exist? A quick search did not reveal anything other than old posts complaining that .NET access to ADS is not possible at all or about quirks like PowerShell restrictions and/or bugs (further down the page). While googling I also found a StackExchange question describing my use case: "Is there a standardised way of adding custom metadata to image and other filetypes?". And another random data point: Azure File Sync might profit from this feature as ADS are currently not synched. (Assuming .NET plays a role here. And desperately trying to find arguments.) What's your take on which direction this proposal might take? |
The relevant Windows functions are ZwQueryEaFile and ZwSetEaFile, but the documentation seems to cover kernel-mode calls only. There are also On Linux, I think .NET should treat the extended attribute namespace as part of the string, rather than automatically insert or delete a "user." prefix. That way, the API would be usable with all namespaces, including ones defined in the future. Such an API would make it harder to use extended attributes in a way that is portable across operating systems, but I don't know whether developers even want to do so; perhaps it is more important to read and write OS-specific extended attributes that are already in use. |
@heinrich-ulbricht, yes, as far as I'm aware there are no .NET APIs for ADS (NTFS alternate data streams) yet, though PowerShell's cmdlets do support them.
I think a platform-neutral abstraction along the lines you've proposed would be beneficial, and seems possible at least in principle, given that what the platform-specific features share in the abstract is the ability to attach name-value pairs of arbitrary data to file-system items - but there are many details to be hashed out, and it'll be a challenge to find the right balance between focusing on shared, platform-neutral functionality vs. not preventing use of platform-specific functionality. @KalleOlaviNiemitalo, I think we need to get clarity on the shared concepts as opposed to terminology. Even though in terms of names, the Windows EAs (extended attributes) seem to be a closer fit, conceptually, ADS seem to be the equivalent of the Unix-world extended attributes, i.e. arbitrary, user-defined pieces of data associated with file-system items that have no special meaning to either the file-system or the operating system. Can you clarify how the Windows EAs fit into the picture here? |
Here mentioned EA is for OS/2 compatibility. So question is make sense to consider the API in modern world? |
@iSazonov This article shows active development regarding EAs: User Xattr Support Finally Landing For NFS In Linux 5.9:
Now that there seem to be two options to implement EA-compatibility on Windows - EAs and ADS - there are different goals one might try to achieve:
My need when opening this ticket was to put user-controlled metadata on files that is persisted when moving those files across file system boundaries (ext2, ext3, ext4, JFS, Squashfs, Yaffs2, ReiserFS, Reiser4, XFS, Btrfs, OrangeFS, Lustre, OCFS2 1.6, ZFS, and F2FS - and NTFS). And to have a .NET API to set this metadata in a cross-platform .NET application. |
Oh and I don't know if somebody tried to be funny here. Citation from IETF's RFC 8276:
(Emphasis mine.) |
Thanks for the sleuthing, @iSazonov. So it does sound like EAs and ADS are alternative implementations of the same basic functionality - but for compatibility with different operating systems, given that the blog post that @heinrich-ulbricht linked to states:
The EA tools repo you mention additionally states the following, disconcerting fact (emphasis added):
Anecdotally, I can say that while I've come across ADS many times, I had never heard of EAs before - which, given the above, is perhaps not too surprising. Also, the NFSv4-related RFC 8276 @heinrich-ulbricht links to with respect to NFS references ADS (though not by that exact name):
|
I do not think this is feasible, as the only feature guaranteed to be available across file-systems is file content, so you'd have to store the metadata as part of the content, which in turn means that you require special tools / APIs to read the regular content - such a file won't act like a regular file anymore (and for directories you'd be out of luck anyway). I think the best a potential cross-platform .NET API could hope for is described by what FreeBSD has done in its
The linked
|
Perhaps ZwSetEaFile, can do this. Given that EA NTFS extended attributes are so little known, is it worth pulling them out in .Net Runtme? |
Because named streams on Windows can already be created, read, and deleted by passing the appropriate strings to System.IO.FileStream and such, I think there is less need for an alternative "extended file attributes" API that does the same thing; that's why it might be more useful to have the API access EAs as implemented in NTFS. It appears .NET doesn't currently have an API for enumerating named streams (FindFirstStreamW), though. |
Yes, and PowerShell implements this internally (GetStreams() method) so I think it makes sense to consider the API in .Net. |
Hm, modifying file content does not seem like an option for arbitrary files if the application doesn't "own" the files. And you are right, there is no guarantee the EAs survive when files (or directories) are moved between file systems. But I think from an API perspective I expect to be able to CRUD EAs in .NET on different platforms. This at least would make applications possible that use EAs to store application-specific metadata. Whether EAs survive moving to another file system is out of scope for the API. Reading the comments I see the hesitancy to pull this "old" EA feature out of the closet into a modern API. It would be good to know how widespread the use of EAs currently really is - or estimate how widespread it would be if there was an API. Whatever the case I'm loving the discussion so far. Cross-platform, cross-file system, cross-team and cross-century. 🤩 |
Another thought: if it won't be possible to provide either
then a .NET application would have to
The latter sounds like a new project to be had that provides an adapter API for those tools so they could be used via one API on any platform. This would be the not-so-clean but pragmatic approach. Ideally the companion-binaries are already part of the platform so we don't have to cope with licensing or security implications. And every platform specialist could plug in "their" tool. |
Just adding another data point for the topic of EAs on Windows: Appropriate function to set NTFS extended attributes: ZwSetEaFile or NtSetEaFile |
Found more info about xattr and cross-platform support over here in discussions in the borg backup repo: borgbackup/borg#1342 and borgbackup/borg#1681 And again, a rather amusing take on this topic:
Also some code that respects different platforms: https://github.com/borgbackup/borg/blob/master/src/borg/xattr.py |
Especially on Linux this is very useful to set (and remove) SELinux attributes or for example the NTACL/DOSATTRIB xattrs that samba leaves behind. So the underlying technology does matter (for that use case at least), it shouldn't just be a way to store random metadata in "some" store. The exact keys/names and tech are important so other native tools can use the same attributes. Right now you can call the actual setxattr functions directly or use the Mono.Posix package, but it's not as nice. |
That would be "security.selinux" -- so the .NET API should not restrict itself to the "user." namespace. Perhaps the .NET API could take an Another option might be a Windows recognizes "$Kernel." and "$Kernel.Purge." prefixes on extended attribute names. Kernel Extended Attributes |
On Windows, the NFS client was reported to specially handle the extended attribute names "NfsActOnLink", "NfsV3Attributes", and "NfsSymlinkTargetName": "Re: Visible symlinks under Windows" posted to samba-technical on 2008-06-23. That suggests other file system redirectors might similarly define magic names and risk conflict with user-defined extended attributes. On the other hand, I don't know whether the NFS client even advertises FILE_SUPPORTS_EXTENDED_ATTRIBUTES in GetVolumeInformationW; if it doesn't, then I suppose it is free to use the extended attribute API for its own purposes. If an application creates files with extended attributes or alternative data streams, and OneDrive moves them to cloud storage, are the EAs and ADSs lost? |
It would be good to have access to this functionality, just setting and reading from ADS/EA would be sufficient to me, also as a byte array as not all data is in English and ASCII strings. I don't care about deleting ADS/EA. Storing extra meta information about files is kind of useful, as well as being able to scan files for alternative data streams without messing around with Win32 API and PInvoke crap, something i'd prefer not to do anymore. In fact it would be great if .NET would grow and import legacy API stuff, regardless of cross platform support. Cross platform functionality can be added in time - if it exists. Also stop focusing on moving files to different fillesystems or cloud drives, the scope here is files stored on filestystems capable of ADS/EA and nothing else. If a user or process moves a file to a filesystem unable to deal with ADS/EA, that is beyond what .Net should be able to solve. |
Note: came across this nice sample of creating and copying EAs on Windows, leaving it here, in case somebody wants to see that in action: https://github.com/gtworek/PSBits/tree/master/CopyEAs |
Another sample that reads EAs on Windows: https://gist.github.com/jborean93/50a517a8105338b28256ff0ea27ab2c8 Today have another use case for Alternate Data Streams that needs to work cross-platform. And so this topic came back :D. |
Background and Motivation
The API currently lacks support to create, read, update and delete "extended file attributes". What are extended file attributes? Citation from the Wikipedia article:
On *nix based systems they are called xattr, on Windows E xtended A ttributes (EA).
MacOS documentation: https://ss64.com/osx/xattr.html
Linux documentation: https://man7.org/linux/man-pages/man7/xattr.7.html
Windows documentation (?): https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/a82e9105-2405-4e37-b2c3-28c773902d85?redirectedfrom=MSDN (
$EA
and$EA_INFORMATION
)Hints to usage of EA on Windows: https://superuser.com/q/396692/93905
My use case specifically is storing an ID for files that is not visible for the user but links those files to external cloud sources those files were generated (and need to be updated) from. Internet Explorer seems to store the "downloaded from Internet" information there, Antivirus vendors store scan information, Dropbox used those properties in the past as well - there definitely are use cases.
Extended file attributes are a lightweight, cross-platform way of storing file metadata that cannot be tinkered with (at least not easily) by the user. Platform/file system support seems to be good. There are even recent developments like Linux NFS support for xattr: User Xattr Support Finally Landing For NFS In Linux 5.9 To cite one comment: "I've been wanting this for years!" ;)
Proposed API
As extension to
System.IO.File
?(Note: Like the existing File.GetAttributes API)
(Note: It's also possible to set those attributes on directories.)
Usage Examples
Alternative Designs
The API design should reflect existing implementations from other languages and/or platforms. If this proposal is deemed worthy of being pursued further then we would have to look deeper into existing designs.
Risks
Effort for a rarely-used (?) feature.
Short-sighted design; searching for files based on those property values could be desirable. How could this be designed in contrast to existing designs?
The text was updated successfully, but these errors were encountered: