Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On Linux, FileShare.None doesn't seem to be taken into account when retrying to open a locked file #59995

Closed
nicolasl-rc opened this issue Oct 5, 2021 · 14 comments

Comments

@nicolasl-rc
Copy link

nicolasl-rc commented Oct 5, 2021

Description

We have 2 processes. The first one opens a file with an exclusive lock (using FileShare.None), and keeps this lock for some time. Another process starts and tries to open the same file with an exclusive lock (also using FileShare.None). And it retries to open it until it gets the exclusive lock.

Here are two small programs. First, test1 is started, then, after the lock is acquired (but before it is released), test2 is started.

using System;
using System.IO;
using System.Threading;

namespace test1
{
    class Program
    {
        static void Main(string[] args)
        {
            var lockfilepath = Path.Combine(Path.GetTempPath(), "test-lock");
            Console.WriteLine($"{DateTime.Now} test1 {lockfilepath}");
            using(var exclusiveLock = new FileStream(lockfilepath, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None, 1, FileOptions.DeleteOnClose)) {
                Console.WriteLine($"{DateTime.Now} {lockfilepath} locked");
                Thread.Sleep(10 * 1000);
                Console.WriteLine($"{DateTime.Now} releasing lock");
            }
            Console.WriteLine($"{DateTime.Now} lock released");
        }
    }
}
using System;
using System.IO;
using System.Threading;

namespace test2
{
    class Program
    {
        static void Main(string[] args)
        {
            var lockfilepath = Path.Combine(Path.GetTempPath(), "test-lock");
            Console.WriteLine($"{DateTime.Now} test2 - {lockfilepath}");
            while(true) {
                try {
                    var exclusiveLock = new FileStream(lockfilepath, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None, 1, FileOptions.DeleteOnClose);
                    break;
                } catch(Exception _) {
                    Console.WriteLine($"{DateTime.Now} retrying");
                    Thread.Sleep(1000);
                }
            }
            Console.WriteLine($"{DateTime.Now} lock acquired");
        }
    }
}

What we see:
The first attempt by test2 to open the file fails as expected. However, the second attempt succeeds, even though test1 still has the lock.

What we expect:
test2 should loop until test1 releases the lock.

Configuration

.NET version: 6.0.100-rc.1.21463.6
OS: Linux (distro: rocky linux 8)
Architecture: x64

Regression?

This works as expected on Windows (.NET Framework 4.8, .NET Core 3.1, .NET 6 RC1) and Linux with .NET Core 3.1.

It seems to be linked to the use of FileOptions.DeleteOnClose. Using FileOptions.None gives the expected behavior.

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.IO untriaged New issue has not been triaged by the area owner labels Oct 5, 2021
@ghost
Copy link

ghost commented Oct 5, 2021

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

We have 2 processes. The first one opens a file with an exclusive lock (using FileShare.None), and keeps this lock for some time. Another process starts and tries to open the same file with an exclusive lock (also using FileShare.None). And it retries to open it until it gets the exclusive lock.

Here are two small programs. First, test1 is started, then, after the lock is acquired (but before it is released), test2 is started.

using System;
using System.IO;
using System.Threading;

namespace test1
{
    class Program
    {
        static void Main(string[] args)
        {
            var lockfilepath = Path.Combine(Path.GetTempPath(), "test-lock");
            Console.WriteLine($"{DateTime.Now} test1 {lockfilepath}");
            using(var exclusiveLock = new FileStream(lockfilepath, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None, 1, FileOptions.DeleteOnClose)) {
                Console.WriteLine($"{DateTime.Now} {lockfilepath} locked");
                Thread.Sleep(10 * 1000);
                Console.WriteLine($"{DateTime.Now} releasing lock");
            }
            Console.WriteLine($"{DateTime.Now} lock released");
        }
    }
}
using System;
using System.IO;
using System.Threading;

namespace test2
{
    class Program
    {
        static void Main(string[] args)
        {
            var lockfilepath = Path.Combine(Path.GetTempPath(), "test-lock");
            Console.WriteLine($"{DateTime.Now} test2 - {lockfilepath}");
            while(true) {
                try {
                    var exclusiveLock = new FileStream(lockfilepath, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None, 1, FileOptions.DeleteOnClose);
                    break;
                } catch(Exception _) {
                    Console.WriteLine($"{DateTime.Now} retrying");
                    Thread.Sleep(1000);
                }
            }
            Console.WriteLine($"{DateTime.Now} lock acquired");
        }
    }
}

What we see:
The first attempt by test2 to open the file fails as expected. However, the second attempt succeeds, even though test1 still has the lock.

What we expect:
test2 should loop until test1 releases the lock.

Configuration

.NET version: 6.0.100-rc.1.21463.6
OS: Linux (distro: rocky linux 8)
Architecture: x64

Regression?

This works as expected on Windows (.NET Framework 4.8, .NET Core 3.1, .NET 6 RC1) and Linux with .NET Core 3.1.

Author: nicolasl-rc
Assignees: -
Labels:

area-System.IO, untriaged

Milestone: -

@carlossanlop
Copy link
Member

@tmds is this a repro of the issue you're fixing with #55327 ?

@carlossanlop carlossanlop removed the untriaged New issue has not been triaged by the area owner label Oct 5, 2021
@tmds
Copy link
Member

tmds commented Oct 5, 2021

This is a 6.0 regression introduced by #55153.

_deleteOnClose gets set before we're able to lock the file. So even when we can't lock the file, Dispose will remove it.

private void Init(string path, FileMode mode, FileAccess access, FileShare share, FileOptions options, long preallocationSize)
{
IsAsync = (options & FileOptions.Asynchronous) != 0;
_deleteOnClose = (options & FileOptions.DeleteOnClose) != 0;
// Lock the file if requested via FileShare. This is only advisory locking. FileShare.None implies an exclusive
// lock on the file and all other modes use a shared lock. While this is not as granular as Windows, not mandatory,
// and not atomic with file opening, it's better than nothing.
Interop.Sys.LockOperations lockOperation = (share == FileShare.None) ? Interop.Sys.LockOperations.LOCK_EX : Interop.Sys.LockOperations.LOCK_SH;
if (CanLockTheFile(lockOperation, access) && !(_isLocked = Interop.Sys.FLock(this, lockOperation | Interop.Sys.LockOperations.LOCK_NB) >= 0))

cc @stephentoub

@tmds
Copy link
Member

tmds commented Oct 5, 2021

The fix is to assign _deleteOnClose after the file is locked as in #55327.

is this a repro of the issue you're fixing with #55327 ?

The reproducer describes a scenario where the first program has locked the file for the whole time that the second program is started/stopped. That worked in previous versions of .NET.

#55327 makes it works when both programs are started/stopped at random times. That has never worked perfect on Unix so far due to races.

@stephentoub
Copy link
Member

The fix is to assign _deleteOnClose after the file is locked as in #55327.

Does that mean #55327 fixed this?

@tmds
Copy link
Member

tmds commented Oct 6, 2021

Does that mean #55327 fixed this?

Yes, this, and the races that occur on Unix when concurrently opening/closing a file with FileShare.None and FileOptions.DeleteOnClose.

@stephentoub
Copy link
Member

stephentoub commented Oct 6, 2021

Thanks. What's the bare minimum src change required to fix just the regression?

@tmds
Copy link
Member

tmds commented Oct 6, 2021

Moving _deleteOnClose down a couple of lines so it is after obtaining the lock.

I could imagine the reproducer is a simplified version of production code that would be subject to the races that are fixed in #55327 (but very unlikely to occur).

If you like a minimal fix, I can make a PR for it against 6.0 branch.

@stephentoub
Copy link
Member

If you like a minimal fix, I can make a PR for it against 6.0 branch.

Please and thank you :)

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Oct 7, 2021
@adamsitnik
Copy link
Member

@nicolasl-rc thank you for a great bug report!

@tmds
Copy link
Member

tmds commented Oct 7, 2021

@nicolasl-rc fyi, what happens in your reproducer:

When the second program runs the first time, it ends up deleting the file from the first program.
So when you run it the second time, it creates a new file (distinct from the one locked by the first program).

The fix is to not delete the file when we fail to lock.

@nicolasl-rc
Copy link
Author

Thank you for looking at it so rapidly!

Just to be sure: will the fix be included in the .NET 6 release?

We are migrating our products from .NET Core 3.1 to .NET 6, and would like to be ready as soon as possible after the official release (we already have customers asking us about .NET 6 support). And as far as I can tell, this is the last issue we have.

@tmds
Copy link
Member

tmds commented Oct 7, 2021

Yes, it will be fixed for .NET 6.

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Oct 8, 2021
@jeffhandley
Copy link
Member

Fixed by #60112

@ghost ghost locked as resolved and limited conversation to collaborators Nov 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants