Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] EntraID OIDC - ACLs not being applied to OIDC registered users #2377

Closed
2 of 4 tasks
SysAdminSmith opened this issue Jan 25, 2025 · 32 comments · Fixed by #2388
Closed
2 of 4 tasks

[Bug] EntraID OIDC - ACLs not being applied to OIDC registered users #2377

SysAdminSmith opened this issue Jan 25, 2025 · 32 comments · Fixed by #2388
Labels
bug Something isn't working
Milestone

Comments

@SysAdminSmith
Copy link

SysAdminSmith commented Jan 25, 2025

Is this a support request?

  • This is not a support request

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Users register via Entra ID OIDC. They are granted access by virtue of Security Group membership. When they login for the first time they are assigned a User ID, and their Name and Username are registered. The username is their EntraID email address (i.e. [email protected]). I have also created a local user (node.mgr) to manager subnet nodes and exit nodes. This user has an ID and username only (node.mgr).

When the default Allow All from All policy is in place all users can receive subnet routes, exit node routes, and have access to the DNS server.

However, when implementing the following ACL, only the node.mgr user has the proper policies applied. Any other user - which register via OIDC - can see (tailscale status) every node to which they should have access and can "tailscale ping" those devices, but none can resolve FQDN's. reach the outside world, or navigate the subnets provided by the subnet routers.

Expected Behavior

Expected behavior is that every user defined in a group has the policies applied based on the group to which they are party.

I am new to Headscale/Tailscale so I am fully cognizant that I may be misconstruing the ACL or misunderstanding how it works so I apologize in advance if this is not a bug but, rather, user error. If the latter, I would sincerely appreciate being pointed in the right direction.

Steps To Reproduce

Create ACL
Apply ACL
Login

Environment

- OS: Fedora 41, Debian 12 (Headscale Server), Debian 12 (tailscale clients), Windows 10/11 (tailscale clients)
- Headscale version: 0.24.1
- Tailscale version: 1.78.1

Runtime environment

  • Headscale is behind a (reverse) proxy
  • Headscale runs in a container

Anything else?

{
  "groups": {
    "group:InformationTechnology": [
      "[email protected]",
      "[email protected]",
      "node.mgr"
    ],
    "group:Operations": [
      "[email protected]",
      "[email protected]"
    ],
    "group:DataServices": [
      "[email protected]"
    ],
 
...Truncated...
 
  },
  "hosts": {
    "tech-dns-tn": "100.64.0.2",
    "exitnode-tn-att": "100.64.0.3",
    "subnetnode-lab-tn": "100.64.0.4",
    "subnetnode-azu-tn": "100.64.0.5",
    "subnetnode-128-tn": "100.64.0.6",
    "subnetnode-088-tn": "100.64.0.7",
    "subnetnode-008-tn": "100.64.0.8",
    "exitnode-tn": "100.64.0.9"
  },
  "acls": [
    {
      "action": "accept",
      "src": [
        "group:InformationTechnology"
      ],
      "dst": [
        "subnetnode-lab-tn:*",
        "subnetnode-azu-tn:*",
        "subnetnode-128-tn:*",
        "subnetnode-088-tn:*",
        "subnetnode-008-tn:*",
        "tech-dns-tn:*",
        "exitnode-tn-att:*",
        "exitnode-tn:*"
      ]
    },
    {
      "action": "accept",
      "src": [
    "group:Operations",
        "group:DataServices",
        "group:Management",
        "group:Standard"
      ],
      "dst": [
        "tech-dns-tn:*",
        "exitnode-tn-att:*",
        "exitnode-tn:*"
      ]
    },
    {
      "action": "accept",
      "src": [
        "subnetnode-lab-tn",
        "subnetnode-azu-tn",
        "subnetnode-128-tn",
        "subnetnode-088-tn",
        "subnetnode-008-tn",
        "tech-dns-tn",
        "exitnode-tn-att",
        "exitnode-tn"
    ],
      "dst": [
    "*:*"
        ]
    }
  ]
}
@SysAdminSmith SysAdminSmith added the bug Something isn't working label Jan 25, 2025
@SysAdminSmith SysAdminSmith changed the title [Bug] EntrID OIDC - ACLs not being applied to OIDC registered users [Bug] EntraID OIDC - ACLs not being applied to OIDC registered users Jan 25, 2025
@GoodiesHQ
Copy link

I am also experiencing this. In headscale version 0.24.0, ACL worked by the user's email address (the username has no effect). If no email was defined, it worked on Provider ID (which was horrendously ugly having an entire URL in the ACL entries, but it did work). Now in 0.24.1, the user.name field gets populated with the email address but nothing seems to work in the ACL.

@kradalby
Copy link
Collaborator

kradalby commented Jan 26, 2025

@SysAdminSmith

Could you give me some example output of your users?

Something like the output som headscale users list --output json ?

In headscale version 0.24.0, ACL worked by the user's email address (the username has no effect). If no email was defined, it worked on Provider ID (which was horrendously ugly having an entire URL in the ACL entries, but it did work). Now in 0.24.1, the user.name field gets populated with the email address but nothing seems to work in the ACL.

Are you saying there is different behaviour in 0.24.0 and 0.24.1 ? from entra, which didnt seem to send emails based on another issue, it would not expect that.

@GoodiesHQ Could you give me the same output but from 0.24.0 and 0.24.1?

@SysAdminSmith
Copy link
Author

SysAdminSmith commented Jan 26, 2025

I just posted this in Discord in response to a suggestion I downgrade to 0.24.0 and manually change usernames - which auto populate with email names ([email protected]) - to just user.name.

Sadly this did not work for me.
I downgraded to 0.24.0
I deleted an existing user.
Restarted Headscale.
Had the user sign in via OIDC. It was assigned a [email protected] username. I manually changed the username to user.name1.
Tailscale status on client node "shows" access to all nodes they should have access to but:
Magic DNS seems to be working but DNS, generally, does not work.
Changed ACL to accept all to all and users are able to get to WAN via exitnodes and navigate the non-Tailscale network via subnet routers.

Manually created users do not have any issues irrespective of the ACLs in place. My one non-OIDC user "node.mgr" has the same access as "[email protected]".

@kradalby

Could you give me some example output of your users?

Something like the output som headscale users list --output json ?

(Name, Display Name, and Tenant ID and sub claim are randomized)

This is in 0.24.0

[
	{
		"id": 1,
		"name": "user.name1",
		"created_at": {
			"seconds": 1737743188,
			"nanos": 738930533
		},
		"display_name": "User Name1",
		"provider_id": "https://login.microsoftonline.com/abcd1234-5678-90ef-ghij-klmnopqrstuv/v2.0/wXYZ9876-5432-10fe-dcba-lmnopqrstuvw",
		"provider": "oidc"
	},
	{
		"id": 2,
		"name": "node.mgr",
		"created_at": {
			"seconds": 1737743579,
			"nanos": 362789820
		}
	}
]

@Codelica
Copy link

Curious your OIDC user is missing an "email" property, perhaps your IDP isn't including email_verified.

@Codelica
Copy link

I am also experiencing this. In headscale version 0.24.0, ACL worked by the user's email address (the username has no effect). If no email was defined, it worked on Provider ID (which was horrendously ugly having an entire URL in the ACL entries, but it did work). Now in 0.24.1, the user.name field gets populated with the email address but nothing seems to work in the ACL.

This would match my experience.

In 0.24.0 with ODIC users that have plain usernames (fred, mike, etc) and email addresses ([email protected], [email protected], etc) populated, I have working user ACLs based on email address.

In 0.24.1, usernames get migrated to email addresses (as users login again), but I can't seem to get user ACLs to work based on email address.

@SysAdminSmith
Copy link
Author

Curious your OIDC user is missing an "email" property, perhaps your IDP isn't including email_verified.

Do you mind expanding a bit on this?

@Codelica
Copy link

In 0.24.0 at least, user ACLs seem to key off email address, but your user with username user.name1 doesn't have any email address property set. I believe that happens when the email_verified claim isn't being sent by your IDP based on this comment.

My OIDC users do have the email property set like:

...
	{
		"id": 2,
		"name": "testuser",
		"created_at": {
			"seconds": 1737933527,
			"nanos": 133145801
		},
		"display_name": "Test User",
=>              "email": "[email protected]",
		"provider_id": "https://auth.domain.com/304400930772615424",
		"provider": "oidc"
	}

And the email address can be used for user based ACLs in 0.24.0. But not in 0.24.1 it seems.

@SysAdminSmith
Copy link
Author

In 0.24.0 at least, user ACLs seem to key off email address, but your user with username user.name1 doesn't have any email address property set. I believe that happens when the email_verified claim isn't being sent by your IDP based on this comment.

My OIDC users do have the email property set like:

...
{
"id": 2,
"name": "testuser",
"created_at": {
"seconds": 1737933527,
"nanos": 133145801
},
"display_name": "Test User",
=> "email": "[email protected]",
"provider_id": "https://auth.domain.com/304400930772615424",
"provider": "oidc"
}

And the email address can be used for user based ACLs in 0.24.0. But not in 0.24.1 it seems.

Thank you :) I did some further research an apparently Entra ID does not provide email_verified by default. I'm looking into a solution

@SysAdminSmith
Copy link
Author

In 0.24.0 at least, user ACLs seem to key off email address, but your user with username user.name1 doesn't have any email address property set. I believe that happens when the email_verified claim isn't being sent by your IDP based on this comment.

My OIDC users do have the email property set like:

...
{
"id": 2,
"name": "testuser",
"created_at": {
"seconds": 1737933527,
"nanos": 133145801
},
"display_name": "Test User",
=> "email": "[email protected]",
"provider_id": "https://auth.domain.com/304400930772615424",
"provider": "oidc"
}

And the email address can be used for user based ACLs in 0.24.0. But not in 0.24.1 it seems.

In 0.24.0 at least, user ACLs seem to key off email address, but your user with username user.name1 doesn't have any email address property set. I believe that happens when the email_verified claim isn't being sent by your IDP based on this comment.
My OIDC users do have the email property set like:
...
{
"id": 2,
"name": "testuser",
"created_at": {
"seconds": 1737933527,
"nanos": 133145801
},
"display_name": "Test User",
=> "email": "[email protected]",
"provider_id": "https://auth.domain.com/304400930772615424",
"provider": "oidc"
}
And the email address can be used for user based ACLs in 0.24.0. But not in 0.24.1 it seems.

Thank you :) I did some further research an apparently Entra ID does not provide email_verified by default. I'm looking into a solution

I downgraded to 0.24.0 and went the UPN route since EntraID provides it by default (i.e. in the headscale config I set email_claim: "upn").

When I logged in, just the Display Name was populated; neither the username nor the email address was captured. I manually modified the user to create a username (user.name) which is referenced in the ACL.

No difference, sadly.

What is so damn odd is that when I set the endpoint to --accept-routes, --accept-dns, and --exit-node=exitnode

=== 'Use Tailscale DNS' status ===

Tailscale DNS: enabled.

Tailscale is configured to handle DNS queries on this device.
Run 'tailscale set --accept-dns=false' to revert to your system default DNS resolver.

=== MagicDNS configuration ===

This is the DNS configuration provided by the coordination server to this device.

MagicDNS: enabled tailnet-wide (suffix = dauntless.tail)

Other devices in your tailnet can reach this device at dd-0121-001l.dauntless.tail

Resolvers (in preference order):
  - 100.64.0.2

Split DNS Routes:

Search Domains:
  - dauntless.tail

=== System DNS configuration ===

This is the DNS configuration that Tailscale believes your operating system is using.
Tailscale may use this configuration if 'Override Local DNS' is disabled in the admin console,
or if no resolvers are provided by the coordination server.

  (reading the system DNS configuration is not supported on this platform)

[this is a preliminary version of this command; the output format may change in the future]
tailscale dns query google.com
DNS query for "google.com" (A) using internal resolver:

failed to query DNS: 500 Internal Server Error: resolving using "http://100.64.0.3:34152/dns-query": 403 Forbidden

For whatever reason, it is forwarding dns requests to the exitnode (100.64.0.3). The dns is configured at 100.64.0.2.

Further:

Jan 27 11:56:38 DD-0121-001L tailscaled[87856]: [RATELIMIT] format("dns udp query: %v")
Jan 27 11:56:46 DD-0121-001L tailscaled[87856]: [RATELIMIT] format("open-conn-track: flow %v %v > %v rejected due to %v") (6 dropped)
Jan 27 11:56:46 DD-0121-001L tailscaled[87856]: open-conn-track: flow TCP 100.64.0.1:43420 > 192.168.128.1:53 rejected due to acl
Jan 27 11:56:47 DD-0121-001L tailscaled[87856]: open-conn-track: flow TCP 100.64.0.1:43420 > 192.168.128.1:53 rejected due to acl

I'm not sure what its trying to do by sending to port 53 on a private ip. That is a DNS server but I am not sure how tailscale even knows about it.

@Codelica
Copy link

Well AFAIK email address has to be used for user ACLs in 0.24.0, so without that set you may be out of luck there. And 0.24.1 doesn't seem work with email or username as far as I can tell.

Not sure about the dns issue, but without a working ACL maybe it's trying whatever it can to get some dns back through your exit node. Pure guess, seems odd. I know Tailscale does try to transparently upgrade you to DNS over HTTPS (DoH).

@SlackingVeteran
Copy link

SlackingVeteran commented Jan 27, 2025

I have been fighting this ACL issue about a week now. And I finally found a repro case on my end

If a User's user.name field is updated ACL no longer works for any user. On 0.24.0 I had manually assigned each user a username and it broke my ACL, Once I upgraded to 0.24.1 ACL started working soon as server was up and it broke again once a user logged out and logged-in, which caused the user.name to be updated to include email.
Then I renamed user's user.name back to what it was, and ACL stayed broken. I then downgraded to 0.24.0 and it magically started working again.

It looks like update of the user.name field for any user breaks the ACL and it cannot be fixed until a minor version upgrade is done? Is there any database migration that occurs during version upgrade/downgrade? Because it only seems to get fixed after that

@kradalby
Copy link
Collaborator

@SlackingVeteran this is useful, can you show me your ACL?

Do you use your "old" usernames in the ACL or have you migrated to the what ends up being populated in the ACL (likely emails)?

@SlackingVeteran
Copy link

@kradalby I was using usernames in ACL on v0.24.0 which was not working, upgrading to 0.24.1 fixed it to a point where username started working until username was updated by OIDC to be email. I swapped to use email in ACL after it stopped working, and it still did not apply appropriate ACL, i then changed username from email to custom username and used username on ACL and that did not work either, downgraded to 0.24.0 and username started working again.
Following is my ACL on 0.24.0:

{
  "groups": {},
  "tagOwners": {},
  "hosts": {},
  "acls": [
    {
      "action": "accept",
      "src": [
        "user"
      ],
      "dst": [
        "user1:*"
      ]
    },
    {
      "action": "accept",
      "src": [
        "user2"
      ],
      "dst": [
        "user2:*"
      ]
    },
    {
      "action": "accept",
      "src": [
        "user3"
      ],
      "dst": [
        "user3:*"
      ]
    }
  ],
  "ssh": []
}

@kradalby
Copy link
Collaborator

kradalby commented Jan 29, 2025

@SlackingVeteran, you have to use the exact value present in the user object, it will check the Email and then the Name (username) field in order. So if it used to be stripped down from email [email protected] to user2, it will now be [email protected] as we do not strip them anymore.

When using OIDC, rename should not be used as it will be overwritten every time the user logs in, I filed #2387 to block the usage of it when OIDC is used to avoid confusion.

@SlackingVeteran
Copy link

Thanks @kradalby

@Codelica
Copy link

@kradalby I started fresh again thinking I may have made some mistake, but I'm afraid I can't get OIDC user based ACLs to work as you describe in 0.24.1.

Starting in a working state in 0.24.0 (note not .1) where I have 1 user (which is stable between login/logout as usernames can't be email addresses yet in 0.24.0):

root@cloud:~ # headscale users list
ID | Name         | Username | Email                 | Created
1  | Fred Wilson  | fred     | [email protected]       | 2025-01-21 22:31:18

And a very simple ACL:

{
  "groups": {},
  "tagOwners": {},
  "hosts": {},
  "acls": [
    {
      "action": "accept",
      "src": [
        "[email protected]"
      ],
      "dst": [
        "*:*"
      ]
    }
  ],
  "ssh": []
}

It works as expected, the user fred ([email protected]) sees and can reach everything.

Then I update the system to 0.24.1 -- initially the ACL still works fine.

But the second fred ([email protected]) does a logout/login, his Headscale username is updated to email as expected:

root@cloud:~ # headscale users list
ID | Name         | Username           | Email                 | Created
1  | Fred Wilson  | [email protected]    | [email protected]       | 2025-01-21 22:31:18

But the ACL no longer works, with the following logged to the Headscale console:

2025-01-29T12:19:12-07:00 WRN No IPs found with the alias [email protected]

Which according to headscale nodes list isn't true. [email protected] is shown as the user on his 4 nodes.

If I try editing the ACL using his email [email protected], Headscale will log WRN No IPs found with the alias [email protected] each time it's applied.

The only thing that seems to work for me to identify his user in the ACL is using his provider_id (which I tried based on some commits I was looking at).

Is there some specific output that would help investigate this?

@Codelica
Copy link

Codelica commented Jan 29, 2025

I think the problem may be here.

Once the user does a logout/login in 0.24.1 both user.Email and user.Name will match their email address, which would then hit the multiple user check every time?


Edit: I just ran with trace logging and that is what is hitting for me:

2025-01-29T13:16:37-07:00 TRC home/runner/work/headscale/headscale/hscontrol/policy/acls.go:1043 > could not determine user to filter nodes by error="multiple users with token \"[email protected]\" found: no user matching"

@SlackingVeteran
Copy link

SlackingVeteran commented Jan 29, 2025

Great find @Codelica, I didn't update to 0.24.1 yet because of this issue and looks like you found the cause, might be worth creating a PR. A else if there should fix the issue or continue inside first if block

@Codelica
Copy link

Unfortunately I'm not a Go dev :(, although it would be the one language I'd love to pick up if there's ever time. (seems doubtful) I have a feeling he'll want to preserve a check for legitimate multiple matches though -- just checking based on multiple unique users ids instead.

@kradalby
Copy link
Collaborator

Once the user does a logout/login in 0.24.1 both user.Email and user.Name will match their email address, which would then hit the multiple user check every time?

I think this is it!, good catch!

I have actually redone that function here #2388, a bit by accident but it should fix that problem.
Can you test that branch?

@SlackingVeteran
Copy link

I would test it but my Kubernetes deployment relies on pre-built docker images. If there is an image for it I can test it out right away

@SlackingVeteran
Copy link

I built the docker container myself to test this and can verify email for OIDC logged in user is working as expected in ACL now

@SysAdminSmith
Copy link
Author

SysAdminSmith commented Jan 30, 2025

I updated to 0.24.2 and unforuntately I still am having issues. Am I right to believe that all users must have an email address, now? Because, as mentioned previously, EntraID does not provide email by default in a way that the email field is filled. The username will be filled by the email EntraID provides, but the email field will remain empty (user.name is retrieved via EntraID; node.mgr was a user created locally):

ID | Name          | Username                             | Email | Created            
2  |               | node.mgr                             |       | 2025-01-24 18:32:59
8  | User Name     | [email protected]                    |       | 2025-01-30 15:53:45

@suyashFSG
Copy link

user.name will have email once user logs out and logs back in with OIDC. Account stays unchanged until user re-logins

@suyashFSG
Copy link

As for email field, it will stay empty unless OIDC claim contains email_verified claim with true value

@Codelica
Copy link

@SysAdminSmith Code in 0.24.2 checks the Username and Email fields you see in that table (user.Email and user.Name in the actual user object).

So you should be able to reference those 2 users as "node.mgr" and "[email protected]" even with their Email field empty.

@suyashFSG
Copy link

I use Entra as well and I have added additional claim on entra that has true value. You should be able to follow this instruction on Headscale discord to do so https://discord.com/channels/896711691637780480/1105842846386356294/1331037795900461106

@SysAdminSmith
Copy link
Author

I use Entra as well and I have added additional claim on entra that has true value. You should be able to follow this instruction on Headscale discord to do so https://discord.com/channels/896711691637780480/1105842846386356294/1331037795900461106

Thank you for this! What "scopes" did you use in your config.yaml? Also, did you declare a separate "email_claim" in your config.yaml?

@suyashFSG
Copy link

My scope isopenid profile email and on Entra i had to fill every user’s Email field under Properties -> Contact Information, wish it was possible to define what claim to use for email instead. Because email for Entra is in UPN and preferred_username by default

@SysAdminSmith
Copy link
Author

My scope isopenid profile email and on Entra i had to fill every user’s Email field under Properties -> Contact Information, wish it was possible to define what claim to use for email instead. Because email for Entra is in UPN and preferred_username by default

ooof.

Thank you!

@SysAdminSmith
Copy link
Author

I am not entirely sure why this bug was closed because I still cannot use ACL's with my headscale setup. I have upgraded to 0.24.2 and followed @suyashFSG suggestions but I am still unable to use acls.

Funny thing is, if the default is "deny all to all", then the directive of "accept all from all to all" is technically an acl that "works". But, outside of that, any acl rule will break a user's ability to: use dns and access anything outside of the tailnet.

Perhaps it has something to do with EntraID? I don't know. I am happy to provide whatever information is needed.

@andreyrd
Copy link

andreyrd commented Feb 7, 2025

I'm seeing very similar issues with a custom OIDC server. Email is populating correctly but ACLs are just seemingly not working. Haven't really been able to get to the root of what exactly is happening.

Edit: Never mind, I believe my issue is actually #1475

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants