-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
restore socket descriptor permission management #25804
restore socket descriptor permission management #25804
Conversation
86e1b25
to
9613d7a
Compare
656f34f
to
c70597f
Compare
@jukkar I have this almost working. It works as expected for regular native sockets, as they are all backed by a It doesn't work for registered socket families with I've just now started looking at this part and will hopefully come up with a generic solution. |
@jukkar think I found something workable. There's no common object type I can use for underlying socket descriptor objects, so it's not as simple as the last time I looked in this code, where you could make Any given socket registered with However, I can use a technique that was contributed to make driver disambiguation possible, because in order to distinguish drivers, we needed to be able to identify what subsystem it belongs to. If you look through subsystem headers you'll see that the API struct definition for all subsystems is tagged with I want to try to do something similar to tag socket descriptor object type struct definitions with Then the kernel object tables will have entries for any struct instance it finds that was tagged with We'll be able to handle any kind of socket object then. Later on, could probably extend this technique to other fd types. It would need some notion of object type inheritance, but very doable. |
@@ -446,8 +446,73 @@ void test_v4_accept_timeout(void) | |||
k_sleep(TCP_TEARDOWN_TIMEOUT); | |||
} | |||
|
|||
#ifdef CONFIG_USERSPACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests look good. I wish we had these originally so would not have missed the issue.
We probably also need to add tests to other socket types like net_mgmt, websocket etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I think the end goal we should strive for is to exercise the cross product of all socket APIs against all socket types.
It's really quite breathtaking how forcefully you weigh your opinions on topics that you are so obviously uninformed on. Go read about security features like SMEP/SMAP, or contemplate the fundamental differences and constraints on handling between user vs kernel pointers if you don't understand why we are "snapshotting the world" upon entry/exit to system calls. While you're doing that, given that you're the one who broke this to begin with, maybe you could lay off the trolling at least in this PR? |
Certain types of system call validation may need to be pushed deeper in the implementation and not performed in the verification function. If such checks are only pertinent when the caller was from user mode, we need an API to detect this situation. This is implemented by having thread->syscall_frame be non-NULL only while a user system call is in progress. The template for the system call marshalling functions is changed to clear this value on exit. A test is added to prove that this works. Signed-off-by: Andrew Boie <[email protected]>
Instead of handling this ad hoc, generalize which kobject types can be allocated. Signed-off-by: Andrew Boie <[email protected]>
Now we can build up lists of data structures matching a list of particular tags, with __subsystem being just one case. Relax searches to also look inside C files, since struct prototypes may be declared there as well. Signed-off-by: Andrew Boie <[email protected]>
Any data structure declaration tagged with __net_socket will end up in the kernel object table with type K_OBJ_NET_SOCKET. These all correspond to objects which are associated with socket file descriptors and can handle the socket vtable API. Signed-off-by: Andrew Boie <[email protected]>
Used for permission validation when accessing the associated file descriptors from user mode. There often get defined in implementation code, expand the search to look in drivers/ and subsys/net/. Signed-off-by: Andrew Boie <[email protected]>
Zephyr running on MPU devices have a different memory model than process-oriented OSes like Linux and require a method to set kernel object permissions on a file descriptor's underlying context object. Add this, and a test to show that it is working. Signed-off-by: Andrew Boie <[email protected]>
This socket is shared by all the test cases which run in different threads. Just make it a global object here. Signed-off-by: Andrew Boie <[email protected]>
Add test case to prove that this new API works. Signed-off-by: Andrew Boie <[email protected]>
Anytime a file descriptor context object is updated, we need to reset its access permissions and initialization state. This is the most centralized place to do it. Signed-off-by: Andrew Boie <[email protected]>
This is just done in common code now. Signed-off-by: Andrew Boie <[email protected]>
We now have a low-level function z_dynamic_object_create() which is not a system call and is used for installing kernel objects that are not supported by k_object_alloc(). Checking for valid object type enumeration values moved completely to the implementation function. A few debug messages and comments were improved. Futexes and sys_mutexes are now properly excluded from dynamic generation. Signed-off-by: Andrew Boie <[email protected]>
The socketpair file descriptor context objects are heap allocated and not drawn from a static pool. Register these as kernel objects when we create them if user mode is enabled. Signed-off-by: Andrew Boie <[email protected]>
The original sockets system calls used file descriptors which were actually net_context pointers. For all socket system calls, any calls from user mode would check if the caller had permission to use the net context. This was later changed to not stuff net_context pointers into file descriptors, but all the permission checking was unintentionally lost, allowing all threads on the system to read/write all socket file descriptors in the system at will, with no way to isolate applications running on the same microcontroller from each other's network activity. This patch restores the permission checks on network context objects for socket system calls that originated from user mode. The call to z_object_recycle() was never removed from zsock_socket_internal(); this is again leveraged to grant the caller who opened the socket permission on the net_context associated with the returned file descriptor. To ensure that all socket calls do this checking, all uses of z_get_fd_obj_and_vtable() have been routed through get_sock_vtable(). Objects have initialization state set and thread permissions reset to just the caller in common zsock_socket() code. Signed-off-by: Andrew Boie <[email protected]>
61aa0c2
to
9c936e6
Compare
My questions in #25443 (comment) were exactly about details of validating user vs kernel pointers. And you refused (or were unable) to answer them.
I'm trying to share my load of maintenance of this stuff, and learn as I go. If you're not interested, or unable, to cooperate, and discuss various aspects of this matter, then feel to maintain and update it yourself as various subsystems evolve (and apparently accuse people of breaking it). |
Merging this now so it makes 2.3.0-rc2 |
@carlescufi, @galak: FYI, this failed build in our CI for gnuarmemb toolchain (but not Zephyr SDK). The overall picture: https://ci.linaro.org/job/zephyr-upstream/5245/ . Example of failed build: https://ci.linaro.org/job/zephyr-upstream/5245/PLATFORM=frdm_k64f,ZEPHYR_TOOLCHAIN_VARIANT=gnuarmemb,label=docker-xenial-amd64-13/console . The error:
|
…019-q4-major Locally, this toolchain doesn't work for me due to zephyrproject-rtos/zephyr#22060 . Trying to upgrade CI, as there's report in that issue that "it worked on another machine". We need upgrade due to zephyrproject-rtos/zephyr#25804 (comment) Change-Id: I26477ed4705d6916f244465405309776eb06f98e Signed-off-by: Paul Sokolovsky <[email protected]>
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-v1.14-branch v1.14-branch
# Navigate to the new working tree
cd .worktrees/backport-v1.14-branch
# Create a new branch
git switch --create backport-25804-to-v1.14-branch
# Cherry-pick the merged commits of this pull request and resolve the conflicts
git cherry-pick c951d71ebaff67b6d33c57b6fa7e5d47bcffa4b8~15..c951d71ebaff67b6d33c57b6fa7e5d47bcffa4b8
# Push it to GitHub
git push --set-upstream origin backport-25804-to-v1.14-branch
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-v1.14-branch Then, create a pull request where the |
A change made to the socket system calls to use actual file descriptors (instead of
struct net_context
stuffed into an integer) unintentionally lost the ability to manage per-thread permissions of file descriptors, which is needed in MPU-based user mode memory models (currently all arches that implement user mode in Zephyr).This PR:
z_is_in_user_syscall()
so that system call verification checks may be done deep within implementation functions if those checks are only relevant to syscalls originating from user mode.__net_socket
in their declaration will have their instances tracked as kernel objects with typeK_OBJ_NET_SOCKET
struct net_context
so that APIs likek_object_access_grant()
may be used.K_OBJ_NET_SOCKET
objects dynamically, needed by thesocketpair
implementation (which pulls data from a heap instead of a pool like other socket types).Please review individual commit messages for more information.
Process-based virtual memory is coming to Zephyr for systems with an MMU later this year. In the fullness of time, we can adopt file descriptors that have process-level scope (instead of global scope), removing the need for this management on those types of systems. We will likely continue to need this infrastructure indefinitely for MPU-based systems, but it is the same permission model we already use for all kernel objects and device driver instances.