Libseccomp: RFE: more precisely defined error return values

Created on 7 Oct 2017  ·  24Comments  ·  Source: seccomp/libseccomp

Looking at libseccomp error return values about system calls, it is not possible to determine which of the following reasons caused the errors:

  1. syscall doesn't exist on some arch
  2. syscall cannot be matched on some arch (because it's multiplexed, think socket/socketcall)
  3. other error cases

When constructing the seccomp filter, the caller may consider some of these reasons fatal but not the others, so more detailed error information (wider set of error values than just -EINVAL) would be needed.

See also systemd PR 6952.

enhancement prioritmedium

Most helpful comment

Great, thanks. I've got a half-baked patchset I'll finish up and submit as a PR for review.

All 24 comments

Hi @topimiettinen, sorry it has taken so long to get to this, but I think it's time we fix this.

@drakenclimber this one is going to be a doozy. All of the libseccomp APIs should have manpages at this point (if they don't we need to create an issue for that), with all of the manpages having some hand-wavy "negative values on error" comments in the RETURN VALUE section. I think we need to do the following:

  • manually audit each API call to generate a list of possible return values
  • decide if these return values make sense, modify the code if they don't
  • document each possible return value in the associated manpage with a brief explanation of what the error code indicates

Thoughts?

Hi @topimiettinen, sorry it has taken so long to get to this, but I think it's time we fix this.

@drakenclimber this one is going to be a doozy. All of the libseccomp APIs should have manpages at this point (if they don't we need to create an issue for that), with all of the manpages having some hand-wavy "negative values on error" comments in the RETURN VALUE section. I think we need to do the following:

* manually audit each API call to generate a list of possible return values

* decide if these return values make sense, modify the code if they don't

* document each possible return value in the associated manpage with a brief explanation of what the error code indicates

Thoughts?

Dang.... I grudgingly agree with everything you wrote above, especially the effort required :). Getting the return codes right, and then documenting them will be a really big effort.

And yes, while not glamorous work, I think it's critical. I have been working on some cgroup stuff and we recently ran across a container implementation that completely misunderstood a cgroup feature... but there was no definitive document in the kernel or anywhere else to properly guide the way, so they made it work as best they could.

It's great to see activity on this issue! I'm not against a thorough review, but the original request was limited only to be able to distinguish different failure modes, which is somewhat orthogonal. The review would certainly help, even prerequisite to a degree I think.

It's great to see activity on this issue! I'm not against a thorough review, but the original request was limited only to be able to distinguish different failure modes, which is somewhat orthogonal. The review would certainly help, even prerequisite to a degree I think.

We need to do the review at some point, and it might as well be now. The longer we hold off, the less useful the error codes will become for callers and the entire point of libseccomp is to make this stuff easier to use :)

Dang.... I grudgingly agree with everything you wrote above, especially the effort required :). Getting the return codes right, and then documenting them will be a really big effort.

Yeah, this is one of the reasons this issue has sat for so long, but I've been putting it off long enough (at least @drakenclimber can say he has been putting it off for less than a year!). Later today (tomorrow?) I'll break this up into chunks/multiple-issues (with some suggestions) to make it a bit easier to tackle in pieces.

As a bit of a positive, it looks like libseccomp really only uses nine unique errno values (according to a very crude check):

# grep -e "-E[A-Z0-9]\+" src/*.{h,c} | sed 's/.*-\(E[A-Z0-9]\+\).*/\1/' | sort -u
EACCES
EDOM
EEXIST
EFAULT
EINVAL
ENOMEM
EOPNOTSUPP
EPERM
ESRCH

... this should help shrink the problem space a bit, especially if we can agree to common semantic value for each error code across the library (which we definitely should do).

This is a bit later than intended, but here is a full list of the functions making up the libseccomp API:

const struct scmp_version *seccomp_version(void)
unsigned int seccomp_api_get(void)
int seccomp_api_set(unsigned int level)
scmp_filter_ctx seccomp_init(uint32_t def_action)
int seccomp_reset(scmp_filter_ctx ctx, uint32_t def_action)
void seccomp_release(scmp_filter_ctx ctx)
int seccomp_merge(scmp_filter_ctx ctx_dst, scmp_filter_ctx ctx_src)
uint32_t seccomp_arch_resolve_name(const char *arch_name)
uint32_t seccomp_arch_native(void)
int seccomp_arch_exist(const scmp_filter_ctx ctx, uint32_t arch_token)
int seccomp_arch_add(scmp_filter_ctx ctx, uint32_t arch_token)
int seccomp_arch_remove(scmp_filter_ctx ctx, uint32_t arch_token)
int seccomp_load(const scmp_filter_ctx ctx)
int seccomp_attr_get(const scmp_filter_ctx ctx, enum scmp_filter_attr attr, uint32_t *value)
int seccomp_attr_set(scmp_filter_ctx ctx, enum scmp_filter_attr attr, uint32_t value)
char *seccomp_syscall_resolve_num_arch(uint32_t arch_token, int num)
int seccomp_syscall_resolve_name_arch(uint32_t arch_token, const char *name)
int seccomp_syscall_resolve_name_rewrite(uint32_t arch_token, const char *name)
int seccomp_syscall_resolve_name(const char *name)
int seccomp_syscall_priority(scmp_filter_ctx ctx, int syscall, uint8_t priority)
int seccomp_rule_add_array(scmp_filter_ctx ctx, uint32_t action, int syscall, unsigned int arg_cnt, const struct scmp_arg_cmp *arg_array)
int seccomp_rule_add(scmp_filter_ctx ctx, uint32_t action, int syscall, unsigned int arg_cnt, ...)
int seccomp_rule_add_exact_array(scmp_filter_ctx ctx, uint32_t action, int syscall, unsigned int arg_cnt, const struct scmp_arg_cmp *arg_array)
int seccomp_rule_add_exact(scmp_filter_ctx ctx, uint32_t action, int syscall, unsigned int arg_cnt, ...)
int seccomp_notify_alloc(struct seccomp_notif **req, struct seccomp_notif_resp **resp)
void seccomp_notify_free(struct seccomp_notif *req, struct seccomp_notif_resp *resp)
int seccomp_notify_receive(int fd, struct seccomp_notif *req)
int seccomp_notify_respond(int fd, struct seccomp_notif_resp *resp)
int seccomp_notify_id_valid(int fd, uint64_t id)
int seccomp_notify_fd(const scmp_filter_ctx ctx)
int seccomp_export_pfc(const scmp_filter_ctx ctx, int fd)
int seccomp_export_bpf(const scmp_filter_ctx ctx, int fd)

... of these functions, we only need to worry ourselves about the functions returning "int".

Possible groupings of functions that should have similar code paths and return values.

  • Group A
int seccomp_arch_exist(const scmp_filter_ctx ctx, uint32_t arch_token)
int seccomp_arch_add(scmp_filter_ctx ctx, uint32_t arch_token)
int seccomp_arch_remove(scmp_filter_ctx ctx, uint32_t arch_token)
  • Group B
int seccomp_attr_get(const scmp_filter_ctx ctx, enum scmp_filter_attr attr, uint32_t *value)
int seccomp_attr_set(scmp_filter_ctx ctx, enum scmp_filter_attr attr, uint32_t value)
  • Group C
int seccomp_syscall_resolve_name_arch(uint32_t arch_token, const char *name)
int seccomp_syscall_resolve_name_rewrite(uint32_t arch_token, const char *name)
int seccomp_syscall_resolve_name(const char *name)
  • Group D
int seccomp_rule_add_array(scmp_filter_ctx ctx, uint32_t action, int syscall, unsigned int arg_cnt, const struct scmp_arg_cmp *arg_array)
int seccomp_rule_add(scmp_filter_ctx ctx, uint32_t action, int syscall, unsigned int arg_cnt, ...)
int seccomp_rule_add_exact_array(scmp_filter_ctx ctx, uint32_t action, int syscall, unsigned int arg_cnt, const struct scmp_arg_cmp *arg_array)
int seccomp_rule_add_exact(scmp_filter_ctx ctx, uint32_t action, int syscall, unsigned int arg_cnt, ...)
  • Group E
int seccomp_notify_receive(int fd, struct seccomp_notif *req)
int seccomp_notify_respond(int fd, struct seccomp_notif_resp *resp)
int seccomp_notify_id_valid(int fd, uint64_t id)
int seccomp_notify_fd(const scmp_filter_ctx ctx)
  • Group F
int seccomp_load(const scmp_filter_ctx ctx)
int seccomp_export_bpf(const scmp_filter_ctx ctx, int fd)

... if the function is not in one of the groups above, it is likely unique in it's code path and/or return value.

Group C only returns __NR_SCMP_ERROR.

Group D can return one of EINVAL, EPERM, EOPNOTSUPP, ENOMEM, EDOM, EFAULT, EEXIST.

seccomp_load() can return EINVAL, ENOMEM, ESRCH and also errno values from prctl() (only EACCES, EFAULT, EINVAL for PR_SET_NO_NEW_PRIVS and PR_SET_SECCOMP) and seccomp() (EACCES, EFAULT, EINVAL, ENOMEM, ESRCH) syscalls according to their manual pages.

When it comes to libseccomp passing syscall errors back to the caller, e.g. prctl() and seccomp(), I'm thinking we just need to hide those behind a single errno value (perhaps ENOSYS?) so that libseccomp isn't affected by any changes in the kernel (or ABI differences).

If this becomes a problem for debugging, perhaps we could introduce a new attr which would pass the errno value directly back to the caller.

Possibly, but then it would be nice to make sure there is no ABI/API breakage if the users of libseccomp are already expecting certain errno values. The original problem was that some system calls exist only on some architectures (like ugetrlimit only on x86_32) but this couldn't be distinguished from a mistyped system call name, so different (possibly synthetic) error codes would be needed.

Well, as we've said previously, we currently don't really guarantee specific errno values, just "negative values on failure" so while it would be unfortunate to break users that are currently making assumptions about specific errno values, I think changing things to provide a strong errno guarantee across kernel versions and ABIs in the future is a worthwhile tradeoff.

Well, as we've said previously, we currently don't really guarantee specific errno values, just "negative values on failure" so while it would be unfortunate to break users that are currently making assumptions about specific errno values, I think changing things to provide a strong errno guarantee across kernel versions and ABIs in the future is a worthwhile tradeoff.

I agree.

Once this evaluation is complete and we update the errno values on failures, I would feel more comfortable making some sort of guarantee of the errno values returned.

The grouping idea maybe wasn't the best, so let's start a list to keep track of this all _(I will keep updating this as we progress)_:

  • [x] seccomp_reset
    Currently returns: EINVAL, ENOMEM.

  • [x] seccomp_merge
    Currently returns: EINVAL, EDOM, EEXIST, ENOMEM.

  • [x] seccomp_arch_exist
    Currently returns: EINVAL, EEXIST.

  • [x] seccomp_arch_add
    Currently returns: EINVAL, EEXIST, ENOMEM, EDOM.

  • [x] seccomp_arch_remove
    Currently returns: EINVAL, EEXIST.

  • [x] seccomp_load
    Currently returns: EINVAL, ENOMEM, ESRCH, ECANCELED.

  • [x] seccomp_attr_get
    Currently returns: EINVAL, EEXIST.

  • [x] seccomp_attr_set
    Currently returns: EINVAL, EACCES, EOPNOTSUPP, EEXIST.

  • [x] seccomp_syscall_resolve_name_arch
    Already well defined, returns the syscall value or __NR_SCMP_ERROR on failure.

  • [x] seccomp_syscall_resolve_name_rewrite
    Already well defined, returns the syscall value or __NR_SCMP_ERROR on failure.

  • [x] seccomp_syscall_resolve_name
    Already well defined, returns the syscall value or __NR_SCMP_ERROR on failure.

  • [x] seccomp_syscall_priority
    Currently returns: EINVAL, EDOM, EFAULT, ENOMEM.

  • [x] seccomp_rule_add_array
    Currently returns: EINVAL, EOPNOTSUPP, ENOMEM, EDOM, EFAULT, EEXIST.

  • [x] seccomp_rule_add
    Currently returns: EINVAL, EOPNOTSUPP, ENOMEM, EDOM, EFAULT, EEXIST.

  • [x] seccomp_rule_add_exact_array
    Currently returns: EINVAL, EOPNOTSUPP, ENOMEM, EDOM, EFAULT, EEXIST.

  • [x] seccomp_rule_add_exact
    Currently returns: EINVAL, EOPNOTSUPP, ENOMEM, EDOM, EFAULT, EEXIST.

  • [x] seccomp_notify_alloc
    Currently returns: EOPNOTSUPP, ENOMEM, EFAULT, ECANCELED. The manpage already specifies -1 on error, which likely refers to just the seccomp() errno.

  • [x] seccomp_notify_receive
    Currently returns: EOPNOTSUPP and ECANCELED. The manpage already specifies -1 on error, which likely refers to just the seccomp() errno.

  • [x] seccomp_notify_respond
    Currently returns: EOPNOTSUPP and ECANCELED. The manpage already specifies -1 on error, which likely refers to just the seccomp() errno.

  • [x] seccomp_notify_id_valid
    Currently returns: EOPNOTSUPP and ECANCELED. The manpage already specifies -ENOENT on error (invalid ID), which likely refers to just the seccomp() errno.

  • [x] seccomp_notify_fd
    Already well defined, returns the notification fd.

  • [x] seccomp_export_pfc
    Currently returns: EINVAL and ECANCELED.

  • [x] seccomp_export_bpf
    Currently returns: EINVAL, ENOMEM, and ECANCELED.

Now that we have a list of what functions returns what error codes I'm feeling a bit better about this, especially since we are already fairly consistent with how we use our error codes. That last bit should help tremendously.

I'm going to start a PR so we can start collecting fixes and feedback on the changes, I'll post that here soon.

Given the rather "special" nature of ENOSYS, I'm having reservations about using the errno as our kernel/libc catch-all. I'll have to take a look at other values, does anyone have any strong feelings/thoughts on this?

There is still a lot missing, mostly manpage edits and code comments (not to mention testing), but you can look at the following branch to get an idea of what I'm thinking:

Given the rather "special" nature of ENOSYS, I'm having reservations about using the errno as our kernel/libc catch-all. I'll have to take a look at other values, does anyone have any strong feelings/thoughts on this?

What about EIO?

What about EIO?

I don't know if that would make the error much more actionable. How about instead extending the API with functions which do not use errno values, but for example:

  • SCMP_ERROR_UNKNOWN_SYSCALL: the syscall is not known by libseccomp: caller can use this to reject user input (e.g. typo in syscall name)
  • SCMP_ERROR_SYSCALL_NOT_FOR_THIS_ARCH: the syscall is known by libseccomp but not available here: caller can ignore this error for just this architecture
  • SCMP_ERROR_API_USAGE: libseccomp detects a problem with the calling logic which should not happen in correctly written code: caller could trigger assert()
  • SCMP_ERROR_KERNEL_OTHER: libseccomp was OK with the input and the sequence of calls, but kernel returned certain well defined errors e.g. ENOMEM, EPERM, ENOSYS: caller should check the errno for action. In case the reason for error is known by libseccomp to happen because of error made by caller (say EFAULT), maybe SCMP_ERROR_API_USAGE could be used instead.
  • SCMP_ERROR_KERNEL_API_USAGE: libseccomp was OK with the input etc, but kernel didn't like it for not so clear and obvious reasons. This could indicate a kernel change, disabled config, bug in libseccomp or caller, very weird user input etc. Action for the caller could be to log the event and errno with request to pass the info to developers (caller and/or libseccomp) for further analysis, not assert()able.

Alternatively the API with errnos could remain as is, but a new function could could be used to request the above error code.

Perhaps @poettering or @keszybz could comment too.

I don't know if that would make the error much more actionable. How about instead extending the API with functions which do not use errno values ...

Well, before we could even consider doing something like that (and I'm not sure we want to do that) we need to settle on stable and supported return codes. That's what we are working on here and what we are targeting for v2.5.

Let's get the stable/supported return codes in v2.5 and we can see how that goes, if we need to do something additional we can consider it for v2.6.

What about EIO?

I got pulled off this for a while due to other work priorities and some kernel stuff, but now that I'm back on libseccomp I'm realizing that using EIO here seems wrong. Let me think a bit more on this.

What about ECANCELED as the catch-all kernel error code?

I'm curious what you think @drakenclimber, you've been quiet on this for a while.

I got pulled off this for a while due to other work priorities and some kernel stuff, but now that I'm back on libseccomp I'm realizing that using EIO here seems wrong. Let me think a bit more on this.

Same here. Things have been a little hectic lately :/.

I'm curious what you think @drakenclimber, you've been quiet on this for a while.

Sure. I want to read through the whole thread again, and then I'll chime in.

What about ECANCELED as the catch-all kernel error code?

I admit I wasn't tremendously familiar with ECANCELED. I briefly looked through the kernel source for its usage and did a google search as well. ECANCELED has no collisions with previous libseccomp, prctl(), or other APIs we've used, and I think it can reasonably encapsulate whatever error the kernel throws at us.

tl;dr - I'm cool with ECANCELED as our catch-all for kernel errors.

Great, thanks. I've got a half-baked patchset I'll finish up and submit as a PR for review.

Was this page helpful?
0 / 5 - 0 ratings