Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The interesting part to me is that this bug does not necessarily happen in an unsafe block. The fix happens in an unsafe block, I think the API should change to avoid this. Perhaps by forcing users to pass a lambda to do stuff instead of having to manually lock and drop?


The `unsafe` block was present because `List::remove` is marked `unsafe` [0]:

    /// Removes the provided item from this list and returns it.
    ///
    /// This returns `None` if the item is not in the list. (Note that by the safety requirements,
    /// this means that the item is not in any list.)
    ///
    /// # Safety
    ///
    /// `item` must not be in a different linked list (with the same id).
    pub unsafe fn remove(&mut self, item: &T) -> Option<ListArc<T, ID>> {
I think it'd be tricky at best to make this particular API safe since doing so requires reasoning across arbitrary other List instances. At the very least I don't think locks would help here, since temporary exclusive access to a list won't stop you from adding the same element to multiple lists.

[0]: https://github.com/torvalds/linux/blob/3e0ae02ba831da2b70790...


If the API cannot be made safe then it must be marked unsafe.


I mean, remove() is already marked unsafe?

Otherwise there's the question of where exactly the API boundaries are. In the most general case, your unsafe boundary is going to be the module boundary; as long as what you publicly expose is safe modulo bugs, you're good. In this case the fix was in a crate-internal function, so I suppose one could argue that the public API was/is fine.

That being said, I'm not super-familiar with the code in question so I can't definitively say that there's no way to make internal changes to reduce the risk of similar errors.


Yeah this is a bad fix. It should be impossible to cause incorrect things to happen from safe code, especially from safe code calling safe code.


The author of the patch does mention that the better thing to do in the long run is to replace the data structure with one that is possible to better encapsulate: https://lore.kernel.org/all/20251111-binder-fix-list-remove-...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: