Ticket #149 (new defect) | |||||||||||||||||||||
Opened 4 weeks ago Deadlock on left-over semaphores
Description
With the persistent shared memory registry the registered segments are also persistent, especially if Fawkes crashes, for example due to a segfault in a plugin.
Now consider a buffer with an assigned semaphore used as a read-write lock. In the particular case it was the openni-data pointcloud thread. This thread locks, for example, the XYZ buffer for writing, writes the converted point cloud data, and unlocks the buffer again. If the process crashes while this semaphore is locked (we had this in some students' experimental code), the semaphore stays locked. Since it is persistent, the buffer remains locked even for the next restart of Fawkes, causing an immediate deadlock in the pointcloud thread.
The workaround is to call 'fvshmem -c' between restarts to force the buffers to be removed. A freshly started Fawkes will then re-initialize the buffers and semaphores.
To fix this ticket a solution might be to add buffer timeouts. It is reasonable to assume that on a robot buffers change frequently, at least they tend to have some kind of typical update interval. This could be stored in the shared memory header. If the buffer is later opened again without an update in say twice the expected time the buffer should be considered dead and re-created (at least the semaphore). We probably want a way to opt-out from this behavior, for example by setting the expected interval to 0. Note: See
TracTickets for help on using
tickets. | This list contains all users that will be notified about changes made to this ticket. These roles will be notified: Reporter, Subscriber, Participant
| ||||||||||||||||||||

