Size: 2509
Comment:
|
← Revision 55 as of 2009-03-16 03:30:58 ⇥
Size: 6249
Comment: converted to 1.6 markup
|
Deletions are marked like this. | Additions are marked like this. |
Line 5: | Line 5: |
recently we (cdrkit and cdrskin developers) came accross increasing problems with reliable and safe device locking. This paper enlightens the issues behind the scenes and presents possible future solutions. |
recently we (cdrkit and libburnia developers) came accross increasing problems with reliable and safe device locking. This paper collects our ponderings after having received this advise from Alan Cox on LKML: http://lkml.org/lkml/2007/3/31/175 and having sincerely attempted to solve the problem in user space. |
Line 10: | Line 12: |
Our original concern is the influence of even read-only operations on optical media drives (recorders) during their duty as recorders -- depending on the device model such read-only work may interrupt the process badly practically destroying the medium. |
Our concern is the influence of even read-only operations on optical media drives (recorders) during their duty as recorders -- depending on the device model such interference can spoil the process of recording, eventually wasting the medium. |
Line 16: | Line 17: |
Since many programs already do act on such devices in an unsafe manner, either willingly (e.g. liblkid) or accidentally (e.g. hald, opening with O_EXCL but still clashing with cdr applications working on the competing sg driver), we see the need for reliable communication in order to ensure proper device locking where appropriate, in a way which is appropriate for the particular application. In the following document, first the currently possible mechanisms are itemized with their advantages and their problems, followed by a draft of a locking scheme which shall cope with the particular requirements and which may be implemented in a library shared by our applications later. |
Since many programs already act on such devices we see the need for reliable communication in order to allow proper device locking if good will for cooperation is present. But in short: Good will seems not to be enough. We failed to find a viable method for the nexessary coordination of the participants. |
Line 31: | Line 28: |
=== General inter-process locking mechanisms === | === Path/Inode based locking mechanisms === In general, these mechanisms are not optimally appropriate for our purpose. They use the filename or inode as identity. In our case this imposes problems: but they lack on two places which make then not reliable when used alone: * they do not cope with multiple device files which imply the access to the same driver through different files * they do not automatically cope with multiple device '''drivers''' accessible through different co-existing user space interfaces, like with sg vs. sr drivers. We evaluated: |
Line 35: | Line 40: |
Principle: an additional file is created during the action on the real target file. | Principle: an additional file is created during the action on the real target file. See http://www.pathname.com/fhs/pub/fhs-2.3.html#VARLOCKLOCKFILES |
Line 37: | Line 42: |
Pros: regular filesystem operation, no additional infrastructure required | Pros: * regular filesystem operation, no additional infrastructure required |
Line 40: | Line 46: |
* Possible races unless OS mechanisms are used for exclusive operation on the lock file * The location and name of the lock file need to be known and discussed upfront among all application developers, or be documented excessively * Permission problems may make the creation of lock files impossible (security issues), especially for self-compiled applications and having no root permissions to install them in a required way |
* The location and name of the lock file need to be known and discussed upfront among all application developers, or be documented excessively * Permission problems may disallow the creation of lock files (security issues), especially for self-compiled applications and having no root permissions to install them in a required way * Special precautions are necessary against stale locks |
Line 44: | Line 50: |
Currently, following mechanisms can be considered: | * fcntl(2) exclusive file locking |
Line 46: | Line 52: |
Principle: lock applied on open file handles. Thus probably refering to an inode. See fcntl(2) for details. Pros: * POSIX Cons: * needs open(2) as precondition which has to be avoided on unlocked device files * locks can be released inadvertedly by submodules which just open and close the same file (inode ?). ==== Other locking mechanisms ==== |
|
Line 48: | Line 65: |
Principle: passing of the O_EXCL flag to the open call. The device is locked exclusively for the calling PID, the lock is maintained in the device driver to the particular major/minor combination. |
Principle: passing of the O_EXCL flag to the open call of a device file. The device is locked exclusively for the calling PID, the lock is maintained in the device driver to the particular major/minor combination. |
Line 52: | Line 67: |
Pros: - reliable for a device accessible through one driver Cons: |
Pros: * reliable advisory exclusive locking for a device within one device driver Cons: * for sr it requires kernel 2.6.x (x>=7 or so), with sg it might work on 2.4. * O_EXCL already has a meaning for software like libbklid and this is not the same as we would need. * System V Semaphores See man semget(2), semop(2) SEM_UNDO. They have been considered and rejected mainly because of too many potential device names which would need pre-allocated semaphore objects. ---------------------------------------------------------------- None of the mechanisms above solves the problem with the co-existing drivers for sr and sg, anyway. === Applicability on CD/(HD)DVD/BD drives === As explained in the introduction, the locking is important on optical media recording due to the delicate operation mode during the recording. Ideally, no other application should touch them. Even reading info from the drive can spoil the recording run. Currently we are aware of at least the following participants in drive collisions. They take differing precautions for this case, of which none is really able to prevent inadverted open(2) of a busy drive under all circumstances. * mount: the block device is mounted with the O_EXCL flag but the mount executable also uses libblkid which opens the devices without locking and reads magic data from it. (The problem is not with mutual exclusion of mount(8) and burn programs but with libblkid justifiably misunderstanding the meaning of our O_EXCL lock.) * hald (HAL daemon): frequently opens the block devices with O_EXCL flag. * wodim: opens the devices with O_EXCL flag. Opening /dev/sg is possible and happens more likely with versions prior to 1.1.4. * growisofs: opens the block devices with O_EXCL flag. Opening /dev/sg was never encouraged and does not work on kernel 2.4 (not tested yet on 2.6). * cdrskin (via libburn): opens the devices with O_EXCL flag. It uses only /dev/sr* exor /dev/hd* for serious operations on the drive. Operations on other path representations of the same device are restricted to open(2) O_RDONLY and to obtaining SCSI parameters host,channel,id,lun. * cdrecord: no locking. Author recommends to do it like Solaris does (which seems to do explicite locking, maintained internally on device driver or on major/minor pairs). Any of the listed programs is currently able to spoil a recording run just by its proper operation if only the circumstances are unfortunate enough. This compilation is mostly heuristic and may be erroneous in details. Whatever, the problems and the users' disappointment are real. === Hopeless proposal of a locking algorithm === We developed in dialog with Ted T'so a proposal which would nearly fulfill the coordination needs of good willing programs. Nearly. But not sufficiently and with substantial effort. We finally failed due to the coarseness of O_EXCL and the implementation of fcntl(F_SETLK) which is not really suitable for a modular software architecture. See the detailed specification and declaration of failure at http://libburnia.pykix.org/browser/libburn/trunk/doc/ddlp.txt?format=txt |
On Locking Schemes on Linux Device Drivers
Hello fellow application developer or maintainer,
recently we (cdrkit and libburnia developers) came accross increasing problems with reliable and safe device locking. This paper collects our ponderings after having received this advise from Alan Cox on LKML: http://lkml.org/lkml/2007/3/31/175 and having sincerely attempted to solve the problem in user space.
Introduction
Our concern is the influence of even read-only operations on optical media drives (recorders) during their duty as recorders -- depending on the device model such interference can spoil the process of recording, eventually wasting the medium.
Since many programs already act on such devices we see the need for reliable communication in order to allow proper device locking if good will for cooperation is present.
But in short: Good will seems not to be enough. We failed to find a viable method for the nexessary coordination of the participants.
State of the practice
There are various locking techniques used in other areas which are more or less applicable in our case.
Path/Inode based locking mechanisms
In general, these mechanisms are not optimally appropriate for our purpose. They use the filename or inode as identity. In our case this imposes problems: but they lack on two places which make then not reliable when used alone:
- they do not cope with multiple device files which imply the access to the same driver through different files
they do not automatically cope with multiple device drivers accessible through different co-existing user space interfaces, like with sg vs. sr drivers.
We evaluated:
- Lock files associated with target file
Principle: an additional file is created during the action on the real target file. See http://www.pathname.com/fhs/pub/fhs-2.3.html#VARLOCKLOCKFILES Pros:
- regular filesystem operation, no additional infrastructure required
- The location and name of the lock file need to be known and discussed upfront among all application developers, or be documented excessively
- Permission problems may disallow the creation of lock files (security issues), especially for self-compiled applications and having no root permissions to install them in a required way
- Special precautions are necessary against stale locks
- fcntl(2) exclusive file locking Principle: lock applied on open file handles. Thus probably refering to an inode. See fcntl(2) for details. Pros:
- POSIX
- needs open(2) as precondition which has to be avoided on unlocked device files
- locks can be released inadvertedly by submodules which just open and close the same file (inode ?).
Other locking mechanisms
- O_EXCL locking Principle: passing of the O_EXCL flag to the open call of a device file. The device is locked exclusively for the calling PID, the lock is maintained in the device driver to the particular major/minor combination. Pros:
- reliable advisory exclusive locking for a device within one device driver
for sr it requires kernel 2.6.x (x>=7 or so), with sg it might work on 2.4.
- O_EXCL already has a meaning for software like libbklid and this is not the same as we would need.
- System V Semaphores See man semget(2), semop(2) SEM_UNDO. They have been considered and rejected mainly because of too many potential device names which would need pre-allocated semaphore objects.
None of the mechanisms above solves the problem with the co-existing drivers for sr and sg, anyway.
Applicability on CD/(HD)DVD/BD drives
As explained in the introduction, the locking is important on optical media recording due to the delicate operation mode during the recording. Ideally, no other application should touch them. Even reading info from the drive can spoil the recording run. Currently we are aware of at least the following participants in drive collisions. They take differing precautions for this case, of which none is really able to prevent inadverted open(2) of a busy drive under all circumstances.
- mount: the block device is mounted with the O_EXCL flag but the mount executable also uses libblkid which opens the devices without locking and reads magic data from it. (The problem is not with mutual exclusion of mount(8) and burn programs but with libblkid justifiably misunderstanding the meaning of our O_EXCL lock.)
- hald (HAL daemon): frequently opens the block devices with O_EXCL flag.
- wodim: opens the devices with O_EXCL flag. Opening /dev/sg is possible and happens more likely with versions prior to 1.1.4.
- growisofs: opens the block devices with O_EXCL flag. Opening /dev/sg was never encouraged and does not work on kernel 2.4 (not tested yet on 2.6).
- cdrskin (via libburn): opens the devices with O_EXCL flag. It uses only /dev/sr* exor /dev/hd* for serious operations on the drive. Operations on other path representations of the same device are restricted to open(2) O_RDONLY and to obtaining SCSI parameters host,channel,id,lun.
- cdrecord: no locking. Author recommends to do it like Solaris does (which seems to do explicite locking, maintained internally on device driver or on major/minor pairs).
Any of the listed programs is currently able to spoil a recording run just by its proper operation if only the circumstances are unfortunate enough. This compilation is mostly heuristic and may be erroneous in details. Whatever, the problems and the users' disappointment are real.
Hopeless proposal of a locking algorithm
We developed in dialog with Ted T'so a proposal which would nearly fulfill the coordination needs of good willing programs. Nearly. But not sufficiently and with substantial effort.
We finally failed due to the coarseness of O_EXCL and the implementation of fcntl(F_SETLK) which is not really suitable for a modular software architecture.
See the detailed specification and declaration of failure at http://libburnia.pykix.org/browser/libburn/trunk/doc/ddlp.txt?format=txt