On Locking Schemes on Linux Device Drivers

Hello fellow application developer or maintainer,

recently we (cdrkit and libburnia developers) came accross increasing problems with reliable and safe device locking. This paper collects our ponderings after having received this advise from Alan Cox on LKML: http://lkml.org/lkml/2007/3/31/175 and having sincerely attempted to solve the problem in user space.

Introduction

Our concern is the influence of even read-only operations on optical media drives (recorders) during their duty as recorders -- depending on the device model such interference can spoil the process of recording, eventually wasting the medium.

Since many programs already act on such devices we see the need for reliable communication in order to allow proper device locking if good will for cooperation is present.

Well, in short: Good will seems not to be enough. We failed to find a viable method for the nexessary coordination of the participants.

State of the practice

There are various locking techniques used in other areas which are more or less applicable in our case.

Path/Inode based locking mechanisms

In general, these mechanisms are not optimally appropriate for our purpose. They use the filename or inode as identity. In our case this imposes problems: but they lack on two places which make then not reliable when used alone:

We evaluated:

Other locking mechanisms


None of the mechanisms above solves the problem with the co-existing drivers for sr and sg, anyway.

Applicability on CD/(HD)DVD/BD drives

As explained in the introduction, the locking is important on optical media recording due to the delicate operation mode during the recording. Ideally, no other application should touch them. Even reading info from the drive can spoil the recording run. Currently we are aware of at least the following participants in drive collisions. They take differing precautions for this case, of which none is really able to prevent inadverted open(2) of a busy drive under all circumstances.

Any of the listed programs is currently able to spoil a recording run just by its proper operation if only the circumstances are unfortunate enough. This compilation is mostly heuristic and may be erroneous in details. Whatever, the problems and the users' disappointment are real.

Proposed locking algorithm

It adopts the FHS idea of locking a proxy before any open(2) is performed, but avoids the known drawbacks of the FHS /var/lock/ protocol.

It is designed to allow the use of any of the sg, sr, scd device drivers at the discretion of the programs. It is also designed to include the less ambiguous situation of drive access via /dev/hd*.


Compliant processes apply open(2) to suspected CD/DVD burner device files only if they are able to do this via one of the following paths:

and only after they have obtained a lock on them. (N= 31 or 255 ?)

Locking is performed similar to UUCP tradition but without the potential race conditions or potential stale locks: Other than with FHS /var/lock, not the mere existence of the lock file establishes the lock state. It is instead implemented by open(2) with O_RDWR and then fcntl(2) with F_SETLK. The lock file descriptor is held open until the lock is obsolete.

Paths other than the permissible ones have to be translated. The call stat(2) with its result element .st_rdev allows to search a matching device file among the permissible ones. So /dev/nec_burner can be translated to exactly one of /dev/sr0, /dev/sg2, /dev/hdd. (If not, then it is hardly a burner device.)

To circumvent the sg-sr-scd ambiguity, those devices must get locked in all their three permissible path instances. E.g. not only /dev/sr0 has to be locked before open(2) for serious usage is allowed, but also /dev/sg2 and /dev/scd0.

The device triples are formed from those device files which have the same SCSI parameters Host,Channel,Id,Lun from ioctl(SCSI_IOCTL_GET_IDLUN). Since this needs open(2), the search has to be accompanied by the locking of the tested files. Those which do not match get released immediately. If all three files are found and locked, it is guaranteed that any of them is free for usage. If any of the three is not found, then the lock is not granted due to a suspected collision between two locking contestants.

This cannot disturb a serious drive operation because such is allowed to start only if all three paths are locked. Thus there would be no starting point for a device-triple search at all.

Further precautions like open(O_EXCL) or fcntl(F_SETLK) on the device file itself are allowed. Programs are asked politely to offer expert options to disable them. In general a program is free to use a device in any way after a lock has been obtained successfully.


All we need for this is a directory which is present on any Linux system and is supposed to offer rwx-permissions to anybody who is allowed to access the devices.

As an application programmer i would propose /tmp/ and some file name prefix. It would work, after all. It would be covered by FHS specs except the fact that /var/lock is the paragraph which matches our problem more specifically - and fails to solve it.

To perform the sketched algorithm in /var/lock would violate FHS. The often restrictive permission settings of /var/lock would also make necessary an additional rule: A missing lock file which cannot be created allows to use the device as if a lock had been granted. (Provident sysadmins would then once create the lock files in /var/lock/ and allow rw-permission for all intended users.)

This is where we should ask the broad Linux public for opinions and advise. We are not much in a hurry and therefore should ponder duely over any aspect.