The Raku Precompilation Problem and Proposals

This page tries to outline the problem precompilation poses for perl 6 module packaging within debian, some technical background and possible ways to handle it. This page most certainly needs your help, if you have any input please contribute!

Raku Precompilation

When rakudo tries to load modules, it first checks whether the module already exists on disk in precompiled form. If it does, the precompiled form is loaded, otherwise the source is loaded, precompiled and stored. This saves lots of time and is generally a great thing to do. It can be observed as well:

  robertle@momoko:~/test2$ find
  .
  ./test.pl6
  ./lib
  ./lib/MyShit.pm6
  robertle@momoko:~/test2$ PERL6LIB=lib RAKUDO_MODULE_DEBUG=1 ./test.pl6 
   1 RMD: Loading settings CORE
   1 RMD: going to load Perl6::BOOTSTRAP
   1 RMD: Settings CORE loaded
   1 RMD: Attempting 'MyShit' as a pragma
   1 RMD:   'MyShit' is not a valid pragma
   1 RMD: Attempting to load 'MyShit'
   1 RMD:   Late loading 'MyShit'
   1 RMD: Parsing specs: lib
   1 RMD: try-load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B: /home/robertle/test2/lib/MyShit.pm6
   1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/test2/lib/.precomp
   1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/.perl6/precomp
   1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/rakudo/install/share/perl6/site/prcomp
   1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/rakudo/install/share/perl6/vendor/recomp
   1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/rakudo/install/share/perl6/precomp
   1 RMD: Precompiling /home/robertle/test2/lib/MyShit.pm6 into /home/robertle/test2/lib/.precomp/09C8C6A0D6E69AF983FA3B2BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.bc (  )
   2     RMD: Loading settings CORE
   2     RMD: going to load Perl6::BOOTSTRAP
   2     RMD: Settings CORE loaded
   1 RMD: Precompiled /home/robertle/test2/lib/MyShit.pm6 into /home/robertle/test2/lib/.precomp/09C8C6A0D6E69AF983FA3B2BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.bc
   1 RMD: Writing dependencies and byte code to /home/robertle/test2/lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF31C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.tmp for source checksum:   74105D252BB3B89F8D88CCA0698D99AB8A855F9A
   1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/test2/lib/.precomp
   1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.repo-id from /home/robertle/test2/lib/.precomp
   1 RMD: Loading precompiled
          /home/robertle/test2/lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94DB04B91B
   1 RMD: Performing imports for 'MyShit'
   1 RMD: Imports for 'MyShit' done
  robertle@momoko:~/test2$ PERL6LIB=lib RAKUDO_MODULE_DEBUG=1 ./test.pl6 
   1 RMD: Loading settings CORE
   1 RMD: going to load Perl6::BOOTSTRAP
   1 RMD: Settings CORE loaded
   1 RMD: Attempting 'MyShit' as a pragma
   1 RMD:   'MyShit' is not a valid pragma
   1 RMD: Attempting to load 'MyShit'
   1 RMD:   Late loading 'MyShit'
   1 RMD: Parsing specs: lib
   1 RMD: try-load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B: /home/robertle/test2/lib/MyShit.pm6
   1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/test2/lib/.precomp
   1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.repo-id from /home/robertle/test2/lib/.precomp
   1 RMD: Loading precompiled
        /home/robertle/test2/lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B
   1 RMD: Performing imports for 'MyShit'
   1 RMD: Imports for 'MyShit' done
  robertle@momoko:~/test2$ find .
  .
  ./test.pl6
  ./lib
  ./lib/.precomp
  ./lib/.precomp/.lock
  ./lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C
  ./lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55
  ./lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.repo-id
  ./lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B
  ./lib/MyShit.pm6

Note the 09C8...315C in the path, this is specific to the rakudo compiler in question, since the precompiled form depends on the compiler to some degree. this compiler ID can be checked within perl through $*PERL.compiler.id. It appears to be effectively created via nqp/tools/build/gen-version.pl by hashing the relevant sources. The 5535...B91B is a hash of the module name and related metadata (version, author...), and the /55/ in the middle is just the first part of teh module hash to make sure we do not end up with millions of files in the same directory.

The precompiled files can go into other directories/repositories than the one the source is in though, which can be seen with a simple experiment:

robertle@momoko:~/test2$ chmod -w lib
robertle@momoko:~/test2$ find
.
./test.pl6
./lib
./lib/MyShit.pm6
./lib2
robertle@momoko:~/test2$ PERL6LIB=lib2,lib RAKUDO_MODULE_DEBUG=1 ./test.pl6 
 1 RMD: Loading settings CORE
 1 RMD: going to load Perl6::BOOTSTRAP
 1 RMD: Settings CORE loaded
 1 RMD: Attempting 'MyShit' as a pragma
 1 RMD:   'MyShit' is not a valid pragma
 1 RMD: Attempting to load 'MyShit'
 1 RMD:   Late loading 'MyShit'
 1 RMD: Parsing specs: lib2,lib
 1 RMD: try-load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B: /home/robertle/test2/lib/MyShit.pm6
 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/test2/lib2/.precomp
 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/test2/lib/.precomp
 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/.perl6/precomp
 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/rakudo/install/share/perl6/site/precomp
 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/rakudo/install/share/perl6/vendor/precomp
 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/rakudo/install/share/perl6/precomp
 1 RMD: Precompiling /home/robertle/test2/lib/MyShit.pm6 into /home/robertle/test2/lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.bc (  )
 2     RMD: Loading settings CORE
 2     RMD: going to load Perl6::BOOTSTRAP
 2     RMD: Settings CORE loaded
 1 RMD: Precompiled /home/robertle/test2/lib/MyShit.pm6 into /home/robertle/test2/lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.bc
 1 RMD: Writing dependencies and byte code to /home/robertle/test2/lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.tmp for source checksum: 74105D252BB3B89F8D88CCA0698D99AB8A855F9A
 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/test2/lib2/.precomp
 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.repo-id from /home/robertle/test2/lib2/.precomp
 1 RMD: Loading precompiled
        /home/robertle/test2/lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B
 1 RMD: Performing imports for 'MyShit'
 1 RMD: Imports for 'MyShit' done
robertle@momoko:~/test2$ find
.
./test.pl6
./lib
./lib/MyShit.pm6
./lib2
./lib2/.precomp
./lib2/.precomp/.lock
./lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C
./lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55
./lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.repo-id
./lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B

In the case above, the precompiled files could not be written to the repository the source came from, so they got stored in lib2 instead. This only works in one direction in the repository stack though, in the case above the repository stack was:

    /home/robertle/test2/lib2
    /home/robertle/test2/lib
    /home/robertle/.perl6
    /home/robertle/rakudo/install/share/perl6/site
    /home/robertle/rakudo/install/share/perl6/vendor
    /home/robertle/rakudo/install/share/perl6
    CompUnit::Repository::AbsolutePath<94388303163488>
    CompUnit::Repository::NQP<94388281200304>
    CompUnit::Repository::Perl5<94388281200344>

Note that these are all local paths, a properly deployed debian rakudo package would have the site and vendor packages outside user homedirectories of course.

The Problem

The most basic problem with this is that we need to clean up after packages, but if precompiled files get generated on the system we have no way we can tell which precomp files belong to which package. So what to do on package de-install?

This gets worse with regular users: if a module in site/vendor (so a raku package installed through apt) first gets loaded by someone with permissions to that directory, the precomp files get generated there. if it however gets loaded by a user that cannot write to that directory first, that user will end up with precomp files in her local .perl6 instead. cluttering the user's homedir with package data seems less than awesome.

Additionaly, when rakudo gets upgraded so that the compiler-id changes, the old precomp files are no longer valid and new ones will be created. again, these could be created in user home directories if they load the file first after the new rakudo install. So rakudo upgrades cause extra littering with precomp files.

Possible Solutions

A) Re-Pre-Compile at module install and rakudo upgrade

original discussion and more details in https://github.com/ugexe/zef/issues/117. But the basic strategy is: each module package would only ship source files. During installation, these sources would be copied to a tempdir and precompiled there. The precomp files would then be copied to the actual repository, and the list of precomp files would be stored. On package de-isntall, these files can then be cleaned up. On package upgrade, we would first remove the precomp files from the old version, and then do the same for the new version. Each module package would also be registered on installation, so that the cleanup and regeneration of precomp files can be triggered by a rakudo upgrade.

B) Ship pre-compiled files in packages

Further down in the zef issue linked above, niner proposes a different strategy: debian packages would contain both sources and precompiled files, and would only need to be unpacked. No pre/post scripting necessary, and no generation of precomp files on user directories. This does however mean that the module package has a fairly tight dependency on the rakudo version that is compatible with these precomp files. As a consequence, we would need to re-build module packages whenever rakudo changes, and all raku modules on a system would need to be upgraded together. Within debian, this could be achieved via the "transitions" mechanism (Teams/ReleaseTeam/Transitions)

Discussion, Pros/Cons

Upstream documentation

* CompUnits and where to find them: How and when Raku modules are compiled, where they are stored, and how to access them in compiled form.

Other Experiences

Perl 5

Perl 5 XS modules are tightly coupled to the runtime version, and this is handled through the transition mechanism described above. It would be interesting what the experiences with that are, and how the automated bump version (+b1 +b2 ...) are done.

https://lists.debian.org/debian-perl/2018/05/msg00011.html

Python

https://www.debian.org/doc/packaging-manuals/python-policy/ch-module_packages.html suggests that they also create pre-compiled files on package installation and clean them up on deinstall. it looks as if .pyc files are not dependent on the python runtime version apart from the 2.6 vs 2.7 vs 3 "major" version, which is not an issue as these live in different source directories anyway. This can also be checked by looking at any python module package.

https://lists.debian.org/debian-python/2018/05/msg00006.html

Haskell

Haskell modules appear to be dependent on a specific libghc-base-dev version and there are ~2.5k haskell packages, so they will have experiences with big transitions.

https://lists.debian.org/debian-haskell/2018/05/msg00005.html

Decision

Back in 2018, we decided to go with option A). Both options have their merits and problems, but the factors tipping the scales are the need for coordinated updates across a possibly large set of modules in proposal B and the dislike of the debian admins of a permanent tracker. There are good experiences with setups similar to A) in emacs and python packages as well.

Experience have shown that pre-compilation time at installation is several seconds and not fun from user's point of view. It's also a waste of resources. So, for Bullseye, we've decided to switch to option B).

The changes are: