The Raku Precompilation Problem and Proposals
This page tries to outline the problem precompilation poses for perl 6 module packaging within debian, some technical background and possible ways to handle it. This page most certainly needs your help, if you have any input please contribute!
Raku Precompilation
When rakudo tries to load modules, it first checks whether the module already exists on disk in precompiled form. If it does, the precompiled form is loaded, otherwise the source is loaded, precompiled and stored. This saves lots of time and is generally a great thing to do. It can be observed as well:
robertle@momoko:~/test2$ find . ./test.pl6 ./lib ./lib/MyShit.pm6 robertle@momoko:~/test2$ PERL6LIB=lib RAKUDO_MODULE_DEBUG=1 ./test.pl6 1 RMD: Loading settings CORE 1 RMD: going to load Perl6::BOOTSTRAP 1 RMD: Settings CORE loaded 1 RMD: Attempting 'MyShit' as a pragma 1 RMD: 'MyShit' is not a valid pragma 1 RMD: Attempting to load 'MyShit' 1 RMD: Late loading 'MyShit' 1 RMD: Parsing specs: lib 1 RMD: try-load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B: /home/robertle/test2/lib/MyShit.pm6 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/test2/lib/.precomp 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/.perl6/precomp 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/rakudo/install/share/perl6/site/prcomp 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/rakudo/install/share/perl6/vendor/recomp 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/rakudo/install/share/perl6/precomp 1 RMD: Precompiling /home/robertle/test2/lib/MyShit.pm6 into /home/robertle/test2/lib/.precomp/09C8C6A0D6E69AF983FA3B2BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.bc ( ) 2 RMD: Loading settings CORE 2 RMD: going to load Perl6::BOOTSTRAP 2 RMD: Settings CORE loaded 1 RMD: Precompiled /home/robertle/test2/lib/MyShit.pm6 into /home/robertle/test2/lib/.precomp/09C8C6A0D6E69AF983FA3B2BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.bc 1 RMD: Writing dependencies and byte code to /home/robertle/test2/lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF31C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.tmp for source checksum: 74105D252BB3B89F8D88CCA0698D99AB8A855F9A 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/test2/lib/.precomp 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.repo-id from /home/robertle/test2/lib/.precomp 1 RMD: Loading precompiled /home/robertle/test2/lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94DB04B91B 1 RMD: Performing imports for 'MyShit' 1 RMD: Imports for 'MyShit' done robertle@momoko:~/test2$ PERL6LIB=lib RAKUDO_MODULE_DEBUG=1 ./test.pl6 1 RMD: Loading settings CORE 1 RMD: going to load Perl6::BOOTSTRAP 1 RMD: Settings CORE loaded 1 RMD: Attempting 'MyShit' as a pragma 1 RMD: 'MyShit' is not a valid pragma 1 RMD: Attempting to load 'MyShit' 1 RMD: Late loading 'MyShit' 1 RMD: Parsing specs: lib 1 RMD: try-load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B: /home/robertle/test2/lib/MyShit.pm6 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/test2/lib/.precomp 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.repo-id from /home/robertle/test2/lib/.precomp 1 RMD: Loading precompiled /home/robertle/test2/lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B 1 RMD: Performing imports for 'MyShit' 1 RMD: Imports for 'MyShit' done robertle@momoko:~/test2$ find . . ./test.pl6 ./lib ./lib/.precomp ./lib/.precomp/.lock ./lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C ./lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55 ./lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.repo-id ./lib/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B ./lib/MyShit.pm6
Note the 09C8...315C in the path, this is specific to the rakudo compiler in question, since the precompiled form depends on the compiler to some degree. this compiler ID can be checked within perl through $*PERL.compiler.id. It appears to be effectively created via nqp/tools/build/gen-version.pl by hashing the relevant sources. The 5535...B91B is a hash of the module name and related metadata (version, author...), and the /55/ in the middle is just the first part of teh module hash to make sure we do not end up with millions of files in the same directory.
The precompiled files can go into other directories/repositories than the one the source is in though, which can be seen with a simple experiment:
robertle@momoko:~/test2$ chmod -w lib robertle@momoko:~/test2$ find . ./test.pl6 ./lib ./lib/MyShit.pm6 ./lib2 robertle@momoko:~/test2$ PERL6LIB=lib2,lib RAKUDO_MODULE_DEBUG=1 ./test.pl6 1 RMD: Loading settings CORE 1 RMD: going to load Perl6::BOOTSTRAP 1 RMD: Settings CORE loaded 1 RMD: Attempting 'MyShit' as a pragma 1 RMD: 'MyShit' is not a valid pragma 1 RMD: Attempting to load 'MyShit' 1 RMD: Late loading 'MyShit' 1 RMD: Parsing specs: lib2,lib 1 RMD: try-load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B: /home/robertle/test2/lib/MyShit.pm6 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/test2/lib2/.precomp 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/test2/lib/.precomp 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/.perl6/precomp 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/rakudo/install/share/perl6/site/precomp 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/rakudo/install/share/perl6/vendor/precomp 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/rakudo/install/share/perl6/precomp 1 RMD: Precompiling /home/robertle/test2/lib/MyShit.pm6 into /home/robertle/test2/lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.bc ( ) 2 RMD: Loading settings CORE 2 RMD: going to load Perl6::BOOTSTRAP 2 RMD: Settings CORE loaded 1 RMD: Precompiled /home/robertle/test2/lib/MyShit.pm6 into /home/robertle/test2/lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.bc 1 RMD: Writing dependencies and byte code to /home/robertle/test2/lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.tmp for source checksum: 74105D252BB3B89F8D88CCA0698D99AB8A855F9A 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B from /home/robertle/test2/lib2/.precomp 1 RMD: Trying to load 5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.repo-id from /home/robertle/test2/lib2/.precomp 1 RMD: Loading precompiled /home/robertle/test2/lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B 1 RMD: Performing imports for 'MyShit' 1 RMD: Imports for 'MyShit' done robertle@momoko:~/test2$ find . ./test.pl6 ./lib ./lib/MyShit.pm6 ./lib2 ./lib2/.precomp ./lib2/.precomp/.lock ./lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C ./lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55 ./lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B.repo-id ./lib2/.precomp/09C8C6A0D6E69AF983FA3B22BDB63DA7A1EF315C/55/5535F31A43EAA3DFE87B53B20C8FA94D1B04B91B
In the case above, the precompiled files could not be written to the repository the source came from, so they got stored in lib2 instead. This only works in one direction in the repository stack though, in the case above the repository stack was:
/home/robertle/test2/lib2 /home/robertle/test2/lib /home/robertle/.perl6 /home/robertle/rakudo/install/share/perl6/site /home/robertle/rakudo/install/share/perl6/vendor /home/robertle/rakudo/install/share/perl6 CompUnit::Repository::AbsolutePath<94388303163488> CompUnit::Repository::NQP<94388281200304> CompUnit::Repository::Perl5<94388281200344>
Note that these are all local paths, a properly deployed debian rakudo package would have the site and vendor packages outside user homedirectories of course.
The Problem
The most basic problem with this is that we need to clean up after packages, but if precompiled files get generated on the system we have no way we can tell which precomp files belong to which package. So what to do on package de-install?
This gets worse with regular users: if a module in site/vendor (so a raku package installed through apt) first gets loaded by someone with permissions to that directory, the precomp files get generated there. if it however gets loaded by a user that cannot write to that directory first, that user will end up with precomp files in her local .perl6 instead. cluttering the user's homedir with package data seems less than awesome.
Additionaly, when rakudo gets upgraded so that the compiler-id changes, the old precomp files are no longer valid and new ones will be created. again, these could be created in user home directories if they load the file first after the new rakudo install. So rakudo upgrades cause extra littering with precomp files.
Possible Solutions
A) Re-Pre-Compile at module install and rakudo upgrade
original discussion and more details in https://github.com/ugexe/zef/issues/117. But the basic strategy is: each module package would only ship source files. During installation, these sources would be copied to a tempdir and precompiled there. The precomp files would then be copied to the actual repository, and the list of precomp files would be stored. On package de-isntall, these files can then be cleaned up. On package upgrade, we would first remove the precomp files from the old version, and then do the same for the new version. Each module package would also be registered on installation, so that the cleanup and regeneration of precomp files can be triggered by a rakudo upgrade.
B) Ship pre-compiled files in packages
Further down in the zef issue linked above, niner proposes a different strategy: debian packages would contain both sources and precompiled files, and would only need to be unpacked. No pre/post scripting necessary, and no generation of precomp files on user directories. This does however mean that the module package has a fairly tight dependency on the rakudo version that is compatible with these precomp files. As a consequence, we would need to re-build module packages whenever rakudo changes, and all raku modules on a system would need to be upgraded together. Within debian, this could be achieved via the "transitions" mechanism (Teams/ReleaseTeam/Transitions)
Discussion, Pros/Cons
- Burden for user system: in A), the user system needs to do the precompilation on installation, which wastes CPU cycles. Option B avoids that, but requires new downloads of all module packages when rakudo gets upgraded in a significant way, so wastes download time and bandwidth.
- Install-time complexity: A) requires quite an elaborate scripting system that needs to be maintained and that can go wrong or has edge cases that are hard to test, Option B is relatively straight forward during the installation.
- Build-time complexity: in A) each module package can be maintained independently from each other and the runtime, in B we need to tightly couple them and transistion them together. This is more effort, but also means that reaction times to new releases is slower. It is possible that we would not want to do this every month. We would also need to time this right around stable releases, or live with a rakudo in stable that is a month or two older than strictly necessary.
- Usage of internals: in A) we encode a lot of knowledge of rakudo internals into the scripts, which is a bit dirty and bound to break when these internals change. Note that in both cases we would need to determine when a rakudo becomes incompatible with precompiled sources through $*PERL.compiler.id, which is also a bit of an internal.
- Where does it fail: In proposal A possible failures would most likely happen on the user's system, where they are quite hard to debug and fix. In proposal B the failures are more likely to happen in our build infrastructure where they can be debugged and fixed with relative ease
Discussions with the debian sys admins (see https://lists.debian.org/debian-devel/2017/05/msg00344.html and https://lists.debian.org/debian-devel/2017/06/msg00184.html) suggests that they do not like the permanent tracker required for proposal B
Upstream documentation
* CompUnits and where to find them: How and when Raku modules are compiled, where they are stored, and how to access them in compiled form.
Other Experiences
Perl 5
Perl 5 XS modules are tightly coupled to the runtime version, and this is handled through the transition mechanism described above. It would be interesting what the experiences with that are, and how the automated bump version (+b1 +b2 ...) are done.
https://lists.debian.org/debian-perl/2018/05/msg00011.html
Python
https://www.debian.org/doc/packaging-manuals/python-policy/ch-module_packages.html suggests that they also create pre-compiled files on package installation and clean them up on deinstall. it looks as if .pyc files are not dependent on the python runtime version apart from the 2.6 vs 2.7 vs 3 "major" version, which is not an issue as these live in different source directories anyway. This can also be checked by looking at any python module package.
https://lists.debian.org/debian-python/2018/05/msg00006.html
Haskell
Haskell modules appear to be dependent on a specific libghc-base-dev version and there are ~2.5k haskell packages, so they will have experiences with big transitions.
https://lists.debian.org/debian-haskell/2018/05/msg00005.html
Decision
Back in 2018, we decided to go with option A). Both options have their merits and problems, but the factors tipping the scales are the need for coordinated updates across a possibly large set of modules in proposal B and the dislike of the debian admins of a permanent tracker. There are good experiences with setups similar to A) in emacs and python packages as well.
Experience have shown that pre-compilation time at installation is several seconds and not fun from user's point of view. It's also a waste of resources. So, for Bullseye, we've decided to switch to option B).
The changes are:
- rakudo-helper.pl script is deprecated and will be removed from Rakudo once Bullseye is out
rakudo now has a permanent tracker
- Raku module are shipped with arch any (instead of all), otherwise the permanent tracker does not work.
New Raku module style is supported by dh-raku