Translation(s): none


A description of the capabilities of Tea4CUPS with examples.

Introduction to Tea4CUPS

The final filter a print job passes through before being sent to a printer is the backend filter (usb, parallel, socket etc), which transports the printer-ready job to the printer. Tea4CUPS is not intended to replace the function of any existing backend being used with the printer. It is a script (written in Python) which wraps round the chosen backend to manipulate print data before or after they are dispatched by the backend to the printer.

                          DATA1=DATA2 without a Tea4CUPS filter.

+------------+    +--------------+       +----------+       +----------+       +---------+
|            |    |     CUPS     | DATA1 | Tea4CUPS | DATA2 | Tea4CUPS | DATA3 |         |
| Input file |--->|      +       |------>|  filter  |------>| wrapped  |------>| Printer |
|            |    | cups-filters |       |          |   |   | backend  |   |   |         |
+____________+    +--------------+       +----------+   |   +----------+   |   +---------+
                                                        |                  |
                                                        |                  |
                                           Prehooks <---+                  +---> Posthooks

The manipulation may be done via a combination of prehooks, posthooks and a filter. There can be only one filter per print queue but as many hooks as wanted can be specified for the same print queue. Hooks may be scripts and do not alter the data which are sent to the printer. If the data should be modified before being sent to a hook or the printer a filter should be defined.

There is very adequate documentation (including examples) for Tea4CUPS in /etc/cups/tea4cups.conf and the README in /usr/share/doc/cups-tea4cups/.

Wrapping a Backend with Tea4CUPS

A print queue managed by Tea4CUPS can be created with lpadmin, the web interface or system-config-printer. If the unwrapped DEVICE_URI is socket://192.168.7.200:9100 the wrapped one (the -v option to lpadmin) would be

tea4cups:/socket://192.168.7.200:9100
or
tea4cups:socket://192.168.7.200:9100

On Jessie lpadmin complains about tea4cups://socket://192.168.7.200:9100.

Capturing an Input File

tea4cups:// is a special device-uri in that it is virtual. It is useful when sending data to a real printer with a wrapped backend is not wanted. The input is still available to be manipulated by hooks and/or a filter.

It is tempting to think tea4cups:/file:/tmp/out would print to /tmp/out. However, file:/ is built into CUPS and is not a backend device-uri. But, with a virtual device-uri, a hook could be used for capturing the input file. The following technique is used in some of the scripts on this page.

Set up a virtual raw queue:

   lpadmin -p virtq -v tea4cups:// -E -m raw

The input file avoids any filtering which would be done by cups-filters and instead is sent directly to the tea4cups wrapped backend:

                        DATA1=DATA2 without a Tea4CUPS filter.

+------------+         +----------+       +------------------+          +---------+
|            |  DATA1  | Tea4CUPS | DATA2 | Tea4CUPS wrapped |   DATA3  |         |
| Input file |-------->|  filter  |------>| virtual backend  |--------->| Nowhere |
|            |         |          |   |   |                  |    |     |         |
+____________+         +----------+   |   +------------------+    |     +---------+
                                      |                           |
                                      |                           |
                         Prehooks <---+                           +---> Posthooks

A hook or filter can now work directly with the input file to modify it and send the result to a print queue which has been set up with:

  lpadmin -p realq -v <DEVICE_URI> -E -m <PPD>

Jobs received from virtq can be sent to realq using a prehook or posthook with the line

  lp -d realq <OPTIONS> <file>

in it.

Enforcing number-up=2 Printing (1)

A question which is sometimes seen concerns wanting to enforce default printer settings for applications and printing from the command line. It might be because some users on a network don't have the patience to check their printer settings or maybe it is seen as desirable to limit the alteration of some settings on an expensive-to-run colour printer. Within CUPS there is no surefire way of achieving this. A user is either allowed to print or not print. If allowed to print then that user can change the print job options in the application's interface or with lpoptions. This is irrespective of how the print queue has been set up.

Set up a virtual raw queue. A prehook such as

  prehook_something : lp -d realq -o "$TEAOPTIONS" "$TEADATAFILE"

would pass the options given to virtq on to realq unchanged.

With

  prehook_enforce : lp -d realq -o "$TEAOPTIONS number-up=2" "$TEADATAFILE"

the number-up=2 option replaces any similar option in $TEAOPTIONS.

An option to print double-sided might not present in the virtq option list but it might be wanted as a default for printing to realq. sides=two-sided-long-edge looks like something which could be put in prehook_enforce to make every print job use a duplex facility on the printer.

Enforcing number-up=2 Printing (2)

Because it is a raw queue the options offered by virtq in the printing dialogue for an application such as Firefox and with lpoptions are limited. They are certainly fewer than the options which realq with its PPD would display. This might not matter too much to most users if realq has been set up sensibly, but nevertheless it could be regarded as a deficiency by users and administrators alike.

Instead of a setting up a virtual raw queue as before set up pretendq with the PPD used for realq:

  lpadmin -p pretendq -v tea4cups:/socket://192.168.7.200:9100 -E -m <PPD_for_realq>

We do not want the cups-filters processed data sent to the printer or used by a prehook. With the prehook line

  prehook_spool : /usr/local/bin/spool

this script would realise these two objectives:

  #!/bin/bash
  # spool: a script to print from /var/spool/cups.

  # Location of CUPS' "d" files.
  DDIR=/var/spool/cups
  # Find job originally sent to the queue.
  JOB=$(ls $DDIR | grep "$TEAJOBID"-001)

  lp -d realq -o "$TEAOPTIONS number-up=2"  $DDIR/$JOB

  # Causes the job not to be sent to the real backend.
  exit -1

Manual Duplexing

Without duplexing cabability on the printer double-sided printing requires manual intervention. One way is to print the odd numbered pages of a document in reverse and then print the even numbered pages in normal order after returning the first print run to the printer tray with the first printed page at the bottom of the pile. Let's see the result of that with three documents having, respectively, three, four and five pages.

lp -d <print_queue> docA docB docC

The order of printing is

  C5 C3 C1 B3 B1 A3 A1

and the order from the top of the pile is

  A1 A3 B1 B3 C1 C3 C5

Printing the even numbered pages gives

  A2 B2 B4 C2 C4

so, matching up sheets, we have

  A1 A3 B1 B3 C1 C3 C6
  A2 B2 B4 C2 C4

There are two consequences here:

* Two sheets have pages from different documents: A3, B2 and B3, C2.
* The last two pages in the pile are not pulled out of the tray.

What we want is for a blank page to be added to the end of even numbered pages of a document if the total number of pages in that document is odd. Like this (using "0" for an empty page):

  A2 A0 B2 B4 C2 C4 C0

Matching sheets now gives

A1 A3 B1 B3 C1 C3 C6
A2 A0 B2 B4 C2 C4 C0

With pdftopdf from cupsfilters 1.0.55-1 or newer this is done automatically, so on Jessie you do not have to think about it. If your pdftopdf is from an older version of cupsfilters or is not part of the filtering chain it is not done. pdftopdf would not be used for a raw queue or (prior to cups-filters 1.10.0) a queue where a PostScript file is being sent to a PostScript printer.

Using Tea4CUPS the wanted blank pages can be inserted by having a hook script send a form feed to the printer when printing the even numbered pages in a document containing an odd number of pages. Whether the total number of pages in a document is odd or even can be determined with the help of pkpgcounter.

Set up virtq

   lpadmin -p virtq -v tea4cups:// -E -m raw

Have

   prehook_manualduplex : /usr/local/bin/manualduplex $TEADATAFILE

as a hook in /etc/cups/tea4cups.conf and use this for /usr/local/bin/manualduplex:

# Are we printing the document's odd or even pages?
PAGESET=$(echo $TEAOPTIONS | grep -o "page-set.*" | cut -d"=" -f2 | cut -d" " -f1)
# Number of pages in the document.
PAGES=$(pkpgcounter $1)
# Has the document an odd or even number of pages?
REM=$(($PAGES % 2))
# Print queue.
PRINTQ=realq
# Command for printing.
PRINT=(/usr/bin/lp -d $PRINTQ -o "$TEAOPTIONS" $1)

case $PAGESET in
     even) case $REM in
               0) "${PRINT[@]}"                 ;;
               1) "${PRINT[@]}"
                  # Form feed.
                  echo -en "\f" | lp -d $PRINTQ ;;
           esac                                 ;;
        *) "${PRINT[@]}"                        ;;
esac

realq is set up as described here.

Printing from an application is a matter of selecting the printer virtq, choosing odd numbered pages and printing them in reverse order, followed by printing even numbered pages. Otherwise:

  lp -d virtq -o 'outputorder=reverse page-set=odd' <files>
  lp -d virtq -o 'page-set=even' <files>

Printing Microsoft Documents

The technique is based on converting the document to a PDF via a prehook and a script and then submitting the PDF to a print queue, realq.unoconv does the conversion. Its installation will bring in a number of Libreoffice packages but X is not required because unoconv can be started as a listener for unconv clients. For listening permanently in the background:

  unconv -l &

A prehook would call the script with

   prehook_unoconv : /usr/local/bin/print-office-docs $TEADATAFILE

and invoke

  #!/bin/bash

  # The working directory and filename.
  OUT=$TEADIRECTORY/tmp/tea4cups-$TEAJOBID-$TEATITLE.pdf

  UNO=/usr/bin/unoconv
  MIMETYPE=$(file --mime-type -b $1)
  PPT=application/vnd.ms-powerpoint
  DOC=application/msword
  DOCX=application/vnd.openxmlformats-officedocument.wordprocessingml.document
  RTF=text/rtf
  ODT=application/vnd.oasis.opendocument.text

  # Have "*)" exit with an error message?
  case $MIMETYPE in
     $PPT|$DOC|$DOCX|$RTF|$ODT) $UNO -o $OUT $1
                                lp -d realq -o "$TEAOPTIONS" $OUT
                                rm $OUT
                                  ;;
                             *) lp -d realq -o "$TEAOPTIONS" $1
                                  ;;
  esac

Print queues virtq and realq are set up as indicated in the previous section. The usual command line, without any options, would be:

  lp -d virtq <office_file>

What should happen if the file is not an office-type file is open to configuration in the script.

Text to PDF with a Selected Font

On the texttopdf page we see that a font for printing text files can be selected by fontconfig or specified as a system-wide or user default. If a text file is sent to a remote server the font used for printing is determined by what happens on the server. One solution to having a text file printed with a client's chosen font is first to convert it to a PDF which has the fonts embedded in it. It is the PDF which is sent to the server.

The same technique (converting a text file to a PDF) can also be used when sending a file to both local and remote print queues and wanting to choose a font for the job. This is the approach taken in the following script.

Suppose the choice available for a monospaced font is one of FreeMono and DejaVusansMono. Create the files pdf.freemono and pdf.dejavu in $HOME/charsets:

# pdf.freemono
charset utf8
0000 04FF ltor single /usr/share/fonts/truetype/freemono/FreeMono.ttf

# pdf.dejavu
charset utf8
0000 04FF ltor single /usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf

The command for converting a text file to a PDF file having the FreeMono font on the client machine is:

CUPS_DATADIR=$HOME CHARSET=freemono /usr/lib/cups/filter/texttopdf 0 0 0 0 0 in.txt > out.pd

This causes texttopdf to look for the file pdf.freemono in the charsets directory of $CUPS_DATADIR. $CUPS_DATADIR would normally be /usr/share/cups/ but texttopdf will now look for the character set in a charsets directory in the home directory. This command is used in the script below. Note that there the environment variables $CUPS_DATADIR and $CHARSET are set up prior to invoking /usr/lib/cups/filter/texttopdf.

We can now proceed with setting up a raw virtual queue and using the prehook line:

prehook_texttopdf : /usr/local/bin/texttopdf $TEADATAFILE

A viable script for texttopdf is:

#!/bin/bash

# Detect whether the input file is text/plain.
MIMETYPE=$(file --mime-type -b $1)
# The print queue to send the output PDF to.
PRINTQ=realq
# The user's CUPS_DATADIR. Where the pdf.* files are.
CUPS_DATADIR=/home/$TEAUSERNAME
# Font_name. Should be the "*" part of pdf.*.
CHARSET=$(echo $TEAOPTIONS | grep -E -o "font=[a-z]+" | cut -d"=" -f2)

# The default pdf.utf-8.simple is used if the font option is empty or
# if the user specifies a font_name unknown to the files in $HOME/charsets.
if [ -z "$CHARSET" ] ; then
   CUPS_DATADIR="/usr/share/cups" ; CHARSET="utf-8"
fi

if [ -e "$CUPS_DATADIR/charsets/pdf.$CHARSET" ] && [ $MIMETYPE = "text/plain" ]
   then
      /usr/lib/cups/filter/texttopdf 1 1 1 1 1 $1 | lp -d $PRINTQ
   else
     lp -d $PRINTQ $1
fi

Printing would be with

  lp -d virtq -o font=<font_name> <text_file>

Creating Searchable PDFs from Text Files (1)

A PDF file produced with the texttopdf filter as in the previous section is not searchable, so we will look at a way of creating such PDFs with Abiword and have Tea4CUPS send the output by mail to a user.

An X setup is not necessary if xvfb is installed. The first step is to convert the text file to a .abw file, which then gets converted to the wanted searchable PDF.

  xvfb-run abiword --to=abw --to-name=out.abw in.txt

  xvfb-run abiword --to=pdf --to-name=searchable.pdf out.abw

You are wondering why a conversion straight to PDF with --to=pdf isn't being done? Well, it could be - but the PDF will have a serif font with a 12pt font and this may not be to your liking.

A .abw file is xml text so we lose nothing with taking this route and gain the ability to have a configurable monospaced font by editing out.abw. FontFamily= and FontSize= will be the options we will use with lp. The FF and FS variables hold the command line -o values. Without values for FontFamily and FontSize FreeMono at a 10pt font size is the default for the PDF.

The Tea4CUPS variable TEABILLING is conscripted to allow an email address to be specified on the command line. Without it the PDF file is sent to the user's home directory on the machine the script runs on.

The script is:

 #!/bin/bash

  # Check for a text input file?
  # Execute abiword without installing xorg.
  XVFB=/usr/bin/xvfb-run
  # Mail program.
  MAIL=/usr/bin/mutt
  # Mail subject.
  SUBJ="Your requested PDF: $TEATITLE.pdf"
  # Where we work. And the filename we work with.
  OUT=$TEADIRECTORY/tmp/tea4cups-$TEAUSERNAME-$TEATITLE
  # Have we been given FontFamily and FontSize?
  FF=$(echo $TEAOPTIONS | grep -o "FontFamily.*" | cut -d"=" -f2 | cut -d" " -f1)
  FS=$(echo $TEAOPTIONS | grep -o "FontSize.*" | cut -d"=" -f2 | cut -d" " -f1)

  # First convert the input file to a .abw (XML) file.
  $XVFB abiword --to=abw --to-name=$OUT.abw $TEADATAFILE > /dev/null

  # Edit $OUT.abw for FontFamily ($FF) and FontSize ($FS).
  case $FF in
      "") awk '{sub("Liberation Serif","FreeMono",$0); print $0}' $OUT.abw > $OUT.FF.abw ;;
       *) awk '{sub("Liberation Serif","'$FF'",$0); print $0}' $OUT.abw > $OUT.FF.abw ;;
  esac

  case $FS in
      "") awk '{sub("font-size:12pt","font-size:10pt",$0); print $0}' $OUT.FF.abw > $OUT.FS.abw ;;
       *) awk '{sub("font-size:12pt","font-size:"'$FS'"pt"); print $0}' $OUT.FF.abw > $OUT.FS.abw ;;
  esac

  # Convert to a PDF.
  xvfb-run abiword --to=pdf --to-name=$OUT.pdf $OUT.FS.abw > /dev/null

  if [ -z "$TEABILLING" ] ; then
     $MAIL -s "$SUBJ" -a $OUT.pdf -- $TEAUSERNAME@localhost
  else
     $MAIL -s "$SUBJ" -a $OUT.pdf -- $TEABILLING
  fi

  # Clean up. /var/spool/cups/tmp/sent? Rely on /etc/logrotate.d/cups-daemon
  # to clear /var/spool/cups/tmp/?
  rm $OUT*

The prehook line is:

  prehook_searchablepdf : /usr/local/bin/searchablepdf

A command for printing would look like:

  lp -d virtq -o 'FontFamily=DroidSansMono FontSize=10 job-billing=user@example.com'

Creating Searchable PDFs from Text Files (2)

This is an alternative to using Abiword and might be seen as easier and more straightforward to implement. It relies on pdftocairo to convert the PDF produced by the texttopdf filter into another PDF, which is searchable. The script is a simplified version of the one in a previous section and pdftocairo performs the same task in a later script.

#!/bin/bash
# texttopdf. Convert text files to searchable PDFs.

MIMETYPE=$(file --mime-type -b $1)
TTP="/usr/lib/cups/filter/texttopdf 1 1 1 1 1 $1"
DIR="/home/$TEAUSERNAME/PDF/"
PDF="$DIR/$TEATITLE.pdf"
TU="$TEAUSERNAME"

# Create directory for PDFs if one does not exist.
if [ ! -d "$DIR" ] ; then
   mkdir "$DIR"
   chown "$TU":"$TU" "$DIR"
fi

if [ "$MIMETYPE" = "text/plain" ]
   then
      $TTP | pdftocairo -pdf - "$PDF"
      chown "$TU:$TU" "$PDF"
   else 
      rm "$TEADATAFILE"
fi

A prehook for a virtual raw queue would be

prehook_texttopdf : /usr/local/bin/texttopdf $TEADATAFILE

Virtual PDF Printer (1)

Create a raw printer queue named TeaPDF:

lpadmin -p TeaPDF -v tea4cups:// -E -m raw

Create pre and post hooks in /etc/cups/tea4cups.conf by adding the following lines at the end of the file:

[TeaPDF]
prehook_rawpdf  : /bin/cat $TEADATAFILE |  su $TEAUSERNAME -l -c "cat - > `/usr/bin/getent passwd $TEAUSERNAME | /usr/bin/cut -f 6,6 -d :`'/PDF/job-$TEAJOBID-$TEATITLE.pdf'"
posthook_rawpdf : /bin/cat >/tmp/log_of_pdf_creation_for_job_$TEAJOBID

In the user's home directory create a new directory named PDF otherwise print jobs to TeaPDF will be aborted. Any print jobs sent to TeaPDF will be saved to the new PDF directory with the file name format, job-<JOB-ID>-<TITLE>.pdf.

Virtual PDF Printer (2)

It is very common for a user to use cups-pdf to convert a file to PDF format. Unfortunately, it is possible for the resulting PDFs (particularly from GTK applications) to be of variable quality and not have searchable or extractable text. The script below, print-to-pdf, is intended to remedy both these deficiencies.

#!/bin/bash
# print-to-pdf. A script to produce searchable PDFs.

# Determine the MIME media type of the file to be printed.
MIMETYPE=$(file --mime-type -b "$TEADATAFILE")

# Directory for the output file.
DIR="/home/$TEAUSERNAME/PDF/"

CF1=cupsfilter
CF2="cupsfilter -m application/vnd.cups-pdf -o"
TU="$TEAUSERNAME"
TD="$TEADATAFILE"

# Get any page-ranges. Useful for jobs submitted with lp but not for
# jobs from most GTK/QT apps. The latter pre-process jobs to sort out
# the pages to be printed and do not send "-o pages-ranges" to pdftopdf.
TO=$(echo "$TEAOPTIONS" | grep -o 'page-ranges[^ [:space:]]\+')

# $TEATITLE might be file:///etc/services.
TT=$(basename "$TEATITLE")

# Create directory for PDFs if it does not exist.
if [ ! -d "$DIR" ] ; then
   mkdir "$DIR"
   chown "$TU":"$TU" "$DIR"
fi

# Put a PDF in $DIR. Make those produced from text files searchable with
# pdftocairo.
transfer () {
if [ ! -z "$TO" ] ; then
   PAGES=$(echo $TO | cut -d"=" -f2)
   PDF="$(echo $PDF | cut -d"." -f1)_(pages_numbers_$PAGES).pdf"
      case "$MIMETYPE" in
              application/pdf) $CF2 "$TO" "$TD" > "$DIR/$PDF"                    ;;
                   text/plain) $CF2 "$TO" "$TD" | pdftocairo -pdf - $DIR/$PDF    ;;
       application/postscript) $CF2 "$TO" "$TD" > "$DIR/$PDF"
      esac
else
      case "$MIMETYPE" in
              application/pdf) $CF1 "$TD" > "$DIR/$PDF"                    ;;
                   text/plain) $CF1 "$TD" | pdftocairo -pdf - $DIR/$PDF    ;;
       application/postscript) $CF1 "$TD" > "$DIR/$PDF"
      esac
fi
chown "$TU":"$TU" "$DIR/$PDF"
}

# Check existence of a .pdf extension. Provide one if necessary. Replace
# a space with a "_". Remove last "_" in a filename.
print_pdf () {
if [ ${TT: -4} == ".pdf" ] ; then
   PDF=$(echo $TT | tr [:space:] '_' | sed 's/.$//')
else
   PDF=$(echo $TT | tr [:space:] '_' | sed 's/.$//').pdf
fi
transfer
}

# Replace a space with a "_". Remove last "_" and .txt in a filename.
print_txt () {
if [ ${TT: -4} == ".txt" ] ; then
   PDF=$(echo $TT | tr [:space:] '_' | sed 's/.....$//').pdf
else
   PDF=$(echo $TT | tr [:space:] '_' | sed 's/.$//').pdf
fi
transfer
}

# Replace a space with a "_". Remove last "_" and .ps in a filename.
print_ps () {
if [ ${TT: -3} == ".ps" ] ; then
   PDF=$(echo $TT | tr [:space:] '_' | sed 's/....$//').pdf
else
   PDF=$(echo $TT | tr [:space:] '_' | sed 's/.$//').pdf
fi
transfer
}

case "$MIMETYPE" in
          application/pdf) print_pdf    ;;
               text/plain) print_txt    ;;
   application/postscript) print_ps
esac

All three file types avoid doing psftops and pdftops conversions (as done by cups-pdf) in the final stages of the filtering chain. Apart from PDFs of decent quality being produced there will be some decrease in processing time.

The print queue to set up:

lpadmin -p PrinttoPDF -v tea4cups:// -E -m raw

The prehook to use:

prehook_pdf : /usr/local/bin/<print-to-pdf>

Creating a PDF/A Document with unoconv

Some individuals and organisations want to have PDFs they produce or acquire accessible many, many years from now and to this end there is an ISO specification treating long term archiving of PDF documents. The technical aspects are described at the PDF Association and at PDFlib GmbH.

The only software in Debian which appears to support conversion of non-PDF and non-Postscript documents to PDF/A is Libreoffice. It is designed to produce PDF/A-1a compliant files.

As in the section Printing Microsoft documents have unoconv listening in the background and set up a virtual raw queue with lpadmin or the web interface of CUPS. Create a doc-pdfa directory in /usr/local/share.

A very simple script would be

  #!/bin/bash

  OUTPDF=/usr/local/share/doc-pdfa/$TEATITLE.pdf
  unoconv -f pdf -eSelectPdfVersion=1 -o $OUTPDF $1
  chmod 444 $OUTPDF

which is called from

  prehook_pdfa_unoconv : /usr/local/bin/pdfa-unoconv $TEADATAFILE

Creating a PDF/A Document with Ghostscript

PDF and PostScript files types are not candidates for conversion with unoconv but Ghostscript can produce PDF/A-1b and PDF/A-2b outputs as documented on the PostScript-to-PDF converter web page. The Ghostscript FAQ explains why PDF/A-1a production is not supported.

Following the advice in this bug report the system's PDFA_def.ps is copied to /usr/local/share/ghostscript and (if Debian 8 is being used) the line

  /ICCProfile (ISO Coated sb.icc)   % Customize.

is altered to something which suits, such as

  /ICCProfile (/usr/share/color/icc/ghostscript/srgb.icc)   % Customize.

The ghostscript directory will have to be created and you are advised to use the full path to the *.icc file even if not on Jessie. You might also want to read this Debian bug report.

The Tea4CUPS queue setup is the same as with unoconv and a basic script for converting a PDF to PDF/A-1b could look like this:

  #!/bin/bash

  # Remove pdf extension (.pdf) from the input file name.
  BN=$(basename $TEATITLE .pdf)
  # Replace extension when output file is produced.
  OUTPDF=/usr/local/share/doc-pdfa/$BN.pdf

  gs -dPDFA=1  -dBATCH -dNOPAUSE  -sDEVICE=pdfwrite  \
     -sColorConversionStrategy=/RGB                  \
     -dPDFACompatibilityPolicy=1                     \
     -sOutputFile=$OUTPDF                            \
      /usr/local/share/ghostscript/PDFA_def.ps       \
      $1

  chmod 444 $OUTPDF

The prehook is

  prehook_pdfa_gs : /usr/local/bin/pdfa-gs $TEADATAFILE

PDF files may contain transparent objects and layers but transparency is not permitted in the PDF/A-1 specification. Ghostscript deals with this by rendering such portions of the PDF to images. "Text" in these regions is no longer text after this and is not searchable or extractable.

If a searchable PDF/A-1a is a requirement you can add -dNOTRANSPARENCY to the gs command line, but be aware this could lead to altering the PDF for screen display or printing. PDF/A-2 handles transparency, so an alternative approach is replacing -dPDFA=1 with -dPDFA=2 on the gs command line.

Tip

It is worthwhile testing PDFs for transparency prior to processing because the conversion of a whole page to an image can take a long time and involve much swapping out to disk.

Obtain pdf_info.ps from the toolbin directory in the Ghostscript source package and run

  • gs -dNODISPLAY -q -sFile=your.pdf pdf_info.ps

Printer Accounting

The accounting schemes which CUPS supports are outlined at the CUPS website and on on a machine with cups installed. Some of the problems associated with obtaining accurate information with software accounting are discussed on these openSUSE and KDE pages.

Whatever the pitfalls of software approaches to counting printed pages it is suggested you take a look at pkpgcounter. It is easy enough to use as a prehook and can provide information for a raw queue. $TEATITLE may contain spaces; they will need removing.

  prehook _accounting : echo $TEAUSERNAME $TEATITLE `pkpgcounter $TEADATAFILE` $TEACOPIES >> /var/log/cups/printaccounting.log

pkpgcounter calculates the number of pages to print a given document of a recognised file type. Multipling by the number of copies gives the total number of pages used. $TEACOPIES is the number of copies recorded by CUPS in argv[4] of the error_log; that is, the number of copies asked for when the job is submitted to cups. How this is managed depends on CUPS and the printer driver. The CUPS filters could decide to produce a single copy but instruct the CUPS backend to send that copy several times in a row.

It might be useful to know that the -d option to pkpgcounter displays the file format of a document it is given to examine.

A sample printaccounting.log:

  brian nsswitch.conf 1 1
  brian PlanetDebian 104 1
  brian linux-Howtoextractthefilenamewithouttheextensionfromafullpath?-SuperUser 2 1
  brian https://www.debian.org/releases/stable/amd64/install.txt.en 94 5
  brian services 4 3

And a way of calculating pages printed:

  awk '{ total+=$3*$4 }  END { print total }' /var/log/cups/printaccounting.log

Printer Accounting and the Page Log

The page_log file in /var/log/cups/ lists each page that is sent to a print queue and can be used as the basis for printer accounting. There is a proviso though (which is mentioned in the documentation) - drivers must provide page accounting information.

Except for PostScript printers with no filters, page_log depends on
the driver generating PAGE: messages for each page it produces.  If
the driver doesn't generate PAGE: messages, you don't get a page_log
file.

Let us take the example of an HP M401dne printer; it uses a PPD file from the printer-driver-postscript-hp package and the PPD has the line

  *cupsFilter: "application/vnd.cups-postscript 0 hpps

in it.

For a PDF submitted to a queue using this PPD the filter chain would be

  PDF in --> pdftopdf --> pdftops --> hpps --> PostScript out

and pdftops starts the pstops filter. pstops does do page accounting but unfortunately hpps does not and, because it is the final filter in the chain, the page_log file is empty. Fortunately, Tea4CUPS can help out.

Set up a queue for the M401 in the usual way:

  lpadmin -realq -v <DEVICE_URI> -E -m <401dne_PPD>

Copy 401dne_PPD, edit to remove the *cupsFilter line from it and set up a second queue:

  lpadmin -p pagelogq -v file:/dev/null -E -m <401dne_PPD_without_*cupsFilter>

Set up a virtual raw queue with the tea4cups backend:

  lpadmin -p m401 -v tea4cups:// -E -m raw

Now have two hooks in tea4cups.conf; one to send the job to the printer and one to write to page_log when pdftops is the final filter in the filter chain.

  prehook_pagelog : lp -d pagelogq -U "$TEAUSERNAME" -t "$TEATITLE" $TEADATAFILE"
  prehook_print : lp -d realq "$TEADATAFILE"

CUPS 2.0b1 and later has page_log files disabled by default. Enabling them is a matter of providing a value for PageLogFormat, which is present in cups.conf but with an empty string.

Resharing a Raw Print Queue

A printer is connected by USB to a server running cupsd. Because of resource constraints and other considerations a decision is taken not to have avahi-daemon on the server. This means the print queue set up with

  lpadmin -p printq -v usb://.... -E -o printer-is-shared -m <PPD>

is not advertised. On another machine a queue pointing to printq could be used to access the printer:

  lpadmin -p anotherq -v ipp://<IP_of_server>/printers/printq -E -m raw

anotherq is a raw queue so any job sent from there to printq doesn't get processed until it reaches the server.

The second machine does have avahi-daemon so it might look reasonable to broadcast the existence of anotherq by adding -o printer-is-shared to the previous lpadmin command. Unfortunately, this does not work and is very unlikely to work with any future CUPS. Explanations and discussion are in STR#4738 and STR#4766.

But what we can do is set up a Tea4CUPS virtual print queue with

  lpadmin -p yetanotherq -v tea4cups:// -E -o printer-is-shared -m raw

and have a pre-hook directing the print job to anotherq:

  prehook_reshare : lp -d anotherq -o "$TEAOPTIONS" "$TEADATAFILE"

There is no restriction made by CUPS on the virtual queue being shared.

Versions of cups-browsed less than 1.8.2 will not create a local queue pointing to a remote yetanother queue, so that queue will not be visible to command line programs or non-GTK applications on the client machine. Please see bug #814020.

A Customised Printing Dialog

Many people regularly use a restricted set of printing options to send a job to a print queue. For example, they might always want duplex printing at draft quality as the default method when using a particular colour laser printer and only switch to a better quality when the printout is work-related. Or again, a user is only interested in printing images at photo quality on glossy 4x6 paper with the choice of borderless or not. It can be a chore to continually have to decide on and change options when printing from applications or the command line.

Setting up a different print queue for each task is an option which could be taken, as is using lpoptions with an instance. This section presents an alternative technique using Tea4CUPS and a script. The script print-dialog provides a description of the task to be undertaken and avoids any repetitive changing of printing options. The idea is to have a dialog which does not require a user to have to remember queue names or think too deeply about printing options or provide extensive input. The script can easily be tailored to the needs of the user. It is presented here in the contexts of an application which uses the GTK (Firefox, Evince etc) and QT (Okular etc) print dialogs or printing from the command line in an xterm (or something similar) with lp.

#!/bin/bash
# print-dialog. A script to present a user with a limited set of printing options.

JOB=$1

msg()
{
dialog --msgbox "Printing is in progress. Press the ENTER \
key or click on OK to dismiss this notice." 8 40
rm "$JOB"
}

# Assumes realq is visible to lpstat on the machine being printed from.
# Use 'lp -h <hostname_or_IP> -d realq' for a non-advertised network queue.
# Use 'lpoptions -p realq -l' for the default PPD options of realq.
printing()
{
case $choice in
        1) lp -d realq -o 'fit-to-page Duplex=DuplexNoTumble OutputMode=Draft' "$JOB"
           msg        ;;
        2) lp -d realq -o 'fit-to-page Duplex=DuplexNoTumble OutputMode=Best' "$JOB"
           msg        ;;
        3) lp -d realq -o 'MediaType=Glossy OutputMode=Photo PageSize=Photo4x6' "$JOB"
           msg        ;;
esac
}

dialog --backtitle "Printing selection" \
--radiolist Move up and down with the arrow keys or choose \
1, 2 or 3 from the keyboard. Select with the SPACEBAR. Press \
the ENTER key. \n\nYou can also use the mouse to \
select (click on the choice) and print (click on OK)." 20 85 4 \
1 "Print double-sided at draft quality" off \
2 "Print double-sided at best quality" off \
3 "Print a photo on 6x4 glossy paper with borders" off 2> /tmp/menuchoice.$$

sel=$?
choice=$(cat /tmp/menuchoice.$$)
case $sel in
    0) printing ; rm /tmp/menuchoice.$$ ;;
    1)                                  ;;
  255)                                  ;;
esac

Although some adjustment might be required, a working set of prehooks with a virtual raw queue (virtq) is:

[virtq]
prehook_print-dialog_0 : cp $TEADATAFILE /tmp/job
prehook_print-dialog_1 : chown $TEAUSERNAME:$TEAUSERNAME /tmp/job
prehook_print-dialog_2 : LANG=C XAUTHORITY=/home/$TEAUSERNAME/.Xauthority DISPLAY=:0 su -c "xterm -geometry 100x35+0+0 -fa DejaVuSansMono -e print-dialog /tmp/job" $TEAUSERNAME

The third prehook ensures the xterm and the print-dialog script run as a user and not as root. Because of this the first two prehooks are needed to ensure the submitted file can be printed from a user account (which does not have access to /var/spool/cups).

Dealing with Duplicate Print Jobs

Users often experience that a file sent for printing comes out of the printer almost immediately. When it does not, some are given to sending the file again (and again!) under the mistaken impression that this will kick the system into action. We are not concerned here with the causes of the delay. Network problems, limitations on resources (disk space, printer memory, CPU etc) and large print jobs already in the queue or being sent can be some of the factors. What we are concerned with is the paper and ink/toner wastage when all the sent duplicate jobs eventually get printed.

With the following script all duplicates sent to the queue are deleted. No account is taken of the reason why a second job is submitted, whether it be due to impatience or oversight or it is intentional. The script relies on the md5sum ($TEAMD5SUM) being the same for each submitted duplicate job. /etc/cups/tea4cups.conf mentions a situation in which this might not be the case.

#!/bin/bash
# rmdup. Delete duplicate jobs.

REALQ=realq
MD5SUM=/tmp/MD5SUM
if [ ! -e "$MD5SUM" ] ; then 
   touch "$MD5SUM"
fi

DUP=$(grep "$TEAUSERNAME" "$MD5SUM" | cut -d":" -f2 | grep -m1 "$TEAMD5SUM")

if [ -z "$DUP" ] ; then
   lp -d "$REALQ" "$TEADATAFILE"
   echo "$TEAUSERNAME:TEAMD5SUM" >> "$MD5SUM"
else
   rm "$TEADATAFILE"
fi

The prehook with a virtual raw queue is

prehook_rmdup : /usr/local/bin/rmdup $TEADATAFILE

Credits

The writing of this page benefited greatly from the conributions made by Jerome Alet <alet SPAMFREE AT librelogiciel DOT com>, the author of Tea4CUPS, on the Tea4CUPS and CUPs user mailing lists.

See Also

Finally

Although the scripts on this page were tested on Jessie and are intended to be working scripts it is possible they are suboptimal for your needs. You are encouraged to correct any serious mistakes in them and submit ideas for using Tea4CUPS in novel or interesting ways.