This project has been selected for GSoC 2018
Student
Name |
IRC nick |
Earliest time for weekly call (UTC) |
Latest time for weekly call (UTC) |
duskybomb |
02:00 |
20:00 |
Mentors
Name |
IRC nick |
Earliest time for weekly call (UTC) |
Latest time for weekly call (UTC) |
m3nu |
01:00 |
16:00 |
|
tlevine |
|||
?PimMoerenhout |
Project Introduction
Despite efforts to develop new formats for the exchange of invoices, most invoices are still exchanged via PDF, mainly due to the great fallback (can view, print, sign, add stamps, etc) they provide. Purely machine-readable formats, like EDIFACT[1] are only used for high-volumen business relationships by large companies.
In January 2018, France finalized a new standard, called Factur-X that builds on a different German standard ("ZUGFeRD"), as well as EU Norm EN 16931. This standard allows for the embedding of different XML-based invoice representations, like CII[17]. Benefits of using this standard are:
- No change in processes. Companies can keep sending invoices in PDF format.
- Invoices can still be printed and can have graphical design elements.
- Invoice recipients using compatible accounting software can directly process these machine-readable invoices.
- No separate record needs to be kept except the original file.
- Invoices lacking an embedded version can have it added retroactively.
More on the Factur-x standard (translated from [7]):
- Factur-X is a European standard (EN 16931) for mixed electronic billing (PDF for users and XML data for automated processing), the first implementation of the European Semantic Standard EN 16931 published by the European Commission on October 16th [2017]. Factur-X is at the same time a bill readable in PDF format , containing all information useful for its treatment, especially in case of discrepancy of reconciliation with orders or receptions, and invoice data presented as a structured file , complete or not, allowing information systems to perform automated integration and reconciliation. The primary objective of Factur-X is to enable suppliers, billers to create value-added invoices , containing a maximum of information in structured form, according to their ability to produce them in this form, and to let customers recipients free to use the added value thus provided, or not.
While there are a number of commercial libraries[2, 10] and several Java libraries[8, 9] that support the Factur-x standard, there is no fully developed easy-to-use open source Python library.[4]
This Debian-backed and Google-sponsored project aims to advance the ecosystem for machine-readable invoice exchange and make it easily accessible for the whole Python community by making the following contributions:
- Python library to read/write/add/edit Factur-x metadata in different XML-flavors in Python.
- Command line interface to process PDF files and access the main library functions.
Way to add structured data to existing files or from legacy accounting systems. (via invoice2data project)
- New desktop (web?) GUI to add, edit, import and export factur-x metadata in- and out of PDF files.
Goals of 2018 GSoC
- Python library to handle metadata embedded in PDF invoices adhering to the Factur-X Standard.
- Read metadata embedded in PDF files
- Map XML metadata to Python dict
- Edit existing metadata
- Batch-update metadata from Python dictionary
- Save new or existing metadata in different XML-flavors.
Reference workflow for invoice generation: Using LibreOffice to compile a template into a PDF with added XML metadata.
- Integrate invoice2data to support the discovery and embedding of metadata in existing PDF files using templates for highest accuracy.
- Develop cross-platform desktop GUI:
- edit metadata of existing PDFs
- extract metadata TO standalone file
- add metadata FROM standalone file
- export metadata from multiple files to e.g. CSV?
- Website with documentation and introduction
- Testing and publication workflow for all projects
Bonus deliverables:
- cryptographical signing of PDF invoices to guarantee authenticity
Development tools and resources
Source code repositories
Source code is kept on Github for easy access by the wider community. Bugs, ideas and progress on features are tracked via Github issues.
InvoiceX-GUI: Factur-x GUI
Style guides and best practices
Communication channels
#invoice2data on OFTC/Debian IRC (open to everyone)
Technical skills for student to become familiar with
- Python and related tooling for testing, continuous integration and publishing.
- XML and PDF processing
- Git
- Web development
Current Tasks
Task |
Status |
Refactor and extend facturx project |
In Progress |
Close existing invoice2data issues. |
Completed |
Integrate facturx library with invoice2data |
Not started |
Develop desktop GUI to view, add and edit factur-x details |
In Progress |
Documentation and simple website |
Not started |
References
1: EDIFACT: https://en.wikipedia.org/wiki/XML/EDIFACT
2: http://www.pdflib.com/knowledge-base/pdfa/zugferd-invoices/
3: https://groups.google.com/forum/#!topic/zugferd/AwIh3X56fi8
4: https://groups.google.com/forum/#!topic/zugferd/BRBrlIvWAt4
5: Official standard - paid and DRMed: https://www.evs.ee/products/evs-en-16931-1-2017
6: French standard: http://fnfe-mpe.org/factur-x/
7: German standard: http://www.ferd-net.de/electronic-invoicing446/introduction/index.html (has extensive sample files)
8: Java implementation: https://github.com/ZUGFeRD/mustangproject
9: Another Java implementation: https://konik.io/docs/
10: Commercial Java library: https://developers.itextpdf.com/examples/zugferd/creating-zugferd-xml-files
12: Python library to add XMP data to different files: https://python-xmp-toolkit.readthedocs.io/en/latest/
13: https://github.com/akretion/factur-x/blob/master/facturx/facturx.py
14: Discussion about invoice metadata in PDFs: https://forums.adobe.com/message/7914863#7914863
15: Related service to provide business-focused metadata from PDF: https://www.pdfmapper.com/index_en.html
17: UNECE invoice standard: http://tfig.unece.org/contents/cross-industry-invoice-cii.htm
18: UBL Invoice standard: http://docs.oasis-open.org/ubl/os-UBL-2.1/UBL-2.1.html#T-INVOICE