Skip to main content

A Python interface to XSL-FO libraries (Conversion HTML to PDF, RTF, DOCX, WML and ODT)

Project description

The zopyx.convert package helps you to convert HTML to PDF, RTF, ODT, DOCX and WML using XSL-FO technology.

Requirements

  • Java 1.5.0 or higher (FOP 0.94 requires Java 1.6 or higher)

  • csstoxslfo (included)

  • XFC-4.0 (XMLMind) for ODT, RTF, DOCX and WML support (if needed)

  • XINC 2.0 (Lunasil) for PDF support (commercial)

  • or FOP 0.94 (Apache project) for PDF support (free)

  • BeautifulSoup (will be installed automatically through easy_install. See Installation.)

  • ElementTree (will be installed automatically through easy_install. See Installation.)

Installation

  • install zopyx.convert either using easy_install or by downloading the sources from the Python Cheeseshop. This will install automatically the Beautifulsoup and Elementree modules if necessary.

  • the environment variable $XFC_DIR must be set and point to the root of your XFC installation directory

  • the environment variable $XINC_HOME must be set and to point to the root of your XINC installation directory

  • the environment variable $FOP_HOME must be set and point to the root of your FOP installation directory

Supported platforms

Windows, Unix

Subversion repository

Usage

Some examples from the Python command-line:

from zopyx.convert import Converter
C = Converter('/path/to/some/file.html')
pdf_filename = C('pdf')         # using XINC
pdf2_filename = C('pdf2')       # using FOP
rtf_filename = C('rtf')
pdt_filename = C('odt')
wml_filename = C('wml')
docx_filename = C('docx')

A very simple command-line converter is also available:

xslfo-convert --format rtf --output foo.rtf sample.html

xslfo-convert has a –test option that will convert some sample HTML. If everything is ok then you should see something like that:

>xslfo-convert --test
Entering testmode
pdf: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.pdf
rtf: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.rtf
docx: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.docx
odt: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.odt
wml: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.wml
pdf: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.pdf
rtf: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.rtf
docx: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.docx
odt: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.odt
wml: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.wml

How zopyx.convert works internally

  • The source HTML file is converted to XHTML using mxTidy

  • the XHTML file is converted to FO using the great “csstoxslfo” converter written by Werner Donne.

  • the FO file is passed either to the external XINC or XFC converter to generated the desired output format

  • all converters are based on Java technology make the conversion solution highly portable across operating system (including Windows)

Known issues

  • If you are using zopyx.convert together with FOP: use the latest FOP 0.94 only. Don’t use any packaged FOP version like the one from MacPorts which is known to be broken.

  • Ensure that you have read the csstoxslfo documentation. csstoxslfo has several requirements about the HTML markup. Don’t expect that it is the ultimate HTML converter. Any questions regarding the necessary markup are documented in the csstoxslfo documentation and will not be answered.

Author

zopyx.convert was written by Andreas Jung for ZOPYX Ltd. & Co. KG, Tuebingen, Germany.

License

zopyx.convert is published under the Lesser GNU Public License V 2.1 (LGPL 2.1). See LICENSE.txt.

Contact

ZOPYX Ltd. & Co. KG
c/o Andreas Jung,
Charlottenstr. 37/1
D-72070 Tuebingen, Germany
E-mail: info at zopyx dot com

Changes:

1.1.8 (26.06.2008)

  • changed logging levels

  • reorganized files

1.1.7 (20.06.2008)

  • better support for csstoxslfo commandline options

1.1.6 (19.04.2008)

  • call ‘fop’ using bash

  • better logger configuration

  • minor code cleanup

1.1.5 (01.03.2008)

  • updated documentation

1.1.4 (05.02.2008)

  • remove duplicate ID attributes

1.1.3 (31.01.2008)

  • clearified Java requirements for FOP

1.1.2 (22.01.2008)

  • removed some nasty debugging code

1.1.1 (22.01.2008)

  • supporting FOP on Windows

1.1.0 (20.01.2008)

  • support for free FOP PDF converter

1.0.6 (14.10.2007)

  • html2fo: added workaround for generated FO code for PRE tags

1.0.5 (05.10.2007)

  • minor bugfixes

1.0.4 (05.10.2007)

  • Windows support added

1.0.3 (04.10.2007)

  • passing -Duser.language=en to java in order to prevent corrupted FO code caused by locales

1.0.2 (03.10.2007)

  • bugfix

1.0.1 (03.10.2007)

  • added –test option to command-line frontend

1.0.0 (30.09.2007)

  • update to css2xslfo V 1.5.0

  • official 1.0.0 release

0.5.0 (09.09.2007)

  • replaced mxTidy related code with the BeautifulSoup module (no longer requires any compiling)

  • html2fo checks the existence of images

0.4.9 (25.07.2007)

  • support for utidy lib (which is the preferred tidy library). Using mx.Tidy only as fallback

0.4.8 (unreleased)

  • unreleased

0.4.7 (08.07.2007)

  • reSTified documentation

0.4.6 (08.07.2007)

  • fixes in availableFormats()

0.4.5 (07.07.2007)

  • various FO fixes

0.4.4 (06.07.2007)

  • using logging module

0.4.3 (05.07.2007)

  • html2fo: using ElementTree for most FO modifications

0.4.2 (30.06.2007)

  • converting page-break-after: always back into break-after: page

0.4.1 (24.06.2007)

  • various fixes

0.4.0 (24.06.2007)

  • added zope interfaces

  • converters are now classes

  • added unittests

0.3.1 (18.06.2007)

  • html2fo() and the converter constructor got a new ‘encoding’ parameter in order to specify the input encoding of the HTML file. This parameter will be passed down to Tidy in order to perform a proper conversion of non-ascii characters.

0.3.0 (unreleased)

  • using subprocess module of Python

  • new Convert() class for high-level XSLFO access

  • logger added

  • better checks for XINC, XFC

  • updated documentation

0.2.0 (16.06.2007)

  • PDF support added

  • command line interface added

  • mxTidy integration

0.1.0 (16.06.2007)

  • initial release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zopyx.convert-1.1.8.tar.gz (358.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zopyx.convert-1.1.8-py2.4.egg (375.0 kB view details)

Uploaded Egg

File details

Details for the file zopyx.convert-1.1.8.tar.gz.

File metadata

  • Download URL: zopyx.convert-1.1.8.tar.gz
  • Upload date:
  • Size: 358.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for zopyx.convert-1.1.8.tar.gz
Algorithm Hash digest
SHA256 f07f7438625a24abf9790335a5beef0ef65e6c1cd16bb64d2bcded07ee0ebf02
MD5 48358b488b96a79ab32561307b48518d
BLAKE2b-256 feeab7cfb4149d77a6bcb17a25e59506604d4a81eb02d780113ab9c86b294f19

See more details on using hashes here.

File details

Details for the file zopyx.convert-1.1.8-py2.4.egg.

File metadata

File hashes

Hashes for zopyx.convert-1.1.8-py2.4.egg
Algorithm Hash digest
SHA256 c08165cf2e28b162b39085fff95b971921407c95c7f3b7ed4b822b5f66bd007e
MD5 5e1ed71211d32e39f6117cf1cf28c973
BLAKE2b-256 1bd3f0376cbd86cf2740357092f2c8420d0acdeb6018d039f7113ba689646777

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page