While researching PDF libraries for Python, I stumbled across another little project called metaPDF. According to its website, metaPDF is a lightweight Python library optimized for metadata extraction and insertion, and it is a fast wrapper over the excellent pyPdf library. It works by quickly searching the last 2048 bytes of the PDF before parsing the xref table, offering a 50-60% performance increase over directly parsing the table line by line. I’m not really sure how useful that will be, but let’s try it out and see what metaPDF can do. (more…)
Entries tagged with “Python PDF Series”.
Sat 21 Jul 2012
Python PDF Series – An Intro to metaPDF
Posted by Mike under Python
[3] Comments
Wed 11 Jul 2012
PyPDF2: The New Fork of pyPdf
Posted by Mike under Python
[2] Comments
Today I learned that the pyPDF project is NOT dead, as I had originally thought. In fact, it’s been forked into PyPDF2 (note the slightly different spelling). There’s also a possibility that someone else has taken over the original pyPDF project and is actively working on it. You can follow all that over on reddit if you like. In the mean time, I decided to give PyPDF2 a whirl and see how it is different from the original. Feel free to follow along if you have a free moment or two. (more…)
Tue 10 Jul 2012
An Intro to pyfpdf – A Simple Python PDF Generation Library
Posted by Mike under Python
[3] Comments
Today we’ll be looking at a simple PDF generation library called pyfpdf, a port of FPDF which is a php library. This is not a replacement for Reportlab, but it does give you more than enough to create simple PDFs and may meet your needs. Let’s take a look and see what it can do! (more…)
Sat 7 Jul 2012
A Quick Intro to pdfrw
Posted by Mike under Python
[6] Comments
I’m always on the lookout for Python PDF libraries and I happened to stumble across pdfrw the other day. It looks like a replacement to pyPDF in that it can read and write PDFs, join PDFs and can use Reportlab for concatenation and watermarking, among other things. The project also appears slightly dead in that its last update was in 2011, but then again, pyPDF’s last update was in 2010, so it’s a little fresher. In this article, we’ll take a little test drive of pdfrw and see if it’s useful or not. Come and join the fun! (more…)
Wed 27 Jun 2012
Reportlab: Mixing Fixed Content and Flowables
Posted by Mike under Python
[2] Comments
Recently I needed the ability to use Reportlab’s flowables, but place them in fixed locations. Some of you are probably wondering why I would want to do that. The nice thing about flowables, like the Paragraph, is that they’re easily styled. If I could bold something or center something AND put it in a fixed location, then that would rock! It took a lot of Googling and trial and error, but I finally got a decent template put together that I could use for mailings. In this article, I’m going to show you how to do this too. (more…)
Sun 17 Jun 2012
An Intro to rst2pdf – Changing Restructured Text into PDFs with Python
Posted by Mike under Cross-Platform, Python
1 Comment
There are several cool ways to create PDFs with Python. In this article we will be focusing on a cool little tool called rst2pdf, which takes a text file that contains Restructured Text and converts it to a PDF. The rst2pdf package requires Reportlab to function. This won’t be a tutorial on Restructured Text, although we’ll have to discuss it to some degree just to understand what’s going on. (more…)
Tue 21 Sep 2010
Reportlab Tables – Creating Tables in PDFs with Python
Posted by Mike under Cross-Platform, Python
[2] Comments
Back in March of this year, I wrote a simple tutorial on Reportlab, a handy 3rd party Python package that allows the developer to create PDFs programmatically. Recently, I received a request to cover how to do tables in Reportlab. Since my Reportlab article is so popular, I figured it was probably worth the trouble to figure out tables. In this article, I will attempt to show you the basics of inserting tables into Reportlab generated PDFs. (more…)
Sat 15 May 2010
Manipulating PDFs with Python and pyPdf
Posted by Mike under Cross-Platform, Python, wxPython
[4] Comments
There’s a handy 3rd party module called pyPdf out there that you can use to merge PDFs documents together, rotate pages, split and crop pages, and decrypt/encrypt PDF documents. In this article, we’ll take a look at a few of these functions and then create a simple GUI with wxPython that will allow us to merge a couple of PDFs. (more…)
Mon 8 Mar 2010
A Simple Step-by-Step Reportlab Tutorial
Posted by Mike under Cross-Platform, Python
[19] Comments
The subtitle for this article could easily be “How To Create PDFs with Python”, but WordPress doesn’t support that. Anyway, the premier PDF library in Python is Reportlab. It is not distributed with that standard library, so you’ll need to download it if you want to run the examples in this tutorial. There will also be at least one example of how to put an image into a PDF, which means you’ll also need the Python Imaging Library (PIL). As I understand it, Reportlab is compatible with Python 2.x, IronPython and Jython. They are currently working on a port for Python 3.x (or will be soon). (more…)
