PDF Generation within Embedded Systems

A traditional data logger presents the data as either plain text, comma-separated values (CSV), or Extensible Mark-up Language (XML). Files in these formats work well if you have supporting software to process data, but, if you are looking for a quick way of distributing a file, that anyone can open, a Portable Document Format (PDF) is more suitable. This is because data is presented in a consistent and easy to manage manner; instead of a wall of text the user can be presented with, for example, a clearly laid out summary, plots and images. Anyone can easily open or distribute the data without needing any software beyond what they already have on their computer.

We have developed embedded software which can generate a dynamic PDF file directly on a microcontroller (Wikipedia), which can then be written to an SD card or delivered via USB (the device can show up as a mass storage device – just like your flash drive). Use as a data logger is just one of the potential applications.

PC and Embedded System Comparison

We began by writing the software in C#, to run on a PC, as this would let us quickly and easily determine what was required to generate a PDF in code – this turned out to be reasonably simple, although we could already tell that there would be unique challenges encountered when writing the code for an embedded device (an 8-bit PIC microcontroller in this case).

When writing software to run on a PC there is a huge pool of resources available to the program, the table below shows a comparison between a typical PC and a typical 8-bit microcontroller.

PC Microcontroller
RAM (B) 4,294,967,296 256
ROM (KB) 1,073,741,824 8
Clock Speed (MHz) 3000 16

Quite a difference! A PC has 17 million times more memory (RAM), 134 million times more storage (ROM) and operates 188 times faster! So while it is easy to generate an entire PDF file in RAM on a PC, this is simply impossible to do on many microcontrollers.

Typically, this would not pose much of a problem as a file can be written start to finish. However, the PDF format is designed for fast viewing – one of the ways this is achieved is by using a table at the end of the file which references the location of all components (pages, images, fonts etc.). The table allows the viewing software to directly read the required parts of the file, saving time and memory. This referencing technique, along with others, can be tricky to handle programmatically.

Solution

The simplest solution would be to generate the entire PDF in RAM before going back and filling in all the missing references – but, as discussed earlier, this is not possible due to the constraints of an embedded system. By using a number of predictive and compensative methods the generating software can overcome this difficulty – progressively generating the file line by line and pushing each to external storage (e.g. SD card). In this manner, only a small amount of the file is in memory at any point in time, ensuring that the limited resources are not depleted while still allowing large, complicated PDF files. A simple example page, generated in this manner, is shown below.

pdf_generation

 

 

 

 

 

 

 

 

 

 

This is a common difficulty when writing code for an embedded system – the real challenge is in understanding how the microcontroller operates and being aware of the restrictions. With this in mind, fast and efficient systems can be created.