Commit Briefs

cc79fa051f Sven M. Hallberg

fix a copy-and-paste mistake in parse_fonts (master)

Pretty sure that this got copied from below in 317cc8fb and should be dict_t, i.e. the case of the "font dictionary" being a (single) font resource itself.


2b528fbdf3 Sven M. Hallberg

fix broken indentation in page content code


1d12105938 Sven M. Hallberg

fix wrong indentation in act_viol (leakcheck)


2e272c3132 Sven M. Hallberg

free parse result ifdef LEAKCHECK

This covers the main parse result and a possible "error parse", but not the calls to h_parse() in filters and parse_obj().


7d36d3a94d Sven M. Hallberg

free the parse result from p_startxref


53d6518a00 Sven M. Hallberg

free parse result in act_viol


6a516036da Sven M. Hallberg

light style pass over act_viol


68108a4aa0 Sven M. Hallberg

statically allocate global lzw decoder context

Avoids the use of malloc(). Also factors out table initialization to a function lzw_init_table().


1afde767c4 Sven M. Hallberg

print an error message if /Root not found

If we are actually processing page content, that is.


f0c8a4732e Sven M. Hallberg

correctly look for /Root in the last trailer section

A mistake snuck into commit 76e546ce, taking the last element of the xrefs array as the "last" trailer section. But the array is filled in reverse order by following the chain of startxref and /Prev pointers, so the (logical) last/latest section is xrefs[0].


Branches

Tags

This repository contains no tags

Tree

.gitignorecommits | blame
LICENSEcommits | blame
Makefilecommits | blame
READMEcommits | blame
TODOcommits | blame
lzw.ccommits | blame
lzw.hcommits | blame
pdf.1.mdoccommits | blame
pdf.1.txtcommits | blame
pdf.ccommits | blame
test/

README

Beginnings of a PDF parser in Hammer
====================================


BUILDING

   Simply call 'make' in the top level directory.

       $ make

   The environment variables CC, CFLAGS, and LDFLAGS can be used in the usual
   way to control the compiler to use, compiler flags, and linker flags,
   respectively.

   This program uses the Hammer parser combinator library. It needs a recent
   version, which can be obtained from:

       https://gitlab.special-circumstanc.es/hammer/hammer/

   See the file README.md in that repository for build/install instructions.
   It is recommended to install Hammer as a system library. See also the
   TROUBLESHOOTING section below.


USAGE

       ./pdf [options] input.pdf [oid]

   The 'pdf' utility attempts to parse and validate the given PDF file. If
   successful, it prints the resulting AST to stdout using a JSON format.
   It exits 0 on success, 1 if the input file was found to be invalid, and >1
   if an error occurs. The optional oid argument selects a specific object to
   print instead of the whole document.

   Refer to the supplied manual page 'pdf.1' for details.


TROUBLESHOOTING

   <hammer/hammer.h> or libhammer.so not found:

     If Hammer is not installed as a system library or in a nonstandard
     location, cc and ld will fail to locate its headers and library. The
     quick fix for this is to create symlinks called 'hammer' and 'lib'
     pointing to Hammer's source and build output directories, respectively:

         $ ln -s ../hammer/src hammer
         $ ln -s ../hammer/build/opt/src lib
         $ make

     Likewise, when running 'pdf' directly, ld.so will fail to locate
     libhammer.so. The quick fix is to point LD_LIBRARY_PATH to the 'lib' dir:

         $ export LD_LIBRARY_PATH=$PWD/lib
         $ ./pdf <filename>


EVALUATING TEST RESULTS

   A suite of example files is provided in the test/ directory. To run the
   test suite:

       $ make test
 
   For every file in the test/valid/ and test/invalid/ subdirectories, the pdf
   parser is invoked.

   For the valid samples, a message of the following form is displayed on a
   successful parse (exit code 0):

       OK: test/valid/<filename>

   Non-fatal messages may be displayed above it, but presence of the "OK"
   indicates that the test passed. On any nonzero exit, i.e. if either the
   file is deemed invalid or the program encountered an unexpected error,
   error messages are displayed above an indication of the following form
   that includes the exact exit code:

       FAIL (exit <n>): test/valid/<filename>

   For the invalid samples, messages about parse errors are suppressed and an
   "OK" is displayed if and only if pdf exits with 1 ("invalid input"). An
   exit code of 0 or abnormal termination will produce the "FAIL" message with
   any program output appearing above it.


COPYRIGHT

   Various authors. Released under the terms of the ISC license.

   See LICENSE for full copyright and licensing notice.