Commit Briefs

3c662d44f4 Sven M. Hallberg

make dump (dumpjpegs)


8ef141df42 Sven M. Hallberg

quick hack to dump /DCTDecode as jpeg files


cd24df616f pompolic

Merge branch 'fix-assertion-a-used-failed' into 'master'

Fix segfault on dictionaries with odd lengths See merge request pesco/pdf!16


86fecbce40 pompolic

Merge branch 'fix-aux-xrefs-segfault' into 'master'

Fix segfault when `decode_stream` fails in xrefs See merge request pesco/pdf!17


27b2ab1324 xentrac

Fix segfault on dictionaries with odd lengths

It’s probably a bug that our dictionary parser is inserting a key-value “pair” into our dictionary structure which just has a key but no value, but the proximal cause of the crash was that `dictentry` is reading off the end of the key-value pair and getting a null pointer. This fixes the bug revealed by the instigator in input file assertion-a-used-failed.


79dc4dd64d pompolic

Merge branch 'fix-decode-assert-fail' into 'master'

Report incorrect /Filter type with decode failure See merge request pesco/pdf!18


a5abf1e2d9 xentrac

Fix segfault when `decode_stream` fails in xrefs

In instigator-crashes/aux-xrefs-segfault an invalid flate-encoded stream was producing this behavior: inflate: invalid distance too far back (-3) parse error in stream (XRef) ../instigator-crashes/aux-xrefs-segfault: error parsing xref section at position 249939 (0x3d053) Program received signal SIGSEGV, Segmentation fault. 0x000055555555d91f in lookup_xref (aux=0x7fffffffdf60, nr=4, gen=0) at pdf.c:1249 1249 HCountedArray *subs = H_INDEX_SEQ(aux->xrefs[i], 0); What was happening was that `act_ks_value`, indirectly invoked by `parse_xrefs`, invoked `decode_stream`, which produced the "inflate:" message and returned NULL; so `act_ks_value` produced the "parse error in stream" message and returned an HParseResult of that NULL pointer. Higher up the stack `act_xrstm` packs this NULL pointer into element 0 of a new `h_sequence`. `parse_xrefs` was happily storing this `h_sequence` into `aux->xrefs[0]`, then blithely continuing to the next loop iteration, at which point it would report "error parsing xref section" and return back to main(). However, this did not abort parsing the file! main() was continuing on to attempt to parse the PDF file as a whole, but the first time the resulting parse tried to `lookup_xref`, that lookup would attempt to iterate over the xrefs section in the file, checking to see if the xref number belonged to any of them. The line of code above then segfaulted while attempting to assert that the NULL was actually a valid `h_sequence` pointer. So this patch simply prevents `parse_xrefs` from treating the failed xrefs section as valid. The result is that, as before, the parse exits shortly because it can't follow any xrefs — but now without segfaulting! inflate: invalid distance too far back (-3) parse error in stream (XRef) ../instigator-crashes/aux-xrefs-segfault: error parsing xref section at position 255242 (0x3e50a) VIOLATION[1]@433 (0x1b1): Missing endobj token (severity=1) ../instigator-crashes/aux-xrefs-segfault: no parse VIOLATION[1]@433 (0x1b1): Missing endobj token (severity=1) ../instigator-crashes/aux-xrefs-segfault: error after position 433 (0x1b1) [Inferior 1 (process 626584) exited with code 01]


c9ab81f899 Pompolic

Fix overflow in act_rldstring


7dbed70aae Pompolic

Merge branch 'xentrac/pdf-fix-digit-pair-assert'


4019289144 xentrac

Fix typo in comment


Branches

Tags

This repository contains no tags

Tree

.gitignorecommits | blame
LICENSEcommits | blame
Makefilecommits | blame
READMEcommits | blame
TODOcommits | blame
lzw-ab-license.txtcommits | blame
lzw-lib.ccommits | blame
lzw-lib.hcommits | blame
pdf.ccommits | blame
t/

README

Beginnings of a PDF parser in Hammer
====================================

 - Currently needs a custom Hammer branch. You'll need to build against this:

   https://gitlab.special-circumstanc.es/pesco/hammer/tree/pdf

   For detailed build instructions, see README.md in that repository.

 - Help the default Makefile find Hammer

       $ ln -s ../hammer/src hammer         # needed for building pdf, include files
       $ ln -s ../hammer/build/opt/src lib  # needed for running pdf, to locate libhammer.so

 - Notes for 2020-04-27 release:

    The release branch has been tested to build with the 2020-04-27_RELEASE` branch located at https://gitlab.special-circumstanc.es/pesco/hammer/tree/2020-04-27_RELEASE

 - Build:

       $ pushd ../hammer; scons; popd       # build Hammer
       $ make pdf

 - Usage:

       $ export LD_LIBRARY_PATH=./lib       # see Troubleshooting section below to see if this is needed
       $ ldd ./pdf | grep libhammer         # verify that libhammer.so was found
       $ ./pdf <filename>

       # place some test files in the t/ directory...
       $ make test

 - Troubleshooting:

       libhammer.so not found:

           If Hammer is not installed as a system library, ld may fail to locate libhammer.so. The quick fix for this is altering LD_LIBRARY_PATH before running pdf:

           $ export LD_LIBRARY_PATH=./lib
           $ make test

           The second solution is executing "scons install" when building Hammer, which will install it in ld's usual search path:

           $ pushd ../hammer; scons install; popd
           # ... Update ldconfig cache if needed
           $ make pdf
           $ make test

 - Evaluating test results:
 
   For every file in the t/ directory, the pdf parser is executed. On successful parse, a message of the following form is displayed:

   OK: t/<filename>

   In case of a non-fatal parse error, error messages may be displayed, but presence of the "OK" indicates pdf exited successfully. On a failed test run, only parse error messages are displayed.

 - Copyright:

  - pesco 2019,2020
  - pompolic 2020
  - Paul Vines 2020
  - David Bryant (modified lzw-ab code)

  See LICENSE and lzw-ab-license.txt for full copyright and licensing notice.