KHJK gotweb

Commits

Commit:: cc79fa051f794094a7067c2801b15b89015ef618
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Jun 15 15:53:52 2023 UTC

fix a copy-and-paste mistake in parse_fonts Pretty sure that this got copied from below in 317cc8fb and should be dict_t, i.e. the case of the "font dictionary" being a (single) font resource itself.

diff | patch | tree

Commit:: 2b528fbdf315ca85f18f9ed28a5fef1513039c7c
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Jun 15 14:51:39 2023 UTC

fix broken indentation in page content code

diff | patch | tree

Commit:: 1d12105938ec9cbee8e5f64acfb855ae9cd4213f
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Jun 15 14:51:04 2023 UTC

fix wrong indentation in act_viol

diff | patch | tree

Commit:: 2e272c3132a6f583a268ea0e840f56033f6b155b
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Jun 15 11:58:59 2023 UTC

free parse result ifdef LEAKCHECK This covers the main parse result and a possible "error parse", but not the calls to h_parse() in filters and parse_obj().

diff | patch | tree

Commit:: 7d36d3a94d71dec034a00e45e226794a81744cfe
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Wed Jun 14 13:51:17 2023 UTC

free the parse result from p_startxref

diff | patch | tree

Commit:: 53d6518a0026320c70a6ffb8ddd91512d0cde20e
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Wed Jun 14 13:31:08 2023 UTC

free parse result in act_viol

diff | patch | tree

Commit:: 6a516036da003849e63b9aec1e594548ec0da278
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Wed Jun 14 13:26:53 2023 UTC

light style pass over act_viol

diff | patch | tree

Commit:: 68108a4aa05489ffac8c941ae90c4d995f4eaa47
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Wed Jun 14 13:16:21 2023 UTC

statically allocate global lzw decoder context Avoids the use of malloc(). Also factors out table initialization to a function lzw_init_table().

diff | patch | tree

Commit:: 1afde767c483e9672beed15c549717de2db061da
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Fri Apr 14 11:47:57 2023 UTC

print an error message if /Root not found If we are actually processing page content, that is.

diff | patch | tree

Commit:: f0c8a4732e52479072004950a488c09a892c0d58
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Fri Apr 14 11:45:23 2023 UTC

correctly look for /Root in the last trailer section A mistake snuck into commit 76e546ce, taking the last element of the xrefs array as the "last" trailer section. But the array is filled in reverse order by following the chain of startxref and /Prev pointers, so the (logical) last/latest section is xrefs[0].

diff | patch | tree

Commit:: 06ed0943b4ffd56be1323c336f0cb52379ccd6d1
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Apr 13 17:31:08 2023 UTC

fix format specifier for printing HBytes Since HBytes is a length/pointer pair and not a null-terminated string, we must pass the length as an argument to printf. The correct format specifier for that is "%.*s" (string with "precision" = length), not "%*s" (string with minimum field width).

diff | patch | tree

Commit:: 11e873cc864fe2d1d96d72b11374e34b553412b3
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Apr 13 17:11:40 2023 UTC

add missing printf argument Forgotten in b3dda3fe when adding the input file name to error messages.

diff | patch | tree

Commit:: 656f5a3f4d37e12933c4fd4b5bc0c9450ae60969
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Mar 30 16:58:27 2023 UTC

remove stale comment Finished reviewing past modifications to parse_xrefs(). NB: All code attributed to Sumit Ray has been removed from this function.

diff | patch | tree

Commit:: a1014f81d804955bb38b434865b733271aa3d7a7
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Mar 30 16:52:28 2023 UTC

improve handling of parse errors in xref stream data Improve on the bugfix in commit a5abf1e2: - Reinstate the assert for 'res->ast != NULL'. If it fails, there is a bug in the parser, not an error in the input file. - Provide a distinct error message for the case where p_xref fails on a cross-reference stream because of invalid data. - Only skip storing the invalid section. Try to follow the /Prev entry in the stream dictionary to find more sections.

diff | patch | tree

Commit:: 512de3c2ead8ba1a54f5b7d4e4fd9050854ccc78
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Mar 30 16:38:58 2023 UTC

remove a comment I cannot tell what this refers to. The (nonexistent) else case of the if statement above it is simply the case of the object number in question not falling within this subsection. Anyway, the function lookup_xref() is a low-level utility used during parsing, not a place to produce error messages.

diff | patch | tree

Commit:: c8be9e8432f98ec8d146fda0d5ce02958a68ecc4
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Mar 30 15:47:59 2023 UTC

comments regarding act_ks_value HParseResult was introduced in 6b54ebfa (generally parse stream objects) to hold the result of parsing the stream data, including the application of any filters. This is produced in act_ks_value(). The fact that parse errors in stream data are thus detectable is in fact significant for xref stream processing, so we should not just return the bare data on error.

diff | patch | tree

Commit:: e61966396178de7c6118f47291cf1d93b96072d3
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Mar 30 14:18:22 2023 UTC

adjust comments

diff | patch | tree

Commit:: b3dda3fe558da73e9b929790be8eb74afad337c6
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Mar 30 14:14:52 2023 UTC

don't emulate VIOL in error messages While it might seem like a good idea to "grade" errors by severity, we are not *really* in any place to do so accurately. Our tasks are (a) to decide, internally, whether to print a message or silently ignore a malformation, and (b) to ultimately judge the file valid or invalid as a whole. Note that the latter part, as stated before, is not the responsibility of parse_xrefs(). Reinstate the input file name in these error messages. That information is useful when running the program on multiple files from a script, as we have been doing. While we're at it, fix style (line lengths).

diff | patch | tree

Commit:: 9ff8c465fbd3eb44f85988f3249768e4caa91ab0
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Mar 30 13:40:42 2023 UTC

add test cases for out-of-bounds xref pointers Both currently fail because the parser proper does not validate these offsets.

diff | patch | tree

Commit:: 9196b5c2b80d606f09cde523b1931d6c9c921692
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Mar 30 13:40:42 2023 UTC

drop use of h_seek in parse_xrefs Now that we are validating the offset ourselves, we no longer need h_seek() to do our bounds checking. But add a defensive assert just in case.

diff | patch | tree

Commit:: dd3c8e62ac41add9bad416af8b71cc5db02de029
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Mar 30 13:40:42 2023 UTC

bounds-check /Prev pointers Mirrors the check for startxref. I considered unifying the two into one test at the start of the loop, but then we would lose the information whether we got the offset from startxref or a /Prev.

diff | patch | tree

Commit:: aa40560780b0cbea24d03b68570f3aac3b352da5
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Mar 30 13:40:42 2023 UTC

report location of invalid startxref This is useful information, especially in hex, when looking into the file. The invalid value itself, on the other hand, is not so useful.

diff | patch | tree

Commit:: 550c070d23ab6702b3961e54b5d19bc6aad33e04
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Mar 30 13:27:46 2023 UTC

adjust error message The correct and standard format specifier for values of type size_t is %zu. There is no need to point out the valid bounds. Match style with the other messages.

diff | patch | tree

Commit:: 431c7db3b7ea3e2db9cc7066cb5334e4bb7dcb75
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Thu Mar 30 13:27:46 2023 UTC

remove useless/erroneous condition The offset can never be negative (size_t is unsigned). And this treated offset = 0 as out of bounds, which is nonsense. In fact, offset == size is also not invalid (it is the end of file).

diff | patch | tree

Commit:: 9883a543682945509e8b20b5e9444e1b52876a09
From:: Sven M. Hallberg <pesco@khjk.org>
Date:: Tue Mar 28 17:44:30 2023 UTC

revert parse_xrefs to its original signature Passing the aux struct by reference may look cleaner, but it was deliberate to keep parse_xrefs() independent of that struct, since the latter is conceptually part of the parser's interface and the former is not. Also, this way parse_xrefs() has a proper return value that signals success or failure. Plus, no ugly indirection or temporary variable is needed to access sz.

diff | patch | tree

More ↓