commit a5abf1e2d9cdc9bbb71f02f8555d2055309541c5 from: xentrac date: Fri Feb 26 04:05:57 2021 UTC Fix segfault when `decode_stream` fails in xrefs In instigator-crashes/aux-xrefs-segfault an invalid flate-encoded stream was producing this behavior: inflate: invalid distance too far back (-3) parse error in stream (XRef) ../instigator-crashes/aux-xrefs-segfault: error parsing xref section at position 249939 (0x3d053) Program received signal SIGSEGV, Segmentation fault. 0x000055555555d91f in lookup_xref (aux=0x7fffffffdf60, nr=4, gen=0) at pdf.c:1249 1249 HCountedArray *subs = H_INDEX_SEQ(aux->xrefs[i], 0); What was happening was that `act_ks_value`, indirectly invoked by `parse_xrefs`, invoked `decode_stream`, which produced the "inflate:" message and returned NULL; so `act_ks_value` produced the "parse error in stream" message and returned an HParseResult of that NULL pointer. Higher up the stack `act_xrstm` packs this NULL pointer into element 0 of a new `h_sequence`. `parse_xrefs` was happily storing this `h_sequence` into `aux->xrefs[0]`, then blithely continuing to the next loop iteration, at which point it would report "error parsing xref section" and return back to main(). However, this did not abort parsing the file! main() was continuing on to attempt to parse the PDF file as a whole, but the first time the resulting parse tried to `lookup_xref`, that lookup would attempt to iterate over the xrefs section in the file, checking to see if the xref number belonged to any of them. The line of code above then segfaulted while attempting to assert that the NULL was actually a valid `h_sequence` pointer. So this patch simply prevents `parse_xrefs` from treating the failed xrefs section as valid. The result is that, as before, the parse exits shortly because it can't follow any xrefs — but now without segfaulting! inflate: invalid distance too far back (-3) parse error in stream (XRef) ../instigator-crashes/aux-xrefs-segfault: error parsing xref section at position 255242 (0x3e50a) VIOLATION[1]@433 (0x1b1): Missing endobj token (severity=1) ../instigator-crashes/aux-xrefs-segfault: no parse VIOLATION[1]@433 (0x1b1): Missing endobj token (severity=1) ../instigator-crashes/aux-xrefs-segfault: error after position 433 (0x1b1) [Inferior 1 (process 626584) exited with code 01] commit - c9ab81f899e5ed4668d95cf5d250364c5ba50922 commit + a5abf1e2d9cdc9bbb71f02f8555d2055309541c5 blob - c2d370e2a67ee3320b3b7ed10179087ef21e2ecb blob + 6782e47ee5cfbd3f96506e4eb59136d4d742d6f2 --- pdf.c +++ pdf.c @@ -2356,12 +2356,11 @@ parse_xrefs(const uint8_t *input, size_t sz, size_t *n //res = h_parse(p_xref, input + offset, sz - offset); HParser *p = h_right(h_seek(offset * 8, SEEK_SET), p_xref); // XXX res = h_parse(p, input, sz); - if (res == NULL) { + if (res == NULL || res->ast == NULL || H_INDEX_TOKEN(res->ast, 0) == NULL) { fprintf(stderr, "%s: error parsing xref section at " "position %zu (%#zx)\n", infile, offset, offset); break; } - assert(res->ast != NULL); /* save this section in xrefs */ if (n >= SIZE_MAX / sizeof(HParsedToken *))