Various plan9 fixes
Needs a bunch of *f functions in APE math.h:
sinf, cosf, tanf, acosf, fabsf, ceilf, floorf, sqrtf, atan2f, fmodf,
hypotf, expf, powf, signbit, logf, log10f
Needs realpath (XSI) in APE stdlib.h.
Needs timegm (GNU) in APE time.h
Depends on libjpeg, jbig2dec, and freetype.
Bug 703379: Fix conversion from Indexed Separation colorspace.
Fix MuPDF-gl chapter/page handling.
Internally, fz_locations are numbered from 0. Externally we number from
1. Adjust when parsing.
OSS-Fuzz 29728: Avoid buffer overflow.
Don't access past the end of the xref_index.
Coverity 216668: Avoid null deref.
Probably never happens in practise as we don't go looking for objects
that don't exist, but this is safer.
PDF OCR: Support vertical writing
Ensure that we give reasonable output for vertical writing.
I can't actually get Tesseract to give me vertical writing in
general, but testing this with a file with a single column of
Japanese gives reasonable results.
At some point, we can look at using a font definition with
/WMode 1 to maybe simplify this.
Update mupdf-gl to accept "chapter:page" as well as just "page".
Bug 703392: Fix PDFDocument.insertPage to match C code.
Negative numbers are defined to work, as is INT_MAX. The
JNI code should not complain about them.
Bug 701640: Fix linearisation offset problem.
When linearising, we write objects out in one pass to establish offsets.
We then create the hint objects, and do another pass to properly
write the file out.
During the first pass, we were creating a hint object without a
stream. The stream was then created before the second pass was
performed. Accordingly, the overall length of that object changes.
The code allowed for the change in the size of the stream, but
was failing to allow for it changing from not having a stream at all
to having a stream. (i.e. it allowed for the stream changing size,
but did not allow for the addition of "stream\n" and "endstream\n".)
Due to the 'slack' inherent in writing objects with default large
numbers in the first pass, and those numbers being reduced to smaller
values in the second pass, we were getting away with this in most
cases. The addition of an ASCIIHexDecode filter was tipping us over
The fix is simply to ensure we create the object with a stream to
Bug 703312: Update coding-overview.html to strengthen locking requirements.
Bug 703366: Fix double free of object during linearization.
This appears to happen because we parse an illegal object from
a broken file and assign it to object 0, which is defined to
Here, we fix the parsing code so this can't happen.
Tweak pdf_update_stream to work better with undo.
Set the new length in the object before we update the stream.
Setting the new length has the effect of moving/copying the
old object/old stream into the journal. Previously we were
setting the stream, THEN moving the old version which meant
the new stream was copied in to accompany the old object.
Fix mutool usage messages not to leak.
Mostly this is to avoid me running "mutool whatever" to get the
usage message and then having to wade through pages of Memento
leaked block information to actually find what I was looking for.
mutool create: Add some exception handling.
Avoids leaks in failure cases.
Bug 703388: document that fz_drop_context() should not be called inside fz_try/always/catch blocks.
Improve OCR handling of R2L text.
Previously, tesseract was handing us the chars for a word in R2L order
and we were outputting them in L2R order, meaning that cut/paste would
Add JNI bindings for undo/redo.
Fix 'wonky bboxes' issue in OCRd text.
When we save a 'High Security Redaction' version of a document
(even without any redactions in) the resultant file often
gives 'wonky' selection boxes when viewed in Acrobat (or similar).
This is because the OCR routine will return different word bboxes
for different words, and baselines won't line up.
Accordingly, we now collect the recognised words up as lines before
flushing them to the stream, and use a line bbox rather than a
word bbox to base the positions on.
This code currently relies on text being horizontal. Vertical mode
text will probably break it. The existing code relies on both
L2R and horizontal text, so we are no worse off with this.
We can cope with R2L and vertical text when we have some examples.
Bug 701919: Add page number to man page for mupdf.
Thanks to Daniel Lublin.
Bug 702898: Fix SVG output clipping issues.
When we define a pattern, we do so using the SVG symbol mechanism.
Some SVG renderers, by default, make a clipbox around the rendered
symbol. This is no good for contents of patterns that are offset
and should still appear correctly due to the repeats in the patterns.
Accordingly, we make every such symbol have the style="overflow:visible"