Various plan9 fixes Needs a bunch of *f functions in APE math.h: sinf, cosf, tanf, acosf, fabsf, ceilf, floorf, sqrtf, atan2f, fmodf, hypotf, expf, powf, signbit, logf, log10f Needs realpath (XSI) in APE stdlib.h. Needs timegm (GNU) in APE time.h Depends on libjpeg, jbig2dec, and freetype.
Bug 703379: Fix conversion from Indexed Separation colorspace.
Fix MuPDF-gl chapter/page handling. Internally, fz_locations are numbered from 0. Externally we number from 1. Adjust when parsing.
OSS-Fuzz 29728: Avoid buffer overflow. Don't access past the end of the xref_index.
Coverity 216668: Avoid null deref. Probably never happens in practise as we don't go looking for objects that don't exist, but this is safer.
PDF OCR: Support vertical writing Ensure that we give reasonable output for vertical writing. I can't actually get Tesseract to give me vertical writing in general, but testing this with a file with a single column of Japanese gives reasonable results. At some point, we can look at using a font definition with /WMode 1 to maybe simplify this.
Update mupdf-gl to accept "chapter:page" as well as just "page".
Bug 703392: Fix PDFDocument.insertPage to match C code. Negative numbers are defined to work, as is INT_MAX. The JNI code should not complain about them.
Bug 701640: Fix linearisation offset problem. When linearising, we write objects out in one pass to establish offsets. We then create the hint objects, and do another pass to properly write the file out. During the first pass, we were creating a hint object without a stream. The stream was then created before the second pass was performed. Accordingly, the overall length of that object changes. The code allowed for the change in the size of the stream, but was failing to allow for it changing from not having a stream at all to having a stream. (i.e. it allowed for the stream changing size, but did not allow for the addition of "stream\n" and "endstream\n".) Due to the 'slack' inherent in writing objects with default large numbers in the first pass, and those numbers being reduced to smaller values in the second pass, we were getting away with this in most cases. The addition of an ASCIIHexDecode filter was tipping us over the edge. The fix is simply to ensure we create the object with a stream to start with.
Bug 703312: Update coding-overview.html to strengthen locking requirements.
Bug 703366: Fix double free of object during linearization. This appears to happen because we parse an illegal object from a broken file and assign it to object 0, which is defined to be free. Here, we fix the parsing code so this can't happen.
Tweak pdf_update_stream to work better with undo. Set the new length in the object before we update the stream. Setting the new length has the effect of moving/copying the old object/old stream into the journal. Previously we were setting the stream, THEN moving the old version which meant the new stream was copied in to accompany the old object.
Fix mutool usage messages not to leak. Mostly this is to avoid me running "mutool whatever" to get the usage message and then having to wade through pages of Memento leaked block information to actually find what I was looking for.
mutool create: Add some exception handling. Avoids leaks in failure cases.
Bug 703388: document that fz_drop_context() should not be called inside fz_try/always/catch blocks.
Improve OCR handling of R2L text. Previously, tesseract was handing us the chars for a word in R2L order and we were outputting them in L2R order, meaning that cut/paste would reverse words.
Add JNI bindings for undo/redo.
Fix 'wonky bboxes' issue in OCRd text. When we save a 'High Security Redaction' version of a document (even without any redactions in) the resultant file often gives 'wonky' selection boxes when viewed in Acrobat (or similar). This is because the OCR routine will return different word bboxes for different words, and baselines won't line up. Accordingly, we now collect the recognised words up as lines before flushing them to the stream, and use a line bbox rather than a word bbox to base the positions on. This code currently relies on text being horizontal. Vertical mode text will probably break it. The existing code relies on both L2R and horizontal text, so we are no worse off with this. We can cope with R2L and vertical text when we have some examples.
Bug 701919: Add page number to man page for mupdf. Thanks to Daniel Lublin.
Bug 702898: Fix SVG output clipping issues. When we define a pattern, we do so using the SVG symbol mechanism. Some SVG renderers, by default, make a clipbox around the rendered symbol. This is no good for contents of patterns that are offset and should still appear correctly due to the repeats in the patterns. Accordingly, we make every such symbol have the style="overflow:visible" attribute.