From: Brian Wolff Date: Fri, 6 Dec 2013 20:29:30 +0000 (-0400) Subject: Normalize newlines in DjVu text-layer metadata. X-Git-Tag: 1.31.0-rc.0~17730^2 X-Git-Url: https://git.cyclocoop.org/%7B%24admin_url%7Dcompta/operations/modifier.php?a=commitdiff_plain;h=9b48c297fb7d078d2b79597d72612e885aadd707;p=lhc%2Fweb%2Fwiklou.git Normalize newlines in DjVu text-layer metadata. Currently, newlines in DjVu text layer are stored as the literal string '\n'. Its up to the consumer to unescape that into a real newline. Other formats like pdfs return newlines as an actual \n character when getPageText() is called. I think getPageText() should not require callers to do this. Change-Id: Ie1a438bbce5444c53ff6b7b3aaf2b5267ba3c8b4 --- diff --git a/includes/media/DjVuImage.php b/includes/media/DjVuImage.php index 9bfc378d08..971c865236 100644 --- a/includes/media/DjVuImage.php +++ b/includes/media/DjVuImage.php @@ -306,7 +306,9 @@ EOR; function pageTextCallback( $matches ) { # Get rid of invalid UTF-8, strip control characters - return ''; + $val = htmlspecialchars( UtfNormal::cleanUp( stripcslashes( $matches[1] ) ) ); + $val = str_replace( array( "\n", '�' ), array( ' ', '' ), $val ); + return ''; } /**