diff options
| author | Xin LI <delphij@FreeBSD.org> | 2025-05-03 01:54:10 -0700 |
|---|---|---|
| committer | Xin LI <delphij@FreeBSD.org> | 2025-05-03 01:54:10 -0700 |
| commit | 12eff5f0d8b7f4cf5c87b697cc5729d6c223e4c0 (patch) | |
| tree | 716576e8bdf396ffb0f92e282a2c86b9cbb8d48a | |
| parent | 956197bcea3aa907f58c3550fa935dec81930f4c (diff) | |
Vendor import of xz 5.8.1 (trimmed)vendor/xz/5.8.1
82 files changed, 7396 insertions, 2112 deletions
@@ -24,7 +24,7 @@ Authors of XZ Utils by Michał Górny. Architecture-specific CRC optimizations were contributed by - Ilya Kurdyukov, Hans Jansen, and Chenxi Mao. + Ilya Kurdyukov, Chenxi Mao, and Xi Ruoyao. Other authors: - Jonathan Nieder @@ -40,6 +40,12 @@ XZ Utils Licensing free software licenses. These aren't built or installed as part of XZ Utils. + The following command may be helpful in finding per-file license + information. It works on xz.git and on a clean file tree extracted + from a release tarball. + + sh build-aux/license-check.sh -v + For the files under the BSD Zero Clause License (0BSD), if a copyright notice is needed, the following is sufficient: @@ -59,25 +65,6 @@ XZ Utils Licensing - COPYING.GPLv2: GNU General Public License version 2 - COPYING.GPLv3: GNU General Public License version 3 - A note about old XZ Utils releases: - - XZ Utils releases 5.4.6 and older and 5.5.1alpha have a - significant amount of code put into the public domain and - that obviously remains so. The switch from public domain to - 0BSD for newer releases was made in Febrary 2024 because - public domain has (real or perceived) legal ambiguities in - some jurisdictions. - - There is very little *practical* difference between public - domain and 0BSD. The main difference likely is that one - shouldn't claim that 0BSD-licensed code is in the public - domain; 0BSD-licensed code is copyrighted but available under - an extremely permissive license. Neither 0BSD nor public domain - require retaining or reproducing author, copyright holder, or - license notices when distributing the software. (Compare to, - for example, BSD 2-Clause "Simplified" License which does have - such requirements.) - If you have questions, don't hesitate to ask for more information. The contact information is in the README file. diff --git a/ChangeLog b/ChangeLog index 2d36d7bb1043..577dce5e12a2 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,14 +1,2702 @@ -commit 9331ce4009ddc839f5191d234cc41b2d4797376d +commit a522a226545730551f7e7c2685fab27cf567746c Author: Lasse Collin <lasse.collin@tukaani.org> -Date: 2024-10-01 12:21:22 +0300 +Date: 2025-04-03 14:34:43 +0300 - Bump version and soname for 5.6.3 + Bump version and soname for 5.8.1 src/liblzma/Makefile.am | 2 +- src/liblzma/api/lzma/version.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -commit f52857ffde768058db0e0e13f68a2660ca9f1330 +commit 1c462c2ad86ff85766928638431029cd0b0dc995 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-04-03 14:34:43 +0300 + + Add NEWS for 5.8.1 + + NEWS | 30 ++++++++++++++++++++++++++++++ + 1 file changed, 30 insertions(+) + +commit 513cabcf7f5ce1c3ed0619e791393fc53d1dbbd0 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-04-03 14:34:43 +0300 + + Tests: Call lzma_code() in smaller chunks in fuzz_common.h + + This makes it easy to crash fuzz_decode_stream_mt when tested + against the code from 5.8.0. + + Obviously this might make it harder to reach some other code path now. + The previous code has been in use since 2018 when fuzzing was added + in 106d1a663d4b ("Tests: Add a fuzz test program and a config file + for OSS-Fuzz."). + + tests/ossfuzz/fuzz_common.h | 31 ++++++++++++++++++++++++------- + 1 file changed, 24 insertions(+), 7 deletions(-) + +commit 48440e24a25911ae59e8518b67a1e0f6f1c293bf +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-04-03 14:34:43 +0300 + + Tests: Add a fuzzing target for the multithreaded .xz decoder + + It doesn't seem possible to trigger the CVE-2025-31115 bug with this + fuzzing target at the moment. It's because the code in fuzz_common.h + passes the whole input buffer to lzma_code() at once. + + tests/ossfuzz/fuzz_decode_stream_mt.c | 47 +++++++++++++++++++++++++++++++++++ + 1 file changed, 47 insertions(+) + +commit 0c80045ab82c406858d9d5bcea9f48ebc3d0a81d +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-04-03 14:34:42 +0300 + + liblzma: mt dec: Fix lack of parallelization in single-shot decoding + + Single-shot decoding means calling lzma_code() by giving it the whole + input at once and enough output buffer space to store the uncompressed + data, and combining this with LZMA_FINISH and no timeout + (lzma_mt.timeout = 0). This way the file is decoded with a single + lzma_code() call if possible. + + The bug prevented the decoder from starting more than one worker thread + in single-shot mode. The issue was noticed when reviewing the code; + there are no bug reports. Thus maybe few have tried this mode. + + Fixes: 64b6d496dc81 ("liblzma: Threaded decoder: Always wait for output if LZMA_FINISH is used.") + + src/liblzma/common/stream_decoder_mt.c | 11 +++++++++-- + 1 file changed, 9 insertions(+), 2 deletions(-) + +commit 8188048854e8d11071b8a50d093c74f4c030acc9 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-04-03 14:34:42 +0300 + + liblzma: mt dec: Don't modify thr->in_size in the worker thread + + Don't set thr->in_size = 0 when returning the thread to the stack of + available threads. Not only is it useless, but the main thread may + read the value in SEQ_BLOCK_THR_RUN. With valid inputs, it made + no difference if the main thread saw the original value or 0. With + invalid inputs (when worker thread stops early), thr->in_size was + no longer modified after the previous commit with the security fix + ("Don't free the input buffer too early"). + + So while the bug appears harmless now, it's important to fix it because + the variable was being modified without proper locking. It's trivial + to fix because there is no need to change the value. Only main thread + needs to set the value in (in SEQ_BLOCK_THR_INIT) when starting a new + Block before the worker thread is activated. + + Fixes: 4cce3e27f529 ("liblzma: Add threaded .xz decompressor.") + Reviewed-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> + Thanks-to: Sam James <sam@gentoo.org> + + src/liblzma/common/stream_decoder_mt.c | 6 ++++-- + 1 file changed, 4 insertions(+), 2 deletions(-) + +commit d5a2ffe41bb77b918a8c96084885d4dbe4bf6480 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-04-03 14:34:42 +0300 + + liblzma: mt dec: Don't free the input buffer too early (CVE-2025-31115) + + The input buffer must be valid as long as the main thread is writing + to the worker-specific input buffer. Fix it by making the worker + thread not free the buffer on errors and not return the worker thread to + the pool. The input buffer will be freed when threads_end() is called. + + With invalid input, the bug could at least result in a crash. The + effects include heap use after free and writing to an address based + on the null pointer plus an offset. + + The bug has been there since the first committed version of the threaded + decoder and thus affects versions from 5.3.3alpha to 5.8.0. + + As the commit message in 4cce3e27f529 says, I had made significant + changes on top of Sebastian's patch. This bug was indeed introduced + by my changes; it wasn't in Sebastian's version. + + Thanks to Harri K. Koskinen for discovering and reporting this issue. + + Fixes: 4cce3e27f529 ("liblzma: Add threaded .xz decompressor.") + Reported-by: Harri K. Koskinen <x64nop@nannu.org> + Reviewed-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> + Thanks-to: Sam James <sam@gentoo.org> + + src/liblzma/common/stream_decoder_mt.c | 31 ++++++++++++++++++++++--------- + 1 file changed, 22 insertions(+), 9 deletions(-) + +commit c0c835964dfaeb2513a3c0bdb642105152fe9f34 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-04-03 14:34:42 +0300 + + liblzma: mt dec: Simplify by removing the THR_STOP state + + The main thread can directly set THR_IDLE in threads_stop() which is + called when errors are detected. threads_stop() won't return the stopped + threads to the pool or free the memory pointed by thr->in anymore, but + it doesn't matter because the existing workers won't be reused after + an error. The resources will be cleaned up when threads_end() is + called (reinitializing the decoder always calls threads_end()). + + Reviewed-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> + Thanks-to: Sam James <sam@gentoo.org> + + src/liblzma/common/stream_decoder_mt.c | 75 +++++++++++++--------------------- + 1 file changed, 29 insertions(+), 46 deletions(-) + +commit 831b55b971cf579ee16a854f177c36b20d3c6999 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-04-03 14:34:42 +0300 + + liblzma: mt dec: Fix a comment + + Reviewed-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> + Thanks-to: Sam James <sam@gentoo.org> + + src/liblzma/common/stream_decoder_mt.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit b9d168eee4fb6393b4fe207c0aeb5faee316ca1a +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-04-03 14:34:30 +0300 + + liblzma: Add assertions to lzma_bufcpy() + + src/liblzma/common/common.c | 6 ++++++ + 1 file changed, 6 insertions(+) + +commit c8e0a4897b4d0f906966f5d4d4f662221d64f3ae +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-04-02 16:40:22 +0300 + + DOS: Update Makefile to fix the build + + dos/Makefile | 2 ++ + 1 file changed, 2 insertions(+) + +commit 307c02ed698a69763ef1c9c0df4ff24727442118 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-29 12:41:32 +0200 + + sysdefs.h: Avoid <stdalign.h> even with C11 compilers + + Oracle Developer Studio 12.6 on Solaris 10 claims C11 support in + __STDC_VERSION__ and supports _Alignas. However, <stdalign.h> is missing. + We only need alignas, so define it to _Alignas with C11/C17 compilers. + If something included <stdalign.h> later, it shouldn't cause problems. + + Thanks to Ihsan Dogan for reporting the issue and testing the fix. + + Fixes: c0e7eaae8d6eef1e313c9d0da20ccf126ec61f38 + + src/common/sysdefs.h | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +commit 7ce38b318339d6c01378a77585e08169ca3a604e +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-29 12:32:05 +0200 + + Update THANKS + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 688e51bde4c987589717b2be1a1fde9576c604fc +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-29 12:21:51 +0200 + + Translations: Update the Croatian translation + + po/hr.po | 14 +++++++------- + 1 file changed, 7 insertions(+), 7 deletions(-) + +commit 173fb5c68b08a8c1369550267be258132b7760c6 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-25 18:23:57 +0200 + + doc/SHA256SUMS: Add 5.8.0 + + doc/SHA256SUMS | 6 ++++++ + 1 file changed, 6 insertions(+) + +commit db9258e828bc2cd96e3954f1ddcc9d3530589025 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-25 15:18:32 +0200 + + Bump version and soname for 5.8.0 + + Also remove the LZMA_UNSTABLE macro. + + src/liblzma/Makefile.am | 2 +- + src/liblzma/api/lzma/bcj.h | 2 -- + src/liblzma/api/lzma/version.h | 6 +++--- + src/liblzma/common/common.h | 2 -- + src/liblzma/liblzma_generic.map | 2 +- + src/liblzma/liblzma_linux.map | 2 +- + 6 files changed, 6 insertions(+), 10 deletions(-) + +commit bfb752a38f89ed03fc93d54f11c09f43fda64bc2 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-25 15:18:32 +0200 + + Add NEWS for 5.8.0 + + NEWS | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 1 file changed, 62 insertions(+) + +commit 6ccbb904da851eb0c174c8dbd43e84da31739720 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-25 15:18:31 +0200 + + Translations: Run "make -C po update-po" + + POT-Creation-Date is set to match the timestamp in 5.7.2beta which + in the Translation Project is known as 5.8.0-pre1. The strings + haven't changed since 5.7.1alpha but a few comments have. + + This is a very noisy commit, but this helps keeping the PO files + similar between the Git repository and stable release tarballs. + + po/ca.po | 964 ++++++++++++++++++++++++++++++++++++++++++++--------------- + po/cs.po | 935 ++++++++++++++++++++++++++++++++++++++++++---------------- + po/da.po | 663 ++++++++++++++++++++++++++++++----------- + po/de.po | 7 +- + po/eo.po | 966 +++++++++++++++++++++++++++++++++++++++++++++--------------- + po/es.po | 7 +- + po/fi.po | 2 +- + po/fr.po | 916 +++++++++++++++++++++++++++++++++++++++++--------------- + po/hu.po | 966 +++++++++++++++++++++++++++++++++++++++++++++--------------- + po/ka.po | 7 +- + po/ko.po | 7 +- + po/nl.po | 7 +- + po/pl.po | 7 +- + po/pt_BR.po | 962 ++++++++++++++++++++++++++++++++++++++++++++--------------- + po/sr.po | 2 +- + po/sv.po | 7 +- + po/tr.po | 7 +- + po/uk.po | 7 +- + po/vi.po | 948 +++++++++++++++++++++++++++++++++++++++++++--------------- + po/zh_CN.po | 940 ++++++++++++++++++++++++++++++++++++++++++++-------------- + po/zh_TW.po | 2 +- + 21 files changed, 6209 insertions(+), 2120 deletions(-) + +commit 891a5f057a6bb2dd2e3ce5e3bdd7a1f1ee03b800 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-25 15:18:31 +0200 + + Translations: Run po4a/update-po + + Also remove the trivial obsolete messages like man page dates. + + This is a noisy commit, but this helps keeping the PO files similar + between the Git repository and stable release tarballs. + + po4a/fr.po | 82 +++++++++++++++++++++++++++++++++++++------------------ + po4a/pt_BR.po | 88 +++++++++++++++++++++++++++++++++++++++++------------------ + po4a/sr.po | 79 ++++++++++++++++++++++++++++++++++------------------- + 3 files changed, 167 insertions(+), 82 deletions(-) + +commit 4f52e7387012cb3510b01c937dd9b3a0c6a3ac6c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-25 15:18:31 +0200 + + Translations: Partially fix overtranslation in Serbian man pages + + Names of environment variables and some other strings must be present + in the original form. The translator couldn't be reached so I'm + changing some of the strings myself. In the "Robot mode" section, + occurrences in the middle of sentences weren't changed to reduce + the chance of grammar breakage, but I kept the translated strings in + parenthesis in the headings. It's not ideal, but now people shouldn't + need to look at the English man page to find the English strings. + + po4a/sr.po | 66 ++++++++++++++++++++++++++++++++++++++++++-------------------- + 1 file changed, 45 insertions(+), 21 deletions(-) + +commit ff5d944749b99eb5ab35e2ebaf01d05a59e7169b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-25 15:18:31 +0200 + + liblzma: Count the extra bytes in LZMA/LZMA2 decoder memory usage + + src/liblzma/lz/lz_decoder.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +commit 943b012d09f717f7b44284c4e4976ea41264c731 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-25 15:18:31 +0200 + + liblzma: Use SSE2 intrinsics instead of memcpy() in dict_repeat() + + SSE2 is supported on every x86-64 processor. The SSE2 code is used on + 32-bit x86 if compiler options permit unconditional use of SSE2. + + dict_repeat() copies short random-sized unaligned buffers. At least + on glibc, FreeBSD, and Windows (MSYS2, UCRT, MSVCRT), memcpy() is + clearly faster than byte-by-byte copying in this use case. Compared + to the memcpy() version, the new SSE2 version reduces decompression + time by 0-5 % depending on the machine and libc. It should never be + slower than the memcpy() version. + + However, on musl 1.2.5 on x86-64, the memcpy() version is the slowest. + Compared to the memcpy() version: + + - The byte-by-version takes 6-7 % less time to decompress. + - The SSE2 version takes 16-18 % less time to decompress. + + The numbers are from decompressing a Linux kernel source tarball in + single-threaded mode on older AMD and Intel systems. The tarball + compresses well, and thus dict_repeat() performance matters more + than with some other files. + + src/liblzma/lz/lz_decoder.c | 14 ++++++-- + src/liblzma/lz/lz_decoder.h | 87 ++++++++++++++++++++++++++++++++++++++++----- + 2 files changed, 90 insertions(+), 11 deletions(-) + +commit bc14e4c94e788d42eeab984298391fc0ca46f969 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-25 15:18:31 +0200 + + liblzma: Add "restrict" to a few functions in lz_decoder.h + + This doesn't make any difference in practice because compilers can + already see that writing through the dict->buf pointer cannot modify + the contents of *dict itself: The LZMA decoder makes a local copy of + the lzma_dict structure, and even if it didn't, the pointer to + lzma_dict in the LZMA decoder is already "restrict". + + It's nice to add "restrict" anyway. uint8_t is typically unsigned char + which can alias anything. Without the above conditions or "restrict", + compilers could need to assume that writing through dict->buf might + modify *dict. This would matter in dict_repeat() because the loops + refer to dict->buf and dict->pos instead of making local copies of + those members for the duration of the loops. If compilers had to + assume that writing through dict->buf can affect *dict, then compilers + would need to emit code that reloads dict->buf and dict->pos after + every write through dict->buf. + + src/liblzma/lz/lz_decoder.h | 7 ++++--- + 1 file changed, 4 insertions(+), 3 deletions(-) + +commit e82ee090c567e560f51a056775a17f534d159d65 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-25 15:18:30 +0200 + + liblzma: Define LZ_DICT_INIT_POS for initial dictionary position + + It's more readable. + + src/liblzma/lz/lz_decoder.c | 4 ++-- + src/liblzma/lz/lz_decoder.h | 9 ++++++--- + 2 files changed, 8 insertions(+), 5 deletions(-) + +commit 8e7cd0091e5239334437decbe1989662d45a2f47 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-25 15:18:30 +0200 + + Windows: Update README-Windows.txt about UCRT + + windows/README-Windows.txt | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +commit 2c24292d341e505e5579fccac3bce5bc71d839ef +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-25 15:18:15 +0200 + + Update THANKS + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 48053c90898fa191a216aefca01626520a7413f4 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-17 15:33:25 +0200 + + Translations: Update the Italian translation + + po/it.po | 32 ++++++++++++++++---------------- + 1 file changed, 16 insertions(+), 16 deletions(-) + +commit 8d6f06a65f50358fad13567f5dd8af41ef1d2b58 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-17 15:28:56 +0200 + + Translations: Update the Portuguese translation + + The language tag in the Translation Project is pt, not pt_PT, + thus I changed the "Language:" line to pt. + + po/pt.po | 1045 +++++++++++++++++++++++++++++++------------------------------- + 1 file changed, 526 insertions(+), 519 deletions(-) + +commit c3439b039f46fe547ad603e16dc3bd63c1ca9b0c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-14 13:02:21 +0200 + + Translations: Update the Italian translation + + po/it.po | 1020 +++++++++++++++++++++++++++++++------------------------------- + 1 file changed, 516 insertions(+), 504 deletions(-) + +commit 79b4ab8d79528dd633a84df2d29e63f5d13ccbdf +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-12 20:48:39 +0200 + + Translations: Update the Italian man page translations + + Only trivial additions but this keeps the file in sync with the TP. + + po4a/it.po | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +commit 515b6fc8557825e1335012b3b1c8cf71e2c38775 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-12 19:38:54 +0200 + + Translations: Update the Italian man page translations + + po4a/it.po | 129 ++++++++++++++++++++++++++++++++++++------------------------- + 1 file changed, 77 insertions(+), 52 deletions(-) + +commit 333b7c0b776295f0941269b4e6cdb1a0ba5f6218 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-10 21:00:31 +0200 + + Translations: Update the Korean man page translations + + po4a/ko.po | 139 +++++++++++++++++++++++++++++++++++-------------------------- + 1 file changed, 80 insertions(+), 59 deletions(-) + +commit ae52ebd27dc0be5e1ba62fb0c45255d8563fcd88 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-10 20:56:57 +0200 + + Translations: Update the German man page translations + + po4a/de.po | 102 ++++++++++++++++++++++++++++++++++++++----------------------- + 1 file changed, 63 insertions(+), 39 deletions(-) + +commit 1028e52c93d2292b44ff7bae8e721025d2f2c94d +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-10 13:13:30 +0200 + + CMake: Fix tuklib_use_system_extensions + + Revert back to a macro so that list(APPEND CMAKE_REQUIRED_DEFINITIONS) + will affect the calling scope. I had forgotten that while CMake + functions inherit the variables from the parent scope, the changes + to them are local unless using set(... PARENT_SCOPE). + + This also means that the commit message in 5bb77d0920dc is wrong. The + commit itself is still fine, making it clearer that -DHAVE_SYS_PARAM_H + is only needed for specific check_c_source_compiles() calls. + + Fixes: c1ea7bd0b60eed6ebcdf9a713ca69034f6f07179 + + cmake/tuklib_common.cmake | 7 +++++-- + 1 file changed, 5 insertions(+), 2 deletions(-) + +commit 80e48836024ec2d7cbd557575be6da3d1f055cba +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-10 11:38:55 +0200 + + INSTALL: Document -bmaxdata on AIX + + This is based on a pull request and AIX docs. I haven't tested the + instructions myself. + + Closes: https://github.com/tukaani-project/xz/pull/137 + + INSTALL | 5 +++++ + 1 file changed, 5 insertions(+) + +commit ab319186b6d0454285ff4941a777ac95e580f60f +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-10 11:37:19 +0200 + + Update THANKS + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 4434671a04436038f88ab0feaa251cc8d7abb683 +Author: Collin Funk <collin.funk1@gmail.com> +Date: 2025-03-09 19:14:31 -0700 + + tuklib_physmem: Silence -Wsign-conversion on AIX + + Closes: https://github.com/tukaani-project/xz/pull/168 + + src/common/tuklib_physmem.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 18bcaa4fafc935d89ffde94301fa6427907306bf +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-09 22:10:38 +0200 + + Translations: Update the Romanian man page translations + + po4a/ro.po | 110 ++++++++++++++++++++++++++++++++++++------------------------- + 1 file changed, 66 insertions(+), 44 deletions(-) + +commit 1e17b7f42fe2f9df279f44ad7043d3753cd00363 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-09 21:28:15 +0200 + + Translations: Update the Croatian translation + + po/hr.po | 19 +++++++++++-------- + 1 file changed, 11 insertions(+), 8 deletions(-) + +commit ff85e6130d5940896915cdbb99aa9ece9d41240b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-09 21:23:34 +0200 + + Translations: Update the Romanian translation + + po/ro.po | 24 +++++++++++++----------- + 1 file changed, 13 insertions(+), 11 deletions(-) + +commit a5bfb33f30f77e656723d365db8b06e089d3de61 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-09 21:11:34 +0200 + + Translations: Update the Ukrainian man page translations + + po4a/uk.po | 107 ++++++++++++++++++++++++++++++++++++------------------------- + 1 file changed, 64 insertions(+), 43 deletions(-) + +commit 5bb77d0920dcf949d8eb04eb19204b7b199e42df +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-09 14:43:07 +0200 + + CMake: Use cmake_push_check_state in tuklib_cpucores and tuklib_physmem + + Now the changes to CMAKE_REQUIRED_DEFINITIONS are temporary and don't + leak to the calling code. + + cmake/tuklib_cpucores.cmake | 3 +++ + cmake/tuklib_physmem.cmake | 4 +++- + 2 files changed, 6 insertions(+), 1 deletion(-) + +commit c1ea7bd0b60eed6ebcdf9a713ca69034f6f07179 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-09 14:06:35 +0200 + + CMake: Revise tuklib_use_system_extensions + + Define NetBSD and Darwin/macOS feature test macros. Autoconf defines + these too (and a few others). + + Define the macros on Windows except with MSVC. The _GNU_SOURCE macro + makes a difference with mingw-w64. + + Use a function instead of a macro. Don't take the TARGET_OR_ALL argument + because there's always global effect because the global variable + CMAKE_REQUIRED_DEFINITIONS is modified. + + CMakeLists.txt | 2 +- + cmake/tuklib_common.cmake | 27 +++++++++++++++------------ + 2 files changed, 16 insertions(+), 13 deletions(-) + +commit 4243c45a48ef8c103d77b75d9f93d48adcb631db +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-08 14:54:29 +0200 + + doc/SHA256SUMS: Add 5.7.2beta + + doc/SHA256SUMS | 3 +++ + 1 file changed, 3 insertions(+) + +commit cc7f2fc1cf9f3c63cbce90ee92bfbb004f98140b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-08 14:29:57 +0200 + + Bump version and soname for 5.7.2beta + + src/liblzma/Makefile.am | 2 +- + src/liblzma/api/lzma/version.h | 4 ++-- + src/liblzma/liblzma_generic.map | 2 +- + src/liblzma/liblzma_linux.map | 2 +- + 4 files changed, 5 insertions(+), 5 deletions(-) + +commit 62e44b36167de27541776dcf677ed04077c9fd19 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-08 14:24:38 +0200 + + Add NEWS for 5.7.2beta + + NEWS | 35 +++++++++++++++++++++++++++++++++++ + 1 file changed, 35 insertions(+) + +commit 70f1f203789433b5d7b8b22e1655abc465d659f7 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-08 14:23:00 +0200 + + COPYING: Remove the note about old releases + + COPYING | 19 ------------------- + 1 file changed, 19 deletions(-) + +commit db9827dc38ff79de747a6fc7a99619e961dbc5e6 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-08 14:22:28 +0200 + + xz: Update the man page about the environment variables again + + src/xz/xz.1 | 22 +++++++++++----------- + 1 file changed, 11 insertions(+), 11 deletions(-) + +commit 99c584891bd1d946561cebded2226df9b83f1efb +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-06 19:26:09 +0200 + + liblzma: Edit spelling in a comment + + It was found with codespell. + + src/liblzma/api/lzma/container.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 7a234c8c05a8f64efde013cd6a6d31a90b7d0d28 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-06 19:14:23 +0200 + + xz: Update the man page about the environment variables + + src/xz/xz.1 | 26 ++++++++++++++++++++++++-- + 1 file changed, 24 insertions(+), 2 deletions(-) + +commit 808f05af3ef40730d40b3798666757bd866484f1 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-06 17:37:39 +0200 + + Docs: Add a few TRANSLATORS comments to man pages + + All translators know that --command-line-options must not be translated. + With some other strings it's not obvious when the untranslated string + must be preserved. These comments hopefully help. + + src/scripts/xzmore.1 | 2 ++ + src/xz/xz.1 | 22 ++++++++++++++++++++++ + 2 files changed, 24 insertions(+) + +commit 051de255f00dda331e2a6fa189a6e7fe56a7c69b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-06 16:34:32 +0200 + + Scripts: Mark the LZMA Utils script aliases as deprecated + + The deprecated aliases are lzcmp, lzdiff, lzless, lzmore, + lzgrep, lzegrep, and lzfgrep. The commands that start with + the xz prefix have identical behavior, for example, both + lzgrep and xzgrep handle all supported file formats. + + This doesn't affect lzma, unlzma, lzcat, lzmadec, or lzmainfo. + The last release of LZMA Utils was made in 2008, but the lzma + compatibility alias for the gzip-like tool is still in common use. + Deprecating it would cause unnecessary breakage. + + src/scripts/xzdiff.1 | 5 ++++- + src/scripts/xzgrep.1 | 6 +++++- + src/scripts/xzless.1 | 4 +++- + src/scripts/xzmore.1 | 4 +++- + 4 files changed, 15 insertions(+), 4 deletions(-) + +commit 4941ea454c02cf15a64d6434a0778fc2a81282fc +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-02 21:13:04 +0200 + + Translations: Add Serbian man page translations + + po4a/po4a.conf | 2 +- + po4a/sr.po | 3892 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 2 files changed, 3893 insertions(+), 1 deletion(-) + +commit d142d96f24daa451edaabfca8594e202932b3c0b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-03-02 20:42:14 +0200 + + Translations: Update Georgian translation + + po/ka.po | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +commit 9b7e45d841195c8fd8d286e26f810df28c53dd16 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-28 21:07:21 +0200 + + Update THANKS + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 9351592710e0df3238b09d39c545a643c50ac88f +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-22 16:04:58 +0200 + + Update THANKS + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 9023be7831faca2f28def55e16c39e3a42e1e262 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-19 16:33:52 +0200 + + Translations: Update the Croatian translation + + po/hr.po | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +commit 2eaf242c56e8c65db83d48b018fa44aeafeb33a5 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-17 21:46:15 +0200 + + Build: Fix out-of-tree builds when using the replacement getopt_long + + Nowaways $(top_builddir)/lib/getopt.h depends on headers in + $(top_srcdir)/lib, so both have to be in the include path. + CMake-based build already did this. + + Fixes: 7e884c00d0093c38339f17fb1d280eec493f42ca + + src/lzmainfo/Makefile.am | 6 ++++-- + src/xz/Makefile.am | 6 ++++-- + src/xzdec/Makefile.am | 6 ++++-- + 3 files changed, 12 insertions(+), 6 deletions(-) + +commit 41322b2c60cd2c67a1053cb40d27e573420185b7 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-17 18:25:52 +0200 + + m4/getopt.m4: Remove an outdated comment + + m4/getopt.m4 | 3 --- + 1 file changed, 3 deletions(-) + +commit 03c23a4952bce1b50a1d213ca2d1c15acd76a489 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-17 18:11:58 +0200 + + Build: Allow forcing the use of the replacement getopt_long + + Now one can pass gl_replace_getopt=yes to configure to force the use + of GNU getopt_long from the lib directory. This only checks that the + value of gl_replace_getopt is non-empty, so one cannot force the + replacement to be disabled. + + Closes: https://github.com/tukaani-project/xz/pull/166 + + m4/getopt.m4 | 5 +++-- + 1 file changed, 3 insertions(+), 2 deletions(-) + +commit c23b837d15960ecc0d537f0260f389904e1e7f02 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-17 18:11:42 +0200 + + Update THANKS + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 2672a38f1159babf9ba3cca429f644bb823a8bdd +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-12 19:23:31 +0200 + + Update THANKS + + THANKS | 2 ++ + 1 file changed, 2 insertions(+) + +commit 4fdcbfaf3f222299747c6a815762a74eeb1b0b23 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-11 12:13:41 +0200 + + Update THANKS + + THANKS | 3 +++ + 1 file changed, 3 insertions(+) + +commit 0d553568f1af9a35779ecac41392a6c871786930 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-08 11:39:08 +0200 + + Translations: Update the Polish translation + + po/pl.po | 802 ++++++++++++++++++++++++++++++++++++--------------------------- + 1 file changed, 464 insertions(+), 338 deletions(-) + +commit 9f165076aebb3b5115d2b6520529db8fa11a6bdd +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-07 19:12:03 +0200 + + Docs: Update TODO a little + + TODO | 22 ++++------------------ + 1 file changed, 4 insertions(+), 18 deletions(-) + +commit f5aa292c534f87b9dd588e667d1c65ed31e5f289 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-07 18:50:56 +0200 + + Add researcher credits of CVE-2022-1271 and CVE-2024-47611 to THANKS + + These are specific phrases that were included in the advisories and + NEWS. It's nice to have them in THANKS as well. + + THANKS | 4 ++++ + 1 file changed, 4 insertions(+) + +commit 7cf463b5add70e3fb48a10de3965c8beb6c01ad9 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-07 18:43:00 +0200 + + Update THANKS + + THANKS | 5 +++++ + 1 file changed, 5 insertions(+) + +commit 6b7fe7e27b77038592e2c2e31df955059dda7d1d +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-04 14:12:46 +0200 + + Docs: Update the "Translations" section in README + + Make it clearer that translations cannot be accepted if they don't + come via the Translation Project. + + Column headings have been handled automatically for years and now --help + is autowrapped too, so the related instructions can be removed. + + README | 107 ++++++++++++++++++++++++----------------------------------------- + 1 file changed, 39 insertions(+), 68 deletions(-) + +commit 2c7aee94936babf84b61b55420e503a0b2629ec1 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-04 13:23:53 +0200 + + debug/translations.bash: Revise a little + + Make it work for out-of-tree builds without requiring one to specify + the location of the xz executable. + + Add xz --filters-help. + + Make the output shorter by reducing the number of xz -lvv test files. + + Show the value of LANGUAGE environment variable. + + Show the xz.git version using git describe --abbrev=8 instead of =4. + + debug/translation.bash | 24 +++++++++++------------- + 1 file changed, 11 insertions(+), 13 deletions(-) + +commit c6b15e7045209002bbbf4979c48072af01c20d8d +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-04 13:20:52 +0200 + + Build: Use "git describe --abbrev=8" in snapshot tarball names + + 8 is more likely to be reproducible than the old 4 without being + excessively long for a small repository like this. + + Makefile.am | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 0ce97987c5b27cfb6f98984e5fd7477880e0cf33 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-04 19:37:17 +0200 + + Update THANKS + + THANKS | 2 ++ + 1 file changed, 2 insertions(+) + +commit 353c33355cb12e5016d49052fd1e90d15568aa37 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-03 16:29:31 +0200 + + Translations: Update the Serbian translation + + po/sr.po | 805 ++++++++++++++++++++++++++++++++++++--------------------------- + 1 file changed, 458 insertions(+), 347 deletions(-) + +commit 887dc281885052bced32b3aa309506ea58a2e78e +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-03 16:15:38 +0200 + + Translations: Update Chinese (traditional) translation + + Since there are no spaces between words, the unsophisticated automatic + word wrapping code needs some help. Compared to the version in the + Translation Project, I added a few \t characters which the word + wrapping code interprets as zero width spaces (hopefully they are + placed correctly). These edits can be seen with this command: + + grep -v ^# po/zh_TW.po | grep --color -F '\t' + + po/zh_TW.po | 843 +++++++++++++++++++++++++++++++++--------------------------- + 1 file changed, 471 insertions(+), 372 deletions(-) + +commit 0f1454cf5f460a4095f47f8f73f5a290e9777d7f +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-03 16:12:44 +0200 + + Update THANKS + + THANKS | 2 ++ + 1 file changed, 2 insertions(+) + +commit 23ea031820086d302a213be005a091df763b8a7b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-02 14:15:07 +0200 + + Build: Update posix-shell.m4 from Gnulib + + Tabs have been converted to spaces and a "serial" number has been + added. The previous version was from 2008/2009. There are no functional + changes since then but now it's clearer that the copy in XZ Utils + isn't outdated. + + The new file was picked from the Gnulib commit + 81a4c1e3b7692e95c0806d948cbab9148ad85ef2. A later commit adds + a warranty disclaimer to the license, which obviously is fine, + but I didn't find a SPDX license identifier for the new license, + so for simplicity I used the earlier commit. + + m4/posix-shell.m4 | 31 ++++++++++++++++--------------- + 1 file changed, 16 insertions(+), 15 deletions(-) + +commit 84c33c0384aa4604ff7956f2fae6f83ea60ba96b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-02 12:51:03 +0200 + + Build: Check for -fsanitize= also in $CC + + People may put -fsanitize in CC instead of CFLAGS so check both. + Landlock sandbox isn't compatible with sanitizers so it's nice + to catch the incompatible options at configure time. + + Don't attempt to do the same in CMakeLists.txt; the check for + CMAKE_C_FLAGS / CFLAGS shall be enough there. The extra flags from + the CC environment variable go into the undocumented internal variable + CMAKE_C_COMPILER_ARG1 (all flags from CC go into that same variable). + Peeking the internal variable merely for improved diagnostics isn't + worth it. + + Fixes: 88588b1246d8c26ffbc138b3e5c413c5f14c3179 + + configure.ac | 5 +++-- + 1 file changed, 3 insertions(+), 2 deletions(-) + +commit a7304ea4a7daede9789a8fe422b714e372737120 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2023-09-26 19:11:20 +0300 + + Build: Remove the FIXME about -Werror checks + + configure.ac | 7 ------- + 1 file changed, 7 deletions(-) + +commit 1780bba74075da5e7764615bd323e95e19057dee +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2023-09-26 19:10:51 +0300 + + Build: If using a GCC compatible compiler, ensure that -Werror works + + The check can be skipped by passing SKIP_WERROR_CHECK=yes to configure. + It won't be documented anywhere else than in the error message. + + Ways to test: + + ./configure CC=gcc CFLAGS=-Wunused-macros + ./configure CC=clang CFLAGS=-Weverything + ./configure CC=clang CFLAGS=-Weverything SKIP_WERROR_CHECK=yes + + configure.ac | 26 ++++++++++++++++++++++++++ + 1 file changed, 26 insertions(+) + +commit 3aca2daefbdedd7cc0fb75ddde6b714273b1cc1d +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-02 14:30:15 +0200 + + Update THANKS + + THANKS | 4 ++++ + 1 file changed, 4 insertions(+) + +commit 186ff78ab40ceb07cde139506cab42a927ca99d2 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-02-01 12:49:09 +0200 + + Translations: Update Romanian translation + + po/ro.po | 12 ++++++------ + 1 file changed, 6 insertions(+), 6 deletions(-) + +commit 40a8ce3e10747ca5233610cc2cb704fc303c48e4 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-30 18:16:43 +0200 + + Translations: Update Korean man page translations + + po4a/ko.po | 146 ++++++++++++++++++++++++------------------------------------- + 1 file changed, 56 insertions(+), 90 deletions(-) + +commit 1787f9bd18ea8798d64b636cdefe6d0fda9b8f72 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-30 18:15:52 +0200 + + Translations: Add Italian man page translations + + po4a/it.po | 3876 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + po4a/po4a.conf | 2 +- + 2 files changed, 3877 insertions(+), 1 deletion(-) + +commit 9b9182e561787a811fc0178489589f28c3e0174c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 22:18:29 +0200 + + Translations: Update the Finnish translation + + po/fi.po | 13 +++++++------ + 1 file changed, 7 insertions(+), 6 deletions(-) + +commit 7d73ff7a9d8eab6270f0b1ff7d10c0aa6f5ba53f +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 20:50:03 +0200 + + lzmainfo: Use tuklib_mbstr_wrap for --help text + + Some languages have so long strings that they need to be wrapped. + + CMakeLists.txt | 4 ++++ + src/lzmainfo/Makefile.am | 2 ++ + src/lzmainfo/lzmainfo.c | 36 ++++++++++++++++++++++++++---------- + 3 files changed, 32 insertions(+), 10 deletions(-) + +commit c56eb4707627d700695813fccdddd1483eac4f21 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 20:00:06 +0200 + + Translations: Update the Croatian translation + + po/hr.po | 926 ++++++++++++++++++++++++++++++++++++--------------------------- + 1 file changed, 529 insertions(+), 397 deletions(-) + +commit 69f4aec0a2442ab81f9ab66e5871a6546aefb0fc +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 19:56:01 +0200 + + Translations: Update the Finnish translation + + po/fi.po | 911 +++++++++++++++++++++++++++++++++------------------------------ + 1 file changed, 483 insertions(+), 428 deletions(-) + +commit d49dde33cf5f488bb38b1f57e172c4e3343fb383 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 19:55:27 +0200 + + Translations: Update the German man page translations + + po4a/de.po | 147 +++++++++++++++++++++++-------------------------------------- + 1 file changed, 55 insertions(+), 92 deletions(-) + +commit 23b99fc4a1f35bec5d63ffd02b14cacbdce9fe3c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 19:55:17 +0200 + + Translations: Update the German translation + + po/de.po | 825 +++++++++++++++++++++++++++++++++++---------------------------- + 1 file changed, 460 insertions(+), 365 deletions(-) + +commit 7edab2bde0606b42229d9c04fe664069e38de3fb +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 19:55:05 +0200 + + Translations: Update the Turkish translation + + po/tr.po | 892 +++++++++++++++++++++++++++++++++++---------------------------- + 1 file changed, 490 insertions(+), 402 deletions(-) + +commit fac4d0fa5277d7a1f621707621ee9516f0bdbac5 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 19:54:36 +0200 + + Translations: Add the Dutch translation + + po/LINGUAS | 1 + + po/nl.po | 1268 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 2 files changed, 1269 insertions(+) + +commit abe5092f24b55dde9f7f78fac1bf810bce173273 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 19:53:50 +0200 + + Translations: Update the Georgian translation + + po/ka.po | 153 +++++++++++++++++++++++++++++++++++++++++++++++---------------- + 1 file changed, 115 insertions(+), 38 deletions(-) + +commit b97b23c78d8100eec363c3e999c511560366d347 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 19:53:21 +0200 + + Translations: Update the Spanish translation + + po/es.po | 824 ++++++++++++++++++++++++++++++++++----------------------------- + 1 file changed, 450 insertions(+), 374 deletions(-) + +commit c68318cb49e0562bd22e88724ce85e76c6789a3a +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 19:53:06 +0200 + + Translations: Update the Korean translation + + po/ko.po | 785 +++++++++++++++++++++++++++++++++++++-------------------------- + 1 file changed, 460 insertions(+), 325 deletions(-) + +commit 153ee17f635962a474499f786ea1de1e1a2bb276 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 19:52:42 +0200 + + Translations: Update the Romanian man page translations + + po4a/ro.po | 141 +++++++++++++++++++++++-------------------------------------- + 1 file changed, 54 insertions(+), 87 deletions(-) + +commit 6ed308197e1f9d6c7a5cfe5aae301e75544017c4 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 19:51:59 +0200 + + Translations: Update the Romanian translation + + po/ro.po | 818 +++++++++++++++++++++++++++++++++++---------------------------- + 1 file changed, 461 insertions(+), 357 deletions(-) + +commit 06028803e19219f642aa9abddd3525c43594ec6c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 19:50:50 +0200 + + Translations: Update the Ukrainian man page translations + + po4a/uk.po | 142 +++++++++++++++++++++++-------------------------------------- + 1 file changed, 54 insertions(+), 88 deletions(-) + +commit 8cbaf896a65a53c1d1e7e2ffc80d6ea216b1e8df +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 19:50:26 +0200 + + Translations: Update the Ukrainian translation + + po/uk.po | 813 ++++++++++++++++++++++++++++++++++++--------------------------- + 1 file changed, 460 insertions(+), 353 deletions(-) + +commit 81c352907b8048b97d9868947026701a49f377ef +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-29 19:48:43 +0200 + + Translations: Update the Swedish translation + + po/sv.po | 847 ++++++++++++++++++++++++++++++++++----------------------------- + 1 file changed, 462 insertions(+), 385 deletions(-) + +commit 999ce263718a52ba74245c3e2a416ab11494d1b1 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-28 16:33:32 +0200 + + tuklib_physmem: Clean up disabled code + + src/common/tuklib_physmem.c | 9 +-------- + 1 file changed, 1 insertion(+), 8 deletions(-) + +commit 4d7e7c9d94f7a5ad4931a5bbd6ed9d00173fa1ab +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-28 16:28:18 +0200 + + Windows: Avoid an error message on broken pipe + + Also make xz not process more input files after a broken pipe has + been detected. This matches the behavior on POSIX. If all files + are being written to standard output, trying with the next file is + pointless when it's known that standard output won't accept more data. + + xzdec already stopped after the first error. It does so with all + errors, so it differs from xz: + + $ xz -dc not_found_1 not_found_2 + xz: not_found_1: No such file or directory + xz: not_found_2: No such file or directory + + $ xzdec not_found_1 not_found_2 + xzdec: not_found_1: No such file or directory + + Reported-by: Vincent Torri + + src/xz/file_io.c | 13 +++++++++++++ + src/xzdec/xzdec.c | 11 ++++++++++- + 2 files changed, 23 insertions(+), 1 deletion(-) + +commit 95b638480aa8203e547c709c651f421c22db1718 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-23 19:59:17 +0200 + + doc/SHA256SUMS: Add 5.6.4 and 5.7.1alpha + + doc/SHA256SUMS | 9 +++++++++ + 1 file changed, 9 insertions(+) + +commit cdae0df31e4c2dfb1e885941cd1998e5a2b6e39d +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-23 11:50:42 +0200 + + Bump version and soname for 5.7.1alpha + + src/liblzma/Makefile.am | 2 +- + src/liblzma/api/lzma/version.h | 2 +- + src/liblzma/liblzma_generic.map | 2 +- + src/liblzma/liblzma_linux.map | 2 +- + 4 files changed, 4 insertions(+), 4 deletions(-) + +commit 4d2af2c43bae25ef4ef9cd88304471d4859aa322 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-23 11:48:43 +0200 + + Translations: Run po4a/update-po + + po4a/de.po | 64 +++++++++++++++++++++++++++++++++++++++++++++++++---------- + po4a/fr.po | 57 +++++++++++++++++++++++++++++++++++++++++++++++----- + po4a/ko.po | 64 +++++++++++++++++++++++++++++++++++++++++++++++++---------- + po4a/pt_BR.po | 57 +++++++++++++++++++++++++++++++++++++++++++++++----- + po4a/ro.po | 64 +++++++++++++++++++++++++++++++++++++++++++++++++---------- + po4a/uk.po | 64 +++++++++++++++++++++++++++++++++++++++++++++++++---------- + 6 files changed, 320 insertions(+), 50 deletions(-) + +commit ff0b825505e60e21b32e33c42f551c8f34ba393f +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-23 11:40:46 +0200 + + Add NEWS for 5.7.1alpha + + NEWS | 107 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 1 file changed, 107 insertions(+) + +commit f6cd3e3bfc8d1f5a76dd55170968bf4582b95baf +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-23 11:40:46 +0200 + + Add NEWS for 5.6.4 + + NEWS | 45 +++++++++++++++++++++++++++++++++++++++++++++ + 1 file changed, 45 insertions(+) + +commit b3af3297e4d6cf0eafb48155aa97bb06c82a9228 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-23 11:40:46 +0200 + + NEWS: The security fix in 5.6.3 is known as CVE-2024-47611 + + NEWS | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +commit a04b9dd0c7c74fabd8c393d2dc68a221276d6e29 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-22 16:55:09 +0200 + + windows/build.bash: Fix error message + + Fixes: 1ee716f74085223c8fbcae1d5a384e6bf53c0f6a + + windows/build.bash | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 4eae859ae8ad7072eaa74aeaee79a2c3c12c55cb +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-22 15:03:55 +0200 + + Windows: Disable MinGW-w64's stdio functions in size-optimized builds + + This only affects builds with UCRT. With legacy MSVCRT, the replacement + functions are always enabled. + + Omitting the MinGW-w64 replacements saves over 20 KiB per executable. + The downside is that --enable-small or XZ_SMALL=ON disables thousand + separator support in xz messages. If someone is OK with the slower + speed of slightly smaller builds, lack of thousand separators won't + matter. + + Don't override __USE_MINGW_ANSI_STDIO if it is already defined (via + CPPFLAGS or such method). + + src/common/sysdefs.h | 30 +++++++++++++++++++++--------- + src/xz/util.c | 6 +++++- + 2 files changed, 26 insertions(+), 10 deletions(-) + +commit a831bc185bdd44c06847eae8df2d35cc281f65da +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-20 16:44:27 +0200 + + liblzma: Add raw ARM64, RISC-V, and x86 BCJ filter APIs + + Put them behind the LZMA_UNSTABLE macro for now. + + These low-level special APIs might become useful in erofs-utils. + + src/liblzma/api/lzma/bcj.h | 99 +++++++++++++++++++++++++++++++++++++++++ + src/liblzma/common/common.h | 2 + + src/liblzma/liblzma_generic.map | 10 +++++ + src/liblzma/liblzma_linux.map | 10 +++++ + src/liblzma/simple/arm64.c | 18 ++++++++ + src/liblzma/simple/riscv.c | 18 ++++++++ + src/liblzma/simple/x86.c | 24 ++++++++++ + 7 files changed, 181 insertions(+) + +commit 6f5cdd4534faf7db4b6c123651d6a606bc59b98c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-20 16:31:49 +0200 + + xz: Unify a few strings with liblzma + + Avoid having both "%s: foo" and "foo" as translatable strings + so that translators don't need to handle it twice. + + src/xz/options.c | 11 ++++++----- + src/xz/util.c | 4 ++-- + 2 files changed, 8 insertions(+), 7 deletions(-) + +commit 713fdaa8b06a83f18b06811aba7b9bd7b7cbf1cb +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-20 16:31:49 +0200 + + xz: Translate error messages from lzma_str_to_filters() + + liblzma doesn't use gettext but the messages are included in xz.pot, + so xz can translate the messages. + + src/xz/coder.c | 9 +++------ + 1 file changed, 3 insertions(+), 6 deletions(-) + +commit f2e2b267cab8d7aa0b0a58c325546ee5070c0028 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-20 16:31:49 +0200 + + liblzma: Mark string conversion messages as translatable + + po/POTFILES.in | 1 + + src/liblzma/common/string_conversion.c | 96 ++++++++++++++++++++-------------- + 2 files changed, 59 insertions(+), 38 deletions(-) + +commit f49d7413d9a0d480ded6d448c1ef7475ae6cd1c9 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-20 16:31:35 +0200 + + liblzma: Tweak a few error messages in lzma_str_to_filters() + + src/liblzma/common/string_conversion.c | 9 +++++---- + 1 file changed, 5 insertions(+), 4 deletions(-) + +commit da359c360e986b21cd8d7b888c6a80f56b9d49c7 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-19 20:11:54 +0200 + + Update THANKS + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit f032373561cefaf07f92ffe3fbc471ec6770456e +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-19 19:40:32 +0200 + + Update THANKS + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 51f038f8cbd5d8a95954c05bfcbbc32f2a313615 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-13 08:44:58 +0200 + + liblzma: memcmplen.h: Use 8-byte method on 64-bit unaligned archs + + Previously it was enabled only on x86-64 and ARM64 when also support + for unaligned access was detected or manually enabled at built time. + + In the default build configuration, the 8-byte method is now enabled + also on 64-bit RISC-V and 64-bit PowerPC (both endiannesses). It was + reported that on big endian POWER9, encoding time may reduce 12-13 %. + + This change only affects builds with GCC and Clang because the code + uses __builtin_ctzll or __builtin_clzll. + + Thanks to Marcus Comstedt for testing on POWER9. + + src/liblzma/common/memcmplen.h | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +commit 96336b0110d47756a9fd2a103fbf0a99e905fbed +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-12 13:06:17 +0200 + + Update THANKS + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 150356207c8d6a3e0af465b676430d19d62f884c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-12 12:59:20 +0200 + + liblzma: Fix the encoder breakage on big endian ARM64 + + When the 8-byte method was enabled for ARM64, a check for endianness + wasn't added. This broke the LZMA/LZMA2 encoder. Test suite caught it. + + Fixes: cd64dd70d5665b6048829c45772d08606f44672e + Co-authored-by: Marcus Comstedt <marcus@mc.pp.se> + + src/liblzma/common/memcmplen.h | 9 +++++++-- + 1 file changed, 7 insertions(+), 2 deletions(-) + +commit b01b0958025a2da284b53a583f313f8140636cb5 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-12 11:04:27 +0200 + + Windows: Update manifest comments about long UTF-8 filenames + + src/common/w32_application.manifest.comments.txt | 23 +++++++++++++++-------- + 1 file changed, 15 insertions(+), 8 deletions(-) + +commit 0dfc67d37ebb038be8a9b17b536d1b561d52e81a +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-12 10:47:58 +0200 + + Windows: Update build.bash and its README-Windows.txt to UCRT + + While MSVCRT builds are possible, UCRT works better with UTF-8. + A 32-bit build is included still but hopefully it's not actually + needed anymore. + + windows/README-Windows.txt | 17 ++++++++--------- + windows/build.bash | 20 ++++++++++++++------ + 2 files changed, 22 insertions(+), 15 deletions(-) + +commit 7b3eb2db6c4ba24b5eb438e58ab1ca57e14e59c2 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-10 13:11:40 +0200 + + Translations: Update Serbian translation + + I rewrapped a few overlong lines. Those edits aren't in the + Translation Project. Automatic wrapping in the master branch + means that these strings need to be updated soon anyway. + + po/sr.po | 346 ++++++++++++++++++++++----------------------------------------- + 1 file changed, 121 insertions(+), 225 deletions(-) + +commit 950da11ce09c90412dcbca29689575037640667a +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-08 19:26:29 +0200 + + Build: Use --sort=name in TAR_OPTIONS + + Use also LC_COLLATE=C to make the sorting locale-independent. + Sorting makes the file order reproducible. + + Makefile.am | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +commit 75d91d6b39ea3e2fae8f027dcec01be2dca9594d +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-08 19:08:08 +0200 + + xz: Workaround broken O_SEARCH in musl + + Testing with musl 1.2.5 and Linux 6.12, O_SEARCH doesn't result + in a file descriptor that works with fsync() although it should work. + See the added comment. + + The same issue affected gzip --synchronous: + + https://bugs.gnu.org/75405 + + Thanks to Paul Eggert. + + src/xz/file_io.c | 11 +++++++++++ + 1 file changed, 11 insertions(+) + +commit ea92eae122a3ccefa61087f84fd99b417fc9ee3c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-07 21:34:33 +0200 + + Revert "xz: O_SEARCH cannot be used for fsync()" + + This reverts commit 4014e2479c7b0273f15bd0c9c017c5fe859b0d8f. + + POSIX-conforming O_SEARCH should allow fsync(). + + src/xz/file_io.c | 21 +++++++++++---------- + 1 file changed, 11 insertions(+), 10 deletions(-) + +commit 4014e2479c7b0273f15bd0c9c017c5fe859b0d8f +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-05 21:43:11 +0200 + + xz: O_SEARCH cannot be used for fsync() + + Opening a directory with O_SEARCH results in a file descriptor that can + be used with functions like openat(). Such a file descriptor cannot be + used with fsync(). Use O_RDONLY instead. + + In musl, O_SEARCH becomes Linux-specific O_PATH. A file descriptor + from O_PATH doesn't allow fsync(). + + Seems that it's not possible to fsync() a directory that has write + and search permissions but not read permission. + + Fixes: 2a9e91d796d091740489d951fa7780525e4275f1 + + src/xz/file_io.c | 21 ++++++++++----------- + 1 file changed, 10 insertions(+), 11 deletions(-) + +commit ad2b57cb477b753293c25a01fc24c7f84ee523c2 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-05 20:48:28 +0200 + + CI: Make ctest show errors from failed tests + + build-aux/ci_build.bash | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit c405264c031aceaf68dfd1546d6337afcebd48e5 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-05 20:14:49 +0200 + + tuklib_mbstr_nonprint: Preserve the value of errno + + A typical use case is like this: + + printf("%s: %s\n", tuklib_mask_nonprint(filename), strerror(errno)); + + tuklib_mask_nonprint() may call mbrtowc() and malloc() which may modify + errno. If errno isn't preserved, the error message might be wrong if + a compiler decides to call tuklib_mask_nonprint() before strerror(). + + Fixes: 40e573305535960574404d2eae848b248c95ea7e + + src/common/tuklib_mbstr_nonprint.c | 17 ++++++++++++++--- + src/common/tuklib_mbstr_nonprint.h | 4 +++- + 2 files changed, 17 insertions(+), 4 deletions(-) + +commit 2a9e91d796d091740489d951fa7780525e4275f1 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-05 20:14:49 +0200 + + xz: Use fsync() before deleting the input file, and add --no-sync + + xz's default behavior is to delete the input file after successful + compression or decompression (unless writing to standard output). + If the system crashes soon after the deletion, it is possible that + the newly written file has not yet hit the disk while the previous + delete operation might have. In that case neither the original file + nor the written file is available. + + Call fsync() on the file. On POSIX systems, sync also the directory + where the file was created. + + Add a new option --no-sync which disables fsync() usage. It can avoid + a (possibly significant) performance penalty when processing many + small files. It's fine to use --no-sync when one knows that the files + are easy to recreate or restore after a system crash. + + Using fsync() after every flush initiated by --flush-timeout was + considered. It wasn't implemented at least for now. + + - --flush-timeout is typically used when writing to stdout. If stdout + is a file, xz cannot (portably) sync the directory of the file. + One would need to create the output file first, sync the directory, + and then run xz with fsync() enabled. + + - If xz --flush-timeout output goes to a file, it's possible to use + a separate script to sync the file, for example, once per minute + while telling xz to flush more frequently. + + - Not supporting syncing with --flush-timeout was simpler. + + Portability notes: + + - On systems that lack O_SEARCH (like Linux), "xz dir/file" will now + fail if "dir" cannot be opened for reading. If "dir" still has + write and search permissions (like d-wx------ in "ls -l"), + previously xz would have been able to compress "dir/file" still. + Now it only works if using --no-sync (or --keep or --stdout). + + - <libgen.h> and dirname() should be available on all POSIX systems, + and aren't needed on non-POSIX systems. + + - fsync() is available on all POSIX systems. The directory syncing + could be changed to fdatasync() although at least on ext4 it + doesn't seem to make a performance difference in xz's usage. + fdatasync() would need a build system check to support (old) + special cases, for example, MINIX 3.3.0 doesn't have fdatasync() + and Solaris 10 needs -lrt. + + - On native Windows, _commit() is used to replace fsync(). Directory + syncing isn't done and shouldn't be needed. (In Cygwin, fsync() on + directories is a no-op.) + + - DJGPP has fsync() for files. ;-) + + Using fsync() was considered somewhere around 2009 and again in 2016 but + those times the idea was rejected. For comparison, GNU gzip 1.7 (2016) + added the option --synchronous which enables fsync(). + + Co-authored-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> + Fixes: https://bugs.debian.org/814089 + Link: https://www.mail-archive.com/xz-devel@tukaani.org/msg00282.html + Closes: https://github.com/tukaani-project/xz/pull/151 + + src/xz/args.c | 14 ++++++ + src/xz/args.h | 2 +- + src/xz/file_io.c | 129 +++++++++++++++++++++++++++++++++++++++++++++++++++++-- + src/xz/file_io.h | 6 +++ + src/xz/message.c | 3 ++ + src/xz/sandbox.c | 5 ++- + src/xz/xz.1 | 24 ++++++++++- + 7 files changed, 177 insertions(+), 6 deletions(-) + +commit 2e28c7145747b3287283f13c9d2becd73a7c4a1f +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-27 09:15:50 +0200 + + xz: Use "goto" for error handling in io_open_dest_real() + + src/xz/file_io.c | 20 +++++++++----------- + 1 file changed, 9 insertions(+), 11 deletions(-) + +commit 75107217670a97b7b772833669d88c3c2f188e37 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-05 12:10:05 +0200 + + liblzma: Always validate the first digit of a preset string + + lzma_str_to_filters() may call parse_lzma12_preset() in two ways. The + call from str_to_filters() detects the string type from the first + character(s) and as a side-effect it validates the first digit of + the preset string. So this change makes no difference there. + + However, the call from parse_options() doesn't pre-validate the string. + parse_lzma12_preset() will return an invalid value which is passed to + lzma_lzma_preset() which safely rejects it. The bug still affects the + the error message: + + $ xz --filters=lzma2:preset=X + xz: Error in --filters=FILTERS option: + xz: lzma2:preset=X + xz: ^ + xz: Unsupported preset + + After the fix: + + $ xz --filters=lzma2:preset=X + xz: Error in --filters=FILTERS option: + xz: lzma2:preset=X + xz: ^ + xz: Unsupported preset + + The ^ now correctly points to the X and not past it because the X itself + is the problematic character. + + Fixes: cedeeca2ea6ada5b0411b2ae10d7a859e837f203 + + src/liblzma/common/string_conversion.c | 4 ++++ + 1 file changed, 4 insertions(+) + +commit 52ff32433734d03befd85a5bf00fba77d6501455 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-05 11:40:34 +0200 + + xz: Fix getopt_long argument type in --filters* + + Forgetting the argument (or not using = to separate the option from + the argument) resulted in lzma_str_to_filters() being called with NULL + as input string argument. The function handles it fine but xz passes + the NULL to printf() too: + + $ xz --filters + xz: Error in --filters=FILTERS option: + xz: (null) + xz: ^ + xz: Unexpected NULL pointer argument(s) to lzma_str_to_filters() + + Now it's correct: + + $ xz --filters + xz: option '--filters' requires an argument + + The --filters-help option doesn't take any arguments. + + Fixes: 9ded880a0221f4d1256845fc4ab957ffd377c760 + Fixes: d6af7f347077b22403133239592e478931307759 + Fixes: a165d7df1964121eb9df715e6f836a31c865beef + + src/xz/args.c | 22 +++++++++++----------- + 1 file changed, 11 insertions(+), 11 deletions(-) + +commit 2655c81b5e92278b0fd51f6537c1116f8349b02a +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-04 20:04:56 +0200 + + xzdec: Don't leave Landlock file descriptor open for no reason + + This fix is similar to 48ff3f06521ca326996ab9a04d1b342098960427. + + Fixes: d74fb5f060b76db709b50f5fd37490394e52f975 + + src/xzdec/xzdec.c | 2 ++ + 1 file changed, 2 insertions(+) + +commit 35df4c2bc0500e60ba9d0d163d37a6d110d6841e +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-04 20:02:18 +0200 + + xz: Make --single-stream imply --keep + + Suggested by xx on #tukaani on 2024-04-12. + + src/xz/args.c | 3 +++ + src/xz/xz.1 | 9 ++++++++- + 2 files changed, 11 insertions(+), 1 deletion(-) + +commit 6f412814a8019700248229ce972530159a0d9872 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-04 19:57:07 +0200 + + Update AUTHORS + + The contributions have been rewritten. + + AUTHORS | 2 +- + src/liblzma/check/crc32_arm64.h | 1 - + src/liblzma/check/crc32_fast.c | 1 - + src/liblzma/check/crc_common.h | 1 - + 4 files changed, 1 insertion(+), 4 deletions(-) + +commit 5651d153031a7ee2581cdba9bff658031826cb50 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-04 15:02:16 +0200 + + xz: Avoid printf formats like %2$s + + It's a POSIX feature that isn't in standard C. It's not available on + Windows. Even MinGW-w64 with __USE_MINGW_ANSI_STDIO doesn't support + it even though it supports POSIX %'d for thousand separators. + + Gettext's <libintl.h> provides overrides for printf and other functions + which do support the %2$s formats. Translations use them. But xz should + work on Windows without <libintl.h> too. + + Fixes: 3e9177fd206d20d6d8acc7d203c25a9ae0549229 + + src/xz/message.c | 51 ++++++++++++++++++++++++++++++++------------------- + 1 file changed, 32 insertions(+), 19 deletions(-) + +commit 63b246c90e7677c617faab1d3f6fc5c643b5e7cf +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-04 14:41:37 +0200 + + tuklib_mbstr_wrap: Add printf format attribute + + It's supported by GCC 3.x already. + + src/common/tuklib_common.h | 7 +++++++ + src/common/tuklib_mbstr_wrap.h | 1 + + 2 files changed, 8 insertions(+) + +commit a7313c01d9b8db71ffb61dc1dd7c4ea928824b4b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-04 13:44:12 +0200 + + xz: Translate a Windows-specific string + + Originally I thought that native Windows builds wouldn't be translated + but nowadays at least MSYS2 ships such binaries. + + src/xz/file_io.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 00eb6073c088be9e7516dfc00a13ef520827b57c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-02 15:32:10 +0200 + + xz: Use my_landlock.h + + A slightly silly thing is that xz may now query the ABI version up to + three times. We could call my_landlock_ruleset_attr_forbid_all() only + once and cache the result but it didn't seem worth doing. + + CMakeLists.txt | 1 + + src/xz/sandbox.c | 72 ++++++++++---------------------------------------------- + 2 files changed, 13 insertions(+), 60 deletions(-) + +commit 0fc5a625d7cc4ad51fde9367de088b9ad3bd40f6 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-02 15:32:10 +0200 + + xzdec: Use my_landlock.h + + CMakeLists.txt | 1 + + src/xzdec/xzdec.c | 34 ++++++---------------------------- + 2 files changed, 7 insertions(+), 28 deletions(-) + +commit 38cb8ec9fd70d25fca6b473de44cf61586238552 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-02 15:32:10 +0200 + + Add my_landlock.h with helper functions to use Linux Landlock + + This supports up to Landlock ABI version 6. The current code in + xz and xzdec only support up to ABI version 4. + + src/Makefile.am | 1 + + src/common/my_landlock.h | 141 +++++++++++++++++++++++++++++++++++++++++++++++ + 2 files changed, 142 insertions(+) + +commit 672da29bb3a209a727ae46c0df948d7eea69f2e2 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-01 18:46:50 +0200 + + liblzma: Silence warnings from "clang -Wimplicit-fallthrough" + + src/liblzma/lzma/lzma_decoder.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 1a8a1ad9a1e3179ce267baa551fb17b30624b4dd +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-01 15:34:51 +0200 + + Build: Use -Wimplicit-fallthrough=5 when supported + + Now that we have the FALLTHROUGH macro, use the strictest mode with + GCC so that comment-based fallthrough markings are no longer accepted. + + In GCC, -Wextra includes -Wimplicit-fallthrough=3 and + -Wimplicit-fallthrough is the same as -Wimplicit-fallthrough=3. + Thus, the strict mode requires specifying -Wimplicit-fallthrough=5. + + Clang has -Wimplicit-fallthrough which is *not* enabled by -Wextra. + Clang doesn't have a variant that takes an argument. Thus we need + to check for -Wimplicit-fallthrough. Do it before checking for + -Wimplicit-fallthrough=5 so that the latter overrides the former + when using GCC. + + CMakeLists.txt | 2 ++ + configure.ac | 2 ++ + 2 files changed, 4 insertions(+) + +commit 94adc996e45cc5cad9352cc3271d3a1a2f5c4c22 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-01 15:30:50 +0200 + + Replace "Fall through" comments with FALLTHROUGH + + src/liblzma/common/alone_decoder.c | 3 +-- + src/liblzma/common/auto_decoder.c | 5 ++--- + src/liblzma/common/block_decoder.c | 6 ++---- + src/liblzma/common/block_encoder.c | 6 ++---- + src/liblzma/common/common.c | 2 +- + src/liblzma/common/file_info.c | 22 +++++++++------------- + src/liblzma/common/index_decoder.c | 9 +++------ + src/liblzma/common/index_encoder.c | 6 ++---- + src/liblzma/common/index_hash.c | 7 +++---- + src/liblzma/common/lzip_decoder.c | 14 +++++--------- + src/liblzma/common/stream_decoder.c | 16 ++++++---------- + src/liblzma/common/stream_decoder_mt.c | 25 +++++++++---------------- + src/liblzma/common/stream_encoder_mt.c | 10 ++++------ + src/liblzma/lzma/lzma2_encoder.c | 9 +++------ + src/xz/args.c | 2 +- + src/xz/list.c | 3 +-- + 16 files changed, 54 insertions(+), 91 deletions(-) + +commit f31c3a6647b5a5d056324a9c83e6b2c940ebec22 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-01 15:08:51 +0200 + + sysdefs.h: Add FALLTHROUGH macro + + src/common/sysdefs.h | 9 +++++++++ + 1 file changed, 9 insertions(+) + +commit e34dbd6a0ae7a560a5508d51fc0bd142c5a320dc +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-01 15:06:15 +0200 + + xzdec: Fix language in a comment + + src/xzdec/xzdec.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 16821252c504071f5c2012e415e59cbf5fb79820 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-02 13:35:48 +0200 + + Windows: Make NLS require UCRT and gettext-runtime >= 0.23.1 + + Also remove the recently-added workaround from tuklib_gettext.h. + Requiring a new enough gettext-runtime is cleaner. I guess it's + mostly MSYS2 where xz is built with translation support, so once + MSYS2 has Gettext >= 0.23.1, this requirement shouldn't be a problem + in practice. + + CMakeLists.txt | 29 ++++++++++++++++++++++++++ + configure.ac | 29 ++++++++++++++++++++++++++ + src/common/tuklib_gettext.h | 51 --------------------------------------------- + 3 files changed, 58 insertions(+), 51 deletions(-) + +commit aa1807ed942579f700a08ab091b796cf04e31aec +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2025-01-02 11:52:17 +0200 + + windows/build-with-cmake.bat: Fix ENABLE_NLS to XZ_NLS + + Fixes: 29f77c7b707f2458fb047e77497354b195e05b14 + + windows/build-with-cmake.bat | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit ea21c76aa2406ba06ac154fe57741734c04f260f +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-30 11:21:57 +0200 + + Build: Use git log --pretty=medium when creating ChangeLog + + It's the default in git-log. Specifying it explicitly is good in case + a user has set format.pretty to a different value. + + Makefile.am | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +commit 08050c0788ce5bac0ffd572e9784a2749c4a13df +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-30 10:51:33 +0200 + + Windows: Update MinGW-w64 + CMake instructions to recommend UCRT + + windows/INSTALL-MinGW-w64_with_CMake.txt | 38 +++++++++++++++++++------------- + 1 file changed, 23 insertions(+), 15 deletions(-) + +commit 653732bd6f06d8f465bf353bf6e1c16f1405b906 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-30 10:51:26 +0200 + + xz man page: Describe the source file deletion in -z and -d options + + The DESCRIPTION section always explained it, and the OPTIONS section + only described the differences to the default behavior. However, new + users in a hurry may skip reading DESCRIPTION. The default behavior + is a bit dangerous, thus it's good to repeat in --compress and + --decompress docs that source file is removed after successful operation. + + Fixes: https://github.com/tukaani-project/xz/issues/150 + + src/xz/xz.1 | 17 ++++++++++++++++- + 1 file changed, 16 insertions(+), 1 deletion(-) + +commit bb79f79b278fd4fb06a0bcd5ab3445c468f9baaf +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-27 21:52:28 +0200 + + Build: Set libtool -version-info so that it matches with CMake + + In the past, they haven't been in sync in development versions + although they (of course) have been in stable releases. + + src/liblzma/Makefile.am | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit cf54f70e14c218faf5019ffa2fa769ed73772ee8 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-28 18:28:56 +0200 + + CMake/macOS: Use GNU Libtool compatible shared library versioning + + Because this increases the Mach-O compatibility_version, this commit + shouldn't cause any ABI compatibility trouble for existing CMake users + on macOS. This is assuming that they won't later downgrade to an older + liblzma version that was built with CMake before this commit. + + Meson allows customising the Mach-O versioning too. So the three + build systems can be configured to be compatible. + + CMakeLists.txt | 51 ++++++++++++++++++++++++++++++++++++++++++++++++--- + 1 file changed, 48 insertions(+), 3 deletions(-) + +commit 94e17916689d38bc09bf35e602ed6f6276034b59 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-28 14:49:45 +0200 + + CMake: Edit a comment + + CMakeLists.txt | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 6b50590725aeae8a2aed06faa3238cb9f8771c1b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-28 20:39:49 +0200 + + version.sh: Omit an unwanted dot from development versions + + It printed 5.7.0.alpha instead of 5.7.0alpha. + + Fixes: e7a42cda7c827e016619e8cab15e2faf5d4181ae + + build-aux/version.sh | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit f7a248f56e94310a080051c4a709c08514fa48b1 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-27 16:25:07 +0200 + + CMake: Remove a duplicate word from a comment + + CMakeLists.txt | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +commit 8b7c55d148f4a9b3702207164e862437ddffad33 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-27 16:23:12 +0200 + + INSTALL: Document CMAKE_DLL_NAME_WITH_SOVERSION + + INSTALL | 19 +++++++++++++++++++ + 1 file changed, 19 insertions(+) + +commit 260d5d36203955a7148ae1ab05d0931c942028d5 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-26 21:27:18 +0200 + + xz: Fix comments + + src/xz/file_io.c | 4 ++-- + src/xz/file_io.h | 4 ++-- + 2 files changed, 4 insertions(+), 4 deletions(-) + +commit bf6da9a573a780cd1a7fb1728ef55d09e58dad11 +Author: Dexter Castor Döpping <dexter.c.dopping@gmail.com> +Date: 2024-12-22 13:44:03 +0100 + + CMake: Disable unity builds project-wide + + liblzma and xz can't be compiled as a unity/jumbo build because of + redeclarations and type name reuse. The CMake documentation recommends + setting UNITY_BUILD to false in this case. + + This is especially important if we're compiled as a subproject and the + consumer wants to use CMAKE_UNITY_BUILD=ON for the rest of their code + base. + + Closes: https://github.com/tukaani-project/xz/pull/158 + + CMakeLists.txt | 6 ++++++ + 1 file changed, 6 insertions(+) + +commit f8c328eed1bf0a0168132025a52116b7735f894c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-20 08:51:18 +0200 + + Windows: Workaround a UTF-8 issue in Gettext's libintl_setlocale() + + See the comment. In this package, locale is set at program startup and + not changed later, so the point (2) in the comment isn't a problem. + + Fixes: 46ee0061629fb075d61d83839e14dd193337af59 + + src/common/tuklib_gettext.h | 51 +++++++++++++++++++++++++++++++++++++++++++++ + 1 file changed, 51 insertions(+) + +commit 03533906093529701ba91081907d8977991997de +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-20 06:50:36 +0200 + + Revert "Windows: Use UTF-8 locale when active code page is UTF-8" + + This reverts commit 0d0b574cc45045d6150d397776340c068df59e2a. + + src/common/tuklib_gettext.h | 32 ++------------------------------ + 1 file changed, 2 insertions(+), 30 deletions(-) + +commit 4b319e05afef4eab2fbafb6223f25d128ec99fce +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-19 18:31:09 +0200 + + xzdec: Use setlocale() instead of tuklib_gettext_setlocale() + + xzdec isn't translated and doesn't need libintl on Windows even + when NLS is enabled, thus libintl_setlocale() cannot interfere + with the locale settings. Thus, standard setlocale() works perfectly. + + In the commit 78868b6e, the explanation in the commit message is wrong. + + Fixes: 78868b6ed63fa4c89f73e3dfed27abfb8b0d46db + + src/xzdec/xzdec.c | 9 +++------ + 1 file changed, 3 insertions(+), 6 deletions(-) + +commit 34b80e282ea76ec793eaedaef58a36c3913dec78 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-19 19:36:15 +0200 + + Windows: Revert the setlocale(LC_ALL, ".UTF8") documentation + + Only leave the FindFileFirstA() notes from 20dfca81, reverting + the incorrect setlocale() notes. On Windows, Gettext's <libintl.h> + overrides setlocale() with libintl_setlocale() wrapper. I hadn't + noticed this, and thus my conclusions were wrong. + + Fixes: 20dfca8171dad4c64785ac61d5b68972c444877b + + src/common/w32_application.manifest.comments.txt | 21 +-------------------- + 1 file changed, 1 insertion(+), 20 deletions(-) + +commit 5794cda064ce980450eaa5a4e2c71bd317168ce4 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-18 17:49:05 +0200 + + tuklib_mbstr_wrap: Silence a warning from Clang + + Fixes: ca529c3f41a4a19a59e2e252e6dd9255f130c634 + + src/common/tuklib_mbstr_wrap.c | 9 +++++++++ + 1 file changed, 9 insertions(+) + +commit 16c9796ef970ae349c54fef9a346e394d7cc4c75 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-18 14:00:09 +0200 + + Update THANKS + + THANKS | 2 ++ + 1 file changed, 2 insertions(+) + +commit 3b5c8a1fcab385eed9cc95684223fddd7cf5a053 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-18 14:00:09 +0200 + + Update TODO + + Fixes: 5f6dddc6c911df02ba660564e78e6de80947c947 + + TODO | 3 --- + 1 file changed, 3 deletions(-) + +commit 22a35e64ce3d331b668f15f858a7bb3da3acc78e +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-18 14:00:09 +0200 + + lzmainfo: Use tuklib_mbstr_nonprint + + CMakeLists.txt | 3 +++ + src/lzmainfo/Makefile.am | 1 + + src/lzmainfo/lzmainfo.c | 16 ++++++++++------ + 3 files changed, 14 insertions(+), 6 deletions(-) + +commit 03111595ee713e0f94fb4f4a19a15594d5149347 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-18 14:00:09 +0200 + + xzdec: Use tuklib_mbstr_nonprint + + CMakeLists.txt | 3 +++ + src/xzdec/Makefile.am | 2 ++ + src/xzdec/xzdec.c | 15 +++++++++++---- + 3 files changed, 16 insertions(+), 4 deletions(-) + +commit d22f96921fd2f94d842f3cc2e5f729cb3cca5122 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-18 14:00:09 +0200 + + xz: Use tuklib_mbstr_nonprint + + Call tuklib_mask_nonprint() on filenames and also on a few other + strings from the command line too. + + The filename printed by "xz --robot --list" (in list.c) is also masked. + It's good to get rid of tabs and newlines which would desync the output + but masking other chars wouldn't be strictly necessary. It might matter + with sensible filenames if LC_CTYPE is "C" (when iswprint() might reject + non-ASCII chars) and a script wants to read a filename from xz's output. + Hopefully it's an unusual enough corner case to not be a real problem. + + CMakeLists.txt | 2 ++ + src/xz/Makefile.am | 1 + + src/xz/coder.c | 19 ++++++++----- + src/xz/file_io.c | 81 ++++++++++++++++++++++++++++++++++-------------------- + src/xz/list.c | 32 +++++++++++++-------- + src/xz/main.c | 10 +++++-- + src/xz/message.c | 8 ++++-- + src/xz/options.c | 10 ++++--- + src/xz/private.h | 1 + + src/xz/suffix.c | 12 ++++---- + 10 files changed, 113 insertions(+), 63 deletions(-) + +commit 40e573305535960574404d2eae848b248c95ea7e +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-18 14:00:09 +0200 + + Add tuklib_mbstr_nonprint to mask non-printable characters + + Malicious filenames or other untrusted strings may affect the state of + the terminal when such strings are printed as part of (error) messages. + Add functions that mask such characters. + + It's not enough to handle only single-byte control characters. + In multibyte locales, some control characters are multibyte too, for + example, terminals interpret C1 control characters (U+0080 to U+009F) + that are two bytes as UTF-8. + + Instead of checking for control characters with iswcntrl(), this + uses iswprint() to detect printable characters. This is much stricter. + On Windows it's actually too strict as it rejects some characters that + definitely are printable. + + Gnulib's quotearg would do a lot more but I hope this simpler method + is good enough here. + + Thanks to Ryan Colyer for the discussion about the problems of + the earlier single-byte-only method. + + Thanks to Christian Weisgerber for reporting a bug in an earlier + version of this code. + + Thanks to Jeroen Roovers for a typo fix. + + Closes: https://github.com/tukaani-project/xz/pull/118 + + src/Makefile.am | 2 + + src/common/tuklib_mbstr_nonprint.c | 151 +++++++++++++++++++++++++++++++++++++ + src/common/tuklib_mbstr_nonprint.h | 69 +++++++++++++++++ + 3 files changed, 222 insertions(+) + +commit 36190c8c4bb13d1eab84a30f3650a5ec5ff0e402 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-18 11:33:09 +0200 + + Translations: Add preliminary Georgian translation + + Most of the auto-wrapped strings are translated already. A few + strings have changed since this was created though. This file + isn't in the Translation Project *yet* because these strings + are still very new. + + Closes: https://github.com/tukaani-project/xz/pull/145 + + po/LINGUAS | 1 + + po/ka.po | 1186 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 2 files changed, 1187 insertions(+) + +commit 4a0c4f92b820b84ace625a95305a9d56cb662f4e +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-10-30 20:50:20 +0200 + + xz: Make one string simpler for translators + + Leading spaces in the string can get miscounted by translators. + + src/xz/list.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +commit 3fcf547e926f6c0414b23459f7b43164f7e8c378 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-17 10:26:10 +0200 + + lzmainfo: Sync the translatable strings with xz + + src/lzmainfo/lzmainfo.c | 20 ++++++++++++-------- + 1 file changed, 12 insertions(+), 8 deletions(-) + +commit 3e9177fd206d20d6d8acc7d203c25a9ae0549229 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-17 10:26:10 +0200 + + xz: Use automatic word wrapping for help texts + + --long-help is now one line longer because --lzma1 is now on its + own line. + + CMakeLists.txt | 2 + + src/xz/Makefile.am | 3 +- + src/xz/message.c | 482 ++++++++++++++++++++++++++++++++++------------------- + 3 files changed, 313 insertions(+), 174 deletions(-) + +commit a0eecc9eb23ac583ccf442de3f5c106d4b09482d +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-16 18:46:45 +0200 + + po/Makevars: Add --keyword=W_:... to XGETTEXT_OPTIONS + + The text was copied from tuklib_gettext.h. + + Also rearrange the --keyword options to be last on the line. + + po/Makevars | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit ca529c3f41a4a19a59e2e252e6dd9255f130c634 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-16 18:43:52 +0200 + + Add tuklib_mbstr_wrap for automatic word wrapping + + Automatic word wrapping makes translators' work easier and reduces + errors like misaligned columns or overlong lines. Right-to-left + languages and languages that don't use spaces between words will + still need extra effort. (xz hasn't been translated to any RTL + language so far.) + + cmake/tuklib_mbstr.cmake | 4 + + m4/tuklib_mbstr.m4 | 2 +- + src/Makefile.am | 2 + + src/common/tuklib_gettext.h | 11 ++ + src/common/tuklib_mbstr_wrap.c | 285 +++++++++++++++++++++++++++++++++++++++++ + src/common/tuklib_mbstr_wrap.h | 203 +++++++++++++++++++++++++++++ + 6 files changed, 506 insertions(+), 1 deletion(-) + +commit 314b83cebad0244a0015a8abc6d8d086b581c215 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-17 17:57:18 +0200 + + Build: Sort filenames to ASCII order in Makefile.am + + src/Makefile.am | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit df399c52554dfdf60259ca2cce97adbcfff39dc0 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-10-21 18:51:24 +0300 + + tuklib_mbstr_width: Add tuklib_mbstr_width_mem() + + It's a new function split from tuklib_mbstr_width(). + It's useful with partial strings that aren't terminated with \0. + + src/common/tuklib_mbstr.h | 17 +++++++++++++++++ + src/common/tuklib_mbstr_width.c | 8 ++++++++ + 2 files changed, 25 insertions(+) + +commit 51081efae4c52c226e96da95313916eba99f885f +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-16 20:08:27 +0200 + + tuklib_mbstr_width: Update a comment about shift states + + src/common/tuklib_mbstr_width.c | 11 ++++++++--- + 1 file changed, 8 insertions(+), 3 deletions(-) + +commit 7ff1b0ac53866877bdfd79acf5fee0269058c58b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-10-21 18:47:56 +0300 + + tuklib_mbstr_width: Don't mention shift states in the API docs + + It is assumed that this code won't be used with charsets that use + locking shift states. + + src/common/tuklib_mbstr.h | 8 ++------ + 1 file changed, 2 insertions(+), 6 deletions(-) + +commit 3c16105936320e4095dbe84fa9a33a4a6d46a597 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-10-21 18:41:41 +0300 + + tuklib_mbstr_width: Use stricter return value checking + + This should make no difference in practice (at least if mbrtowc() + isn't broken). + + src/common/tuklib_mbstr_width.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit b797c44c42ea54fe1c52722a2fca0c9618575598 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-16 20:06:07 +0200 + + tuklib_mbstr_width: Change the behavior when wcwidth() is not available + + If wcwidth() isn't available (Windows), previously it was assumed + that one byte == one column in the terminal. Now it is assumed that + one multibyte character == one column. This works better with UTF-8. + Languages that only use single-width characters without any combining + characters should work correctly with this. + + In xz, none of po/*.po contain combining characters and only ko.po, + zh_CN.po, and zh_TW.po contain fullwidth characters. Thus, "only" + those three translations in xz are broken on Windows with the + UTF-8 code page. Broken means that column headings in xz -lvv and + (only in the master branch) strings in --long-help are misaligned, + so it's not a huge problem. I don't know if those three languages + displayed perfectly before the UTF-8 change because I hadn't tested + translations with native Windows builds before. + + Fixes: 46ee0061629fb075d61d83839e14dd193337af59 + + src/common/tuklib_mbstr_width.c | 13 +++++++++++-- + 1 file changed, 11 insertions(+), 2 deletions(-) + +commit 78868b6ed63fa4c89f73e3dfed27abfb8b0d46db +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-18 14:23:13 +0200 + + xzdec: Use setlocale() via tuklib_gettext_setlocale() + + xzdec isn't translated and didn't have locale-specific behavior + in the past. On Windows with UTF-8 in the application manifest, + setting the locale makes a difference though: + + - Without any setlocale() call, non-ASCII filenames don't display + properly in Command Prompt unless one first uses "chcp 65001" + to set the console code page to UTF-8. + + - setlocale(LC_ALL, "") is enough to make non-ASCII filenames + print correctly in Command Prompt without using "chcp 65001", + assuming that the non-UTF-8 code page (like 850) supports + those non-ASCII characters. + + - setlocale(LC_ALL, ".UTF8") is even better because then mbrtowc() and + such functions use an UTF-8 locale instead of a legacy code page. + The tuklib_gettext_setlocale() macro takes care of this (without + enabling any translations). + + Fixes: 46ee0061629fb075d61d83839e14dd193337af59 + + src/xzdec/xzdec.c | 12 ++++++++++++ + 1 file changed, 12 insertions(+) + +commit 0d0b574cc45045d6150d397776340c068df59e2a +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-17 14:59:37 +0200 + + Windows: Use UTF-8 locale when active code page is UTF-8 + + XZ Utils 5.6.3 set the active code page to UTF-8 to fix CVE-2024-47611. + This wasn't paired with UCRT-specific setlocale(LC_ALL, ".UTF8"), thus + non-ASCII characters from translations became mojibake. + + Fixes: 46ee0061629fb075d61d83839e14dd193337af59 + + src/common/tuklib_gettext.h | 32 ++++++++++++++++++++++++++++++-- + 1 file changed, 30 insertions(+), 2 deletions(-) + +commit 20dfca8171dad4c64785ac61d5b68972c444877b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-17 15:01:29 +0200 + + Windows: Document the need for setlocale(LC_ALL, ".UTF8") + + Also warn about unpaired surrogates and (somewhat UTF-8-specific) + MAX_PATH issue in FindFirstFileA(). + + Fixes: 46ee0061629fb075d61d83839e14dd193337af59 + + src/common/w32_application.manifest.comments.txt | 28 +++++++++++++++++++++++- + 1 file changed, 27 insertions(+), 1 deletion(-) + +commit 4e936f234056e5831013ed922145b666b04bb1e3 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-18 14:12:22 +0200 + + xzdec: Call tuklib_progname_init() early enough + + If the early pledge() call on OpenBSD fails, it calls my_errorf() + which requires the "progname" variable. + + Fixes: d74fb5f060b76db709b50f5fd37490394e52f975 + + src/xzdec/xzdec.c | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +commit 61feaf681bd793dc5c919732b44bca7dcf2ed1b8 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-15 19:08:32 +0200 + + CMake: Bump maximum policy version to 3.31 + + With CMake 3.31, there were a few warnings from + CMP0177 "install() DESTINATION paths are normalized". + These occurred because the install(FILES) command in + my_install_man_lang() is called with a DESTINATION path + that contains two consecutive slashes, for example, + "share/man//man1". Such a path is for the English man pages. + With translated man pages, the language code goes between + the slashes. The warning was probably triggered because the + extra slash gets removed by the normalization. + + CMakeLists.txt | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit b0bb84dd7bbdcc85243386a0051c7b2cb5fc6a18 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-15 18:35:27 +0200 + + Update THANKS + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit bee0c044d30a6ad3b3d94901c27e7519f6f46e27 +Author: Dexter Castor Döpping <dexter.c.dopping@gmail.com> +Date: 2024-12-08 18:24:29 +0100 + + liblzma: Fix incorrect macro name in a comment + + Fixes: 33b8a24b6646a9dbfd8358405aec466b13078559 + Closes: https://github.com/tukaani-project/xz/pull/155 + + src/liblzma/api/lzma/lzma12.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 2cfa1ad0a9eb62b1847cf13f9aee290158978a3a +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-17 10:36:43 +0200 + + license-check.sh: Add an exception for doc/SHA256SUMS + + Fixes: 36b531022f24a2ab57a2dfb9e5052f1c176e9d9a + + build-aux/license-check.sh | 1 + + 1 file changed, 1 insertion(+) + +commit 36b531022f24a2ab57a2dfb9e5052f1c176e9d9a +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-12-01 21:38:17 +0200 + + doc/SHA256SUMS: Add the list of SHA-256 hashes of release files + + The release files are signed but verifying the signatures cannot + catch certain types of attacks: + + 1. A malicious maintainer could make more than one variant of + a package. One could be for general distribution. Another + with malicious content could be targeted to specific users, + for example, distributing the malicious version on a mirror + controlled by the attacker. + + 2. If the signing key of an honest maintainer was compromised + without being detected, a similar situation as described + above could occur. + + SHA256SUMS could be put on the project website but having it in + the Git repository makes it obvious that old lines aren't modified + when the file is updated. + + Hashes of uncompressed files are included too. This way tarballs + can be recompressed and the hashes can still be verified. + + .gitattributes | 1 + + doc/SHA256SUMS | 218 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 2 files changed, 219 insertions(+) + +commit fe9e66993fdbcc2981c7361b9b034a451eb0fc42 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-11-30 12:05:59 +0200 + + Docs: Remove .github/SECURITY.md + + One of the reasons to have this file in the xz repository was to + show vulnerability reporting info in the Security section on GitHub. + On 2024-11-25, I added SECURITY.md to the tukaani-project organization + on GitHub: + + https://github.com/tukaani-project/.github/blob/main/SECURITY.md + + GitHub shows that file in all projects in the organization unless + overridden by a project-specific SECURITY.md. Thus, removing + the file from the xz repo makes GitHub show the organization-wide + text instead. + + Maintaining a single copy for the whole GitHub organization makes + things simpler. It's also nicer to have fewer GitHub-specific files + in the xz repo. Information how to report bugs (including security + issues) is available in README and on the home page too. + + The OpenSSF Scorecard tool didn't find .github/SECURITY.md from the + xz repository. There was a suggestion to move the file to the top-level + directory where Scorecard should find it. However, Scorecard does find + the organization-wide SECURITY.md. Thus, the file isn't needed in the + xz repository to score points in the Scorecard game: + + https://scorecard.dev/viewer/?uri=github.com/tukaani-project/xz + + Closes: https://github.com/tukaani-project/xz/issues/148 + Closes: https://github.com/tukaani-project/xz/pull/149 + + .github/SECURITY.md | 14 -------------- + 1 file changed, 14 deletions(-) + +commit b36177273602ebc83e9cc58517f63a7b6af33f70 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-11-30 10:27:14 +0200 + + Translations: Update the Chinese (traditional) translation + + po/zh_TW.po | 201 +++++++++++++++++++++++++----------------------------------- + 1 file changed, 84 insertions(+), 117 deletions(-) + +commit c15115f7ede492f20c91b08ba485f9426f60233f +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-10-30 19:54:34 +0200 + + liblzma: Optimize the loop conditions in BCJ filters + + Compilers cannot optimize the addition "i + 4" away since theoretically + it could overflow. + + src/liblzma/simple/arm.c | 4 +++- + src/liblzma/simple/arm64.c | 4 +++- + src/liblzma/simple/armthumb.c | 7 ++++++- + src/liblzma/simple/ia64.c | 4 +++- + src/liblzma/simple/powerpc.c | 4 +++- + src/liblzma/simple/sparc.c | 5 +++-- + 6 files changed, 21 insertions(+), 7 deletions(-) + +commit 9f69e71e78621fd056f5eaaad7cdcd9279310fb5 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-11-25 16:26:54 +0200 + + Update THANKS + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit 48ff3f06521ca326996ab9a04d1b342098960427 +Author: Mark Wielaard <mark@klomp.org> +Date: 2024-11-25 12:28:44 +0200 + + xz: Landlock: Fix a file descriptor leak + + src/xz/sandbox.c | 1 + + 1 file changed, 1 insertion(+) + +commit dbca3d078ec581600600abebbb18769d3d713914 +Author: Sam James <sam@gentoo.org> +Date: 2024-10-02 03:04:03 +0100 + + CI: update FreeBSD, NetBSD, OpenBSD, Solaris actions + + Checked the changes and they're all innocuous. This should hopefully + fix the "externally managed" pip error in these jobs that started + recently. + + .github/workflows/freebsd.yml | 2 +- + .github/workflows/netbsd.yml | 2 +- + .github/workflows/openbsd.yml | 2 +- + .github/workflows/solaris.yml | 2 +- + 4 files changed, 4 insertions(+), 4 deletions(-) + +commit a94b85bea3f04d8c1f4e2e6f648a9a15bc6ce58f Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-10-01 12:17:39 +0300 @@ -17,18 +2705,31 @@ Date: 2024-10-01 12:17:39 +0300 NEWS | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 125 insertions(+) -commit b8f52990b5d47a50902bf33cd2305ce985457bac +commit be4bf94446b6286a5dffdde85fc1d21448f4edff +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-10-01 14:49:41 +0300 + + cmake/tuklib_large_file_support.cmake: Add a missing include + + v5.2 didn't build with CMake. Other branches had + include(CMakePushCheckState) in top-level CMakeLists.txt + which made the build work. + + Fixes: 597f49b61475438a43a417236989b2acc968a686 + + cmake/tuklib_large_file_support.cmake | 1 + + 1 file changed, 1 insertion(+) + +commit 1ebbe915d4e0d877154261b5f8103719a6722975 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-10-01 12:10:23 +0300 Update THANKS - - (cherry picked from commit 1ebbe915d4e0d877154261b5f8103719a6722975) THANKS | 2 ++ 1 file changed, 2 insertions(+) -commit 51f6f455873911894f155e6997bc23a9be8f42ba +commit 74702ee00ecfd080d8ab11118cd25dbe6c437ec0 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-10-01 12:10:23 +0300 @@ -36,8 +2737,6 @@ Date: 2024-10-01 12:10:23 +0300 This ensures that the test programs get executed the same way as the binaries that are installed. - - (cherry picked from commit 74702ee00ecfd080d8ab11118cd25dbe6c437ec0) CMakeLists.txt | 14 ++++++++++---- tests/Makefile.am | 10 ++++++++++ @@ -45,7 +2744,19 @@ Date: 2024-10-01 12:10:23 +0300 tests/tests_w32res.rc | 18 ++++++++++++++++++ 4 files changed, 70 insertions(+), 5 deletions(-) -commit bf518b9ba446327a062ddfe67e7e0a5baed2394f +commit 7ddf2273e0e4654582ee65db19d44431bfdb5791 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-10-01 12:10:23 +0300 + + license-check.sh: Add an exception for w32_application.manifest + + The file gets embedded as is into executables, thus it cannot + hold a license identifier. + + build-aux/license-check.sh | 1 + + 1 file changed, 1 insertion(+) + +commit 46ee0061629fb075d61d83839e14dd193337af59 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-10-01 12:10:23 +0300 @@ -77,8 +2788,6 @@ Date: 2024-10-01 12:10:23 +0300 Thanks to Kelvin Lee for testing with MSVC and helping with the required build system fixes. - - (cherry picked from commit 46ee0061629fb075d61d83839e14dd193337af59) CMakeLists.txt | 18 +++ src/Makefile.am | 4 +- @@ -87,7 +2796,7 @@ Date: 2024-10-01 12:10:23 +0300 src/common/w32_application.manifest.comments.txt | 178 +++++++++++++++++++++++ 5 files changed, 232 insertions(+), 1 deletion(-) -commit 5718ce932e6ad4262d5fffc9e2a7a838f963d7e5 +commit dad153091552b52a41b95ec4981c6951f1cae487 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-09-29 14:46:52 +0300 @@ -96,13 +2805,11 @@ Date: 2024-09-29 14:46:52 +0300 Now the information in the "Details" tab in the file properties dialog matches the naming convention of Cygwin and MSYS2. This is only a cosmetic change. - - (cherry picked from commit dad153091552b52a41b95ec4981c6951f1cae487) src/liblzma/liblzma_w32res.rc | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) -commit e77c0ca61d12ebac433b7661840cb18d7031700a +commit 8940ecb96fe9f0f2a9cfb8b66fe9ed31ffbea904 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-09-25 15:47:55 +0300 @@ -110,13 +2817,11 @@ Date: 2024-09-25 15:47:55 +0300 LANGUAGE and VS_VERSION_INFO begin new statements so put an empty line between them. - - (cherry picked from commit 8940ecb96fe9f0f2a9cfb8b66fe9ed31ffbea904) src/common/common_w32res.rc | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) -commit e0ba0f26d9f3f53cedc92fb13303924c39d00392 +commit c3b9dad07d3fd9319f88386b7095019bcea45ce1 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-09-28 20:09:50 +0300 @@ -128,13 +2833,11 @@ Date: 2024-09-28 20:09:50 +0300 for Cygwin or MSYS2 because in that context it should be useless. (If Cygwin or MSYS2 is used to host building of normal Windows binaries then the DEF file is still created.) - - (cherry picked from commit c3b9dad07d3fd9319f88386b7095019bcea45ce1) CMakeLists.txt | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) -commit 69637d0c323c0d7d9619cff637c7ce97dabc4f02 +commit da4f275bd1c18b897e5c2dd0043546de3accce0a Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-09-28 15:19:14 +0300 @@ -142,37 +2845,31 @@ Date: 2024-09-28 15:19:14 +0300 If common_w32res.rc is modified, the resource files need to be rebuilt. In contrast, the liblzma*.map files truly are link dependencies. - - (cherry picked from commit da4f275bd1c18b897e5c2dd0043546de3accce0a) CMakeLists.txt | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) -commit af8533459c60d7bc5b55f2f516251af4572169e4 +commit 1c673c0aac7f7dee8dda2c1140351c8417a71e47 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-09-29 01:20:03 +0300 CMake: Checking for CYGWIN covers MSYS2 too On MSYS2, both CYGWIN and MSYS are set. - - (cherry picked from commit 1c673c0aac7f7dee8dda2c1140351c8417a71e47) CMakeLists.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit eca08e4c204db404911e513f95110dcb0fb919bd +commit 6aaa0173b839e28429d43a8b62d257ad2f3b4521 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-09-28 09:37:30 +0300 Translations: Add the SPDX license identifier to pt_BR.po - - (cherry picked from commit 6aaa0173b839e28429d43a8b62d257ad2f3b4521) po/pt_BR.po | 2 ++ 1 file changed, 2 insertions(+) -commit 85801c96c32456300177fbbad1506b07f5dd0a47 +commit dc7b9f24b737e4e55bcbbdde6754883f991c2cfb Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-09-25 16:41:37 +0300 @@ -181,34 +2878,29 @@ Date: 2024-09-25 16:41:37 +0300 CMakeLists.txt was using xzdec_w32res.rc for both xzdec and lzmadec. Fixes: 998d0b29536094a89cf385a3b894e157db1ccefe - (cherry picked from commit dc7b9f24b737e4e55bcbbdde6754883f991c2cfb) CMakeLists.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit a341d19c835a8c10fcf561b00b548c53af43381e +commit b834ae5f80911a3819d6cdb484f61b257174c544 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-09-25 21:29:59 +0300 Translations: Update the Brazilian Portuguese translation - - (cherry picked from commit b834ae5f80911a3819d6cdb484f61b257174c544) po/pt_BR.po | 144 ++++++++++++++++++++++-------------------------------------- 1 file changed, 53 insertions(+), 91 deletions(-) -commit e69c0b9b2e00ade984393ef9cabac57342072328 +commit eceb023d4c129fd63ee881a2d8696eaf52ad1532 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-09-17 01:21:15 +0300 Update THANKS - - (cherry picked from commit eceb023d4c129fd63ee881a2d8696eaf52ad1532) THANKS | 1 + 1 file changed, 1 insertion(+) -commit aef9a25b3200457c16846b046222fb2c7967afe0 +commit 76cfd0a9bb33ae8e534b1f73f6359dc825589f2f Author: Tobias Stoeckmann <tobias@stoeckmann.org> Date: 2024-09-16 23:19:46 +0200 @@ -222,12 +2914,11 @@ Date: 2024-09-16 23:19:46 +0200 Co-authored-by: Lasse Collin <lasse.collin@tukaani.org> Closes: https://github.com/tukaani-project/xz/pull/144 - (cherry picked from commit 76cfd0a9bb33ae8e534b1f73f6359dc825589f2f) src/lzmainfo/lzmainfo.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) -commit 40a7f163f56aca6b3c8b83e9382f5e5cb4f8e93b +commit 78355aebb7fb654302e5e33692ba109909dacaff Author: Tobias Stoeckmann <tobias@stoeckmann.org> Date: 2024-09-16 22:04:40 +0200 @@ -240,24 +2931,20 @@ Date: 2024-09-16 22:04:40 +0200 Fixes: 792331bdee706aa852a78b171040ebf814c6f3ae Closes: https://github.com/tukaani-project/xz/pull/143 [ Lasse: Commit message edits ] - - (cherry picked from commit 78355aebb7fb654302e5e33692ba109909dacaff) src/xzdec/xzdec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit c98714a57058ac381365c2ff1e1d1cd63a5742c4 +commit e5758db7bd75587a2499e0771907521a4aa86908 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-09-10 13:54:47 +0300 Update THANKS - - (cherry picked from commit e5758db7bd75587a2499e0771907521a4aa86908) THANKS | 1 + 1 file changed, 1 insertion(+) -commit 4ed449517817b3659b35d19f39703e3c460f46c2 +commit 80ffa38f56657257ed4d90d76f6bd2f2bcb8163c Author: Firas Khalil Khana <firasuke@gmail.com> Date: 2024-09-10 12:30:32 +0300 @@ -265,12 +2952,11 @@ Date: 2024-09-10 12:30:32 +0300 Fixes: e9be74f5b129fe8a5388d588e68b1b7f5168a310 Closes: https://github.com/tukaani-project/xz/pull/141 - (cherry picked from commit 80ffa38f56657257ed4d90d76f6bd2f2bcb8163c) autogen.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 3b83577a1547e72cb78a905ad3d308a799ded485 +commit 68c54e45d042add64a4cb44bfc87ca74d29b87e2 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-09-02 20:08:40 +0300 @@ -284,13 +2970,11 @@ Date: 2024-09-02 20:08:40 +0300 previous lines. - "make update-po" was run to remove line numbers from comments. - - (cherry picked from commit 68c54e45d042add64a4cb44bfc87ca74d29b87e2) po/zh_CN.po | 102 ++++++++++++++++++++++++------------------------------------ 1 file changed, 40 insertions(+), 62 deletions(-) -commit 06f4c7edda0387eb6a2d6303804b59dcf4d3db1f +commit 2230692aa1bcebb586100183831e3daf1714d60a Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-09-02 19:40:50 +0300 @@ -304,24 +2988,55 @@ Date: 2024-09-02 19:40:50 +0300 to match the changes in fccebe2b4fd513488fc920e4dac32562ed3c7637 and 093490b58271e9424ce38a7b1b38bcf61b9c86c6. xz.pot in the TP is older than these commits. - - (cherry picked from commit 2230692aa1bcebb586100183831e3daf1714d60a) po/ca.po | 171 ++++++++++++++++++++++++++------------------------------------- 1 file changed, 69 insertions(+), 102 deletions(-) -commit 406cb5b669e47c0e45c98f1afb7be998084a93d0 +commit 3e7723ce26f74c71919984a6180504b4548cbb7e Author: Lasse Collin <lasse.collin@tukaani.org> -Date: 2024-08-22 11:01:07 +0300 +Date: 2024-08-22 14:06:16 +0300 Update THANKS + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit d3e0e679b2b8b428598bb8ba56a17715190814db +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-08-22 14:06:16 +0300 + + CMake: Don't install lzmadec.1 symlinks if XZ_TOOL_LZMADEC=OFF - (cherry picked from commit 5e375987509fab484b7bef0b90be92f241c58c91) + Thanks-to: 榆柳松 (ZhengSen Wang) <wzhengsen@gmail.com> + Fixes: fb50c6ba1d4c9405e5b12b5988b01a3002638c5d + Closes: https://github.com/tukaani-project/xz/pull/134 + + CMakeLists.txt | 12 ++++++++++-- + 1 file changed, 10 insertions(+), 2 deletions(-) + +commit acdf21033abe347d9a279e9fe757f90ed16c1dbb +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-08-22 14:06:16 +0300 + + CMake: Fix the build when XZ_TOOL_LZMADEC=OFF + + Co-developed-by: 榆柳松 (ZhengSen Wang) <wzhengsen@gmail.com> + Fixes: fb50c6ba1d4c9405e5b12b5988b01a3002638c5d + Fixes: https://github.com/tukaani-project/xz/pull/134 + + CMakeLists.txt | 6 ++++-- + 1 file changed, 4 insertions(+), 2 deletions(-) + +commit 5e375987509fab484b7bef0b90be92f241c58c91 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-08-22 11:01:07 +0300 + + Update THANKS THANKS | 1 + 1 file changed, 1 insertion(+) -commit 3a4a05d75eb41ddc41899324df0511670ceaaf1e +commit 6cd7c8607843c337edfe2c472aa316602a393754 Author: Yifeng Li <tomli@tomli.me> Date: 2024-08-22 02:18:49 +0000 @@ -350,35 +3065,50 @@ Date: 2024-08-22 02:18:49 +0000 Fixes: 3182a330c1512cc1f5c87b5c5a272578e60a5158 Fixes: https://github.com/tukaani-project/xz/issues/121 Closes: https://github.com/tukaani-project/xz/pull/136 - (cherry picked from commit 6cd7c8607843c337edfe2c472aa316602a393754) src/liblzma/rangecoder/range_decoder.h | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) -commit 4669f06d1a8d31de4b8b5861b5e8afd82cacd721 +commit bf901dee5d4c46609645e50311c0cb2dfdcf9738 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-07-19 20:02:43 +0300 Build: Comment that elf_aux_info(3) will be available on OpenBSD >= 7.6 - - (cherry picked from commit bf901dee5d4c46609645e50311c0cb2dfdcf9738) CMakeLists.txt | 2 +- configure.ac | 17 +++++++++++------ 2 files changed, 12 insertions(+), 7 deletions(-) -commit 9edddda5636d7b3504a033c31e8ea763e293fd35 +commit f7103c2c2a8fa51d1f308ba7387beeff20a0d4dd +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-07-19 19:42:26 +0300 + + Revert "liblzma: Add ARM64 CRC32 instruction support detection on OpenBSD" + + This reverts commit dc03f6290f5b9bd3d50c7e12e58dee870889d599. + + OpenBSD 7.6 will support elf_aux_info(3), and the detection code used + on FreeBSD will work on OpenBSD 7.6 too. Keep things simpler and drop + the OpenBSD-specific sysctl() method. + + Thanks to Christian Weisgerber. + + CMakeLists.txt | 6 ------ + configure.ac | 9 --------- + src/liblzma/check/crc32_arm64.h | 15 --------------- + src/liblzma/check/crc_common.h | 1 - + 4 files changed, 31 deletions(-) + +commit 7c292dd0bf23cefcdf4b1509f3666322e08a7ede Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-07-13 22:10:37 +0300 liblzma: Tweak a comment - - (cherry picked from commit 7c292dd0bf23cefcdf4b1509f3666322e08a7ede) src/liblzma/simple/arm64.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -commit 1a93ab55d1563f5eb9b2c1b8240384046fe4bb97 +commit 6408edac5529d6ec0abf52794074f229c8362303 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-07-11 22:17:56 +0300 @@ -387,18 +3117,27 @@ Date: 2024-07-11 22:17:56 +0300 CMakeLists.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit cfe4465742ad2963fb0d9795e258615d7c1cf32d +commit 9231c39ffb518196d6664a86e5325e744621a21b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-07-06 15:13:19 +0300 + + CMake: Require CMake 3.20 or later + + This allows a few cleanups. + + CMakeLists.txt | 78 ++++++++++++++++++++-------------------------------------- + 1 file changed, 27 insertions(+), 51 deletions(-) + +commit 028185dd4889e3d6235ff13560160ebca6985021 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-07-09 14:27:51 +0300 Update THANKS - - (cherry picked from commit 028185dd4889e3d6235ff13560160ebca6985021) THANKS | 1 + 1 file changed, 1 insertion(+) -commit 0f47db18d04434203b350bde4909a5e468f197cc +commit baecfa142644eb5f5c6dd6f8e2f531c362fa3747 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-07-06 14:04:48 +0300 @@ -407,13 +3146,20 @@ Date: 2024-07-06 14:04:48 +0300 It won't be implemented. find + xargs is more flexible, for example, it allows compressing small files in parallel. An example for that has been included in the xz man page since 2010. - - (cherry picked from commit baecfa142644eb5f5c6dd6f8e2f531c362fa3747) src/xz/args.c | 1 - 1 file changed, 1 deletion(-) -commit 07f52c3528e43c4a925a3fc59a933c89f5604d92 +commit f691d58fae82bd815c5f86ffad10fe9b6b59dad8 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-07-06 14:04:16 +0300 + + Document --disable-loongarch-crc32 in INSTALL + + INSTALL | 8 ++++++++ + 1 file changed, 8 insertions(+) + +commit b3e53122f42796aaebd767bab920cf7bedf69966 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-07-03 20:45:48 +0300 @@ -428,23 +3174,170 @@ Date: 2024-07-03 20:45:48 +0300 Fixes: ac05f1b0d7cda1e7ae79775a8dfecc54601d7f1c Fixes: https://github.com/tukaani-project/xz/issues/129#issuecomment-2204522994 - (cherry picked from commit b3e53122f42796aaebd767bab920cf7bedf69966) CMakeLists.txt | 13 +++++++++++++ 1 file changed, 13 insertions(+) -commit eccb4d258b01651d06a2a31b8b68be9b04b7998c +commit 5742ec1fc7f2cf1c82cfe3477bb90594a4658374 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-07-02 22:49:33 +0300 Update THANKS - - (cherry picked from commit 5742ec1fc7f2cf1c82cfe3477bb90594a4658374) THANKS | 1 + 1 file changed, 1 insertion(+) -commit c9bd00327f064778babb014302718a18d65cf7d3 +commit 2d13d10357ecad243d7e4ff1de0e6b437c38a47a +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-07-02 20:23:35 +0300 + + CMake: Improve NLS error messages + + CMakeLists.txt | 11 +++++++---- + 1 file changed, 7 insertions(+), 4 deletions(-) + +commit 628d8d2c4fdf9e6a91c7bba7a743f400a94c2909 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-07-02 20:19:47 +0300 + + CMake: Update the comment at the top of CMakeLists.txt + + While po/*.gmo files won't be used from the release tarball, + the generated translated man pages will be used still. Those + are text files and po4a has slightly more dependencies than + gettext tools so installing po4a might be a bit more challenging + in some situations. + + CMakeLists.txt | 17 +++++++---------- + 1 file changed, 7 insertions(+), 10 deletions(-) + +commit b4b23c94fd4429abc663ced28d5cdc9cf7eb7507 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-07-02 20:12:40 +0300 + + CMake: Drop support for pre-generated po/*.gmo files + + When a release tarball is created using Autotools, the tarball includes + po/*.gmo files which are binary files generated from po/*.po. Other + tarball creation methods don't and won't create the .gmo files. + + It feels clearer if CMake will never install pre-generated binary files + from the source package. If people are able to install CMake, they + likely are able to install gettext tools as well (assuming they want + translations). + + CMakeLists.txt | 66 +++++++++++++++++++--------------------------------------- + 1 file changed, 21 insertions(+), 45 deletions(-) + +commit fb99f8e8c50171b898cb79fe1dc703d5f91e4f0a +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-07-02 19:14:50 +0300 + + CMake: Make XZ_NLS handling more robust + + If a user set XZ_NLS=ON but find_package(Intl) failed or CMake version + wasn't at least 3.20, the configuration would fail in a cryptic way. + + If XZ_NLS is enabled, require that CMake is new enough and that either + gettext tools or pre-generated .gmo files are available. Otherwise fail + the configuration. Previously missing gettext tools and .gmo files would + only result in a warning. + + Missing man page translations are still only a warning. + + Thanks to Peter Seiderer for the bug report. + + Fixes: https://github.com/tukaani-project/xz/issues/129 + Closes: https://github.com/tukaani-project/xz/pull/130 + + CMakeLists.txt | 82 ++++++++++++++++++++++++++++++++-------------------------- + 1 file changed, 46 insertions(+), 36 deletions(-) + +commit ec6157570ea8a8e38158894e530d35416ff6a0f8 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-07-02 19:39:05 +0300 + + CI: Add gettext as a dependency to CMake builds + + .github/workflows/ci.yml | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +commit 24f0f7e399de03bb2ff675d97b723d14f17ed6ac +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-07-02 18:43:56 +0300 + + CMake: Fix ENABLE_NLS comment too + + Fixes: 29f77c7b707f2458fb047e77497354b195e05b14 + + CMakeLists.txt | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit a0df0676130bc565af0ec911e68a1d0fbc3ed0fb +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-07-02 18:02:50 +0300 + + CMake: The compile definition is ENABLE_NLS, not XZ_NLS + + The CMake variables were renamed and accidentally also + the compile definition was renamed. As a result, translation + support wasn't actually enabled in the executables. + + Fixes: 29f77c7b707f2458fb047e77497354b195e05b14 + + CMakeLists.txt | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +commit 45d08abc33ccc52d2f050dcec458badc2ce59d0b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-07-01 17:33:20 +0300 + + Update AUTHORS and THANKS + + AUTHORS | 2 +- + THANKS | 1 + + 2 files changed, 2 insertions(+), 1 deletion(-) + +commit 7baf6835cfbf9c85ba37f9ffb7d4f87fb86a474e +Author: Xi Ruoyao <xry111@xry111.site> +Date: 2024-06-28 13:36:43 +0300 + + liblzma: Speed up CRC32 calculation on 64-bit LoongArch + + The crc.w.{b/h/w/d}.w instructions in LoongArch can calculate the CRC32 + result for 1/2/4/8 bytes in a single operation. Using these is much + faster compared to the generic method. + + Optimized CRC32 is enabled unconditionally on 64-bit LoongArch because + the LoongArch specification says that CRC32 instructions shall be + implemented for 64-bit processors. Optimized CRC32 isn't enabled for + 32-bit LoongArch processors because not enough information is available + about them. + + Co-authored-by: Lasse Collin <lasse.collin@tukaani.org> + + Closes: https://github.com/tukaani-project/xz/pull/86 + + CMakeLists.txt | 25 ++++++++++++++ + configure.ac | 40 +++++++++++++++++++++++ + src/liblzma/check/Makefile.inc | 3 +- + src/liblzma/check/crc32_fast.c | 2 ++ + src/liblzma/check/crc32_loongarch.h | 65 +++++++++++++++++++++++++++++++++++++ + src/liblzma/check/crc_common.h | 15 +++++++++ + 6 files changed, 149 insertions(+), 1 deletion(-) + +commit 0ed893668554fb0758003289f8a6af9bd08b89d1 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-28 14:20:49 +0300 + + liblzma: ARM64 CRC32: Align the buffer faster + + Instead of doing it byte by byte, use the 1/2/4-byte CRC32 instructions. + + src/liblzma/check/crc32_arm64.h | 54 ++++++++++++++++++++++++++++++----------- + 1 file changed, 40 insertions(+), 14 deletions(-) + +commit 7e99856f66c07852c4e0de7aa01951e9147d86b0 Author: Sam James <sam@gentoo.org> Date: 2024-06-28 14:18:35 +0300 @@ -454,13 +3347,11 @@ Date: 2024-06-28 14:18:35 +0300 6c095a98fbec70b790253a663173ecdb669108c4 and speeds up the Valgrind job a bit, because non-xz tools aren't run unnecessarily with Valgrind by the script tests. - - (cherry picked from commit 7e99856f66c07852c4e0de7aa01951e9147d86b0) .github/workflows/ci.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 495de6ec9d7834c4ef4d5286844ef7b784eb951b +commit 2402e8a1ae92676fa0d4cb1b761d7f62f005c098 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-25 16:00:22 +0300 @@ -470,13 +3361,11 @@ Date: 2024-06-25 16:00:22 +0300 at that point in configure. But prepending is the correct way because in general the libraries being added might require other libraries that come later on the command line. - - (cherry picked from commit 2402e8a1ae92676fa0d4cb1b761d7f62f005c098) configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 55bf3f49a812e20a21e42323e39526bb31d9341a +commit 7bb46f2b7b3989c1b589a247a251470f65e91cda Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-25 14:24:29 +0300 @@ -502,15 +3391,13 @@ Date: 2024-06-25 14:24:29 +0300 on specific headers and types: if headers or types are missing, compilation will fail. Using the linker makes these checks more similar to the ones in cmake/tuklib_*.cmake which always link. - - (cherry picked from commit 7bb46f2b7b3989c1b589a247a251470f65e91cda) configure.ac | 8 ++++++-- m4/tuklib_cpucores.m4 | 8 ++++---- m4/tuklib_physmem.m4 | 17 +++++++++++------ 3 files changed, 21 insertions(+), 12 deletions(-) -commit b45270d88f0de1b2e8bf510f0e370a5db4067e1f +commit 35eb57355ad1c415a838d26192d5af84abb7cf39 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-24 23:35:59 +0300 @@ -529,13 +3416,11 @@ Date: 2024-06-24 23:35:59 +0300 warning flags that break -Werror anyway (but this isn't the only check in configure.ac that has this problem). Using AC_LINK_IFELSE also makes the check more similar to how it is done in CMakeLists.txt. - - (cherry picked from commit 35eb57355ad1c415a838d26192d5af84abb7cf39) configure.ac | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-) -commit 2c3e4cbbdcefe214ef3033a725049034b73e9756 +commit 5a728813c378cc3c4c9c95793762452418d08f1b Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-24 23:34:34 +0300 @@ -543,13 +3428,40 @@ Date: 2024-06-24 23:34:34 +0300 It's nice to keep these in sync. The use of main() will later allow AC_LINK_IFELSE usage too which may avoid the more fragile -Werror. - - (cherry picked from commit 5a728813c378cc3c4c9c95793762452418d08f1b) configure.ac | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) -commit 809e69f1f574dad3c9b00d4f01b9ef1a492319f3 +commit 5279828635a95abdef82e691fc4979d362780e63 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-24 20:14:43 +0300 + + CMake: Not experimental anymore + + While the CMake support has gotten a lot less testing than + the Autotools-based build, the supported features should now + be equal. The output may differ slightly, for example, + liblzma.pc may have + + Libs.private: -pthread -lpthread + + with Autotools on GNU/Linux. CMake doesn't put any options + in Libs.private because on modern glibc the pthread functions + are in libc. The options options aren't required to link static + liblzma into an application. + + Autotools-based build doesn't generate or install + lib/cmake/liblzma-*.cmake files. This means that on most + platforms one cannot rely on + + find_package(liblzma 5.2.5 REQUIRED CONFIG) + + or such finding those files. + + CMakeLists.txt | 9 ++++++--- + 1 file changed, 6 insertions(+), 3 deletions(-) + +commit de215a0517645d16343f3a5336d3df884a4f665f Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-25 16:11:13 +0300 @@ -557,13 +3469,11 @@ Date: 2024-06-25 16:11:13 +0300 I had missed this simpler method before. It does create a dependency so that if .in.h changes the copying is done again. - - (cherry picked from commit de215a0517645d16343f3a5336d3df884a4f665f) CMakeLists.txt | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) -commit 52a8c87f37f4bd133f670722d2d4b73a74e352bc +commit e620f35097c0ad20cd76d8258750aa706758ced9 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-25 15:51:48 +0300 @@ -574,13 +3484,31 @@ Date: 2024-06-25 15:51:48 +0300 the thread libs from CMAKE_REQUIRED_LIBRARIES after the check for pthread_condattr_setclock() but keeping the libs should be fine too. Then it's ready in case more pthread functions were wanted some day. - - (cherry picked from commit e620f35097c0ad20cd76d8258750aa706758ced9) CMakeLists.txt | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) -commit 1591747bf692d10c3b2fd92c9dc8ba931626fd84 +commit 068a70e54932ca32ca2922aff5a67a62615c650b +Author: Sam James <sam@gentoo.org> +Date: 2024-06-24 19:25:30 +0100 + + CMake: Tweak comments + + Co-authored-by: Lasse Collin <lasse.collin@tukaani.org> + + CMakeLists.txt | 15 +++++++-------- + 1 file changed, 7 insertions(+), 8 deletions(-) + +commit 3c95c93bca593bdd54ac5cc01526b12c82c78faa +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-24 22:42:01 +0300 + + CMake: Edit white space for consistency + + CMakeLists.txt | 26 +++++++++++++------------- + 1 file changed, 13 insertions(+), 13 deletions(-) + +commit 114cba69dbb96003e676c8c87a2e9943b12d065f Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-24 22:41:10 +0300 @@ -619,23 +3547,60 @@ Date: 2024-06-24 22:41:10 +0300 it works with GCC 4.9 to 14.1 on x86-64. Reported-by: Sam James <sam@gentoo.org> - (cherry picked from commit 114cba69dbb96003e676c8c87a2e9943b12d065f) CMakeLists.txt | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-) -commit cc386f4ff4b87ff895fbc30fd3b13ee6e6152ace +commit 78e882205e1f1e91df2af2cb7da00fe205dede99 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-24 21:19:14 +0300 + + CMake: Use MATCHES instead of multiple STREQUAL + + CMakeLists.txt | 11 ++++------- + 1 file changed, 4 insertions(+), 7 deletions(-) + +commit d3f20382fc1bd865eb70a65455d5022ed05caac8 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-24 21:06:18 +0300 CMake: Improve the comment about LIBS - - (cherry picked from commit d3f20382fc1bd865eb70a65455d5022ed05caac8) CMakeLists.txt | 6 ++++++ 1 file changed, 6 insertions(+) -commit 65aaa0f87048f78a3f69c4ec0ad03723a2354fa7 +commit 33ec377729a3889e58d98934b2777b2754a3e045 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-24 20:01:25 +0300 + + CMake: Fix a typo in a message + + It was spotted with codespell. + + CMakeLists.txt | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit 2a47be823cd6c717bc91fa29c7710c9b1ae0331f +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-24 19:58:54 +0300 + + Document CMake options in INSTALL + + INSTALL | 115 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----- + 1 file changed, 106 insertions(+), 9 deletions(-) + +commit 3faf4e8079a46bd46e05cd1234365724a6a33802 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-24 17:18:44 +0300 + + CI: Don't omit crc32 from the list with CMake anymore + + XZ_CHECKS accepts it but works without too. + + build-aux/ci_build.bash | 10 +--------- + 1 file changed, 1 insertion(+), 9 deletions(-) + +commit 1bf83cded2955282fe1a868f08c83d4e5d6dca4a Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-24 17:39:54 +0300 @@ -655,13 +3620,11 @@ Date: 2024-06-24 17:39:54 +0300 More information: https://mail.gnu.org/archive/html/config-patches/2022-05/msg00003.html - - (cherry picked from commit 1bf83cded2955282fe1a868f08c83d4e5d6dca4a) build-aux/ci_build.bash | 9 +++++++++ 1 file changed, 9 insertions(+) -commit 810f1a8aee9edb3bff430559f4b832cd0ec50797 +commit dbcdabf68fee9ed694b68c3a82e6adbeff20b679 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-24 15:24:52 +0300 @@ -669,13 +3632,11 @@ Date: 2024-06-24 15:24:52 +0300 The old method put it in CFLAGS which is a wrong place because config.guess doesn't read CFLAGS. - - (cherry picked from commit dbcdabf68fee9ed694b68c3a82e6adbeff20b679) .github/workflows/ci.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -commit dde14ded9a3240fd524d9bc01c9ceeb4d7909e95 +commit 0c1e6d900bac127464fb30a854776e1810ab5f16 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-24 14:54:17 +0300 @@ -686,13 +3647,193 @@ Date: 2024-06-24 14:54:17 +0300 The syntax in ci_build.bash was broken in case one wished to put spaces in CC. - - (cherry picked from commit 0c1e6d900bac127464fb30a854776e1810ab5f16) build-aux/ci_build.bash | 4 ---- 1 file changed, 4 deletions(-) -commit 85a55e1120bebac2f3cd9af8965f4a6335eeeb9b +commit a3d6eb797c1bd9b0425ef6754e475e43e62bf075 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-20 23:25:42 +0300 + + CMake: Add autodetection for 32-bit x86 CRC assembly usage + + CMakeLists.txt | 33 ++++++++++++++++++--------------- + 1 file changed, 18 insertions(+), 15 deletions(-) + +commit dbc14f213e5cf866f1f42b7c6381a91e1189908c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-20 23:00:59 +0300 + + CMake: Move option(XZ_ASM_I386) downwards a few lines + + CMakeLists.txt | 16 ++++++++-------- + 1 file changed, 8 insertions(+), 8 deletions(-) + +commit e5c2b07b489b155c1bebd5cb5e5b94325c2fef1a +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-20 18:45:41 +0300 + + DOS: Update Makefile and config.h for the CRC changes + + dos/Makefile | 4 ++-- + dos/config.h | 3 +++ + 2 files changed, 5 insertions(+), 2 deletions(-) + +commit fe77c4e130d62dc3f9c1de40a18c0c6caa5a4d88 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-23 15:35:35 +0300 + + liblzma: Tidy up crc_common.h + + Prefix ARM64_RUNTIME_DETECTION with CRC_ and reorder it to be with + the other ARM64-specific lines. That macro isn't used outside this + file. + + ARM64 CLMUL implementation doesn't exist yet and thus CRC64_ARM64_CLMUL + isn't used anywhere yet. + + It's not ideal that the single-letter CRC utility macros are here + as they pollute the namespace of the LZ encoder files. Those could + be moved their own crc_macros.h like they were in 5.2.x but in practice + this is fine enough already. + + src/liblzma/check/crc_common.h | 62 ++++++++++++++++++++++++++++-------------- + 1 file changed, 42 insertions(+), 20 deletions(-) + +commit 7484d375384f551d475ff44a93590a225e0cb8f6 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-23 14:22:08 +0300 + + liblzma: Move lzma_crcXX_table[][] declarations to crc_common.h + + LZ encoder needs lzma_crc32_table[0] but otherwise those tables + are private to the CRC code. In contrast, the other things in + check.h are needed in several places. + + src/liblzma/check/check.h | 18 ------------------ + src/liblzma/check/crc32_small.c | 3 +++ + src/liblzma/check/crc_common.h | 18 ++++++++++++++++++ + src/liblzma/lz/lz_encoder_hash.h | 4 ++-- + 4 files changed, 23 insertions(+), 20 deletions(-) + +commit 85b081f5d4598342b8c155a2c08697fb2adc372c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-19 18:38:22 +0300 + + liblzma: Make 32-bit x86 CRC assembly co-exist with CLMUL + + Now runtime detection of CLMUL support can pick between the CLMUL and + the generic assembly implementations. Whatever overhead this has for + builds that omit CLMUL completely isn't important because builds for + any non-ancient system is likely to include the CLMUL code too. + + Handle the CRC tables in crcXX_fast.c files because now these files + are built even when assembly code is used. + + If 32-bit x86 assembly is enabled then it will always be built even + if compiler flags were such that CLMUL would be allowed unconditionally. + That is, runtime detection will be used anyway. This keeps the build + rules simpler. + + In LZ encoder, build and use lzma_lz_hash_table[256] if CLMUL CRC + is used without runtime detection. Previously this wasn't needed + because crc32_table.c included the lzma_crc32_table[][] in the build + unless encoder support had been disabled. Including an 8 KiB table + was silly when only 1 KiB is actually used. So now liblzma is 7 KiB + smaller if CLMUL is enabled without runtime detection. + + CMakeLists.txt | 8 ++------ + src/liblzma/check/Makefile.inc | 8 ++------ + src/liblzma/check/crc32_fast.c | 14 ++++++++++++- + src/liblzma/check/crc32_table.c | 42 --------------------------------------- + src/liblzma/check/crc32_x86.S | 14 +++++-------- + src/liblzma/check/crc64_fast.c | 18 +++++++++++++---- + src/liblzma/check/crc64_table.c | 37 ---------------------------------- + src/liblzma/check/crc64_x86.S | 14 +++++-------- + src/liblzma/check/crc_common.h | 18 +++++++++-------- + src/liblzma/check/crc_x86_clmul.h | 5 ----- + src/liblzma/lz/lz_encoder.c | 2 +- + src/liblzma/lz/lz_encoder_hash.h | 30 ++++++++++++++++++++-------- + 12 files changed, 74 insertions(+), 136 deletions(-) + +commit 6667d503b5dc9826654e3d9ad505e1883ff6c388 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-19 17:44:41 +0300 + + liblzma: CRC: Rename crcXX_generic to lzma_crcXX_generic + + This prepares for the possibility that lzma_crc32_generic and + lzma_crc64_generic are extern functions. + + src/liblzma/check/crc32_fast.c | 6 +++--- + src/liblzma/check/crc64_fast.c | 6 +++--- + 2 files changed, 6 insertions(+), 6 deletions(-) + +commit 1dca581ff20aa1cde61e9e5267d3aeb0af9b6845 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-20 22:55:22 +0300 + + CMake: Define HAVE_CRC_X86_ASM when 32-bit x86 CRC assembly is used + + CMakeLists.txt | 3 +++ + 1 file changed, 3 insertions(+) + +commit f76837acb65676e541d8ee79cd62dbbf27280a62 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-05-10 16:00:26 +0300 + + Build: Define HAVE_CRC_X86_ASM when 32-bit x86 CRC assembly is used + + This makes it easier to determine when the CRC tables are needed. + + configure.ac | 9 +++++++-- + 1 file changed, 7 insertions(+), 2 deletions(-) + +commit 9ce0866b070850da4dc837741ff055faa218bdd6 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-21 00:46:09 +0300 + + CI: Update to the new renamed options in CMakeLists.txt + + build-aux/ci_build.bash | 10 +++++----- + 1 file changed, 5 insertions(+), 5 deletions(-) + +commit 0232e66d5bc5b01a25a447c657e51747626488ab +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-20 18:12:22 +0300 + + CMake: Add XZ_EXTERNAL_SHA256 + + CMakeLists.txt | 121 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--- + 1 file changed, 116 insertions(+), 5 deletions(-) + +commit 4535b80caead82a7ddf7feb988b8fbc773152522 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-20 18:12:21 +0300 + + CMake: Move threading detection a few lines up + + It feels clearer this way, and when support for external SHA-256 + is added, this will keep the order of the library detection the + same as in configure.ac (check for pthreads before libmd) although + it shouldn't matter in practice. + + CMakeLists.txt | 176 ++++++++++++++++++++++++++++----------------------------- + 1 file changed, 88 insertions(+), 88 deletions(-) + +commit 94d062dbac34d366eb26625034200cc3457e6645 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-20 18:12:21 +0300 + + CMake: Move the sandbox code out of the liblzma section + + Sandboxing is for the command line tools, not liblzma. + No functional changes. + + CMakeLists.txt | 214 ++++++++++++++++++++++++++++----------------------------- + 1 file changed, 107 insertions(+), 107 deletions(-) + +commit 75ce4797d49621710e6da95d8cb91541028c6d68 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-20 18:12:21 +0300 @@ -700,24 +3841,373 @@ Date: 2024-06-20 18:12:21 +0300 This makes no difference yet because -lrt is currently the only option that might be added to LIBS. - - (cherry picked from commit 75ce4797d49621710e6da95d8cb91541028c6d68) CMakeLists.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit e24a762f1be6bf379df73b7fe0a115ccae139a35 +commit 47aaa92516fd9609821d04e5e94ca6558e56d62b +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Don't install scripts if the xz tool isn't built + + The scripts need the xz tool. + + CMakeLists.txt | 11 +++++++++-- + tests/tests.cmake | 2 +- + 2 files changed, 10 insertions(+), 3 deletions(-) + +commit fb50c6ba1d4c9405e5b12b5988b01a3002638c5d +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Add XZ_TOOL_XZDEC and XZ_TOOL_LZMADEC + + CMakeLists.txt | 15 ++++++++++++++- + 1 file changed, 14 insertions(+), 1 deletion(-) + +commit def767f7d18ccbd81cd5e5b46c8b6031f3a1de34 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Add XZ_TOOL_LZMAINFO + + CMakeLists.txt | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +commit 5600e370fb7e11eafabc6c3ef5bf6510e859f4f0 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Add XZ_TOOL_XZ + + CMakeLists.txt | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +commit 6a3c4aaa43a90da441e1156c5ffd2e6098f5521f +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + Windows: Drop Visual Studio 2013 support + + This simplifies things a little. Building liblzma with VS2013 probably + still worked but building the command line tools was not supported. + + Microsoft ended support for VS2013 on 2024-04. + + CMakeLists.txt | 9 +++++++-- + src/common/sysdefs.h | 6 +----- + windows/INSTALL-MSVC.txt | 8 ++------ + 3 files changed, 10 insertions(+), 13 deletions(-) + +commit 5d5c92b26246936461a635dda1f95740d7de2058 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Add XZ_TOOL_SCRIPTS + + CMakeLists.txt | 44 +++++++++++++++++++++++++++++--------------- + 1 file changed, 29 insertions(+), 15 deletions(-) + +commit d274a2bc00d235f07e96aaf82c149794cfe82b12 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Add XZ_DOC + + CMakeLists.txt | 45 ++++++++++++++++++++++++--------------------- + 1 file changed, 24 insertions(+), 21 deletions(-) + +commit 188143a50ade67253ed256608f50f78aa1380403 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-20 21:53:03 +0300 + + CMake: Refactor XZ_SYMBOL_VERSIONING to match configure.ac + + Make the available options and their behavior match + --enable-symbol-versions in configure.ac. + + Don't enable symbol versions on Linux if not using glibc. Previously + the generic variant was selected on Microblaze or if using NVHPC + without checking that libc is glibc. + + Leave the cache variable to "auto" or "yes" if that was specified + instead of setting it to the autodetected value by default. A downside + is that one cannot easily see which variant the autodetection code + has selected. The same applies to XZ_SANDBOX and XZ_THREADS though. + + CMakeLists.txt | 125 ++++++++++++++++++++++++++++++++++----------------------- + 1 file changed, 75 insertions(+), 50 deletions(-) + +commit cc52ef8ed3b75a581262c587f6c06c213a550f86 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Use the same option list for XZ_THREADS as in configure.ac + + Also clarify that "yes" will fail if no threading support is found. + If no threading is wanted, it has to be disabled manually. + + configure.ac doesn't behave this way at the moment. Instead it + assumes pthreads to be present if not targeting Windows. If pthreads + actually are missing, the build fails later. + + CMakeLists.txt | 18 ++++++++++-------- + 1 file changed, 10 insertions(+), 8 deletions(-) + +commit 37f7af3452bab0a34ce320c2ad532835f18752d9 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Use the same option list for XZ_SANDBOX as in configure.ac + + It's simpler to document this way. + + CMakeLists.txt | 20 ++++++++++---------- + 1 file changed, 10 insertions(+), 10 deletions(-) + +commit c715dec8e800b65145918cfb0ee9bbc90faa8aad Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-15 18:07:04 +0300 CMake: Fix indentation + + CMakeLists.txt | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +commit ea379f2f180befabd2039342db8eaeb757fdd2b7 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Add warning options for GCC and Clang + + The list was copied from configure.ac and should be kept in sync. + (Pretend that the deleted comment in CMakeLists.txt didn't exist.) - (cherry picked from commit c715dec8e800b65145918cfb0ee9bbc90faa8aad) + There is no need to add equivalent of --enable-werror as CMake >= 3.24 + supports -DCMAKE_COMPILE_WARNING_AS_ERROR=ON. + + CMakeLists.txt | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++----- + 1 file changed, 59 insertions(+), 5 deletions(-) + +commit 74223338197b7dfcd69f56df78b6502805a75f23 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Use \040 instead of \x20 for a space + + This is for consistency with 4c81c9611f8b2e1ad65eb7fa166afc570c58607e + where \040 has to be used because \0x20F gets interpret at three hex + digits. Octals escapes are never longer than three digits. CMakeLists.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 99555b721b55263a6892b1093f2806f09a92e1fb +commit e8854b6bdc956c46dc4232bd07c17163034a00f2 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Add XZ_ASSUME_RAM + + CMakeLists.txt | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +commit e1127e75cb82e0385f02c995771d6fe1420f43c5 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename liblzma_INSTALL_CMAKEDIR to XZ_INSTALL_CMAKEDIR + + CMakeLists.txt | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +commit 96abfe98c15e431a50a6a31015c5bb05540ab2ff +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Refactor ADDITIONAL_CHECK_TYPES to XZ_CHECKS + + Now "crc32" is in the list too for completeness but it doesn't + actually have any effect. The description of the cache variable + says that "crc32 is always built" so it should be clear enough. + + CMakeLists.txt | 14 +++++++------- + tests/tests.cmake | 17 ++++++++--------- + 2 files changed, 15 insertions(+), 16 deletions(-) + +commit 679500ffe00ecb4f02292129e7529ab7392f3943 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename the cache variable POSIX_SHELL to XZ_POSIX_SHELL + + We still need the variable POSIX_SHELL for configure_file() + but it doesn't need to be a cache variable. + + CMakeLists.txt | 7 ++++--- + 1 file changed, 4 insertions(+), 3 deletions(-) + +commit e5c0eb2e50e5522a0a55e7ba83fe49b04c8a6eef +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename ENCODERS and DECODERS to use XZ_ prefix + + CMakeLists.txt | 34 +++++++++++++++++----------------- + tests/tests.cmake | 4 ++-- + 2 files changed, 19 insertions(+), 19 deletions(-) + +commit e7785e2061f95d44aa6c0856b09cc0fbad7d6154 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename MATCH_FINDERS to XZ_MATCH_FINDERS + + CMakeLists.txt | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +commit 63294806b488a27a28a0960f6a257695dd2b569a +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename SYMBOL_VERSIONING to XZ_SYMBOL_VERSIONING + + CMakeLists.txt | 9 +++++---- + 1 file changed, 5 insertions(+), 4 deletions(-) + +commit ad245b133675d285bca5d48123062e9d1e3f747e +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename ENABLE_THREADS to XZ_THREADS + + CMakeLists.txt | 24 +++++++++++------------- + 1 file changed, 11 insertions(+), 13 deletions(-) + +commit 4250d4de32e66e558cc2ebe73b05255633c933ed +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename ENABLE_SANDBOX to XZ_SANDBOX + + CMakeLists.txt | 23 +++++++++++------------ + 1 file changed, 11 insertions(+), 12 deletions(-) + +commit 0fdcd0c582f1a38542cd647dde449d9447d5888d +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename ENABLE_X86_ASM to XZ_ASM_I386 + + CMakeLists.txt | 10 +++++----- + 1 file changed, 5 insertions(+), 5 deletions(-) + +commit e017d5526e316003fdb2a3f76acbb83443f14ddf +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename CREATE_XZ_SYMLINKS to XZ_TOOL_SYMLINKS + + This only affects the names unxz and xzcat. The xz-prefixed script + symlinks (xzfgrep and such) are always created if scripts are enabled. + + CMakeLists.txt | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +commit 04cac14fcb9fb302c24e90b04ca4b77d3717b50c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename CREATE_LZMA_SYMLINKS to XZ_TOOL_LZMA_SYMLINKS + + Update the description too. + + It affects creation of not only the legacy lzma, unlzma, lzcat symlinks + but also lzgrep and other legacy names for the scripts. The last + LZMA Utils release was made in 2008 but these names are still used + in some places to handle .lzma files. + + CMakeLists.txt | 7 ++++--- + 1 file changed, 4 insertions(+), 3 deletions(-) + +commit 612ccebf884eb1a9b6848e230c24f97a03fe917a +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename ALLOW_ARM64_CRC32 to XZ_ARM64_CRC32 + + Update description too. + + CMakeLists.txt | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +commit 3dcc12290d6dffbe7f10f501c141d325bad65901 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename ALLOW_CLMUL_CRC to XZ_CLMUL_CRC + + Update description too. + + CMakeLists.txt | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +commit 4b8faa72442da9aa1a356f5848aae798d8588a7d +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename ENABLE_DOXYGEN to XZ_DOXYGEN + + CMakeLists.txt | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +commit b56273ae575bac350e50b0c689269dcab04b04b3 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename LZIP_DECODER to XZ_LZIP_DECODER + + CMakeLists.txt | 4 ++-- + tests/tests.cmake | 2 +- + 2 files changed, 3 insertions(+), 3 deletions(-) + +commit 2343992fcbe8b436da6df888be37713cccaff0ab +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename MICROLZMA_ENCODER/DECODER to XZ_MICROLZMA_ENCODER/DECODER + + CMakeLists.txt | 8 ++++---- + tests/tests.cmake | 2 +- + 2 files changed, 5 insertions(+), 5 deletions(-) + +commit 96f0a6632cc0598a26d93255b0c444df18dc7891 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename ENABLE_SMALL to XZ_SMALL + + CMakeLists.txt | 14 +++++++------- + 1 file changed, 7 insertions(+), 7 deletions(-) + +commit 29f77c7b707f2458fb047e77497354b195e05b14 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-15 18:07:04 +0300 + + CMake: Rename ENABLE_NLS to XZ_NLS + + Also update the description to mention that this affects installation + of translated man pages too. + + Prefixing the cache variables with the project name helps if + the package is used as a subproject in another package. + It also makes the package-specific options group more nicely + in ccmake and cmake-gui. + + CMakeLists.txt | 28 +++++++++++++++------------- + 1 file changed, 15 insertions(+), 13 deletions(-) + +commit ac05f1b0d7cda1e7ae79775a8dfecc54601d7f1c Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-15 23:34:29 +0300 @@ -736,24 +4226,20 @@ Date: 2024-06-15 23:34:29 +0300 they aren't, or something like that... It seems best to always specify a scope keyword as the meanings of those three keywords at least are clear. - - (cherry picked from commit ac05f1b0d7cda1e7ae79775a8dfecc54601d7f1c) CMakeLists.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 258bae30a2040138c783b5c380cef0ca603663ed +commit 82986d8c691a294c78b48d8391303e5c428b5437 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-16 19:39:32 +0300 CMake: Add empty lines - - (cherry picked from commit 82986d8c691a294c78b48d8391303e5c428b5437) CMakeLists.txt | 2 ++ 1 file changed, 2 insertions(+) -commit a95a9601a109f0d0d059dea7a5a44efa87ef1401 +commit 2aecffe0f0e14f3ef635e8cd7b405420f2385de2 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-16 19:37:36 +0300 @@ -762,24 +4248,20 @@ Date: 2024-06-16 19:37:36 +0300 This shouldn't make much difference in practice as on Windows no flags are needed anyway and unitialized variable (when threading is disabled) expands to empty. But it's clearer this way. - - (cherry picked from commit 2aecffe0f0e14f3ef635e8cd7b405420f2385de2) CMakeLists.txt | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) -commit 65a10ddd439ad435d2c0176106b1e2d6b9c1b3a1 +commit 664918bd3635ea8e773f06022286ecb0c485166c Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-17 18:20:14 +0300 Update THANKS - - (cherry picked from commit 664918bd3635ea8e773f06022286ecb0c485166c) THANKS | 3 +++ 1 file changed, 3 insertions(+) -commit 6ad5739094ac69ac448a84493f2c7ddfc6eb0688 +commit 5ca96a93488d0f5a530c78b274cac317453807ff Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-16 19:25:07 +0300 @@ -787,13 +4269,11 @@ Date: 2024-06-16 19:25:07 +0300 vcpkg doesn't specify the newline type so it should be fine to use native newlines in liblzma.pc on Windows. - - (cherry picked from commit 5ca96a93488d0f5a530c78b274cac317453807ff) CMakeLists.txt | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) -commit 4107f2066764bb3a31d114852bc20722d582fd82 +commit ebd155c3a1b87411edae06d3bdaa9659ec057522 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-16 19:18:56 +0300 @@ -805,13 +4285,27 @@ Date: 2024-06-16 19:18:56 +0300 absolute path. Thanks to Eli Schwartz. - - (cherry picked from commit ebd155c3a1b87411edae06d3bdaa9659ec057522) CMakeLists.txt | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) -commit ff697eb154361417d94284e0c569aa08cacf9031 +commit 7a366d93cfd74ce10201db400be8836199944e36 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-16 18:33:08 +0300 + + Revert "CMake: Set only "prefix" as an absolute path in liblzma.pc" + + This reverts commit 5d1c649ba9eb7a5b9371252ebfbc2911dc774e69. + + While CMAKE_INSTALL_<dir> tend to be relative paths, they don't need + to be. Thus the commit was broken. A fancier method is required. + + Thanks to Eli Schwartz for the bug report and explanation. + + CMakeLists.txt | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +commit 30a2d5d51006301a3ddab5ef1f5ff0a9d74dce6f Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-16 13:39:37 +0300 @@ -822,13 +4316,153 @@ Date: 2024-06-16 13:39:37 +0300 not needed even though it's a "static inline" function. Thanks to Ilya Kurdyukov. - - (cherry picked from commit 30a2d5d51006301a3ddab5ef1f5ff0a9d74dce6f) src/liblzma/check/crc_x86_clmul.h | 4 ++++ 1 file changed, 4 insertions(+) -commit 4e4a568f6a089c867891c2388a19624e312eb2f3 +commit 54eaea5ea49bb8bca4286d4412f19ac73187489e +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-16 13:21:34 +0300 + + liblzma: x86 CLMUL CRC: Rewrite + + It's faster with both tiny and large buffers and doesn't require + disabling any sanitizers. With large buffers the extra speed is + from folding four 16-byte chunks in parallel. + + The 32-bit x86 with MSVC reportedly still needs a workaround. + Now the simpler "__asm mov ebx, ebx" trick is enough but it + needs to be in lzma_crc64() instead of crc64_arch_optimized(). + Thanks to Iouri Kharon for testing and the fix. + + Thanks to Ilya Kurdyukov for testing the speed with aligned and + unaligned buffers on a few x86 processors and on E2K v6. + + Thanks to Sam James for general feedback. + + Fixes: https://github.com/tukaani-project/xz/issues/112 + Fixes: https://github.com/tukaani-project/xz/issues/122 + + src/liblzma/check/crc64_fast.c | 8 + + src/liblzma/check/crc_x86_clmul.h | 437 ++++++++++++++++++++------------------ + 2 files changed, 237 insertions(+), 208 deletions(-) + +commit c0e7eaae8d6eef1e313c9d0da20ccf126ec61f38 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-01 14:44:04 +0300 + + sysdefs.h: Add alignas + + src/common/sysdefs.h | 11 +++++++++++ + 1 file changed, 11 insertions(+) + +commit 20014c261451381d5e2f58e63e7b1fbefd4df4bf +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-11 12:47:59 +0300 + + liblzma: Use a single macro to select CLMUL CRC to build + + This way it's clearer that two things cannot be selected + at the same time. + + src/liblzma/check/crc32_fast.c | 2 +- + src/liblzma/check/crc64_fast.c | 2 +- + src/liblzma/check/crc_x86_clmul.h | 18 ++++++++++-------- + 3 files changed, 12 insertions(+), 10 deletions(-) + +commit d8fb0986171bd6a3066b236fc9a6b3d573c8e441 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-10 15:31:01 +0300 + + liblzma: CRC32 CLMUL: Refactor the constants and simplify + + By using modulus scaled constants, the final reduction can + be simplified. + + src/liblzma/check/crc_x86_clmul.h | 52 +++++++-------------------------------- + 1 file changed, 9 insertions(+), 43 deletions(-) + +commit ef652ac391ff7e8cda656238dc5b5f83bc1554c2 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-10 15:12:48 +0300 + + liblzma: CRC64 CLMUL: Refactor the constants + + Now it refers to crc_clmul_consts_gen.c. vfold8 was renamed to mu_p + and the p no longer has the lowest bit set (it makes no difference + as the output bits it affects are ignored). + + src/liblzma/check/crc_x86_clmul.h | 43 +++++++-------------------------------- + 1 file changed, 7 insertions(+), 36 deletions(-) + +commit 9f5fc17e32bf5c7c6cfadf40c29a1dedb4cc03ac +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-10 14:45:44 +0300 + + liblzma: Add crc_clmul_consts_gen.c + + It's a standalone program that prints the required constants. + It's won't be a part of the normal build of the package. + + src/liblzma/check/Makefile.inc | 1 + + src/liblzma/check/crc_clmul_consts_gen.c | 160 +++++++++++++++++++++++++++++++ + 2 files changed, 161 insertions(+) + +commit 71b147aab7fe4a60ed57b697d5bb490f099894be +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-05-09 21:44:03 +0300 + + liblzma: Remove CRC_USE_GENERIC_FOR_SMALL_INPUTS + + It was already commented out. + + src/liblzma/check/crc32_fast.c | 21 --------------------- + src/liblzma/check/crc64_fast.c | 5 ----- + src/liblzma/check/crc_common.h | 14 -------------- + src/liblzma/check/crc_x86_clmul.h | 9 +-------- + 4 files changed, 1 insertion(+), 48 deletions(-) + +commit f99a7be40645f86959a5b180dfae948dd165e07c +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-05-09 21:03:39 +0300 + + liblzma: Remove crc_attr_no_sanitize_address + + It's not enough to silence the address sanitizer. Also memory and + thread sanitizers would need to be silenced. They, at least currently, + aren't smart enough to see that the extra bytes are discarded from + the xmm registers by later instructions. + + Valgrind is smarter, possibly because this kind of code isn't weird + to write in assembly. Agner Fog's optimizing_assembly.pdf even mentions + this idea of doing an aligned read and then discarding the extra + bytes. The sanitizers don't instrument assembly code but Valgrind + checks all code. + + It's better to change the implementation to avoid the sanitization + attributes which also look scary in the code. (Somehow they can look + more scary than __asm__ which is implictly unsanitized.) + + See also: + https://github.com/tukaani-project/xz/issues/112 + https://github.com/tukaani-project/xz/issues/122 + + src/liblzma/check/crc_common.h | 9 --------- + src/liblzma/check/crc_x86_clmul.h | 3 --- + 2 files changed, 12 deletions(-) + +commit ead4d151996f8a18bf9b07eb1e175c0a1590e562 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-10 15:37:49 +0300 + + Revert "Build: Temporarily disable CRC CLMUL to silence OSS Fuzz" + + This reverts commit 9f1a6d6f9a258886933a22239a5b81af34b28199. + + configure.ac | 4 +--- + 1 file changed, 1 insertion(+), 3 deletions(-) + +commit 2178acf8a4d40a93e970cfcf9b807d5ef6c8da92 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-12 14:26:44 +0300 @@ -837,24 +4471,20 @@ Date: 2024-06-12 14:26:44 +0300 There is no need to make a similar change in configure.ac. With Autoconf 2.72, the deprecated macro AC_PROG_CC_C99 is an alias for AC_PROG_CC which prefers a C11 compiler. - - (cherry picked from commit 2178acf8a4d40a93e970cfcf9b807d5ef6c8da92) CMakeLists.txt | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) -commit 849e757a8cce41bfd6acfaa7dd3b07324363de90 +commit c97e9c12fef4d1093ee2a75236742481361f50f5 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-12 14:20:21 +0300 Update THANKS - - (cherry picked from commit c97e9c12fef4d1093ee2a75236742481361f50f5) THANKS | 4 ++++ 1 file changed, 4 insertions(+) -commit 1305056a54e68895e052506bceb26274f52bbc9a +commit 89e9f12e03324b8a186e807b268f34f92d1b2f41 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-11 11:15:49 +0300 @@ -862,24 +4492,20 @@ Date: 2024-06-11 11:15:49 +0300 A similar one was already there for CRC64 but nowadays also CRC32 has a CLMUL implementation, so it's good to test it better too. - - (cherry picked from commit 89e9f12e03324b8a186e807b268f34f92d1b2f41) tests/test_check.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) -commit a44493ec41edc98f24ed9933668e7372f5267a40 +commit c7164b1927e3fe7cdba70ee4687e1a590a81043b Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-11 22:42:26 +0300 xz: Fix white space - - (cherry picked from commit c7164b1927e3fe7cdba70ee4687e1a590a81043b) src/xz/list.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -commit 5e74a6a8138b3c102193d731120139d5a854f2cf +commit 0a32d2072c598de281058b26dc08920fbf0cd2a1 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-11 21:59:09 +0300 @@ -888,43 +4514,46 @@ Date: 2024-06-11 21:59:09 +0300 Thanks to Sam James for spotting it. Fixes: f644473a211394447824ea00518d0a214ff3f7f2 - (cherry picked from commit 0a32d2072c598de281058b26dc08920fbf0cd2a1) src/liblzma/check/crc_x86_clmul.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 3f7edc673cf21b3e4db3e2f11746905e0a393db7 +commit afd9b4d282a10186808c3331dad4caf79c02d55f Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-10 15:52:26 +0300 liblzma: Fix a comment indentation - - (cherry picked from commit afd9b4d282a10186808c3331dad4caf79c02d55f) src/liblzma/check/crc_common.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -commit 8a9cc7ca0867494f39990f0d4cbe0972042f6d59 +commit 50e6bff274568c568930e15094da8217e7d47d28 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-09 22:09:12 +0300 liblzma: Fix white space - - (cherry picked from commit 50e6bff274568c568930e15094da8217e7d47d28) src/liblzma/check/crc32_table.c | 10 +++++----- src/liblzma/check/crc_x86_clmul.h | 6 +++--- src/liblzma/check/sha256.c | 2 +- 3 files changed, 9 insertions(+), 9 deletions(-) -commit b29b13082fe578a3bb9384a5939c82055f796a34 +commit caea7844d3824755d053b4743c4913d73ac2db3d +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-01 14:25:29 +0300 + + tuklib: __STDC_VERSION__ in C23 is 202311 + + src/common/tuklib_common.h | 4 +--- + 1 file changed, 1 insertion(+), 3 deletions(-) + +commit 9e73918a4f14be754a23f74dda45ca431939a4a0 Author: RainRat <rainrat78@yahoo.ca> Date: 2024-06-05 15:21:49 -0700 Fix typos Closes: https://github.com/tukaani-project/xz/pull/124 - (cherry picked from commit 9e73918a4f14be754a23f74dda45ca431939a4a0) INSTALL | 2 +- doc/examples/03_compress_custom.c | 2 +- @@ -934,7 +4563,7 @@ Date: 2024-06-05 15:21:49 -0700 tests/test_filter_str.c | 2 +- 6 files changed, 6 insertions(+), 6 deletions(-) -commit 6f66155e01a6467e70db48cddbe790bdb8d87754 +commit 04b23addf3733873667675df2439725f076c2f36 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-07 15:47:20 +0300 @@ -956,23 +4585,55 @@ Date: 2024-06-07 15:47:20 +0300 Co-authored-by: Christian Weisgerber <naddy@mips.inka.de> Co-authored-by: Brad Smith <brad@comstyle.com> Closes: https://github.com/tukaani-project/xz/pull/126 - (cherry picked from commit 04b23addf3733873667675df2439725f076c2f36) src/common/tuklib_integer.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) -commit 5522759d31e0f1513fffbdf39a955f12d373f121 +commit dc03f6290f5b9bd3d50c7e12e58dee870889d599 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-07 15:06:59 +0300 + + liblzma: Add ARM64 CRC32 instruction support detection on OpenBSD + + The C code is from Christian Weisgerber, I merely reordered the OSes. + Then I added the build system checks without testing them. + + Also thanks to Brad Smith who submitted a similar patch on GitHub + a few hours after Christian had sent his via email. + + Co-authored-by: Christian Weisgerber <naddy@mips.inka.de> + Closes: https://github.com/tukaani-project/xz/pull/125 + + CMakeLists.txt | 6 ++++++ + configure.ac | 9 +++++++++ + src/liblzma/check/crc32_arm64.h | 15 +++++++++++++++ + src/liblzma/check/crc_common.h | 1 + + 4 files changed, 31 insertions(+) + +commit f5c2ae58ec68c665e62c790b842657afcb31474c Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-05 13:55:43 +0300 Update THANKS - - (cherry picked from commit f5c2ae58ec68c665e62c790b842657afcb31474c) THANKS | 2 ++ 1 file changed, 2 insertions(+) -commit 45aed6f37f17e5fac215290204e03894965cf1d5 +commit e5491dfab9c54dc7078a8d3d07fabb91d6e06418 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-05 13:42:47 +0300 + + CMake: Include the "alpha" or "beta" suffix in PACKAGE_VERSION + + This way the version string gets into xzgrep and other scripts + in full and also into liblzma.pc. + + For the project() command, a suffixless string is required though. + + CMakeLists.txt | 16 +++++++++++++--- + 1 file changed, 13 insertions(+), 3 deletions(-) + +commit 1d3c61575fda0be6b2d50c9e32a343349d5cd5c0 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-05 13:30:28 +0300 @@ -986,12 +4647,23 @@ Date: 2024-06-05 13:30:28 +0300 was used as the fallback. It has the same value as xz_VERSION. Fixes: 7e3493d40eac0c3fa3d5124097745a70e15c41f6 - (cherry picked from commit 1d3c61575fda0be6b2d50c9e32a343349d5cd5c0) CMakeLists.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 198271a6ed0e6ac6820f8f44172a203aa44abe39 +commit 5d1c649ba9eb7a5b9371252ebfbc2911dc774e69 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-06-05 12:59:59 +0300 + + CMake: Set only "prefix" as an absolute path in liblzma.pc + + CMake provides variables that are relative to CMAKE_INSTALL_PREFIX + so use them instead of repeating the full path. + + CMakeLists.txt | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +commit e0d6d05ce0d464e966c0669bbf869202a43cc2f7 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-04 23:59:29 +0300 @@ -1027,13 +4699,11 @@ Date: 2024-06-04 23:59:29 +0300 See the discussion: https://github.com/microsoft/vcpkg/pull/39024 Thanks to Vincent Torri for confirming the naming issue on Cygwin. - - (cherry picked from commit e0d6d05ce0d464e966c0669bbf869202a43cc2f7) CMakeLists.txt | 34 ++++++++++++++++++++++++++++++---- 1 file changed, 30 insertions(+), 4 deletions(-) -commit 92e5425979199407080fd80e67c15f2cbf85392b +commit e7a42cda7c827e016619e8cab15e2faf5d4181ae Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-03 16:55:03 +0300 @@ -1062,35 +4732,29 @@ Date: 2024-06-03 16:55:03 +0300 in this commit are smaller and should have a smaller risk for regressions. It's also possible that version.sh will be dropped entirely at some point. - - (cherry picked from commit e7a42cda7c827e016619e8cab15e2faf5d4181ae) build-aux/version.sh | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) -commit 0c089a33a5b1f5b9451b332484c68e1d6f02631a +commit a61c9ab4751f2710dcd5459c7d74bbf20781f0f9 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-03 17:07:11 +0300 CI: Don't require po4a on Solaris - - (cherry picked from commit a61c9ab4751f2710dcd5459c7d74bbf20781f0f9) .github/workflows/solaris.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 83d3792711295656a3de69bbcd98dcb4b06be1c2 +commit 5229bdf5335ce18ed54beb7e646e39927663be86 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-03 15:08:15 +0300 CI: Use set -e on Solaris too - - (cherry picked from commit 5229bdf5335ce18ed54beb7e646e39927663be86) .github/workflows/solaris.yml | 1 + 1 file changed, 1 insertion(+) -commit 9c64d4fd787ea7bca3795be55367504a9f47a68c +commit afa938e429c1ce07d26d02999352fb014b62ff3d Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-06-03 17:44:50 +0300 @@ -1103,12 +4767,11 @@ Date: 2024-06-03 17:44:50 +0300 See: https://github.com/microsoft/vcpkg/blob/eb895b95aac6fd7485373702f29f508c42a180a0/ports/liblzma/portfile.cmake https://github.com/microsoft/vcpkg/pull/39024#issuecomment-2145064670 - (cherry picked from commit afa938e429c1ce07d26d02999352fb014b62ff3d) CMakeLists.txt | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) -commit 42754176bd84c4539db55a9e70bdcdd5700c709f +commit 35f8649f08341639a627fd06350e938124ca3622 Author: Sam James <sam@gentoo.org> Date: 2024-06-03 06:16:23 +0100 @@ -1120,8 +4783,6 @@ Date: 2024-06-03 06:16:23 +0100 Maybe going forward we can limit this further by only being paranoid for the jobs with any access to tokens. - - (cherry picked from commit 35f8649f08341639a627fd06350e938124ca3622) .github/workflows/ci.yml | 4 ++-- .github/workflows/freebsd.yml | 2 +- @@ -1131,30 +4792,27 @@ Date: 2024-06-03 06:16:23 +0100 .github/workflows/windows-ci.yml | 4 ++-- 6 files changed, 8 insertions(+), 8 deletions(-) -commit 9a5fee7022eddffdfcee32a7e43f64635581b393 +commit e885dae37ff5b1dbc760dabc1e03e866a7302ef2 Author: Christoph Junghans <christoph.junghans@gmail.com> Date: 2024-04-30 07:49:26 -0600 ci: set -e on openbsd Closes: https://github.com/tukaani-project/xz/pull/116 - (cherry picked from commit e885dae37ff5b1dbc760dabc1e03e866a7302ef2) .github/workflows/openbsd.yml | 1 + 1 file changed, 1 insertion(+) -commit a2d66de54f234999a7d42305988cf2c3e0b1b8f6 +commit 21b02dd128cf9e8c76325ec124f70381862dcf19 Author: Christoph Junghans <christoph.junghans@gmail.com> Date: 2024-04-30 07:48:58 -0600 ci: set -e on netbsd - - (cherry picked from commit 21b02dd128cf9e8c76325ec124f70381862dcf19) .github/workflows/netbsd.yml | 1 + 1 file changed, 1 insertion(+) -commit 1bdc70176b59b0e22c0a580c518dc5d0f2fd0723 +commit 8641f0c24c041136670c975b23408184b45431bc Author: Christoph Junghans <christoph.junghans@gmail.com> Date: 2024-04-25 14:56:06 -0700 @@ -1163,38 +4821,33 @@ Date: 2024-04-25 14:56:06 -0700 Without "set -e" the job will always be successful. See vmactions/freebsd-vm#72 - - (cherry picked from commit 8641f0c24c041136670c975b23408184b45431bc) .github/workflows/freebsd.yml | 1 + 1 file changed, 1 insertion(+) -commit 4132277103acdf1c01f8b5a4c12c0992c330ade4 +commit ef616683ef11f11ffdfbe0624da33905e28a70f9 Author: Andrew Murray <radarhere@users.noreply.github.com> Date: 2024-04-25 09:24:46 +1000 Updated actions Closes: https://github.com/tukaani-project/xz/pull/115 - (cherry picked from commit ef616683ef11f11ffdfbe0624da33905e28a70f9) .github/workflows/ci.yml | 4 ++-- .github/workflows/windows-ci.yml | 6 +++--- 2 files changed, 5 insertions(+), 5 deletions(-) -commit 1575414636104773cefc62cf075726c6ee7ae37d +commit 57b440d316da9ac9cb312ee7e6890f5382556f10 Author: Sam James <sam@gentoo.org> Date: 2024-06-03 02:49:40 +0100 ci: add po4a - - (cherry picked from commit 57b440d316da9ac9cb312ee7e6890f5382556f10) .github/workflows/netbsd.yml | 2 +- .github/workflows/openbsd.yml | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) -commit c3e293037e1bb2bd9efedbb0e75387d1282cc03f +commit 08cdf4be9a673d78efe393b53dd73bf43c81dd95 Author: Sam James <sam@gentoo.org> Date: 2024-04-13 21:02:04 +0100 @@ -1203,13 +4856,11 @@ Date: 2024-04-13 21:02:04 +0100 Inspired by https://github.com/RsyncProject/rsync/commit/3f2a38b01184cae9a931280b534acf5a3dae2e94. It runs on Solaris 5.11 via a VirtualBox VM. - - (cherry picked from commit 08cdf4be9a673d78efe393b53dd73bf43c81dd95) .github/workflows/solaris.yml | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) -commit dc6b6011b45b0d0ddd0650f4885e24c68b37fddf +commit b69768c8bd1a34fde311935c551d061ba52d9a3f Author: Sam James <sam@gentoo.org> Date: 2024-04-14 08:08:00 +0100 @@ -1227,61 +4878,32 @@ Date: 2024-04-14 08:08:00 +0100 It's presumably because of older gettext missing format attributes. This is with `gcc (GCC) 7.3.0`. - - (cherry picked from commit b69768c8bd1a34fde311935c551d061ba52d9a3f) src/xz/list.c | 7 +++++++ 1 file changed, 7 insertions(+) -commit 7ce2ac795a812ecf1eb2d6b62f51b55ac799c2a5 +commit bb90e1f66d9beb490c4c99763e79519045968710 Author: Lasse Collin <lasse.collin@tukaani.org> -Date: 2024-05-31 21:36:26 +0300 +Date: 2024-06-03 11:44:28 +0300 - Update THANKS + license-check.sh: Fix reporting of unclear license info - (cherry picked from commit b8d134e61ede9f4a296226d97f5c20721fb4e8e2) - - THANKS | 3 +++ - 1 file changed, 3 insertions(+) + The main feature was broken because an old variable name hadn't + been updated to match the rest of the script. -commit 3ec664d3f652133136587a51d4505b1abe1acdd7 -Author: Lasse Collin <lasse.collin@tukaani.org> -Date: 2024-05-29 18:03:51 +0300 - - Bump version and soname for 5.6.2 - - src/liblzma/Makefile.am | 2 +- - src/liblzma/api/lzma/version.h | 2 +- - 2 files changed, 2 insertions(+), 2 deletions(-) - -commit 3cc0aa702e50b786c52c6f3d3f831a635c4df197 -Author: Lasse Collin <lasse.collin@tukaani.org> -Date: 2024-05-29 18:03:04 +0300 - - Add NEWS for 5.6.2 - - NEWS | 130 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ - 1 file changed, 130 insertions(+) - -commit 526d3f7f2c2d5e134157d08b37fb5fd0b125799e -Author: Lasse Collin <lasse.collin@tukaani.org> -Date: 2024-05-29 18:03:04 +0300 - - Add NEWS for 5.4.7 - - NEWS | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ - 1 file changed, 89 insertions(+) + build-aux/license-check.sh | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) -commit 660b09279e8f544acf120d29194d5c3051b484eb +commit b8d134e61ede9f4a296226d97f5c20721fb4e8e2 Author: Lasse Collin <lasse.collin@tukaani.org> -Date: 2024-05-29 18:03:04 +0300 +Date: 2024-05-31 21:36:26 +0300 - Add NEWS for 5.2.13 + Update THANKS - NEWS | 115 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ - 1 file changed, 115 insertions(+) + THANKS | 3 +++ + 1 file changed, 3 insertions(+) -commit 7d76282dac766c0ced8ae24e0f7ce0005f3e377d +commit 162587d3fb3fcedc6eee61eda3ccaaf60c80f0de Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-29 17:47:13 +0300 @@ -1298,7 +4920,7 @@ Date: 2024-05-29 17:47:13 +0300 po4a/uk.po | 1592 ++++++++++--------- 6 files changed, 6114 insertions(+), 9521 deletions(-) -commit 4470c3f7d8954bb47b280ec07ad0bd4be2223083 +commit 50cd8ed002473c5cd53980e70a53e5e6ad646ffe Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-29 17:44:53 +0300 @@ -1336,7 +4958,34 @@ Date: 2024-05-29 17:44:53 +0300 po/zh_TW.po | 558 ++++++++++++++++++++++++--------------- 23 files changed, 7257 insertions(+), 5132 deletions(-) -commit 33b8a85face5392b5ac843bdbe3a72f024cad6ef +commit 16dbd865c8833462e1604a1e13f7effe55bb3fe6 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-05-29 18:03:04 +0300 + + Add NEWS for 5.6.2 + + NEWS | 130 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 1 file changed, 130 insertions(+) + +commit a0eeb5f9369c43508610dcf00140edb8e2be92a6 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-05-29 18:03:04 +0300 + + Add NEWS for 5.4.7 + + NEWS | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 1 file changed, 89 insertions(+) + +commit 9b476fb93a9672f2e70b56e3e9c7e9cfedd6c162 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-05-29 18:03:04 +0300 + + Add NEWS for 5.2.13 + + NEWS | 115 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + 1 file changed, 115 insertions(+) + +commit 9284f1aea31f0eb23e2ea72f7218b271e2234762 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-29 16:33:24 +0300 @@ -1373,14 +5022,12 @@ Date: 2024-05-29 16:33:24 +0300 Distribution tarballs will still have non-reproducible POT-Creation-Date in po/xz.pot and po4a/xz-man.pot but those are just two files. Even they could be made reproducible from a Git timestamp if desired. - - (cherry picked from commit 9284f1aea31f0eb23e2ea72f7218b271e2234762) Makefile.am | 3 ++- po/Makevars | 6 +++++- 2 files changed, 7 insertions(+), 2 deletions(-) -commit 09daebd66b55799bbc495b84310a86c91bbfc1c8 +commit 4beba1cd62d7f8f7a6f1e899b68292d94c53b599 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-28 21:10:33 +0300 @@ -1397,72 +5044,86 @@ Date: 2024-05-28 21:10:33 +0300 The --add-location=file option was removed as redundant. The line numbers don't exist in the .pot file due to --porefs file and thus they cannot get copied to the .po files either. - - (cherry picked from commit 4beba1cd62d7f8f7a6f1e899b68292d94c53b599) po4a/update-po | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) -commit 51ad72dae4e516e9292f6f399bd1e4970b77f7c1 +commit b14c130a58a649f9a73392eeb122cb252327c569 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-28 18:36:53 +0300 Update contact info in README - - (cherry picked from commit b14c130a58a649f9a73392eeb122cb252327c569) README | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) -commit 18463917f9b255b8f925fa54ab9388319735b14a +commit 75f5f2e014b0ee646963f36bc6a9c840fb272353 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-28 13:25:07 +0300 Translations: Use --package-name=xz-man with po4a This is to match reality. See the added comment. - - (cherry picked from commit 75f5f2e014b0ee646963f36bc6a9c840fb272353) po4a/update-po | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) -commit 26bbcb13cd2bbb56fe406544a484b4edfc7e0837 +commit eb217d016cfbbba1babc19a61095b3ea25898af6 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-28 13:03:40 +0300 Translations: Omit --package-name from po/Makevars This is closer to the reality in the po/*.po files. - - (cherry picked from commit eb217d016cfbbba1babc19a61095b3ea25898af6) po/Makevars | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) -commit c35ee804b89556d15bc8cdc16867f4316e69392f +commit d28a4b2520adeeaa1b9e921bf42c7c1f36552c06 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-05-27 17:45:51 +0300 + + license-check.sh: Use '--' with slightly untrusted filenames + + Names from git ls-files should be safe but if one runs it on + a tree without the .git dir and there are extra files, it's + safer to have the end of arguments marked with '--'. + + build-aux/license-check.sh | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +commit fda0ec862a34094cf23fc25d0e0a95858c3a3ab5 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-05-27 17:41:37 +0300 + + license-check.sh: Use xargs -0 instead of -d + + Neither are in POSIX but -0 is much more portable in practice. + + Despite the old comment, the grep usage should be portable already. + + build-aux/license-check.sh | 11 ++++++----- + 1 file changed, 6 insertions(+), 5 deletions(-) + +commit 9114267038deaecf4832a5cacb5acbe6591ac839 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-28 01:17:45 +0300 Translations: Omit man page line numbers from .pot and .po files - - (cherry picked from commit 9114267038deaecf4832a5cacb5acbe6591ac839) po4a/update-po | 5 +++++ 1 file changed, 5 insertions(+) -commit 0f4429d47f9cfe2cdfbad115a7bc2f11221cb217 +commit 093490b58271e9424ce38a7b1b38bcf61b9c86c6 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-28 01:06:30 +0300 Translations: Use the xgettext option --add-location=file - - (cherry picked from commit 093490b58271e9424ce38a7b1b38bcf61b9c86c6) po/Makevars | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -commit a93e2c2d1d34a6f609d24a8e62072ce78df7a734 +commit fccebe2b4fd513488fc920e4dac32562ed3c7637 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-28 00:43:53 +0300 @@ -1473,24 +5134,20 @@ Date: 2024-05-28 00:43:53 +0300 The option is available since gettext 0.19 (2014). configure.ac requires 0.19.6. - - (cherry picked from commit fccebe2b4fd513488fc920e4dac32562ed3c7637) po/Makevars | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit d4389895592e9a8e0f6391fdad816ae0537bb07b +commit f361d9ae85707a87eb28db400eb7229cec103d58 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-27 12:22:08 +0300 Build: Use $(SHELL) instead of sh to run scripts in Makefile.am - - (cherry picked from commit f361d9ae85707a87eb28db400eb7229cec103d58) - Makefile.am | 10 +++++----- - 1 file changed, 5 insertions(+), 5 deletions(-) + Makefile.am | 14 +++++++------- + 1 file changed, 7 insertions(+), 7 deletions(-) -commit 5781414b6e3120098b0060d073aa2b0580ff6f40 +commit a26dece34793a09aac2476f954d162d03e9cf62b Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-23 17:25:13 +0300 @@ -1501,8 +5158,6 @@ Date: 2024-05-23 17:25:13 +0300 and translated strings are identical in this case so it wouldn't matter. But patching the translations helps still because then po4a will show the correct translation percentage. - - (cherry picked from commit a26dece34793a09aac2476f954d162d03e9cf62b) po4a/de.po | 8 ++++---- po4a/fr.po | 4 ++-- @@ -1512,7 +5167,7 @@ Date: 2024-05-23 17:25:13 +0300 po4a/uk.po | 8 ++++---- 6 files changed, 18 insertions(+), 18 deletions(-) -commit 3670e0616eb9d86e7519d2b76242fd32c6e0c1ae +commit 24387c234b4eed1ef9a7eaa107391740b4095568 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-23 15:15:18 +0300 @@ -1526,43 +5181,76 @@ Date: 2024-05-23 15:15:18 +0300 On top of this, if the code is run on modern processors that support the CLMUL instruction, then the C code should be faster (but then one should also be using a x86-64 build if possible). - - (cherry picked from commit 24387c234b4eed1ef9a7eaa107391740b4095568) CMakeLists.txt | 34 +++++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-) -commit c1b001b09e902ecacabb8a2ae1fc991018a4d1f8 +commit 0fb3c9c3f684f5a25bd425ed079a20a79f0c969d Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-23 14:26:45 +0300 CMake: Rename USE_DOXYGEN to ENABLE_DOXYGEN It's more consistent with the other option() uses. - - (cherry picked from commit 0fb3c9c3f684f5a25bd425ed079a20a79f0c969d) CMakeLists.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -commit 7213fe39c717d4623c92af715484a71d9a6ff8d0 +commit 6bbec3bda02bf87d24fa095074456e723589921f +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-05-22 15:21:53 +0300 + + Mention license-check.sh in COPYING + + COPYING | 6 ++++++ + 1 file changed, 6 insertions(+) + +commit 62733592a1cc6f0b41f46ef52e06d1a6fe1ff38a Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-22 15:21:53 +0300 Use more confident language in COPYING - - (cherry picked from commit 62733592a1cc6f0b41f46ef52e06d1a6fe1ff38a) COPYING | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) -commit 15358be94a4e3f9c20f331b64b3980f3e5283760 +commit a119a4209e8827e1d7c2cfd30cb9f5a9b76f9dff +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-05-22 15:21:53 +0300 + + Build: Run license-check.sh in "mydist" and "dist-hook" + + In mydist the point is to check using the file list from the Git + repository. In dist-hook it is to check that the TARBALL_IGNORE + patterns work when the .git dir or the "git" command aren't available. + + Refuse to create a distribution tarball if license issues are found. + + Makefile.am | 2 ++ + 1 file changed, 2 insertions(+) + +commit f3434ecfcb45154508752986f4fc670b8f0555dc +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-05-22 15:21:53 +0300 + + Add build-aux/license-check.sh + + This helps in spotting files that lack SPDX license identifier + and which haven't been explicitly white listed either. The script + requires the .git directory to be present as only the files that + are in the Git repository are checked. + + XZ Utils isn't FSFE REUSE compliant for now. + + Makefile.am | 1 + + build-aux/license-check.sh | 174 +++++++++++++++++++++++++++++++++++++++++++++ + 2 files changed, 175 insertions(+) + +commit 9ae2ebc1e504a1814b0788de95fb5c58c0328dde Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-29 17:16:38 +0300 Add SPDX license identifiers to files under tests/ossfuzz - - (cherry picked from commit 9ae2ebc1e504a1814b0788de95fb5c58c0328dde) tests/ossfuzz/Makefile | 2 ++ tests/ossfuzz/config/fuzz_decode_alone.options | 2 ++ @@ -1572,18 +5260,16 @@ Date: 2024-04-29 17:16:38 +0300 tests/ossfuzz/config/fuzz_xz.dict | 2 ++ 6 files changed, 12 insertions(+) -commit 1aa92c7ffd0bf8f9738ebf3bd1263bd6f5f096a2 +commit 9000d70eb9815bd7f43ffddc1c3316c507aa0e05 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-29 17:16:06 +0300 Add SPDX license identifier to .codespellrc - - (cherry picked from commit 9000d70eb9815bd7f43ffddc1c3316c507aa0e05) .codespellrc | 2 ++ 1 file changed, 2 insertions(+) -commit 3c7e400fdcabc0a1b78863948fc17964667a9401 +commit 903c16fcfa5bfad0cdb2a7383d941243bcb12e76 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-22 15:12:09 +0300 @@ -1592,49 +5278,90 @@ Date: 2024-05-22 15:12:09 +0300 The po4a directory is in EXTRA_DIST and thus all files there are included in the package. .gitignore doesn't belong in the package so keep that file out of the po4a directory. - - (cherry picked from commit 903c16fcfa5bfad0cdb2a7383d941243bcb12e76) .gitignore | 4 ++++ po4a/.gitignore | 3 --- 2 files changed, 4 insertions(+), 3 deletions(-) -commit 8a99272d4a9358dabdb5bc0b72f4c5240a9dc066 +commit 56f1d5ed68e84ba5dfa328ea2291b8f46c995125 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-20 16:55:00 +0300 - CMake: Add comments + Tests: Make the config.h grep patterns Meson compatible + + Now the test scripts detect both + + #define HAVE_DECODER_ARM + #define HAVE_DECODER_ARM 1 + + as support for the ARM filter without confusing it with these: + + #define HAVE_DECODER_ARM64 + #define HAVE_DECODER_ARM64 1 + + Previously only the ones ending with " 1" were accepted for + the macros where this kind of confusion was possible. + + This should help with Meson support because Meson's built-in + features produce config.h entries that are either + + #define FOO 1 + #define FOO 0 + + or: + + #define FOO + #undef FOO + + The former method has a benefit that one can use "#if FOO" and -Wundef + will catch if a #define is missing (for example, it helps catching + typos). But XZ Utils has to use the latter since it has been + convenient with Autoconf's default behavior.[*] While it's easy to + emulate the Autoconf style (#define FOO 1 vs. no #define at all) + in Meson, it results in clumsy code. Thus it's better to change + the few places in the tests where this difference matters. - (cherry picked from commit 9d997d6f9d4f042412e45c7b7a23a14ad2e4f9aa) + [*] While most checks in Autoconf default to the second style above, + a few things use the first style (like AC_CHECK_DECLS). The mix + of both styles is the most confusing as one has to remember which + macro needs #ifdef and which #if. Currently HAVE_VISIBILITY is + only such config.h entry that is 1 or 0. It comes unmodified + from Gnulib's visibility.m4. + + tests/test_compress.sh | 4 ++-- + tests/test_files.sh | 2 +- + 2 files changed, 3 insertions(+), 3 deletions(-) + +commit 9d997d6f9d4f042412e45c7b7a23a14ad2e4f9aa +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-05-20 16:55:00 +0300 + + CMake: Add comments tests/tests.cmake | 2 ++ 1 file changed, 2 insertions(+) -commit c35259c9e2400f6f88c269d95ecafdb223ff45d2 +commit d35368b33e54bad2f566df99fac29ffea38e34de Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-20 16:55:00 +0300 CMake: Remove the note that some tests aren't run They are now in the common build configurations. - - (cherry picked from commit d35368b33e54bad2f566df99fac29ffea38e34de) CMakeLists.txt | 2 -- 1 file changed, 2 deletions(-) -commit 30982a215395f19b3837c3da540e1cb3f913569f +commit dc232d584619b2819a9c52d6ad5d8b5d56b392ba Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-20 16:55:00 +0300 CMake: Add support for test_files.sh - - (cherry picked from commit dc232d584619b2819a9c52d6ad5d8b5d56b392ba) tests/tests.cmake | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) -commit 3a8f81e0ad4cd1c102a03ff09e703cf8cb074afc +commit a7e9230af9d1f87f474fe38886eb977d4149dc9b Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-20 16:55:00 +0300 @@ -1645,24 +5372,20 @@ Date: 2024-05-20 16:55:00 +0300 If ../config.h doesn't exist, assume that all encoders and decoders are available. - - (cherry picked from commit a7e9230af9d1f87f474fe38886eb977d4149dc9b) tests/test_files.sh | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) -commit 0644675c829143112c85455f8a6aa91bfc4e1bbb +commit b40e6efbb48d740b9b5b303e59e344801cbb5bd8 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-20 16:55:00 +0300 CMake: Add support for test_compress.sh tests - - (cherry picked from commit b40e6efbb48d740b9b5b303e59e344801cbb5bd8) tests/tests.cmake | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) -commit dcc02a6ca0e0ac4e330e820683754badbcf9815b +commit ac3222d2cb1ff3a15eb6d58f9ea9bc78e8bc3bb2 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-20 16:55:00 +0300 @@ -1682,13 +5405,11 @@ Date: 2024-05-20 16:55:00 +0300 Use the default check type instead of forcing CRC32 or CRC64. Now the script doesn't need to check if CRC64 is available. - - (cherry picked from commit ac3222d2cb1ff3a15eb6d58f9ea9bc78e8bc3bb2) tests/test_compress.sh | 41 +++++++++++++++++++++++++++++------------ 1 file changed, 29 insertions(+), 12 deletions(-) -commit c761b7051fb2ebb6da3cbecafe695fb5af7b2c9c +commit 006040b29c83104403621e950ada0c8956c56b3d Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-20 16:55:00 +0300 @@ -1698,32 +5419,26 @@ Date: 2024-05-20 16:55:00 +0300 features were built but the CMake build doesn't create config.h. So instead those test scripts will be run only when all relevant features have been enabled. - - (cherry picked from commit 006040b29c83104403621e950ada0c8956c56b3d) tests/tests.cmake | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) -commit a71bc2d75b95f85fe046f0fd1fb25d36be2b20ba +commit 6167607a6ea72fb74eefb943c4566e3cab528cd2 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-20 16:55:00 +0300 Tests: test_suffix.sh: Add a comment - - (cherry picked from commit 6167607a6ea72fb74eefb943c4566e3cab528cd2) tests/test_suffix.sh | 3 +++ 1 file changed, 3 insertions(+) -commit 8fda5ce872632e464a1f9660b3ab8dac939a03c6 +commit 4e9023857d287f624562156b60dc23d2b64c0f10 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-18 00:34:07 +0300 Fix typos Thanks to xx on #tukaani. - - (cherry picked from commit 4e9023857d287f624562156b60dc23d2b64c0f10) src/common/mythread.h | 2 +- src/common/tuklib_integer.h | 2 +- @@ -1733,53 +5448,71 @@ Date: 2024-05-18 00:34:07 +0300 src/scripts/xzgrep.in | 2 +- 6 files changed, 6 insertions(+), 6 deletions(-) -commit 2729079bcb8dd1c3ab1a79426690d17f6f8e6f7d +commit b14d08fbbc254485ace9ccfe7908674f608a62ae Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-18 00:23:52 +0300 liblzma: Fix white space Thanks to xx on #tukaani. - - (cherry picked from commit b14d08fbbc254485ace9ccfe7908674f608a62ae) src/liblzma/simple/simple_coder.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) -commit a289c4dfeb3ded35e129c48b13f46605f0138704 +commit 9f1a6d6f9a258886933a22239a5b81af34b28199 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-05-15 23:14:17 +0300 + + Build: Temporarily disable CRC CLMUL to silence OSS Fuzz + + The code makes aligned 16-byte reads which may read up to 15 bytes + before the beginning or past the end of the buffer if the buffer + is misaligned. The unneeded bytes are then ignored. It cannot cross + page boundaries and thus cannot cause access violations. + + This inherently trips address sanitizer which was already disabled + with __attribute__((__no_sanitize_address__)). However, it also + trips memory sanitizer if the extra bytes are uninitialized because + memory sanitizer doesn't see that those bytes then get ignored by + byte shuffling in the xmm registers. + + The plan is to change the code so that all sanitizers pass but it's + not finished yet (performance shouldn't get worse) so as a temporary + measure to keep OSS Fuzz happy, the CLMUL CRC is now disabled even + though I think think the code is fine to use (and easy enough to review + the memory accesses in it too). + + configure.ac | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +commit 142e670a413a7bce1a2647f1cf1f33f8ee2dbe88 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-13 17:15:04 +0300 xz: Document the static function get_chains_memusage() - - (cherry picked from commit 142e670a413a7bce1a2647f1cf1f33f8ee2dbe88) src/xz/coder.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) -commit 6f0db31713845386ce2419c55b2df89b53b80dd3 +commit 78e984399a64bfee5d11e7308e0bdbc1006db2ca Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-13 17:07:22 +0300 xz: Rename filters_memusage_max() to get_chains_memusage() - - (cherry picked from commit 78e984399a64bfee5d11e7308e0bdbc1006db2ca) src/xz/coder.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) -commit d7e2bf7e2dc9289a7a5dd0311d19d10de6d7ea1b +commit 54c3db0a83d3e67d89aba92a0957f2dce9b111a7 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-13 17:04:05 +0300 xz: Rename filter_memusages to chains_memusages - - (cherry picked from commit 54c3db0a83d3e67d89aba92a0957f2dce9b111a7) src/xz/coder.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -commit 58f200b6d1dc4cbc1ab3315a359120ab6eb84878 +commit d9e1ae79ec90d6a7eafeaceaf0ece4f0c83d4417 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 22:26:30 +0300 @@ -1799,134 +5532,115 @@ Date: 2024-05-12 22:26:30 +0300 makes the most sense. Fixes: 5f0c5a04388f8334962c70bc37a8c2ff8f605e0a - (cherry picked from commit d9e1ae79ec90d6a7eafeaceaf0ece4f0c83d4417) src/xz/coder.c | 163 ++++++++++++++++++++------------------------------------- 1 file changed, 57 insertions(+), 106 deletions(-) -commit 41bdc9fa5cc2fc2a70f4331329ac724773cc2f26 +commit 0ee56983d198b776878432703de664049b1be32e Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-13 12:14:00 +0300 xz: Edit comments - - (cherry picked from commit 0ee56983d198b776878432703de664049b1be32e) src/xz/coder.h | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) -commit 52e40c1912dfdbf8c7aa85e3a4c3eb138fa73d5d +commit ec82a49c3553f7206104582dbfb8b64fa433b491 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-13 12:03:51 +0300 xz: Rename chain_idx to chain_num - - (cherry picked from commit ec82a49c3553f7206104582dbfb8b64fa433b491) src/xz/coder.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -commit 8a019633319c694423691f58c55fa23a46e45ded +commit a731a6993c34bbbd55abaf9c166718682b1da24f Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 22:29:11 +0300 xz: Edit coding style - - (cherry picked from commit a731a6993c34bbbd55abaf9c166718682b1da24f) src/xz/coder.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit e3ad7eda74caea29849e2e9ec01212f5f7d0f574 +commit 32eb176b89243fce3112347fe43a8ad14a9fd2be Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 22:16:05 +0300 xz: Edit comments Fixes: 5f0c5a04388f8334962c70bc37a8c2ff8f605e0a - (cherry picked from commit 32eb176b89243fce3112347fe43a8ad14a9fd2be) src/xz/coder.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) -commit 09cabae2ab47a06f6eee02419a815d4bfd0d9490 +commit b90339f4daa510d2b1b8c550f855a99667f1d004 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 21:57:49 +0300 xz: Fix grammar in a comment Fixes: cb3111e3ed84152912b5138d690c8d9f00c6ef02 - (cherry picked from commit b90339f4daa510d2b1b8c550f855a99667f1d004) src/xz/coder.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit c10b66fbf9b2442741a1f052bdb4ce7009af9cda +commit 4c0bdaf13d651b22ba13bd93f8379724d6ccdc13 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 21:46:56 +0300 xz: Rename filter_memusages to encoder_memusages - - (cherry picked from commit 4c0bdaf13d651b22ba13bd93f8379724d6ccdc13) src/xz/coder.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) -commit 9132ce3564b2c003bffd6de6294a3d98dccf314e +commit b54aa023e0ec291b06e976e5f094ab0549e7b09b Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 21:42:05 +0300 xz: Edit coding style - - (cherry picked from commit b54aa023e0ec291b06e976e5f094ab0549e7b09b) src/xz/coder.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) -commit d642e13874e93b03959d1de523f1c8ebe9428838 +commit 49f67d3d3f42b640a7dfc4ca04c8934f658e10ce Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 21:31:02 +0300 xz: Rename filters_index to chain_num The reason is the same as in bd0782c1f13e52cd0fd8415208e30e47004a4c68. - - (cherry picked from commit 49f67d3d3f42b640a7dfc4ca04c8934f658e10ce) src/xz/args.c | 8 ++++---- src/xz/coder.c | 8 ++++---- src/xz/coder.h | 2 +- 3 files changed, 9 insertions(+), 9 deletions(-) -commit 47599f3b73f0a2bc18e0a8367d723f1eb0f11b63 +commit ff9e8b3d069ecfa52ec43dcdb198542d1692a492 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 21:22:43 +0300 xz: Replace a few uint32_t with "unsigned" to reduce the number of casts These hold only tiny values. - - (cherry picked from commit ff9e8b3d069ecfa52ec43dcdb198542d1692a492) src/xz/args.c | 2 +- src/xz/coder.c | 17 ++++++++--------- src/xz/coder.h | 2 +- 3 files changed, 10 insertions(+), 11 deletions(-) -commit 8f5ab75c454ea8676ed09c7f6eda8afe87b008ad +commit b5e6c1113b1ba02c282bd9163eccdb521c937a78 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 21:10:45 +0300 xz: Rename filters_used_mask to chains_used_mask The reason is the same as in bd0782c1f13e52cd0fd8415208e30e47004a4c68. - - (cherry picked from commit b5e6c1113b1ba02c282bd9163eccdb521c937a78) src/xz/coder.c | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) -commit 3eb7cf9dd5b90a074f741234225d7de51ad88774 +commit 32500dfaadae2ea36fda2e17b49ae7d9ac1acf52 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 17:14:43 +0300 @@ -1936,12 +5650,11 @@ Date: 2024-05-12 17:14:43 +0300 of the filter chain handling. Fixes: d6af7f347077b22403133239592e478931307759 - (cherry picked from commit 32500dfaadae2ea36fda2e17b49ae7d9ac1acf52) src/xz/coder.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) -commit 067961ee0e1adaa66a43fbf8c3be31697554a839 +commit ad146b1f42bbb678175a503a45ce525e779f9b8b Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 17:09:17 +0300 @@ -1959,58 +5672,49 @@ Date: 2024-05-12 17:09:17 +0300 This also renames "filter_idx" to "chain_idx" which is used as an index as in chains[chain_idx]. - - (cherry picked from commit ad146b1f42bbb678175a503a45ce525e779f9b8b) src/xz/coder.c | 68 +++++++++++++++++++++++++++++----------------------------- 1 file changed, 34 insertions(+), 34 deletions(-) -commit 6822f6f891d43c97ea379a51223ce8ea69439161 +commit 5a4ae4e4d0105404184e9a82ee08f94e1b7783e0 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 16:56:15 +0300 xz: Clean up a comment - - (cherry picked from commit 5a4ae4e4d0105404184e9a82ee08f94e1b7783e0) src/xz/coder.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) -commit 0e5e3e7bdcfcdc4b4607665ff0f6ad794e5195af +commit 2de80494ed9a4dc7db395a32a5efb770ce769804 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 16:52:09 +0300 xz: Add clarifying assertions - - (cherry picked from commit 2de80494ed9a4dc7db395a32a5efb770ce769804) src/xz/coder.c | 4 ++++ 1 file changed, 4 insertions(+) -commit 77bcf6b76a26833923e62b2dec717474d5d44700 +commit 1eaad004bf7748976324672db028e34f42802e61 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-10 20:23:33 +0300 xz: Add a clarifying assertion Fixes: 5f0c5a04388f8334962c70bc37a8c2ff8f605e0a - (cherry picked from commit 1eaad004bf7748976324672db028e34f42802e61) src/xz/coder.c | 1 + 1 file changed, 1 insertion(+) -commit df3efc058a256629ea0153b4750d3df308757038 +commit 605094329b986244833c967c04963cacc41a868d Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 16:47:17 +0300 xz: Clarify a comment - - (cherry picked from commit 605094329b986244833c967c04963cacc41a868d) src/xz/coder.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) -commit 4ebfe11cd33439675f03e1e3725abf03d6f8251b +commit 8fac2577f2dbb9491afd8500f60d004c9071df3b Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 16:28:25 +0300 @@ -2018,59 +5722,49 @@ Date: 2024-05-12 16:28:25 +0300 This is slightly simpler and it avoids looping through the opt_block_list array. - - (cherry picked from commit 8fac2577f2dbb9491afd8500f60d004c9071df3b) src/xz/coder.c | 95 ++++++++++++++++++++++++---------------------------------- 1 file changed, 39 insertions(+), 56 deletions(-) -commit bfea6913618357a7034a1d79079bccb688262124 +commit 81d350dab864b985b740742772f3b132d4c52914 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 15:48:45 +0300 xz: Remember the filter chains and the largest Block in parse_block_list() - - (cherry picked from commit 81d350dab864b985b740742772f3b132d4c52914) src/xz/args.c | 18 ++++++++++++++++++ src/xz/coder.c | 2 ++ src/xz/coder.h | 13 +++++++++++++ 3 files changed, 33 insertions(+) -commit d4e33e73922427a0f5277b91b239af538fd41c06 +commit 46ab56968f7dfdac187710a1223659d832fa1565 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 15:38:48 +0300 xz: Update a comment and initialization of filters_used_mask - - (cherry picked from commit 46ab56968f7dfdac187710a1223659d832fa1565) src/xz/coder.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) -commit 3c130737c9bb4a5021bb14eb19e9ceae30ffef3a +commit e89293a0baeb8663707c6b4a74fbb310ec698a8f Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 15:08:10 +0300 xz: parse_block_list: Edit integer type casting - - (cherry picked from commit e89293a0baeb8663707c6b4a74fbb310ec698a8f) src/xz/args.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) -commit 40c8513b4ee42b8c0fae9b2a229e078ac7e0f87a +commit 87011e40c168255cd2edea129ee68c901770603b Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-12 14:51:37 +0300 xz: Make filter_memusages a local variable - - (cherry picked from commit 87011e40c168255cd2edea129ee68c901770603b) src/xz/coder.c | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-) -commit cacaf25aa71cd1110cc049d037c11e4075602c35 +commit 347b412a9374e0456bef9da0d7d79174c0b6f1a5 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-10 20:33:08 +0300 @@ -2085,12 +5779,11 @@ Date: 2024-05-10 20:33:08 +0300 calculation if verbosity level isn't high enough. Fixes: 5f0c5a04388f8334962c70bc37a8c2ff8f605e0a - (cherry picked from commit 347b412a9374e0456bef9da0d7d79174c0b6f1a5) src/xz/coder.c | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-) -commit 3495a6b291f49079485854bb185a52c29d06cd2f +commit 31358c057c9de9d6aba96bae112b2d17942de7cb Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-10 20:22:58 +0300 @@ -2099,12 +5792,11 @@ Date: 2024-05-10 20:22:58 +0300 lzma_options_lzma.dict_size is uint32_t so use it here too. Fixes: 5f0c5a04388f8334962c70bc37a8c2ff8f605e0a - (cherry picked from commit 31358c057c9de9d6aba96bae112b2d17942de7cb) src/xz/coder.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 2861d856deb557734f067c5c471d670f0b0c6684 +commit 3f71e0f3a118e1012526f94fd640a626d30cb599 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-08 21:40:07 +0300 @@ -2114,23 +5806,20 @@ Date: 2024-05-08 21:40:07 +0300 at least nice=4. It is rounded up internally by liblzma when needed. Fixes: 5cd9f0df78cc4f8a7807bf6104adea13034fbb45 - (cherry picked from commit 3f71e0f3a118e1012526f94fd640a626d30cb599) debug/translation.bash | 1 - 1 file changed, 1 deletion(-) -commit 54546babc3feb2786e541b80f9e7216b8f1bd543 +commit b05a516830095a0e1937aeb31c937fb0400408b6 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-07 20:41:28 +0300 Fix the date of NEWS for 5.4.5 - - (cherry picked from commit b05a516830095a0e1937aeb31c937fb0400408b6) NEWS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit a7e58d1fdb493d58854ac599347cf64da0cecca4 +commit 6d336aeb97b69c496ddc626af403f6f21c753658 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-07 16:21:15 +0300 @@ -2138,13 +5827,11 @@ Date: 2024-05-07 16:21:15 +0300 This fixes the syntax of the "serial" line and renames a temporary variable. - - (cherry picked from commit 6d336aeb97b69c496ddc626af403f6f21c753658) m4/visibility.m4 | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) -commit 07a9cda037042b262ba6c8c18fae4a5b3333d508 +commit ab51e8ee610e2a893906859848f93d5cb0d5ba83 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-07 15:05:21 +0300 @@ -2154,24 +5841,20 @@ Date: 2024-05-07 15:05:21 +0300 The top-level Makefile.am puts the whole po4a directory into distribution tarball (it's simpler) so deleting these temporary files is needed to prevent them from getting into tarballs. - - (cherry picked from commit ab51e8ee610e2a893906859848f93d5cb0d5ba83) po4a/update-po | 4 ++++ 1 file changed, 4 insertions(+) -commit 1b4e7dca243d8ef297a245b5ee3ce9cd1ca20f56 +commit e4780244a17420cc95d5498cd6e02ad10eac6e5f Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-07 13:12:17 +0300 xz: Edit comments and coding style - - (cherry picked from commit e4780244a17420cc95d5498cd6e02ad10eac6e5f) src/xz/coder.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) -commit 18683525a78e96ec6d7c2b4e841e94ad39be7096 +commit fe4d8b0c80eaeca3381be302eeb89aba871a7e7c Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-06 23:08:22 +0300 @@ -2180,12 +5863,11 @@ Date: 2024-05-06 23:08:22 +0300 It likely was a leftover from a development version of the code. Fixes: 183819bfd9efac8c184d9bf123325719b7eee30f - (cherry picked from commit fe4d8b0c80eaeca3381be302eeb89aba871a7e7c) src/xz/coder.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) -commit 005f0398645b0342c9c1915d422743c77ec1d435 +commit 9bef5b8d17dd5e009d6a6b2becc2dc535da53937 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-06 23:04:31 +0300 @@ -2195,35 +5877,31 @@ Date: 2024-05-06 23:04:31 +0300 Fixes: 5f0c5a04388f8334962c70bc37a8c2ff8f605e0a Fixes: 479fd58d60622331fcbe48fddf756927b9f80d9a - (cherry picked from commit 9bef5b8d17dd5e009d6a6b2becc2dc535da53937) src/xz/coder.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) -commit 34be4e6aa62376314fde250ea4f142c18274272f +commit de06b9f0c0a3f72569829ecadbc9c0a3ef099f57 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-06 23:00:09 +0300 liblzma: Omit an unneeded array from the x86 filter Fixes: 6aa2a6deeba04808a0fe4461396e7fb70277f3d4 - (cherry picked from commit de06b9f0c0a3f72569829ecadbc9c0a3ef099f57) src/liblzma/simple/x86.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) -commit 79e329b771210c30ea317dd4d99e8968f3e6f9b2 +commit 7da488cb933fdf51cfc14cb5810beb0766224380 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-06 22:56:31 +0300 CMake: Add test_suffix.sh to the tests - - (cherry picked from commit 7da488cb933fdf51cfc14cb5810beb0766224380) tests/tests.cmake | 13 +++++++++++++ 1 file changed, 13 insertions(+) -commit 86f33bb90c6cfe6950f1d36c9e5dd7fdc9798124 +commit a805594ed0b4cbf7b81aa28ff46a8ab3c83c6876 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-06 22:55:54 +0300 @@ -2231,13 +5909,11 @@ Date: 2024-05-06 22:55:54 +0300 It needs to find the xz executable from a different directory and work without config.h. - - (cherry picked from commit a805594ed0b4cbf7b81aa28ff46a8ab3c83c6876) tests/test_suffix.sh | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) -commit 1e243ab378e8f78ebb3af741fb38354954cf20f9 +commit 50e19489387774bab3c4a988397d0d9c7a142a46 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-06 20:45:34 +0300 @@ -2253,25 +5929,22 @@ Date: 2024-05-06 20:45:34 +0300 Thus it's enough to mention the need for --disable-threads as configure doesn't autodetect the lack of pthreads. - - (cherry picked from commit 50e19489387774bab3c4a988397d0d9c7a142a46) INSTALL | 20 +++++++------------- 1 file changed, 7 insertions(+), 13 deletions(-) -commit 8595b5ab3ba766eb6daed890bfe91a16fe329c2c +commit 68d18aea1422a2b86b98b71d0b019233d84e01b0 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-02 23:00:16 +0300 Windows: Remove the "doc/api" line from README-Windows.txt Fixes: 252aa1d67bc015eeba462803ab72edeb7744d864 - (cherry picked from commit 68d18aea1422a2b86b98b71d0b019233d84e01b0) windows/README-Windows.txt | 2 -- 1 file changed, 2 deletions(-) -commit a3f163a4ad97189744107e964e4dea505fbcc252 +commit 8ede961374613aa302a13571d662cfaea1cf91f7 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-05-02 22:59:04 +0300 @@ -2281,12 +5954,11 @@ Date: 2024-05-02 22:59:04 +0300 still built liblzma API docs with Doxygen. Fixes: d3a77ebc04bf1db8d52de2d9b0f07877bc4fd139 - (cherry picked from commit 8ede961374613aa302a13571d662cfaea1cf91f7) Makefile.am | 5 ----- 1 file changed, 5 deletions(-) -commit cb0e847fe07099c1ef6d8076f6a46e17bc431acb +commit 9a6761aa35ed84d30bd2fda2333a4fdf3f46ecdc Author: Sam James <sam@gentoo.org> Date: 2024-05-02 13:26:40 +0100 @@ -2294,15 +5966,13 @@ Date: 2024-05-02 13:26:40 +0100 I've checked over each of these and they're straightforward applications of the relevant Github Actions. - - (cherry picked from commit 9a6761aa35ed84d30bd2fda2333a4fdf3f46ecdc) .github/workflows/freebsd.yml | 2 ++ .github/workflows/netbsd.yml | 2 ++ .github/workflows/openbsd.yml | 2 ++ 3 files changed, 6 insertions(+) -commit c3c854dc759fe0c5549aa0a730be9e259243edb6 +commit 81efe6119f86e3274e512c9eca5ec22b2196c2b3 Author: Yaroslav Halchenko <debian@onerussian.com> Date: 2024-03-29 14:37:24 -0400 @@ -2313,24 +5983,20 @@ Date: 2024-03-29 14:37:24 -0400 This is the first commit from https://github.com/tukaani-project/xz/pull/93 with trivial edits by Lasse Collin. - - (cherry picked from commit 81efe6119f86e3274e512c9eca5ec22b2196c2b3) .codespellrc | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) -commit 3216301aa20fcf9d5a7485e35a295d5c451d9658 +commit 905bfc74fe2670fd9c39014803017ab53d325401 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-30 14:37:11 +0300 Add .gitattributes to clean up git-archive output - - (cherry picked from commit 905bfc74fe2670fd9c39014803017ab53d325401) .gitattributes | 7 +++++++ 1 file changed, 7 insertions(+) -commit f99e7c69ada9e0db0ee1ebbc38c8ce9390cd9788 +commit 3334c71d3d4294a4f6569df3ba9bcf2443dfa501 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 12:11:09 +0300 @@ -2344,13 +6010,11 @@ Date: 2024-04-19 12:11:09 +0300 read-only sandbox is used for multi-file case. On the other hand, xz doesn't go to the strictest mode when processing the last file when more than one file was specified; xzdec does. - - (cherry picked from commit 3334c71d3d4294a4f6569df3ba9bcf2443dfa501) src/xzdec/xzdec.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) -commit bfe9be7a46cfd3b3069c15f7ba1432192bca1f5b +commit 278563ef8f2b8d98d7f2c85e1a64ec1bc21d26d8 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-30 22:22:45 +0300 @@ -2371,13 +6035,12 @@ Date: 2024-04-30 22:22:45 +0300 methods that check the function type on indirect function calls. Fixes: 3b34851de1eaf358cf9268922fa0eeed8278d680 - (cherry picked from commit 278563ef8f2b8d98d7f2c85e1a64ec1bc21d26d8) src/liblzma/common/filter_decoder.c | 15 ++++++++++++--- src/liblzma/common/filter_encoder.c | 17 +++++++++++++---- 2 files changed, 25 insertions(+), 7 deletions(-) -commit 882eadc5b820b6b1495fc91ba3573ac2aa6c1df3 +commit 77c8f60547decefca8f2d0c905d9c708c38ee8ff Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-30 21:41:11 +0300 @@ -2394,12 +6057,11 @@ Date: 2024-04-30 21:41:11 +0300 Fixes: 88ccf47205d7f3aa314d358c72ef214f10f68b43 Co-authored-by: Sam James <sam@gentoo.org> - (cherry picked from commit 77c8f60547decefca8f2d0c905d9c708c38ee8ff) src/xz/args.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) -commit ec5458e1c9b2beb416781e81ad4ff22b0149b99d +commit 64503cc2b76a388ced4ec5f68234a07f0dcddcd5 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-27 20:42:00 +0300 @@ -2410,26 +6072,22 @@ Date: 2024-04-27 20:42:00 +0300 This uses the update-doxygen script, thus this is under if(UNIX) although Doxygen itself can run on Windows too. - - (cherry picked from commit 64503cc2b76a388ced4ec5f68234a07f0dcddcd5) CMakeLists.txt | 40 +++++++++++++++++++++++++++++++--------- 1 file changed, 31 insertions(+), 9 deletions(-) -commit 8c93ced56bcb23df723dab23b7477d580720f522 +commit 0a7f5a80d8532a1d8cfa0a902c9d1ad7651eca37 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-20 23:36:39 +0300 CMake: List API headers in LIBLZMA_API_HEADERS variable This way the same list will be usable in more than one location. - - (cherry picked from commit 0a7f5a80d8532a1d8cfa0a902c9d1ad7651eca37) CMakeLists.txt | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) -commit f7c9bab0372db357511e42c9c610a2cfe5fca9b1 +commit 541406bee3f09e9813103c6406b10fc6ab2e0d30 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 15:16:42 +0300 @@ -2437,41 +6095,35 @@ Date: 2024-04-19 15:16:42 +0300 Also add a note that packagers should check the licensing of the Doxygen output. - - (cherry picked from commit 541406bee3f09e9813103c6406b10fc6ab2e0d30) PACKAGERS | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) -commit 28e7d130cb843e96d7e6b0358f8dd58bd1b2a275 +commit e21efdf96f39378fe417479f89e97046680406f5 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-27 17:47:09 +0300 Build: Add --enable-doxygen to generate and install API docs It requires Doxygen. This option is disabled by default. - - (cherry picked from commit e21efdf96f39378fe417479f89e97046680406f5) INSTALL | 6 ++++++ configure.ac | 10 +++++++++- src/liblzma/api/Makefile.am | 19 +++++++++++++++++++ 3 files changed, 34 insertions(+), 1 deletion(-) -commit cca7e6c05bc6cc51c0271c36856b7fe29f65c648 +commit 0ece09a575d7e542bda8825808ddd6cf7de8cc4b Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 15:15:17 +0300 Doxygen: update-doxygen: Support out-of-tree builds Also, now $0 is used to refer to the script itself. - - (cherry picked from commit 0ece09a575d7e542bda8825808ddd6cf7de8cc4b) doxygen/update-doxygen | 110 ++++++++++++++++++++++++++++++------------------- 1 file changed, 68 insertions(+), 42 deletions(-) -commit 8090d3dc7f0eea4a3a61f4f6d46a0d0866e345fe +commit 2c519f641f266fd897edf680827d9c905f411440 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-28 21:08:00 +0300 @@ -2479,13 +6131,11 @@ Date: 2024-04-28 21:08:00 +0300 This omits all comments and a few non-default options that weren't needed. Now it contains no copyrighted content from Doxygen itself. - - (cherry picked from commit 2c519f641f266fd897edf680827d9c905f411440) doxygen/Doxyfile | 2698 +----------------------------------------------------- 1 file changed, 25 insertions(+), 2673 deletions(-) -commit 0721b8bfe558502669f06c97601fe59ad0d52541 +commit bdba39a57530d11b88440df8024002be3d09e4a1 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 15:14:02 +0300 @@ -2496,57 +6146,47 @@ Date: 2024-04-19 15:14:02 +0300 pre-generated liblzma API docs anymore, the extra bloat and extra license info of the JavaScript files won't affect the upstream source package anymore. - - (cherry picked from commit bdba39a57530d11b88440df8024002be3d09e4a1) doxygen/update-doxygen | 21 --------------------- 1 file changed, 21 deletions(-) -commit 1ddb40f6fd286c3c6ef510735112db1ac1b60936 +commit d3a77ebc04bf1db8d52de2d9b0f07877bc4fd139 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 17:26:41 +0300 Build: Remove old Doxygen rules from top-level Makefile.am - - (cherry picked from commit d3a77ebc04bf1db8d52de2d9b0f07877bc4fd139) Makefile.am | 12 ------------ 1 file changed, 12 deletions(-) -commit 092af76234b1bc79380427456b3215aa0b80f339 +commit fd7faa4c338a42a6a40e854b837d285ae2e8c609 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 15:10:06 +0300 Update COPYING to match the autogen.sh and mydist changes - - (cherry picked from commit fd7faa4c338a42a6a40e854b837d285ae2e8c609) COPYING | 11 ----------- 1 file changed, 11 deletions(-) -commit 77bce9a0a250cfb20333ee0dca036b3193dd4941 +commit b2bc55d8a0a9f2f59bfd4302067300e650f6baa3 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 17:23:43 +0300 Build: Don't run update-doxygen as part of "make mydist" - - (cherry picked from commit b2bc55d8a0a9f2f59bfd4302067300e650f6baa3) Makefile.am | 1 - 1 file changed, 1 deletion(-) -commit 3a2fc62f59b2e8cc45f8d8fd9988b4305efe4bff +commit e9be74f5b129fe8a5388d588e68b1b7f5168a310 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 15:09:48 +0300 autogen.sh: Don't generated Doxygen docs anymore - - (cherry picked from commit e9be74f5b129fe8a5388d588e68b1b7f5168a310) autogen.sh | 18 +++--------------- 1 file changed, 3 insertions(+), 15 deletions(-) -commit b04c16f9a5a8675a87783305568cadfa3f17d999 +commit 252aa1d67bc015eeba462803ab72edeb7744d864 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 17:41:36 +0300 @@ -2554,24 +6194,20 @@ Date: 2024-04-19 17:41:36 +0300 They will be omitted from the source tarball and I don't want to make Doxygen a dependency of build.bash. - - (cherry picked from commit 252aa1d67bc015eeba462803ab72edeb7744d864) windows/build.bash | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -commit d4dd3c8f6169adf50cad8fe6872e0f5fcb82475c +commit 634095364d87444d62d8ec54c134c0cd4705f5d7 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 14:14:47 +0300 README: Don't mention PDF man pages anymore - - (cherry picked from commit 634095364d87444d62d8ec54c134c0cd4705f5d7) README | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -commit be90720d6cd7fbb1b170794445815f579b444a6f +commit dc684bf76ea23574ee9d88382057381e04e6089a Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 14:10:39 +0300 @@ -2579,37 +6215,32 @@ Date: 2024-04-19 14:10:39 +0300 pdf-local rule was added to create the PDFs still with "make pdf". The install rules are missing but that likely doesn't matter at all. - - (cherry picked from commit dc684bf76ea23574ee9d88382057381e04e6089a) Makefile.am | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-) -commit f724552d0c1ae2e3aa693d80d8d0da962dfac4e8 +commit e3531ab4125cbd5c01ebd3200791350960547189 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 13:54:39 +0300 windows/build.bash: Don't copy PDF man pages to the package - - (cherry picked from commit e3531ab4125cbd5c01ebd3200791350960547189) windows/README-Windows.txt | 2 +- windows/build.bash | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -commit 00e774819c6550a8eac219e9f6f083ab2b155505 +commit 710a4573ef2cbd19c66318c3b2d1388e418e26c7 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-28 01:34:50 +0300 Tests: test_index: Fix failures when features are disabled Fixes: cd88423e76d54eb72aea037364f3ebb21f122503 - (cherry picked from commit 710a4573ef2cbd19c66318c3b2d1388e418e26c7) tests/test_index.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) -commit 51133ad71eecc19bdb3ab287a0732fd9441753f4 +commit aaff75c3486c4489ce88b0efb36b41cf138af7c3 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-20 17:09:11 +0300 @@ -2625,15 +6256,13 @@ Date: 2024-04-20 17:09:11 +0300 who want them, and those who won't run the tests anyway have a straightforward way to ensure that nothing from the "tests" directory can affect the build process. - - (cherry picked from commit aaff75c3486c4489ce88b0efb36b41cf138af7c3) CMakeLists.txt | 76 ++--------------------------------------------- tests/Makefile.am | 1 + tests/tests.cmake | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 92 insertions(+), 73 deletions(-) -commit 85b5595b67f0081b2a900104ed7589de4bb75e12 +commit a5f2aa5618fe9183706c9c514c3067985f6c338b Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-20 13:12:50 +0300 @@ -2642,8 +6271,6 @@ Date: 2024-04-20 13:12:50 +0300 These are very old but the exact test file isn't easy to reproduce as it was compiled from a short C program (bcj_test.c) long ago. These tests weren't very good anyway, just a little better than nothing. - - (cherry picked from commit a5f2aa5618fe9183706c9c514c3067985f6c338b) tests/Makefile.am | 7 ---- tests/bcj_test.c | 64 --------------------------------- @@ -2656,40 +6283,34 @@ Date: 2024-04-20 13:12:50 +0300 tests/test_compress_prepared_bcj_x86 | 4 --- 9 files changed, 87 deletions(-) -commit d8228d1ea08155a17acaadd76ed95805d3b0a929 +commit d879686469c9c4bf2a7c0bb6420ebe4530fc8f07 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-27 18:30:40 +0300 Tests: test_index: Edit a misleading test - - (cherry picked from commit d879686469c9c4bf2a7c0bb6420ebe4530fc8f07) tests/test_index.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) -commit 2358ef8238f166c49e66f438e7494d4d352eb113 +commit 612005bbdb0dea9dc09e9e2e9cc16a15c1480acd Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-27 16:46:01 +0300 Tests: test_index: Use minimal values to test integer overflow - - (cherry picked from commit 612005bbdb0dea9dc09e9e2e9cc16a15c1480acd) tests/test_index.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -commit 54f4a4162aae8796580489013583d6148be5a473 +commit 4ad88b2544c2aaf8de8f38af54587098cbe66c1d Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-27 15:13:39 +0300 Tests: test_index: Test lzma_index_buffer_decode() more - - (cherry picked from commit 4ad88b2544c2aaf8de8f38af54587098cbe66c1d) tests/test_index.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-) -commit 85ab59a6b70db33f320a3ea7a854249cb693dea2 +commit 575b11b0d291e66c5fce31ce7a72f11436d57c83 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-27 15:08:29 +0300 @@ -2697,35 +6318,29 @@ Date: 2024-04-27 15:08:29 +0300 On LZMA_DATA_ERROR from lzma_index_buffer_decode(), *i = NULL was already done but this adds a test for that case too. - - (cherry picked from commit 575b11b0d291e66c5fce31ce7a72f11436d57c83) tests/test_index.c | 31 +++++++++++++++++++++++++++---- 1 file changed, 27 insertions(+), 4 deletions(-) -commit fb42599e44dde417305c7d92fd782147ca923079 +commit 2c970debdb285823f01f75e875561d893345ac2b Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-27 15:01:25 +0300 Tests: test_index: Test lzma_index_buffer_encode() with empty output buf - - (cherry picked from commit 2c970debdb285823f01f75e875561d893345ac2b) tests/test_index.c | 3 +++ 1 file changed, 3 insertions(+) -commit 20cac20f63a96a39391f2d613bef0f7bd6553495 +commit cd88423e76d54eb72aea037364f3ebb21f122503 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-27 14:59:55 +0300 Tests: test_index: Replace if-statements with tuktest assertions - - (cherry picked from commit cd88423e76d54eb72aea037364f3ebb21f122503) tests/test_index.c | 22 +++++++++------------- 1 file changed, 9 insertions(+), 13 deletions(-) -commit 91e3ea8735752db5d0373991e84607196070aeaa +commit 7f865577a6224fbbb5f5ca52574b62ea8ac9bf51 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-27 14:56:16 +0300 @@ -2737,46 +6352,38 @@ Date: 2024-04-27 14:56:16 +0300 Putting the pre-increment in the if-statement was clearly wrong although in practice it didn't matter here as the function is called only a couple of times. - - (cherry picked from commit 7f865577a6224fbbb5f5ca52574b62ea8ac9bf51) tests/test_index.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) -commit df1659a6c8367db69e82e2ea59ad5f959cf4e615 +commit 12313a3b6596cdcf012e180597f84d231f8730d3 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-27 14:51:52 +0300 Tests: test_index: Verify also iter.block.number_in_stream - - (cherry picked from commit 12313a3b6596cdcf012e180597f84d231f8730d3) tests/test_index.c | 2 ++ 1 file changed, 2 insertions(+) -commit e083e95dbfda73900109cca4c82c8713d0a1da21 +commit ad2654010d9d641ce1601beeff00630027e6bcd4 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-27 14:51:06 +0300 Tests: test_index: Check cases that aren't a multiple of 4 bytes - - (cherry picked from commit ad2654010d9d641ce1601beeff00630027e6bcd4) tests/test_index.c | 33 +++++++++++++++++++++++++-------- 1 file changed, 25 insertions(+), 8 deletions(-) -commit b0d3b86ecf1881d10e6614b64b0fcc6c16a3b08f +commit 2524fcf2b68b662035437cee8edbe80067c0c240 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-27 14:40:25 +0300 Tests: test_index: Edit comments and white space - - (cherry picked from commit 2524fcf2b68b662035437cee8edbe80067c0c240) tests/test_index.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) -commit bae288ea6ffb976c36e2387c03d75ce84a8a1034 +commit 71eed2520e2eecae89bade9dceea16e56cfa2ea0 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-27 14:33:38 +0300 @@ -2792,83 +6399,69 @@ Date: 2024-04-27 14:33:38 +0300 In practice this matters very little: The problem can occur only if the functions are called with invalid arguments, that is, the calling application must already have a bug. - - (cherry picked from commit 71eed2520e2eecae89bade9dceea16e56cfa2ea0) src/liblzma/common/index_decoder.c | 11 +++++++++++ 1 file changed, 11 insertions(+) -commit f10cb93f335900a29e50f990b751996ef026b3a3 +commit 0478473953f50716a2bc37b619b1c7dc2682b1ad Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-26 18:25:18 +0300 CMake: Bump maximum policy version to 3.29 - - (cherry picked from commit 0478473953f50716a2bc37b619b1c7dc2682b1ad) CMakeLists.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 59055d70cdd3df091264ae9da793821bfd65314d +commit a607e2b40d23f7d998dbaba76692aa30b4c3d9d3 Author: Sam James <sam@gentoo.org> Date: 2024-04-13 22:30:44 +0100 ci: add NetBSD - - (cherry picked from commit a607e2b40d23f7d998dbaba76692aa30b4c3d9d3) .github/workflows/netbsd.yml | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) -commit 812c1f95f37751aaa1e020fc2360949a674842fd +commit 72c210336de26fb87a928160d025fa10a638d23b Author: Sam James <sam@gentoo.org> Date: 2024-04-13 23:49:26 +0100 ci: add FreeBSD - - (cherry picked from commit 72c210336de26fb87a928160d025fa10a638d23b) .github/workflows/freebsd.yml | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) -commit d2a4f963c28b864aa179464f7827cc10c6e1365d +commit b526ec2dbfb5889845ea60548c4f5b1f97d84ab2 Author: Sam James <sam@gentoo.org> Date: 2024-04-13 23:16:08 +0100 ci: add OpenBSD - - (cherry picked from commit b526ec2dbfb5889845ea60548c4f5b1f97d84ab2) .github/workflows/openbsd.yml | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) -commit 493bc57c33385bda5ad32d01ab73dcfe8f5e7ced +commit c7ef767c49351743d8d011574abb9e200bf6b24f Author: Sam James <sam@gentoo.org> Date: 2024-04-15 05:53:01 +0100 liblzma: outqueue: add header guard Reported by github's codeql. - - (cherry picked from commit c7ef767c49351743d8d011574abb9e200bf6b24f) src/liblzma/common/outqueue.h | 5 +++++ 1 file changed, 5 insertions(+) -commit cede418d4f8e1fb4c8a30839fa5d3b14743e83d4 +commit 55dcae3056d95cb2ddb8b560c12ba7596bc79f2c Author: Sam James <sam@gentoo.org> Date: 2024-04-15 05:53:56 +0100 liblzma: easy_preset: add header guard Reported by github's codeql. - - (cherry picked from commit 55dcae3056d95cb2ddb8b560c12ba7596bc79f2c) src/liblzma/common/easy_preset.h | 5 +++++ 1 file changed, 5 insertions(+) -commit 6e76a25df28b47407a201bf0381fa6d3c80cb0bb +commit 4ffc60f32397371769b7d6b5e3ed8626292d58df Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-25 14:00:57 +0300 @@ -2891,8 +6484,6 @@ Date: 2024-04-25 14:00:57 +0300 Thanks to Sam James for pointing out the compiler warning on NetBSD 10.0. - - (cherry picked from commit 4ffc60f32397371769b7d6b5e3ed8626292d58df) src/common/tuklib_integer.h | 47 ++++++++++++++++++++------------------ src/liblzma/check/crc32_fast.c | 4 ++-- @@ -2901,76 +6492,65 @@ Date: 2024-04-25 14:00:57 +0300 src/liblzma/check/crc64_tablegen.c | 2 +- 5 files changed, 31 insertions(+), 28 deletions(-) -commit 0ca14871f306b97ce81bfe44c4a39b6b2af31bb3 +commit 08ab0966a75b501aa7c717622223f0c13a113c75 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-24 01:20:26 +0300 liblzma: API doc cleanups - - (cherry picked from commit 08ab0966a75b501aa7c717622223f0c13a113c75) src/liblzma/api/lzma/container.h | 2 +- src/liblzma/api/lzma/index.h | 6 +++--- src/liblzma/api/lzma/vli.h | 5 ++--- 3 files changed, 6 insertions(+), 7 deletions(-) -commit 94a462850bc8718f5dd5b30116bce2165b2403c2 +commit 3ac8a9bb4cccbee88350696dc9c645c48d77c989 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-23 16:35:33 +0300 Tests: test_filter_str: Add a few assertions - - (cherry picked from commit 3ac8a9bb4cccbee88350696dc9c645c48d77c989) tests/test_filter_str.c | 4 ++++ 1 file changed, 4 insertions(+) -commit 72058ca22a7f3c9c67ed58be624f8302c6337cd7 +commit 26c69be80523b05c84dea86c47c4ddd9a10945d7 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-23 16:35:08 +0300 Tests: test_filter_str: Move one assertion and add a comment - - (cherry picked from commit 26c69be80523b05c84dea86c47c4ddd9a10945d7) tests/test_filter_str.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) -commit c59ebbe1c6dd18b78a046aae3133702dd52c352e +commit 4f6af853bc99904efb8b6c28a0af7b81a8476c1b Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-23 16:26:06 +0300 Tests: test_filter_str: Tweak comments and white space - - (cherry picked from commit 4f6af853bc99904efb8b6c28a0af7b81a8476c1b) tests/test_filter_str.c | 3 +++ 1 file changed, 3 insertions(+) -commit ceda860934b0272689d0722ceeb490cf9c559956 +commit c92663aa1bd576e0615498a4189acf0df12e84b9 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-23 16:25:22 +0300 Tests: test_filter_str: Add missing RISC-V case Fixes: 89ea1a22f4ed3685b053b7260bc5acf6c75d1664 - (cherry picked from commit c92663aa1bd576e0615498a4189acf0df12e84b9) tests/test_filter_str.c | 3 +++ 1 file changed, 3 insertions(+) -commit 2234b7cc472e62f3401216a71261579342fa2959 +commit b0366df1d7ed26268101f9303a001c91c0806dfc Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-22 22:23:32 +0300 Tests: test_filter_str: Test *error_pos more thoroughly - - (cherry picked from commit b0366df1d7ed26268101f9303a001c91c0806dfc) tests/test_filter_str.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 76 insertions(+), 1 deletion(-) -commit 3ba3ef57f929670adb1f9c5e5207a81a29374237 +commit 70d12dd069bb9bb0d6bb1c8fafc4e6f77780263d Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-22 21:54:39 +0300 @@ -2981,34 +6561,29 @@ Date: 2024-04-22 21:54:39 +0300 or filters == NULL or unsupported flags were specified. Fixes: cedeeca2ea6ada5b0411b2ae10d7a859e837f203 - (cherry picked from commit 70d12dd069bb9bb0d6bb1c8fafc4e6f77780263d) src/liblzma/common/string_conversion.c | 6 ++++++ 1 file changed, 6 insertions(+) -commit 57ad820e15381344a812c78ce9b67a77a60b9cf3 +commit ed8e552395701fbf046027cebc8be4a6755b263f Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-22 20:31:25 +0300 liblzma: Clean up white space - - (cherry picked from commit ed8e552395701fbf046027cebc8be4a6755b263f) src/liblzma/lz/lz_encoder.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit ba0b5bfe7cb3cdbd9a4e3c268e10c304cb834e8a +commit 2f06920f20b1ad63b7953dc09569e1d424998849 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-22 18:35:19 +0300 Tests: test_filter_flags: Edit comments and style - - (cherry picked from commit 2f06920f20b1ad63b7953dc09569e1d424998849) tests/test_filter_flags.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) -commit d2ed6759596185ac6a9c69ea713c27cd4bd1d9ba +commit b101e1d1dbc81577c0c9aa0cb89cf2e46a15eb82 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-22 16:39:44 +0300 @@ -3016,35 +6591,29 @@ Date: 2024-04-22 16:39:44 +0300 The array could become empty and then the initializer would be simply {} which is allowed only in GNU-C and C23. - - (cherry picked from commit b101e1d1dbc81577c0c9aa0cb89cf2e46a15eb82) tests/test_filter_flags.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) -commit 9a70e93fef3fd5943484e56f1881a7c6e3296027 +commit f8f3a220ac8afcb8cb2812917d3b77e00c2eab0d Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-21 20:32:16 +0300 DOS: Omit useless defines from config.h - - (cherry picked from commit f8f3a220ac8afcb8cb2812917d3b77e00c2eab0d) dos/config.h | 12 ------------ 1 file changed, 12 deletions(-) -commit dc4740f720e08bdd496aa2736db3b7aea6dd3d1e +commit fc1921b04b8840caaa777c2bd5340d41b259da20 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-21 20:27:50 +0300 Build: Omit useless checks for fcntl.h, limits.h, and sys/time.h - - (cherry picked from commit fc1921b04b8840caaa777c2bd5340d41b259da20) configure.ac | 6 ------ 1 file changed, 6 deletions(-) -commit 6e210d5766b25d36729152a13c5889bb0605a1e3 +commit 6aa2a6deeba04808a0fe4461396e7fb70277f3d4 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 22:04:21 +0300 @@ -3057,82 +6626,69 @@ Date: 2024-04-19 22:04:21 +0300 ones have quite a different implementation for the same filter. Thanks to Sam James. - - (cherry picked from commit 6aa2a6deeba04808a0fe4461396e7fb70277f3d4) src/liblzma/simple/x86.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) -commit 4019b012f29008ea6545aba6fe6c141a2d920ae2 +commit e89d3e83b4496d0b5410870634970c0aa9721d59 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 23:18:19 +0300 Update .gitignore - - (cherry picked from commit e89d3e83b4496d0b5410870634970c0aa9721d59) .gitignore | 21 ++++++++------------- 1 file changed, 8 insertions(+), 13 deletions(-) -commit 09a0311a1e8cdefbcfab9e490cdd41c97a459d24 +commit 86fc4ee859709da0ff9617a1490f13ddac0a109b Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 20:53:24 +0300 Tests: test_lzip_decoder: Tweak coding style and comments - - (cherry picked from commit 86fc4ee859709da0ff9617a1490f13ddac0a109b) tests/test_lzip_decoder.c | 58 +++++++++++++++++++++++------------------------ 1 file changed, 28 insertions(+), 30 deletions(-) -commit 3117336a0291309ddd2a54d2966a589f9f806850 +commit 38be573a279bd7b608ee7d8509ec10884e6fb0d5 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 20:51:36 +0300 Tests: test_lzip_decoder: Remove redundant initializations - - (cherry picked from commit 38be573a279bd7b608ee7d8509ec10884e6fb0d5) tests/test_lzip_decoder.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) -commit f78081eb12c804ec4f5a3dc569b859646b16e9e5 +commit d7e4bc53eacfab9f3de95d8252bdfdc9419079c9 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-19 20:47:24 +0300 Tests: test_lzip_decoder: Remove unneeded tuktest_malloc() calls - - (cherry picked from commit d7e4bc53eacfab9f3de95d8252bdfdc9419079c9) tests/test_lzip_decoder.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-) -commit 7413383e4280065b79ca70abe4d8ebc78055b35a +commit eeca8f7c5baf1ad69606bb734d5001763466d58f Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-15 20:35:07 +0300 xz: Fix white space error. Thanks to xx on #tukaani. - - (cherry picked from commit eeca8f7c5baf1ad69606bb734d5001763466d58f) src/xz/args.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit eed2f26c0edb6e31a50d48bab4ff619778690a1e +commit 462ca9409940a19f743daee6b3bcc611277d0007 Author: Sam James <sam@gentoo.org> Date: 2024-04-11 23:01:44 +0100 xz: add missing noreturn for message_filters_help Fixes: a165d7df1964121eb9df715e6f836a31c865beef - (cherry picked from commit 462ca9409940a19f743daee6b3bcc611277d0007) src/xz/message.h | 1 + 1 file changed, 1 insertion(+) -commit 2633d8df616405bd54fd748d7bf887ebc4505b88 +commit 863f13d2828b99b0539ce73f9cf85bde32358034 Author: Sam James <sam@gentoo.org> Date: 2024-04-11 19:34:04 +0100 @@ -3158,24 +6714,20 @@ Date: 2024-04-11 19:34:04 +0100 Just suppress -Wsign-conversion for `signals_init` for macOS given there's no real nice way of fixing this. - - (cherry picked from commit 863f13d2828b99b0539ce73f9cf85bde32358034) src/xz/signals.c | 7 +++++++ 1 file changed, 7 insertions(+) -commit 50fb269c7a9cf62a9f3fe08859e2aa4348b600a7 +commit fcbd0d199933a69713cb293cbd7409a757d854cd Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-13 22:19:40 +0300 Tests: test_microlzma: Add a "FIXME?" about LZMA_FINISH handling - - (cherry picked from commit fcbd0d199933a69713cb293cbd7409a757d854cd) tests/test_microlzma.c | 8 ++++++++ 1 file changed, 8 insertions(+) -commit 3e2ff2d38c54c8fc7ce15aaf91185dc105d9c92c +commit 0fe2dfa68355d2b165544b2bc8babf77dcc2039e Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-13 18:05:31 +0300 @@ -3184,13 +6736,11 @@ Date: 2024-04-13 18:05:31 +0300 A few lines were reordered, a few ARRAY_SIZE were changed to sizeof, and a few uint32_t were changed to size_t. No real functional changes were intended. - - (cherry picked from commit 0fe2dfa68355d2b165544b2bc8babf77dcc2039e) tests/test_microlzma.c | 149 +++++++++++++++++++++++++++---------------------- 1 file changed, 83 insertions(+), 66 deletions(-) -commit ebc8b8de19d641c37ab7959a224bcd0ff4c0833f +commit 97f0ee0f1f903f4e7c4ea23e9b89d687025d2992 Author: Ryan Carsten Schmidt <git@ryandesign.com> Date: 2024-04-12 19:31:13 -0500 @@ -3198,13 +6748,11 @@ Date: 2024-04-12 19:31:13 -0500 hw.ncpu counts all CPUs including inactive ones. hw.activecpu counts only the active CPUs. - - (cherry picked from commit 97f0ee0f1f903f4e7c4ea23e9b89d687025d2992) build-aux/ci_build.bash | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 1e63f7d53648beb6dd5acb5771850d7c4bc30477 +commit 73f629e321b74f68c9954728fa4f19261afccf46 Author: Sam James <sam@gentoo.org> Date: 2024-04-10 18:33:55 +0100 @@ -3212,26 +6760,22 @@ Date: 2024-04-10 18:33:55 +0100 We discussed the name and it's less cognitive load to just call it '.bash' so you don't have an immediate question about if bashisms are OK. - - (cherry picked from commit 73f629e321b74f68c9954728fa4f19261afccf46) .github/workflows/ci.yml | 52 ++++++++++++++++---------------- .github/workflows/windows-ci.yml | 20 ++++++------ build-aux/{ci_build.sh => ci_build.bash} | 0 3 files changed, 36 insertions(+), 36 deletions(-) -commit aea54a4724414466a20afd7493156d40d0a2741c +commit 8709407a9ef8e7e8aec117879400e4dd3e227ada Author: Sam James <sam@gentoo.org> Date: 2024-04-10 17:42:23 +0100 ci: build in parallel by default - - (cherry picked from commit 8709407a9ef8e7e8aec117879400e4dd3e227ada) build-aux/ci_build.sh | 2 ++ 1 file changed, 2 insertions(+) -commit 4381fcf00b2fabb6dcc9fd5cf35d520feb9e775a +commit 65bf7e0a1ca6386f17608e8afb84ac470c18d23f Author: Sam James <sam@gentoo.org> Date: 2024-04-10 15:41:08 +0100 @@ -3240,13 +6784,11 @@ Date: 2024-04-10 15:41:08 +0100 We need this for when we're passing sanitizer flags or -gdwarf-4 for Clang with Valgrind. Just always start with -O2 if CFLAGS isn't set in the environment and append what was passed on the command line. - - (cherry picked from commit 65bf7e0a1ca6386f17608e8afb84ac470c18d23f) build-aux/ci_build.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -commit 752ba5ed99ec754bafbdc4d87a2876cb2566ecc4 +commit bc899f9e0700ad153bd65f4804c4de7515c8a847 Author: Sam James <sam@gentoo.org> Date: 2024-04-10 15:17:47 +0100 @@ -3254,13 +6796,11 @@ Date: 2024-04-10 15:17:47 +0100 This is a lot easier to work with than the save-logs thing the action tries to do... - - (cherry picked from commit bc899f9e0700ad153bd65f4804c4de7515c8a847) build-aux/ci_build.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit cc21af171599ffe0419fc32a30edd3ef7d479865 +commit b5e3470442531717b2457b40ab412740296af1bc Author: Sam James <sam@gentoo.org> Date: 2024-04-10 12:38:51 +0100 @@ -3270,13 +6810,11 @@ Date: 2024-04-10 12:38:51 +0100 I made in Meson for this in October [0]. [0] https://github.com/mesonbuild/meson/commit/7b7d2e060b447de9c2642848847370a58711ac1c - - (cherry picked from commit b5e3470442531717b2457b40ab412740296af1bc) .github/workflows/ci.yml | 1 + 1 file changed, 1 insertion(+) -commit 2d2d5f14b392cd1aeddab7ce34fd50ba5422e5b5 +commit 6c095a98fbec70b790253a663173ecdb669108c4 Author: Sam James <sam@gentoo.org> Date: 2024-04-10 11:43:10 +0100 @@ -3295,14 +6833,12 @@ Date: 2024-04-10 11:43:10 +0100 [0] https://www.gnu.org/software/autoconf-archive/ax_valgrind_check.html [1] https://tecnocode.co.uk/2014/12/23/automatically-valgrinding-code-with-ax_valgrind_check/ - - (cherry picked from commit 6c095a98fbec70b790253a663173ecdb669108c4) .github/workflows/ci.yml | 11 ++++++++++- build-aux/ci_build.sh | 8 +++++--- 2 files changed, 15 insertions(+), 4 deletions(-) -commit 5d20a612051fac3ca6d99abe3cd7e0e3370e5b67 +commit 6286c1900c2d2ca33d9b1b397122c7bcdb9a4d59 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-10 23:20:02 +0300 @@ -3310,15 +6846,13 @@ Date: 2024-04-10 23:20:02 +0300 A macro is useful to prevent a single #if directive from getting too ugly but only one macro is needed for all archs. - - (cherry picked from commit 6286c1900c2d2ca33d9b1b397122c7bcdb9a4d59) src/liblzma/check/crc32_table.c | 10 ++++------ src/liblzma/check/crc64_table.c | 4 ++-- src/liblzma/check/crc_common.h | 5 +++-- 3 files changed, 9 insertions(+), 10 deletions(-) -commit 2a80827e23169c624560ac89714bf5084cbead43 +commit 45da936c879acf4f053a3055665bf1b10ded4462 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-10 23:09:40 +0300 @@ -3328,12 +6862,11 @@ Date: 2024-04-10 23:09:40 +0300 when it should have. Fixes: 1940f0ec28f08c0ac72c1413d9706fb82eabe6ad - (cherry picked from commit 45da936c879acf4f053a3055665bf1b10ded4462) src/liblzma/check/crc32_table.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit a54117377151356c1e2494ba1febc245cb71b51c +commit 308a9af85400b0e2019f0f012c8354e831d06d65 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-10 22:21:51 +0300 @@ -3342,12 +6875,11 @@ Date: 2024-04-10 22:21:51 +0300 This can speed up configure a tiny bit. Fixes: c5f6d79cc9515a7f22d7ea4860c6cc394b295732 - (cherry picked from commit 308a9af85400b0e2019f0f012c8354e831d06d65) configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 9223ad6e78a666cc9f9aba135d1755fec184a24a +commit fc43cecd32bf9d5f8caa599206b15c9569af1eb6 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-10 22:04:27 +0300 @@ -3356,12 +6888,11 @@ Date: 2024-04-10 22:04:27 +0300 I didn't test this but it shouldn't change any functionality. Fixes: 761f5b69a4c778c8bcb09279b845b07c28790575 - (cherry picked from commit fc43cecd32bf9d5f8caa599206b15c9569af1eb6) src/liblzma/check/crc32_arm64.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) -commit 32ceb2c36a0e450037bbe906c2a1ea42607b9d21 +commit 1024cd4cd966b998fedec51e385e9ee9a49b3c57 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-10 21:59:27 +0300 @@ -3372,12 +6903,11 @@ Date: 2024-04-10 21:59:27 +0300 I didn't test this. Fixes: 761f5b69a4c778c8bcb09279b845b07c28790575 - (cherry picked from commit 1024cd4cd966b998fedec51e385e9ee9a49b3c57) src/liblzma/check/crc32_arm64.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) -commit 42915101e914dba353c236925bc1d5e4826d3f7a +commit 2337f7021c860b026e3e849e60a9ae8d09ec0ea0 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-10 21:56:33 +0300 @@ -3386,35 +6916,39 @@ Date: 2024-04-10 21:56:33 +0300 Subtracting from 0 is negation, this just keeps warnings away. Fixes: 761f5b69a4c778c8bcb09279b845b07c28790575 - (cherry picked from commit 2337f7021c860b026e3e849e60a9ae8d09ec0ea0) src/liblzma/check/crc32_arm64.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 42a9482b48f0171852fbaddbdc729a56f2daa547 +commit d8fffd01aa1a3c18e437a222abd34699e23ff5e7 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-10 21:55:10 +0300 liblzma: ARM64 CRC32: Tweak coding style and comments - - (cherry picked from commit d8fffd01aa1a3c18e437a222abd34699e23ff5e7) src/liblzma/check/crc32_arm64.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) -commit 38a3ec5a7e2ddeee3686be64b037aa1377f31fd1 +commit 780d2c236de0e4749655696c2e0c26fb7565afd3 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-04-09 21:55:01 +0300 + + Update SECURITY.md. + + .github/SECURITY.md | 25 ++++++++----------------- + 1 file changed, 8 insertions(+), 17 deletions(-) + +commit 986865ea2f9d1f8dbef4a130926df106b0f6d41a Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-09 17:47:01 +0300 CI: Remove ifunc support. - - (cherry picked from commit 986865ea2f9d1f8dbef4a130926df106b0f6d41a) .github/workflows/ci.yml | 13 +++---------- build-aux/ci_build.sh | 5 +---- 2 files changed, 4 insertions(+), 14 deletions(-) -commit 34d1252f093944ff350a88a6196539f95902ad41 +commit 689ae2427342a2ea1206eb5ca08301baf410e7e0 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-09 17:43:16 +0300 @@ -3426,8 +6960,6 @@ Date: 2024-04-09 17:43:16 +0300 extra code to support it. The only case where ifunc *might* matter for performance is if the CRC functions are used directly by an application. In normal compression use it's completely irrelevant. - - (cherry picked from commit 689ae2427342a2ea1206eb5ca08301baf410e7e0) CMakeLists.txt | 79 --------------------------------------- INSTALL | 8 ---- @@ -3438,69 +6970,57 @@ Date: 2024-04-09 17:43:16 +0300 src/liblzma/check/crc_x86_clmul.h | 11 +----- 7 files changed, 8 insertions(+), 247 deletions(-) -commit a594b39685051cd1ec866360bc4dd6c22f301bb4 +commit 6b4c859059a7eb9b0547590c081668e14ecf8af6 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-08 22:04:41 +0300 tests/files/README: Update the main heading. - - (cherry picked from commit 6b4c859059a7eb9b0547590c081668e14ecf8af6) tests/files/README | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -commit fa76e3ef597ee2e9d150461a42d270a386204042 +commit 2a851e06b891ce894f918faff32a6cca6fdecee6 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-08 22:02:45 +0300 tests/files/README: Explain how to recreate the ARM64 test files. - - (cherry picked from commit 2a851e06b891ce894f918faff32a6cca6fdecee6) tests/files/README | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) -commit 112fa0aba6be30968811c9131f1b995cf9e92e75 +commit 3d09b721b94e18fe1f853a04799697f5de10b291 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-08 21:51:55 +0300 debug: Add generator for the ARM64 test file data. - - (cherry picked from commit 3d09b721b94e18fe1f853a04799697f5de10b291) debug/Makefile.am | 3 +- debug/testfilegen-arm64.c | 116 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 118 insertions(+), 1 deletion(-) -commit 1a1f3d0323d5991a3238566e8f517d5116358b5c +commit 31ef676567c9d6fcc4ec9fc833c312f7a7c21c48 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-08 21:19:38 +0300 xz man page: Use .ft CR instead of CW to silence warnings from groff. - - (cherry picked from commit 31ef676567c9d6fcc4ec9fc833c312f7a7c21c48) src/xz/xz.1 | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) -commit 9f9203f574f895c40a86a83c45c6bb79c25bb5d2 +commit 780cbf29d5a88db2b546e9b7b019c4c33ca72685 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-08 19:28:35 +0300 Fix NEWS for 5.6.0 and 5.6.1. - - (cherry picked from commit 780cbf29d5a88db2b546e9b7b019c4c33ca72685) NEWS | 6 ++++++ 1 file changed, 6 insertions(+) -commit 12876b33c79e36d7e51e8ba6ab7162bd2129cb5b +commit bfd0c7c478e93a1911b845459549ff94587b6ea2 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-08 19:22:26 +0300 Remove the XZ logo. - - (cherry picked from commit bfd0c7c478e93a1911b845459549ff94587b6ea2) COPYING | 5 - COPYING.CC-BY-SA-4.0 | 427 --------------------------------------------------- @@ -3511,15 +7031,13 @@ Date: 2024-04-08 19:22:26 +0300 doxygen/footer.html | 13 -- 7 files changed, 3 insertions(+), 452 deletions(-) -commit 879295d91f06c241fd8a8fc1ca95776dbeb45f93 +commit 77a294d98a9d2d48f7e4ac273711518bf689f5c4 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-08 18:27:39 +0300 Update maintainer and author info. The other maintainer suddenly disappeared. - - (cherry picked from commit 77a294d98a9d2d48f7e4ac273711518bf689f5c4) AUTHORS | 9 +++++++-- README | 10 +++------- @@ -3527,29 +7045,26 @@ Date: 2024-04-08 18:27:39 +0300 src/liblzma/api/lzma.h | 2 +- 4 files changed, 11 insertions(+), 11 deletions(-) -commit 859617d30d81317236e004b323fed0883f932dcf +commit 8dd03d4484ccf80022722a16d0ed9b37f2b58072 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-08 18:05:32 +0300 Docs: Update .xz file format specification to 1.2.1. This only reverts the XZ URL changes. - - (cherry picked from commit 8dd03d4484ccf80022722a16d0ed9b37f2b58072) doc/xz-file-format.txt | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) -commit eeb74fba1f6ea334a519015938b4a26c6ba5d4eb +commit 17aa2e1a796d3f758802df29afc89dcf335db567 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-08 17:33:56 +0300 Update website URLs back to tukaani.org. The XZ projects were moved back to their original URLs. - - (cherry picked from commit 17aa2e1a796d3f758802df29afc89dcf335db567) + .github/SECURITY.md | 2 +- CMakeLists.txt | 2 +- COPYING | 3 +-- README | 4 ++-- @@ -3561,78 +7076,129 @@ Date: 2024-04-08 17:33:56 +0300 src/xz/xz.1 | 6 +++--- src/xzdec/xzdec.1 | 4 ++-- windows/README-Windows.txt | 2 +- - 11 files changed, 20 insertions(+), 21 deletions(-) + 12 files changed, 21 insertions(+), 22 deletions(-) -commit a7b9cd70004bfc1abadc7e865dfce765f7b8b59d +commit 2739db981023373a2ddabc7b456c7e658bb4f582 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-08 17:07:08 +0300 xzdec: Tweak coding style and comments. - - (cherry picked from commit 2739db981023373a2ddabc7b456c7e658bb4f582) src/xzdec/xzdec.c | 32 +++++++++++++++++++++----------- 1 file changed, 21 insertions(+), 11 deletions(-) -commit ebe9d6d8cb27168706078009b3f64da8fde63833 +commit 408b6adb2a07d07c6535f859571cca38837caaf3 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-04-08 15:53:46 +0300 tests/ossfuzz: Tiny fix to a comment. - - (cherry picked from commit 408b6adb2a07d07c6535f859571cca38837caaf3) tests/ossfuzz/fuzz_decode_stream.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 78ab47d65d916207233abbcdb0ccfd6efb946c05 +commit db4dd74a344580e0b81436598d9741a3454245b0 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-04-09 18:22:16 +0300 + + Update THANKS. + + THANKS | 1 + + 1 file changed, 1 insertion(+) + +commit e93e13c8b3bec925c56e0c0b675d8000a0f7f754 +Author: Lasse Collin <lasse.collin@tukaani.org> +Date: 2024-04-08 15:32:58 +0300 + + Remove the backdoor found in 5.6.0 and 5.6.1 (CVE-2024-3094). + + While the backdoor was inactive (and thus harmless) without inserting + a small trigger code into the build system when the source package was + created, it's good to remove this anyway: + + - The executable payloads were embedded as binary blobs in + the test files. This was a blatant violation of the + Debian Free Software Guidelines. + + - On machines that see lots bots poking at the SSH port, the backdoor + noticeably increased CPU load, resulting in degraded user experience + and thus overwhelmingly negative user feedback. + + - The maintainer who added the backdoor has disappeared. + + - Backdoors are bad for security. + + This reverts the following without making any other changes: + + 6e636819 Tests: Update two test files. + a3a29bbd Tests: Test --single-stream can decompress bad-3-corrupt_lzma2.xz. + 0b4ccc91 Tests: Update RISC-V test files. + 8c9b8b20 liblzma: Fix typos in crc32_fast.c and crc64_fast.c. + 82ecc538 liblzma: Fix false Valgrind error report with GCC. + cf44e4b7 Tests: Add a few test files. + 3060e107 Tests: Use smaller dictionary size in RISC-V test files. + e2870db5 Tests: Add two RISC-V Filter test files. + + The RISC-V test files also have real content that tests the filter + but the real content would fit into much smaller files. A generator + program would need to be available as well. + + Thanks to Andres Freund for finding and reporting it and making + it public quickly so others could act without a delay. + See: https://www.openwall.com/lists/oss-security/2024/03/29/4 + + src/liblzma/check/crc32_fast.c | 7 +++++-- + src/liblzma/check/crc64_fast.c | 4 +++- + src/liblzma/check/crc_common.h | 25 ------------------------- + tests/files/README | 27 --------------------------- + tests/files/bad-3-corrupt_lzma2.xz | Bin 512 -> 0 bytes + tests/files/bad-dict_size.lzma | Bin 41 -> 0 bytes + tests/files/good-1-riscv-lzma2-1.xz | Bin 7424 -> 0 bytes + tests/files/good-1-riscv-lzma2-2.xz | Bin 7432 -> 0 bytes + tests/files/good-2cat.xz | Bin 136 -> 0 bytes + tests/files/good-large_compressed.lzma | Bin 35421 -> 0 bytes + tests/files/good-small_compressed.lzma | Bin 258 -> 0 bytes + tests/test_files.sh | 11 ----------- + 12 files changed, 8 insertions(+), 66 deletions(-) + +commit f9cf4c05edd14dedfe63833f8ccbe41b55823b00 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-03-30 14:36:28 +0200 CMake: Fix sabotaged Landlock sandbox check. It never enabled it. - - (cherry picked from commit f9cf4c05edd14dedfe63833f8ccbe41b55823b00) CMakeLists.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 5f178c364c3b5c6fe87099b7624d5c76995ff8e6 -Author: Lasse Collin <lasse.collin@tukaani.org> -Date: 2024-05-22 14:08:33 +0300 +commit af071ef7702debef4f1d324616a0137a5001c14c +Author: Jia Tan <jiat0218@gmail.com> +Date: 2024-03-26 01:50:02 +0800 - Delete SECURITY.md from v5.6 - - It's too easily out of date in the stable branches. - It's not included in the release packages anyway. + Docs: Simplify SECURITY.md. - .github/SECURITY.md | 29 ----------------------------- - 1 file changed, 29 deletions(-) + .github/SECURITY.md | 8 +------- + 1 file changed, 1 insertion(+), 7 deletions(-) -commit b3a756188004a16de5956c368e3b0efd1a9bccb0 +commit 0b99783d63f27606936bb79a16c52d0d70c0b56f Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-03-22 17:46:30 +0200 liblzma: memcmplen.h: Add a comment why subtraction is used. - - (cherry picked from commit 0b99783d63f27606936bb79a16c52d0d70c0b56f) src/liblzma/common/memcmplen.h | 13 +++++++++++++ 1 file changed, 13 insertions(+) -commit 94939a145f362ff8b09fb37fc72901743f7f5cb2 +commit 8a25ba024d55610c448c6e4f1400a00bae51b493 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-03-15 17:43:39 +0200 INSTALL: Document arguments of --enable-symbol-versions. - - (cherry picked from commit 8a25ba024d55610c448c6e4f1400a00bae51b493) INSTALL | 43 +++++++++++++++++++++++++++++++++++++++---- 1 file changed, 39 insertions(+), 4 deletions(-) -commit fa14c8aaf0d0266b7e0c3b7c766159299c1a0f18 +commit 49324b711f9d42b3543bf2f3ae598eaa03360bd5 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-03-15 17:15:50 +0200 @@ -3643,13 +7209,11 @@ Date: 2024-03-15 17:15:50 +0200 AC_EGREP_CPP uses AC_REQUIRE so the outermost if-commands must be changed to AS_IF to ensure that things wont break some day. See 5a5bd7f871818029d5ccbe189f087f591258c294. - - (cherry picked from commit 49324b711f9d42b3543bf2f3ae598eaa03360bd5) configure.ac | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) -commit 73baa8d99b51c7623ed95afe6411302d9ff56864 +commit c273123ed0ebaebf49994057a7fe98aae7f42c40 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-03-15 16:36:35 +0200 @@ -3658,36 +7222,30 @@ Date: 2024-03-15 16:36:35 +0200 It doesn't support the __symver__ attribute or __asm__(".symver ..."). The generic symbol versioning can still be used since it only needs linker support. - - (cherry picked from commit c273123ed0ebaebf49994057a7fe98aae7f42c40) CMakeLists.txt | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) -commit 886633f42376f4648d931917733c8a59fb2e1f6c +commit df7f487648d18a3992386a59b8a061edca862d17 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-03-13 21:38:24 +0200 Update THANKS. - - (cherry picked from commit df7f487648d18a3992386a59b8a061edca862d17) THANKS | 1 + 1 file changed, 1 insertion(+) -commit 760f622f0d73632df2347aaca7ac7ff5761e98b6 +commit 3217b82b3ec023bf8338249134a076bea0ea30ec Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-03-13 21:30:18 +0200 liblzma: Minor comment edits. - - (cherry picked from commit 3217b82b3ec023bf8338249134a076bea0ea30ec) src/liblzma/common/string_conversion.c | 4 ++-- src/liblzma/delta/delta_decoder.c | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) -commit 403b4c78b81f67bc3787542f55f555407253316c +commit 096bc0e3f8fb4bfc4d2f3f64a7f219401ffb4c31 Author: Sergey Kosukhin <sergey.kosukhin@mpimet.mpg.de> Date: 2024-03-13 13:07:13 +0100 @@ -3703,15 +7261,13 @@ Date: 2024-03-13 13:07:13 +0100 vectorization is enabled, which results in failed tests. This introduces NVHPC-specific workarounds that address the issues. - - (cherry picked from commit 096bc0e3f8fb4bfc4d2f3f64a7f219401ffb4c31) src/liblzma/common/string_conversion.c | 6 ++++-- src/liblzma/delta/delta_decoder.c | 3 +++ src/liblzma/rangecoder/range_decoder.h | 1 + 3 files changed, 8 insertions(+), 2 deletions(-) -commit 1888fb49f629340758e98e69d5aa328f6f73c5e1 +commit 2ad7fad67080e88fa7fc191f9d613d8b7add9c62 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-03-13 21:17:10 +0200 @@ -3724,24 +7280,20 @@ Date: 2024-03-13 21:17:10 +0200 configure.ac tries to enable symbol versioning only with glibc so now CMake does the same. - - (cherry picked from commit 2ad7fad67080e88fa7fc191f9d613d8b7add9c62) CMakeLists.txt | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) -commit 4b3c84e8eebbcf712fc2396dbb8117cce2d72464 +commit 82f0c0d39eb2c026b1d96ee706f70ace868d4ed4 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-03-13 20:32:46 +0200 CMake: Make symbol versioning configurable. - - (cherry picked from commit 82f0c0d39eb2c026b1d96ee706f70ace868d4ed4) CMakeLists.txt | 62 +++++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 42 insertions(+), 20 deletions(-) -commit 69d1e20208eb9bd1f4f1c8ee4e49cc82d681a877 +commit 45d33bfc45e4295b8ad743bc2ae61cc724f98076 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-03-13 19:47:36 +0200 @@ -3749,13 +7301,11 @@ Date: 2024-03-13 19:47:36 +0200 The AC_MSG_ERROR line is overlong anyway as are a few other AC_MSG_ERROR lines already. - - (cherry picked from commit 45d33bfc45e4295b8ad743bc2ae61cc724f98076) configure.ac | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) -commit 051d6b5c85a874c78249693865fd751088f403a2 +commit f56ed6fac6619b56b005878d3b5210e2f0d721c0 Author: Sergey Kosukhin <sergey.kosukhin@mpimet.mpg.de> Date: 2024-03-12 20:03:49 +0100 @@ -3768,87 +7318,11 @@ Date: 2024-03-12 20:03:49 +0100 the compiler does not support all features that are required for the linux versioning), the latter might help in overriding the assumptions made in the configure script. - - (cherry picked from commit f56ed6fac6619b56b005878d3b5210e2f0d721c0) configure.ac | 91 +++++++++++++++++++++++++++++++++--------------------------- 1 file changed, 50 insertions(+), 41 deletions(-) -commit 95dcea4b5df0b180af461e4584d2bcf7725e3aef -Author: Lasse Collin <lasse.collin@tukaani.org> -Date: 2024-04-09 18:22:16 +0300 - - Update THANKS. - - THANKS | 1 + - 1 file changed, 1 insertion(+) - -commit 1107712e372f7593ad729764c0c2644d0e4aa675 -Author: Lasse Collin <lasse.collin@tukaani.org> -Date: 2024-04-08 15:32:58 +0300 - - Remove the backdoor found in 5.6.0 and 5.6.1 (CVE-2024-3094). - - While the backdoor was inactive (and thus harmless) without inserting - a small trigger code into the build system when the source package was - created, it's good to remove this anyway: - - - The executable payloads were embedded as binary blobs in - the test files. This was a blatant violation of the - Debian Free Software Guidelines. - - - On machines that see lots bots poking at the SSH port, the backdoor - noticeably increased CPU load, resulting in degraded user experience - and thus overwhelmingly negative user feedback. - - - The maintainer who added the backdoor has disappeared. - - - Backdoors are bad for security. - - This reverts the following without making any other changes: - - 6e636819 Tests: Update two test files. - a3a29bbd Tests: Test --single-stream can decompress bad-3-corrupt_lzma2.xz. - 0b4ccc91 Tests: Update RISC-V test files. - 8c9b8b20 liblzma: Fix typos in crc32_fast.c and crc64_fast.c. - 82ecc538 liblzma: Fix false Valgrind error report with GCC. - cf44e4b7 Tests: Add a few test files. - 3060e107 Tests: Use smaller dictionary size in RISC-V test files. - e2870db5 Tests: Add two RISC-V Filter test files. - - The RISC-V test files also have real content that tests the filter - but the real content would fit into much smaller files. A generator - program would need to be available as well. - - Thanks to Andres Freund for finding and reporting it and making - it public quickly so others could act without a delay. - See: https://www.openwall.com/lists/oss-security/2024/03/29/4 - - src/liblzma/check/crc32_fast.c | 7 +++++-- - src/liblzma/check/crc64_fast.c | 4 +++- - src/liblzma/check/crc_common.h | 25 ------------------------- - tests/files/README | 27 --------------------------- - tests/files/bad-3-corrupt_lzma2.xz | Bin 512 -> 0 bytes - tests/files/bad-dict_size.lzma | Bin 41 -> 0 bytes - tests/files/good-1-riscv-lzma2-1.xz | Bin 7424 -> 0 bytes - tests/files/good-1-riscv-lzma2-2.xz | Bin 7432 -> 0 bytes - tests/files/good-2cat.xz | Bin 136 -> 0 bytes - tests/files/good-large_compressed.lzma | Bin 35421 -> 0 bytes - tests/files/good-small_compressed.lzma | Bin 258 -> 0 bytes - tests/test_files.sh | 11 ----------- - 12 files changed, 8 insertions(+), 66 deletions(-) - -commit fd1b975b7851e081ed6e5cf63df946cd5cbdbb94 -Author: Jia Tan <jiat0218@gmail.com> -Date: 2024-03-09 11:42:50 +0800 - - Bump version and soname for 5.6.1. - - src/liblzma/Makefile.am | 2 +- - src/liblzma/api/lzma/version.h | 2 +- - 2 files changed, 2 insertions(+), 2 deletions(-) - -commit a2cda572498e96163fe4e2bde096d5dd7b814668 +commit a4f2e20d8466369b1bb277c66f75c9e4ba9cc378 Author: Jia Tan <jiat0218@gmail.com> Date: 2024-03-09 11:27:27 +0800 @@ -3857,7 +7331,7 @@ Date: 2024-03-09 11:27:27 +0800 NEWS | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) -commit 8583c6021124e388bce044a09f00ebabfd6165a7 +commit f01be8ad754a905d8c418601767480ec11621b02 Author: Jia Tan <jiat0218@gmail.com> Date: 2024-03-09 10:43:20 +0800 @@ -3871,7 +7345,7 @@ Date: 2024-03-09 10:43:20 +0800 po4a/uk.po | 702 +++++++++++++++++++++++++++++----------------------------- 6 files changed, 2024 insertions(+), 1974 deletions(-) -commit 74b138d2a6529f2c07729d7c77b1725a8e8b16f1 +commit 6e636819e8f070330d835fce46289a3ff72a7b89 Author: Jia Tan <jiat0218@gmail.com> Date: 2024-03-09 10:18:29 +0800 @@ -3885,7 +7359,7 @@ Date: 2024-03-09 10:18:29 +0800 tests/files/good-large_compressed.lzma | Bin 35430 -> 35421 bytes 2 files changed, 0 insertions(+), 0 deletions(-) -commit 3ec6dfd656bdd40ede2a5f11e6be338988e38be4 +commit a3a29bbd5d86183fc7eae8f0182dace374e778d8 Author: Jia Tan <jiat0218@gmail.com> Date: 2024-03-09 10:08:32 +0800 @@ -3897,7 +7371,7 @@ Date: 2024-03-09 10:08:32 +0800 tests/test_files.sh | 11 +++++++++++ 1 file changed, 11 insertions(+) -commit a67dcce6109c2f932a0a86abb0d7a95d3c31fb3e +commit 0b4ccc91454dbcf0bf521b9bd51aa270581ee23c Author: Jia Tan <jiat0218@gmail.com> Date: 2024-03-09 10:05:32 +0800 @@ -3909,7 +7383,7 @@ Date: 2024-03-09 10:05:32 +0800 tests/files/good-1-riscv-lzma2-2.xz | Bin 7512 -> 7432 bytes 2 files changed, 0 insertions(+), 0 deletions(-) -commit 058337b0f1da9f166049ecc972fa5c499c1af08c +commit 8c9b8b2063daa78ead9f648c2ec3c91e8615dffb Author: Jia Tan <jiat0218@gmail.com> Date: 2024-03-09 09:52:32 +0800 @@ -3919,7 +7393,7 @@ Date: 2024-03-09 09:52:32 +0800 src/liblzma/check/crc64_fast.c | 3 +-- 2 files changed, 3 insertions(+), 4 deletions(-) -commit cd5de9c1bbab3dd41b34b37a89c193fb6ff51ca5 +commit b93a8d7631d9517da63f03e0185455024a4609e8 Author: Jia Tan <jiat0218@gmail.com> Date: 2024-03-09 09:49:55 +0800 @@ -3933,7 +7407,7 @@ Date: 2024-03-09 09:49:55 +0800 tests/test_microlzma.c | 12 ++++-------- 4 files changed, 22 insertions(+), 23 deletions(-) -commit 651a1545c8b6150051a0b44857136efd419afc6f +commit 82ecc538193b380a21622aea02b0ba078e7ade92 Author: Jia Tan <jiat0218@gmail.com> Date: 2024-03-09 09:20:57 +0800 @@ -3954,7 +7428,7 @@ Date: 2024-03-09 09:20:57 +0800 src/liblzma/check/crc_common.h | 25 +++++++++++++++++++++++++ 3 files changed, 31 insertions(+), 10 deletions(-) -commit 6e97b299f1b22e366ec42ba5dc5b9d0746e87b84 +commit 3007e74ef250f0ce95d97ffbdf2282284f93764d Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-03-05 23:21:26 +0200 @@ -3963,7 +7437,7 @@ Date: 2024-03-05 23:21:26 +0200 src/liblzma/simple/riscv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -commit 4e1c97052b5f14f4d6dda99d12cbbd01e66e3712 +commit 72d2933bfae514e0dbb123488e9f1eb7cf64175f Author: Jia Tan <jiat0218@gmail.com> Date: 2024-03-05 00:34:46 +0800 @@ -3976,7 +7450,7 @@ Date: 2024-03-05 00:34:46 +0800 src/liblzma/check/crc64_fast.c | 3 +++ 2 files changed, 8 insertions(+) -commit ed957d39426695e948b06de0ed952a2fbbe84bd1 +commit e5faaebbcf02ea880cfc56edc702d4f7298788ad Author: Jia Tan <jiat0218@gmail.com> Date: 2024-03-05 00:27:31 +0800 @@ -4001,7 +7475,7 @@ Date: 2024-03-05 00:27:31 +0800 configure.ac | 7 +++++++ 2 files changed, 14 insertions(+) -commit e98ddaf85a1a8fb3cc863637f83356cc9db31e13 +commit 7eeadd279a24c26ca7ff1292b7df802b89409eb7 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-03-04 19:23:18 +0200 @@ -4010,7 +7484,7 @@ Date: 2024-03-04 19:23:18 +0200 src/liblzma/simple/riscv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -commit 319cec142f67fe294e0486402f1569f223d9a83d +commit 5f3d0595296cc3035eae9e7bb6c3ffb1e1267333 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-02-29 16:35:52 +0200 @@ -4019,7 +7493,7 @@ Date: 2024-02-29 16:35:52 +0200 CMakeLists.txt | 9 +++++++++ 1 file changed, 9 insertions(+) -commit 46c3e113d8eeb1a731a60829fa7f5d1b519f7f26 +commit 4cd1042ee752d61370c685d0d8b20c1e935672f7 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-02-29 16:35:52 +0200 @@ -4033,7 +7507,7 @@ Date: 2024-02-29 16:35:52 +0200 CMakeLists.txt | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) -commit 86bec8334bb1dcb6d9293a11cdccd895b17f364b +commit a94b42362c8e807f92236d6d63373f04991e3a50 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-02-28 18:26:25 +0200 @@ -4042,7 +7516,7 @@ Date: 2024-02-28 18:26:25 +0200 src/xz/coder.c | 10 ++++++++++ 1 file changed, 10 insertions(+) -commit 5c91b454c24e043ca8f2cc7d2b09bd091dafe655 +commit bbf112e32307a75a54a9e170bc392811443d5c87 Author: Jia Tan <jiat0218@gmail.com> Date: 2024-02-27 23:42:41 +0800 @@ -4066,7 +7540,7 @@ Date: 2024-02-27 23:42:41 +0800 src/xz/coder.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -commit d0e57b2f159f8fd03a9a89f2f593a768d0487898 +commit 649f6447441510d593a88475ad6df4bcdf74ce48 Author: Lasse Collin <lasse.collin@tukaani.org> Date: 2024-02-26 23:06:13 +0200 @@ -4075,7 +7549,7 @@ Date: 2024-02-26 23:06:13 +0200 THANKS | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -commit d416be55ac02af1144fed455fb18b710147bb490 +commit 1255b7d849bf53f196a842ef2a508ed0ff577eaa Author: Jia Tan <jiat0218@gmail.com> Date: 2024-02-26 23:39:29 +0800 @@ -4084,7 +7558,7 @@ Date: 2024-02-26 23:39:29 +0800 THANKS | 1 + 1 file changed, 1 insertion(+) -commit f06b33edd2aeabdb11836a2bf0b681768dad29d3 +commit eee579fff50099ba163c12305e81a4bd42b7dd53 Author: Chien Wong <m@xv97.com> Date: 2024-02-25 21:38:13 +0800 @@ -4095,7 +7569,7 @@ Date: 2024-02-25 21:38:13 +0800 src/xz/xz.1 | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) -commit a100f9111c8cc7f5b5f0e4a5e8af3de7161c7975 +commit 328c52da8a2bbb81307644efdb58db2c422d9ba7 Author: Jia Tan <jiat0218@gmail.com> Date: 2024-02-26 23:02:06 +0800 @@ -4113,7 +7587,7 @@ Date: 2024-02-26 23:02:06 +0800 src/xzdec/xzdec.c | 8 ++++---- 5 files changed, 54 insertions(+), 10 deletions(-) -commit d85efdc8911e6e8964ec920af44c8a6fe0a4c3c2 +commit eb8ad59e9bab32a8d655796afd39597ea6dcc64d Author: Jia Tan <jiat0218@gmail.com> Date: 2024-02-26 20:06:10 +0800 @@ -4123,7 +7597,7 @@ Date: 2024-02-26 20:06:10 +0800 CMakeLists.txt | 1 + 2 files changed, 2 insertions(+) -commit 42ee4256739779005a7f921946c8a8e483d1f2ed +commit 9eed1b9a3ae140e93a82febc05a0181e9a4f5093 Author: Jia Tan <jiat0218@gmail.com> Date: 2024-02-26 19:56:25 +0800 @@ -4132,7 +7606,7 @@ Date: 2024-02-26 19:56:25 +0800 tests/test_microlzma.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) -commit c83349dfd9cf9c495005b6d30e2fd34a9cafc18a +commit 8bf9f72ee1c05b9e205a72807e8a9e304785673d Author: Jia Tan <jiat0218@gmail.com> Date: 2024-02-25 21:41:55 +0800 @@ -4142,11 +7616,18 @@ Date: 2024-02-25 21:41:55 +0800 NEWS | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -commit 2d7d862e3ffa8cec4fd3fdffcd84e984a17aa429 +commit 5d8d915ebe2e345820a0f54d1baf8d7d4824c0c7 Author: Jia Tan <jiat0218@gmail.com> -Date: 2024-02-24 15:55:08 +0800 +Date: 2024-02-24 16:30:06 +0800 - Bump version and soname for 5.6.0. + Bump version and soname for 5.7.0alpha. + + Like 5.5.0alpha, 5.7.0alpha won't be released, it's just to mark that + the branch is not stable. + + Once again there is no API/ABI stability for new features in devel + versions. The major soname won't be bumped even if API/ABI of new + features breaks between devel releases. src/liblzma/Makefile.am | 2 +- src/liblzma/api/lzma/version.h | 6 +++--- @@ -10,6 +10,7 @@ XZ Utils 2. Version numbering 3. Reporting bugs 4. Translations + 4.1. Testing translations 5. Other implementations of the .xz format 6. Contact information @@ -203,77 +204,47 @@ XZ Utils https://translationproject.org/html/translators.html - Below are notes and testing instructions specific to xz - translations. + Updates to translations won't be accepted by methods that bypass + the Translation Project because there is a risk of duplicate work: + translation updates made in the xz repository aren't seen by the + translators in the Translation Project. If you have found bugs in + a translation, please report them to the Language-Team address + which can be found near the beginning of the PO file. - Testing can be done by installing xz into a temporary directory: + If you find language problems in the original English strings, + feel free to suggest improvements. Ask if something is unclear. + + +4.1. Testing translations + + Testing can be done by installing xz into a temporary directory. + + If building from Git repository (not tarball), generate the + Autotools files: + + ./autogen.sh + + Create a subdirectory for the build files. The tmp-build directory + can be deleted after testing. + + mkdir tmp-build + cd tmp-build + ../configure --disable-shared --enable-debug --prefix=$PWD/inst + + Edit the .po file in the po directory. Then build and install to + the "tmp-build/inst" directory, and use translations.bash to see + how some of the messages look. Repeat these steps if needed: - ./configure --disable-shared --prefix=/tmp/xz-test - # <Edit the .po file in the po directory.> make -C po update-po - make install - bash debug/translation.bash | less - bash debug/translation.bash | less -S # For --list outputs - - Repeat the above as needed (no need to re-run configure though). - - Note especially the following: - - - The output of --help and --long-help must look nice on - an 80-column terminal. It's OK to add extra lines if needed. - - - In contrast, don't add extra lines to error messages and such. - They are often preceded with e.g. a filename on the same line, - so you have no way to predict where to put a \n. Let the terminal - do the wrapping even if it looks ugly. Adding new lines will be - even uglier in the generic case even if it looks nice in a few - limited examples. - - - Be careful with column alignment in tables and table-like output - (--list, --list --verbose --verbose, --info-memory, --help, and - --long-help): - - * All descriptions of options in --help should start in the - same column (but it doesn't need to be the same column as - in the English messages; just be consistent if you change it). - Check that both --help and --long-help look OK, since they - share several strings. - - * --list --verbose and --info-memory print lines that have - the format "Description: %s". If you need a longer - description, you can put extra space between the colon - and %s. Then you may need to add extra space to other - strings too so that the result as a whole looks good (all - values start at the same column). - - * The columns of the actual tables in --list --verbose --verbose - should be aligned properly. Abbreviate if necessary. It might - be good to keep at least 2 or 3 spaces between column headings - and avoid spaces in the headings so that the columns stand out - better, but this is a matter of opinion. Do what you think - looks best. - - - Be careful to put a period at the end of a sentence when the - original version has it, and don't put it when the original - doesn't have it. Similarly, be careful with \n characters - at the beginning and end of the strings. - - - Read the TRANSLATORS comments that have been extracted from the - source code and included in xz.pot. Some comments suggest - testing with a specific command which needs an .xz file. You - may use e.g. any tests/files/good-*.xz. However, these test - commands are included in translations.bash output, so reading - translations.bash output carefully can be enough. - - - If you find language problems in the original English strings, - feel free to suggest improvements. Ask if something is unclear. - - - The translated messages should be understandable (sometimes this - may be a problem with the original English messages too). Don't - make a direct word-by-word translation from English especially if - the result doesn't sound good in your language. - - Thanks for your help! + make -j"$(nproc)" install + bash ../debug/translation.bash | less + bash ../debug/translation.bash | less -S # For --list outputs + + To test other languages, set the LANGUAGE environment variable + before running translations.bash. The value should match the PO file + name without the .po suffix. Example: + + export LANGUAGE=fi 5. Other implementations of the .xz format @@ -20,6 +20,7 @@ has been important. :-) In alphabetical order: - Jakub Bogusz - Adam Borowski - Maarten Bosmans + - Roel Bouckaert - Lukas Braune - Benjamin Buch - Trent W. Buck @@ -29,26 +30,35 @@ has been important. :-) In alphabetical order: - Frank Busse - Daniel Mealha Cabrita - Milo Casagrande + - Cristiano Ceglia - Marek Černocký - Tomer Chachamu - Vitaly Chikunov - Antoine Cœur + - Elijah Almeida Coimbra - Felix Collin + - Ryan Colyer + - Marcus Comstedt + - Vincent Cruz - Gabi Davar + - Ron Desmond - İhsan Doğan - Chris Donawa - Andrew Dudman - Markus Duft - İsmail Dönmez + - Dexter Castor Döpping - Paul Eggert - Robert Elz - Gilles Espinasse - Denis Excoffier - Vincent Fazio - Michael Felt + - Sean Fenian - Michael Fox - Andres Freund - Mike Frysinger + - Collin Funk - Daniel Richard G. - Tomasz Gajc - Bjarni Ingi Gislason @@ -57,10 +67,14 @@ has been important. :-) In alphabetical order: - Matthew Good - Michał Górny - Jason Gorski + - Alexander M. Greenham - Juan Manuel Guerrero - Gabriela Gutierrez - Diederik de Haas + - Jan Terje Hansen + - Tobias Lahrmann Hansen - Joachim Henke + - Lizandro Heredia - Christian Hesse - Vincenzo Innocente - Peter Ivanov @@ -76,9 +90,11 @@ has been important. :-) In alphabetical order: - Per Øyvind Karlsen - Firas Khalil Khana - Iouri Kharon + - Kim Jinyeong - Thomas Klausner - Richard Koch - Anton Kochkov + - Harri K. Koskinen - Ville Koskinen - Sergey Kosukhin - Marcin Kowalczyk @@ -103,14 +119,20 @@ has been important. :-) In alphabetical order: - Chenxi Mao - Gregory Margo - Julien Marrec + - Pierre-Yves Martin - Ed Maste - Martin Matuška + - Scott McAllister + - Chris McCrohan + - Derwin McGeary - Ivan A. Melnikov - Jim Meyering - Arkadiusz Miskiewicz - Nathan Moinvaziri - Étienne Mollier - Conley Moorhous + - Dirk Müller + - Rainer Müller - Andrew Murray - Rafał Mużyło - Adrien Nader @@ -118,28 +140,34 @@ has been important. :-) In alphabetical order: - Alexander Neumann - Hongbo Ni - Jonathan Nieder + - Asgeir Storesund Nilsen - Andre Noll + - Ruarí Ødegaard - Peter O'Gorman - Dimitri Papadopoulos Orfanos - Daniel Packard - Filip Palian - Peter Pallinger - Kai Pastor + - Keith Patton - Rui Paulo - Igor Pavlov - Diego Elio Pettenò - Elbert Pol + - Guiorgy Potskhishvili - Mikko Pouru - Frank Prochnow - Rich Prohaska - Trần Ngọc Quân - Pavel Raiskup + - Matthieu Rakotojaona - Ole André Vadla Ravnås - Eric S. Raymond - Robert Readman - Bernhard Reutner-Fischer - Markus Rickert - Cristian Rodríguez + - Jeroen Roovers - Christian von Roques - Boud Roukema - Torsten Rupp @@ -156,6 +184,7 @@ has been important. :-) In alphabetical order: - Dan Shechter - Stuart Shelton - Sebastian Andrzej Siewior + - Andrej Skenderija - Ville Skyttä - Brad Smith - Bruce Stark @@ -181,20 +210,28 @@ has been important. :-) In alphabetical order: - Christian Weisgerber - Dan Weiss - Bert Wesarg + - Mark Wielaard - Fredrik Wikstrom - Jim Wilcoxson - Ralf Wildenhues - Charles Wilson - Lars Wirzenius + - Vincent Wixsom - Pilorz Wojciech - Chien Wong + - Xi Ruoyao - Ryan Young - Andreas Zieringer + - 榆柳松 (ZhengSen Wang) Companies: - Google - Sandfly Security +Other credits: + - cleemy desu wayo working with Trend Micro Zero Day Initiative + - Orange Tsai and splitline from DEVCORE Research Team + Also thanks to all the people who have participated in the Tukaani project. I have probably forgot to add some names to the above list. Sorry about @@ -5,12 +5,7 @@ XZ Utils To-Do List Known bugs ---------- - The test suite is too incomplete. - - If the memory usage limit is less than about 13 MiB, xz is unable to - automatically scale down the compression settings enough even though - it would be possible by switching from BT2/BT3/BT4 match finder to - HC3/HC4. + The test suite is incomplete. XZ Utils compress some files significantly worse than LZMA Utils. This is due to faster compression presets used by XZ Utils, and @@ -19,9 +14,6 @@ Known bugs compress extremely well, so going from compression ratio of 0.003 to 0.004 means big relative increase in the compressed file size. - xz doesn't quote unprintable characters when it displays file names - given on the command line. - tuklib_exit() doesn't block signals => EINTR is possible. If liblzma has created threads and fork() gets called, liblzma @@ -41,9 +33,6 @@ Missing features be mostly useful when using a preset dictionary in LZMA2, but it may have other uses too. Compare to deflateCopy() in zlib. - Support LZMA_FINISH in raw decoder to indicate end of LZMA1 and - other streams that don't have an end of payload marker. - Adjust dictionary size when the input file size is known. Maybe do this only if an option is given. @@ -67,9 +56,9 @@ Missing features Support LZMA_FULL_FLUSH for lzma_stream_decoder() to stop at Block and Stream boundaries. - lzma_strerror() to convert lzma_ret to human readable form? - This is tricky, because the same error codes are used with - slightly different meanings, and this cannot be fixed anymore. + Error codes from lzma_code() aren't very specific. A more detailed + error message (string) could be provided too. It could be returned + by a new function or use a currently-reserved member of lzma_stream. Make it possible to adjust LZMA2 options in the middle of a Block so that the encoding speed vs. compression ratio can be optimized @@ -97,9 +86,3 @@ Documentation Document the LZMA1 and LZMA2 algorithms. - -Miscellaneous ------------- - - Try to get the media type for .xz registered at IANA. - diff --git a/src/common/my_landlock.h b/src/common/my_landlock.h new file mode 100644 index 000000000000..e135d08c858f --- /dev/null +++ b/src/common/my_landlock.h @@ -0,0 +1,141 @@ +// SPDX-License-Identifier: 0BSD + +/////////////////////////////////////////////////////////////////////////////// +// +/// \file my_landlock.h +/// \brief Linux Landlock sandbox helper functions +// +// Author: Lasse Collin +// +/////////////////////////////////////////////////////////////////////////////// + +#ifndef MY_LANDLOCK_H +#define MY_LANDLOCK_H + +#include "sysdefs.h" + +#include <linux/landlock.h> +#include <sys/syscall.h> +#include <sys/prctl.h> + + +/// \brief Initialize Landlock ruleset attributes to forbid everything +/// +/// The supported Landlock ABI is checked at runtime and only the supported +/// actions are forbidden in the attributes. Thus, if the attributes are +/// used with my_landlock_create_ruleset(), it shouldn't fail. +/// +/// \return On success, the Landlock ABI version is returned (a positive +/// integer). If Landlock isn't supported, -1 is returned. +static int +my_landlock_ruleset_attr_forbid_all(struct landlock_ruleset_attr *attr) +{ + memzero(attr, sizeof(*attr)); + + const int abi_version = syscall(SYS_landlock_create_ruleset, + (void *)NULL, 0, LANDLOCK_CREATE_RULESET_VERSION); + if (abi_version <= 0) + return -1; + + // ABI 1 except the few at the end + attr->handled_access_fs + = LANDLOCK_ACCESS_FS_EXECUTE + | LANDLOCK_ACCESS_FS_WRITE_FILE + | LANDLOCK_ACCESS_FS_READ_FILE + | LANDLOCK_ACCESS_FS_READ_DIR + | LANDLOCK_ACCESS_FS_REMOVE_DIR + | LANDLOCK_ACCESS_FS_REMOVE_FILE + | LANDLOCK_ACCESS_FS_MAKE_CHAR + | LANDLOCK_ACCESS_FS_MAKE_DIR + | LANDLOCK_ACCESS_FS_MAKE_REG + | LANDLOCK_ACCESS_FS_MAKE_SOCK + | LANDLOCK_ACCESS_FS_MAKE_FIFO + | LANDLOCK_ACCESS_FS_MAKE_BLOCK + | LANDLOCK_ACCESS_FS_MAKE_SYM +#ifdef LANDLOCK_ACCESS_FS_REFER + | LANDLOCK_ACCESS_FS_REFER // ABI 2 +#endif +#ifdef LANDLOCK_ACCESS_FS_TRUNCATE + | LANDLOCK_ACCESS_FS_TRUNCATE // ABI 3 +#endif +#ifdef LANDLOCK_ACCESS_FS_IOCTL_DEV + | LANDLOCK_ACCESS_FS_IOCTL_DEV // ABI 5 +#endif + ; + +#ifdef LANDLOCK_ACCESS_NET_BIND_TCP + // ABI 4 + attr->handled_access_net + = LANDLOCK_ACCESS_NET_BIND_TCP + | LANDLOCK_ACCESS_NET_CONNECT_TCP; +#endif + +#ifdef LANDLOCK_SCOPE_SIGNAL + // ABI 6 + attr->scoped + = LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET + | LANDLOCK_SCOPE_SIGNAL; +#endif + + // Disable flags that require a new ABI version. + switch (abi_version) { + case 1: +#ifdef LANDLOCK_ACCESS_FS_REFER + attr->handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER; +#endif + FALLTHROUGH; + + case 2: +#ifdef LANDLOCK_ACCESS_FS_TRUNCATE + attr->handled_access_fs &= ~LANDLOCK_ACCESS_FS_TRUNCATE; +#endif + FALLTHROUGH; + + case 3: +#ifdef LANDLOCK_ACCESS_NET_BIND_TCP + attr->handled_access_net = 0; +#endif + FALLTHROUGH; + + case 4: +#ifdef LANDLOCK_ACCESS_FS_IOCTL_DEV + attr->handled_access_fs &= ~LANDLOCK_ACCESS_FS_IOCTL_DEV; +#endif + FALLTHROUGH; + + case 5: +#ifdef LANDLOCK_SCOPE_SIGNAL + attr->scoped = 0; +#endif + FALLTHROUGH; + + default: + // We only know about the features of the ABIs 1-6. + break; + } + + return abi_version; +} + + +/// \brief Wrapper for the landlock_create_ruleset(2) syscall +/// +/// Syscall wrappers provide argument type checking. +/// +/// \note Remember to call `prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)` too! +static inline int +my_landlock_create_ruleset(const struct landlock_ruleset_attr *attr, + size_t size, uint32_t flags) +{ + return syscall(SYS_landlock_create_ruleset, attr, size, flags); +} + + +/// \brief Wrapper for the landlock_restrict_self(2) syscall +static inline int +my_landlock_restrict_self(int ruleset_fd, uint32_t flags) +{ + return syscall(SYS_landlock_restrict_self, ruleset_fd, flags); +} + +#endif diff --git a/src/common/sysdefs.h b/src/common/sysdefs.h index 5f3785b5137a..b10ffa7c3b18 100644 --- a/src/common/sysdefs.h +++ b/src/common/sysdefs.h @@ -23,17 +23,29 @@ # include <config.h> #endif -// This #define ensures that C99 and POSIX compliant stdio functions are -// available with MinGW-w64 (both 32-bit and 64-bit). Modern MinGW-w64 adds -// this automatically, for example, when the compiler is in C99 (or later) -// mode when building against msvcrt.dll. It still doesn't hurt to be explicit -// that we always want this and #define this unconditionally. +// Choose if MinGW-w64's stdio replacement functions should be used. +// The default has varied slightly in the past so it's clearest to always +// set it explicitly. // -// With Universal CRT (UCRT) this is less important because UCRT contains -// C99-compatible stdio functions. It's still nice to #define this as UCRT -// doesn't support the POSIX thousand separator flag in printf (like "%'u"). -#ifdef __MINGW32__ +// Modern MinGW-w64 enables the replacement functions even with UCRT +// when _GNU_SOURCE is defined. That's good because UCRT doesn't support +// the POSIX thousand separator flag in printf (like "%'u"). Otherwise +// XZ Utils works with the UCRT stdio functions. +// +// The replacement functions add over 20 KiB to each executable. For +// size-optimized builds (HAVE_SMALL), disable the replacements. +// Then thousand separators aren't shown in xz's messages but this is +// a minor downside compare to the slower speed of the HAVE_SMALL builds. +// +// The legacy MSVCRT is pre-C99 and it's best to always use the stdio +// replacements functions from MinGW-w64. +#if defined(__MINGW32__) && !defined(__USE_MINGW_ANSI_STDIO) # define __USE_MINGW_ANSI_STDIO 1 +# include <_mingw.h> +# if defined(_UCRT) && defined(HAVE_SMALL) +# undef __USE_MINGW_ANSI_STDIO +# define __USE_MINGW_ANSI_STDIO 0 +# endif #endif // size_t and NULL @@ -156,17 +168,26 @@ typedef unsigned char _Bool; # define __bool_true_false_are_defined 1 #endif +// We may need alignas from C11/C17/C23. +#if __STDC_VERSION__ >= 202311 + // alignas is a keyword in C23. Do nothing. +#elif __STDC_VERSION__ >= 201112 + // Oracle Developer Studio 12.6 lacks <stdalign.h>. + // For simplicity, avoid the header with all C11/C17 compilers. +# define alignas _Alignas +#elif defined(__GNUC__) || defined(__clang__) +# define alignas(n) __attribute__((__aligned__(n))) +#else +# define alignas(n) +#endif + #include <string.h> -// Visual Studio 2013 update 2 supports only __inline, not inline. -// MSVC v19.0 / VS 2015 and newer support both. +// MSVC v19.00 (VS 2015 version 14.0) and later should work. // // MSVC v19.27 (VS 2019 version 16.7) added support for restrict. // Older ones support only __restrict. #ifdef _MSC_VER -# if _MSC_VER < 1900 && !defined(inline) -# define inline __inline -# endif # if _MSC_VER < 1927 && !defined(restrict) # define restrict __restrict # endif @@ -196,4 +217,13 @@ typedef unsigned char _Bool; # define lzma_attr_alloc_size(x) #endif +#if __STDC_VERSION__ >= 202311 +# define FALLTHROUGH [[__fallthrough__]] +#elif (defined(__GNUC__) && __GNUC__ >= 7) \ + || (defined(__clang_major__) && __clang_major__ >= 10) +# define FALLTHROUGH __attribute__((__fallthrough__)) +#else +# define FALLTHROUGH ((void)0) +#endif + #endif diff --git a/src/common/tuklib_common.h b/src/common/tuklib_common.h index 7554dfc86fb6..d73f07255e4d 100644 --- a/src/common/tuklib_common.h +++ b/src/common/tuklib_common.h @@ -56,6 +56,13 @@ # define TUKLIB_GNUC_REQ(major, minor) 0 #endif +#if defined(__GNUC__) || defined(__clang__) +# define tuklib_attr_format_printf(fmt_index, args_index) \ + __attribute__((__format__(__printf__, fmt_index, args_index))) +#else +# define tuklib_attr_format_printf(fmt_index, args_index) +#endif + // tuklib_attr_noreturn attribute is used to mark functions as non-returning. // We cannot use "noreturn" as the macro name because then C23 code that // uses [[noreturn]] would break as it would expand to [[ [[noreturn]] ]]. @@ -68,9 +75,7 @@ // __attribute__((nonnull(1))) // extern void foo(const char *s); // -// FIXME: Update __STDC_VERSION__ for the final C23 version. 202000 is used -// by GCC 13 and Clang 15 with -std=c2x. -#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 202000 +#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 202311 # define tuklib_attr_noreturn [[noreturn]] #elif defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112 # define tuklib_attr_noreturn _Noreturn diff --git a/src/common/tuklib_gettext.h b/src/common/tuklib_gettext.h index 3ef5cb7292b5..e5ad5e6f78a1 100644 --- a/src/common/tuklib_gettext.h +++ b/src/common/tuklib_gettext.h @@ -40,4 +40,15 @@ #endif #define N_(msgid) msgid +// Optional: Strings that are word wrapped using tuklib_mbstr_wrap may be +// marked with W_("foo) in the source code. xgettext can then add a comment +// to all such strings to inform translators. The following option needs to +// be added to XGETTEXT_OPTIONS in po/Makevars or in an equivalent place: +// +// '--keyword=W_:1,"This is word wrapped at spaces. The Unicode character U+00A0 works as a non-breaking space. Tab (\t) is interpret as a zero-width space (the tab itself is not displayed); U+200B is NOT supported. Manual word wrapping with \n is supported but requires care."' +// +// NOTE: The double-quotes in the --keyword argument above must be passed to +// xgettext as is, thus one needs the single-quotes in Makevars. +#define W_(msgid) _(msgid) + #endif diff --git a/src/common/tuklib_mbstr.h b/src/common/tuklib_mbstr.h index 4c8eeb7e3700..5ac06eb35e88 100644 --- a/src/common/tuklib_mbstr.h +++ b/src/common/tuklib_mbstr.h @@ -27,10 +27,7 @@ extern size_t tuklib_mbstr_width(const char *str, size_t *bytes); /// /// This is somewhat similar to wcswidth() but works on multibyte strings. /// -/// \param str String whose width is to be calculated. If the -/// current locale uses a multibyte character set -/// that has shift states, the string must begin -/// and end in the initial shift state. +/// \param str String whose width is to be calculated. /// \param bytes If this is not NULL, *bytes is set to the /// value returned by strlen(str) (even if an /// error occurs when calculating the width). @@ -38,8 +35,24 @@ extern size_t tuklib_mbstr_width(const char *str, size_t *bytes); /// \return On success, the number of columns needed to display the /// string e.g. in a terminal emulator is returned. On error, /// (size_t)-1 is returned. Possible errors include invalid, -/// partial, or non-printable multibyte character in str, or -/// that str doesn't end in the initial shift state. +/// partial, or non-printable multibyte character in str. + +#define tuklib_mbstr_width_mem TUKLIB_SYMBOL(tuklib_mbstr_width_mem) +extern size_t tuklib_mbstr_width_mem(const char *str, size_t len); +///< +/// \brief Get the number of columns needed for the multibyte buffer +/// +/// This is like tuklib_mbstr_width() except that this takes the buffer +/// length in bytes as the second argument. This allows using the function +/// for buffers that aren't terminated with '\0'. +/// +/// \param str String whose width is to be calculated. +/// \param len Number of bytes to read from str. +/// +/// \return On success, the number of columns needed to display the +/// string e.g. in a terminal emulator is returned. On error, +/// (size_t)-1 is returned. Possible errors include invalid, +/// partial, or non-printable multibyte character in str. #define tuklib_mbstr_fw TUKLIB_SYMBOL(tuklib_mbstr_fw) extern int tuklib_mbstr_fw(const char *str, int columns_min); diff --git a/src/common/tuklib_mbstr_nonprint.c b/src/common/tuklib_mbstr_nonprint.c new file mode 100644 index 000000000000..dc778757b148 --- /dev/null +++ b/src/common/tuklib_mbstr_nonprint.c @@ -0,0 +1,162 @@ +// SPDX-License-Identifier: 0BSD + +/////////////////////////////////////////////////////////////////////////////// +// +/// \file tuklib_mbstr_nonprint.c +/// \brief Find and replace non-printable characters with question marks +// +// Author: Lasse Collin +// +/////////////////////////////////////////////////////////////////////////////// + +#include "tuklib_mbstr_nonprint.h" +#include <stdlib.h> +#include <string.h> +#include <errno.h> + +#ifdef HAVE_MBRTOWC +# include <wchar.h> +# include <wctype.h> +#else +# include <ctype.h> +#endif + + +static bool +is_next_printable(const char *str, size_t len, size_t *next_len) +{ +#ifdef HAVE_MBRTOWC + // This assumes that character sets with locking shift states aren't + // used, and thus mbsinit() is never needed. + mbstate_t ps; + memset(&ps, 0, sizeof(ps)); + + wchar_t wc; + *next_len = mbrtowc(&wc, str, len, &ps); + + if (*next_len == (size_t)-2) { + // Incomplete multibyte sequence: Treat the whole sequence + // as a single non-printable multibyte character that ends + // the string. + *next_len = len; + return false; + } + + // Check more broadly than just ret == (size_t)-1 to be safe + // in case mbrtowc() returns something weird. This check + // covers (size_t)-1 (that is, SIZE_MAX) too because len is from + // strlen() and the terminating '\0' isn't part of the length. + if (*next_len < 1 || *next_len > len) { + // Invalid multibyte sequence: Treat the first byte as + // a non-printable single-byte character. Decoding will + // be restarted from the next byte on the next call to + // this function. + *next_len = 1; + return false; + } + +# if defined(_WIN32) && !defined(__CYGWIN__) + // On Windows, wchar_t stores UTF-16 code units, thus characters + // outside the Basic Multilingual Plane (BMP) don't fit into + // a single wchar_t. In an UTF-8 locale, UCRT's mbrtowc() returns + // successfully when the input is a non-BMP character but the + // output is the replacement character U+FFFD. + // + // iswprint() returns 0 for U+FFFD on Windows for some reason. Treat + // U+FFFD as printable and thus also all non-BMP chars as printable. + if (wc == 0xFFFD) + return true; +# endif + + return iswprint((wint_t)wc) != 0; +#else + (void)len; + *next_len = 1; + return isprint((unsigned char)str[0]) != 0; +#endif +} + + +static bool +has_nonprint(const char *str, size_t len) +{ + for (size_t i = 0; i < len; ) { + size_t next_len; + if (!is_next_printable(str + i, len - i, &next_len)) + return true; + + i += next_len; + } + + return false; +} + + +extern bool +tuklib_has_nonprint(const char *str) +{ + const int saved_errno = errno; + const bool ret = has_nonprint(str, strlen(str)); + errno = saved_errno; + return ret; +} + + +extern const char * +tuklib_mask_nonprint_r(const char *str, char **mem) +{ + const int saved_errno = errno; + + // Free the old string, if any. + free(*mem); + *mem = NULL; + + // If the whole input string contains only printable characters, + // return the input string. + const size_t len = strlen(str); + if (!has_nonprint(str, len)) { + errno = saved_errno; + return str; + } + + // Allocate memory for the masked string. Since we use the single-byte + // character '?' to mask non-printable characters, it's possible that + // a few bytes less memory would be needed in reality if multibyte + // characters are masked. + // + // If allocation fails, return "???" because it should be safer than + // returning the unmasked string. + *mem = malloc(len + 1); + if (*mem == NULL) { + errno = saved_errno; + return "???"; + } + + // Replace all non-printable characters with '?'. + char *dest = *mem; + + for (size_t i = 0; i < len; ) { + size_t next_len; + if (is_next_printable(str + i, len - i, &next_len)) { + memcpy(dest, str + i, next_len); + dest += next_len; + } else { + *dest++ = '?'; + } + + i += next_len; + } + + *dest = '\0'; + + errno = saved_errno; + return *mem; +} + + +extern const char * +tuklib_mask_nonprint(const char *str) +{ + static char *mem = NULL; + return tuklib_mask_nonprint_r(str, &mem); +} diff --git a/src/common/tuklib_mbstr_nonprint.h b/src/common/tuklib_mbstr_nonprint.h new file mode 100644 index 000000000000..6fc969109fe0 --- /dev/null +++ b/src/common/tuklib_mbstr_nonprint.h @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: 0BSD + +/////////////////////////////////////////////////////////////////////////////// +// +/// \file tuklib_mbstr_nonprint.h +/// \brief Find and replace non-printable characters with question marks +/// +/// If mbrtowc(3) is available, it and iswprint(3) is used to check if all +/// characters are printable. Otherwise single-byte character set is assumed +/// and isprint(3) is used. +// +// Author: Lasse Collin +// +/////////////////////////////////////////////////////////////////////////////// + +#ifndef TUKLIB_MBSTR_NONPRINT_H +#define TUKLIB_MBSTR_NONPRINT_H + +#include "tuklib_common.h" +TUKLIB_DECLS_BEGIN + +#define tuklib_has_nonprint TUKLIB_SYMBOL(tuklib_has_nonprint) +extern bool tuklib_has_nonprint(const char *str); +///< +/// \brief Check if a string contains any non-printable characters +/// +/// \return false if str contains only valid multibyte characters and +/// iswprint(3) returns non-zero for all of them; true otherwise. +/// The value of errno is preserved. +/// +/// \note In case mbrtowc(3) isn't available, single-byte character set +/// is assumed and isprint(3) is used instead of iswprint(3). + +#define tuklib_mask_nonprint_r TUKLIB_SYMBOL(tuklib_mask_nonprint_r) +extern const char *tuklib_mask_nonprint_r(const char *str, char **mem); +///< +/// \brief Replace non-printable characters with question marks +/// +/// \param str Untrusted string, for example, a filename +/// \param mem This function always calls free(*mem) to free the old +/// allocation and then sets *mem = NULL. Before the first +/// call, *mem should be initialized to NULL. If this +/// function needs to allocate memory for a modified +/// string, a pointer to the allocated memory will be +/// stored to *mem. Otherwise *mem will remain NULL. +/// +/// \return If tuklib_has_nonprint(str) returns false, this function +/// returns str. Otherwise memory is allocated to hold a modified +/// string and a pointer to that is returned. The pointer to the +/// allocated memory is also stored to *mem. A modified string +/// has the problematic characters replaced by '?'. If memory +/// allocation fails, "???" is returned and *mem is NULL. +/// The value of errno is preserved. + +#define tuklib_mask_nonprint TUKLIB_SYMBOL(tuklib_mask_nonprint) +extern const char *tuklib_mask_nonprint(const char *str); +///< +/// \brief Replace non-printable characters with question marks +/// +/// This is a convenience function for single-threaded use. This calls +/// tuklib_mask_nonprint_r() using an internal static variable to hold +/// the possible allocation. +/// +/// \param str Untrusted string, for example, a filename +/// +/// \return See tuklib_mask_nonprint_r(). +/// +/// \note This function is not thread safe! + +TUKLIB_DECLS_END +#endif diff --git a/src/common/tuklib_mbstr_width.c b/src/common/tuklib_mbstr_width.c index 7a8bf0707518..98c611d8f38d 100644 --- a/src/common/tuklib_mbstr_width.c +++ b/src/common/tuklib_mbstr_width.c @@ -12,7 +12,7 @@ #include "tuklib_mbstr.h" #include <string.h> -#if defined(HAVE_MBRTOWC) && defined(HAVE_WCWIDTH) +#ifdef HAVE_MBRTOWC # include <wchar.h> #endif @@ -24,9 +24,17 @@ tuklib_mbstr_width(const char *str, size_t *bytes) if (bytes != NULL) *bytes = len; -#if !(defined(HAVE_MBRTOWC) && defined(HAVE_WCWIDTH)) + return tuklib_mbstr_width_mem(str, len); +} + + +extern size_t +tuklib_mbstr_width_mem(const char *str, size_t len) +{ +#ifndef HAVE_MBRTOWC // In single-byte mode, the width of the string is the same // as its length. + (void)str; return len; #else @@ -41,21 +49,35 @@ tuklib_mbstr_width(const char *str, size_t *bytes) while (i < len) { wchar_t wc; const size_t ret = mbrtowc(&wc, str + i, len - i, &state); - if (ret < 1 || ret > len) + if (ret < 1 || ret > len - i) return (size_t)-1; i += ret; +#ifdef HAVE_WCWIDTH const int wc_width = wcwidth(wc); if (wc_width < 0) return (size_t)-1; width += (size_t)wc_width; +#else + // Without wcwidth() (like in a native Windows build), + // assume that one multibyte char == one column. With + // UTF-8, this is less bad than one byte == one column. + // This way quite a few languages will be handled correctly + // in practice; CJK chars will be very wrong though. + ++width; +#endif } - // Require that the string ends in the initial shift state. - // This way the caller can be combine the string with other - // strings without needing to worry about the shift states. + // It's good to check that the string ended in the initial state. + // However, in practice this is redundant: + // + // - No one will use this code with character sets that have + // locking shift states. + // + // - We already checked that mbrtowc() didn't return (size_t)-2 + // which would indicate a partial multibyte character. if (!mbsinit(&state)) return (size_t)-1; diff --git a/src/common/tuklib_mbstr_wrap.c b/src/common/tuklib_mbstr_wrap.c new file mode 100644 index 000000000000..8d906e004d75 --- /dev/null +++ b/src/common/tuklib_mbstr_wrap.c @@ -0,0 +1,294 @@ +// SPDX-License-Identifier: 0BSD + +/////////////////////////////////////////////////////////////////////////////// +// +/// \file tuklib_mbstr_wrap.c +/// \brief Word wraps a string and prints it to a FILE stream +/// +/// This depends on tuklib_mbstr_width.c. +// +// Author: Lasse Collin +// +/////////////////////////////////////////////////////////////////////////////// + +#include "tuklib_mbstr.h" +#include "tuklib_mbstr_wrap.h" +#include <stdarg.h> +#include <stdlib.h> +#include <stdio.h> +#include <string.h> + + +extern int +tuklib_wraps(FILE *outfile, const struct tuklib_wrap_opt *opt, const char *str) +{ + // left_cont may be less than left_margin. In that case, if the first + // word is extremely long, it will stay on the first line even if + // the line then gets overlong. + // + // On the other hand, left2_cont < left2_margin isn't allowed because + // it could result in inconsistent behavior when a very long word + // comes right after a \v. + // + // It is fine to have left2_margin < left_margin although it would be + // an odd use case. + if (!(opt->left_margin < opt->right_margin + && opt->left_cont < opt->right_margin + && opt->left2_margin <= opt->left2_cont + && opt->left2_cont < opt->right_margin)) + return TUKLIB_WRAP_ERR_OPT; + + // This is set to TUKLIB_WRAP_WARN_OVERLONG if one or more + // output lines extend past opt->right_margin columns. + int warn_overlong = 0; + + // Indentation of the first output line after \n or \r. + // \v sets this to opt->left2_margin. + // \r resets this back to the original value. + size_t first_indent = opt->left_margin; + + // Indentation of the output lines that occur due to word wrapping. + // \v sets this to opt->left2_cont and \r back to the original value. + size_t cont_indent = opt->left_cont; + + // If word wrapping occurs, the newline isn't printed unless more + // text would be put on the continuation line. This is also used + // when \v needs to start on a new line. + bool pending_newline = false; + + // Spaces are printed only when there is something else to put + // after the spaces on the line. This avoids unwanted empty lines + // in the output and makes it possible to ignore possible spaces + // before a \v character. + size_t pending_spaces = first_indent; + + // Current output column. When cur_col == pending_spaces, nothing + // has been actually printed to the current output line. + size_t cur_col = pending_spaces; + + while (true) { + // Number of bytes until the *next* line-break opportunity. + size_t len = 0; + + // Number of columns until the *next* line-break opportunity. + size_t width = 0; + + // Text between a pair of \b characters is treated as + // an unbreakable block even if it contains spaces. + // It must not contain any control characters before + // the closing \b. + bool unbreakable = false; + + while (true) { + // Find the next character that we handle specially. + // In an unbreakable block, search only for the + // closing \b; if missing, the unbreakable block + // extends to the end of the string. + const size_t n = strcspn(str + len, + unbreakable ? "\b" : " \t\n\r\v\b"); + + // Calculate how many columns the characters need. + const size_t w = tuklib_mbstr_width_mem(str + len, n); + if (w == (size_t)-1) + return TUKLIB_WRAP_ERR_STR; + + width += w; + len += n; + + // \b isn't a line-break opportunity so it has to + // be handled here. For simplicity, empty blocks + // are treated as zero-width characters. + if (str[len] == '\b') { + ++len; + unbreakable = !unbreakable; + continue; + } + + break; + } + + // Determine if adding this chunk of text would make the + // current output line exceed opt->right_margin columns. + const bool too_long = cur_col + width > opt->right_margin; + + // Wrap the line if needed. However: + // + // - Don't wrap if the current column is less than where + // the continuation line would begin. In that case + // the chunk wouldn't fit on the next line either so + // we just have to produce an overlong line. + // + // - Don't wrap if so far the line only contains spaces. + // Wrapping in that case would leave a weird empty line. + // NOTE: This "only contains spaces" condition is the + // reason why left2_margin > left2_cont isn't allowed. + if (too_long && cur_col > cont_indent + && cur_col > pending_spaces) { + // There might be trailing spaces or zero-width spaces + // which need to be ignored to keep the output pretty. + // + // Spaces need to be ignored because in some + // writing styles there are two spaces after + // a full stop. Example string: + // + // "Foo bar. Abc def." + // ^ + // If the first space after the first full stop + // triggers word wrapping, both spaces must be + // ignored. Otherwise the next line would be + // indented too much. + // + // Zero-width spaces are ignored the same way + // because they are meaningless if an adjacent + // character is a space. + while (*str == ' ' || *str == '\t') + ++str; + + // Don't print the newline here; only mark it as + // pending. This avoids an unwanted empty line if + // there is a \n or \r or \0 after the spaces have + // been ignored. + pending_newline = true; + pending_spaces = cont_indent; + cur_col = pending_spaces; + + // Since str may have been incremented due to the + // ignored spaces, the loop needs to be restarted. + continue; + } + + // Print the current chunk of text before the next + // line-break opportunity. If the chunk was empty, + // don't print anything so that the pending newline + // and pending spaces aren't printed on their own. + if (len > 0) { + if (pending_newline) { + pending_newline = false; + if (putc('\n', outfile) == EOF) + return TUKLIB_WRAP_ERR_IO; + } + + while (pending_spaces > 0) { + if (putc(' ', outfile) == EOF) + return TUKLIB_WRAP_ERR_IO; + + --pending_spaces; + } + + for (size_t i = 0; i < len; ++i) { + // Ignore unbreakable block characters (\b). + const int c = (unsigned char)str[i]; + if (c != '\b' && putc(c, outfile) == EOF) + return TUKLIB_WRAP_ERR_IO; + } + + str += len; + cur_col += width; + + // Remember if the line got overlong. If no other + // errors occur, we return warn_overlong. It might + // help in catching problematic strings. + if (too_long) + warn_overlong = TUKLIB_WRAP_WARN_OVERLONG; + } + + // Handle the special character after the chunk of text. + switch (*str) { + case ' ': + // Regular space. + ++cur_col; + ++pending_spaces; + break; + + case '\v': + // Set the alternative indentation settings. + first_indent = opt->left2_margin; + cont_indent = opt->left2_cont; + + if (first_indent > cur_col) { + // Add one or more spaces to reach + // the column specified in first_indent. + pending_spaces += first_indent - cur_col; + } else { + // There is no room to add even one space + // before reaching the column first_indent. + pending_newline = true; + pending_spaces = first_indent; + } + + cur_col = first_indent; + break; + + case '\0': // Implicit newline at the end of the string. + case '\r': // Newline that also resets the effect of \v. + case '\n': // Newline without resetting the indentation mode. + if (putc('\n', outfile) == EOF) + return TUKLIB_WRAP_ERR_IO; + + if (*str == '\0') + return warn_overlong; + + if (*str == '\r') { + first_indent = opt->left_margin; + cont_indent = opt->left_cont; + } + + pending_newline = false; + pending_spaces = first_indent; + cur_col = first_indent; + break; + } + + // Skip the specially-handled character. + ++str; + } +} + + +extern int +tuklib_wrapf(FILE *stream, const struct tuklib_wrap_opt *opt, + const char *fmt, ...) +{ + va_list ap; + char *buf; + +#ifdef HAVE_VASPRINTF + va_start(ap, fmt); + +#ifdef __clang__ +# pragma GCC diagnostic push +# pragma GCC diagnostic ignored "-Wformat-nonliteral" +#endif + const int n = vasprintf(&buf, fmt, ap); +#ifdef __clang__ +# pragma GCC diagnostic pop +#endif + + va_end(ap); + if (n == -1) + return TUKLIB_WRAP_ERR_FORMAT; +#else + // Fixed buffer size is dumb but in practice one shouldn't need + // huge strings for *formatted* output. This simple method is safe + // with pre-C99 vsnprintf() implementations too which don't return + // the required buffer size (they return -1 or buf_size - 1) or + // which might not null-terminate the buffer in case it's too small. + const size_t buf_size = 128 * 1024; + buf = malloc(buf_size); + if (buf == NULL) + return TUKLIB_WRAP_ERR_FORMAT; + + va_start(ap, fmt); + const int n = vsnprintf(buf, buf_size, fmt, ap); + va_end(ap); + + if (n <= 0 || n >= (int)(buf_size - 1)) { + free(buf); + return TUKLIB_WRAP_ERR_FORMAT; + } +#endif + + const int ret = tuklib_wraps(stream, opt, buf); + free(buf); + return ret; +} diff --git a/src/common/tuklib_mbstr_wrap.h b/src/common/tuklib_mbstr_wrap.h new file mode 100644 index 000000000000..4e2f297dabb4 --- /dev/null +++ b/src/common/tuklib_mbstr_wrap.h @@ -0,0 +1,204 @@ +// SPDX-License-Identifier: 0BSD + +/////////////////////////////////////////////////////////////////////////////// +// +/// \file tuklib_mbstr_wrap.h +/// \brief Word wrapping for multibyte strings +/// +/// The word wrapping functions are intended to be usable, for example, +/// for printing --help text in command line tools. While manually-wrapped +/// --help text allows precise formatting, such freedom requires translators +/// to count spaces and determine where line breaks should occur. It's +/// tedious and error prone, and experience has shown that only some +/// translators do it well. Automatic word wrapping is less flexible but +/// results in polished-enough look with less effort from everyone. +/// Right-to-left languages and languages that don't use spaces between +/// words will still need extra effort though. +// +// Author: Lasse Collin +// +/////////////////////////////////////////////////////////////////////////////// + +#ifndef TUKLIB_MBSTR_WRAP_H +#define TUKLIB_MBSTR_WRAP_H + +#include "tuklib_common.h" +#include <stdio.h> + +TUKLIB_DECLS_BEGIN + +/// One or more output lines exceeded right_margin. +/// This only a warning; everything was still printed successfully. +#define TUKLIB_WRAP_WARN_OVERLONG 0x01 + +/// Error writing to to the output FILE. The error flag in the FILE +/// should have been set as well. +#define TUKLIB_WRAP_ERR_IO 0x02 + +/// Invalid options in struct tuklib_wrap_opt. +/// Nothing was printed. +#define TUKLIB_WRAP_ERR_OPT 0x04 + +/// Invalid or unsupported multibyte character in the input string: +/// either mbrtowc() failed or wcwidth() returned a negative value. +#define TUKLIB_WRAP_ERR_STR 0x08 + +/// Only tuklib_wrapf(): Error in converting the format string. +/// It's either a memory allocation failure or something bad with the +/// format string or arguments. +#define TUKLIB_WRAP_ERR_FORMAT 0x10 + +/// Options for tuklib_wraps() and tuklib_wrapf() +struct tuklib_wrap_opt { + /// Indentation of the first output line after `\n` or `\r`. + /// This can be anything less than right_margin. + unsigned short left_margin; + + /// Column where word-wrapped continuation lines start. + /// This can be anything less than right_margin. + unsigned short left_cont; + + /// Column where the text after `\v` will start, either on the current + /// line (when there is room to add at least one space) or on a new + /// empty line. + unsigned short left2_margin; + + /// Like left_cont but for text after a `\v`. However, this must + /// be greater than or equal to left2_margin in addition to being + /// less than right_margin. + unsigned short left2_cont; + + /// For 80-column terminals, it is recommended to use 79 here for + /// maximum portability. 80 will work most of the time but it will + /// result in unwanted empty lines in the rare case where a terminal + /// moves the cursor to the beginning of the next line immediately + /// when the last column has been used. + unsigned short right_margin; +}; + +#define tuklib_wraps TUKLIB_SYMBOL(tuklib_wraps) +extern int tuklib_wraps(FILE *stream, const struct tuklib_wrap_opt *opt, + const char *str); +///< +/// \brief Word wrap a multibyte string and write it to a FILE +/// +/// Word wrapping is done only at spaces and at the special control characters +/// described below. Multiple consecutive spaces are handled properly: strings +/// that have two (or more) spaces after a full sentence will look good even +/// when the spaces occur at a word wrapping boundary. Trailing spaces are +/// ignored at the end of a line or at the end of a string. +/// +/// The following control characters have been repurposed: +/// +/// - `\t` = Zero-width space allows a line break without producing any +/// output by itself. This can be useful after hard hyphens as +/// hyphens aren't otherwise used for line breaking. This can also +/// be useful in languages that don't use spaces between words. +/// (The Unicode character U+200B isn't supported.) +/// - `\b` = Text between a pair of `\b` characters is treated as an +/// unbreakable block (not wrapped even if there are spaces). +/// For example, a non-breaking space can be done like +/// in `"123\b \bMiB"`. Control characters (like `\n` or `\t`) +/// aren't allowed before the closing `\b`. If closing `\b` is +/// missing, the block extends to the end of the string. Empty +/// blocks are treated as zero-width characters. If line breaks +/// are possible around an empty block (like in `"foo \b\b bar"` +/// or `"foo \b"`), it can result in weird output. +/// - `\v` = Change to alternative indentation (left2_margin). +/// - `\r` = Reset back to the initial indentation and add a newline. +/// The next line will be indented by left_margin. +/// - `\n` = Add a newline without resetting the effect of `\v`. The +/// next line will be indented by left_margin or left2_margin +/// (not left_cont or left2_cont). +/// +/// Only `\n` should appear in translatable strings. `\t` works too but +/// even that might confuse some translators even if there is a TRANSLATORS +/// comment explaining its meaning. +/// +/// To use the other control characters in messages, one should use +/// tuklib_wrapf() with appropriate printf format string to combine +/// translatable strings with non-translatable portions. For example: +/// +/// \code{.c} +/// static const struct tuklib_wrap_opt wrap2 = { 2, 2, 22, 22, 79 }; +/// int e = 0; +/// ... +/// e |= tuklib_wrapf(stdout, &wrap2, +/// "-h, --help\v%s\r" +/// " --version\v%s", +/// W_("display this help and exit"), +/// W_("display version information and exit")); +/// ... +/// if (e != 0) { +/// // Handle warning or error. +/// ... +/// } +/// \endcode +/// +/// Control characters other than `\n` and `\t` are unusable in +/// translatable strings: +/// +/// - Gettext tools show annoying warnings if C escape sequences other +/// than `\n` or `\t` are seen. (Otherwise they still work perfectly +/// fine though.) +/// +/// - While at least Poedit and Lokalize support all escapes, some +/// editors only support `\n` and `\t`. +/// +/// - They could confuse some translators, resulting in broken +/// translations. +/// +/// Using non-control characters would solve some issues but it wouldn't +/// help with the unfortunate real-world issue that some translators would +/// likely have trouble understanding a new syntax. The Gettext manual +/// specifically warns about this, see the subheading "No unusual markup" +/// in `info (gettext)Preparing Strings`. (While using `\t` for zero-width +/// space is such custom markup, most translators will never need it.) +/// +/// Translators can use the Unicode character U+00A0 (or U+202F) if they +/// need a non-breaking space. For example, in French a non-breaking space +/// may be needed before colons and question marks (U+00A0 is common in +/// real-world French PO files). +/// +/// Using a non-ASCII char in a string in the C code (like `"123\u00A0MiB"`) +/// can work if one tells xgettext that input encoding is UTF-8, one +/// ensures that the C compiler uses UTF-8 as the input charset, and one +/// is certain that the program is *always* run under an UTF-8 locale. +/// Unfortunately a portable program cannot make this kind of assumptions, +/// which means that there is no pretty way to have a non-breaking space in +/// a translatable string. +/// +/// Optional: To tell translators which strings are automatically word +/// wrapped, see the macro `W_` in tuklib_gettext.h. +/// +/// \param stream Output FILE stream. For decent performance, it +/// should be in buffered mode because this function +/// writes the output one byte at a time with fputc(). +/// \param opt Word wrapping options. +/// \param str Null-terminated multibyte string that is in +/// the encoding used by the current locale. +/// +/// \return Returns 0 on success. If an error or warning occurs, one of +/// TUKLIB_WRAP_* codes is returned. Those codes are powers +/// of two. When warning/error detection can be delayed, the +/// return values can be accumulated from multiple calls using +/// bitwise-or into a single variable which can be checked after +/// all strings have (hopefully) been printed. + +#define tuklib_wrapf TUKLIB_SYMBOL(tuklib_wrapf) +tuklib_attr_format_printf(3, 4) +extern int tuklib_wrapf(FILE *stream, const struct tuklib_wrap_opt *opt, + const char *fmt, ...); +///< +/// \brief Format and word-wrap a multibyte string and write it to a FILE +/// +/// This is like tuklib_wraps() except that this takes a printf +/// format string. +/// +/// \note On platforms that lack vasprintf(), the intermediate +/// result from vsnprintf() must fit into a 128 KiB buffer. +/// TUKLIB_WRAP_ERR_FORMAT is returned if it doesn't but +/// only on platforms that lack vasprintf(). + +TUKLIB_DECLS_END +#endif diff --git a/src/common/tuklib_physmem.c b/src/common/tuklib_physmem.c index 1009df14d9d1..5988ba77a284 100644 --- a/src/common/tuklib_physmem.c +++ b/src/common/tuklib_physmem.c @@ -91,18 +91,11 @@ tuklib_physmem(void) // supports reporting values greater than 4 GiB. To keep the // code working also on older Windows versions, use // GlobalMemoryStatusEx() conditionally. - HMODULE kernel32 = GetModuleHandle(TEXT("kernel32.dll")); + HMODULE kernel32 = GetModuleHandleA("kernel32.dll"); if (kernel32 != NULL) { typedef BOOL (WINAPI *gmse_type)(LPMEMORYSTATUSEX); -#ifdef CAN_DISABLE_WCAST_FUNCTION_TYPE -# pragma GCC diagnostic push -# pragma GCC diagnostic ignored "-Wcast-function-type" -#endif gmse_type gmse = (gmse_type)GetProcAddress( kernel32, "GlobalMemoryStatusEx"); -#ifdef CAN_DISABLE_WCAST_FUNCTION_TYPE -# pragma GCC diagnostic pop -#endif if (gmse != NULL) { MEMORYSTATUSEX meminfo; meminfo.dwLength = sizeof(meminfo); @@ -155,7 +148,7 @@ tuklib_physmem(void) ret += entries[i].end - entries[i].start + 1; #elif defined(TUKLIB_PHYSMEM_AIX) - ret = _system_configuration.physmem; + ret = (uint64_t)_system_configuration.physmem; #elif defined(TUKLIB_PHYSMEM_SYSCONF) const long pagesize = sysconf(_SC_PAGESIZE); diff --git a/src/common/w32_application.manifest.comments.txt b/src/common/w32_application.manifest.comments.txt index ad0835ccb0b1..de5c2105acf9 100644 --- a/src/common/w32_application.manifest.comments.txt +++ b/src/common/w32_application.manifest.comments.txt @@ -67,6 +67,13 @@ This is useful for programs that use main(): to the UTF-8 code page and aren't distinguishable from filenames that contain the actual replacement character U+FFFD. + FindFirstFileA() and FindFirstFileExA() also suffer from the above + issue where unpaired surrogates become U+FFFD. Another issue is + that filenames may require more bytes in UTF-8 than in a legacy + code page. In UTF-8, a very long filename may exceed MAX_PATH bytes + and thus these APIs cannot list such filenames anymore because + WIN32_FIND_DATAA has a member "CHAR cFileName[MAX_PATH]". + If different programs use different code pages, compatibility issues are possible. For example, if one program produces a list of filenames and another program reads it, both programs should use @@ -82,11 +89,18 @@ when writing to console with printf(). With UCRT it works. Long path names --------------- -The manifest enables support for path names longer than 259 -characters if the feature has been enabled in the Windows registry. -Omit the longPathAware element from the manifest if the application -isn't compatible with it. For example, uses of MAX_PATH might be -a sign of incompatibility. +The manifest enables support for path names longer than 260 wide +characters (UTF-16 code units) if the feature has been enabled in +the Windows registry. Omit the longPathAware element from the manifest +if the application isn't compatible with it. For example, some uses +of MAX_PATH might be a sign of incompatibility. + +Note that UTF-8 encoded filenames can exceed MAX_PATH (260) bytes when +the UTF-16 form is still within MAX_PATH wide characters. In this +situation the application doesn't need to be long path aware: functions +like _open() work with UTF-8 names that exceed MAX_PATH bytes if the +wide character form stays within MAX_PATH wide characters. (MAX_PATH +includes the terminating null character.) Documentation of the registry setting: https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=registry#enable-long-paths-in-windows-10-version-1607-and-later @@ -123,9 +137,9 @@ trustInfo longPathAware Declare the application as long path aware. This way many file - system operations aren't limited by MAX_PATH (260 characters - including the terminating null character) if the feature has - also been enabled in the Windows registry. + system operations aren't limited to MAX_PATH (260) wide characters + (including the terminating null character). The feature has to be + enabled in the Windows registry too. activeCodePage diff --git a/src/liblzma/api/lzma/bcj.h b/src/liblzma/api/lzma/bcj.h index 7f6611feb325..fb737cbba49c 100644 --- a/src/liblzma/api/lzma/bcj.h +++ b/src/liblzma/api/lzma/bcj.h @@ -96,3 +96,100 @@ typedef struct { uint32_t start_offset; } lzma_options_bcj; + + +/** + * \brief Raw ARM64 BCJ encoder + * + * This is for special use cases only. + * + * \param start_offset The lowest 32 bits of the offset in the + * executable being filtered. For the ARM64 + * filter, this must be a multiple of four. + * For the very best results, this should also + * be in sync with 4096-byte page boundaries + * in the executable due to how ARM64's ADRP + * instruction works. + * \param buf Buffer to be filtered in place + * \param size Size of the buffer + * + * \return Number of bytes that were processed in `buf`. This is at most + * `size`. With the ARM64 filter, the return value is always + * a multiple of 4, and at most 3 bytes are left unfiltered. + * + * \since 5.7.1alpha + */ +extern LZMA_API(size_t) lzma_bcj_arm64_encode( + uint32_t start_offset, uint8_t *buf, size_t size) lzma_nothrow; + +/** + * \brief Raw ARM64 BCJ decoder + * + * See lzma_bcj_arm64_encode(). + * + * \since 5.7.1alpha + */ +extern LZMA_API(size_t) lzma_bcj_arm64_decode( + uint32_t start_offset, uint8_t *buf, size_t size) lzma_nothrow; + + +/** + * \brief Raw RISC-V BCJ encoder + * + * This is for special use cases only. + * + * \param start_offset The lowest 32 bits of the offset in the + * executable being filtered. For the RISC-V + * filter, this must be a multiple of 2. + * \param buf Buffer to be filtered in place + * \param size Size of the buffer + * + * \return Number of bytes that were processed in `buf`. This is at most + * `size`. With the RISC-V filter, the return value is always + * a multiple of 2, and at most 7 bytes are left unfiltered. + * + * \since 5.7.1alpha + */ +extern LZMA_API(size_t) lzma_bcj_riscv_encode( + uint32_t start_offset, uint8_t *buf, size_t size) lzma_nothrow; + +/** + * \brief Raw RISC-V BCJ decoder + * + * See lzma_bcj_riscv_encode(). + * + * \since 5.7.1alpha + */ +extern LZMA_API(size_t) lzma_bcj_riscv_decode( + uint32_t start_offset, uint8_t *buf, size_t size) lzma_nothrow; + + +/** + * \brief Raw x86 BCJ encoder + * + * This is for special use cases only. + * + * \param start_offset The lowest 32 bits of the offset in the + * executable being filtered. For the x86 + * filter, all values are valid. + * \param buf Buffer to be filtered in place + * \param size Size of the buffer + * + * \return Number of bytes that were processed in `buf`. This is at most + * `size`. For the x86 filter, the return value is always + * a multiple of 1, and at most 4 bytes are left unfiltered. + * + * \since 5.7.1alpha + */ +extern LZMA_API(size_t) lzma_bcj_x86_encode( + uint32_t start_offset, uint8_t *buf, size_t size) lzma_nothrow; + +/** + * \brief Raw x86 BCJ decoder + * + * See lzma_bcj_x86_encode(). + * + * \since 5.7.1alpha + */ +extern LZMA_API(size_t) lzma_bcj_x86_decode( + uint32_t start_offset, uint8_t *buf, size_t size) lzma_nothrow; diff --git a/src/liblzma/api/lzma/container.h b/src/liblzma/api/lzma/container.h index ee5d77e4f1af..dbd414cbf8c0 100644 --- a/src/liblzma/api/lzma/container.h +++ b/src/liblzma/api/lzma/container.h @@ -573,7 +573,7 @@ extern LZMA_API(lzma_ret) lzma_stream_buffer_encode( * The action argument must be LZMA_FINISH and the return value will never be * LZMA_OK. Thus the encoding is always done with a single lzma_code() after * the initialization. The benefit of the combination of initialization - * function and lzma_code() is that memory allocations can be re-used for + * function and lzma_code() is that memory allocations can be reused for * better performance. * * lzma_code() will try to encode as much input as is possible to fit into diff --git a/src/liblzma/api/lzma/lzma12.h b/src/liblzma/api/lzma/lzma12.h index 05f5b66eb56a..fec3e0dadb23 100644 --- a/src/liblzma/api/lzma/lzma12.h +++ b/src/liblzma/api/lzma/lzma12.h @@ -461,7 +461,7 @@ typedef struct { * * ext_size_low holds the least significant 32 bits of the * uncompressed size. The most significant 32 bits must be set - * in ext_size_high. The macro lzma_ext_size_set(opt_lzma, u64size) + * in ext_size_high. The macro lzma_set_ext_size(opt_lzma, u64size) * can be used to set these members. * * The 64-bit uncompressed size is split into two uint32_t variables diff --git a/src/liblzma/api/lzma/version.h b/src/liblzma/api/lzma/version.h index e86c0ea4c3d1..86b355635961 100644 --- a/src/liblzma/api/lzma/version.h +++ b/src/liblzma/api/lzma/version.h @@ -19,10 +19,10 @@ #define LZMA_VERSION_MAJOR 5 /** \brief Minor version number of the liblzma release. */ -#define LZMA_VERSION_MINOR 6 +#define LZMA_VERSION_MINOR 8 /** \brief Patch version number of the liblzma release. */ -#define LZMA_VERSION_PATCH 3 +#define LZMA_VERSION_PATCH 1 /** * \brief Version stability marker diff --git a/src/liblzma/check/check.h b/src/liblzma/check/check.h index f0eb1172d907..16a56334211a 100644 --- a/src/liblzma/check/check.h +++ b/src/liblzma/check/check.h @@ -95,24 +95,6 @@ typedef struct { } lzma_check_state; -/// lzma_crc32_table[0] is needed by LZ encoder so we need to keep -/// the array two-dimensional. -#ifdef HAVE_SMALL -lzma_attr_visibility_hidden -extern uint32_t lzma_crc32_table[1][256]; - -extern void lzma_crc32_init(void); - -#else - -lzma_attr_visibility_hidden -extern const uint32_t lzma_crc32_table[8][256]; - -lzma_attr_visibility_hidden -extern const uint64_t lzma_crc64_table[4][256]; -#endif - - /// \brief Initialize *check depending on type extern void lzma_check_init(lzma_check_state *check, lzma_check type); diff --git a/src/liblzma/check/crc32_arm64.h b/src/liblzma/check/crc32_arm64.h index 39c1c63ec0ec..fb0e8f0105a9 100644 --- a/src/liblzma/check/crc32_arm64.h +++ b/src/liblzma/check/crc32_arm64.h @@ -7,7 +7,7 @@ // // Authors: Chenxi Mao // Jia Tan -// Hans Jansen +// Lasse Collin // /////////////////////////////////////////////////////////////////////////////// @@ -49,25 +49,50 @@ crc32_arch_optimized(const uint8_t *buf, size_t size, uint32_t crc) { crc = ~crc; - // Align the input buffer because this was shown to be - // significantly faster than unaligned accesses. - const size_t align_amount = my_min(size, (0U - (uintptr_t)buf) & 7); + if (size >= 8) { + // Align the input buffer because this was shown to be + // significantly faster than unaligned accesses. + const size_t align = (0 - (uintptr_t)buf) & 7; - for (const uint8_t *limit = buf + align_amount; buf < limit; ++buf) - crc = __crc32b(crc, *buf); + if (align & 1) + crc = __crc32b(crc, *buf++); + + if (align & 2) { + crc = __crc32h(crc, aligned_read16le(buf)); + buf += 2; + } + + if (align & 4) { + crc = __crc32w(crc, aligned_read32le(buf)); + buf += 4; + } - size -= align_amount; + size -= align; - // Process 8 bytes at a time. The end point is determined by - // ignoring the least significant three bits of size to ensure - // we do not process past the bounds of the buffer. This guarantees - // that limit is a multiple of 8 and is strictly less than size. - for (const uint8_t *limit = buf + (size & ~(size_t)7); - buf < limit; buf += 8) - crc = __crc32d(crc, aligned_read64le(buf)); + // Process 8 bytes at a time. The end point is determined by + // ignoring the least significant three bits of size to + // ensure we do not process past the bounds of the buffer. + // This guarantees that limit is a multiple of 8 and is + // strictly less than size. + for (const uint8_t *limit = buf + (size & ~(size_t)7); + buf < limit; buf += 8) + crc = __crc32d(crc, aligned_read64le(buf)); + + size &= 7; + } // Process the remaining bytes that are not 8 byte aligned. - for (const uint8_t *limit = buf + (size & 7); buf < limit; ++buf) + if (size & 4) { + crc = __crc32w(crc, aligned_read32le(buf)); + buf += 4; + } + + if (size & 2) { + crc = __crc32h(crc, aligned_read16le(buf)); + buf += 2; + } + + if (size & 1) crc = __crc32b(crc, *buf); return ~crc; diff --git a/src/liblzma/check/crc32_fast.c b/src/liblzma/check/crc32_fast.c index 16dbb7467513..3c7cb95f57b7 100644 --- a/src/liblzma/check/crc32_fast.c +++ b/src/liblzma/check/crc32_fast.c @@ -7,7 +7,6 @@ // // Authors: Lasse Collin // Ilya Kurdyukov -// Hans Jansen // /////////////////////////////////////////////////////////////////////////////// @@ -15,10 +14,12 @@ #include "crc_common.h" #if defined(CRC_X86_CLMUL) -# define BUILDING_CRC32_CLMUL +# define BUILDING_CRC_CLMUL 32 # include "crc_x86_clmul.h" #elif defined(CRC32_ARM64) # include "crc32_arm64.h" +#elif defined(CRC32_LOONGARCH) +# include "crc32_loongarch.h" #endif @@ -28,8 +29,19 @@ // Generic CRC32 // /////////////////// +#ifdef WORDS_BIGENDIAN +# include "crc32_table_be.h" +#else +# include "crc32_table_le.h" +#endif + + +#ifdef HAVE_CRC_X86_ASM +extern uint32_t lzma_crc32_generic( + const uint8_t *buf, size_t size, uint32_t crc); +#else static uint32_t -crc32_generic(const uint8_t *buf, size_t size, uint32_t crc) +lzma_crc32_generic(const uint8_t *buf, size_t size, uint32_t crc) { crc = ~crc; @@ -85,7 +97,8 @@ crc32_generic(const uint8_t *buf, size_t size, uint32_t crc) return ~crc; } -#endif +#endif // HAVE_CRC_X86_ASM +#endif // CRC32_GENERIC #if defined(CRC32_GENERIC) && defined(CRC32_ARCH_OPTIMIZED) @@ -119,7 +132,7 @@ static crc32_func_type crc32_resolve(void) { return is_arch_extension_supported() - ? &crc32_arch_optimized : &crc32_generic; + ? &crc32_arch_optimized : &lzma_crc32_generic; } @@ -164,27 +177,6 @@ extern LZMA_API(uint32_t) lzma_crc32(const uint8_t *buf, size_t size, uint32_t crc) { #if defined(CRC32_GENERIC) && defined(CRC32_ARCH_OPTIMIZED) - // On x86-64, if CLMUL is available, it is the best for non-tiny - // inputs, being over twice as fast as the generic slice-by-four - // version. However, for size <= 16 it's different. In the extreme - // case of size == 1 the generic version can be five times faster. - // At size >= 8 the CLMUL starts to become reasonable. It - // varies depending on the alignment of buf too. - // - // The above doesn't include the overhead of mythread_once(). - // At least on x86-64 GNU/Linux, pthread_once() is very fast but - // it still makes lzma_crc32(buf, 1, crc) 50-100 % slower. When - // size reaches 12-16 bytes the overhead becomes negligible. - // - // So using the generic version for size <= 16 may give better - // performance with tiny inputs but if such inputs happen rarely - // it's not so obvious because then the lookup table of the - // generic version may not be in the processor cache. -#ifdef CRC_USE_GENERIC_FOR_SMALL_INPUTS - if (size <= 16) - return crc32_generic(buf, size, crc); -#endif - /* #ifndef HAVE_FUNC_ATTRIBUTE_CONSTRUCTOR // See crc32_dispatch(). This would be the alternative which uses @@ -199,6 +191,6 @@ lzma_crc32(const uint8_t *buf, size_t size, uint32_t crc) return crc32_arch_optimized(buf, size, crc); #else - return crc32_generic(buf, size, crc); + return lzma_crc32_generic(buf, size, crc); #endif } diff --git a/src/liblzma/check/crc32_loongarch.h b/src/liblzma/check/crc32_loongarch.h new file mode 100644 index 000000000000..ec738b83d70a --- /dev/null +++ b/src/liblzma/check/crc32_loongarch.h @@ -0,0 +1,65 @@ +// SPDX-License-Identifier: 0BSD + +/////////////////////////////////////////////////////////////////////////////// +// +/// \file crc32_loongarch.h +/// \brief CRC32 calculation with LoongArch optimization +// +// Authors: Xi Ruoyao +// Lasse Collin +// +/////////////////////////////////////////////////////////////////////////////// + +#ifndef LZMA_CRC32_LOONGARCH_H +#define LZMA_CRC32_LOONGARCH_H + +#include <larchintrin.h> + + +static uint32_t +crc32_arch_optimized(const uint8_t *buf, size_t size, uint32_t crc_unsigned) +{ + int32_t crc = (int32_t)~crc_unsigned; + + if (size >= 8) { + const size_t align = (0 - (uintptr_t)buf) & 7; + + if (align & 1) + crc = __crc_w_b_w((int8_t)*buf++, crc); + + if (align & 2) { + crc = __crc_w_h_w((int16_t)aligned_read16le(buf), crc); + buf += 2; + } + + if (align & 4) { + crc = __crc_w_w_w((int32_t)aligned_read32le(buf), crc); + buf += 4; + } + + size -= align; + + for (const uint8_t *limit = buf + (size & ~(size_t)7); + buf < limit; buf += 8) + crc = __crc_w_d_w((int64_t)aligned_read64le(buf), crc); + + size &= 7; + } + + if (size & 4) { + crc = __crc_w_w_w((int32_t)aligned_read32le(buf), crc); + buf += 4; + } + + if (size & 2) { + crc = __crc_w_h_w((int16_t)aligned_read16le(buf), crc); + buf += 2; + } + + if (size & 1) + crc = __crc_w_b_w((int8_t)*buf, crc); + + return (uint32_t)~crc; +} + +#endif // LZMA_CRC32_LOONGARCH_H diff --git a/src/liblzma/check/crc32_small.c b/src/liblzma/check/crc32_small.c index 6a1bd66185ea..4a62830c807a 100644 --- a/src/liblzma/check/crc32_small.c +++ b/src/liblzma/check/crc32_small.c @@ -10,8 +10,11 @@ /////////////////////////////////////////////////////////////////////////////// #include "check.h" +#include "crc_common.h" +// The table is used by the LZ encoder too, thus it's not static like +// in crc64_small.c. uint32_t lzma_crc32_table[1][256]; diff --git a/src/liblzma/check/crc32_table.c b/src/liblzma/check/crc32_table.c deleted file mode 100644 index 56413eec336e..000000000000 --- a/src/liblzma/check/crc32_table.c +++ /dev/null @@ -1,42 +0,0 @@ -// SPDX-License-Identifier: 0BSD - -/////////////////////////////////////////////////////////////////////////////// -// -/// \file crc32_table.c -/// \brief Precalculated CRC32 table with correct endianness -// -// Author: Lasse Collin -// -/////////////////////////////////////////////////////////////////////////////// - -#include "common.h" - - -// FIXME: Compared to crc_common.h this has to check for __x86_64__ too -// so that in 32-bit builds crc32_x86.S won't break due to a missing table. -#if defined(HAVE_USABLE_CLMUL) && ((defined(__x86_64__) && defined(__SSSE3__) \ - && defined(__SSE4_1__) && defined(__PCLMUL__)) \ - || (defined(__e2k__) && __iset__ >= 6)) -# define NO_CRC32_TABLE - -#elif defined(HAVE_ARM64_CRC32) \ - && !defined(WORDS_BIGENDIAN) \ - && defined(__ARM_FEATURE_CRC32) -# define NO_CRC32_TABLE -#endif - - -#if !defined(HAVE_ENCODERS) && defined(NO_CRC32_TABLE) -// No table needed. Use a typedef to avoid an empty translation unit. -typedef void lzma_crc32_dummy; - -#else -// Having the declaration here silences clang -Wmissing-variable-declarations. -extern const uint32_t lzma_crc32_table[8][256]; - -# ifdef WORDS_BIGENDIAN -# include "crc32_table_be.h" -# else -# include "crc32_table_le.h" -# endif -#endif diff --git a/src/liblzma/check/crc32_x86.S b/src/liblzma/check/crc32_x86.S index ddc3cee6ea5b..37ee063d1068 100644 --- a/src/liblzma/check/crc32_x86.S +++ b/src/liblzma/check/crc32_x86.S @@ -67,7 +67,7 @@ init_table(void) #endif #define MAKE_SYM_CAT(prefix, sym) prefix ## sym #define MAKE_SYM(prefix, sym) MAKE_SYM_CAT(prefix, sym) -#define LZMA_CRC32 MAKE_SYM(__USER_LABEL_PREFIX__, lzma_crc32) +#define LZMA_CRC32 MAKE_SYM(__USER_LABEL_PREFIX__, lzma_crc32_generic) #define LZMA_CRC32_TABLE MAKE_SYM(__USER_LABEL_PREFIX__, lzma_crc32_table) /* @@ -82,6 +82,9 @@ init_table(void) .text .globl LZMA_CRC32 +#ifdef __ELF__ + .hidden LZMA_CRC32 +#endif #if !defined(__APPLE__) && !defined(_WIN32) && !defined(__CYGWIN__) \ && !defined(__MSDOS__) @@ -290,14 +293,7 @@ LZMA_CRC32: .indirect_symbol LZMA_CRC32_TABLE .long 0 -#elif defined(_WIN32) || defined(__CYGWIN__) -# ifdef DLL_EXPORT - /* This is equivalent of __declspec(dllexport). */ - .section .drectve - .ascii " -export:lzma_crc32" -# endif - -#elif !defined(__MSDOS__) +#elif !defined(_WIN32) && !defined(__CYGWIN__) && !defined(__MSDOS__) /* ELF */ .size LZMA_CRC32, .-LZMA_CRC32 #endif diff --git a/src/liblzma/check/crc64_fast.c b/src/liblzma/check/crc64_fast.c index 0ce83fe4ad36..8a6770a431e8 100644 --- a/src/liblzma/check/crc64_fast.c +++ b/src/liblzma/check/crc64_fast.c @@ -14,7 +14,7 @@ #include "crc_common.h" #if defined(CRC_X86_CLMUL) -# define BUILDING_CRC64_CLMUL +# define BUILDING_CRC_CLMUL 64 # include "crc_x86_clmul.h" #endif @@ -25,6 +25,18 @@ // Generic slice-by-four CRC64 // ///////////////////////////////// +#if defined(WORDS_BIGENDIAN) +# include "crc64_table_be.h" +#else +# include "crc64_table_le.h" +#endif + + +#ifdef HAVE_CRC_X86_ASM +extern uint64_t lzma_crc64_generic( + const uint8_t *buf, size_t size, uint64_t crc); +#else + #ifdef WORDS_BIGENDIAN # define A1(x) ((x) >> 56) #else @@ -34,7 +46,7 @@ // See the comments in crc32_fast.c. They aren't duplicated here. static uint64_t -crc64_generic(const uint8_t *buf, size_t size, uint64_t crc) +lzma_crc64_generic(const uint8_t *buf, size_t size, uint64_t crc) { crc = ~crc; @@ -78,7 +90,8 @@ crc64_generic(const uint8_t *buf, size_t size, uint64_t crc) return ~crc; } -#endif +#endif // HAVE_CRC_X86_ASM +#endif // CRC64_GENERIC #if defined(CRC64_GENERIC) && defined(CRC64_ARCH_OPTIMIZED) @@ -97,7 +110,7 @@ static crc64_func_type crc64_resolve(void) { return is_arch_extension_supported() - ? &crc64_arch_optimized : &crc64_generic; + ? &crc64_arch_optimized : &lzma_crc64_generic; } #ifdef HAVE_FUNC_ATTRIBUTE_CONSTRUCTOR @@ -133,24 +146,24 @@ crc64_dispatch(const uint8_t *buf, size_t size, uint64_t crc) extern LZMA_API(uint64_t) lzma_crc64(const uint8_t *buf, size_t size, uint64_t crc) { -#if defined(CRC64_GENERIC) && defined(CRC64_ARCH_OPTIMIZED) - -#ifdef CRC_USE_GENERIC_FOR_SMALL_INPUTS - if (size <= 16) - return crc64_generic(buf, size, crc); +#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) && !defined(__clang__) \ + && defined(_M_IX86) && defined(CRC64_ARCH_OPTIMIZED) + // VS2015-2022 might corrupt the ebx register on 32-bit x86 when + // the CLMUL code is enabled. This hack forces MSVC to store and + // restore ebx. This is only needed here, not in lzma_crc32(). + __asm mov ebx, ebx #endif + +#if defined(CRC64_GENERIC) && defined(CRC64_ARCH_OPTIMIZED) return crc64_func(buf, size, crc); #elif defined(CRC64_ARCH_OPTIMIZED) // If arch-optimized version is used unconditionally without runtime // CPU detection then omitting the generic version and its 8 KiB // lookup table makes the library smaller. - // - // FIXME: Lookup table isn't currently omitted on 32-bit x86, - // see crc64_table.c. return crc64_arch_optimized(buf, size, crc); #else - return crc64_generic(buf, size, crc); + return lzma_crc64_generic(buf, size, crc); #endif } diff --git a/src/liblzma/check/crc64_table.c b/src/liblzma/check/crc64_table.c deleted file mode 100644 index 78e427597ce6..000000000000 --- a/src/liblzma/check/crc64_table.c +++ /dev/null @@ -1,37 +0,0 @@ -// SPDX-License-Identifier: 0BSD - -/////////////////////////////////////////////////////////////////////////////// -// -/// \file crc64_table.c -/// \brief Precalculated CRC64 table with correct endianness -// -// Author: Lasse Collin -// -/////////////////////////////////////////////////////////////////////////////// - -#include "common.h" - - -// FIXME: Compared to crc_common.h this has to check for __x86_64__ too -// so that in 32-bit builds crc64_x86.S won't break due to a missing table. -#if defined(HAVE_USABLE_CLMUL) && ((defined(__x86_64__) && defined(__SSSE3__) \ - && defined(__SSE4_1__) && defined(__PCLMUL__)) \ - || (defined(__e2k__) && __iset__ >= 6)) -# define NO_CRC64_TABLE -#endif - - -#ifdef NO_CRC64_TABLE -// No table needed. Use a typedef to avoid an empty translation unit. -typedef void lzma_crc64_dummy; - -#else -// Having the declaration here silences clang -Wmissing-variable-declarations. -extern const uint64_t lzma_crc64_table[4][256]; - -# if defined(WORDS_BIGENDIAN) -# include "crc64_table_be.h" -# else -# include "crc64_table_le.h" -# endif -#endif diff --git a/src/liblzma/check/crc64_x86.S b/src/liblzma/check/crc64_x86.S index 47f608181ea8..df50018653b4 100644 --- a/src/liblzma/check/crc64_x86.S +++ b/src/liblzma/check/crc64_x86.S @@ -57,7 +57,7 @@ init_table(void) #endif #define MAKE_SYM_CAT(prefix, sym) prefix ## sym #define MAKE_SYM(prefix, sym) MAKE_SYM_CAT(prefix, sym) -#define LZMA_CRC64 MAKE_SYM(__USER_LABEL_PREFIX__, lzma_crc64) +#define LZMA_CRC64 MAKE_SYM(__USER_LABEL_PREFIX__, lzma_crc64_generic) #define LZMA_CRC64_TABLE MAKE_SYM(__USER_LABEL_PREFIX__, lzma_crc64_table) /* @@ -72,6 +72,9 @@ init_table(void) .text .globl LZMA_CRC64 +#ifdef __ELF__ + .hidden LZMA_CRC64 +#endif #if !defined(__APPLE__) && !defined(_WIN32) && !defined(__CYGWIN__) \ && !defined(__MSDOS__) @@ -273,14 +276,7 @@ LZMA_CRC64: .indirect_symbol LZMA_CRC64_TABLE .long 0 -#elif defined(_WIN32) || defined(__CYGWIN__) -# ifdef DLL_EXPORT - /* This is equivalent of __declspec(dllexport). */ - .section .drectve - .ascii " -export:lzma_crc64" -# endif - -#elif !defined(__MSDOS__) +#elif !defined(_WIN32) && !defined(__CYGWIN__) && !defined(__MSDOS__) /* ELF */ .size LZMA_CRC64, .-LZMA_CRC64 #endif diff --git a/src/liblzma/check/crc_clmul_consts_gen.c b/src/liblzma/check/crc_clmul_consts_gen.c new file mode 100644 index 000000000000..5fe14bd6f042 --- /dev/null +++ b/src/liblzma/check/crc_clmul_consts_gen.c @@ -0,0 +1,160 @@ +// SPDX-License-Identifier: 0BSD + +/////////////////////////////////////////////////////////////////////////////// +// +/// \file crc_clmul_consts_gen.c +/// \brief Generate constants for CLMUL CRC code +/// +/// Compiling: gcc -std=c99 -o crc_clmul_consts_gen crc_clmul_consts_gen.c +/// +/// This is for CRCs that use reversed bit order (bit reflection). +/// The same CLMUL CRC code can be used with CRC64 and smaller ones like +/// CRC32 apart from one special case: CRC64 needs an extra step in the +/// Barrett reduction to handle the 65th bit; the smaller ones don't. +/// Otherwise it's enough to just change the polynomial and the derived +/// constants and use the same code. +/// +/// See the Intel white paper "Fast CRC Computation for Generic Polynomials +/// Using PCLMULQDQ Instruction" from 2009. +// +// Author: Lasse Collin +// +/////////////////////////////////////////////////////////////////////////////// + +#include <inttypes.h> +#include <stdio.h> + + +/// CRC32 (Ethernet) polynomial in reversed representation +static const uint64_t p32 = 0xedb88320; + +// CRC64 (ECMA-182) polynomial in reversed representation +static const uint64_t p64 = 0xc96c5795d7870f42; + + +/// Calculates floor(x^128 / p) where p is a CRC64 polynomial in +/// reversed representation. The result is in reversed representation too. +static uint64_t +calc_cldiv(uint64_t p) +{ + // Quotient + uint64_t q = 0; + + // Align the x^64 term with the x^128 (the implied high bits of the + // divisor and the dividend) and do the first step of polynomial long + // division, calculating the first remainder. The variable q remains + // zero because the highest bit of the quotient is an implied bit 1 + // (we kind of set q = 1 << -1). + uint64_t r = p; + + // Then process the remaining 64 terms. Note that r has no implied + // high bit, only q and p do. (And remember that a high bit in the + // polynomial is stored at a low bit in the variable due to the + // reversed bit order.) + for (unsigned i = 0; i < 64; ++i) { + q |= (r & 1) << i; + r = (r >> 1) ^ (r & 1 ? p : 0); + } + + return q; +} + + +/// Calculate the remainder of carryless division: +/// +/// x^(bits + n - 1) % p, where n=64 (for CRC64) +/// +/// p must be in reversed representation which omits the bit of +/// the highest term of the polynomial. Instead, it is an implied bit +/// at kind of like "1 << -1" position, as if it had just been shifted out. +/// +/// The return value is in the reversed bit order. (There are no implied bits.) +static uint64_t +calc_clrem(uint64_t p, unsigned bits) +{ + // Do the first step of polynomial long division. + uint64_t r = p; + + // Then process the remaining terms. Start with i = 1 instead of i = 0 + // to account for the -1 in x^(bits + n - 1). This -1 is convenient + // with the reversed bit order. See the "Bit-Reflection" section in + // the Intel white paper. + for (unsigned i = 1; i < bits; ++i) + r = (r >> 1) ^ (r & 1 ? p : 0); + + return r; +} + + +extern int +main(void) +{ + puts("// CRC64"); + + // The order of the two 64-bit constants in a vector don't matter. + // It feels logical to put them in this order as it matches the + // order in which the input bytes are read. + printf("const __m128i fold512 = _mm_set_epi64x(" + "0x%016" PRIx64 ", 0x%016" PRIx64 ");\n", + calc_clrem(p64, 4 * 128 - 64), + calc_clrem(p64, 4 * 128)); + + printf("const __m128i fold128 = _mm_set_epi64x(" + "0x%016" PRIx64 ", 0x%016" PRIx64 ");\n", + calc_clrem(p64, 128 - 64), + calc_clrem(p64, 128)); + + // When we multiply by mu, we care about the high bits of the result + // (in reversed bit order!). It doesn't matter that the low bit gets + // shifted out because the affected output bits will be ignored. + // Below we add the implied high bit with "| 1" after the shifting + // so that the high bits of the multiplication will be correct. + // + // p64 is shifted left by one so that the final multiplication + // in Barrett reduction won't be misaligned by one bit. We could + // use "(p64 << 1) | 1" instead of "p64 << 1" too but it makes + // no difference as that bit won't affect the relevant output bits + // (we only care about the lowest 64 bits of the result, that is, + // lowest in the reversed bit order). + // + // NOTE: The 65rd bit of p64 gets shifted out. It needs to be + // compensated with 64-bit shift and xor in the CRC64 code. + printf("const __m128i mu_p = _mm_set_epi64x(" + "0x%016" PRIx64 ", 0x%016" PRIx64 ");\n", + (calc_cldiv(p64) << 1) | 1, + p64 << 1); + + puts(""); + + puts("// CRC32"); + + printf("const __m128i fold512 = _mm_set_epi64x(" + "0x%08" PRIx64 ", 0x%08" PRIx64 ");\n", + calc_clrem(p32, 4 * 128 - 64), + calc_clrem(p32, 4 * 128)); + + printf("const __m128i fold128 = _mm_set_epi64x(" + "0x%08" PRIx64 ", 0x%08" PRIx64 ");\n", + calc_clrem(p32, 128 - 64), + calc_clrem(p32, 128)); + + // CRC32 calculation is done by modulus scaling it to a CRC64. + // Since the CRC is in reversed representation, only the mu + // constant changes with the modulus scaling. This method avoids + // one additional constant and one additional clmul in the final + // reduction steps, making the code both simpler and faster. + // + // p32 is shifted left by one so that the final multiplication + // in Barrett reduction won't be misaligned by one bit. We could + // use "(p32 << 1) | 1" instead of "p32 << 1" too but it makes + // no difference as that bit won't affect the relevant output bits. + // + // NOTE: The 33-bit value fits in 64 bits so, unlike with CRC64, + // there is no need to compensate for any missing bits in the code. + printf("const __m128i mu_p = _mm_set_epi64x(" + "0x%016" PRIx64 ", 0x%" PRIx64 ");\n", + (calc_cldiv(p32) << 1) | 1, + p32 << 1); + + return 0; +} diff --git a/src/liblzma/check/crc_common.h b/src/liblzma/check/crc_common.h index c15d4c675c8f..7ea1e60b043b 100644 --- a/src/liblzma/check/crc_common.h +++ b/src/liblzma/check/crc_common.h @@ -3,11 +3,10 @@ /////////////////////////////////////////////////////////////////////////////// // /// \file crc_common.h -/// \brief Some functions and macros for CRC32 and CRC64 +/// \brief Macros and declarations for CRC32 and CRC64 // // Authors: Lasse Collin // Ilya Kurdyukov -// Hans Jansen // Jia Tan // /////////////////////////////////////////////////////////////////////////////// @@ -18,6 +17,10 @@ #include "common.h" +///////////// +// Generic // +///////////// + #ifdef WORDS_BIGENDIAN # define A(x) ((x) >> 24) # define B(x) (((x) >> 16) & 0xFF) @@ -38,43 +41,63 @@ #endif -// CRC CLMUL code needs this because accessing input buffers that aren't -// aligned to the vector size will inherently trip the address sanitizer. -#if lzma_has_attribute(__no_sanitize_address__) -# define crc_attr_no_sanitize_address \ - __attribute__((__no_sanitize_address__)) +/// lzma_crc32_table[0] is needed by LZ encoder so we need to keep +/// the array two-dimensional. +#ifdef HAVE_SMALL +lzma_attr_visibility_hidden +extern uint32_t lzma_crc32_table[1][256]; + +extern void lzma_crc32_init(void); + #else -# define crc_attr_no_sanitize_address -#endif -// Keep this in sync with changes to crc32_arm64.h -#if defined(_WIN32) || defined(HAVE_GETAUXVAL) \ - || defined(HAVE_ELF_AUX_INFO) \ - || (defined(__APPLE__) && defined(HAVE_SYSCTLBYNAME)) -# define ARM64_RUNTIME_DETECTION 1 +lzma_attr_visibility_hidden +extern const uint32_t lzma_crc32_table[8][256]; + +lzma_attr_visibility_hidden +extern const uint64_t lzma_crc64_table[4][256]; #endif +/////////////////// +// Configuration // +/////////////////// + +// NOTE: This config isn't used if HAVE_SMALL is defined! + +// These are defined if the generic slicing-by-n implementations and their +// lookup tables are built. #undef CRC32_GENERIC #undef CRC64_GENERIC +// These are defined if an arch-specific version is built. If both this +// and matching _GENERIC is defined then runtime detection must be used. #undef CRC32_ARCH_OPTIMIZED #undef CRC64_ARCH_OPTIMIZED // The x86 CLMUL is used for both CRC32 and CRC64. #undef CRC_X86_CLMUL +// Many ARM64 processor have CRC32 instructions. +// CRC64 could be done with CLMUL but it's not implemented yet. #undef CRC32_ARM64 -#undef CRC64_ARM64_CLMUL -#undef CRC_USE_GENERIC_FOR_SMALL_INPUTS +// 64-bit LoongArch has CRC32 instructions. +#undef CRC32_LOONGARCH + + +// ARM64 +// +// Keep this in sync with changes to crc32_arm64.h +#if defined(_WIN32) || defined(HAVE_GETAUXVAL) \ + || defined(HAVE_ELF_AUX_INFO) \ + || (defined(__APPLE__) && defined(HAVE_SYSCTLBYNAME)) +# define CRC_ARM64_RUNTIME_DETECTION 1 +#endif // ARM64 CRC32 instruction is only useful for CRC32. Currently, only // little endian is supported since we were unable to test on a big // endian machine. -// -// NOTE: Keep this and the next check in sync with the macro -// NO_CRC32_TABLE in crc32_table.c #if defined(HAVE_ARM64_CRC32) && !defined(WORDS_BIGENDIAN) // Allow ARM64 CRC32 instruction without a runtime check if // __ARM_FEATURE_CRC32 is defined. GCC and Clang only define @@ -82,21 +105,40 @@ # if defined(__ARM_FEATURE_CRC32) # define CRC32_ARCH_OPTIMIZED 1 # define CRC32_ARM64 1 -# elif defined(ARM64_RUNTIME_DETECTION) +# elif defined(CRC_ARM64_RUNTIME_DETECTION) # define CRC32_ARCH_OPTIMIZED 1 # define CRC32_ARM64 1 # define CRC32_GENERIC 1 # endif #endif -#if defined(HAVE_USABLE_CLMUL) -// If CLMUL is allowed unconditionally in the compiler options then the -// generic version can be omitted. Note that this doesn't work with MSVC -// as I don't know how to detect the features here. + +// LoongArch // -// NOTE: Keep this in sync with the NO_CRC32_TABLE macro in crc32_table.c -// and NO_CRC64_TABLE in crc64_table.c. -# if (defined(__SSSE3__) && defined(__SSE4_1__) && defined(__PCLMUL__)) \ +// Only 64-bit LoongArch is supported for now. No runtime detection +// is needed because the LoongArch specification says that the CRC32 +// instructions are a part of the Basic Integer Instructions and +// they shall be implemented by 64-bit LoongArch implementations. +#ifdef HAVE_LOONGARCH_CRC32 +# define CRC32_ARCH_OPTIMIZED 1 +# define CRC32_LOONGARCH 1 +#endif + + +// x86 and E2K +#if defined(HAVE_USABLE_CLMUL) + // If CLMUL is allowed unconditionally in the compiler options then + // the generic version and the tables can be omitted. Exceptions: + // + // - If 32-bit x86 assembly files are enabled then those are always + // built and runtime detection is used even if compiler flags + // were set to allow CLMUL unconditionally. + // + // - This doesn't work with MSVC as I don't know how to detect + // the features here. + // +# if (defined(__SSSE3__) && defined(__SSE4_1__) && defined(__PCLMUL__) \ + && !defined(HAVE_CRC_X86_ASM)) \ || (defined(__e2k__) && __iset__ >= 6) # define CRC32_ARCH_OPTIMIZED 1 # define CRC64_ARCH_OPTIMIZED 1 @@ -107,21 +149,12 @@ # define CRC32_ARCH_OPTIMIZED 1 # define CRC64_ARCH_OPTIMIZED 1 # define CRC_X86_CLMUL 1 - -/* - // The generic code is much faster with 1-8-byte inputs and - // has similar performance up to 16 bytes at least in - // microbenchmarks (it depends on input buffer alignment - // too). If both versions are built, this #define will use - // the generic version for inputs up to 16 bytes and CLMUL - // for bigger inputs. It saves a little in code size since - // the special cases for 0-16-byte inputs will be omitted - // from the CLMUL code. -# define CRC_USE_GENERIC_FOR_SMALL_INPUTS 1 -*/ # endif #endif + +// Fallback configuration +// // For CRC32 use the generic slice-by-eight implementation if no optimized // version is available. #if !defined(CRC32_ARCH_OPTIMIZED) && !defined(CRC32_GENERIC) diff --git a/src/liblzma/check/crc_x86_clmul.h b/src/liblzma/check/crc_x86_clmul.h index 50306e49a72a..b302d6cf7f51 100644 --- a/src/liblzma/check/crc_x86_clmul.h +++ b/src/liblzma/check/crc_x86_clmul.h @@ -8,26 +8,20 @@ /// The CRC32 and CRC64 implementations use 32/64-bit x86 SSSE3, SSE4.1, and /// CLMUL instructions. This is compatible with Elbrus 2000 (E2K) too. /// -/// They were derived from +/// See the Intel white paper "Fast CRC Computation for Generic Polynomials +/// Using PCLMULQDQ Instruction" from 2009. The original file seems to be +/// gone from Intel's website but a version is available here: /// https://www.researchgate.net/publication/263424619_Fast_CRC_computation -/// and the public domain code from https://github.com/rawrunprotected/crc -/// (URLs were checked on 2023-10-14). +/// (The link was checked on 2024-06-11.) /// /// While this file has both CRC32 and CRC64 implementations, only one -/// should be built at a time to ensure that crc_simd_body() is inlined -/// even with compilers with which lzma_always_inline expands to plain inline. -/// The version to build is selected by defining BUILDING_CRC32_CLMUL or -/// BUILDING_CRC64_CLMUL before including this file. +/// can be built at a time. The version to build is selected by defining +/// BUILDING_CRC_CLMUL to 32 or 64 before including this file. /// -/// FIXME: Builds for 32-bit x86 use the assembly .S files by default -/// unless configured with --disable-assembler. Even then the lookup table -/// isn't omitted in crc64_table.c since it doesn't know that assembly -/// code has been disabled. +/// NOTE: The x86 CLMUL CRC implementation was rewritten for XZ Utils 5.8.0. // -// Authors: Ilya Kurdyukov -// Hans Jansen -// Lasse Collin -// Jia Tan +// Authors: Lasse Collin +// Ilya Kurdyukov // /////////////////////////////////////////////////////////////////////////////// @@ -37,6 +31,10 @@ #endif #define LZMA_CRC_X86_CLMUL_H +#if BUILDING_CRC_CLMUL != 32 && BUILDING_CRC_CLMUL != 64 +# error BUILDING_CRC_CLMUL is undefined or has an invalid value +#endif + #include <immintrin.h> #if defined(_MSC_VER) @@ -59,330 +57,277 @@ #endif -#define MASK_L(in, mask, r) r = _mm_shuffle_epi8(in, mask) +// GCC and Clang would produce good code with _mm_set_epi64x +// but MSVC needs _mm_cvtsi64_si128 on x86-64. +#if defined(__i386__) || defined(_M_IX86) +# define my_set_low64(a) _mm_set_epi64x(0, (a)) +#else +# define my_set_low64(a) _mm_cvtsi64_si128(a) +#endif -#define MASK_H(in, mask, r) \ - r = _mm_shuffle_epi8(in, _mm_xor_si128(mask, vsign)) -#define MASK_LH(in, mask, low, high) \ - MASK_L(in, mask, low); \ - MASK_H(in, mask, high) +// Align it so that the whole array is within the same cache line. +// More than one unaligned load can be done from this during the +// same CRC function call. +// +// The bytes [0] to [31] are used with AND to clear the low bytes. (With ANDN +// those could be used to clear the high bytes too but it's not needed here.) +// +// The bytes [16] to [47] are for left shifts. +// The bytes [32] to [63] are for right shifts. +alignas(64) +static uint8_t vmasks[64] = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, + 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, + 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, + 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, +}; + + +// *Unaligned* 128-bit load +crc_attr_target +static inline __m128i +my_load128(const uint8_t *p) +{ + return _mm_loadu_si128((const __m128i *)p); +} +// Keep the highest "count" bytes as is and clear the remaining low bytes. crc_attr_target -crc_attr_no_sanitize_address -static lzma_always_inline void -crc_simd_body(const uint8_t *buf, const size_t size, __m128i *v0, __m128i *v1, - const __m128i vfold16, const __m128i initial_crc) +static inline __m128i +keep_high_bytes(__m128i v, size_t count) { - // Create a vector with 8-bit values 0 to 15. This is used to - // construct control masks for _mm_blendv_epi8 and _mm_shuffle_epi8. - const __m128i vramp = _mm_setr_epi32( - 0x03020100, 0x07060504, 0x0b0a0908, 0x0f0e0d0c); - - // This is used to inverse the control mask of _mm_shuffle_epi8 - // so that bytes that wouldn't be picked with the original mask - // will be picked and vice versa. - const __m128i vsign = _mm_set1_epi8(-0x80); + return _mm_and_si128(my_load128((vmasks + count)), v); +} - // Memory addresses A to D and the distances between them: - // - // A B C D - // [skip_start][size][skip_end] - // [ size2 ] - // - // A and D are 16-byte aligned. B and C are 1-byte aligned. - // skip_start and skip_end are 0-15 bytes. size is at least 1 byte. - // - // A = aligned_buf will initially point to this address. - // B = The address pointed by the caller-supplied buf. - // C = buf + size == aligned_buf + size2 - // D = buf + size + skip_end == aligned_buf + size2 + skip_end - const size_t skip_start = (size_t)((uintptr_t)buf & 15); - const size_t skip_end = (size_t)((0U - (uintptr_t)(buf + size)) & 15); - const __m128i *aligned_buf = (const __m128i *)( - (uintptr_t)buf & ~(uintptr_t)15); - - // If size2 <= 16 then the whole input fits into a single 16-byte - // vector. If size2 > 16 then at least two 16-byte vectors must - // be processed. If size2 > 16 && size <= 16 then there is only - // one 16-byte vector's worth of input but it is unaligned in memory. - // - // NOTE: There is no integer overflow here if the arguments - // are valid. If this overflowed, buf + size would too. - const size_t size2 = skip_start + size; - - // Masks to be used with _mm_blendv_epi8 and _mm_shuffle_epi8: - // The first skip_start or skip_end bytes in the vectors will have - // the high bit (0x80) set. _mm_blendv_epi8 and _mm_shuffle_epi8 - // will produce zeros for these positions. (Bitwise-xor of these - // masks with vsign will produce the opposite behavior.) - const __m128i mask_start - = _mm_sub_epi8(vramp, _mm_set1_epi8((char)skip_start)); - const __m128i mask_end - = _mm_sub_epi8(vramp, _mm_set1_epi8((char)skip_end)); - - // Get the first 1-16 bytes into data0. If loading less than 16 - // bytes, the bytes are loaded to the high bits of the vector and - // the least significant positions are filled with zeros. - const __m128i data0 = _mm_blendv_epi8(_mm_load_si128(aligned_buf), - _mm_setzero_si128(), mask_start); - aligned_buf++; - - __m128i v2, v3; - -#ifndef CRC_USE_GENERIC_FOR_SMALL_INPUTS - if (size <= 16) { - // Right-shift initial_crc by 1-16 bytes based on "size" - // and store the result in v1 (high bytes) and v0 (low bytes). - // - // NOTE: The highest 8 bytes of initial_crc are zeros so - // v1 will be filled with zeros if size >= 8. The highest - // 8 bytes of v1 will always become zeros. - // - // [ v1 ][ v0 ] - // [ initial_crc ] size == 1 - // [ initial_crc ] size == 2 - // [ initial_crc ] size == 15 - // [ initial_crc ] size == 16 (all in v0) - const __m128i mask_low = _mm_add_epi8( - vramp, _mm_set1_epi8((char)(size - 16))); - MASK_LH(initial_crc, mask_low, *v0, *v1); - - if (size2 <= 16) { - // There are 1-16 bytes of input and it is all - // in data0. Copy the input bytes to v3. If there - // are fewer than 16 bytes, the low bytes in v3 - // will be filled with zeros. That is, the input - // bytes are stored to the same position as - // (part of) initial_crc is in v0. - MASK_L(data0, mask_end, v3); - } else { - // There are 2-16 bytes of input but not all bytes - // are in data0. - const __m128i data1 = _mm_load_si128(aligned_buf); - - // Collect the 2-16 input bytes from data0 and data1 - // to v2 and v3, and bitwise-xor them with the - // low bits of initial_crc in v0. Note that the - // the second xor is below this else-block as it - // is shared with the other branch. - MASK_H(data0, mask_end, v2); - MASK_L(data1, mask_end, v3); - *v0 = _mm_xor_si128(*v0, v2); - } - *v0 = _mm_xor_si128(*v0, v3); - *v1 = _mm_alignr_epi8(*v1, *v0, 8); - } else -#endif - { - // There is more than 16 bytes of input. - const __m128i data1 = _mm_load_si128(aligned_buf); - const __m128i *end = (const __m128i*)( - (const char *)aligned_buf - 16 + size2); - aligned_buf++; - - MASK_LH(initial_crc, mask_start, *v0, *v1); - *v0 = _mm_xor_si128(*v0, data0); - *v1 = _mm_xor_si128(*v1, data1); - - while (aligned_buf < end) { - *v1 = _mm_xor_si128(*v1, _mm_clmulepi64_si128( - *v0, vfold16, 0x00)); - *v0 = _mm_xor_si128(*v1, _mm_clmulepi64_si128( - *v0, vfold16, 0x11)); - *v1 = _mm_load_si128(aligned_buf++); - } +// Shift the 128-bit value left by "amount" bytes (not bits). +crc_attr_target +static inline __m128i +shift_left(__m128i v, size_t amount) +{ + return _mm_shuffle_epi8(v, my_load128((vmasks + 32 - amount))); +} - if (aligned_buf != end) { - MASK_H(*v0, mask_end, v2); - MASK_L(*v0, mask_end, *v0); - MASK_L(*v1, mask_end, v3); - *v1 = _mm_or_si128(v2, v3); - } - *v1 = _mm_xor_si128(*v1, _mm_clmulepi64_si128( - *v0, vfold16, 0x00)); - *v0 = _mm_xor_si128(*v1, _mm_clmulepi64_si128( - *v0, vfold16, 0x11)); - *v1 = _mm_srli_si128(*v0, 8); - } +// Shift the 128-bit value right by "amount" bytes (not bits). +crc_attr_target +static inline __m128i +shift_right(__m128i v, size_t amount) +{ + return _mm_shuffle_epi8(v, my_load128((vmasks + 32 + amount))); } -///////////////////// -// x86 CLMUL CRC32 // -///////////////////// - -/* -// These functions were used to generate the constants -// at the top of crc32_arch_optimized(). -static uint64_t -calc_lo(uint64_t p, uint64_t a, int n) +crc_attr_target +static inline __m128i +fold(__m128i v, __m128i k) { - uint64_t b = 0; int i; - for (i = 0; i < n; i++) { - b = b >> 1 | (a & 1) << (n - 1); - a = (a >> 1) ^ ((0 - (a & 1)) & p); - } - return b; + __m128i a = _mm_clmulepi64_si128(v, k, 0x00); + __m128i b = _mm_clmulepi64_si128(v, k, 0x11); + return _mm_xor_si128(a, b); } -// same as ~crc(&a, sizeof(a), ~0) -static uint64_t -calc_hi(uint64_t p, uint64_t a, int n) + +crc_attr_target +static inline __m128i +fold_xor(__m128i v, __m128i k, const uint8_t *buf) { - int i; - for (i = 0; i < n; i++) - a = (a >> 1) ^ ((0 - (a & 1)) & p); - return a; + return _mm_xor_si128(my_load128(buf), fold(v, k)); } -*/ -#ifdef BUILDING_CRC32_CLMUL +#if BUILDING_CRC_CLMUL == 32 crc_attr_target -crc_attr_no_sanitize_address static uint32_t crc32_arch_optimized(const uint8_t *buf, size_t size, uint32_t crc) +#else +crc_attr_target +static uint64_t +crc64_arch_optimized(const uint8_t *buf, size_t size, uint64_t crc) +#endif { -#ifndef CRC_USE_GENERIC_FOR_SMALL_INPUTS - // The code assumes that there is at least one byte of input. + // We will assume that there is at least one byte of input. if (size == 0) return crc; -#endif - // uint32_t poly = 0xedb88320; - const int64_t p = 0x1db710640; // p << 1 - const int64_t mu = 0x1f7011641; // calc_lo(p, p, 32) << 1 | 1 - const int64_t k5 = 0x163cd6124; // calc_hi(p, p, 32) << 1 - const int64_t k4 = 0x0ccaa009e; // calc_hi(p, p, 64) << 1 - const int64_t k3 = 0x1751997d0; // calc_hi(p, p, 128) << 1 - - const __m128i vfold4 = _mm_set_epi64x(mu, p); - const __m128i vfold8 = _mm_set_epi64x(0, k5); - const __m128i vfold16 = _mm_set_epi64x(k4, k3); - - __m128i v0, v1, v2; - - crc_simd_body(buf, size, &v0, &v1, vfold16, - _mm_cvtsi32_si128((int32_t)~crc)); - - v1 = _mm_xor_si128( - _mm_clmulepi64_si128(v0, vfold16, 0x10), v1); // xxx0 - v2 = _mm_shuffle_epi32(v1, 0xe7); // 0xx0 - v0 = _mm_slli_epi64(v1, 32); // [0] - v0 = _mm_clmulepi64_si128(v0, vfold8, 0x00); - v0 = _mm_xor_si128(v0, v2); // [1] [2] - v2 = _mm_clmulepi64_si128(v0, vfold4, 0x10); - v2 = _mm_clmulepi64_si128(v2, vfold4, 0x00); - v0 = _mm_xor_si128(v0, v2); // [2] - return ~(uint32_t)_mm_extract_epi32(v0, 2); -} -#endif // BUILDING_CRC32_CLMUL + // See crc_clmul_consts_gen.c. +#if BUILDING_CRC_CLMUL == 32 + const __m128i fold512 = _mm_set_epi64x(0x1d9513d7, 0x8f352d95); + const __m128i fold128 = _mm_set_epi64x(0xccaa009e, 0xae689191); + const __m128i mu_p = _mm_set_epi64x( + (int64_t)0xb4e5b025f7011641, 0x1db710640); +#else + const __m128i fold512 = _mm_set_epi64x( + (int64_t)0x081f6054a7842df4, (int64_t)0x6ae3efbb9dd441f3); + const __m128i fold128 = _mm_set_epi64x( + (int64_t)0xdabe95afc7875f40, (int64_t)0xe05dd497ca393ae4); -///////////////////// -// x86 CLMUL CRC64 // -///////////////////// + const __m128i mu_p = _mm_set_epi64x( + (int64_t)0x9c3e466c172963d5, (int64_t)0x92d8af2baf0e1e84); +#endif -/* -// These functions were used to generate the constants -// at the top of crc64_arch_optimized(). -static uint64_t -calc_lo(uint64_t poly) -{ - uint64_t a = poly; - uint64_t b = 0; + __m128i v0, v1, v2, v3; - for (unsigned i = 0; i < 64; ++i) { - b = (b >> 1) | (a << 63); - a = (a >> 1) ^ (a & 1 ? poly : 0); - } + crc = ~crc; - return b; -} + if (size < 8) { + uint64_t x = crc; + size_t i = 0; -static uint64_t -calc_hi(uint64_t poly, uint64_t a) -{ - for (unsigned i = 0; i < 64; ++i) - a = (a >> 1) ^ (a & 1 ? poly : 0); + // Checking the bit instead of comparing the size means + // that we don't need to update the size between the steps. + if (size & 4) { + x ^= read32le(buf); + buf += 4; + i = 32; + } - return a; -} -*/ + if (size & 2) { + x ^= (uint64_t)read16le(buf) << i; + buf += 2; + i += 16; + } -#ifdef BUILDING_CRC64_CLMUL + if (size & 1) + x ^= (uint64_t)*buf << i; -// MSVC (VS2015 - VS2022) produces bad 32-bit x86 code from the CLMUL CRC -// code when optimizations are enabled (release build). According to the bug -// report, the ebx register is corrupted and the calculated result is wrong. -// Trying to workaround the problem with "__asm mov ebx, ebx" didn't help. -// The following pragma works and performance is still good. x86-64 builds -// and CRC32 CLMUL aren't affected by this problem. The problem does not -// happen in crc_simd_body() either (which is shared with CRC32 CLMUL anyway). -// -// NOTE: Another pragma after crc64_arch_optimized() restores -// the optimizations. If the #if condition here is updated, -// the other one must be updated too. -#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) && !defined(__clang__) \ - && defined(_M_IX86) -# pragma optimize("g", off) -#endif + v0 = my_set_low64((int64_t)x); + v0 = shift_left(v0, 8 - size); -crc_attr_target -crc_attr_no_sanitize_address -static uint64_t -crc64_arch_optimized(const uint8_t *buf, size_t size, uint64_t crc) -{ -#ifndef CRC_USE_GENERIC_FOR_SMALL_INPUTS - // The code assumes that there is at least one byte of input. - if (size == 0) - return crc; -#endif - - // const uint64_t poly = 0xc96c5795d7870f42; // CRC polynomial - const uint64_t p = 0x92d8af2baf0e1e85; // (poly << 1) | 1 - const uint64_t mu = 0x9c3e466c172963d5; // (calc_lo(poly) << 1) | 1 - const uint64_t k2 = 0xdabe95afc7875f40; // calc_hi(poly, 1) - const uint64_t k1 = 0xe05dd497ca393ae4; // calc_hi(poly, k2) + } else if (size < 16) { + v0 = my_set_low64((int64_t)(crc ^ read64le(buf))); - const __m128i vfold8 = _mm_set_epi64x((int64_t)p, (int64_t)mu); - const __m128i vfold16 = _mm_set_epi64x((int64_t)k2, (int64_t)k1); + // NOTE: buf is intentionally left 8 bytes behind so that + // we can read the last 1-7 bytes with read64le(buf + size). + size -= 8; - __m128i v0, v1, v2; + // Handling 8-byte input specially is a speed optimization + // as the clmul can be skipped. A branch is also needed to + // avoid a too high shift amount. + if (size > 0) { + const size_t padding = 8 - size; + uint64_t high = read64le(buf + size) >> (padding * 8); #if defined(__i386__) || defined(_M_IX86) - crc_simd_body(buf, size, &v0, &v1, vfold16, - _mm_set_epi64x(0, (int64_t)~crc)); + // Simple but likely not the best code for 32-bit x86. + v0 = _mm_insert_epi32(v0, (int32_t)high, 2); + v0 = _mm_insert_epi32(v0, (int32_t)(high >> 32), 3); #else - // GCC and Clang would produce good code with _mm_set_epi64x - // but MSVC needs _mm_cvtsi64_si128 on x86-64. - crc_simd_body(buf, size, &v0, &v1, vfold16, - _mm_cvtsi64_si128((int64_t)~crc)); + v0 = _mm_insert_epi64(v0, (int64_t)high, 1); #endif - v1 = _mm_xor_si128(_mm_clmulepi64_si128(v0, vfold16, 0x10), v1); - v0 = _mm_clmulepi64_si128(v1, vfold8, 0x00); - v2 = _mm_clmulepi64_si128(v0, vfold8, 0x10); - v0 = _mm_xor_si128(_mm_xor_si128(v1, _mm_slli_si128(v0, 8)), v2); + v0 = shift_left(v0, padding); + + v1 = _mm_srli_si128(v0, 8); + v0 = _mm_clmulepi64_si128(v0, fold128, 0x10); + v0 = _mm_xor_si128(v0, v1); + } + } else { + v0 = my_set_low64((int64_t)crc); + + // To align or not to align the buf pointer? If the end of + // the buffer isn't aligned, aligning the pointer here would + // make us do an extra folding step with the associated byte + // shuffling overhead. The cost of that would need to be + // lower than the benefit of aligned reads. Testing on an old + // Intel Ivy Bridge processor suggested that aligning isn't + // worth the cost but it likely depends on the processor and + // buffer size. Unaligned loads (MOVDQU) should be fast on + // x86 processors that support PCLMULQDQ, so we don't align + // the buf pointer here. + + // Read the first (and possibly the only) full 16 bytes. + v0 = _mm_xor_si128(v0, my_load128(buf)); + buf += 16; + size -= 16; + + if (size >= 48) { + v1 = my_load128(buf); + v2 = my_load128(buf + 16); + v3 = my_load128(buf + 32); + buf += 48; + size -= 48; + + while (size >= 64) { + v0 = fold_xor(v0, fold512, buf); + v1 = fold_xor(v1, fold512, buf + 16); + v2 = fold_xor(v2, fold512, buf + 32); + v3 = fold_xor(v3, fold512, buf + 48); + buf += 64; + size -= 64; + } + + v0 = _mm_xor_si128(v1, fold(v0, fold128)); + v0 = _mm_xor_si128(v2, fold(v0, fold128)); + v0 = _mm_xor_si128(v3, fold(v0, fold128)); + } + + while (size >= 16) { + v0 = fold_xor(v0, fold128, buf); + buf += 16; + size -= 16; + } + + if (size > 0) { + // We want the last "size" number of input bytes to + // be at the high bits of v1. First do a full 16-byte + // load and then mask the low bytes to zeros. + v1 = my_load128(buf + size - 16); + v1 = keep_high_bytes(v1, size); + + // Shift high bytes from v0 to the low bytes of v1. + // + // Alternatively we could replace the combination + // keep_high_bytes + shift_right + _mm_or_si128 with + // _mm_shuffle_epi8 + _mm_blendv_epi8 but that would + // require larger tables for the masks. Now there are + // three loads (instead of two) from the mask tables + // but they all are from the same cache line. + v1 = _mm_or_si128(v1, shift_right(v0, size)); + + // Shift high bytes of v0 away, padding the + // low bytes with zeros. + v0 = shift_left(v0, 16 - size); + + v0 = _mm_xor_si128(v1, fold(v0, fold128)); + } + v1 = _mm_srli_si128(v0, 8); + v0 = _mm_clmulepi64_si128(v0, fold128, 0x10); + v0 = _mm_xor_si128(v0, v1); + } + + // Barrett reduction + +#if BUILDING_CRC_CLMUL == 32 + v1 = _mm_clmulepi64_si128(v0, mu_p, 0x10); // v0 * mu + v1 = _mm_clmulepi64_si128(v1, mu_p, 0x00); // v1 * p + v0 = _mm_xor_si128(v0, v1); + return ~(uint32_t)_mm_extract_epi32(v0, 2); +#else + // Because p is 65 bits but one bit doesn't fit into the 64-bit + // half of __m128i, finish the second clmul by shifting v1 left + // by 64 bits and xorring it to the final result. + v1 = _mm_clmulepi64_si128(v0, mu_p, 0x10); // v0 * mu + v2 = _mm_slli_si128(v1, 8); + v1 = _mm_clmulepi64_si128(v1, mu_p, 0x00); // v1 * p + v0 = _mm_xor_si128(v0, v2); + v0 = _mm_xor_si128(v0, v1); #if defined(__i386__) || defined(_M_IX86) return ~(((uint64_t)(uint32_t)_mm_extract_epi32(v0, 3) << 32) | (uint64_t)(uint32_t)_mm_extract_epi32(v0, 2)); #else return ~(uint64_t)_mm_extract_epi64(v0, 1); #endif -} - -#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) && !defined(__clang__) \ - && defined(_M_IX86) -# pragma optimize("", on) #endif - -#endif // BUILDING_CRC64_CLMUL +} // Even though this is an inline function, compile it only when needed. diff --git a/src/liblzma/common/alone_decoder.c b/src/liblzma/common/alone_decoder.c index 78af651578fc..e2b58e1f3758 100644 --- a/src/liblzma/common/alone_decoder.c +++ b/src/liblzma/common/alone_decoder.c @@ -134,8 +134,7 @@ alone_decode(void *coder_ptr, const lzma_allocator *allocator, coder->pos = 0; coder->sequence = SEQ_CODER_INIT; - - // Fall through + FALLTHROUGH; case SEQ_CODER_INIT: { if (coder->memusage > coder->memlimit) diff --git a/src/liblzma/common/auto_decoder.c b/src/liblzma/common/auto_decoder.c index fdd520f905c5..da49345f909d 100644 --- a/src/liblzma/common/auto_decoder.c +++ b/src/liblzma/common/auto_decoder.c @@ -79,7 +79,7 @@ auto_decode(void *coder_ptr, const lzma_allocator *allocator, return LZMA_GET_CHECK; } - // Fall through + FALLTHROUGH; case SEQ_CODE: { const lzma_ret ret = coder->next.code( @@ -91,10 +91,9 @@ auto_decode(void *coder_ptr, const lzma_allocator *allocator, return ret; coder->sequence = SEQ_FINISH; + FALLTHROUGH; } - // Fall through - case SEQ_FINISH: // When LZMA_CONCATENATED was used and we were decoding // a LZMA_Alone file, we need to check that there is no diff --git a/src/liblzma/common/block_decoder.c b/src/liblzma/common/block_decoder.c index 2e369d316bdf..bbc9f5566c8b 100644 --- a/src/liblzma/common/block_decoder.c +++ b/src/liblzma/common/block_decoder.c @@ -146,10 +146,9 @@ block_decode(void *coder_ptr, const lzma_allocator *allocator, coder->block->uncompressed_size = coder->uncompressed_size; coder->sequence = SEQ_PADDING; + FALLTHROUGH; } - // Fall through - case SEQ_PADDING: // Compressed Data is padded to a multiple of four bytes. while (coder->compressed_size & 3) { @@ -173,8 +172,7 @@ block_decode(void *coder_ptr, const lzma_allocator *allocator, lzma_check_finish(&coder->check, coder->block->check); coder->sequence = SEQ_CHECK; - - // Fall through + FALLTHROUGH; case SEQ_CHECK: { const size_t check_size = lzma_check_size(coder->block->check); diff --git a/src/liblzma/common/block_encoder.c b/src/liblzma/common/block_encoder.c index ce8c1de69442..eb7997a72aeb 100644 --- a/src/liblzma/common/block_encoder.c +++ b/src/liblzma/common/block_encoder.c @@ -94,10 +94,9 @@ block_encode(void *coder_ptr, const lzma_allocator *allocator, coder->block->uncompressed_size = coder->uncompressed_size; coder->sequence = SEQ_PADDING; + FALLTHROUGH; } - // Fall through - case SEQ_PADDING: // Pad Compressed Data to a multiple of four bytes. We can // use coder->compressed_size for this since we don't need @@ -117,8 +116,7 @@ block_encode(void *coder_ptr, const lzma_allocator *allocator, lzma_check_finish(&coder->check, coder->block->check); coder->sequence = SEQ_CHECK; - - // Fall through + FALLTHROUGH; case SEQ_CHECK: { const size_t check_size = lzma_check_size(coder->block->check); diff --git a/src/liblzma/common/common.c b/src/liblzma/common/common.c index cc0e06a51bee..6e031a56c888 100644 --- a/src/liblzma/common/common.c +++ b/src/liblzma/common/common.c @@ -96,6 +96,12 @@ lzma_bufcpy(const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size, uint8_t *restrict out, size_t *restrict out_pos, size_t out_size) { + assert(in != NULL || *in_pos == in_size); + assert(out != NULL || *out_pos == out_size); + + assert(*in_pos <= in_size); + assert(*out_pos <= out_size); + const size_t in_avail = in_size - *in_pos; const size_t out_avail = out_size - *out_pos; const size_t copy_size = my_min(in_avail, out_avail); @@ -348,7 +354,7 @@ lzma_code(lzma_stream *strm, lzma_action action) else strm->internal->sequence = ISEQ_END; - // Fall through + FALLTHROUGH; case LZMA_NO_CHECK: case LZMA_UNSUPPORTED_CHECK: diff --git a/src/liblzma/common/file_info.c b/src/liblzma/common/file_info.c index 7c85084a706e..4b2eb5d0400b 100644 --- a/src/liblzma/common/file_info.c +++ b/src/liblzma/common/file_info.c @@ -298,15 +298,13 @@ file_info_decode(void *coder_ptr, const lzma_allocator *allocator, // Start looking for Stream Padding and Stream Footer // at the end of the file. coder->file_target_pos = coder->file_size; - - // Fall through + FALLTHROUGH; case SEQ_PADDING_SEEK: coder->sequence = SEQ_PADDING_DECODE; return_if_error(reverse_seek( coder, in_start, in_pos, in_size)); - - // Fall through + FALLTHROUGH; case SEQ_PADDING_DECODE: { // Copy to coder->temp first. This keeps the code simpler if @@ -356,9 +354,9 @@ file_info_decode(void *coder_ptr, const lzma_allocator *allocator, if (coder->temp_size < LZMA_STREAM_HEADER_SIZE) return_if_error(reverse_seek( coder, in_start, in_pos, in_size)); - } - // Fall through + FALLTHROUGH; + } case SEQ_FOOTER: // Copy the Stream Footer field into coder->temp. @@ -414,7 +412,7 @@ file_info_decode(void *coder_ptr, const lzma_allocator *allocator, return LZMA_SEEK_NEEDED; } - // Fall through + FALLTHROUGH; case SEQ_INDEX_INIT: { // Calculate the amount of memory already used by the earlier @@ -444,10 +442,9 @@ file_info_decode(void *coder_ptr, const lzma_allocator *allocator, coder->index_remaining = coder->footer_flags.backward_size; coder->sequence = SEQ_INDEX_DECODE; + FALLTHROUGH; } - // Fall through - case SEQ_INDEX_DECODE: { // Decode (a part of) the Index. If the whole Index is already // in coder->temp, read it from there. Otherwise read from @@ -574,9 +571,9 @@ file_info_decode(void *coder_ptr, const lzma_allocator *allocator, return_if_error(reverse_seek(coder, in_start, in_pos, in_size)); } - } - // Fall through + FALLTHROUGH; + } case SEQ_HEADER_DECODE: // Copy the Stream Header field into coder->temp. @@ -596,8 +593,7 @@ file_info_decode(void *coder_ptr, const lzma_allocator *allocator, coder->temp + coder->temp_size))); coder->sequence = SEQ_HEADER_COMPARE; - - // Fall through + FALLTHROUGH; case SEQ_HEADER_COMPARE: // Compare Stream Header against Stream Footer. They must diff --git a/src/liblzma/common/index_decoder.c b/src/liblzma/common/index_decoder.c index 4bcb30692115..4eab56d942e1 100644 --- a/src/liblzma/common/index_decoder.c +++ b/src/liblzma/common/index_decoder.c @@ -93,8 +93,7 @@ index_decode(void *coder_ptr, const lzma_allocator *allocator, coder->pos = 0; coder->sequence = SEQ_MEMUSAGE; - - // Fall through + FALLTHROUGH; case SEQ_MEMUSAGE: if (lzma_index_memusage(1, coder->count) > coder->memlimit) { @@ -153,8 +152,7 @@ index_decode(void *coder_ptr, const lzma_allocator *allocator, case SEQ_PADDING_INIT: coder->pos = lzma_index_padding_size(coder->index); coder->sequence = SEQ_PADDING; - - // Fall through + FALLTHROUGH; case SEQ_PADDING: if (coder->pos > 0) { @@ -170,8 +168,7 @@ index_decode(void *coder_ptr, const lzma_allocator *allocator, *in_pos - in_start, coder->crc32); coder->sequence = SEQ_CRC32; - - // Fall through + FALLTHROUGH; case SEQ_CRC32: do { diff --git a/src/liblzma/common/index_encoder.c b/src/liblzma/common/index_encoder.c index ecc299c0159f..80f1be1e3aea 100644 --- a/src/liblzma/common/index_encoder.c +++ b/src/liblzma/common/index_encoder.c @@ -93,8 +93,7 @@ index_encode(void *coder_ptr, } coder->sequence = SEQ_UNPADDED; - - // Fall through + FALLTHROUGH; case SEQ_UNPADDED: case SEQ_UNCOMPRESSED: { @@ -127,8 +126,7 @@ index_encode(void *coder_ptr, *out_pos - out_start, coder->crc32); coder->sequence = SEQ_CRC32; - - // Fall through + FALLTHROUGH; case SEQ_CRC32: // We don't use the main loop, because we don't want diff --git a/src/liblzma/common/index_hash.c b/src/liblzma/common/index_hash.c index caa5967ca496..b7f1b6b58d1a 100644 --- a/src/liblzma/common/index_hash.c +++ b/src/liblzma/common/index_hash.c @@ -267,9 +267,9 @@ lzma_index_hash_decode(lzma_index_hash *index_hash, const uint8_t *in, index_hash->pos = (LZMA_VLI_C(4) - index_size_unpadded( index_hash->records.count, index_hash->records.index_list_size)) & 3; - index_hash->sequence = SEQ_PADDING; - // Fall through + index_hash->sequence = SEQ_PADDING; + FALLTHROUGH; case SEQ_PADDING: if (index_hash->pos > 0) { @@ -302,8 +302,7 @@ lzma_index_hash_decode(lzma_index_hash *index_hash, const uint8_t *in, *in_pos - in_start, index_hash->crc32); index_hash->sequence = SEQ_CRC32; - - // Fall through + FALLTHROUGH; case SEQ_CRC32: do { diff --git a/src/liblzma/common/lzip_decoder.c b/src/liblzma/common/lzip_decoder.c index 651a0ae712c8..4dff2d5889ea 100644 --- a/src/liblzma/common/lzip_decoder.c +++ b/src/liblzma/common/lzip_decoder.c @@ -150,10 +150,9 @@ lzip_decode(void *coder_ptr, const lzma_allocator *allocator, coder->member_size = sizeof(lzip_id_string); coder->sequence = SEQ_VERSION; + FALLTHROUGH; } - // Fall through - case SEQ_VERSION: if (*in_pos >= in_size) return LZMA_OK; @@ -173,7 +172,7 @@ lzip_decode(void *coder_ptr, const lzma_allocator *allocator, if (coder->tell_any_check) return LZMA_GET_CHECK; - // Fall through + FALLTHROUGH; case SEQ_DICT_SIZE: { if (*in_pos >= in_size) @@ -220,10 +219,9 @@ lzip_decode(void *coder_ptr, const lzma_allocator *allocator, // LZMA_MEMLIMIT_ERROR we need to be able to restart after // the memlimit has been increased. coder->sequence = SEQ_CODER_INIT; + FALLTHROUGH; } - // Fall through - case SEQ_CODER_INIT: { if (coder->memusage > coder->memlimit) return LZMA_MEMLIMIT_ERROR; @@ -243,10 +241,9 @@ lzip_decode(void *coder_ptr, const lzma_allocator *allocator, coder->crc32 = 0; coder->sequence = SEQ_LZMA_STREAM; + FALLTHROUGH; } - // Fall through - case SEQ_LZMA_STREAM: { const size_t in_start = *in_pos; const size_t out_start = *out_pos; @@ -273,10 +270,9 @@ lzip_decode(void *coder_ptr, const lzma_allocator *allocator, return ret; coder->sequence = SEQ_MEMBER_FOOTER; + FALLTHROUGH; } - // Fall through - case SEQ_MEMBER_FOOTER: { // The footer of .lz version 0 lacks the Member size field. // This is the only difference between version 0 and diff --git a/src/liblzma/common/memcmplen.h b/src/liblzma/common/memcmplen.h index 394a4856dd6a..82e908542295 100644 --- a/src/liblzma/common/memcmplen.h +++ b/src/liblzma/common/memcmplen.h @@ -58,15 +58,13 @@ lzma_memcmplen(const uint8_t *buf1, const uint8_t *buf2, #if defined(TUKLIB_FAST_UNALIGNED_ACCESS) \ && (((TUKLIB_GNUC_REQ(3, 4) || defined(__clang__)) \ - && (defined(__x86_64__) \ - || defined(__aarch64__))) \ + && SIZE_MAX == UINT64_MAX) \ || (defined(__INTEL_COMPILER) && defined(__x86_64__)) \ || (defined(__INTEL_COMPILER) && defined(_M_X64)) \ || (defined(_MSC_VER) && (defined(_M_X64) \ || defined(_M_ARM64) || defined(_M_ARM64EC)))) // This is only for x86-64 and ARM64 for now. This might be fine on - // other 64-bit processors too. On big endian one should use xor - // instead of subtraction and switch to __builtin_clzll(). + // other 64-bit processors too. // // Reasons to use subtraction instead of xor: // @@ -82,7 +80,11 @@ lzma_memcmplen(const uint8_t *buf1, const uint8_t *buf2, // version 2023-05-26. https://www.agner.org/optimize/ #define LZMA_MEMCMPLEN_EXTRA 8 while (len < limit) { +# ifdef WORDS_BIGENDIAN + const uint64_t x = read64ne(buf1 + len) ^ read64ne(buf2 + len); +# else const uint64_t x = read64ne(buf1 + len) - read64ne(buf2 + len); +# endif if (x != 0) { // MSVC or Intel C compiler on Windows # if defined(_MSC_VER) || defined(__INTEL_COMPILER) @@ -90,6 +92,8 @@ lzma_memcmplen(const uint8_t *buf1, const uint8_t *buf2, _BitScanForward64(&tmp, x); len += (uint32_t)tmp >> 3; // GCC, Clang, or Intel C compiler +# elif defined(WORDS_BIGENDIAN) + len += (uint32_t)__builtin_clzll(x) >> 3; # else len += (uint32_t)__builtin_ctzll(x) >> 3; # endif diff --git a/src/liblzma/common/stream_decoder.c b/src/liblzma/common/stream_decoder.c index 7f426841366a..94004b74a165 100644 --- a/src/liblzma/common/stream_decoder.c +++ b/src/liblzma/common/stream_decoder.c @@ -154,9 +154,9 @@ stream_decode(void *coder_ptr, const lzma_allocator *allocator, if (coder->tell_any_check) return LZMA_GET_CHECK; - } - // Fall through + FALLTHROUGH; + } case SEQ_BLOCK_HEADER: { if (*in_pos >= in_size) @@ -187,10 +187,9 @@ stream_decode(void *coder_ptr, const lzma_allocator *allocator, coder->pos = 0; coder->sequence = SEQ_BLOCK_INIT; + FALLTHROUGH; } - // Fall through - case SEQ_BLOCK_INIT: { // Checking memusage and doing the initialization needs // its own sequence point because we need to be able to @@ -252,10 +251,9 @@ stream_decode(void *coder_ptr, const lzma_allocator *allocator, return ret; coder->sequence = SEQ_BLOCK_RUN; + FALLTHROUGH; } - // Fall through - case SEQ_BLOCK_RUN: { const lzma_ret ret = coder->block_decoder.code( coder->block_decoder.coder, allocator, @@ -291,10 +289,9 @@ stream_decode(void *coder_ptr, const lzma_allocator *allocator, return ret; coder->sequence = SEQ_STREAM_FOOTER; + FALLTHROUGH; } - // Fall through - case SEQ_STREAM_FOOTER: { // Copy the Stream Footer to the internal buffer. lzma_bufcpy(in, in_pos, in_size, coder->buffer, &coder->pos, @@ -331,10 +328,9 @@ stream_decode(void *coder_ptr, const lzma_allocator *allocator, return LZMA_STREAM_END; coder->sequence = SEQ_STREAM_PADDING; + FALLTHROUGH; } - // Fall through - case SEQ_STREAM_PADDING: assert(coder->concatenated); diff --git a/src/liblzma/common/stream_decoder_mt.c b/src/liblzma/common/stream_decoder_mt.c index 244624a47900..271f9b07c4b8 100644 --- a/src/liblzma/common/stream_decoder_mt.c +++ b/src/liblzma/common/stream_decoder_mt.c @@ -23,15 +23,10 @@ typedef enum { THR_IDLE, /// Decoding is in progress. - /// Main thread may change this to THR_STOP or THR_EXIT. + /// Main thread may change this to THR_IDLE or THR_EXIT. /// The worker thread may change this to THR_IDLE. THR_RUN, - /// The main thread wants the thread to stop whatever it was doing - /// but not exit. Main thread may change this to THR_EXIT. - /// The worker thread may change this to THR_IDLE. - THR_STOP, - /// The main thread wants the thread to exit. THR_EXIT, @@ -346,27 +341,6 @@ worker_enable_partial_update(void *thr_ptr) } -/// Things do to at THR_STOP or when finishing a Block. -/// This is called with thr->mutex locked. -static void -worker_stop(struct worker_thread *thr) -{ - // Update memory usage counters. - thr->coder->mem_in_use -= thr->in_size; - thr->in_size = 0; // thr->in was freed above. - - thr->coder->mem_in_use -= thr->mem_filters; - thr->coder->mem_cached += thr->mem_filters; - - // Put this thread to the stack of free threads. - thr->next = thr->coder->threads_free; - thr->coder->threads_free = thr; - - mythread_cond_signal(&thr->coder->cond); - return; -} - - static MYTHREAD_RET_TYPE worker_decoder(void *thr_ptr) { @@ -397,17 +371,6 @@ next_loop_unlocked: return MYTHREAD_RET_VALUE; } - if (thr->state == THR_STOP) { - thr->state = THR_IDLE; - mythread_mutex_unlock(&thr->mutex); - - mythread_sync(thr->coder->mutex) { - worker_stop(thr); - } - - goto next_loop_lock; - } - assert(thr->state == THR_RUN); // Update progress info for get_progress(). @@ -472,8 +435,7 @@ next_loop_unlocked: } // Either we finished successfully (LZMA_STREAM_END) or an error - // occurred. Both cases are handled almost identically. The error - // case requires updating thr->coder->thread_error. + // occurred. // // The sizes are in the Block Header and the Block decoder // checks that they match, thus we know these: @@ -481,16 +443,30 @@ next_loop_unlocked: assert(ret != LZMA_STREAM_END || thr->out_pos == thr->block_options.uncompressed_size); - // Free the input buffer. Don't update in_size as we need - // it later to update thr->coder->mem_in_use. - lzma_free(thr->in, thr->allocator); - thr->in = NULL; - mythread_sync(thr->mutex) { + // Block decoder ensures this, but do a sanity check anyway + // because thr->in_filled < thr->in_size means that the main + // thread is still writing to thr->in. + if (ret == LZMA_STREAM_END && thr->in_filled != thr->in_size) { + assert(0); + ret = LZMA_PROG_ERROR; + } + if (thr->state != THR_EXIT) thr->state = THR_IDLE; } + // Free the input buffer. Don't update in_size as we need + // it later to update thr->coder->mem_in_use. + // + // This step is skipped if an error occurred because the main thread + // might still be writing to thr->in. The memory will be freed after + // threads_end() sets thr->state = THR_EXIT. + if (ret == LZMA_STREAM_END) { + lzma_free(thr->in, thr->allocator); + thr->in = NULL; + } + mythread_sync(thr->coder->mutex) { // Move our progress info to the main thread. thr->coder->progress_in += thr->in_pos; @@ -510,7 +486,20 @@ next_loop_unlocked: && thr->coder->thread_error == LZMA_OK) thr->coder->thread_error = ret; - worker_stop(thr); + // Return the worker thread to the stack of available + // threads only if no errors occurred. + if (ret == LZMA_STREAM_END) { + // Update memory usage counters. + thr->coder->mem_in_use -= thr->in_size; + thr->coder->mem_in_use -= thr->mem_filters; + thr->coder->mem_cached += thr->mem_filters; + + // Put this thread to the stack of free threads. + thr->next = thr->coder->threads_free; + thr->coder->threads_free = thr; + } + + mythread_cond_signal(&thr->coder->cond); } goto next_loop_lock; @@ -544,17 +533,22 @@ threads_end(struct lzma_stream_coder *coder, const lzma_allocator *allocator) } +/// Tell worker threads to stop without doing any cleaning up. +/// The clean up will be done when threads_exit() is called; +/// it's not possible to reuse the threads after threads_stop(). +/// +/// This is called before returning an unrecoverable error code +/// to the application. It would be waste of processor time +/// to keep the threads running in such a situation. static void threads_stop(struct lzma_stream_coder *coder) { for (uint32_t i = 0; i < coder->threads_initialized; ++i) { + // The threads that are in the THR_RUN state will stop + // when they check the state the next time. There's no + // need to signal coder->threads[i].cond. mythread_sync(coder->threads[i].mutex) { - // The state must be changed conditionally because - // THR_IDLE -> THR_STOP is not a valid state change. - if (coder->threads[i].state != THR_IDLE) { - coder->threads[i].state = THR_STOP; - mythread_cond_signal(&coder->threads[i].cond); - } + coder->threads[i].state = THR_IDLE; } } @@ -1077,9 +1071,9 @@ stream_decode_mt(void *coder_ptr, const lzma_allocator *allocator, if (coder->tell_any_check) return LZMA_GET_CHECK; - } - // Fall through + FALLTHROUGH; + } case SEQ_BLOCK_HEADER: { const size_t in_old = *in_pos; @@ -1214,10 +1208,9 @@ stream_decode_mt(void *coder_ptr, const lzma_allocator *allocator, } coder->sequence = SEQ_BLOCK_INIT; + FALLTHROUGH; } - // Fall through - case SEQ_BLOCK_INIT: { // Check if decoding is possible at all with the current // memlimit_stop which we must never exceed. @@ -1303,10 +1296,9 @@ stream_decode_mt(void *coder_ptr, const lzma_allocator *allocator, } coder->sequence = SEQ_BLOCK_THR_INIT; + FALLTHROUGH; } - // Fall through - case SEQ_BLOCK_THR_INIT: { // We need to wait for a multiple conditions to become true // until we can initialize the Block decoder and let a worker @@ -1508,10 +1500,9 @@ stream_decode_mt(void *coder_ptr, const lzma_allocator *allocator, } coder->sequence = SEQ_BLOCK_THR_RUN; + FALLTHROUGH; } - // Fall through - case SEQ_BLOCK_THR_RUN: { if (action == LZMA_FINISH && coder->fail_fast) { // We know that we won't get more input and that @@ -1549,10 +1540,17 @@ stream_decode_mt(void *coder_ptr, const lzma_allocator *allocator, // Read output from the output queue. Just like in // SEQ_BLOCK_HEADER, we wait to fill the output buffer // only if waiting_allowed was set to true in the beginning - // of this function (see the comment there). + // of this function (see the comment there) and there is + // no input available. In SEQ_BLOCK_HEADER, there is never + // input available when read_output_and_wait() is called, + // but here there can be when LZMA_FINISH is used, thus we + // need to check if *in_pos == in_size. Otherwise we would + // wait here instead of using the available input to start + // a new thread. return_if_error(read_output_and_wait(coder, allocator, out, out_pos, out_size, - NULL, waiting_allowed, + NULL, + waiting_allowed && *in_pos == in_size, &wait_abs, &has_blocked)); if (coder->pending_error != LZMA_OK) { @@ -1561,6 +1559,10 @@ stream_decode_mt(void *coder_ptr, const lzma_allocator *allocator, } // Return if the input didn't contain the whole Block. + // + // NOTE: When we updated coder->thr->in_filled a few lines + // above, the worker thread might by now have finished its + // work and returned itself back to the stack of free threads. if (coder->thr->in_filled < coder->thr->in_size) { assert(*in_pos == in_size); return LZMA_OK; @@ -1613,10 +1615,9 @@ stream_decode_mt(void *coder_ptr, const lzma_allocator *allocator, coder->mem_direct_mode = coder->mem_next_filters; coder->sequence = SEQ_BLOCK_DIRECT_RUN; + FALLTHROUGH; } - // Fall through - case SEQ_BLOCK_DIRECT_RUN: { const size_t in_old = *in_pos; const size_t out_old = *out_pos; @@ -1652,8 +1653,7 @@ stream_decode_mt(void *coder_ptr, const lzma_allocator *allocator, return LZMA_OK; coder->sequence = SEQ_INDEX_DECODE; - - // Fall through + FALLTHROUGH; case SEQ_INDEX_DECODE: { // If we don't have any input, don't call @@ -1672,10 +1672,9 @@ stream_decode_mt(void *coder_ptr, const lzma_allocator *allocator, return ret; coder->sequence = SEQ_STREAM_FOOTER; + FALLTHROUGH; } - // Fall through - case SEQ_STREAM_FOOTER: { // Copy the Stream Footer to the internal buffer. const size_t in_old = *in_pos; @@ -1714,10 +1713,9 @@ stream_decode_mt(void *coder_ptr, const lzma_allocator *allocator, return LZMA_STREAM_END; coder->sequence = SEQ_STREAM_PADDING; + FALLTHROUGH; } - // Fall through - case SEQ_STREAM_PADDING: assert(coder->concatenated); @@ -1948,7 +1946,7 @@ stream_decoder_mt_init(lzma_next_coder *next, const lzma_allocator *allocator, // accounting from scratch, too. Changes in filter and block sizes may // affect number of threads. // - // FIXME? Reusing should be easy but unlike the single-threaded + // Reusing threads doesn't seem worth it. Unlike the single-threaded // decoder, with some types of input file combinations reusing // could leave quite a lot of memory allocated but unused (first // file could allocate a lot, the next files could use fewer diff --git a/src/liblzma/common/stream_encoder_mt.c b/src/liblzma/common/stream_encoder_mt.c index f0fef1523318..fd0eb98df682 100644 --- a/src/liblzma/common/stream_encoder_mt.c +++ b/src/liblzma/common/stream_encoder_mt.c @@ -731,8 +731,7 @@ stream_encode_mt(void *coder_ptr, const lzma_allocator *allocator, coder->header_pos = 0; coder->sequence = SEQ_BLOCK; - - // Fall through + FALLTHROUGH; case SEQ_BLOCK: { // Initialized to silence warnings. @@ -851,9 +850,9 @@ stream_encode_mt(void *coder_ptr, const lzma_allocator *allocator, // to be ready to be copied out. coder->progress_out += lzma_index_size(coder->index) + LZMA_STREAM_HEADER_SIZE; - } - // Fall through + FALLTHROUGH; + } case SEQ_INDEX: { // Call the Index encoder. It doesn't take any input, so @@ -873,10 +872,9 @@ stream_encode_mt(void *coder_ptr, const lzma_allocator *allocator, return LZMA_PROG_ERROR; coder->sequence = SEQ_STREAM_FOOTER; + FALLTHROUGH; } - // Fall through - case SEQ_STREAM_FOOTER: lzma_bufcpy(coder->header, &coder->header_pos, sizeof(coder->header), diff --git a/src/liblzma/common/string_conversion.c b/src/liblzma/common/string_conversion.c index c899783c642a..015acf225856 100644 --- a/src/liblzma/common/string_conversion.c +++ b/src/liblzma/common/string_conversion.c @@ -12,6 +12,11 @@ #include "filter_common.h" +// liblzma itself doesn't use gettext to translate messages. +// Mark the strings still so that xz can translate them. +#define N_(msgid) msgid + + ///////////////////// // String building // ///////////////////// @@ -317,6 +322,10 @@ parse_lzma12_preset(const char **const str, const char *str_end, uint32_t *preset) { assert(*str < str_end); + + if (!(**str >= '0' && **str <= '9')) + return N_("Unsupported preset"); + *preset = (uint32_t)(**str - '0'); // NOTE: Remember to update LZMA12_PRESET_STR if this is modified! @@ -327,7 +336,7 @@ parse_lzma12_preset(const char **const str, const char *str_end, break; default: - return "Unsupported preset flag"; + return N_("Unsupported flag in the preset"); } } @@ -346,7 +355,7 @@ set_lzma12_preset(const char **const str, const char *str_end, lzma_options_lzma *opts = filter_options; if (lzma_lzma_preset(opts, preset)) - return "Unsupported preset"; + return N_("Unsupported preset"); return NULL; } @@ -438,7 +447,7 @@ parse_lzma12(const char **const str, const char *str_end, void *filter_options) return errmsg; if (opts->lc + opts->lp > LZMA_LCLP_MAX) - return "The sum of lc and lp must not exceed 4"; + return N_("The sum of lc and lp must not exceed 4"); return NULL; } @@ -574,21 +583,21 @@ parse_options(const char **const str, const char *str_end, // Fail if the '=' wasn't found or the option name is missing // (the first char is '='). if (equals_sign == NULL || **str == '=') - return "Options must be 'name=value' pairs separated " - "with commas"; + return N_("Options must be 'name=value' pairs " + "separated with commas"); // Reject a too long option name so that the memcmp() // in the loop below won't read past the end of the // string in optmap[i].name. const size_t name_len = (size_t)(equals_sign - *str); if (name_len > NAME_LEN_MAX) - return "Unknown option name"; + return N_("Unknown option name"); // Find the option name from optmap[]. size_t i = 0; while (true) { if (i == optmap_size) - return "Unknown option name"; + return N_("Unknown option name"); if (memcmp(*str, optmap[i].name, name_len) == 0 && optmap[i].name[name_len] == '\0') @@ -605,7 +614,7 @@ parse_options(const char **const str, const char *str_end, // string so check it here. const size_t value_len = (size_t)(name_eq_value_end - *str); if (value_len == 0) - return "Option value cannot be empty"; + return N_("Option value cannot be empty"); // LZMA1/2 preset has its own parsing function. if (optmap[i].type == OPTMAP_TYPE_LZMA_PRESET) { @@ -626,14 +635,14 @@ parse_options(const char **const str, const char *str_end, // in the loop below won't read past the end of the // string in optmap[i].u.map[j].name. if (value_len > NAME_LEN_MAX) - return "Invalid option value"; + return N_("Invalid option value"); const name_value_map *map = optmap[i].u.map; size_t j = 0; while (true) { // The array is terminated with an empty name. if (map[j].name[0] == '\0') - return "Invalid option value"; + return N_("Invalid option value"); if (memcmp(*str, map[j].name, value_len) == 0 && map[j].name[value_len] @@ -647,7 +656,8 @@ parse_options(const char **const str, const char *str_end, } else if (**str < '0' || **str > '9') { // Note that "max" isn't supported while it is // supported in xz. It's not useful here. - return "Value is not a non-negative decimal integer"; + return N_("Value is not a non-negative " + "decimal integer"); } else { // strtoul() has locale-specific behavior so it cannot // be relied on to get reproducible results since we @@ -661,13 +671,13 @@ parse_options(const char **const str, const char *str_end, v = 0; do { if (v > UINT32_MAX / 10) - return "Value out of range"; + return N_("Value out of range"); v *= 10; const uint32_t add = (uint32_t)(*p - '0'); if (UINT32_MAX - add < v) - return "Value out of range"; + return N_("Value out of range"); v += add; ++p; @@ -692,8 +702,9 @@ parse_options(const char **const str, const char *str_end, if ((optmap[i].flags & OPTMAP_USE_BYTE_SUFFIX) == 0) { *str = multiplier_start; - return "This option does not support " - "any integer suffixes"; + return N_("This option does not " + "support any multiplier " + "suffixes"); } uint32_t shift; @@ -716,8 +727,13 @@ parse_options(const char **const str, const char *str_end, default: *str = multiplier_start; - return "Invalid multiplier suffix " - "(KiB, MiB, or GiB)"; + + // TRANSLATORS: Don't translate the + // suffixes "KiB", "MiB", or "GiB" + // because a user can only specify + // untranslated suffixes. + return N_("Invalid multiplier suffix " + "(KiB, MiB, or GiB)"); } ++p; @@ -736,19 +752,19 @@ parse_options(const char **const str, const char *str_end, // Now we must have no chars remaining. if (p < name_eq_value_end) { *str = multiplier_start; - return "Invalid multiplier suffix " - "(KiB, MiB, or GiB)"; + return N_("Invalid multiplier suffix " + "(KiB, MiB, or GiB)"); } if (v > (UINT32_MAX >> shift)) - return "Value out of range"; + return N_("Value out of range"); v <<= shift; } if (v < optmap[i].u.range.min || v > optmap[i].u.range.max) - return "Value out of range"; + return N_("Value out of range"); } // Set the value in filter_options. Enums are handled @@ -810,15 +826,15 @@ parse_filter(const char **const str, const char *str_end, lzma_filter *filter, // string in filter_name_map[i].name. const size_t name_len = (size_t)(name_end - *str); if (name_len > NAME_LEN_MAX) - return "Unknown filter name"; + return N_("Unknown filter name"); for (size_t i = 0; i < ARRAY_SIZE(filter_name_map); ++i) { if (memcmp(*str, filter_name_map[i].name, name_len) == 0 && filter_name_map[i].name[name_len] == '\0') { if (only_xz && filter_name_map[i].id >= LZMA_FILTER_RESERVED_START) - return "This filter cannot be used in " - "the .xz format"; + return N_("This filter cannot be used in " + "the .xz format"); // Allocate the filter-specific options and // initialize the memory with zeros. @@ -826,7 +842,7 @@ parse_filter(const char **const str, const char *str_end, lzma_filter *filter, filter_name_map[i].opts_size, allocator); if (options == NULL) - return "Memory allocation failed"; + return N_("Memory allocation failed"); // Filter name was found so the input string is good // at least this far. @@ -846,7 +862,7 @@ parse_filter(const char **const str, const char *str_end, lzma_filter *filter, } } - return "Unknown filter name"; + return N_("Unknown filter name"); } @@ -865,8 +881,8 @@ str_to_filters(const char **const str, lzma_filter *filters, uint32_t flags, ++*str; if (**str == '\0') - return "Empty string is not allowed, " - "try \"6\" if a default value is needed"; + return N_("Empty string is not allowed, " + "try '6' if a default value is needed"); // Detect the type of the string. // @@ -889,7 +905,7 @@ str_to_filters(const char **const str, lzma_filter *filters, uint32_t flags, // there are no chars other than spaces. for (size_t i = 1; str_end[i] != '\0'; ++i) if (str_end[i] != ' ') - return "Unsupported preset"; + return N_("Unsupported preset"); } else { // There are no trailing spaces. Use the whole string. str_end = *str + str_len; @@ -902,11 +918,11 @@ str_to_filters(const char **const str, lzma_filter *filters, uint32_t flags, lzma_options_lzma *opts = lzma_alloc(sizeof(*opts), allocator); if (opts == NULL) - return "Memory allocation failed"; + return N_("Memory allocation failed"); if (lzma_lzma_preset(opts, preset)) { lzma_free(opts, allocator); - return "Unsupported preset"; + return N_("Unsupported preset"); } filters[0].id = LZMA_FILTER_LZMA2; @@ -930,7 +946,7 @@ str_to_filters(const char **const str, lzma_filter *filters, uint32_t flags, size_t i = 0; do { if (i == LZMA_FILTERS_MAX) { - errmsg = "The maximum number of filters is four"; + errmsg = N_("The maximum number of filters is four"); goto error; } @@ -952,7 +968,7 @@ str_to_filters(const char **const str, lzma_filter *filters, uint32_t flags, // Inputs that have "--" at the end or "-- " in the middle // will result in an empty filter name. if (filter_end == *str) { - errmsg = "Filter name is missing"; + errmsg = N_("Filter name is missing"); goto error; } @@ -979,8 +995,8 @@ str_to_filters(const char **const str, lzma_filter *filters, uint32_t flags, const lzma_ret ret = lzma_validate_chain(temp_filters, &dummy); assert(ret == LZMA_OK || ret == LZMA_OPTIONS_ERROR); if (ret != LZMA_OK) { - errmsg = "Invalid filter chain " - "('lzma2' missing at the end?)"; + errmsg = N_("Invalid filter chain " + "('lzma2' missing at the end?)"); goto error; } } @@ -1008,17 +1024,26 @@ lzma_str_to_filters(const char *str, int *error_pos, lzma_filter *filters, if (error_pos != NULL) *error_pos = 0; - if (str == NULL || filters == NULL) + if (str == NULL || filters == NULL) { + // Don't translate this because it's only shown in case of + // a programming error. return "Unexpected NULL pointer argument(s) " "to lzma_str_to_filters()"; + } // Validate the flags. const uint32_t supported_flags = LZMA_STR_ALL_FILTERS | LZMA_STR_NO_VALIDATION; - if (flags & ~supported_flags) + if (flags & ~supported_flags) { + // This message is possible only if the caller uses flags + // that are only supported in a newer liblzma version (or + // the flags are simply buggy). Don't translate this at least + // when liblzma itself doesn't use gettext; xz and liblzma + // are usually upgraded at the same time. return "Unsupported flags to lzma_str_to_filters()"; + } const char *used = str; const char *errmsg = str_to_filters(&used, filters, flags, allocator); diff --git a/src/liblzma/liblzma_generic.map b/src/liblzma/liblzma_generic.map index f74c15484559..2bef27a8f7d7 100644 --- a/src/liblzma/liblzma_generic.map +++ b/src/liblzma/liblzma_generic.map @@ -126,3 +126,13 @@ XZ_5.6.0 { global: lzma_mt_block_size; } XZ_5.4; + +XZ_5.8 { +global: + lzma_bcj_arm64_encode; + lzma_bcj_arm64_decode; + lzma_bcj_riscv_encode; + lzma_bcj_riscv_decode; + lzma_bcj_x86_encode; + lzma_bcj_x86_decode; +} XZ_5.6.0; diff --git a/src/liblzma/liblzma_linux.map b/src/liblzma/liblzma_linux.map index 7e4b25e17620..50f1571de219 100644 --- a/src/liblzma/liblzma_linux.map +++ b/src/liblzma/liblzma_linux.map @@ -141,3 +141,13 @@ XZ_5.6.0 { global: lzma_mt_block_size; } XZ_5.4; + +XZ_5.8 { +global: + lzma_bcj_arm64_encode; + lzma_bcj_arm64_decode; + lzma_bcj_riscv_encode; + lzma_bcj_riscv_decode; + lzma_bcj_x86_encode; + lzma_bcj_x86_decode; +} XZ_5.6.0; diff --git a/src/liblzma/lz/lz_decoder.c b/src/liblzma/lz/lz_decoder.c index 92913f225a0d..1cb120ab3b09 100644 --- a/src/liblzma/lz/lz_decoder.c +++ b/src/liblzma/lz/lz_decoder.c @@ -53,9 +53,9 @@ typedef struct { static void lz_decoder_reset(lzma_coder *coder) { - coder->dict.pos = 2 * LZ_DICT_REPEAT_MAX; + coder->dict.pos = LZ_DICT_INIT_POS; coder->dict.full = 0; - coder->dict.buf[2 * LZ_DICT_REPEAT_MAX - 1] = '\0'; + coder->dict.buf[LZ_DICT_INIT_POS - 1] = '\0'; coder->dict.has_wrapped = false; coder->dict.need_reset = false; return; @@ -261,10 +261,12 @@ lzma_lz_decoder_init(lzma_next_coder *next, const lzma_allocator *allocator, // recommended to give aligned buffers to liblzma. // // Reserve 2 * LZ_DICT_REPEAT_MAX bytes of extra space which is - // needed for alloc_size. + // needed for alloc_size. Reserve also LZ_DICT_EXTRA bytes of extra + // space which is *not* counted in alloc_size or coder->dict.size. // // Avoid integer overflow. - if (lz_options.dict_size > SIZE_MAX - 15 - 2 * LZ_DICT_REPEAT_MAX) + if (lz_options.dict_size > SIZE_MAX - 15 - 2 * LZ_DICT_REPEAT_MAX + - LZ_DICT_EXTRA) return LZMA_MEM_ERROR; lz_options.dict_size = (lz_options.dict_size + 15) & ~((size_t)(15)); @@ -277,7 +279,13 @@ lzma_lz_decoder_init(lzma_next_coder *next, const lzma_allocator *allocator, // Allocate and initialize the dictionary. if (coder->dict.size != alloc_size) { lzma_free(coder->dict.buf, allocator); - coder->dict.buf = lzma_alloc(alloc_size, allocator); + + // The LZ_DICT_EXTRA bytes at the end of the buffer aren't + // included in alloc_size. These extra bytes allow + // dict_repeat() to read and write more data than requested. + // Otherwise this extra space is ignored. + coder->dict.buf = lzma_alloc(alloc_size + LZ_DICT_EXTRA, + allocator); if (coder->dict.buf == NULL) return LZMA_MEM_ERROR; @@ -320,5 +328,6 @@ lzma_lz_decoder_init(lzma_next_coder *next, const lzma_allocator *allocator, extern uint64_t lzma_lz_decoder_memusage(size_t dictionary_size) { - return sizeof(lzma_coder) + (uint64_t)(dictionary_size); + return sizeof(lzma_coder) + (uint64_t)(dictionary_size) + + 2 * LZ_DICT_REPEAT_MAX + LZ_DICT_EXTRA; } diff --git a/src/liblzma/lz/lz_decoder.h b/src/liblzma/lz/lz_decoder.h index cb61b6e24c78..2698e0167fcc 100644 --- a/src/liblzma/lz/lz_decoder.h +++ b/src/liblzma/lz/lz_decoder.h @@ -15,10 +15,40 @@ #include "common.h" +#ifdef HAVE_IMMINTRIN_H +# include <immintrin.h> +#endif + + +// dict_repeat() implementation variant: +// 0 = Byte-by-byte copying only. +// 1 = Use memcpy() for non-overlapping copies. +// 2 = Use x86 SSE2 for non-overlapping copies. +#ifndef LZMA_LZ_DECODER_CONFIG +# if defined(TUKLIB_FAST_UNALIGNED_ACCESS) \ + && defined(HAVE_IMMINTRIN_H) \ + && (defined(__SSE2__) || defined(_M_X64) \ + || (defined(_M_IX86_FP) && _M_IX86_FP >= 2)) +# define LZMA_LZ_DECODER_CONFIG 2 +# else +# define LZMA_LZ_DECODER_CONFIG 1 +# endif +#endif -/// Maximum length of a match rounded up to a nice power of 2 which is -/// a good size for aligned memcpy(). The allocated dictionary buffer will -/// be 2 * LZ_DICT_REPEAT_MAX bytes larger than the actual dictionary size: +/// Byte-by-byte and memcpy() copy exactly the amount needed. Other methods +/// can copy up to LZ_DICT_EXTRA bytes more than requested, and this amount +/// of extra space is needed at the end of the allocated dictionary buffer. +/// +/// NOTE: If this is increased, update LZMA_DICT_REPEAT_MAX too. +#if LZMA_LZ_DECODER_CONFIG >= 2 +# define LZ_DICT_EXTRA 32 +#else +# define LZ_DICT_EXTRA 0 +#endif + +/// Maximum number of bytes that dict_repeat() may copy. The allocated +/// dictionary buffer will be 2 * LZ_DICT_REPEAT_MAX + LZMA_DICT_EXTRA bytes +/// larger than the actual dictionary size: /// /// (1) Every time the decoder reaches the end of the dictionary buffer, /// the last LZ_DICT_REPEAT_MAX bytes will be copied to the beginning. @@ -27,14 +57,26 @@ /// /// (2) The other LZ_DICT_REPEAT_MAX bytes is kept as a buffer between /// the oldest byte still in the dictionary and the current write -/// position. This way dict_repeat(dict, dict->size - 1, &len) +/// position. This way dict_repeat() with the maximum valid distance /// won't need memmove() as the copying cannot overlap. /// +/// (3) LZ_DICT_EXTRA bytes are required at the end of the dictionary buffer +/// so that extra copying done by dict_repeat() won't write or read past +/// the end of the allocated buffer. This amount is *not* counted as part +/// of lzma_dict.size. +/// /// Note that memcpy() still cannot be used if distance < len. /// -/// LZMA's longest match length is 273 so pick a multiple of 16 above that. +/// LZMA's longest match length is 273 bytes. The LZMA decoder looks at +/// the lowest four bits of the dictionary position, thus 273 must be +/// rounded up to the next multiple of 16 (288). In addition, optimized +/// dict_repeat() copies 32 bytes at a time, thus this must also be +/// a multiple of 32. #define LZ_DICT_REPEAT_MAX 288 +/// Initial position in lzma_dict.buf when the dictionary is empty. +#define LZ_DICT_INIT_POS (2 * LZ_DICT_REPEAT_MAX) + typedef struct { /// Pointer to the dictionary buffer. @@ -158,7 +200,8 @@ dict_is_distance_valid(const lzma_dict *const dict, const size_t distance) /// Repeat *len bytes at distance. static inline bool -dict_repeat(lzma_dict *dict, uint32_t distance, uint32_t *len) +dict_repeat(lzma_dict *restrict dict, + uint32_t distance, uint32_t *restrict len) { // Don't write past the end of the dictionary. const size_t dict_avail = dict->limit - dict->pos; @@ -169,9 +212,17 @@ dict_repeat(lzma_dict *dict, uint32_t distance, uint32_t *len) if (distance >= dict->pos) back += dict->size - LZ_DICT_REPEAT_MAX; - // Repeat a block of data from the history. Because memcpy() is faster - // than copying byte by byte in a loop, the copying process gets split - // into two cases. +#if LZMA_LZ_DECODER_CONFIG == 0 + // Minimal byte-by-byte method. This might be the least bad choice + // if memcpy() isn't fast and there's no replacement for it below. + while (left-- > 0) { + dict->buf[dict->pos++] = dict->buf[back++]; + } + +#else + // Because memcpy() or a similar method can be faster than copying + // byte by byte in a loop, the copying process is split into + // two cases. if (distance < left) { // Source and target areas overlap, thus we can't use // memcpy() nor even memmove() safely. @@ -179,32 +230,56 @@ dict_repeat(lzma_dict *dict, uint32_t distance, uint32_t *len) dict->buf[dict->pos++] = dict->buf[back++]; } while (--left > 0); } else { +# if LZMA_LZ_DECODER_CONFIG == 1 memcpy(dict->buf + dict->pos, dict->buf + back, left); dict->pos += left; + +# elif LZMA_LZ_DECODER_CONFIG == 2 + // This can copy up to 32 bytes more than required. + // (If left == 0, we still copy 32 bytes.) + size_t pos = dict->pos; + dict->pos += left; + do { + const __m128i x0 = _mm_loadu_si128( + (__m128i *)(dict->buf + back)); + const __m128i x1 = _mm_loadu_si128( + (__m128i *)(dict->buf + back + 16)); + back += 32; + _mm_storeu_si128( + (__m128i *)(dict->buf + pos), x0); + _mm_storeu_si128( + (__m128i *)(dict->buf + pos + 16), x1); + pos += 32; + } while (pos < dict->pos); + +# else +# error "Invalid LZMA_LZ_DECODER_CONFIG value" +# endif } +#endif // Update how full the dictionary is. if (!dict->has_wrapped) - dict->full = dict->pos - 2 * LZ_DICT_REPEAT_MAX; + dict->full = dict->pos - LZ_DICT_INIT_POS; return *len != 0; } static inline void -dict_put(lzma_dict *dict, uint8_t byte) +dict_put(lzma_dict *restrict dict, uint8_t byte) { dict->buf[dict->pos++] = byte; if (!dict->has_wrapped) - dict->full = dict->pos - 2 * LZ_DICT_REPEAT_MAX; + dict->full = dict->pos - LZ_DICT_INIT_POS; } /// Puts one byte into the dictionary. Returns true if the dictionary was /// already full and the byte couldn't be added. static inline bool -dict_put_safe(lzma_dict *dict, uint8_t byte) +dict_put_safe(lzma_dict *restrict dict, uint8_t byte) { if (unlikely(dict->pos == dict->limit)) return true; @@ -234,7 +309,7 @@ dict_write(lzma_dict *restrict dict, const uint8_t *restrict in, dict->buf, &dict->pos, dict->limit); if (!dict->has_wrapped) - dict->full = dict->pos - 2 * LZ_DICT_REPEAT_MAX; + dict->full = dict->pos - LZ_DICT_INIT_POS; return; } diff --git a/src/liblzma/lz/lz_encoder.c b/src/liblzma/lz/lz_encoder.c index 4af23e14c423..e5c4057dca53 100644 --- a/src/liblzma/lz/lz_encoder.c +++ b/src/liblzma/lz/lz_encoder.c @@ -15,7 +15,7 @@ // See lz_encoder_hash.h. This is a bit hackish but avoids making // endianness a conditional in makefiles. -#if defined(WORDS_BIGENDIAN) && !defined(HAVE_SMALL) +#ifdef LZMA_LZ_HASH_TABLE_IS_NEEDED # include "lz_encoder_hash_table.h" #endif diff --git a/src/liblzma/lz/lz_encoder_hash.h b/src/liblzma/lz/lz_encoder_hash.h index 8ace82b04c51..6d4bf837fd16 100644 --- a/src/liblzma/lz/lz_encoder_hash.h +++ b/src/liblzma/lz/lz_encoder_hash.h @@ -5,23 +5,37 @@ /// \file lz_encoder_hash.h /// \brief Hash macros for match finders // -// Author: Igor Pavlov +// Authors: Igor Pavlov +// Lasse Collin // /////////////////////////////////////////////////////////////////////////////// #ifndef LZMA_LZ_ENCODER_HASH_H #define LZMA_LZ_ENCODER_HASH_H -#if defined(WORDS_BIGENDIAN) && !defined(HAVE_SMALL) - // This is to make liblzma produce the same output on big endian - // systems that it does on little endian systems. lz_encoder.c - // takes care of including the actual table. +// We need to know if CRC32_GENERIC is defined and we may need the declaration +// of lzma_crc32_table[][]. +#include "crc_common.h" + +// If HAVE_SMALL is defined, then lzma_crc32_table[][] exists and +// it's little endian even on big endian systems. +// +// If HAVE_SMALL isn't defined, lzma_crc32_table[][] is in native endian +// but we want a little endian one so that the compressed output won't +// depend on the processor endianness. Big endian systems are less common +// so those get the burden of an extra 1 KiB table. +// +// If HAVE_SMALL isn't defined and CRC32_GENERIC isn't defined either, +// then lzma_crc32_table[][] doesn't exist. +#if defined(HAVE_SMALL) \ + || (defined(CRC32_GENERIC) && !defined(WORDS_BIGENDIAN)) +# define hash_table lzma_crc32_table[0] +#else + // lz_encoder.c takes care of including the actual table. lzma_attr_visibility_hidden extern const uint32_t lzma_lz_hash_table[256]; # define hash_table lzma_lz_hash_table -#else -# include "check.h" -# define hash_table lzma_crc32_table[0] +# define LZMA_LZ_HASH_TABLE_IS_NEEDED 1 #endif #define HASH_2_SIZE (UINT32_C(1) << 10) diff --git a/src/liblzma/lzma/lzma2_encoder.c b/src/liblzma/lzma/lzma2_encoder.c index e20b75b30037..71cfd9b4114e 100644 --- a/src/liblzma/lzma/lzma2_encoder.c +++ b/src/liblzma/lzma/lzma2_encoder.c @@ -159,8 +159,7 @@ lzma2_encode(void *coder_ptr, lzma_mf *restrict mf, coder->uncompressed_size = 0; coder->compressed_size = 0; coder->sequence = SEQ_LZMA_ENCODE; - - // Fall through + FALLTHROUGH; case SEQ_LZMA_ENCODE: { // Calculate how much more uncompressed data this chunk @@ -219,10 +218,9 @@ lzma2_encode(void *coder_ptr, lzma_mf *restrict mf, lzma2_header_lzma(coder); coder->sequence = SEQ_LZMA_COPY; + FALLTHROUGH; } - // Fall through - case SEQ_LZMA_COPY: // Copy the compressed chunk along its headers to the // output buffer. @@ -244,8 +242,7 @@ lzma2_encode(void *coder_ptr, lzma_mf *restrict mf, return LZMA_OK; coder->sequence = SEQ_UNCOMPRESSED_COPY; - - // Fall through + FALLTHROUGH; case SEQ_UNCOMPRESSED_COPY: // Copy the uncompressed data as is from the dictionary diff --git a/src/liblzma/lzma/lzma_decoder.c b/src/liblzma/lzma/lzma_decoder.c index 0abed02b8154..2088a2faa54e 100644 --- a/src/liblzma/lzma/lzma_decoder.c +++ b/src/liblzma/lzma/lzma_decoder.c @@ -18,7 +18,7 @@ // The macros unroll loops with switch statements. // Silence warnings about missing fall-through comments. -#if TUKLIB_GNUC_REQ(7, 0) +#if TUKLIB_GNUC_REQ(7, 0) || defined(__clang__) # pragma GCC diagnostic ignored "-Wimplicit-fallthrough" #endif diff --git a/src/liblzma/simple/arm.c b/src/liblzma/simple/arm.c index 58acb2d11adf..f9d9c08b3c42 100644 --- a/src/liblzma/simple/arm.c +++ b/src/liblzma/simple/arm.c @@ -18,8 +18,10 @@ arm_code(void *simple lzma_attribute((__unused__)), uint32_t now_pos, bool is_encoder, uint8_t *buffer, size_t size) { + size &= ~(size_t)3; + size_t i; - for (i = 0; i + 4 <= size; i += 4) { + for (i = 0; i < size; i += 4) { if (buffer[i + 3] == 0xEB) { uint32_t src = ((uint32_t)(buffer[i + 2]) << 16) | ((uint32_t)(buffer[i + 1]) << 8) diff --git a/src/liblzma/simple/arm64.c b/src/liblzma/simple/arm64.c index 16c2f565f73d..2ec10d937fbd 100644 --- a/src/liblzma/simple/arm64.c +++ b/src/liblzma/simple/arm64.c @@ -28,6 +28,8 @@ arm64_code(void *simple lzma_attribute((__unused__)), uint32_t now_pos, bool is_encoder, uint8_t *buffer, size_t size) { + size &= ~(size_t)3; + size_t i; // Clang 14.0.6 on x86-64 makes this four times bigger and 40 % slower @@ -37,7 +39,7 @@ arm64_code(void *simple lzma_attribute((__unused__)), #ifdef __clang__ # pragma clang loop vectorize(disable) #endif - for (i = 0; i + 4 <= size; i += 4) { + for (i = 0; i < size; i += 4) { uint32_t pc = (uint32_t)(now_pos + i); uint32_t instr = read32le(buffer + i); @@ -122,6 +124,15 @@ lzma_simple_arm64_encoder_init(lzma_next_coder *next, { return arm64_coder_init(next, allocator, filters, true); } + + +extern LZMA_API(size_t) +lzma_bcj_arm64_encode(uint32_t start_offset, uint8_t *buf, size_t size) +{ + // start_offset must be a multiple of four. + start_offset &= ~UINT32_C(3); + return arm64_code(NULL, start_offset, true, buf, size); +} #endif @@ -133,4 +144,13 @@ lzma_simple_arm64_decoder_init(lzma_next_coder *next, { return arm64_coder_init(next, allocator, filters, false); } + + +extern LZMA_API(size_t) +lzma_bcj_arm64_decode(uint32_t start_offset, uint8_t *buf, size_t size) +{ + // start_offset must be a multiple of four. + start_offset &= ~UINT32_C(3); + return arm64_code(NULL, start_offset, false, buf, size); +} #endif diff --git a/src/liblzma/simple/armthumb.c b/src/liblzma/simple/armthumb.c index f1eeca9b80f1..368b51c7fea9 100644 --- a/src/liblzma/simple/armthumb.c +++ b/src/liblzma/simple/armthumb.c @@ -18,8 +18,13 @@ armthumb_code(void *simple lzma_attribute((__unused__)), uint32_t now_pos, bool is_encoder, uint8_t *buffer, size_t size) { + if (size < 4) + return 0; + + size -= 4; + size_t i; - for (i = 0; i + 4 <= size; i += 2) { + for (i = 0; i <= size; i += 2) { if ((buffer[i + 1] & 0xF8) == 0xF0 && (buffer[i + 3] & 0xF8) == 0xF8) { uint32_t src = (((uint32_t)(buffer[i + 1]) & 7) << 19) diff --git a/src/liblzma/simple/ia64.c b/src/liblzma/simple/ia64.c index 502501409977..2a4aaebb4720 100644 --- a/src/liblzma/simple/ia64.c +++ b/src/liblzma/simple/ia64.c @@ -25,8 +25,10 @@ ia64_code(void *simple lzma_attribute((__unused__)), 4, 4, 0, 0, 4, 4, 0, 0 }; + size &= ~(size_t)15; + size_t i; - for (i = 0; i + 16 <= size; i += 16) { + for (i = 0; i < size; i += 16) { const uint32_t instr_template = buffer[i] & 0x1F; const uint32_t mask = BRANCH_TABLE[instr_template]; uint32_t bit_pos = 5; diff --git a/src/liblzma/simple/powerpc.c b/src/liblzma/simple/powerpc.c index ba6cfbef3ab6..ea47d14d4c3f 100644 --- a/src/liblzma/simple/powerpc.c +++ b/src/liblzma/simple/powerpc.c @@ -18,8 +18,10 @@ powerpc_code(void *simple lzma_attribute((__unused__)), uint32_t now_pos, bool is_encoder, uint8_t *buffer, size_t size) { + size &= ~(size_t)3; + size_t i; - for (i = 0; i + 4 <= size; i += 4) { + for (i = 0; i < size; i += 4) { // PowerPC branch 6(48) 24(Offset) 1(Abs) 1(Link) if ((buffer[i] >> 2) == 0x12 && ((buffer[i + 3] & 3) == 1)) { diff --git a/src/liblzma/simple/riscv.c b/src/liblzma/simple/riscv.c index b18df8b637d0..bc97ebdbb0fb 100644 --- a/src/liblzma/simple/riscv.c +++ b/src/liblzma/simple/riscv.c @@ -617,6 +617,15 @@ lzma_simple_riscv_encoder_init(lzma_next_coder *next, return lzma_simple_coder_init(next, allocator, filters, &riscv_encode, 0, 8, 2, true); } + + +extern LZMA_API(size_t) +lzma_bcj_riscv_encode(uint32_t start_offset, uint8_t *buf, size_t size) +{ + // start_offset must be a multiple of two. + start_offset &= ~UINT32_C(1); + return riscv_encode(NULL, start_offset, true, buf, size); +} #endif @@ -752,4 +761,13 @@ lzma_simple_riscv_decoder_init(lzma_next_coder *next, return lzma_simple_coder_init(next, allocator, filters, &riscv_decode, 0, 8, 2, false); } + + +extern LZMA_API(size_t) +lzma_bcj_riscv_decode(uint32_t start_offset, uint8_t *buf, size_t size) +{ + // start_offset must be a multiple of two. + start_offset &= ~UINT32_C(1); + return riscv_decode(NULL, start_offset, false, buf, size); +} #endif diff --git a/src/liblzma/simple/sparc.c b/src/liblzma/simple/sparc.c index e8ad285a1927..1fa4850458e8 100644 --- a/src/liblzma/simple/sparc.c +++ b/src/liblzma/simple/sparc.c @@ -18,9 +18,10 @@ sparc_code(void *simple lzma_attribute((__unused__)), uint32_t now_pos, bool is_encoder, uint8_t *buffer, size_t size) { - size_t i; - for (i = 0; i + 4 <= size; i += 4) { + size &= ~(size_t)3; + size_t i; + for (i = 0; i < size; i += 4) { if ((buffer[i] == 0x40 && (buffer[i + 1] & 0xC0) == 0x00) || (buffer[i] == 0x7F && (buffer[i + 1] & 0xC0) == 0xC0)) { diff --git a/src/liblzma/simple/x86.c b/src/liblzma/simple/x86.c index f216231f2d12..dffa7863131a 100644 --- a/src/liblzma/simple/x86.c +++ b/src/liblzma/simple/x86.c @@ -143,6 +143,18 @@ lzma_simple_x86_encoder_init(lzma_next_coder *next, { return x86_coder_init(next, allocator, filters, true); } + + +extern LZMA_API(size_t) +lzma_bcj_x86_encode(uint32_t start_offset, uint8_t *buf, size_t size) +{ + lzma_simple_x86 simple = { + .prev_mask = 0, + .prev_pos = (uint32_t)(-5), + }; + + return x86_code(&simple, start_offset, true, buf, size); +} #endif @@ -154,4 +166,16 @@ lzma_simple_x86_decoder_init(lzma_next_coder *next, { return x86_coder_init(next, allocator, filters, false); } + + +extern LZMA_API(size_t) +lzma_bcj_x86_decode(uint32_t start_offset, uint8_t *buf, size_t size) +{ + lzma_simple_x86 simple = { + .prev_mask = 0, + .prev_pos = (uint32_t)(-5), + }; + + return x86_code(&simple, start_offset, false, buf, size); +} #endif diff --git a/src/lzmainfo/lzmainfo.c b/src/lzmainfo/lzmainfo.c index d917f371c3ba..0b0b0d3d09a4 100644 --- a/src/lzmainfo/lzmainfo.c +++ b/src/lzmainfo/lzmainfo.c @@ -17,6 +17,8 @@ #include "getopt.h" #include "tuklib_gettext.h" #include "tuklib_progname.h" +#include "tuklib_mbstr_nonprint.h" +#include "tuklib_mbstr_wrap.h" #include "tuklib_exit.h" #ifdef TUKLIB_DOSLIKE @@ -29,17 +31,36 @@ tuklib_attr_noreturn static void help(void) { - printf( -_("Usage: %s [--help] [--version] [FILE]...\n" -"Show information stored in the .lzma file header"), progname); + // A few languages use so long strings that we need automatic + // wrapping. A few strings are the same as in xz/message.c and + // should be kept in sync. + static const struct tuklib_wrap_opt wrap0 = { 0, 0, 0, 0, 79 }; + int e = 0; - printf(_( -"\nWith no FILE, or when FILE is -, read standard input.\n")); - printf("\n"); + printf(_("Usage: %s [--help] [--version] [FILE]...\n"), progname); - printf(_("Report bugs to <%s> (in English or Finnish).\n"), + e |= tuklib_wraps(stdout, &wrap0, + W_("Show information stored in the .lzma file header.")); + e |= tuklib_wraps(stdout, &wrap0, + W_("With no FILE, or when FILE is -, read standard input.")); + + putchar('\n'); + + e |= tuklib_wrapf(stdout, &wrap0, + W_("Report bugs to <%s> (in English or Finnish)."), PACKAGE_BUGREPORT); - printf(_("%s home page: <%s>\n"), PACKAGE_NAME, PACKAGE_URL); + + e |= tuklib_wrapf(stdout, &wrap0, + W_("%s home page: <%s>"), PACKAGE_NAME, PACKAGE_URL); + + if (e != 0) { + // Avoid new translatable strings by printing the message + // in pieces. + fprintf(stderr, _("%s: "), progname); + fprintf(stderr, _("Error printing the help text " + "(error code %d)"), e); + fprintf(stderr, "\n"); + } tuklib_exit(EXIT_SUCCESS, EXIT_FAILURE, true); } @@ -104,7 +125,8 @@ lzmainfo(const char *name, FILE *f) uint8_t buf[13]; const size_t size = fread(buf, 1, sizeof(buf), f); if (size != 13) { - fprintf(stderr, "%s: %s: %s\n", progname, name, + fprintf(stderr, "%s: %s: %s\n", progname, + tuklib_mask_nonprint(name), ferror(f) ? strerror(errno) : _("File is too small to be a .lzma file")); return true; @@ -118,7 +140,8 @@ lzmainfo(const char *name, FILE *f) break; case LZMA_OPTIONS_ERROR: - fprintf(stderr, "%s: %s: %s\n", progname, name, + fprintf(stderr, "%s: %s: %s\n", progname, + tuklib_mask_nonprint(name), _("Not a .lzma file")); return true; @@ -142,7 +165,7 @@ lzmainfo(const char *name, FILE *f) // this output and we don't want to break that when people move // from LZMA Utils to XZ Utils. if (f != stdin) - printf("%s\n", name); + printf("%s\n", tuklib_mask_nonprint(name)); printf("Uncompressed size: "); if (uncompressed_size == UINT64_MAX) @@ -200,9 +223,10 @@ main(int argc, char **argv) if (f == NULL) { ret = EXIT_FAILURE; fprintf(stderr, "%s: %s: %s\n", - progname, - argv[optind], - strerror(errno)); + progname, + tuklib_mask_nonprint( + argv[optind]), + strerror(errno)); continue; } diff --git a/src/xz/args.c b/src/xz/args.c index b3743ceaf205..8043c98e21c1 100644 --- a/src/xz/args.c +++ b/src/xz/args.c @@ -21,6 +21,7 @@ bool opt_stdout = false; bool opt_force = false; bool opt_keep_original = false; +bool opt_synchronous = true; bool opt_robot = false; bool opt_ignore_check = false; @@ -217,6 +218,7 @@ parse_real(args_info *args, int argc, char **argv) OPT_LZMA1, OPT_LZMA2, + OPT_NO_SYNC, OPT_SINGLE_STREAM, OPT_NO_SPARSE, OPT_FILES, @@ -249,6 +251,7 @@ parse_real(args_info *args, int argc, char **argv) { "force", no_argument, NULL, 'f' }, { "stdout", no_argument, NULL, 'c' }, { "to-stdout", no_argument, NULL, 'c' }, + { "no-sync", no_argument, NULL, OPT_NO_SYNC }, { "single-stream", no_argument, NULL, OPT_SINGLE_STREAM }, { "no-sparse", no_argument, NULL, OPT_NO_SPARSE }, { "suffix", required_argument, NULL, 'S' }, @@ -275,17 +278,17 @@ parse_real(args_info *args, int argc, char **argv) { "best", no_argument, NULL, '9' }, // Filters - { "filters", optional_argument, NULL, OPT_FILTERS}, - { "filters1", optional_argument, NULL, OPT_FILTERS1}, - { "filters2", optional_argument, NULL, OPT_FILTERS2}, - { "filters3", optional_argument, NULL, OPT_FILTERS3}, - { "filters4", optional_argument, NULL, OPT_FILTERS4}, - { "filters5", optional_argument, NULL, OPT_FILTERS5}, - { "filters6", optional_argument, NULL, OPT_FILTERS6}, - { "filters7", optional_argument, NULL, OPT_FILTERS7}, - { "filters8", optional_argument, NULL, OPT_FILTERS8}, - { "filters9", optional_argument, NULL, OPT_FILTERS9}, - { "filters-help", optional_argument, NULL, OPT_FILTERS_HELP}, + { "filters", required_argument, NULL, OPT_FILTERS}, + { "filters1", required_argument, NULL, OPT_FILTERS1}, + { "filters2", required_argument, NULL, OPT_FILTERS2}, + { "filters3", required_argument, NULL, OPT_FILTERS3}, + { "filters4", required_argument, NULL, OPT_FILTERS4}, + { "filters5", required_argument, NULL, OPT_FILTERS5}, + { "filters6", required_argument, NULL, OPT_FILTERS6}, + { "filters7", required_argument, NULL, OPT_FILTERS7}, + { "filters8", required_argument, NULL, OPT_FILTERS8}, + { "filters9", required_argument, NULL, OPT_FILTERS9}, + { "filters-help", no_argument, NULL, OPT_FILTERS_HELP}, { "lzma1", optional_argument, NULL, OPT_LZMA1 }, { "lzma2", optional_argument, NULL, OPT_LZMA2 }, @@ -612,6 +615,9 @@ parse_real(args_info *args, int argc, char **argv) case OPT_SINGLE_STREAM: opt_single_stream = true; + + // Since 5.7.1alpha --single-stream implies --keep. + opt_keep_original = true; break; case OPT_NO_SPARSE: @@ -621,7 +627,7 @@ parse_real(args_info *args, int argc, char **argv) case OPT_FILES: args->files_delim = '\n'; - // Fall through + FALLTHROUGH; case OPT_FILES0: if (args->files_name != NULL) @@ -655,6 +661,10 @@ parse_real(args_info *args, int argc, char **argv) optarg, 0, UINT64_MAX); break; + case OPT_NO_SYNC: + opt_synchronous = false; + break; + default: message_try_help(); tuklib_exit(E_ERROR, E_ERROR, false); @@ -823,6 +833,13 @@ args_parse(args_info *args, int argc, char **argv) opt_stdout = true; } + // Don't use fsync() if --keep is specified or implied. + // However, don't document this as "--keep implies --no-sync" + // because if syncing support was added to --flush-timeout, + // it would sync even if --keep was specified. + if (opt_keep_original) + opt_synchronous = false; + // When compressing, if no --format flag was used, or it // was --format=auto, we compress to the .xz format. if (opt_mode == MODE_COMPRESS && opt_format == FORMAT_AUTO) diff --git a/src/xz/args.h b/src/xz/args.h index e693ecd62280..7fdf37f1420f 100644 --- a/src/xz/args.h +++ b/src/xz/args.h @@ -34,7 +34,7 @@ typedef struct { extern bool opt_stdout; extern bool opt_force; extern bool opt_keep_original; -// extern bool opt_recursive; +extern bool opt_synchronous; extern bool opt_robot; extern bool opt_ignore_check; diff --git a/src/xz/coder.c b/src/xz/coder.c index 5e41f0df6802..c28f874a25f7 100644 --- a/src/xz/coder.c +++ b/src/xz/coder.c @@ -168,16 +168,13 @@ str_to_filters(const char *str, uint32_t index, uint32_t flags) if (index > 0) filter_num[0] = '0' + index; - // FIXME? The message in err isn't translated. - // Including the translations in the xz translations is - // slightly ugly but possible. Creating a new domain for - // liblzma might not be worth it especially since on some - // OSes it adds extra dependencies to translation libraries. + // liblzma doesn't translate the error messages but + // the messages are included in xz's translations. message(V_ERROR, _("Error in --filters%s=FILTERS option:"), filter_num); message(V_ERROR, "%s", str); message(V_ERROR, "%*s^", error_pos, ""); - message_fatal("%s", err); + message_fatal("%s", _(err)); } } @@ -1003,8 +1000,9 @@ coder_init(file_pair *pair) strm.avail_out = 0; while ((ret = lzma_code(&strm, LZMA_RUN)) == LZMA_UNSUPPORTED_CHECK) - message_warning(_("%s: %s"), pair->src_name, - message_strm(ret)); + message_warning(_("%s: %s"), + tuklib_mask_nonprint(pair->src_name), + message_strm(ret)); // With --single-stream lzma_code won't wait for // LZMA_FINISH and thus it can return LZMA_STREAM_END @@ -1019,7 +1017,9 @@ coder_init(file_pair *pair) } if (ret != LZMA_OK) { - message_error(_("%s: %s"), pair->src_name, message_strm(ret)); + message_error(_("%s: %s"), + tuklib_mask_nonprint(pair->src_name), + message_strm(ret)); if (ret == LZMA_MEMLIMIT_ERROR) message_mem_needed(V_ERROR, lzma_memusage(&strm)); @@ -1320,11 +1320,13 @@ coder_normal(file_pair *pair) // wrong and we print an error. Otherwise it's just // a warning and coding can continue. if (stop) { - message_error(_("%s: %s"), pair->src_name, - message_strm(ret)); + message_error(_("%s: %s"), + tuklib_mask_nonprint(pair->src_name), + message_strm(ret)); } else { - message_warning(_("%s: %s"), pair->src_name, - message_strm(ret)); + message_warning(_("%s: %s"), + tuklib_mask_nonprint(pair->src_name), + message_strm(ret)); // When compressing, all possible errors set // stop to true. diff --git a/src/xz/file_io.c b/src/xz/file_io.c index 678a9a5ca860..8c83269b13fa 100644 --- a/src/xz/file_io.c +++ b/src/xz/file_io.c @@ -17,6 +17,7 @@ # include <io.h> #else # include <poll.h> +# include <libgen.h> static bool warn_fchown; #endif @@ -56,6 +57,10 @@ static bool warn_fchown; # define S_ISREG(m) (((m) & _S_IFMT) == _S_IFREG) #endif +#if defined(_WIN32) && !defined(__CYGWIN__) +# define fsync _commit +#endif + #ifndef O_BINARY # define O_BINARY 0 #endif @@ -64,6 +69,25 @@ static bool warn_fchown; # define O_NOCTTY 0 #endif +// In musl 1.2.5, O_SEARCH is defined to O_PATH. As of Linux 6.12, +// a file descriptor from open("dir", O_SEARCH | O_DIRECTORY) cannot be +// used with fsync() (fails with EBADF). musl 1.2.5 doesn't emulate it +// using /proc/self/fd. Even if it did, it might need to do it with +// fd = open("/proc/...", O_RDONLY); fsync(fd); which fails if the +// directory lacks read permission. Since we need a working fsync(), +// O_RDONLY imitates O_SEARCH better than O_PATH. +#if defined(O_SEARCH) && defined(O_PATH) && O_SEARCH == O_PATH +# undef O_SEARCH +#endif + +#ifndef O_SEARCH +# define O_SEARCH O_RDONLY +#endif + +#ifndef O_DIRECTORY +# define O_DIRECTORY 0 +#endif + // Using this macro to silence a warning from gcc -Wlogical-op. #if EAGAIN == EWOULDBLOCK # define IS_EAGAIN_OR_EWOULDBLOCK(e) ((e) == EAGAIN) @@ -205,8 +229,9 @@ io_wait(file_pair *pair, int timeout, bool is_reading) continue; message_error(_("%s: poll() failed: %s"), - is_reading ? pair->src_name - : pair->dest_name, + tuklib_mask_nonprint(is_reading + ? pair->src_name + : pair->dest_name), strerror(errno)); return IO_WAIT_ERROR; } @@ -272,14 +297,15 @@ io_unlink(const char *name, const struct stat *known_st) // of the original file, and in that case it obviously // shouldn't be removed. message_warning(_("%s: File seems to have been moved, " - "not removing"), name); + "not removing"), tuklib_mask_nonprint(name)); else #endif // There's a race condition between lstat() and unlink() // but at least we have tried to avoid removing wrong file. if (unlink(name)) message_warning(_("%s: Cannot remove: %s"), - name, strerror(errno)); + tuklib_mask_nonprint(name), + strerror(errno)); return; } @@ -305,7 +331,8 @@ io_copy_attrs(const file_pair *pair) if (fchown(pair->dest_fd, pair->src_st.st_uid, (gid_t)(-1)) && warn_fchown) message_warning(_("%s: Cannot set the file owner: %s"), - pair->dest_name, strerror(errno)); + tuklib_mask_nonprint(pair->dest_name), + strerror(errno)); mode_t mode; @@ -318,7 +345,8 @@ io_copy_attrs(const file_pair *pair) && fchown(pair->dest_fd, (uid_t)(-1), pair->src_st.st_gid)) { message_warning(_("%s: Cannot set the file group: %s"), - pair->dest_name, strerror(errno)); + tuklib_mask_nonprint(pair->dest_name), + strerror(errno)); // We can still safely copy some additional permissions: // 'group' must be at least as strict as 'other' and // also vice versa. @@ -337,7 +365,8 @@ io_copy_attrs(const file_pair *pair) if (fchmod(pair->dest_fd, mode)) message_warning(_("%s: Cannot set the file permissions: %s"), - pair->dest_name, strerror(errno)); + tuklib_mask_nonprint(pair->dest_name), + strerror(errno)); #endif // Copy the timestamps. We have several possible ways to do this, of @@ -445,6 +474,39 @@ io_copy_attrs(const file_pair *pair) } +/// \brief Synchronizes the destination file to permanent storage +/// +/// \param pair File pair having the destination file open for writing +/// +/// \return On success, false is returned. On error, error message +/// is printed and true is returned. +static bool +io_sync_dest(file_pair *pair) +{ + assert(pair->dest_fd != -1); + assert(pair->dest_fd != STDOUT_FILENO); + + if (fsync(pair->dest_fd)) { + message_error(_("%s: Synchronizing the file failed: %s"), + tuklib_mask_nonprint(pair->dest_name), + strerror(errno)); + return true; + } + +#ifndef TUKLIB_DOSLIKE + if (fsync(pair->dir_fd)) { + message_error(_("%s: Synchronizing the directory of " + "the file failed: %s"), + tuklib_mask_nonprint(pair->dest_name), + strerror(errno)); + return true; + } +#endif + + return false; +} + + /// Opens the source file. Returns false on success, true on error. static bool io_open_src_real(file_pair *pair) @@ -515,13 +577,15 @@ io_open_src_real(file_pair *pair) if (!follow_symlinks) { struct stat st; if (lstat(pair->src_name, &st)) { - message_error(_("%s: %s"), pair->src_name, + message_error(_("%s: %s"), + tuklib_mask_nonprint(pair->src_name), strerror(errno)); return true; } else if (S_ISLNK(st.st_mode)) { message_warning(_("%s: Is a symbolic link, " - "skipping"), pair->src_name); + "skipping"), + tuklib_mask_nonprint(pair->src_name)); return true; } } @@ -583,13 +647,15 @@ io_open_src_real(file_pair *pair) if (was_symlink) message_warning(_("%s: Is a symbolic link, " - "skipping"), pair->src_name); + "skipping"), + tuklib_mask_nonprint(pair->src_name)); else #endif // Something else than O_NOFOLLOW failing // (assuming that the race conditions didn't // confuse us). - message_error(_("%s: %s"), pair->src_name, + message_error(_("%s: %s"), + tuklib_mask_nonprint(pair->src_name), strerror(errno)); return true; @@ -612,13 +678,13 @@ io_open_src_real(file_pair *pair) if (S_ISDIR(pair->src_st.st_mode)) { message_warning(_("%s: Is a directory, skipping"), - pair->src_name); + tuklib_mask_nonprint(pair->src_name)); goto error; } if (reg_files_only && !S_ISREG(pair->src_st.st_mode)) { message_warning(_("%s: Not a regular file, skipping"), - pair->src_name); + tuklib_mask_nonprint(pair->src_name)); goto error; } @@ -636,21 +702,21 @@ io_open_src_real(file_pair *pair) // explicitly in io_copy_attr(). message_warning(_("%s: File has setuid or " "setgid bit set, skipping"), - pair->src_name); + tuklib_mask_nonprint(pair->src_name)); goto error; } if (pair->src_st.st_mode & S_ISVTX) { message_warning(_("%s: File has sticky bit " "set, skipping"), - pair->src_name); + tuklib_mask_nonprint(pair->src_name)); goto error; } if (pair->src_st.st_nlink > 1) { message_warning(_("%s: Input file has more " - "than one hard link, " - "skipping"), pair->src_name); + "than one hard link, skipping"), + tuklib_mask_nonprint(pair->src_name)); goto error; } } @@ -679,7 +745,8 @@ io_open_src_real(file_pair *pair) return false; error_msg: - message_error(_("%s: %s"), pair->src_name, strerror(errno)); + message_error(_("%s: %s"), tuklib_mask_nonprint(pair->src_name), + strerror(errno)); error: (void)close(pair->src_fd); return true; @@ -707,6 +774,9 @@ io_open_src(const char *src_name) .dest_name = NULL, .src_fd = -1, .dest_fd = -1, +#ifndef TUKLIB_DOSLIKE + .dir_fd = -1, +#endif .src_eof = false, .src_has_seen_input = false, .flush_needed = false, @@ -809,6 +879,56 @@ io_open_dest_real(file_pair *pair) if (pair->dest_name == NULL) return true; +#ifndef TUKLIB_DOSLIKE + if (opt_synchronous) { + // Open the directory where the destination file will + // be created (the file descriptor is needed for + // fsync()). Do this before creating the destination + // file: + // + // - We currently have no files to clean up if + // opening the directory fails. (We aren't + // reading from stdin so there are no stdin_flags + // to restore either.) + // + // - Allocating memory with xstrdup() is safe only + // when we have nothing to clean up. + char *buf = xstrdup(pair->dest_name); + const char *dir_name = dirname(buf); + + // O_NOCTTY and O_NONBLOCK are there in case + // O_DIRECTORY is 0 and dir_name doesn't refer + // to a directory. (We opened the source file + // already but directories might have been renamed + // after the source file was opened.) + pair->dir_fd = open(dir_name, O_SEARCH | O_DIRECTORY + | O_NOCTTY | O_NONBLOCK); + if (pair->dir_fd == -1) { + // Since we did open the source file + // successfully, we should rarely get here. + // Perhaps something has been renamed or + // had its permissions changed. + // + // In an odd case, the directory has write + // and search permissions but not read + // permission (d-wx------), and O_SEARCH is + // actually O_RDONLY. Then we would be able + // to create a new file and only the directory + // syncing would be impossible. But let's be + // strict about syncing and require users to + // explicitly disable it if they don't want it. + message_error(_("%s: Opening the directory " + "failed: %s"), + tuklib_mask_nonprint(dir_name), + strerror(errno)); + free(buf); + goto error; + } + + free(buf); + } +#endif + #ifdef __DJGPP__ struct stat st; if (stat(pair->dest_name, &st) == 0) { @@ -816,9 +936,9 @@ io_open_dest_real(file_pair *pair) if (st.st_dev == -1) { message_error("%s: Refusing to write to " "a DOS special file", - pair->dest_name); - free(pair->dest_name); - return true; + tuklib_mask_nonprint( + pair->dest_name)); + goto error; } // Check that we aren't overwriting the source file. @@ -826,9 +946,9 @@ io_open_dest_real(file_pair *pair) && st.st_ino == pair->src_st.st_ino) { message_error("%s: Output file is the same " "as the input file", - pair->dest_name); - free(pair->dest_name); - return true; + tuklib_mask_nonprint( + pair->dest_name)); + goto error; } } #endif @@ -836,9 +956,9 @@ io_open_dest_real(file_pair *pair) // If --force was used, unlink the target file first. if (opt_force && unlink(pair->dest_name) && errno != ENOENT) { message_error(_("%s: Cannot remove: %s"), - pair->dest_name, strerror(errno)); - free(pair->dest_name); - return true; + tuklib_mask_nonprint(pair->dest_name), + strerror(errno)); + goto error; } // Open the file. @@ -851,11 +971,15 @@ io_open_dest_real(file_pair *pair) pair->dest_fd = open(pair->dest_name, flags, mode); if (pair->dest_fd == -1) { - message_error(_("%s: %s"), pair->dest_name, + message_error(_("%s: %s"), + tuklib_mask_nonprint(pair->dest_name), strerror(errno)); - free(pair->dest_name); - return true; + goto error; } + + // We could sync dir_fd now and close it. However, performance + // can be better if this is delayed until dest_fd has been + // synced in io_sync_dest(). } if (fstat(pair->dest_fd, &pair->dest_st)) { @@ -881,15 +1005,13 @@ io_open_dest_real(file_pair *pair) // With fstat()/_fstat64() it works. else if (pair->dest_fd != STDOUT_FILENO && !S_ISREG(pair->dest_st.st_mode)) { - message_error("%s: Destination is not a regular file", - pair->dest_name); + message_error(_("%s: Destination is not a regular file"), + tuklib_mask_nonprint(pair->dest_name)); // dest_fd needs to be reset to -1 to keep io_close() working. (void)close(pair->dest_fd); pair->dest_fd = -1; - - free(pair->dest_name); - return true; + goto error; } #elif !defined(TUKLIB_DOSLIKE) else if (try_sparse && opt_mode == MODE_DECOMPRESS) { @@ -961,6 +1083,18 @@ io_open_dest_real(file_pair *pair) #endif return false; + +error: +#ifndef TUKLIB_DOSLIKE + // io_close() closes pair->dir_fd but let's do it here anyway. + if (pair->dir_fd != -1) { + (void)close(pair->dir_fd); + pair->dir_fd = -1; + } +#endif + + free(pair->dest_name); + return true; } @@ -979,8 +1113,8 @@ io_open_dest(file_pair *pair) /// \param pair File whose dest_fd should be closed /// \param success If false, the file will be removed from the disk. /// -/// \return Zero if closing succeeds. On error, -1 is returned and -/// error message printed. +/// \return If closing succeeds, false is returned. On error, an error +/// message is printed and true is returned. static bool io_close_dest(file_pair *pair, bool success) { @@ -1003,9 +1137,17 @@ io_close_dest(file_pair *pair, bool success) if (pair->dest_fd == -1 || pair->dest_fd == STDOUT_FILENO) return false; +#ifndef TUKLIB_DOSLIKE + // dir_fd was only used for syncing the directory. + // Error checking was done when syncing. + if (pair->dir_fd != -1) + (void)close(pair->dir_fd); +#endif + if (close(pair->dest_fd)) { message_error(_("%s: Closing the file failed: %s"), - pair->dest_name, strerror(errno)); + tuklib_mask_nonprint(pair->dest_name), + strerror(errno)); // Closing destination file failed, so we cannot trust its // contents. Get rid of junk: @@ -1042,7 +1184,8 @@ io_close(file_pair *pair, bool success) SEEK_CUR) == -1) { message_error(_("%s: Seeking failed when trying " "to create a sparse file: %s"), - pair->dest_name, strerror(errno)); + tuklib_mask_nonprint(pair->dest_name), + strerror(errno)); success = false; } else { const uint8_t zero[1] = { '\0' }; @@ -1053,11 +1196,16 @@ io_close(file_pair *pair, bool success) signals_block(); - // Copy the file attributes. We need to skip this if destination - // file isn't open or it is standard output. - if (success && pair->dest_fd != -1 && pair->dest_fd != STDOUT_FILENO) + if (success && pair->dest_fd != -1 && pair->dest_fd != STDOUT_FILENO) { + // Copy the file attributes. This may produce warnings but + // not errors so "success" isn't affected. io_copy_attrs(pair); + // Synchronize the file and its directory if needed. + if (opt_synchronous) + success = !io_sync_dest(pair); + } + // Close the destination first. If it fails, we must not remove // the source file! if (io_close_dest(pair, success)) @@ -1141,7 +1289,8 @@ io_read(file_pair *pair, io_buf *buf, size_t size) #endif message_error(_("%s: Read error: %s"), - pair->src_name, strerror(errno)); + tuklib_mask_nonprint(pair->src_name), + strerror(errno)); return SIZE_MAX; } @@ -1171,7 +1320,8 @@ io_seek_src(file_pair *pair, uint64_t pos) if (lseek(pair->src_fd, (off_t)(pos), SEEK_SET) == -1) { message_error(_("%s: Error seeking the file: %s"), - pair->src_name, strerror(errno)); + tuklib_mask_nonprint(pair->src_name), + strerror(errno)); return true; } @@ -1195,7 +1345,7 @@ io_pread(file_pair *pair, io_buf *buf, size_t size, uint64_t pos) if (amount != size) { message_error(_("%s: Unexpected end of file"), - pair->src_name); + tuklib_mask_nonprint(pair->src_name)); return true; } @@ -1240,6 +1390,19 @@ io_write_buf(file_pair *pair, const uint8_t *buf, size_t size) } #endif +#if defined(_WIN32) && !defined(__CYGWIN__) + // On native Windows, broken pipe is reported as + // EINVAL. Don't show an error message in this case. + // Try: xz -dc bigfile.xz | head -n1 + if (errno == EINVAL + && pair->dest_fd == STDOUT_FILENO) { + // Emulate SIGPIPE by setting user_abort here. + user_abort = true; + set_exit_status(E_ERROR); + return true; + } +#endif + // Handle broken pipe specially. gzip and bzip2 // don't print anything on SIGPIPE. In addition, // gzip --quiet uses exit status 2 (warning) on @@ -1254,7 +1417,8 @@ io_write_buf(file_pair *pair, const uint8_t *buf, size_t size) // user_abort, and get EPIPE here. if (errno != EPIPE) message_error(_("%s: Write error: %s"), - pair->dest_name, strerror(errno)); + tuklib_mask_nonprint(pair->dest_name), + strerror(errno)); return true; } @@ -1304,7 +1468,9 @@ io_write(file_pair *pair, const io_buf *buf, size_t size) SEEK_CUR) == -1) { message_error(_("%s: Seeking failed when " "trying to create a sparse " - "file: %s"), pair->dest_name, + "file: %s"), + tuklib_mask_nonprint( + pair->dest_name), strerror(errno)); return true; } diff --git a/src/xz/file_io.h b/src/xz/file_io.h index ae7e2f38f520..9903f5a0adf8 100644 --- a/src/xz/file_io.h +++ b/src/xz/file_io.h @@ -55,6 +55,12 @@ typedef struct { /// File descriptor of the target file int dest_fd; +#ifndef TUKLIB_DOSLIKE + /// File descriptor of the directory of the target file (which is + /// also the directory of the source file) + int dir_fd; +#endif + /// True once end of the source file has been detected. bool src_eof; @@ -177,6 +183,6 @@ extern bool io_pread(file_pair *pair, io_buf *buf, size_t size, uint64_t pos); /// \param buf Buffer containing the data to be written /// \param size Size of the buffer; must be at most IO_BUFFER_SIZE /// -/// \return On success, zero is returned. On error, -1 is returned -/// and error message printed. +/// \return On success, false is returned. On error, error message +/// is printed and true is returned. extern bool io_write(file_pair *pair, const io_buf *buf, size_t size); diff --git a/src/xz/list.c b/src/xz/list.c index e4a64668c76e..6a71d01e437e 100644 --- a/src/xz/list.c +++ b/src/xz/list.c @@ -347,13 +347,14 @@ static bool parse_indexes(xz_file_info *xfi, file_pair *pair) { if (pair->src_st.st_size <= 0) { - message_error(_("%s: File is empty"), pair->src_name); + message_error(_("%s: File is empty"), + tuklib_mask_nonprint(pair->src_name)); return true; } if (pair->src_st.st_size < 2 * LZMA_STREAM_HEADER_SIZE) { message_error(_("%s: Too small to be a valid .xz file"), - pair->src_name); + tuklib_mask_nonprint(pair->src_name)); return true; } @@ -365,7 +366,9 @@ parse_indexes(xz_file_info *xfi, file_pair *pair) hardware_memlimit_get(MODE_LIST), (uint64_t)(pair->src_st.st_size)); if (ret != LZMA_OK) { - message_error(_("%s: %s"), pair->src_name, message_strm(ret)); + message_error(_("%s: %s"), + tuklib_mask_nonprint(pair->src_name), + message_strm(ret)); return true; } @@ -411,7 +414,8 @@ parse_indexes(xz_file_info *xfi, file_pair *pair) } default: - message_error(_("%s: %s"), pair->src_name, + message_error(_("%s: %s"), + tuklib_mask_nonprint(pair->src_name), message_strm(ret)); // If the error was too low memory usage limit, @@ -473,7 +477,8 @@ parse_block_header(file_pair *pair, const lzma_index_iter *iter, break; case LZMA_OPTIONS_ERROR: - message_error(_("%s: %s"), pair->src_name, + message_error(_("%s: %s"), + tuklib_mask_nonprint(pair->src_name), message_strm(LZMA_OPTIONS_ERROR)); return true; @@ -520,8 +525,7 @@ parse_block_header(file_pair *pair, const lzma_index_iter *iter, // If the above fails, the file is corrupt so // LZMA_DATA_ERROR is a good error code. - - // Fall through + FALLTHROUGH; case LZMA_DATA_ERROR: // Free the memory allocated by lzma_block_header_decode(). @@ -587,7 +591,8 @@ parse_block_header(file_pair *pair, const lzma_index_iter *iter, // Check if the stringification succeeded. if (str_ret != LZMA_OK) { - message_error(_("%s: %s"), pair->src_name, + message_error(_("%s: %s"), + tuklib_mask_nonprint(pair->src_name), message_strm(str_ret)); return true; } @@ -596,7 +601,8 @@ parse_block_header(file_pair *pair, const lzma_index_iter *iter, data_error: // Show the error message. - message_error(_("%s: %s"), pair->src_name, + message_error(_("%s: %s"), + tuklib_mask_nonprint(pair->src_name), message_strm(LZMA_DATA_ERROR)); return true; } @@ -744,7 +750,7 @@ print_info_basic(const xz_file_info *xfi, file_pair *pair) char checks[CHECKS_STR_SIZE]; get_check_names(checks, lzma_index_checks(xfi->idx), false); - const char *cols[7] = { + const char *cols[6] = { uint64_to_str(lzma_index_stream_count(xfi->idx), 0), uint64_to_str(lzma_index_block_count(xfi->idx), 1), uint64_to_nicestr(lzma_index_file_size(xfi->idx), @@ -754,7 +760,6 @@ print_info_basic(const xz_file_info *xfi, file_pair *pair) get_ratio(lzma_index_file_size(xfi->idx), lzma_index_uncompressed_size(xfi->idx)), checks, - pair->src_name, }; printf("%*s %*s %*s %*s %*s %-*s %s\n", tuklib_mbstr_fw(cols[0], 5), cols[0], @@ -763,7 +768,7 @@ print_info_basic(const xz_file_info *xfi, file_pair *pair) tuklib_mbstr_fw(cols[3], 11), cols[3], tuklib_mbstr_fw(cols[4], 5), cols[4], tuklib_mbstr_fw(cols[5], 7), cols[5], - cols[6]); + tuklib_mask_nonprint(pair->src_name)); return false; } @@ -1034,7 +1039,7 @@ print_info_adv(xz_file_info *xfi, file_pair *pair) printf(" %-*s %s\n", COLON_STR(COLON_STR_SIZES_IN_HEADERS), xfi->all_have_sizes ? _("Yes") : _("No")); //printf(" %-*s %s\n", COLON_STR(COLON_STR_MINIMUM_XZ_VERSION), - printf(_(" Minimum XZ Utils version: %s\n"), + printf(" %s %s\n", _("Minimum XZ Utils version:"), xz_ver_to_str(xfi->min_version)); } @@ -1048,7 +1053,11 @@ print_info_robot(xz_file_info *xfi, file_pair *pair) char checks[CHECKS_STR_SIZE]; get_check_names(checks, lzma_index_checks(xfi->idx), false); - printf("name\t%s\n", pair->src_name); + // Robot mode has to mask at least some control chars to prevent + // the output from getting out of sync if filename is malicious. + // Masking all non-printable chars is more than we need but + // perhaps this is good enough in practice. + printf("name\t%s\n", tuklib_mask_nonprint(pair->src_name)); printf("file\t%" PRIu64 "\t%" PRIu64 "\t%" PRIu64 "\t%" PRIu64 "\t%s\t%s\t%" PRIu64 "\n", @@ -1219,7 +1228,7 @@ print_totals_adv(void) printf(" %-*s %s\n", COLON_STR(COLON_STR_SIZES_IN_HEADERS), totals.all_have_sizes ? _("Yes") : _("No")); //printf(" %-*s %s\n", COLON_STR(COLON_STR_MINIMUM_XZ_VERSION), - printf(_(" Minimum XZ Utils version: %s\n"), + printf(" %s %s\n", _("Minimum XZ Utils version:"), xz_ver_to_str(totals.min_version)); } diff --git a/src/xz/main.c b/src/xz/main.c index 71b5ef7b7001..1b8b37881172 100644 --- a/src/xz/main.c +++ b/src/xz/main.c @@ -87,7 +87,8 @@ read_name(const args_info *args) continue; message_error(_("%s: Error reading filenames: %s"), - args->files_name, strerror(errno)); + tuklib_mask_nonprint(args->files_name), + strerror(errno)); return NULL; } @@ -95,7 +96,8 @@ read_name(const args_info *args) if (pos != 0) message_error(_("%s: Unexpected end of input " "when reading filenames"), - args->files_name); + tuklib_mask_nonprint( + args->files_name)); return NULL; } @@ -120,7 +122,9 @@ read_name(const args_info *args) message_error(_("%s: Null character found when " "reading filenames; maybe you meant " "to use '--files0' instead " - "of '--files'?"), args->files_name); + "of '--files'?"), + tuklib_mask_nonprint( + args->files_name)); return NULL; } diff --git a/src/xz/message.c b/src/xz/message.c index deafdb438320..7657e85648da 100644 --- a/src/xz/message.c +++ b/src/xz/message.c @@ -11,7 +11,7 @@ /////////////////////////////////////////////////////////////////////////////// #include "private.h" - +#include "tuklib_mbstr_wrap.h" #include <stdarg.h> @@ -196,10 +196,12 @@ print_filename(void) // If we don't know how many files there will be due // to usage of --files or --files0. if (files_total == 0) - fprintf(file, "%s (%u)\n", filename, + fprintf(file, "%s (%u)\n", + tuklib_mask_nonprint(filename), files_pos); else - fprintf(file, "%s (%u/%u)\n", filename, + fprintf(file, "%s (%u/%u)\n", + tuklib_mask_nonprint(filename), files_pos, files_total); signals_unblock(); @@ -648,7 +650,7 @@ progress_flush(bool finished) cols[4]); } else { // The filename is always printed. - fprintf(stderr, _("%s: "), filename); + fprintf(stderr, _("%s: "), tuklib_mask_nonprint(filename)); // Percentage is printed only if we didn't finish yet. if (!finished) { @@ -936,213 +938,360 @@ message_version(void) } -extern void -message_help(bool long_help) +static void +detect_wrapping_errors(int error_mask) { - printf(_("Usage: %s [OPTION]... [FILE]...\n" - "Compress or decompress FILEs in the .xz format.\n\n"), - progname); +#ifndef NDEBUG + // This might help in catching problematic strings in translations. + // It's a debug message so don't translate this. + if (error_mask & TUKLIB_WRAP_WARN_OVERLONG) + message_fatal("The help text contains overlong lines"); +#endif - // NOTE: The short help doesn't currently have options that - // take arguments. - if (long_help) - puts(_("Mandatory arguments to long options are mandatory " - "for short options too.\n")); + if (error_mask & ~TUKLIB_WRAP_WARN_OVERLONG) + message_fatal(_("Error printing the help text " + "(error code %d)"), error_mask); - if (long_help) - puts(_(" Operation mode:\n")); + return; +} - puts(_( -" -z, --compress force compression\n" -" -d, --decompress force decompression\n" -" -t, --test test compressed file integrity\n" -" -l, --list list information about .xz files")); - if (long_help) - puts(_("\n Operation modifiers:\n")); +extern void +message_help(bool long_help) +{ + static const struct tuklib_wrap_opt wrap0 = { 0, 0, 0, 0, 79 }; + static const struct tuklib_wrap_opt wrap1 = { 1, 1, 1, 1, 79 }; + static const struct tuklib_wrap_opt wrap2 = { 2, 2, 22, 22, 79 }; + static const struct tuklib_wrap_opt wrap3 = { 24, 24, 36, 36, 79 }; - puts(_( -" -k, --keep keep (don't delete) input files\n" -" -f, --force force overwrite of output file and (de)compress links\n" -" -c, --stdout write to standard output and don't delete input files")); - // NOTE: --to-stdout isn't included above because it's not - // the recommended spelling. It was copied from gzip but other - // compressors with gzip-like syntax don't support it. + // Accumulated error codes from tuklib_wraps() and tuklib_wrapf() + int e = 0; + + printf(_("Usage: %s [OPTION]... [FILE]...\n"), progname); + e |= tuklib_wraps(stdout, &wrap0, + W_("Compress or decompress FILEs in the .xz format.")); + putchar('\n'); + + e |= tuklib_wraps(stdout, &wrap0, + W_("Mandatory arguments to long options are " + "mandatory for short options too.")); + putchar('\n'); if (long_help) { - puts(_( -" --single-stream decompress only the first stream, and silently\n" -" ignore possible remaining input data")); - puts(_( -" --no-sparse do not create sparse files when decompressing\n" -" -S, --suffix=.SUF use the suffix '.SUF' on compressed files\n" -" --files[=FILE] read filenames to process from FILE; if FILE is\n" -" omitted, filenames are read from the standard input;\n" -" filenames must be terminated with the newline character\n" -" --files0[=FILE] like --files but use the null character as terminator")); + e |= tuklib_wraps(stdout, &wrap1, W_("Operation mode:")); + putchar('\n'); } + e |= tuklib_wrapf(stdout, &wrap2, + "-z, --compress\v%s\r" + "-d, --decompress\v%s\r" + "-t, --test\v%s\r" + "-l, --list\v%s", + W_("force compression"), + W_("force decompression"), + W_("test compressed file integrity"), + W_("list information about .xz files")); + if (long_help) { - puts(_("\n Basic file format and compression options:\n")); - puts(_( -" -F, --format=FMT file format to encode or decode; possible values are\n" -" 'auto' (default), 'xz', 'lzma', 'lzip', and 'raw'\n" -" -C, --check=CHECK integrity check type: 'none' (use with caution),\n" -" 'crc32', 'crc64' (default), or 'sha256'")); - puts(_( -" --ignore-check don't verify the integrity check when decompressing")); + putchar('\n'); + e |= tuklib_wraps(stdout, &wrap1, W_("Operation modifiers:")); + putchar('\n'); } - puts(_( -" -0 ... -9 compression preset; default is 6; take compressor *and*\n" -" decompressor memory usage into account before using 7-9!")); + e |= tuklib_wrapf(stdout, &wrap2, + "-k, --keep\v%s\r" + "-f, --force\v%s\r" + "-c, --stdout\v%s", + W_("keep (don't delete) input files"), + W_("force overwrite of output file and (de)compress links"), + W_("write to standard output and don't delete input files")); + // NOTE: --to-stdout isn't included above because it's not + // the recommended spelling. It was copied from gzip but other + // compressors with gzip-like syntax don't support it. - puts(_( -" -e, --extreme try to improve compression ratio by using more CPU time;\n" -" does not affect decompressor memory requirements")); + if (long_help) { + e |= tuklib_wrapf(stdout, &wrap2, + " --no-sync\v%s\r" + " --single-stream\v%s\r" + " --no-sparse\v%s\r" + "-S, --suffix=%s\v%s\r" + " --files[=%s]\v%s\r" + " --files0[=%s]\v%s\r", + W_("don't synchronize the output file to the storage " + "device before removing the input file"), + W_("decompress only the first stream, and silently " + "ignore possible remaining input data"), + W_("do not create sparse files when decompressing"), + _(".SUF"), + W_("use the suffix '.SUF' on compressed files"), + _("FILE"), + W_("read filenames to process from FILE; " + "if FILE is omitted, " + "filenames are read from the standard input; " + "filenames must be terminated with " + "the newline character"), + _("FILE"), + W_("like --files but use the null character as " + "terminator")); + + e |= tuklib_wraps(stdout, &wrap1, + W_("Basic file format and compression options:")); + + e |= tuklib_wrapf(stdout, &wrap2, + "\n" + "-F, --format=%s\v%s\r" + "-C, --check=%s\v%s\r" + " --ignore-check\v%s", + _("FORMAT"), + W_("file format to encode or decode; possible values " + "are 'auto' (default), 'xz', 'lzma', 'lzip', " + "and 'raw'"), + _("NAME"), + W_("integrity check type: 'none' (use with caution), " + "'crc32', 'crc64' (default), or 'sha256'"), + W_("don't verify the integrity check when " + "decompressing")); + } - puts(_( -" -T, --threads=NUM use at most NUM threads; the default is 0 which uses\n" -" as many threads as there are processor cores")); + e |= tuklib_wrapf(stdout, &wrap2, + "-0 ... -9\v%s\r" + "-e, --extreme\v%s\r" + "-T, --threads=%s\v%s", + W_("compression preset; default is 6; take compressor *and* " + "decompressor memory usage into account before " + "using 7-9!"), + W_("try to improve compression ratio by using more CPU time; " + "does not affect decompressor memory requirements"), + // TRANSLATORS: Short for NUMBER. A longer string is fine but + // wider than 5 columns makes --long-help a few lines longer. + _("NUM"), + W_("use at most NUM threads; the default is 0 which uses " + "as many threads as there are processor cores")); if (long_help) { - puts(_( -" --block-size=SIZE\n" -" start a new .xz block after every SIZE bytes of input;\n" -" use this to set the block size for threaded compression")); - puts(_( -" --block-list=BLOCKS\n" -" start a new .xz block after the given comma-separated\n" -" intervals of uncompressed data; optionally, specify a\n" -" filter chain number (0-9) followed by a ':' before the\n" -" uncompressed data size")); - puts(_( -" --flush-timeout=TIMEOUT\n" -" when compressing, if more than TIMEOUT milliseconds has\n" -" passed since the previous flush and reading more input\n" -" would block, all pending data is flushed out" - )); - puts(_( // xgettext:no-c-format -" --memlimit-compress=LIMIT\n" -" --memlimit-decompress=LIMIT\n" -" --memlimit-mt-decompress=LIMIT\n" -" -M, --memlimit=LIMIT\n" -" set memory usage limit for compression, decompression,\n" -" threaded decompression, or all of these; LIMIT is in\n" -" bytes, % of RAM, or 0 for defaults")); - - puts(_( -" --no-adjust if compression settings exceed the memory usage limit,\n" -" give an error instead of adjusting the settings downwards")); + e |= tuklib_wrapf(stdout, &wrap2, + " --block-size=%s\v%s\r" + " --block-list=%s\v%s\r" + " --flush-timeout=%s\v%s", + _("SIZE"), + W_("start a new .xz block after every SIZE bytes " + "of input; use this to set the block size " + "for threaded compression"), + _("BLOCKS"), + W_("start a new .xz block after the given " + "comma-separated intervals of uncompressed " + "data; optionally, specify a " + "filter chain number (0-9) followed by " + "a ':' before the uncompressed data size"), + _("NUM"), + W_("when compressing, if more than NUM " + "milliseconds has passed since the previous " + "flush and reading more input would block, " + "all pending data is flushed out")); + + e |= tuklib_wrapf(stdout, &wrap2, + " --memlimit-compress=%s\n" + " --memlimit-decompress=%s\n" + " --memlimit-mt-decompress=%s\n" + "-M, --memlimit=%s\v%s\r" + " --no-adjust\v%s", + _("LIMIT"), + _("LIMIT"), + _("LIMIT"), + _("LIMIT"), + // xgettext:no-c-format + W_("set memory usage limit for compression, " + "decompression, threaded decompression, " + "or all of these; LIMIT is in " + "bytes, % of RAM, or 0 for defaults"), + W_("if compression settings exceed the " + "memory usage limit, " + "give an error instead of adjusting " + "the settings downwards")); } if (long_help) { - puts(_( -"\n Custom filter chain for compression (alternative for using presets):")); - - puts(_( -"\n" -" --filters=FILTERS set the filter chain using the liblzma filter string\n" -" syntax; use --filters-help for more information" - )); - - puts(_( -" --filters1=FILTERS ... --filters9=FILTERS\n" -" set additional filter chains using the liblzma filter\n" -" string syntax to use with --block-list" - )); - - puts(_( -" --filters-help display more information about the liblzma filter string\n" -" syntax and exit." - )); + putchar('\n'); + + e |= tuklib_wraps(stdout, &wrap1, + W_("Custom filter chain for compression " + "(an alternative to using presets):")); + + e |= tuklib_wrapf(stdout, &wrap2, + "\n" + "--filters=%s\v%s\r" + "--filters1=%s ... --filters9=%s\v%s\r" + "--filters-help\v%s", + _("FILTERS"), + W_("set the filter chain using the " + "liblzma filter string syntax; " + "use --filters-help for more information"), + _("FILTERS"), + _("FILTERS"), + W_("set additional filter chains using the " + "liblzma filter string syntax to use " + "with --block-list"), + W_("display more information about the " + "liblzma filter string syntax and exit")); #if defined(HAVE_ENCODER_LZMA1) || defined(HAVE_DECODER_LZMA1) \ || defined(HAVE_ENCODER_LZMA2) || defined(HAVE_DECODER_LZMA2) - // TRANSLATORS: The word "literal" in "literal context bits" - // means how many "context bits" to use when encoding - // literals. A literal is a single 8-bit byte. It doesn't - // mean "literally" here. - puts(_( -"\n" -" --lzma1[=OPTS] LZMA1 or LZMA2; OPTS is a comma-separated list of zero or\n" -" --lzma2[=OPTS] more of the following options (valid values; default):\n" -" preset=PRE reset options to a preset (0-9[e])\n" -" dict=NUM dictionary size (4KiB - 1536MiB; 8MiB)\n" -" lc=NUM number of literal context bits (0-4; 3)\n" -" lp=NUM number of literal position bits (0-4; 0)\n" -" pb=NUM number of position bits (0-4; 2)\n" -" mode=MODE compression mode (fast, normal; normal)\n" -" nice=NUM nice length of a match (2-273; 64)\n" -" mf=NAME match finder (hc3, hc4, bt2, bt3, bt4; bt4)\n" -" depth=NUM maximum search depth; 0=automatic (default)")); + e |= tuklib_wrapf(stdout, &wrap2, + "\n" + "--lzma1[=%s]\n" + "--lzma2[=%s]\v%s", + // TRANSLATORS: Short for OPTIONS. + _("OPTS"), + _("OPTS"), + // TRANSLATORS: Use semicolon (or its fullwidth form) + // in "(valid values; default)" even if it is weird in + // your language. There are non-translatable strings + // that look like "(foo, bar, baz; foo)" which list + // the supported values and the default value. + W_("LZMA1 or LZMA2; OPTS is a comma-separated list " + "of zero or more of the following options " + "(valid values; default):")); + + e |= tuklib_wrapf(stdout, &wrap3, + "preset=%s\v%s (0-9[e])\r" + "dict=%s\v%s \b(4KiB - 1536MiB; 8MiB)\b\r" + "lc=%s\v%s \b(0-4; 3)\b\r" + "lp=%s\v%s \b(0-4; 0)\b\r" + "pb=%s\v%s \b(0-4; 2)\b\r" + "mode=%s\v%s (fast, normal; normal)\r" + "nice=%s\v%s \b(2-273; 64)\b\r" + "mf=%s\v%s (hc3, hc4, bt2, bt3, bt4; bt4)\r" + "depth=%s\v%s", + // TRANSLATORS: Short for PRESET. A longer string is + // fine but wider than 4 columns makes --long-help + // one line longer. + _("PRE"), + W_("reset options to a preset"), + _("NUM"), W_("dictionary size"), + _("NUM"), + // TRANSLATORS: The word "literal" in "literal context + // bits" means how many "context bits" to use when + // encoding literals. A literal is a single 8-bit + // byte. It doesn't mean "literally" here. + W_("number of literal context bits"), + _("NUM"), W_("number of literal position bits"), + _("NUM"), W_("number of position bits"), + _("MODE"), W_("compression mode"), + _("NUM"), W_("nice length of a match"), + _("NAME"), W_("match finder"), + _("NUM"), W_("maximum search depth; " + "0=automatic (default)")); #endif - puts(_( -"\n" -" --x86[=OPTS] x86 BCJ filter (32-bit and 64-bit)\n" -" --arm[=OPTS] ARM BCJ filter\n" -" --armthumb[=OPTS] ARM-Thumb BCJ filter\n" -" --arm64[=OPTS] ARM64 BCJ filter\n" -" --powerpc[=OPTS] PowerPC BCJ filter (big endian only)\n" -" --ia64[=OPTS] IA-64 (Itanium) BCJ filter\n" -" --sparc[=OPTS] SPARC BCJ filter\n" -" --riscv[=OPTS] RISC-V BCJ filter\n" -" Valid OPTS for all BCJ filters:\n" -" start=NUM start offset for conversions (default=0)")); + e |= tuklib_wrapf(stdout, &wrap2, + "\n" + "--x86[=%s]\v%s\r" + "--arm[=%s]\v%s\r" + "--armthumb[=%s]\v%s\r" + "--arm64[=%s]\v%s\r" + "--powerpc[=%s]\v%s\r" + "--ia64[=%s]\v%s\r" + "--sparc[=%s]\v%s\r" + "--riscv[=%s]\v%s\r" + "\v%s", + _("OPTS"), + W_("x86 BCJ filter (32-bit and 64-bit)"), + _("OPTS"), + W_("ARM BCJ filter"), + _("OPTS"), + W_("ARM-Thumb BCJ filter"), + _("OPTS"), + W_("ARM64 BCJ filter"), + _("OPTS"), + W_("PowerPC BCJ filter (big endian only)"), + _("OPTS"), + W_("IA-64 (Itanium) BCJ filter"), + _("OPTS"), + W_("SPARC BCJ filter"), + _("OPTS"), + W_("RISC-V BCJ filter"), + W_("Valid OPTS for all BCJ filters:")); + e |= tuklib_wrapf(stdout, &wrap3, + "start=%s\v%s", + _("NUM"), + W_("start offset for conversions (default=0)")); #if defined(HAVE_ENCODER_DELTA) || defined(HAVE_DECODER_DELTA) - puts(_( -"\n" -" --delta[=OPTS] Delta filter; valid OPTS (valid values; default):\n" -" dist=NUM distance between bytes being subtracted\n" -" from each other (1-256; 1)")); + e |= tuklib_wrapf(stdout, &wrap2, + "\n" + "--delta[=%s]\v%s", + _("OPTS"), + W_("Delta filter; valid OPTS " + "(valid values; default):")); + e |= tuklib_wrapf(stdout, &wrap3, + "dist=%s\v%s \b(1-256; 1)\b", + _("NUM"), + W_("distance between bytes being subtracted " + "from each other")); #endif } - if (long_help) - puts(_("\n Other options:\n")); + if (long_help) { + putchar('\n'); + e |= tuklib_wraps(stdout, &wrap1, W_("Other options:")); + putchar('\n'); + } - puts(_( -" -q, --quiet suppress warnings; specify twice to suppress errors too\n" -" -v, --verbose be verbose; specify twice for even more verbose")); + e |= tuklib_wrapf(stdout, &wrap2, + "-q, --quiet\v%s\r" + "-v, --verbose\v%s", + W_("suppress warnings; specify twice to suppress errors too"), + W_("be verbose; specify twice for even more verbose")); if (long_help) { - puts(_( -" -Q, --no-warn make warnings not affect the exit status")); - puts(_( -" --robot use machine-parsable messages (useful for scripts)")); - puts(""); - puts(_( -" --info-memory display the total amount of RAM and the currently active\n" -" memory usage limits, and exit")); - puts(_( -" -h, --help display the short help (lists only the basic options)\n" -" -H, --long-help display this long help and exit")); + e |= tuklib_wrapf(stdout, &wrap2, + "-Q, --no-warn\v%s\r" + " --robot\v%s\r" + "\n" + " --info-memory\v%s\r" + "-h, --help\v%s\r" + "-H, --long-help\v%s", + W_("make warnings not affect the exit status"), + W_("use machine-parsable messages (useful for scripts)"), + W_("display the total amount of RAM and the currently active " + "memory usage limits, and exit"), + W_("display the short help (lists only the basic options)"), + W_("display this long help and exit")); } else { - puts(_( -" -h, --help display this short help and exit\n" -" -H, --long-help display the long help (lists also the advanced options)")); + e |= tuklib_wrapf(stdout, &wrap2, + "-h, --help\v%s\r" + "-H, --long-help\v%s", + W_("display this short help and exit"), + W_("display the long help (lists also the advanced options)")); } - puts(_( -" -V, --version display the version number and exit")); + e |= tuklib_wrapf(stdout, &wrap2, "-V, --version\v%s", + W_("display the version number and exit")); + + putchar('\n'); + e |= tuklib_wraps(stdout, &wrap0, + W_("With no FILE, or when FILE is -, read standard input.")); + putchar('\n'); - puts(_("\nWith no FILE, or when FILE is -, read standard input.\n")); + e |= tuklib_wrapf(stdout, &wrap0, + // TRANSLATORS: This message indicates the bug reporting + // address for this package. Please add another line saying + // "\nReport translation bugs to <...>." with the email or WWW + // address for translation bugs. Thanks! + W_("Report bugs to <%s> (in English or Finnish)."), + PACKAGE_BUGREPORT); - // TRANSLATORS: This message indicates the bug reporting address - // for this package. Please add _another line_ saying - // "Report translation bugs to <...>\n" with the email or WWW - // address for translation bugs. Thanks. - printf(_("Report bugs to <%s> (in English or Finnish).\n"), - PACKAGE_BUGREPORT); - printf(_("%s home page: <%s>\n"), PACKAGE_NAME, PACKAGE_URL); + e |= tuklib_wrapf(stdout, &wrap0, + // TRANSLATORS: The first %s is the name of this software. + // The second <%s> is an URL. + W_("%s home page: <%s>"), PACKAGE_NAME, PACKAGE_URL); #if LZMA_VERSION_STABILITY != LZMA_VERSION_STABILITY_STABLE - puts(_( + e |= tuklib_wraps(stdout, &wrap0, W_( "THIS IS A DEVELOPMENT VERSION NOT INTENDED FOR PRODUCTION USE.")); #endif + detect_wrapping_errors(e); tuklib_exit(E_SUCCESS, E_ERROR, verbosity != V_SILENT); } @@ -1150,20 +1299,25 @@ message_help(bool long_help) extern void message_filters_help(void) { + static const struct tuklib_wrap_opt wrap = { .right_margin = 76 }; + char *encoder_options; if (lzma_str_list_filters(&encoder_options, LZMA_VLI_UNKNOWN, LZMA_STR_ENCODER, NULL) != LZMA_OK) message_bug(); if (!opt_robot) { - puts(_( -"Filter chains are set using the --filters=FILTERS or\n" -"--filters1=FILTERS ... --filters9=FILTERS options. Each filter in the chain\n" -"can be separated by spaces or '--'. Alternatively a preset <0-9>[e] can be\n" -"specified instead of a filter chain.\n" - )); - - puts(_("The supported filters and their options are:")); + int e = tuklib_wrapf(stdout, &wrap, +W_("Filter chains are set using the --filters=FILTERS or " +"--filters1=FILTERS ... --filters9=FILTERS options. " +"Each filter in the chain can be separated by spaces or '--'. " +"Alternatively a preset %s can be specified instead of a filter chain."), + "<0-9>[e]"); + putchar('\n'); + e |= tuklib_wraps(stdout, &wrap, + W_("The supported filters and their options are:")); + + detect_wrapping_errors(e); } puts(encoder_options); diff --git a/src/xz/options.c b/src/xz/options.c index bc8bc1a6c36c..c4f56b495609 100644 --- a/src/xz/options.c +++ b/src/xz/options.c @@ -82,15 +82,16 @@ parse_options(const char *str, const option_map *opts, *value++ = '\0'; if (value == NULL || value[0] == '\0') - message_fatal(_("%s: Options must be 'name=value' " - "pairs separated with commas"), str); + message_fatal(_("%s: %s"), tuklib_mask_nonprint(str), + _("Options must be 'name=value' " + "pairs separated with commas")); // Look for the option name from the option map. unsigned i = 0; while (true) { if (opts[i].name == NULL) message_fatal(_("%s: Invalid option name"), - name); + tuklib_mask_nonprint(name)); if (strcmp(name, opts[i].name) == 0) break; @@ -109,8 +110,9 @@ parse_options(const char *str, const option_map *opts, } if (opts[i].map[j].name == NULL) - message_fatal(_("%s: Invalid option value"), - value); + message_fatal(_("%s: %s"), + tuklib_mask_nonprint(value), + _("Invalid option value")); set(filter_options, i, opts[i].map[j].id, value); @@ -244,7 +246,8 @@ tuklib_attr_noreturn static void error_lzma_preset(const char *valuestr) { - message_fatal(_("Unsupported LZMA1/LZMA2 preset: %s"), valuestr); + message_fatal(_("Unsupported LZMA1/LZMA2 preset: %s"), + tuklib_mask_nonprint(valuestr)); } diff --git a/src/xz/private.h b/src/xz/private.h index b370472e32c8..d351a995eec4 100644 --- a/src/xz/private.h +++ b/src/xz/private.h @@ -28,6 +28,7 @@ #include "tuklib_gettext.h" #include "tuklib_progname.h" #include "tuklib_exit.h" +#include "tuklib_mbstr_nonprint.h" #include "tuklib_mbstr.h" #if defined(_WIN32) && !defined(__CYGWIN__) diff --git a/src/xz/sandbox.c b/src/xz/sandbox.c index 5bd227370751..f5576960d9aa 100644 --- a/src/xz/sandbox.c +++ b/src/xz/sandbox.c @@ -115,26 +115,7 @@ sandbox_enable_strict_if_allowed(int src_fd lzma_attribute((__unused__)), // Landlock // ////////////// -#include <linux/landlock.h> -#include <sys/syscall.h> -#include <sys/prctl.h> - - -// Highest Landlock ABI version supported by this file: -// - For ABI versions 1-3 we don't need anything from <linux/landlock.h> -// that isn't part of version 1. -// - For ABI version 4 we need the larger struct landlock_ruleset_attr -// with the handled_access_net member. That is bundled with the macros -// LANDLOCK_ACCESS_NET_BIND_TCP and LANDLOCK_ACCESS_NET_CONNECT_TCP. -#ifdef LANDLOCK_ACCESS_NET_BIND_TCP -# define LANDLOCK_ABI_MAX 4 -#else -# define LANDLOCK_ABI_MAX 3 -#endif - - -/// Landlock ABI version supported by the kernel -static int landlock_abi; +#include "my_landlock.h" // The required_rights should have those bits set that must not be restricted. @@ -144,40 +125,19 @@ static int landlock_abi; static void enable_landlock(uint64_t required_rights) { - assert(landlock_abi <= LANDLOCK_ABI_MAX); - - if (landlock_abi <= 0) + // Initialize the ruleset to forbid all actions that the available + // Landlock ABI version supports. Return if Landlock isn't supported + // at all. + struct landlock_ruleset_attr attr; + if (my_landlock_ruleset_attr_forbid_all(&attr) == -1) return; - // We want to set all supported flags in handled_access_fs. - // This way the ruleset will initially forbid access to all - // actions that the available Landlock ABI version supports. - // Exceptions can be added using landlock_add_rule(2) to - // allow certain actions on certain files or directories. - // - // The same flag values are used on all archs. ABI v2 and v3 - // both add one new flag. - // - // First in ABI v1: LANDLOCK_ACCESS_FS_EXECUTE = 1ULL << 0 - // Last in ABI v1: LANDLOCK_ACCESS_FS_MAKE_SYM = 1ULL << 12 - // Last in ABI v2: LANDLOCK_ACCESS_FS_REFER = 1ULL << 13 - // Last in ABI v3: LANDLOCK_ACCESS_FS_TRUNCATE = 1ULL << 14 - // - // This makes it simple to set the mask based on the ABI - // version and we don't need to care which flags are #defined - // in the installed <linux/landlock.h> for ABI versions 1-3. - const struct landlock_ruleset_attr attr = { - .handled_access_fs = ~required_rights - & ((1ULL << (12 + my_min(3, landlock_abi))) - 1), -#if LANDLOCK_ABI_MAX >= 4 - .handled_access_net = landlock_abi < 4 ? 0 : - (LANDLOCK_ACCESS_NET_BIND_TCP - | LANDLOCK_ACCESS_NET_CONNECT_TCP), -#endif - }; + // Allow the required rights. + attr.handled_access_fs &= ~required_rights; - const int ruleset_fd = syscall(SYS_landlock_create_ruleset, - &attr, sizeof(attr), 0U); + // Create the ruleset in the kernel. This shouldn't fail. + const int ruleset_fd = my_landlock_create_ruleset( + &attr, sizeof(attr), 0); if (ruleset_fd < 0) message_fatal(_("Failed to enable the sandbox")); @@ -193,9 +153,10 @@ enable_landlock(uint64_t required_rights) // // prctl(PR_SET_NO_NEW_PRIVS, ...) was already called in // sandbox_init() so we don't do it here again. - if (syscall(SYS_landlock_restrict_self, ruleset_fd, 0U) != 0) + if (my_landlock_restrict_self(ruleset_fd, 0) != 0) message_fatal(_("Failed to enable the sandbox")); + (void)close(ruleset_fd); return; } @@ -213,19 +174,14 @@ sandbox_init(void) // fails here the error will still be detected when it matters. (void)prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); - // Get the highest Landlock ABI version supported by the kernel. - landlock_abi = syscall(SYS_landlock_create_ruleset, - (void *)NULL, 0, LANDLOCK_CREATE_RULESET_VERSION); - - // The kernel might support a newer ABI than this file. - if (landlock_abi > LANDLOCK_ABI_MAX) - landlock_abi = LANDLOCK_ABI_MAX; - // These are all in ABI version 1 already. We don't need truncate // rights because files are created with open() using O_EXCL and // without O_TRUNC. // - // LANDLOCK_ACCESS_FS_READ_DIR is included here to get a clear error + // LANDLOCK_ACCESS_FS_READ_DIR is required to synchronize the + // directory before removing the source file. + // + // LANDLOCK_ACCESS_FS_READ_DIR is also helpful to show a clear error // message if xz is given a directory name. Without this permission // the message would be "Permission denied" but with this permission // it's "Is a directory, skipping". It could be worked around with diff --git a/src/xz/suffix.c b/src/xz/suffix.c index 1d548e485b8c..2fd4c7fc9573 100644 --- a/src/xz/suffix.c +++ b/src/xz/suffix.c @@ -163,7 +163,7 @@ uncompressed_name(const char *src_name, const size_t src_len) if (new_len == 0) { message_warning(_("%s: Filename has an unknown suffix, " - "skipping"), src_name); + "skipping"), tuklib_mask_nonprint(src_name)); return NULL; } @@ -178,13 +178,14 @@ uncompressed_name(const char *src_name, const size_t src_len) } -/// This message is needed in multiple places in compressed_name(), -/// so the message has been put into its own function. static void msg_suffix(const char *src_name, const char *suffix) { + char *mem = NULL; message_warning(_("%s: File already has '%s' suffix, skipping"), - src_name, suffix); + tuklib_mask_nonprint(src_name), + tuklib_mask_nonprint_r(suffix, &mem)); + free(mem); return; } @@ -390,7 +391,8 @@ suffix_set(const char *suffix) // Empty suffix and suffixes having a directory separator are // rejected. Such suffixes would break things later. if (suffix[0] == '\0' || has_dir_sep(suffix)) - message_fatal(_("%s: Invalid filename suffix"), suffix); + message_fatal(_("%s: Invalid filename suffix"), + tuklib_mask_nonprint(suffix)); // Replace the old custom_suffix (if any) with the new suffix. free(custom_suffix); diff --git a/src/xz/util.c b/src/xz/util.c index 0d339aede675..e5485beef80d 100644 --- a/src/xz/util.c +++ b/src/xz/util.c @@ -25,7 +25,11 @@ static char bufs[4][128]; // for DJGPP builds. // // MSVC doesn't support thousand separators. -#if defined(__DJGPP__) || defined(_MSC_VER) +// +// MinGW-w64 supports thousand separators only with its own stdio functions +// which our sysdefs.h disables when _UCRT && HAVE_SMALL. +#if defined(__DJGPP__) || defined(_MSC_VER) \ + || (defined(__MINGW32__) && __USE_MINGW_ANSI_STDIO == 0) # define FORMAT_THOUSAND_SEP(prefix, suffix) prefix suffix # define check_thousand_sep(slot) do { } while (0) #else @@ -103,8 +107,8 @@ str_to_uint64(const char *name, const char *value, uint64_t min, uint64_t max) return max; if (*value < '0' || *value > '9') - message_fatal(_("%s: Value is not a non-negative " - "decimal integer"), value); + message_fatal(_("%s: %s"), value, + _("Value is not a non-negative decimal integer")); do { // Don't overflow. diff --git a/src/xz/xz.1 b/src/xz/xz.1 index 5b880e81e8c2..0bc30a9af384 100644 --- a/src/xz/xz.1 +++ b/src/xz/xz.1 @@ -4,7 +4,7 @@ .\" Authors: Lasse Collin .\" Jia Tan .\" -.TH XZ 1 "2024-04-08" "Tukaani" "XZ Utils" +.TH XZ 1 "2025-03-08" "Tukaani" "XZ Utils" . .SH NAME xz, unxz, xzcat, lzma, unlzma, lzcat \- Compress or decompress .xz and .lzma files @@ -237,6 +237,8 @@ The memory usage limiter can be enabled with the command line option \fB\-\-memlimit=\fIlimit\fR. Often it is more convenient to enable the limiter by default by setting the environment variable +.\" TRANSLATORS: Don't translate the uppercase XZ_DEFAULTS. +.\" It's a name of an environment variable. .BR XZ_DEFAULTS , for example, .BR XZ_DEFAULTS=\-\-memlimit=150MiB . @@ -351,9 +353,24 @@ the command name (for example, .B unxz implies .BR \-\-decompress ). +.IP "" +.\" The DESCRIPTION section already says this but it's good to repeat it +.\" here because the default behavior is a bit dangerous and new users +.\" in a hurry may skip reading the DESCRIPTION section. +After successful compression, the source file is removed +unless writing to standard output or +.B \-\-keep +was specified. .TP .BR \-d ", " \-\-decompress ", " \-\-uncompress Decompress. +.\" The DESCRIPTION section already says this but it's good to repeat it +.\" here because the default behavior is a bit dangerous and new users +.\" in a hurry may skip reading the DESCRIPTION section. +After successful decompression, the source file is removed +unless writing to standard output or +.B \-\-keep +was specified. .TP .BR \-t ", " \-\-test Test the integrity of compressed @@ -482,6 +499,13 @@ This option has no effect if the operation mode is not .B \-\-decompress or .BR \-\-test . +.IP "" +Since +.B xz +5.7.1alpha, +.B \-\-single\-stream +implies +.BR \-\-keep . .TP .B \-\-no\-sparse Disable creation of sparse files. @@ -553,6 +577,7 @@ Specify the file to compress or decompress: .RS .TP +.\" TRANSLATORS: Don't translate bold string B<auto>. .B auto This is the default. When compressing, @@ -639,6 +664,9 @@ Supported types: .RS .TP +.\" TRANSLATORS: Don't translate the bold strings B<none>, B<crc32>, +.\" B<crc64>, and B<sha256>. The command line option --check accepts +.\" only the untranslated strings. .B none Don't calculate an integrity check at all. This is usually a bad idea. @@ -1039,6 +1067,28 @@ is unsuitable for decompressing the stream in real time due to how .B xz does buffering. .TP +.B \-\-no\-sync +Do not synchronize the target file and its directory +to the storage device before removing the source file. +This can improve performance if compressing or decompressing +many small files. +However, if the system crashes soon after the deletion, +it is possible that the target file was not written +to the storage device but the delete operation was. +In that case neither the original source file +nor the target file is available. +.IP "" +This option has an effect only when +.B xz +is going to remove the source file. +In other cases synchronization is never done. +.IP "" +The synchronization and +.B \-\-no\-sync +were added in +.B xz +5.7.1alpha. +.TP .BI \-\-memlimit\-compress= limit Set a memory usage limit for compression. If this option is specified multiple times, @@ -1453,6 +1503,11 @@ LZMA1 and LZMA2 share the same set of .IR options : .RS .TP +.\" TRANSLATORS: Don't translate bold strings like B<preset>, B<dict>, +.\" B<mode>, B<nice>, B<fast>, or B<normal> because those are command line +.\" options. On the other hand, do translate the italic strings like +.\" I<preset>, I<size>, and I<mode>, because such italic strings are +.\" placeholders which a user replaces with an actual value. .BI preset= preset Reset all LZMA1 or LZMA2 .I options @@ -2103,6 +2158,11 @@ uses tab-separated output. The first column of every line has a string that indicates the type of the information found on that line: .TP +.\" TRANSLATORS: The bold strings B<name>, B<file>, B<stream>, B<block>, +.\" B<summary>, and B<totals> are produced by the xz tool for scripts to +.\" parse, thus the untranslated strings must be included in the translated +.\" man page. It may be useful to provide a translated string in parenthesis +.\" without bold, for example: "B<name> (nimi)" .B name This is always the first line when starting to list a file. The second column on the line is the filename. @@ -2181,6 +2241,9 @@ are displayed instead of the ratio. .IP 7. 4 Comma-separated list of integrity check names. The following strings are used for the known check types: +.\" TRANSLATORS: Don't translate the bold strings B<None>, B<CRC32>, +.\" B<CRC64>, B<SHA-256>, or B<Unknown-> here. In robot mode, xz produces +.\" them in untranslated form for scripts to parse. .BR None , .BR CRC32 , .BR CRC64 , @@ -2467,6 +2530,7 @@ prints the version number of .B xz and liblzma in the following format: .PP +.\" TRANSLATORS: Don't translate the uppercase XZ_VERSION or LIBLZMA_VERSION. .BI XZ_VERSION= XYYYZZZS .br .BI LIBLZMA_VERSION= XYYYZZZS @@ -2521,6 +2585,8 @@ don't affect the exit status. .B xz parses space-separated lists of options from the environment variables +.\" TRANSLATORS: Don't translate the uppercase XZ_DEFAULTS or XZ_OPT. +.\" They are names of environment variables. .B XZ_DEFAULTS and .BR XZ_OPT , @@ -2530,14 +2596,36 @@ all non-options are silently ignored. Parsing is done with .BR getopt_long (3) which is used also for the command line arguments. +.PP +.B Warning: +By setting these environment variables, +one is effectively modifying programs and scripts that run +.BR xz . +Most of the time it is safe to set memory usage limits, number of threads, +and compression options via the environment variables. +However, some options can break scripts. +An obvious example is +.B \-\-help +which makes +.B xz +show the help text instead of compressing or decompressing a file. +More subtle examples are +.B \-\-quiet +and +.BR \-\-verbose . +In many cases it works well to enable the progress indicator using +.BR \-\-verbose , +but in some situations the extra messages create problems. +The verbosity level also affects the behavior of +.BR \-\-list . .TP .B XZ_DEFAULTS User-specific or system-wide default options. Typically this is set in a shell initialization script to enable .BR xz 's -memory usage limiter by default. +memory usage limiter by default or set the default number of threads. Excluding shell initialization scripts -and similar special cases, scripts must never set or unset +and similar special cases, scripts should never set or unset .BR XZ_DEFAULTS . .TP .B XZ_OPT diff --git a/src/xzdec/xzdec.c b/src/xzdec/xzdec.c index a75ea42a52fb..96e2444438c2 100644 --- a/src/xzdec/xzdec.c +++ b/src/xzdec/xzdec.c @@ -14,6 +14,7 @@ #include <stdarg.h> #include <errno.h> +#include <locale.h> #include <stdio.h> #ifndef _MSC_VER @@ -25,14 +26,7 @@ #endif #ifdef HAVE_LINUX_LANDLOCK -# include <linux/landlock.h> -# include <sys/prctl.h> -# include <sys/syscall.h> -# ifdef LANDLOCK_ACCESS_NET_BIND_TCP -# define LANDLOCK_ABI_MAX 4 -# else -# define LANDLOCK_ABI_MAX 3 -# endif +# include "my_landlock.h" #endif #if defined(HAVE_CAP_RIGHTS_LIMIT) || defined(HAVE_PLEDGE) \ @@ -42,6 +36,7 @@ #include "getopt.h" #include "tuklib_progname.h" +#include "tuklib_mbstr_nonprint.h" #include "tuklib_exit.h" #ifdef TUKLIB_DOSLIKE @@ -209,7 +204,8 @@ uncompress(lzma_stream *strm, FILE *file, const char *filename) // an error occurred. ferror() doesn't // touch errno. my_errorf("%s: Error reading input file: %s", - filename, strerror(errno)); + tuklib_mask_nonprint(filename), + strerror(errno)); exit(EXIT_FAILURE); } @@ -234,8 +230,17 @@ uncompress(lzma_stream *strm, FILE *file, const char *filename) // Wouldn't be a surprise if writing to stderr // would fail too but at least try to show an // error message. - my_errorf("Cannot write to standard output: " +#if defined(_WIN32) && !defined(__CYGWIN__) + // On native Windows, broken pipe is reported + // as EINVAL. Don't show an error message + // in this case. + if (errno != EINVAL) +#endif + { + my_errorf("Cannot write to " + "standard output: " "%s", strerror(errno)); + } exit(EXIT_FAILURE); } @@ -292,7 +297,8 @@ uncompress(lzma_stream *strm, FILE *file, const char *filename) break; } - my_errorf("%s: %s", filename, msg); + my_errorf("%s: %s", tuklib_mask_nonprint(filename), + msg); exit(EXIT_FAILURE); } } @@ -334,33 +340,20 @@ sandbox_enter(int src_fd) (void)src_fd; #elif defined(HAVE_LINUX_LANDLOCK) - int landlock_abi = syscall(SYS_landlock_create_ruleset, - (void *)NULL, 0, LANDLOCK_CREATE_RULESET_VERSION); - - if (landlock_abi > 0) { - if (landlock_abi > LANDLOCK_ABI_MAX) - landlock_abi = LANDLOCK_ABI_MAX; - - const struct landlock_ruleset_attr attr = { - .handled_access_fs = (1ULL - << (12 + my_min(3, landlock_abi))) - 1, -# if LANDLOCK_ABI_MAX >= 4 - .handled_access_net = landlock_abi < 4 ? 0 : - (LANDLOCK_ACCESS_NET_BIND_TCP - | LANDLOCK_ACCESS_NET_CONNECT_TCP), -# endif - }; - - const int ruleset_fd = syscall(SYS_landlock_create_ruleset, - &attr, sizeof(attr), 0U); + struct landlock_ruleset_attr attr; + if (my_landlock_ruleset_attr_forbid_all(&attr) > 0) { + const int ruleset_fd = my_landlock_create_ruleset( + &attr, sizeof(attr), 0); if (ruleset_fd < 0) goto error; // All files we need should have already been opened. Thus, // we don't need to add any rules using landlock_add_rule(2) // before activating the sandbox. - if (syscall(SYS_landlock_restrict_self, ruleset_fd, 0U) != 0) + if (my_landlock_restrict_self(ruleset_fd, 0) != 0) goto error; + + (void)close(ruleset_fd); } (void)src_fd; @@ -391,6 +384,9 @@ error: int main(int argc, char **argv) { + // Initialize progname which will be used in error messages. + tuklib_progname_init(argv); + #ifdef HAVE_PLEDGE // OpenBSD's pledge(2) sandbox. // Initially enable the sandbox slightly more relaxed so that @@ -416,8 +412,15 @@ main(int argc, char **argv) (void)prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); #endif - // Initialize progname which we will be used in error messages. - tuklib_progname_init(argv); + // We need to set the locale even though we don't have any + // translated messages: + // + // - tuklib_mask_nonprint() has locale-specific behavior (LC_CTYPE). + // + // - This is needed on Windows to make non-ASCII filenames display + // properly when the active code page has been set to UTF-8 + // in the application manifest. + setlocale(LC_ALL, ""); // Parse the command line options. parse_options(argc, argv); @@ -453,8 +456,10 @@ main(int argc, char **argv) src_name = argv[optind]; src_file = fopen(src_name, "rb"); if (src_file == NULL) { - my_errorf("%s: %s", src_name, - strerror(errno)); + my_errorf("%s: %s", + tuklib_mask_nonprint( + src_name), + strerror(errno)); exit(EXIT_FAILURE); } } |
