summaryrefslogtreecommitdiff
path: root/pkgs/development/python-modules/rangehttpserver
diff options
context:
space:
mode:
authorYarny0 <41838844+Yarny0@users.noreply.github.com>2025-07-04 12:05:34 +0200
committerYarny0 <41838844+Yarny0@users.noreply.github.com>2025-07-04 12:10:53 +0200
commit5e2baf54d48b025461698ea2023473af6753edd9 (patch)
treeca2f43a704fa4bd5234f59fd5d1ad851c5204ea8 /pkgs/development/python-modules/rangehttpserver
parent605cfcce80bf61ff47815aa8a8d3e275c0655312 (diff)
nixos/test-driver: fix race from filename clash in OCR
There is a race condition in the new paralleized OCR code. The race condition got "active" in commit 819d304a39b027a02da37bbf8956f18a0ca2483e (Use futures for OCR parallelization), however, the underlying bug already slipped in with commit e6ea13f4ea9e4783a241f473aa3105b777bef3dc (User proper `Path` instead of `str` in OCR code). The OCR module applies tesseract to at most three variants of the screenshot: the original one, and two variants that are created by a preprocessing step (with ImageMagick). The preprocessing step needs an output filename that is used to write the preprocessed image file. The "Path" commit broke the way the output file is named: The code still attempts to append a ".negative" to *one* of the preprocessed output files, but the method `.with_suffix` is not suitable for that purpose: Lateron, ".png" is also added with `.with_suffix`, *replacing* the ".negative" and thereby yielding the *the same* output filename for both preprocessed files. Without parallelization, this doesn't hurt; preprocessed files are simply created and analyzed in order. But the parallelization commit causes that these two tasks now run in parallel (plus the third task that analyses the original screensshot, but that does not cause any further harm here): * Task 1: preprocess (non-negative), then tesseract the output * Task 2: preprocess (negative), then tesseract the output Both tasks use the same filename and thus the same file for the preprocessed image that is generated, then used by tesseract. This often creates a garbage file since both preprocessings write that one file at the same time. Tesseract consequently fails and complains about bad data in its input file. The commit at hand simply fixes the file naming by adding ".negative.png" or ".positive.png" to the filename for the preprocessed image. This ensures both threads no longer hurt each other's data and can now coexist in peace.
Diffstat (limited to 'pkgs/development/python-modules/rangehttpserver')
0 files changed, 0 insertions, 0 deletions