Compare commits
2 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 6f7cadfc63 | |||
| 985a33d3f9 |
@@ -4,6 +4,7 @@ __pycache__/
|
||||
venv/
|
||||
env/
|
||||
.venv/
|
||||
.pytest_cache/
|
||||
*.egg-info/
|
||||
build/
|
||||
dist/
|
||||
|
||||
+47
-14
@@ -1,8 +1,8 @@
|
||||
# AI Agent Briefing — PDF OCR Hotfolder
|
||||
|
||||
**Zuletzt aktualisiert:** 2026-04-08
|
||||
**Version:** 0.1.0
|
||||
**Status:** Initiale Implementation, nicht produktiv getestet
|
||||
**Version:** 0.2.0
|
||||
**Status:** Multi-Instanz-Support, nicht produktiv getestet
|
||||
|
||||
## 🎯 Projektziel
|
||||
|
||||
@@ -20,7 +20,7 @@ pdf-ocr-hotfolder/
|
||||
│ ├── processor.py # ocrmypdf + veraPDF
|
||||
│ └── uploaders.py # folder, nextcloud (WebDAV), sftp, email
|
||||
├── systemd/
|
||||
│ └── pdf-ocr-hotfolder.service # Template (Platzhalter __SERVICE_USER__/__SERVICE_GROUP__)
|
||||
│ └── pdf-ocr-hotfolder@.service # systemd Template-Unit (Instanz = %i)
|
||||
├── config.example.toml
|
||||
├── install.sh # Interaktiver Installer
|
||||
├── update.sh # Update aus Repo
|
||||
@@ -43,25 +43,58 @@ pdf-ocr-hotfolder/
|
||||
| Email | `smtplib` (stdlib) |
|
||||
| Service | systemd |
|
||||
|
||||
## 🖥️ Installations-Layout
|
||||
## 🖥️ Installations-Layout (Multi-Instanz)
|
||||
|
||||
| Pfad | Inhalt |
|
||||
|------|--------|
|
||||
| `/opt/pdf-ocr-hotfolder/` | Code + venv (`venv/bin/python`) |
|
||||
| `/etc/pdf-ocr-hotfolder/config.toml` | Konfiguration (mode 640, root:<service-group>) |
|
||||
| `/var/lib/pdf-ocr-hotfolder/{incoming,working,outgoing,error}/` | Datenverzeichnisse |
|
||||
| `/var/log/pdf-ocr-hotfolder/` | Logs (zusätzlich zu journald) |
|
||||
| `/etc/systemd/system/pdf-ocr-hotfolder.service` | systemd-Unit |
|
||||
| `/opt/pdf-ocr-hotfolder/` | Code + venv (für alle Instanzen gemeinsam) |
|
||||
| `/etc/pdf-ocr-hotfolder/<instanz>.toml` | Config pro Instanz (mode 640, root:<service-group>) |
|
||||
| `/etc/systemd/system/pdf-ocr-hotfolder@.service` | Template-Unit |
|
||||
| `/etc/systemd/system/pdf-ocr-hotfolder@<instanz>.service.d/user.conf` | Drop-in für abweichenden User (optional) |
|
||||
| `/var/lib/pdf-ocr-hotfolder/<instanz>/{incoming,working,outgoing,error}/` | Daten pro Instanz |
|
||||
| `/var/log/pdf-ocr-hotfolder/` | Logs |
|
||||
| `/var/backups/pdf-ocr-hotfolder/` | Update-Backups |
|
||||
|
||||
## 👤 Service-User
|
||||
|
||||
Der Installer fragt interaktiv:
|
||||
1. Username (default `pdfocr`)
|
||||
2. Falls User existiert (lokal oder AD via SSSD/Winbind): wird übernommen, primäre Gruppe automatisch erkannt
|
||||
3. Falls nicht: Frage nach lokaler Anlage als System-User
|
||||
- Basis-Install legt Default-User `pdfocr` an (als System-User, falls nicht schon vorhanden)
|
||||
- Beim Anlegen einer Instanz fragt der Installer nach dem Service-User (default `pdfocr`)
|
||||
- Wird ein **abweichender** User gewählt, wird ein systemd-Drop-in erstellt (`pdf-ocr-hotfolder@<instanz>.service.d/user.conf`) mit `User=/Group=` Override
|
||||
- Existierende User (lokal oder AD via SSSD/Winbind) werden übernommen, primäre Gruppe via `id -gn` ermittelt
|
||||
- Bei AD-Usern mit lokaler UID werden Datei-Berechtigungen über die UID gesetzt — transparent
|
||||
|
||||
**Wichtig:** Bei AD-Usern mit lokaler UID werden Datei-Berechtigungen über die UID gesetzt — funktioniert transparent.
|
||||
## 🗂️ Instanz-Management
|
||||
|
||||
`install.sh` ist gleichzeitig **Installer und Instanz-Manager**:
|
||||
|
||||
- Erster Lauf: Basis-Install + erste Instanz anlegen (Pflicht)
|
||||
- Folgender Lauf: Basis-Install wird übersprungen, bestehende Instanzen werden gelistet, weitere Instanzen können ergänzt werden
|
||||
- Eingaben pro Instanz: Name (`[a-z0-9-]+`), Basis-Pfad (default `/var/lib/pdf-ocr-hotfolder/<name>`), Service-User
|
||||
- `config.toml` wird aus `config.example.toml` mit sed-substituierten Pfaden generiert
|
||||
- Instanz wird sofort `enable --now` gestartet
|
||||
|
||||
Manuelles Löschen einer Instanz:
|
||||
```bash
|
||||
systemctl disable --now pdf-ocr-hotfolder@<name>
|
||||
rm /etc/pdf-ocr-hotfolder/<name>.toml
|
||||
rm -rf /etc/systemd/system/pdf-ocr-hotfolder@<name>.service.d
|
||||
systemctl daemon-reload
|
||||
# Datenverzeichnis /var/lib/pdf-ocr-hotfolder/<name> manuell aufräumen
|
||||
```
|
||||
|
||||
## 🔄 Update-Verhalten
|
||||
|
||||
`update.sh`:
|
||||
1. Ermittelt alle aktiven `pdf-ocr-hotfolder@*.service` Units
|
||||
2. Stoppt diese
|
||||
3. Backup nach `/var/backups/pdf-ocr-hotfolder/`
|
||||
4. Kopiert Code + requirements + VERSION + config.example aus dem Repo
|
||||
5. `pip install --upgrade` im venv
|
||||
6. Aktualisiert Template-Unit + `daemon-reload`
|
||||
7. Startet alle zuvor aktiven Instanzen wieder
|
||||
8. Exit 1 wenn eine Instanz nicht mehr hochkommt
|
||||
|
||||
Config-Dateien werden **nie** überschrieben.
|
||||
|
||||
## 🔄 Verarbeitungs-Flow
|
||||
|
||||
|
||||
@@ -1,5 +1,32 @@
|
||||
# Changelog
|
||||
|
||||
## [0.2.1] - 2026-04-09
|
||||
|
||||
### Fixed
|
||||
- **Issue #1**: Preflight-Check beim Start prüft jetzt `tesseract` und `gs` (Ghostscript). Fehlt eine Abhängigkeit, beendet sich der Service sofort mit Exit-Code 2 und klarer Fehlermeldung statt erst bei der ersten Datei.
|
||||
- **Issue #2**: `--once`-Modus liefert jetzt Exit-Code `1`, sobald **mindestens ein** PDF fehlgeschlagen ist. Exit-Code `0` nur bei vollständigem Erfolg (inkl. "keine Dateien vorhanden"). Exit-Code `2` bei Preflight-Fehler.
|
||||
|
||||
### Added
|
||||
- Public API: `HotfolderService.run_once()`, `.success_count`, `.error_count`, `.ensure_dirs()`
|
||||
- `check_preflight()` / `PreflightError` in `pdf_ocr_hotfolder.service`
|
||||
- pytest-Test-Suite (`tests/`) mit 11 Tests — deckt alle Szenarien aus Issue #1 und #2 ab
|
||||
- `ocrmypdf`-Import in `processor.py` ist jetzt lazy (Tests ohne ocrmypdf-Installation möglich)
|
||||
|
||||
## [0.2.0] - 2026-04-08
|
||||
|
||||
### Added
|
||||
- **Multi-Instanz-Support** via systemd Template-Unit `pdf-ocr-hotfolder@<name>.service`
|
||||
- Pro Instanz: eigene Config (`/etc/pdf-ocr-hotfolder/<name>.toml`), eigene Datenverzeichnisse (`/var/lib/pdf-ocr-hotfolder/<name>/…`), optional eigener Service-User via Drop-in
|
||||
- **Instanz-Manager in `install.sh`**: erkennt bestehende Instanzen bei Re-Run, fragt nach weiteren, listet Namen + Status
|
||||
- `update.sh` stoppt/startet automatisch **alle** laufenden Instanzen
|
||||
|
||||
### Changed
|
||||
- Single-Unit `pdf-ocr-hotfolder.service` durch Template-Unit `pdf-ocr-hotfolder@.service` ersetzt
|
||||
- Installer fragt nicht mehr einmalig nach Service-User, sondern **pro Instanz**
|
||||
|
||||
### Removed
|
||||
- Alte Single-Config unter `/etc/pdf-ocr-hotfolder/config.toml` — wird nicht mehr erzeugt
|
||||
|
||||
## [0.1.0] - 2026-04-08
|
||||
|
||||
### Added
|
||||
|
||||
@@ -23,35 +23,56 @@ cd pdf-ocr-hotfolder
|
||||
sudo ./install.sh
|
||||
```
|
||||
|
||||
Der Installer fragt nach dem Service-User. Standardmäßig wird ein lokaler System-User `pdfocr` angelegt. Wenn der User bereits existiert (z.B. AD via SSSD), wird er einfach übernommen.
|
||||
Der Installer:
|
||||
1. Installiert einmalig Code + venv + systemd-Template-Unit
|
||||
2. Fragt nach Instanz-Name, Basis-Pfad, Service-User
|
||||
3. Legt so viele Hotfolder-Instanzen an, wie du willst (`Weitere Instanz anlegen? [j/N]`)
|
||||
|
||||
Danach Konfiguration anpassen:
|
||||
|
||||
```bash
|
||||
sudo nano /etc/pdf-ocr-hotfolder/config.toml
|
||||
sudo systemctl restart pdf-ocr-hotfolder
|
||||
```
|
||||
Bei jedem erneuten Aufruf erkennt der Installer bestehende Instanzen und fragt nur nach neuen.
|
||||
|
||||
Test:
|
||||
|
||||
```bash
|
||||
cp irgendein-scan.pdf /var/lib/pdf-ocr-hotfolder/incoming/
|
||||
journalctl -u pdf-ocr-hotfolder -f
|
||||
cp irgendein-scan.pdf /var/lib/pdf-ocr-hotfolder/<instanz>/incoming/
|
||||
journalctl -u pdf-ocr-hotfolder@<instanz> -f
|
||||
```
|
||||
|
||||
Nach wenigen Sekunden liegt das OCR-PDF unter `/var/lib/pdf-ocr-hotfolder/outgoing/OCR_irgendein-scan.pdf`.
|
||||
Nach wenigen Sekunden liegt das OCR-PDF im `outgoing/`-Ordner der Instanz.
|
||||
|
||||
## Multi-Instanz-Betrieb
|
||||
|
||||
Das Tool arbeitet komplett **instanzbasiert** über eine systemd Template-Unit `pdf-ocr-hotfolder@<name>.service`. Jede Instanz hat:
|
||||
|
||||
- eigene Config-Datei: `/etc/pdf-ocr-hotfolder/<name>.toml`
|
||||
- eigene Datenverzeichnisse: `/var/lib/pdf-ocr-hotfolder/<name>/{incoming,working,outgoing,error}/`
|
||||
- eigene systemd-Unit: `pdf-ocr-hotfolder@<name>.service`
|
||||
- optional eigenen Service-User (via Drop-in `/etc/systemd/system/pdf-ocr-hotfolder@<name>.service.d/user.conf`)
|
||||
|
||||
Beispiel für 3 Hotfolder:
|
||||
|
||||
```bash
|
||||
sudo ./install.sh
|
||||
# → legt z.B. kunde-a, kunde-b, buchhaltung an
|
||||
|
||||
systemctl status 'pdf-ocr-hotfolder@*'
|
||||
journalctl -u pdf-ocr-hotfolder@kunde-a -f
|
||||
```
|
||||
|
||||
Manuell eine weitere Instanz anlegen geht auch — einfach `install.sh` erneut starten, er fragt wieder nach.
|
||||
|
||||
## Verzeichnisse
|
||||
|
||||
| Pfad | Zweck |
|
||||
|------|-------|
|
||||
| `/etc/pdf-ocr-hotfolder/config.toml` | Konfiguration |
|
||||
| `/var/lib/pdf-ocr-hotfolder/incoming` | Eingang (Scanner schreibt hier rein) |
|
||||
| `/var/lib/pdf-ocr-hotfolder/working` | Arbeitsverzeichnis während OCR |
|
||||
| `/var/lib/pdf-ocr-hotfolder/outgoing` | Ausgang (fertige PDFs) |
|
||||
| `/var/lib/pdf-ocr-hotfolder/error` | PDFs, die nicht verarbeitet werden konnten |
|
||||
| `/opt/pdf-ocr-hotfolder/` | Code + venv |
|
||||
| `/opt/pdf-ocr-hotfolder/` | Code + venv (für alle Instanzen gemeinsam) |
|
||||
| `/etc/pdf-ocr-hotfolder/<instanz>.toml` | Config pro Instanz |
|
||||
| `/etc/systemd/system/pdf-ocr-hotfolder@.service` | systemd Template-Unit |
|
||||
| `/var/lib/pdf-ocr-hotfolder/<instanz>/incoming` | Eingang (Scanner schreibt hier rein) |
|
||||
| `/var/lib/pdf-ocr-hotfolder/<instanz>/working` | Arbeitsverzeichnis während OCR |
|
||||
| `/var/lib/pdf-ocr-hotfolder/<instanz>/outgoing` | Ausgang (fertige PDFs) |
|
||||
| `/var/lib/pdf-ocr-hotfolder/<instanz>/error` | Fehlgeschlagene PDFs |
|
||||
| `/var/log/pdf-ocr-hotfolder/` | Logs (zusätzlich zu journald) |
|
||||
| `/var/backups/pdf-ocr-hotfolder/` | Update-Backups |
|
||||
|
||||
## Konfiguration
|
||||
|
||||
@@ -101,9 +122,14 @@ on = "errors" # always | errors | never
|
||||
## Service-Verwaltung
|
||||
|
||||
```bash
|
||||
sudo systemctl status pdf-ocr-hotfolder
|
||||
sudo systemctl restart pdf-ocr-hotfolder
|
||||
journalctl -u pdf-ocr-hotfolder -f
|
||||
# Eine bestimmte Instanz
|
||||
sudo systemctl status pdf-ocr-hotfolder@kunde-a
|
||||
sudo systemctl restart pdf-ocr-hotfolder@kunde-a
|
||||
journalctl -u pdf-ocr-hotfolder@kunde-a -f
|
||||
|
||||
# Alle Instanzen
|
||||
sudo systemctl status 'pdf-ocr-hotfolder@*'
|
||||
sudo systemctl restart 'pdf-ocr-hotfolder@*'
|
||||
```
|
||||
|
||||
## Update
|
||||
@@ -114,15 +140,22 @@ git pull
|
||||
sudo ./update.sh
|
||||
```
|
||||
|
||||
`update.sh`:
|
||||
1. Stoppt alle laufenden Instanzen
|
||||
2. Sichert den alten Code nach `/var/backups/pdf-ocr-hotfolder/`
|
||||
3. Aktualisiert Code + venv + systemd-Template-Unit in `/opt/pdf-ocr-hotfolder/`
|
||||
4. Startet alle zuvor laufenden Instanzen neu
|
||||
|
||||
Config-Dateien unter `/etc/pdf-ocr-hotfolder/` werden **nie** überschrieben.
|
||||
Das Repo muss bestehen bleiben — `update.sh` kopiert daraus.
|
||||
|
||||
## Manueller Lauf (One-Shot)
|
||||
|
||||
Bestehende PDFs im Eingang einmalig verarbeiten und beenden:
|
||||
Bestehende PDFs einer Instanz einmalig verarbeiten und beenden:
|
||||
|
||||
```bash
|
||||
sudo -u pdfocr /opt/pdf-ocr-hotfolder/venv/bin/python -m pdf_ocr_hotfolder \
|
||||
--config /etc/pdf-ocr-hotfolder/config.toml --once
|
||||
--config /etc/pdf-ocr-hotfolder/kunde-a.toml --once
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
@@ -172,5 +205,5 @@ MIT — © Sonith UG
|
||||
|
||||
---
|
||||
|
||||
**Version:** 0.1.0
|
||||
**Version:** 0.2.0
|
||||
**Repo:** https://gitea.sonith.de/sonith_ug/pdf-ocr-hotfolder
|
||||
|
||||
+175
-89
@@ -1,11 +1,13 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# PDF OCR Hotfolder — Installer für Debian 12/13
|
||||
# PDF OCR Hotfolder — Installer / Instanz-Manager für Debian 12/13
|
||||
#
|
||||
# Fragt interaktiv nach dem Service-User. Unterstützt:
|
||||
# - Lokal anlegen (neuer System-User)
|
||||
# - Bereits existierender lokaler User
|
||||
# - AD-User mit lokaler UID (z.B. via SSSD/Winbind)
|
||||
# Basis-Installation erfolgt einmalig (Code, venv, systemd-Template-Unit).
|
||||
# Danach werden Hotfolder-Instanzen verwaltet:
|
||||
# - Beim Erstlauf: mindestens eine Instanz wird angelegt
|
||||
# - Beim Folgelauf: bestehende Instanzen werden erkannt; neue können ergänzt werden
|
||||
#
|
||||
# Unterstützt lokale System-User und AD-User mit lokaler UID (SSSD/Winbind).
|
||||
#
|
||||
|
||||
set -euo pipefail
|
||||
@@ -14,7 +16,7 @@ RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'; BLUE='\033[0;34m'; NC
|
||||
log_info() { echo -e "${GREEN}[INFO]${NC} $*"; }
|
||||
log_warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
|
||||
log_error() { echo -e "${RED}[ERROR]${NC} $*"; }
|
||||
log_step() { echo -e "${BLUE}==>${NC} $*"; }
|
||||
log_step() { echo -e "\n${BLUE}==>${NC} $*"; }
|
||||
|
||||
if [ "${EUID}" -ne 0 ]; then
|
||||
log_error "Bitte als root ausführen: sudo ./install.sh"
|
||||
@@ -23,9 +25,10 @@ fi
|
||||
|
||||
INSTALL_DIR="/opt/pdf-ocr-hotfolder"
|
||||
CONFIG_DIR="/etc/pdf-ocr-hotfolder"
|
||||
DATA_DIR="/var/lib/pdf-ocr-hotfolder"
|
||||
DATA_ROOT="/var/lib/pdf-ocr-hotfolder"
|
||||
LOG_DIR="/var/log/pdf-ocr-hotfolder"
|
||||
SERVICE_NAME="pdf-ocr-hotfolder"
|
||||
SERVICE_TEMPLATE="pdf-ocr-hotfolder@.service"
|
||||
DEFAULT_USER="pdfocr"
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
REPO_DIR="$SCRIPT_DIR"
|
||||
@@ -35,123 +38,206 @@ if [ ! -f "$REPO_DIR/pdf_ocr_hotfolder/__init__.py" ]; then
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo
|
||||
echo "=========================================="
|
||||
echo " PDF OCR Hotfolder — Installation"
|
||||
echo "=========================================="
|
||||
echo
|
||||
|
||||
# ============ 1. System-Dependencies ============
|
||||
log_step "Installiere System-Pakete"
|
||||
# ============================================================
|
||||
# Basis-Installation (idempotent)
|
||||
# ============================================================
|
||||
|
||||
install_base() {
|
||||
log_step "System-Pakete installieren"
|
||||
apt-get update -qq
|
||||
apt-get install -y --no-install-recommends \
|
||||
python3 python3-venv python3-pip \
|
||||
tesseract-ocr tesseract-ocr-deu tesseract-ocr-eng \
|
||||
ghostscript qpdf unpaper pngquant \
|
||||
icc-profiles-free \
|
||||
ca-certificates curl
|
||||
icc-profiles-free ca-certificates curl
|
||||
log_info "System-Pakete ok ✓"
|
||||
|
||||
log_info "System-Pakete installiert ✓"
|
||||
|
||||
# ============ 2. Service-User ============
|
||||
log_step "Service-User konfigurieren"
|
||||
|
||||
read -r -p "Service-User-Name [pdfocr]: " SERVICE_USER
|
||||
SERVICE_USER="${SERVICE_USER:-pdfocr}"
|
||||
|
||||
if id "$SERVICE_USER" &>/dev/null; then
|
||||
log_info "User '$SERVICE_USER' existiert bereits (lokal oder via AD)."
|
||||
SERVICE_GROUP="$(id -gn "$SERVICE_USER")"
|
||||
log_info "Verwende bestehende primäre Gruppe: $SERVICE_GROUP"
|
||||
log_step "Default-User '$DEFAULT_USER' prüfen"
|
||||
if id "$DEFAULT_USER" &>/dev/null; then
|
||||
log_info "'$DEFAULT_USER' existiert bereits"
|
||||
else
|
||||
log_warn "User '$SERVICE_USER' existiert nicht."
|
||||
read -r -p "Lokal als System-User anlegen? [J/n]: " CREATE_USER
|
||||
CREATE_USER="${CREATE_USER:-J}"
|
||||
if [[ "$CREATE_USER" =~ ^[JjYy]$ ]]; then
|
||||
adduser --system --group --home "$DATA_DIR" --shell /usr/sbin/nologin "$SERVICE_USER"
|
||||
SERVICE_GROUP="$SERVICE_USER"
|
||||
log_info "Lokaler System-User '$SERVICE_USER' angelegt ✓"
|
||||
else
|
||||
log_error "User '$SERVICE_USER' muss vor der Installation existieren (z.B. via AD/SSSD)."
|
||||
log_error "Lege ihn an oder wähle einen existierenden Namen."
|
||||
exit 1
|
||||
fi
|
||||
adduser --system --group --home "$DATA_ROOT" --shell /usr/sbin/nologin "$DEFAULT_USER"
|
||||
log_info "System-User '$DEFAULT_USER' angelegt ✓"
|
||||
fi
|
||||
|
||||
# ============ 3. Verzeichnisse ============
|
||||
log_step "Verzeichnisse erstellen"
|
||||
|
||||
mkdir -p "$INSTALL_DIR" "$CONFIG_DIR" "$LOG_DIR"
|
||||
mkdir -p "$DATA_DIR"/{incoming,outgoing,working,error}
|
||||
log_step "Verzeichnisse anlegen"
|
||||
mkdir -p "$INSTALL_DIR" "$CONFIG_DIR" "$DATA_ROOT" "$LOG_DIR"
|
||||
chown root:"$DEFAULT_USER" "$CONFIG_DIR"
|
||||
chmod 750 "$CONFIG_DIR"
|
||||
|
||||
log_step "Code kopieren"
|
||||
rm -rf "$INSTALL_DIR/pdf_ocr_hotfolder"
|
||||
cp -r "$REPO_DIR/pdf_ocr_hotfolder" "$INSTALL_DIR/"
|
||||
cp "$REPO_DIR/requirements.txt" "$INSTALL_DIR/"
|
||||
cp "$REPO_DIR/VERSION" "$INSTALL_DIR/"
|
||||
cp "$REPO_DIR/config.example.toml" "$INSTALL_DIR/"
|
||||
echo "$REPO_DIR" > "$INSTALL_DIR/.repo_path"
|
||||
|
||||
if [ ! -f "$CONFIG_DIR/config.toml" ]; then
|
||||
cp "$REPO_DIR/config.example.toml" "$CONFIG_DIR/config.toml"
|
||||
log_info "Beispiel-Konfig nach $CONFIG_DIR/config.toml kopiert"
|
||||
else
|
||||
log_info "Bestehende Konfig $CONFIG_DIR/config.toml bleibt unverändert"
|
||||
fi
|
||||
|
||||
log_info "Verzeichnisse erstellt ✓"
|
||||
|
||||
# ============ 4. Python venv ============
|
||||
log_step "Python venv anlegen"
|
||||
|
||||
log_step "Python venv"
|
||||
if [ ! -d "$INSTALL_DIR/venv" ]; then
|
||||
python3 -m venv "$INSTALL_DIR/venv"
|
||||
fi
|
||||
"$INSTALL_DIR/venv/bin/pip" install --upgrade pip -q
|
||||
"$INSTALL_DIR/venv/bin/pip" install -r "$INSTALL_DIR/requirements.txt" -q
|
||||
log_info "venv ok ✓"
|
||||
|
||||
log_info "venv bereit ✓"
|
||||
log_step "systemd Template-Unit installieren"
|
||||
cp "$REPO_DIR/systemd/$SERVICE_TEMPLATE" "/etc/systemd/system/$SERVICE_TEMPLATE"
|
||||
systemctl daemon-reload
|
||||
log_info "Template-Unit installiert ✓"
|
||||
|
||||
# ============ 5. Berechtigungen ============
|
||||
log_step "Berechtigungen setzen"
|
||||
chown -R "$DEFAULT_USER":"$DEFAULT_USER" "$INSTALL_DIR" "$LOG_DIR"
|
||||
}
|
||||
|
||||
chown -R "$SERVICE_USER:$SERVICE_GROUP" "$INSTALL_DIR" "$DATA_DIR" "$LOG_DIR"
|
||||
chown root:"$SERVICE_GROUP" "$CONFIG_DIR"
|
||||
chmod 750 "$CONFIG_DIR"
|
||||
if [ -f "$CONFIG_DIR/config.toml" ]; then
|
||||
chown root:"$SERVICE_GROUP" "$CONFIG_DIR/config.toml"
|
||||
chmod 640 "$CONFIG_DIR/config.toml"
|
||||
# ============================================================
|
||||
# Instanz-Verwaltung
|
||||
# ============================================================
|
||||
|
||||
list_instances() {
|
||||
find "$CONFIG_DIR" -maxdepth 1 -name '*.toml' -type f 2>/dev/null \
|
||||
| sed 's|.*/||; s|\.toml$||' \
|
||||
| sort
|
||||
}
|
||||
|
||||
show_existing_instances() {
|
||||
local instances
|
||||
mapfile -t instances < <(list_instances)
|
||||
if [ "${#instances[@]}" -eq 0 ]; then
|
||||
log_info "Keine bestehenden Instanzen gefunden."
|
||||
return
|
||||
fi
|
||||
echo
|
||||
log_info "Bestehende Instanzen:"
|
||||
for name in "${instances[@]}"; do
|
||||
local active
|
||||
active=$(systemctl is-active "pdf-ocr-hotfolder@${name}.service" 2>/dev/null || echo inactive)
|
||||
printf " • %-30s [%s]\n" "$name" "$active"
|
||||
done
|
||||
echo
|
||||
}
|
||||
|
||||
create_instance() {
|
||||
echo
|
||||
read -r -p "Instanz-Name (nur a-z, 0-9, -): " INST
|
||||
if [[ ! "$INST" =~ ^[a-z0-9][a-z0-9-]*$ ]]; then
|
||||
log_error "Ungültiger Name. Abbruch."
|
||||
return 1
|
||||
fi
|
||||
if [ -f "$CONFIG_DIR/$INST.toml" ]; then
|
||||
log_error "Instanz '$INST' existiert bereits. Abbruch."
|
||||
return 1
|
||||
fi
|
||||
|
||||
log_info "Berechtigungen gesetzt ✓"
|
||||
local default_base="$DATA_ROOT/$INST"
|
||||
read -r -p "Basis-Pfad für Daten [$default_base]: " BASE
|
||||
BASE="${BASE:-$default_base}"
|
||||
|
||||
# ============ 6. systemd-Unit ============
|
||||
log_step "systemd-Unit installieren"
|
||||
read -r -p "Service-User [$DEFAULT_USER]: " SVC_USER
|
||||
SVC_USER="${SVC_USER:-$DEFAULT_USER}"
|
||||
|
||||
sed -e "s|__SERVICE_USER__|$SERVICE_USER|g" \
|
||||
-e "s|__SERVICE_GROUP__|$SERVICE_GROUP|g" \
|
||||
"$REPO_DIR/systemd/pdf-ocr-hotfolder.service" \
|
||||
> "/etc/systemd/system/${SERVICE_NAME}.service"
|
||||
local SVC_GROUP
|
||||
if id "$SVC_USER" &>/dev/null; then
|
||||
SVC_GROUP="$(id -gn "$SVC_USER")"
|
||||
log_info "User '$SVC_USER' existiert (Gruppe: $SVC_GROUP)"
|
||||
else
|
||||
log_warn "User '$SVC_USER' existiert nicht."
|
||||
read -r -p "Lokal als System-User anlegen? [J/n]: " CREATE_USER
|
||||
CREATE_USER="${CREATE_USER:-J}"
|
||||
if [[ "$CREATE_USER" =~ ^[JjYy]$ ]]; then
|
||||
adduser --system --group --home "$BASE" --shell /usr/sbin/nologin "$SVC_USER"
|
||||
SVC_GROUP="$SVC_USER"
|
||||
log_info "User '$SVC_USER' angelegt ✓"
|
||||
else
|
||||
log_error "User muss existieren (z.B. via AD/SSSD). Abbruch."
|
||||
return 1
|
||||
fi
|
||||
fi
|
||||
|
||||
log_info "Lege Datenverzeichnisse unter $BASE an..."
|
||||
mkdir -p "$BASE"/{incoming,outgoing,working,error}
|
||||
chown -R "$SVC_USER":"$SVC_GROUP" "$BASE"
|
||||
|
||||
log_info "Erstelle Config $CONFIG_DIR/$INST.toml..."
|
||||
sed \
|
||||
-e "s|/var/lib/pdf-ocr-hotfolder/incoming|$BASE/incoming|" \
|
||||
-e "s|/var/lib/pdf-ocr-hotfolder/outgoing|$BASE/outgoing|" \
|
||||
-e "s|/var/lib/pdf-ocr-hotfolder/working|$BASE/working|" \
|
||||
-e "s|/var/lib/pdf-ocr-hotfolder/error|$BASE/error|" \
|
||||
"$INSTALL_DIR/config.example.toml" > "$CONFIG_DIR/$INST.toml"
|
||||
chown root:"$SVC_GROUP" "$CONFIG_DIR/$INST.toml"
|
||||
chmod 640 "$CONFIG_DIR/$INST.toml"
|
||||
|
||||
# Drop-in für abweichenden Service-User
|
||||
if [ "$SVC_USER" != "$DEFAULT_USER" ]; then
|
||||
local DROPIN_DIR="/etc/systemd/system/pdf-ocr-hotfolder@${INST}.service.d"
|
||||
mkdir -p "$DROPIN_DIR"
|
||||
cat > "$DROPIN_DIR/user.conf" <<EOF
|
||||
[Service]
|
||||
User=$SVC_USER
|
||||
Group=$SVC_GROUP
|
||||
EOF
|
||||
log_info "Drop-in für User '$SVC_USER' erstellt"
|
||||
fi
|
||||
|
||||
systemctl daemon-reload
|
||||
systemctl enable "${SERVICE_NAME}.service"
|
||||
systemctl enable --now "pdf-ocr-hotfolder@${INST}.service"
|
||||
sleep 1
|
||||
if systemctl is-active --quiet "pdf-ocr-hotfolder@${INST}.service"; then
|
||||
log_info "✅ Instanz '$INST' läuft"
|
||||
else
|
||||
log_warn "Instanz '$INST' läuft nicht. Logs: journalctl -u pdf-ocr-hotfolder@${INST}"
|
||||
fi
|
||||
|
||||
log_info "systemd-Unit installiert & enabled ✓"
|
||||
echo
|
||||
echo " Config: $CONFIG_DIR/$INST.toml"
|
||||
echo " Eingang: $BASE/incoming"
|
||||
echo " Ausgang: $BASE/outgoing"
|
||||
echo " User: $SVC_USER ($SVC_GROUP)"
|
||||
echo
|
||||
}
|
||||
|
||||
# ============ 7. Start ============
|
||||
log_step "Service starten"
|
||||
systemctl restart "${SERVICE_NAME}.service"
|
||||
sleep 2
|
||||
systemctl --no-pager --lines=10 status "${SERVICE_NAME}.service" || true
|
||||
# ============================================================
|
||||
# Main
|
||||
# ============================================================
|
||||
|
||||
echo
|
||||
echo "=========================================="
|
||||
echo " Installation abgeschlossen"
|
||||
echo " PDF OCR Hotfolder — Installer"
|
||||
echo "=========================================="
|
||||
|
||||
if [ ! -d "$INSTALL_DIR/venv" ] || [ ! -f "/etc/systemd/system/$SERVICE_TEMPLATE" ]; then
|
||||
log_step "Basis-Installation"
|
||||
install_base
|
||||
else
|
||||
log_info "Basis-Installation bereits vorhanden ($INSTALL_DIR)"
|
||||
log_info "Überspringe Basis-Setup (nutze update.sh für Code-Updates)"
|
||||
fi
|
||||
|
||||
show_existing_instances
|
||||
|
||||
# Erste Instanz ist Pflicht, wenn noch keine vorhanden
|
||||
mapfile -t existing < <(list_instances)
|
||||
if [ "${#existing[@]}" -eq 0 ]; then
|
||||
log_info "Lege erste Hotfolder-Instanz an."
|
||||
create_instance || true
|
||||
fi
|
||||
|
||||
while true; do
|
||||
read -r -p "Weitere Instanz anlegen? [j/N]: " MORE
|
||||
MORE="${MORE:-N}"
|
||||
if [[ "$MORE" =~ ^[JjYy]$ ]]; then
|
||||
create_instance || true
|
||||
else
|
||||
break
|
||||
fi
|
||||
done
|
||||
|
||||
echo
|
||||
echo " Konfiguration: $CONFIG_DIR/config.toml"
|
||||
echo " Eingang: $DATA_DIR/incoming"
|
||||
echo " Ausgang: $DATA_DIR/outgoing"
|
||||
echo " Service-User: $SERVICE_USER ($SERVICE_GROUP)"
|
||||
echo
|
||||
echo " Logs: journalctl -u $SERVICE_NAME -f"
|
||||
echo "=========================================="
|
||||
echo " Fertig"
|
||||
echo "=========================================="
|
||||
show_existing_instances
|
||||
echo " Logs: journalctl -u pdf-ocr-hotfolder@<instanz> -f"
|
||||
echo " Neustart: systemctl restart pdf-ocr-hotfolder@<instanz>"
|
||||
echo " Update: sudo ./update.sh"
|
||||
echo
|
||||
|
||||
@@ -8,7 +8,7 @@ from pathlib import Path
|
||||
|
||||
from . import __version__
|
||||
from .config import load_config
|
||||
from .service import HotfolderService
|
||||
from .service import HotfolderService, PreflightError
|
||||
|
||||
|
||||
def _setup_logging(level: str) -> None:
|
||||
@@ -40,14 +40,20 @@ def main() -> int:
|
||||
_setup_logging(cfg.log_level)
|
||||
|
||||
service = HotfolderService(cfg)
|
||||
|
||||
if args.once:
|
||||
service._ensure_dirs() # noqa: SLF001
|
||||
service._scan_existing() # noqa: SLF001
|
||||
service._executor.shutdown(wait=True) # noqa: SLF001
|
||||
return 0
|
||||
try:
|
||||
errors = service.run_once()
|
||||
except PreflightError as e:
|
||||
print(f"FEHLER: {e}", file=sys.stderr)
|
||||
return 2
|
||||
return 1 if errors > 0 else 0
|
||||
|
||||
try:
|
||||
service.run()
|
||||
except PreflightError as e:
|
||||
print(f"FEHLER: {e}", file=sys.stderr)
|
||||
return 2
|
||||
except KeyboardInterrupt:
|
||||
pass
|
||||
return 0
|
||||
|
||||
@@ -7,8 +7,6 @@ import subprocess
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
|
||||
import ocrmypdf
|
||||
|
||||
from .config import OcrConfig, VeraPdfConfig
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
@@ -25,6 +23,8 @@ class ProcessResult:
|
||||
|
||||
def run_ocr(src: Path, dst: Path, cfg: OcrConfig) -> None:
|
||||
"""Führt ocrmypdf als Library-Call aus (kein Subprozess-Overhead)."""
|
||||
import ocrmypdf # lazy, damit Tests ohne ocrmypdf laufen
|
||||
|
||||
kwargs: dict = {
|
||||
"language": cfg.languages,
|
||||
"jobs": cfg.jobs,
|
||||
|
||||
@@ -2,6 +2,7 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import shutil
|
||||
import signal
|
||||
import threading
|
||||
import time
|
||||
@@ -18,6 +19,27 @@ from .uploaders import notify_email, upload_folder, upload_nextcloud, upload_sft
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class PreflightError(RuntimeError):
|
||||
"""Erforderliche externe Binaries fehlen."""
|
||||
|
||||
|
||||
# Pflicht-Binaries für ocrmypdf
|
||||
_REQUIRED_BINARIES = ("tesseract", "gs")
|
||||
|
||||
|
||||
def check_preflight() -> None:
|
||||
"""Prüft, ob alle externen Abhängigkeiten (Tesseract, Ghostscript) installiert sind.
|
||||
|
||||
Wirft PreflightError mit Liste der fehlenden Binaries.
|
||||
"""
|
||||
missing = [b for b in _REQUIRED_BINARIES if shutil.which(b) is None]
|
||||
if missing:
|
||||
raise PreflightError(
|
||||
"Fehlende Abhängigkeiten: " + ", ".join(missing)
|
||||
+ ". Bitte installieren: sudo apt install tesseract-ocr ghostscript"
|
||||
)
|
||||
|
||||
|
||||
def _is_pdf(path: Path) -> bool:
|
||||
return path.suffix.lower() == ".pdf" and path.is_file()
|
||||
|
||||
@@ -70,10 +92,20 @@ class HotfolderService:
|
||||
self._stop = threading.Event()
|
||||
self._inflight: set[str] = set()
|
||||
self._lock = threading.Lock()
|
||||
self._success_count = 0
|
||||
self._error_count = 0
|
||||
|
||||
@property
|
||||
def success_count(self) -> int:
|
||||
return self._success_count
|
||||
|
||||
@property
|
||||
def error_count(self) -> int:
|
||||
return self._error_count
|
||||
|
||||
# ---- Setup ----
|
||||
|
||||
def _ensure_dirs(self) -> None:
|
||||
def ensure_dirs(self) -> None:
|
||||
for p in (self.cfg.paths.incoming, self.cfg.paths.outgoing,
|
||||
self.cfg.paths.working, self.cfg.paths.error):
|
||||
p.mkdir(parents=True, exist_ok=True)
|
||||
@@ -81,7 +113,8 @@ class HotfolderService:
|
||||
# ---- Lifecycle ----
|
||||
|
||||
def run(self) -> None:
|
||||
self._ensure_dirs()
|
||||
check_preflight()
|
||||
self.ensure_dirs()
|
||||
self._scan_existing()
|
||||
|
||||
self._observer = Observer()
|
||||
@@ -98,6 +131,20 @@ class HotfolderService:
|
||||
finally:
|
||||
self.shutdown()
|
||||
|
||||
def run_once(self) -> int:
|
||||
"""Verarbeitet alle bereits im incoming-Ordner liegenden PDFs und beendet sich.
|
||||
|
||||
Returns:
|
||||
Anzahl fehlgeschlagener PDFs (0 = alles ok).
|
||||
"""
|
||||
check_preflight()
|
||||
self.ensure_dirs()
|
||||
self._scan_existing()
|
||||
self._executor.shutdown(wait=True)
|
||||
log.info("One-shot fertig: %d ok, %d Fehler",
|
||||
self._success_count, self._error_count)
|
||||
return self._error_count
|
||||
|
||||
def shutdown(self) -> None:
|
||||
log.info("Shutdown läuft...")
|
||||
if self._observer:
|
||||
@@ -150,6 +197,12 @@ class HotfolderService:
|
||||
vera_cfg=self.cfg.verapdf,
|
||||
)
|
||||
|
||||
with self._lock:
|
||||
if result.success:
|
||||
self._success_count += 1
|
||||
else:
|
||||
self._error_count += 1
|
||||
|
||||
if result.success:
|
||||
self._dispatch_uploads(result.output)
|
||||
self._notify(result)
|
||||
|
||||
@@ -1,13 +1,13 @@
|
||||
[Unit]
|
||||
Description=PDF OCR Hotfolder
|
||||
Description=PDF OCR Hotfolder (Instance: %i)
|
||||
After=network-online.target
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=__SERVICE_USER__
|
||||
Group=__SERVICE_GROUP__
|
||||
ExecStart=/opt/pdf-ocr-hotfolder/venv/bin/python -m pdf_ocr_hotfolder --config /etc/pdf-ocr-hotfolder/config.toml
|
||||
User=pdfocr
|
||||
Group=pdfocr
|
||||
ExecStart=/opt/pdf-ocr-hotfolder/venv/bin/python -m pdf_ocr_hotfolder --config /etc/pdf-ocr-hotfolder/%i.toml
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
KillMode=mixed
|
||||
@@ -0,0 +1,52 @@
|
||||
"""Gemeinsame pytest-Fixtures."""
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from pdf_ocr_hotfolder.config import (
|
||||
Config,
|
||||
EmailNotify,
|
||||
FolderUpload,
|
||||
NextcloudUpload,
|
||||
OcrConfig,
|
||||
Paths,
|
||||
SftpUpload,
|
||||
VeraPdfConfig,
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def tmp_config(tmp_path: Path) -> Config:
|
||||
"""Minimal-Config mit tmp_path-Verzeichnissen, alle Uploads deaktiviert."""
|
||||
paths = Paths(
|
||||
incoming=tmp_path / "incoming",
|
||||
outgoing=tmp_path / "outgoing",
|
||||
working=tmp_path / "working",
|
||||
error=tmp_path / "error",
|
||||
)
|
||||
for p in (paths.incoming, paths.outgoing, paths.working, paths.error):
|
||||
p.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
return Config(
|
||||
paths=paths,
|
||||
ocr=OcrConfig(max_workers=1),
|
||||
verapdf=VeraPdfConfig(enabled=False),
|
||||
folder=FolderUpload(enabled=False),
|
||||
nextcloud=NextcloudUpload(enabled=False),
|
||||
sftp=SftpUpload(enabled=False),
|
||||
email=EmailNotify(enabled=False),
|
||||
log_level="DEBUG",
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def dummy_pdf(tmp_config: Config) -> Path:
|
||||
"""Legt eine Datei mit .pdf-Extension im incoming-Ordner ab.
|
||||
|
||||
Achtung: kein echtes PDF. Für Tests wird `process_pdf` gemockt.
|
||||
"""
|
||||
pdf = tmp_config.paths.incoming / "test.pdf"
|
||||
pdf.write_bytes(b"%PDF-1.4 fake\n")
|
||||
return pdf
|
||||
@@ -0,0 +1,96 @@
|
||||
"""Tests für Issue #2: --once Modus muss Exit-Code != 0 bei Fehlern liefern."""
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
from unittest.mock import patch
|
||||
|
||||
from pdf_ocr_hotfolder.processor import ProcessResult
|
||||
from pdf_ocr_hotfolder.service import HotfolderService
|
||||
|
||||
|
||||
def _fake_success(src: Path, working_dir, outgoing_dir, error_dir, ocr_cfg, vera_cfg):
|
||||
out = outgoing_dir / f"OCR_{src.name}"
|
||||
out.parent.mkdir(parents=True, exist_ok=True)
|
||||
out.write_bytes(b"%PDF-1.4 ocr\n")
|
||||
src.unlink(missing_ok=True)
|
||||
return ProcessResult(src, out, True)
|
||||
|
||||
|
||||
def _fake_failure(src: Path, working_dir, outgoing_dir, error_dir, ocr_cfg, vera_cfg):
|
||||
error_dir.mkdir(parents=True, exist_ok=True)
|
||||
dest = error_dir / src.name
|
||||
src.rename(dest)
|
||||
return ProcessResult(src, outgoing_dir / f"OCR_{src.name}", False,
|
||||
error="fake ocr failure")
|
||||
|
||||
|
||||
def _run(tmp_config, fake_process):
|
||||
"""Helper: führt run_once() mit gemocktem process_pdf und preflight aus."""
|
||||
with patch("pdf_ocr_hotfolder.service.check_preflight", return_value=None), \
|
||||
patch("pdf_ocr_hotfolder.service.process_pdf", side_effect=fake_process), \
|
||||
patch("pdf_ocr_hotfolder.service._wait_until_stable", return_value=True):
|
||||
service = HotfolderService(tmp_config)
|
||||
try:
|
||||
return service.run_once()
|
||||
finally:
|
||||
service._executor.shutdown(wait=False)
|
||||
|
||||
|
||||
def test_once_exit_0_when_no_files(tmp_config) -> None:
|
||||
"""Szenario: Keine PDFs vorhanden → Exit 0."""
|
||||
errors = _run(tmp_config, _fake_success)
|
||||
assert errors == 0
|
||||
|
||||
|
||||
def test_once_exit_0_when_all_success(tmp_config) -> None:
|
||||
"""Szenario: Alle PDFs erfolgreich → Exit 0."""
|
||||
(tmp_config.paths.incoming / "a.pdf").write_bytes(b"%PDF-1.4\n")
|
||||
(tmp_config.paths.incoming / "b.pdf").write_bytes(b"%PDF-1.4\n")
|
||||
|
||||
errors = _run(tmp_config, _fake_success)
|
||||
assert errors == 0
|
||||
|
||||
|
||||
def test_once_exit_nonzero_when_all_fail(tmp_config) -> None:
|
||||
"""Szenario: Alle PDFs fehlgeschlagen → Exit != 0 (Issue #2)."""
|
||||
(tmp_config.paths.incoming / "a.pdf").write_bytes(b"%PDF-1.4\n")
|
||||
(tmp_config.paths.incoming / "b.pdf").write_bytes(b"%PDF-1.4\n")
|
||||
|
||||
errors = _run(tmp_config, _fake_failure)
|
||||
assert errors == 2
|
||||
|
||||
|
||||
def test_once_exit_nonzero_when_some_fail(tmp_config) -> None:
|
||||
"""Szenario: Teilweise fehlgeschlagen → Exit != 0."""
|
||||
(tmp_config.paths.incoming / "ok.pdf").write_bytes(b"%PDF-1.4\n")
|
||||
(tmp_config.paths.incoming / "bad.pdf").write_bytes(b"%PDF-1.4\n")
|
||||
|
||||
def mixed(src, *args, **kwargs):
|
||||
if "bad" in src.name:
|
||||
return _fake_failure(src, *args, **kwargs)
|
||||
return _fake_success(src, *args, **kwargs)
|
||||
|
||||
errors = _run(tmp_config, mixed)
|
||||
assert errors == 1
|
||||
|
||||
|
||||
def test_counters_track_success_and_failure(tmp_config) -> None:
|
||||
"""success_count und error_count sollen korrekt mitzählen."""
|
||||
(tmp_config.paths.incoming / "ok.pdf").write_bytes(b"%PDF-1.4\n")
|
||||
(tmp_config.paths.incoming / "bad.pdf").write_bytes(b"%PDF-1.4\n")
|
||||
|
||||
def mixed(src, *args, **kwargs):
|
||||
if "bad" in src.name:
|
||||
return _fake_failure(src, *args, **kwargs)
|
||||
return _fake_success(src, *args, **kwargs)
|
||||
|
||||
with patch("pdf_ocr_hotfolder.service.check_preflight", return_value=None), \
|
||||
patch("pdf_ocr_hotfolder.service.process_pdf", side_effect=mixed), \
|
||||
patch("pdf_ocr_hotfolder.service._wait_until_stable", return_value=True):
|
||||
service = HotfolderService(tmp_config)
|
||||
try:
|
||||
service.run_once()
|
||||
assert service.success_count == 1
|
||||
assert service.error_count == 1
|
||||
finally:
|
||||
service._executor.shutdown(wait=False)
|
||||
@@ -0,0 +1,75 @@
|
||||
"""Tests für Issue #1: Preflight-Check bei fehlendem Tesseract."""
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
from unittest.mock import patch
|
||||
|
||||
import pytest
|
||||
|
||||
from pdf_ocr_hotfolder.service import (
|
||||
HotfolderService,
|
||||
PreflightError,
|
||||
check_preflight,
|
||||
)
|
||||
|
||||
|
||||
def test_preflight_passes_when_all_binaries_present() -> None:
|
||||
"""Wenn tesseract + gs im PATH sind, darf kein Fehler fliegen."""
|
||||
with patch("pdf_ocr_hotfolder.service.shutil.which", return_value="/usr/bin/fake"):
|
||||
check_preflight() # darf nicht werfen
|
||||
|
||||
|
||||
def test_preflight_fails_when_tesseract_missing() -> None:
|
||||
"""Fehlendes tesseract → PreflightError mit passender Meldung."""
|
||||
def fake_which(name: str) -> str | None:
|
||||
return None if name == "tesseract" else "/usr/bin/fake"
|
||||
|
||||
with patch("pdf_ocr_hotfolder.service.shutil.which", side_effect=fake_which):
|
||||
with pytest.raises(PreflightError, match="tesseract"):
|
||||
check_preflight()
|
||||
|
||||
|
||||
def test_preflight_fails_when_ghostscript_missing() -> None:
|
||||
def fake_which(name: str) -> str | None:
|
||||
return None if name == "gs" else "/usr/bin/fake"
|
||||
|
||||
with patch("pdf_ocr_hotfolder.service.shutil.which", side_effect=fake_which):
|
||||
with pytest.raises(PreflightError, match="gs"):
|
||||
check_preflight()
|
||||
|
||||
|
||||
def test_preflight_lists_all_missing_binaries() -> None:
|
||||
"""Bei mehreren fehlenden Binaries werden alle genannt."""
|
||||
with patch("pdf_ocr_hotfolder.service.shutil.which", return_value=None):
|
||||
with pytest.raises(PreflightError) as exc_info:
|
||||
check_preflight()
|
||||
msg = str(exc_info.value)
|
||||
assert "tesseract" in msg
|
||||
assert "gs" in msg
|
||||
|
||||
|
||||
def test_run_once_raises_preflight_error(tmp_config) -> None:
|
||||
"""HotfolderService.run_once() wirft PreflightError, wenn tesseract fehlt."""
|
||||
service = HotfolderService(tmp_config)
|
||||
try:
|
||||
with patch("pdf_ocr_hotfolder.service.shutil.which", return_value=None):
|
||||
with pytest.raises(PreflightError):
|
||||
service.run_once()
|
||||
finally:
|
||||
service._executor.shutdown(wait=False)
|
||||
|
||||
|
||||
def test_main_returns_2_on_preflight_error(tmp_config, tmp_path, monkeypatch) -> None:
|
||||
"""CLI liefert Exit-Code 2 bei Preflight-Fehler (Issue #1 Szenario)."""
|
||||
cfg_file = tmp_path / "cfg.toml"
|
||||
cfg_file.write_text(f"""
|
||||
[paths]
|
||||
incoming = "{tmp_config.paths.incoming}"
|
||||
outgoing = "{tmp_config.paths.outgoing}"
|
||||
working = "{tmp_config.paths.working}"
|
||||
error = "{tmp_config.paths.error}"
|
||||
""")
|
||||
monkeypatch.setattr(sys, "argv", ["pdf-ocr-hotfolder", "--config", str(cfg_file), "--once"])
|
||||
with patch("pdf_ocr_hotfolder.service.shutil.which", return_value=None):
|
||||
from pdf_ocr_hotfolder.__main__ import main
|
||||
assert main() == 2
|
||||
@@ -2,6 +2,10 @@
|
||||
#
|
||||
# PDF OCR Hotfolder — Update-Script
|
||||
#
|
||||
# Aktualisiert Code und venv unter /opt/pdf-ocr-hotfolder/ sowie die
|
||||
# systemd Template-Unit. Danach werden alle laufenden Instanzen neu gestartet.
|
||||
# Config-Dateien unter /etc/pdf-ocr-hotfolder/ bleiben unverändert.
|
||||
#
|
||||
set -euo pipefail
|
||||
|
||||
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'; NC='\033[0m'
|
||||
@@ -15,7 +19,8 @@ if [ "${EUID}" -ne 0 ]; then
|
||||
fi
|
||||
|
||||
INSTALL_DIR="/opt/pdf-ocr-hotfolder"
|
||||
SERVICE_NAME="pdf-ocr-hotfolder"
|
||||
CONFIG_DIR="/etc/pdf-ocr-hotfolder"
|
||||
SERVICE_TEMPLATE="pdf-ocr-hotfolder@.service"
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
if [ -f "$SCRIPT_DIR/pdf_ocr_hotfolder/__init__.py" ]; then
|
||||
@@ -42,12 +47,18 @@ log_info "Install: $INSTALL_DIR"
|
||||
log_info "Version: $OLD_VERSION → $NEW_VERSION"
|
||||
echo
|
||||
|
||||
# Service-User aus systemd-Unit lesen
|
||||
SERVICE_USER="$(awk -F= '/^User=/{print $2}' /etc/systemd/system/${SERVICE_NAME}.service 2>/dev/null || echo pdfocr)"
|
||||
SERVICE_GROUP="$(awk -F= '/^Group=/{print $2}' /etc/systemd/system/${SERVICE_NAME}.service 2>/dev/null || echo pdfocr)"
|
||||
# Laufende Instanzen ermitteln
|
||||
mapfile -t RUNNING < <(systemctl list-units --no-legend --state=active 'pdf-ocr-hotfolder@*.service' 2>/dev/null | awk '{print $1}')
|
||||
if [ "${#RUNNING[@]}" -gt 0 ]; then
|
||||
log_info "Laufende Instanzen: ${RUNNING[*]}"
|
||||
else
|
||||
log_info "Keine laufenden Instanzen."
|
||||
fi
|
||||
|
||||
log_info "Stoppe Service..."
|
||||
systemctl stop "${SERVICE_NAME}.service" 2>/dev/null || true
|
||||
log_info "Stoppe laufende Instanzen..."
|
||||
for unit in "${RUNNING[@]}"; do
|
||||
systemctl stop "$unit" || true
|
||||
done
|
||||
|
||||
log_info "Backup erstellen..."
|
||||
BACKUP_DIR="/var/backups/pdf-ocr-hotfolder"
|
||||
@@ -60,29 +71,40 @@ rm -rf "$INSTALL_DIR/pdf_ocr_hotfolder"
|
||||
cp -r "$REPO_DIR/pdf_ocr_hotfolder" "$INSTALL_DIR/"
|
||||
cp "$REPO_DIR/requirements.txt" "$INSTALL_DIR/"
|
||||
cp "$REPO_DIR/VERSION" "$INSTALL_DIR/"
|
||||
cp "$REPO_DIR/config.example.toml" "$INSTALL_DIR/"
|
||||
echo "$REPO_DIR" > "$INSTALL_DIR/.repo_path"
|
||||
|
||||
log_info "Dependencies aktualisieren..."
|
||||
"$INSTALL_DIR/venv/bin/pip" install --upgrade pip -q
|
||||
"$INSTALL_DIR/venv/bin/pip" install --upgrade -r "$INSTALL_DIR/requirements.txt" -q
|
||||
|
||||
log_info "systemd-Unit aktualisieren..."
|
||||
sed -e "s|__SERVICE_USER__|$SERVICE_USER|g" \
|
||||
-e "s|__SERVICE_GROUP__|$SERVICE_GROUP|g" \
|
||||
"$REPO_DIR/systemd/pdf-ocr-hotfolder.service" \
|
||||
> "/etc/systemd/system/${SERVICE_NAME}.service"
|
||||
log_info "systemd Template-Unit aktualisieren..."
|
||||
cp "$REPO_DIR/systemd/$SERVICE_TEMPLATE" "/etc/systemd/system/$SERVICE_TEMPLATE"
|
||||
systemctl daemon-reload
|
||||
|
||||
log_info "Berechtigungen setzen..."
|
||||
chown -R "$SERVICE_USER:$SERVICE_GROUP" "$INSTALL_DIR"
|
||||
# Eigentümer des Codes bleibt der primäre User (pdfocr); Instanzen laufen
|
||||
# ggf. als anderer User, lesen aber nur den Code.
|
||||
PRIMARY_USER="$(stat -c '%U' "$INSTALL_DIR/venv" 2>/dev/null || echo pdfocr)"
|
||||
chown -R "$PRIMARY_USER":"$PRIMARY_USER" "$INSTALL_DIR"
|
||||
|
||||
log_info "Service starten..."
|
||||
systemctl start "${SERVICE_NAME}.service"
|
||||
sleep 2
|
||||
|
||||
if systemctl is-active --quiet "${SERVICE_NAME}.service"; then
|
||||
log_info "✅ Service läuft (Version $NEW_VERSION)"
|
||||
log_info "Starte Instanzen wieder..."
|
||||
FAIL=0
|
||||
for unit in "${RUNNING[@]}"; do
|
||||
systemctl start "$unit" || true
|
||||
sleep 1
|
||||
if systemctl is-active --quiet "$unit"; then
|
||||
log_info " ✅ $unit"
|
||||
else
|
||||
log_error "Service läuft nicht. journalctl -u $SERVICE_NAME -n 30"
|
||||
log_error " ❌ $unit — journalctl -u $unit -n 30"
|
||||
FAIL=1
|
||||
fi
|
||||
done
|
||||
|
||||
echo
|
||||
if [ "$FAIL" -eq 0 ]; then
|
||||
log_info "Update auf $NEW_VERSION abgeschlossen ✓"
|
||||
else
|
||||
log_warn "Update abgeschlossen, aber mindestens eine Instanz läuft nicht."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
Reference in New Issue
Block a user