I Do the Same 4 Steps Every Time I Scan a Document. So I Automated All of Them. [Devlog #5]
All tests run on an 8-year-old MacBook Air. Every. Single. Time. Not complex enough to write a script for. Too repetitive to keep doing manually. So I built a pipeline engine into the app — and here's how the architecture works. Each operation (OCR, compress, encrypt, rename, watermark, save) is a typed StepType. A pipeline is just an ordered list of enabled steps. #[derive(Debug, Clone, Serialize, Deserialize)] pub enum StepType { Ocr, Compress { level: CompressionLevel }, Encrypt { password: String }, Rename { template: String }, Save { destination: PathBuf }, Watermark { text: String, opacity: f32 }, } #[derive(Debug, Clone, Serialize, Deserialize)] pub struct PipelineStep { pub step_type: StepType, pub enabled: bool, } #[derive(Debug, Clone, Serialize, Deserialize)] pub struct Pipeline { pub name: String, pub steps: Vec, } Storing pipelines as serializable data means users can save, share, and reload them. The UI just edits a JSON blob. Each step's output becomes the next step's input. If any step fails, the whole pipeline halts and temp files are cleaned up. pub async fn run_pipeline( pipeline: &Pipeline, input_path: &Path, ) -> Result { let mut current_path = input_path.to_path_buf(); for step in pipeline.steps.iter().filter(|s| s.enabled) { current_path = match &step.step_type { StepType::Ocr => run_ocr(¤t_path).await?, StepType::Compress { level } => compress_pdf(¤t_path, level).await?, StepType::Encrypt { password } => encrypt_pdf(¤t_path, password).await?, StepType::Rename { template } => rename_file(¤t_path, template).await?, StepType::Save { destination } => save_to(¤t_path, destination).await?, StepType::Watermark { text, opacity } => add_watermark(¤t_path, text, *opacity).await?, }; } Ok(current_path) } The ? operator handles error propagation cleanly. Each step function is independently testable. Point it at a folder. Any PDF that lands there triggers the pipeline immediately. use notify::{Watcher, RecursiveMode, watcher}; pub fn watch_folder( folder: &Path, pipeline: Pipeline, ) -> Result { let (tx, rx) = std::sync::mpsc::channel(); let mut watcher = watcher(tx, Duration::from_secs(1))?; watcher.watch(folder, RecursiveMode::NonRecursive)?; loop { match rx.recv() { Ok(DebouncedEvent::Create(path)) => { if path.extension().map_or(false, |e| e == "pdf") { tokio::spawn(run_pipeline(&pipeline, &path)); } } Err(e) => eprintln!("watch error: {:?}", e), _ => {} } } } Set your scanner's output folder as the Hot Folder. Scan the document — by the time you walk back to your desk, it's already OCR'd, compressed, and filed. Steps are drag-and-drop reorderable. Toggle individual steps on/off without deleting them. Save named pipelines for different workflows. Forensic Deep Purge and Stealth Watermark — invisible security. How do you prove a document leaked without the leaker knowing you can prove it? Hiyoko PDF Vault → https://hiyokoko.gumroad.com/l/HiyokoPDFVault @hiyoyok
