From 205ab25d41623d4b97c7fbaa8c73003dd0ad080f Mon Sep 17 00:00:00 2001 From: m00d Date: Wed, 1 Oct 2025 07:13:51 +0200 Subject: [PATCH] Sketch metadata-to-Rust generator plan --- ai/notes.md | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/ai/notes.md b/ai/notes.md index fd2e428..8bf1323 100644 --- a/ai/notes.md +++ b/ai/notes.md @@ -13,3 +13,34 @@ auto-fill checksums when harvesting metadata for MLFS/BLFS/GLFS packages. sources. - Benefits: avoids fragile HTML tables, keeps URLs aligned with official build scripts, and ensures checksums are up-to-date. + +# Metadata → Rust Module Strategy + +Goal: emit Rust modules under `src/pkgs/by_name` directly from harvested +metadata once MLFS/BLFS/GLFS records are validated. + +Outline: +1. **Schema alignment** – Ensure harvested JSON carries everything the + `PackageDefinition` constructor expects (source URLs, checksums, build + commands, dependencies, optimisation flags, notes/stage metadata). +2. **Translation layer** – Implement a converter (likely in a new module, + e.g. `src/pkgs/generator.rs`) that reads a metadata JSON file and produces a + `ScaffoldRequest` or directly writes the module source via the existing + scaffolder. +3. **Naming/layout** – Derive module paths from `package.id` (e.g. + `mlfs/binutils-pass-1` → `src/pkgs/by_name/bi/binutils/pass_1/mod.rs`) while + preserving the prefix/slug conventions already used by the scaffolder. +4. **CLI integration** – Add a subcommand (`metadata_indexer generate`) that + accepts a list of package IDs or a glob, feeds each through the translator, + and optionally stages the resulting Rust files. +5. **Diff safety** – Emit modules to a temporary location first, compare + against existing files, and only overwrite when changes are detected; keep a + `--dry-run` mode for review. +6. **Tests/checks** – After generation, run `cargo fmt` and `cargo check` to + ensure the new modules compile; optionally add schema fixtures covering edge + cases (variants, multiple URLs, absent checksums). + +Open questions: +- How to represent optional post-install steps or multi-phase builds inside the + generated module (additional helper functions vs. raw command arrays). +- Where to store PGO workload hints once the PGO infrastructure is defined.