From 93fca09be1d5140de466bdc908e5f38310b6ea5a Mon Sep 17 00:00:00 2001 From: m00d Date: Wed, 1 Oct 2025 07:28:41 +0200 Subject: [PATCH] Document package generation workflow --- README.md | 2 ++ ai/context.md | 3 +- docs/ARCHITECTURE.md | 2 +- docs/PACKAGE_GENERATION.md | 61 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 66 insertions(+), 2 deletions(-) create mode 100644 docs/PACKAGE_GENERATION.md diff --git a/README.md b/README.md index 745642a..e239454 100644 --- a/README.md +++ b/README.md @@ -146,6 +146,8 @@ Add `--overwrite` to regenerate an existing module directory. layout, binaries, and supporting modules. - [Metadata Harvesting Pipeline](docs/METADATA_PIPELINE.md) – how the metadata indexer produces and validates the JSON records under `ai/metadata/`. +- [Package Module Generation](docs/PACKAGE_GENERATION.md) – end-to-end guide + for converting harvested metadata into Rust modules under `src/pkgs/by_name/`. - `ai/notes.md` – scratchpad for ongoing research tasks (e.g., deeper jhalfs integration). diff --git a/ai/context.md b/ai/context.md index bf67ce1..f3f35c1 100644 --- a/ai/context.md +++ b/ai/context.md @@ -7,7 +7,8 @@ `wget-list`/`md5sums` manifests into `ai/metadata/cache/` and the `harvest` command automatically draws URLs and checksums from those manifests. A `generate` subcommand consumes harvested metadata and scaffolds Rust modules - under `src/pkgs/by_name` (or a custom output directory). + under `src/pkgs/by_name` (or a custom output directory). See + `docs/PACKAGE_GENERATION.md` for the CLI flow. - AI state lives under `ai/`: - `ai/personas.json`, `ai/tasks.json`, `ai/bugs.json` track personas, outstanding work, and known issues. diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 8fd1a96..3af7875 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -9,7 +9,7 @@ outline the main entry points and how the supporting modules fit together. | Binary | Location | Purpose | | ------ | -------- | ------- | | `lpkg` | `src/main.rs` | Primary command-line interface with workflow automation and optional TUI integration. | -| `metadata_indexer` | `src/bin/metadata_indexer.rs` | Harvests LFS/BLFS/GLFS package metadata, validates it against the JSON schema, and keeps `ai/metadata/index.json` up to date. | +| `metadata_indexer` | `src/bin/metadata_indexer.rs` | Harvests LFS/BLFS/GLFS package metadata, validates it against the JSON schema, keeps `ai/metadata/index.json` up to date, and can scaffold Rust modules from harvested records. | ### `lpkg` workflows diff --git a/docs/PACKAGE_GENERATION.md b/docs/PACKAGE_GENERATION.md new file mode 100644 index 0000000..39279c9 --- /dev/null +++ b/docs/PACKAGE_GENERATION.md @@ -0,0 +1,61 @@ +# Package Module Generation + +This document explains how harvested metadata is transformed into concrete +Rust modules under `src/pkgs/by_name/`. + +## Overview + +1. **Harvest metadata** – Use `metadata_indexer harvest` to capture package data + from the LFS/BLFS/GLFS books. Each record is written to + `ai/metadata/packages//.json`. +2. **Refresh manifests** – Run + `metadata_indexer refresh` to ensure the jhalfs `wget-list` and `md5sums` + caches are up to date. Harvesting relies on these caches for canonical + source URLs and checksums. +3. **Generate modules** – Use + `metadata_indexer generate --metadata --output ` to turn a + metadata file into a full Rust module that exposes a `PackageDefinition`. + +Generated modules leverage the existing scaffolder logic, so the command will +create any missing prefix directories (e.g. `bi/mod.rs`) and populate the final +`mod.rs` file with the correct code template. + +## Command reference + +```bash +# Harvest metadata from a book page +cargo run --bin metadata_indexer -- --base-dir . harvest \ + --book mlfs \ + --page chapter05/binutils-pass1 \ + --output ai/metadata/packages/mlfs/binutils-pass-1.json + +# Refresh jhalfs manifests (optional but recommended) +cargo run --bin metadata_indexer -- --base-dir . refresh + +# Generate a module under the standard src tree +cargo run --bin metadata_indexer -- --base-dir . generate \ + --metadata ai/metadata/packages/mlfs/binutils-pass-1.json \ + --output src/pkgs/by_name \ + --overwrite +``` + +### Flags + +- `--output` defaults to `src/pkgs/by_name`. Point it to another directory if + you want to stage modules elsewhere (e.g. `target/generated/by_name`). +- `--overwrite` deletes the existing module directory before scaffolding a new + one. + +After generation, run `cargo fmt` and `cargo check` to ensure the crate compiles +with the new modules. + +## Implementation notes + +- Metadata fields such as `build`, `dependencies`, and `optimizations` are + mapped directly onto the scaffolder’s `ScaffoldRequest` type. +- Source URLs and MD5 checksums are sourced from the harvested metadata + (populated via the jhalfs manifests). +- The module slug is derived from `package.id` (e.g. + `mlfs/binutils-pass-1` → `src/pkgs/by_name/bi/binutils_pass_1/mod.rs`). + +See the code in `src/pkgs/generator.rs` for the full translation logic.