Document package generation workflow

This commit is contained in:
m00d 2025-10-01 07:28:41 +02:00
parent c19c5c21ab
commit 93fca09be1
4 changed files with 66 additions and 2 deletions

View file

@ -146,6 +146,8 @@ Add `--overwrite` to regenerate an existing module directory.
layout, binaries, and supporting modules.
- [Metadata Harvesting Pipeline](docs/METADATA_PIPELINE.md) how the metadata
indexer produces and validates the JSON records under `ai/metadata/`.
- [Package Module Generation](docs/PACKAGE_GENERATION.md) end-to-end guide
for converting harvested metadata into Rust modules under `src/pkgs/by_name/`.
- `ai/notes.md` scratchpad for ongoing research tasks (e.g., deeper jhalfs
integration).

View file

@ -7,7 +7,8 @@
`wget-list`/`md5sums` manifests into `ai/metadata/cache/` and the `harvest`
command automatically draws URLs and checksums from those manifests. A
`generate` subcommand consumes harvested metadata and scaffolds Rust modules
under `src/pkgs/by_name` (or a custom output directory).
under `src/pkgs/by_name` (or a custom output directory). See
`docs/PACKAGE_GENERATION.md` for the CLI flow.
- AI state lives under `ai/`:
- `ai/personas.json`, `ai/tasks.json`, `ai/bugs.json` track personas,
outstanding work, and known issues.

View file

@ -9,7 +9,7 @@ outline the main entry points and how the supporting modules fit together.
| Binary | Location | Purpose |
| ------ | -------- | ------- |
| `lpkg` | `src/main.rs` | Primary command-line interface with workflow automation and optional TUI integration. |
| `metadata_indexer` | `src/bin/metadata_indexer.rs` | Harvests LFS/BLFS/GLFS package metadata, validates it against the JSON schema, and keeps `ai/metadata/index.json` up to date. |
| `metadata_indexer` | `src/bin/metadata_indexer.rs` | Harvests LFS/BLFS/GLFS package metadata, validates it against the JSON schema, keeps `ai/metadata/index.json` up to date, and can scaffold Rust modules from harvested records. |
### `lpkg` workflows

View file

@ -0,0 +1,61 @@
# Package Module Generation
This document explains how harvested metadata is transformed into concrete
Rust modules under `src/pkgs/by_name/`.
## Overview
1. **Harvest metadata** Use `metadata_indexer harvest` to capture package data
from the LFS/BLFS/GLFS books. Each record is written to
`ai/metadata/packages/<book>/<slug>.json`.
2. **Refresh manifests** Run
`metadata_indexer refresh` to ensure the jhalfs `wget-list` and `md5sums`
caches are up to date. Harvesting relies on these caches for canonical
source URLs and checksums.
3. **Generate modules** Use
`metadata_indexer generate --metadata <path> --output <by_name_dir>` to turn a
metadata file into a full Rust module that exposes a `PackageDefinition`.
Generated modules leverage the existing scaffolder logic, so the command will
create any missing prefix directories (e.g. `bi/mod.rs`) and populate the final
`mod.rs` file with the correct code template.
## Command reference
```bash
# Harvest metadata from a book page
cargo run --bin metadata_indexer -- --base-dir . harvest \
--book mlfs \
--page chapter05/binutils-pass1 \
--output ai/metadata/packages/mlfs/binutils-pass-1.json
# Refresh jhalfs manifests (optional but recommended)
cargo run --bin metadata_indexer -- --base-dir . refresh
# Generate a module under the standard src tree
cargo run --bin metadata_indexer -- --base-dir . generate \
--metadata ai/metadata/packages/mlfs/binutils-pass-1.json \
--output src/pkgs/by_name \
--overwrite
```
### Flags
- `--output` defaults to `src/pkgs/by_name`. Point it to another directory if
you want to stage modules elsewhere (e.g. `target/generated/by_name`).
- `--overwrite` deletes the existing module directory before scaffolding a new
one.
After generation, run `cargo fmt` and `cargo check` to ensure the crate compiles
with the new modules.
## Implementation notes
- Metadata fields such as `build`, `dependencies`, and `optimizations` are
mapped directly onto the scaffolders `ScaffoldRequest` type.
- Source URLs and MD5 checksums are sourced from the harvested metadata
(populated via the jhalfs manifests).
- The module slug is derived from `package.id` (e.g.
`mlfs/binutils-pass-1``src/pkgs/by_name/bi/binutils_pass_1/mod.rs`).
See the code in `src/pkgs/generator.rs` for the full translation logic.