6.1 KiB
6.1 KiB
Integrating jhalfs Source Metadata
- Goal: reuse jhalfs wget-list and md5sums to populate package
source.urlsand auto-fill checksums when harvesting metadata for MLFS/BLFS/GLFS packages. - Data source:
https://anduin.linuxfromscratch.org/hosts per-releasewget-list/md5sumsfiles already curated by the jhalfs project. - Approach:
- Fetch (and optionally cache under
ai/cache/) the lists for each book. - When harvesting, map
<package>-<version>against the list to gather all relevant URLs. - Pull matching checksum entries to populate
source.checksums. - Keep the existing HTML scrape for chapter/stage text; jhalfs covers only sources.
- Fetch (and optionally cache under
- Benefits: avoids fragile HTML tables, keeps URLs aligned with official build scripts, and ensures checksums are up-to-date.
Metadata → Rust Module Strategy
Goal: emit Rust modules under src/pkgs/by_name directly from harvested
metadata once MLFS/BLFS/GLFS records are validated.
Outline:
- Schema alignment – Ensure harvested JSON carries everything the
PackageDefinitionconstructor expects (source URLs, checksums, build commands, dependencies, optimisation flags, notes/stage metadata). - Translation layer – Implement a converter (likely in a new module,
e.g.
src/pkgs/generator.rs) that reads a metadata JSON file and produces aScaffoldRequestor directly writes the module source via the existing scaffolder. - Naming/layout – Derive module paths from
package.id(e.g.mlfs/binutils-pass-1→src/pkgs/by_name/bi/binutils/pass_1/mod.rs) while preserving the prefix/slug conventions already used by the scaffolder. - CLI integration – Add a subcommand (
metadata_indexer generate) that accepts a list of package IDs or a glob, feeds each through the translator, and optionally stages the resulting Rust files. - Diff safety – Emit modules to a temporary location first, compare
against existing files, and only overwrite when changes are detected; keep a
--dry-runmode for review. - Tests/checks – After generation, run
cargo fmtandcargo checkto ensure the new modules compile; optionally add schema fixtures covering edge cases (variants, multiple URLs, absent checksums).
Open questions:
- How to represent optional post-install steps or multi-phase builds inside the generated module (additional helper functions vs. raw command arrays).
- Where to store PGO workload hints once the PGO infrastructure is defined.
Lightweight Networking Rewrite
- Motivation: remove heavy async stacks (tokio + reqwest) from the default feature set to keep clean builds fast and reduce binary size.
- HTTP stack baseline:
ureq(blocking, TLS via rustls, small dependency footprint) plusscraperfor DOM parsing. - Migration checklist:
- Replace
reqwestusage insrc/html.rs,md5_utils.rs,wget_list.rs,mirrors.rs, and the ingest pipelines. - Rework
binutilscross toolchain workflow to operate synchronously, eliminating tokio runtime/bootstrap. - Drop
tokioandreqwestfromCargo.tomlonce TUI workflows stop using tracing instrumentation hooks that pulled them in transitively. - Audit for remaining
tracingdependencies and migrate to the lightweight logging facade (log+env_loggeror custom adapter) for non-TUI code.
- Replace
- Follow-up ideas:
- Provide feature flag
full-netthat re-enables async clients when needed for high-concurrency mirror probing. - Benchmark
ureqvsreqwestonmetadata_indexer harvestto ensure we don’t regress throughput noticeably.
- Provide feature flag
README Generation Framework (Markdown RFC)
- Goal: author the project README in Rust, using a small domain-specific builder that outputs GitHub-flavoured Markdown (GFM) from structured sections.
- Design sketch:
- New crate/workspace member
readme_builderundertools/exposing a fluent API (Doc::new().section("Intro", |s| ...)). - Source-of-truth lives in
tools/readme/src/main.rs; runningcargo run -p readme_builderwrites toREADME.md. - Provide reusable primitives:
Heading,Paragraph,CodeBlock,Table::builder(),Callout::note("..."),Badge::docsrs(), etc. - Keep rendering deterministic (sorted sections, stable wrapping) so diffs remain reviewable.
- New crate/workspace member
- Tasks:
- Scaffold
tools/readmecrate with CLI that emits to stdout or specified path (--output README.md). - Model README sections as enums/structs with
Displayimpls to enforce consistency. - Port current README structure into builder code, annotate with inline comments describing regeneration steps.
- Add
make readme(orcargo xtask readme) to rebuild documentation as part of release workflow. - Document in CONTRIBUTING how to edit the Rust source instead of the raw Markdown.
- Scaffold
- Stretch goals:
- Emit additional artefacts (e.g.,
docs/CHANGELOG.md) from the same source modules.
- Emit additional artefacts (e.g.,
- Allow embedding generated tables from Cargo metadata (dependency stats, feature lists).
Dependency Slimming Log
- 2025-03: Replaced
reqwest/tokioasync stack withureq; default builds now avoid pulling in hyper/quinn/tower trees. GraphQL feature gate still pulls Actix/tokio, but only when enabled. - Added
.cargo/config.tomlprofiles: dev stays atopt-level=0, release uses LTO fat +-O3, and PGO profiles exposecargo pgo-instrument/cargo pgo-buildaliases. - All SVG artefacts (core logo, Nixette logo/mascot/wallpaper) are now generated
by Rust binaries under
src/bin/*_gen.rsusing a sharedsvg_buildermodule. Regeneration steps:cargo run --bin logo_gen cargo run --bin nixette_logo_gen cargo run --bin nixette_mascot_gen cargo run --bin nixette_wallpaper_gen - README is produced via
cargo run --bin readme_gen; contributors should edit the builder source instead of the Markdown output. - Remaining work: trim tracing/Actix dependencies inside the TUI path,
investigate replacing
gptmanfor non-critical disk UI builds, and pin a cargodenyaudit to alert on large transitive graphs.