cteward-ng/MIGRATION_PLAN.md
2026-06-08 20:33:47 +02:00

9.6 KiB

Migration Plan: cteward-st-lexware (Node.js) → cteward-ng (Python/Flask)

Framework Decision

Chosen: Flask — not Django.

Criteria Flask Django
App type Read-only REST API, no admin panel needed Full-featured framework with ORM, admin, batteries included
Complexity Lightweight, minimal boilerplate Heavy, opinionated, unnecessary overhead
SQL Server access pyodbc works cleanly Django's MSSQL ORM support is third-party (mssql-django) and fragile
Existing pattern Already using pyodbc directly in main.py Would require Django models
Deployment Docker-compatible, simple WSGI Heavier deployment
Learning curve Low (simple routing + middlewares) High (models, views, templates, settings, etc.)

The app is a thin read-only REST API wrapper around MSSQL queries. Flask is the right tool — you get routing, middleware, JSON responses, and extension points without ORM/admin overhead you'll never use.


Architecture Overview of the Legacy App

HTTP Request (restify)
  → Auth Middleware (LDAP or bot password, Basic auth)
  → Permission Resolution (flag-based: _board_, _member_, _self_, etc.)
  → SQL Query Execution (mssql → MSSQL via connection pool)
  → Data Filter (e.g. active-only, self-only)
  → Data Mapping (raw DB columns → API response shape)
  → Renderer (JSON or CSV output)
  → HTTP Response

Phase 0: Project Scaffolding DONE

Created cteward-ng/cteward-ng/ with the following structure:

cteward-ng/
  cteward-st-lexware/   # ← old, untouched
  cteward_ng/
    __init__.py
    app.py              # Flask app factory + middleware
    config.py           # Config loading (JSON, env vars)
    auth.py             # Basic auth + LDAP + bot auth (stubs)
    permissions.py      # Flag-based permission resolution (stubs)
    database.py         # pyodbc pool + all SQL query defs (stubs)
    memberdata.py       # realstatus(), datum(), patenarray() (full)
    mappings.py         # Raw DB → API response transformers (stubs)
    filters.py          # Active-only, self-only filters (full)
    views.py            # Route handlers (stubs)
    renderers.py        # JSON / CSV response helpers (full)
    requirements.txt
    pytest.ini
    README.md
    tests/
      __init__.py
      conftest.py       # pytest fixtures
      test_memberdata.py
      test_config.py

Phase 1: Infrastructure & Configuration DONE

  • Config loading: Done in config.py — ports the JSON config loading from st-lexware-test.json pattern (mssql creds, auth bots, LDAP, logging)
  • Logging: Replaced bunyan with Python's logging module + BunyanFormatter that produces JSON-structured output matching bunyan format (name, hostname, pid, level, msg, time, v)
  • Docker: Updated Dockerfile with Flask + dependencies (pyodbc, ldap3, Flask, gunicorn, DBUtils). Updated podman-compose.yml with proper environment variables, volumes, and restart policy.

Phase 2: Database Layer

  • Connection pool: Ported database.init() from mssql/tedious to pyodbc + DBUtils.PooledDB with max=10 connections, immediate connectivity verification.
  • Health check: Ported checkBackendOkay() → verifies member count >= 7 and no duplicate crewnames.
  • Query execution: Ported runquery() with parameterized queries. All 14 SQL statements ported from T-SQL @param syntax to pyodbc ? syntax:
    • QUERY_CONTRACTLIST_BY_CREWNAME
    • QUERY_CONTRACT_BY_CREWNAME_AND_CONTRACT
    • QUERY_DEBITLIST_BY_CREWNAME
    • QUERY_DEBIT_BY_CREWNAME_AND_GUID
    • QUERY_MEMBERLIST
    • QUERY_MEMBERLIST_RAW
    • QUERY_MEMBER_BY_CREWNAME
    • QUERY_MEMBER_MEMO_BY_CREWNAME
    • QUERY_WITHDRAWALLIST_BY_CREWNAME
    • QUERY_WITHDRAWAL_BY_CREWNAME_AND_GUID
    • QUERY_PAYMENTLIST_BY_CREWNAME
    • QUERY_STATS_MEMBERS (special)
    • QUERY_STATS_CONTRACTS (special)
    • QUERY_STATS_GENDERS (special)
    • QUERY_STATS_AGES (special, with step/min/max params)

Phase 3: Data Utilities

  • Port memberdata.jsmemberdata.py:
    • realstatus() — determine crew/passive/ex-crew/raumfahrer status
    • datum() — parse YYYYMMDD strings to German date format
    • datum_parsed() — parse ISO date strings
    • patenarray() / cleanpaten() — comma-separated name parsing

Phase 4: Authentication & Authorization DONE

  • Port authprovider.jsauth.py:
    • check_password() — plaintext + apr1 MD5 via passlib.apr_md5_crypt
    • find_botuser() — bot user lookup from config
    • find_ldapuser() — LDAP authentication via ldap3
    • Basic auth extraction + full pipeline in authorize()
  • Port permission resolutionpermissions.py:
    • find_config_flags() — flag assignment + impersonation-limited stripping
    • find_database_flags() — DB-based flags (member, astronaut, passive)
    • impersonate()?impersonate= query param support
    • effective_permissions() — lowest level wins

Phase 5: Filters & Mappings DONE

  • Port filters.jsfilters.py:
    • MEMBERLIST_ACTIVE_ONLY — filter to active members
    • MEMBERLIST_SELF_ONLY — filter to requesting user only
    • runfilter() — apply configured filter
  • Port mappings.jsmappings.py (~380 lines):
    • NONE, CONTRACT, CONTRACTLIST
    • DEBIT, DEBITLIST
    • CONTRIBUTIONS — aggregated billed/paid/unpaid
    • MEMBER, MEMO (with RTF fallback parser)
    • MEMBERLIST, MEMBERLIST_TO_LDAPCSV
    • WITHDRAWAL, WITHDRAWALLIST

Phase 6: API Routes DONE

All 11 endpoints implemented with full auth → query → filter → map → render pipeline:

  • GET /legacy/monitor
  • GET /legacy/memberlist-oldformat (CSV)
  • GET /legacy/stats/members, /contracts, /genders, /ages
  • GET /legacy/member/<crewname> (single or list based on ''/'*')
  • GET /legacy/member/<crewname>/raw
  • GET /legacy/member/<crewname>/memo (board-only)
  • GET /legacy/member/<crewname>/contributions (board-only)
  • GET /legacy/member/<crewname>/<contract|debit|withdrawal|payment>/[<id>]/raw/

Phase 9: Tests 103 passing

  • Config tests (4) — loading, defaults, missing file, invalid JSON
  • Database tests (16) — init, connected, health check, query execution, member lookup
  • Memberdata tests (20) — realstatus, datum, patenarray, cleanpaten
  • Auth tests (21) — check_password, basic auth parsing, bot/LDAP auth, pipeline
  • Permissions tests (16) — flag resolution, self-detection, impersonation gating
  • Mappings tests (19) — all 12 mappers with realistic data shapes
  • Views integration tests (10) — monitor, stats, member, memo, contributions, detail raw

Phase 7: Response Rendering

  • Port renderers.jsrenderers.py:
    • JSON_OUTPUT — JSON with 2-decimal float formatting + JSONP callback support
    • CSV_OUTPUT — semicolon-delimited CSV

Phase 8: Middleware

  • Port request middleware (partially done in app.py):
    • Authorization header parsing + username extraction for logging
    • WWW-Authenticate header on unauthenticated requests
    • CORS / gzip (using flask-compress + flask-cors)

Phase 9: Tests

  • Port Mocha tests to pytest:
    • test/000-startup.js → app startup + logging test
    • test/authprovider-*.js → auth unit tests (6 files)
    • test/memberdata_*.js → memberdata unit tests (4 files merged into test_memberdata.py)
    • test/legacy_monitor.js → health check integration test
    • Use pytest-fixtures for DB mocking, responses or requests-mock for HTTP

Phase 10: Validation & Cutover

  • API parity testing: Hit every endpoint on both old and new with identical credentials; diff JSON responses byte-for-byte
  • Deployment: Update podman-compose.yml to point to new Python service, test in staging, cutover

Key Migration Notes

Concern Details
RTF parsing unrtf (JS) → need Python equivalent. rtfparse or extract-msg may work. This is the riskiest conversion.
LDAP ldapauth-forkldap3. ldap3 is the standard Python LDAP library.
Password hashing apache-md5passlib for apr1 MD5 crypt.
Connection pooling Use DBUtils.PooledDB with pyodbc to match the mssql pool behavior.
JSONP The callback parameter for JSONP is legacy but must be preserved.
Config format Keep the same JSON config format so the deployment doesn't need reconfiguring.

Estimated Effort

Phase Complexity Status
0. Scaffolding Trivial Done
1. Infrastructure Low Done (Dockerfile, podman-compose, BunyanFormatter)
2. Database Layer Medium Done (PooledDB, all 14 queries + 4 stats aggregations)
3. Data Utilities Low Done
4. Auth & Permissions Medium Done (bot/LDAP auth, flag resolution, impersonation)
5. Filters & Mappings High (big file) Done (all 12 mappers + 2 filters)
6. API Routes Medium Done (all 11 endpoints with full auth→query→filter→map→render pipeline)
7. Response Rendering Low Done
8. Middleware Low Done (BunyanFormatter, WWW-Authenticate, CORS, gzip)
9. Tests High 103 passing across config, database, memberdata, auth, permissions, mappings, views
10. Validation Medium Pending