cteward-ng/MIGRATION_PLAN.md
2026-06-06 22:21:20 +02:00

9.6 KiB

Migration Plan: cteward-st-lexware (Node.js) → cteward-ng (Python/Flask)

Framework Decision

Chosen: Flask — not Django.

Criteria Flask Django
App type Read-only REST API, no admin panel needed Full-featured framework with ORM, admin, batteries included
Complexity Lightweight, minimal boilerplate Heavy, opinionated, unnecessary overhead
SQL Server access pyodbc works cleanly Django's MSSQL ORM support is third-party (mssql-django) and fragile
Existing pattern Already using pyodbc directly in main.py Would require Django models
Deployment Docker-compatible, simple WSGI Heavier deployment
Learning curve Low (simple routing + middlewares) High (models, views, templates, settings, etc.)

The app is a thin read-only REST API wrapper around MSSQL queries. Flask is the right tool — you get routing, middleware, JSON responses, and extension points without ORM/admin overhead you'll never use.


Architecture Overview of the Legacy App

HTTP Request (restify)
  → Auth Middleware (LDAP or bot password, Basic auth)
  → Permission Resolution (flag-based: _board_, _member_, _self_, etc.)
  → SQL Query Execution (mssql → MSSQL via connection pool)
  → Data Filter (e.g. active-only, self-only)
  → Data Mapping (raw DB columns → API response shape)
  → Renderer (JSON or CSV output)
  → HTTP Response

Phase 0: Project Scaffolding DONE

Created cteward-ng/cteward-ng/ with the following structure:

cteward-ng/
  cteward-st-lexware/   # ← old, untouched
  cteward_ng/
    __init__.py
    app.py              # Flask app factory + middleware
    config.py           # Config loading (JSON, env vars)
    auth.py             # Basic auth + LDAP + bot auth (stubs)
    permissions.py      # Flag-based permission resolution (stubs)
    database.py         # pyodbc pool + all SQL query defs (stubs)
    memberdata.py       # realstatus(), datum(), patenarray() (full)
    mappings.py         # Raw DB → API response transformers (stubs)
    filters.py          # Active-only, self-only filters (full)
    views.py            # Route handlers (stubs)
    renderers.py        # JSON / CSV response helpers (full)
    requirements.txt
    pytest.ini
    README.md
    tests/
      __init__.py
      conftest.py       # pytest fixtures
      test_memberdata.py
      test_config.py

Phase 1: Infrastructure & Configuration DONE

  • Config loading: Done in config.py — ports the JSON config loading from st-lexware-test.json pattern (mssql creds, auth bots, LDAP, logging)
  • Logging: Replaced bunyan with Python's logging module + BunyanFormatter that produces JSON-structured output matching bunyan format (name, hostname, pid, level, msg, time, v)
  • Docker: Updated Dockerfile with Flask + dependencies (pyodbc, ldap3, Flask, gunicorn, DBUtils). Updated podman-compose.yml with proper environment variables, volumes, and restart policy.

Phase 2: Database Layer

  • Connection pool: Ported database.init() from mssql/tedious to pyodbc + DBUtils.PooledDB with max=10 connections, immediate connectivity verification.
  • Health check: Ported checkBackendOkay() → verifies member count >= 7 and no duplicate crewnames.
  • Query execution: Ported runquery() with parameterized queries. All 14 SQL statements ported from T-SQL @param syntax to pyodbc ? syntax:
    • QUERY_CONTRACTLIST_BY_CREWNAME
    • QUERY_CONTRACT_BY_CREWNAME_AND_CONTRACT
    • QUERY_DEBITLIST_BY_CREWNAME
    • QUERY_DEBIT_BY_CREWNAME_AND_GUID
    • QUERY_MEMBERLIST
    • QUERY_MEMBERLIST_RAW
    • QUERY_MEMBER_BY_CREWNAME
    • QUERY_MEMBER_MEMO_BY_CREWNAME
    • QUERY_WITHDRAWALLIST_BY_CREWNAME
    • QUERY_WITHDRAWAL_BY_CREWNAME_AND_GUID
    • QUERY_PAYMENTLIST_BY_CREWNAME
    • QUERY_STATS_MEMBERS (special)
    • QUERY_STATS_CONTRACTS (special)
    • QUERY_STATS_GENDERS (special)
    • QUERY_STATS_AGES (special, with step/min/max params)

Phase 3: Data Utilities

  • Port memberdata.jsmemberdata.py:
    • realstatus() — determine crew/passive/ex-crew/raumfahrer status
    • datum() — parse YYYYMMDD strings to German date format
    • datum_parsed() — parse ISO date strings
    • patenarray() / cleanpaten() — comma-separated name parsing

Phase 4: Authentication & Authorization

  • Port authprovider.jsauth.py:
    • check_password() — plaintext path done, apr1 MD5 hash verification needs passlib
    • find_botuser() — bot user lookup from config
    • find_ldapuser() — LDAP authentication (use ldap3 Python library instead of ldapauth-fork)
    • Basic auth extraction from Authorization header (partially done in app.py for logging)
  • Port permission resolutionpermissions.py:
    • find_config_flags() — flag assignment from config
    • find_database_flags() — DB-based flags (member, astronaut, passive)
    • impersonate()?impersonate= query param support
    • effective_permissions() — lowest-level permission wins

Phase 5: Filters & Mappings

  • Port filters.jsfilters.py:
    • MEMBERLIST_ACTIVE_ONLY — filter to active members (done, with lazy import)
    • MEMBERLIST_SELF_ONLY — filter to requesting user only (done)
    • runfilter() — apply configured filter (done)
  • Port mappings.jsmappings.py (largest file, ~420 lines):
    • NONE — identity mapper (done)
    • CONTRACT — single contract data transformation
    • CONTRACTLIST — paginated contract list
    • DEBIT — single debit data
    • DEBITLIST — paginated debit list
    • CONTRIBUTIONS — aggregated contribution summaries (complex)
    • MEMBER — full member record (with board-only memo link)
    • MEMO — RTF parsing (need Python RTF library, e.g., rtfparse)
    • MEMBERLIST — paginated member list
    • MEMBERLIST_TO_LDAPCSV — CSV export format
    • WITHDRAWAL — single withdrawal data
    • WITHDRAWALLIST — paginated withdrawal list

Phase 6: API Routes

  • Port startup.js routesviews.py (Flask blueprints):
    • GET /legacy/monitor — health check (returns OK placeholder)
    • GET /legacy/memberlist-oldformat — CSV member list (LDAP export)
    • GET /legacy/stats/members — member count over time
    • GET /legacy/stats/contracts — contract statistics
    • GET /legacy/stats/genders — gender demographics
    • GET /legacy/stats/ages — age demographics
    • GET /legacy/member/<crewname> — member details or list
    • GET /legacy/member/<crewname>/raw — raw DB record
    • GET /legacy/member/<crewname>/memo — RTF memo
    • GET /legacy/member/<crewname>/contributions — contribution summary
    • GET /legacy/member/<crewname>/<contract|debit|withdrawal|payment>/[<id>]/raw/ — raw detail records

Phase 7: Response Rendering

  • Port renderers.jsrenderers.py:
    • JSON_OUTPUT — JSON with 2-decimal float formatting + JSONP callback support
    • CSV_OUTPUT — semicolon-delimited CSV

Phase 8: Middleware

  • Port request middleware (partially done in app.py):
    • Authorization header parsing + username extraction for logging
    • WWW-Authenticate header on unauthenticated requests
    • CORS / gzip (using flask-compress + flask-cors)

Phase 9: Tests

  • Port Mocha tests to pytest:
    • test/000-startup.js → app startup + logging test
    • test/authprovider-*.js → auth unit tests (6 files)
    • test/memberdata_*.js → memberdata unit tests (4 files merged into test_memberdata.py)
    • test/legacy_monitor.js → health check integration test
    • Use pytest-fixtures for DB mocking, responses or requests-mock for HTTP

Phase 10: Validation & Cutover

  • API parity testing: Hit every endpoint on both old and new with identical credentials; diff JSON responses byte-for-byte
  • Deployment: Update podman-compose.yml to point to new Python service, test in staging, cutover

Key Migration Notes

Concern Details
RTF parsing unrtf (JS) → need Python equivalent. rtfparse or extract-msg may work. This is the riskiest conversion.
LDAP ldapauth-forkldap3. ldap3 is the standard Python LDAP library.
Password hashing apache-md5passlib for apr1 MD5 crypt.
Connection pooling Use DBUtils.PooledDB with pyodbc to match the mssql pool behavior.
JSONP The callback parameter for JSONP is legacy but must be preserved.
Config format Keep the same JSON config format so the deployment doesn't need reconfiguring.

Estimated Effort

Phase Complexity Status
0. Scaffolding Trivial Done
1. Infrastructure Low Done (Dockerfile, podman-compose, BunyanFormatter)
2. Database Layer Medium Done (PooledDB, all 14 queries + 4 stats aggregations)
3. Data Utilities Low Done
4. Auth & Permissions Medium Pending
5. Filters & Mappings High (big file) Partial (filters done, mappings stubbed)
6. API Routes Medium Pending
7. Response Rendering Low Done
8. Middleware Low Done (BunyanFormatter, WWW-Authenticate, CORS, gzip)
9. Tests High Partial (memberdata, config, database tests done — 40 passing)
10. Validation Medium Pending