# Migration Plan: cteward-st-lexware (Node.js) → cteward-ng (Python/Flask) ## Framework Decision **Chosen: Flask** — not Django. | Criteria | Flask | Django | |---|---|---| | App type | Read-only REST API, no admin panel needed | Full-featured framework with ORM, admin, batteries included | | Complexity | Lightweight, minimal boilerplate | Heavy, opinionated, unnecessary overhead | | SQL Server access | `pyodbc` works cleanly | Django's MSSQL ORM support is third-party (`mssql-django`) and fragile | | Existing pattern | Already using `pyodbc` directly in `main.py` | Would require Django models | | Deployment | Docker-compatible, simple WSGI | Heavier deployment | | Learning curve | Low (simple routing + middlewares) | High (models, views, templates, settings, etc.) | The app is a thin read-only REST API wrapper around MSSQL queries. Flask is the right tool — you get routing, middleware, JSON responses, and extension points without ORM/admin overhead you'll never use. --- ## Architecture Overview of the Legacy App ``` HTTP Request (restify) → Auth Middleware (LDAP or bot password, Basic auth) → Permission Resolution (flag-based: _board_, _member_, _self_, etc.) → SQL Query Execution (mssql → MSSQL via connection pool) → Data Filter (e.g. active-only, self-only) → Data Mapping (raw DB columns → API response shape) → Renderer (JSON or CSV output) → HTTP Response ``` --- ## Phase 0: Project Scaffolding ✅ DONE Created `cteward-ng/cteward-ng/` with the following structure: ``` cteward-ng/ cteward-st-lexware/ # ← old, untouched cteward_ng/ __init__.py app.py # Flask app factory + middleware config.py # Config loading (JSON, env vars) auth.py # Basic auth + LDAP + bot auth (stubs) permissions.py # Flag-based permission resolution (stubs) database.py # pyodbc pool + all SQL query defs (stubs) memberdata.py # realstatus(), datum(), patenarray() (full) mappings.py # Raw DB → API response transformers (stubs) filters.py # Active-only, self-only filters (full) views.py # Route handlers (stubs) renderers.py # JSON / CSV response helpers (full) requirements.txt pytest.ini README.md tests/ __init__.py conftest.py # pytest fixtures test_memberdata.py test_config.py ``` --- ## Phase 1: Infrastructure & Configuration ✅ DONE - [x] **Config loading**: Done in `config.py` — ports the JSON config loading from `st-lexware-test.json` pattern (mssql creds, auth bots, LDAP, logging) - [x] **Logging**: Replaced `bunyan` with Python's `logging` module + `BunyanFormatter` that produces JSON-structured output matching bunyan format (`name`, `hostname`, `pid`, `level`, `msg`, `time`, `v`) - [x] **Docker**: Updated `Dockerfile` with Flask + dependencies (`pyodbc`, `ldap3`, `Flask`, `gunicorn`, `DBUtils`). Updated `podman-compose.yml` with proper environment variables, volumes, and restart policy. --- ## Phase 2: Database Layer - [ ] **Connection pool**: Port `database.init()` from `mssql`/`tedious` to `pyodbc` with a proper connection pool (use `DBUtils.PooledDB` or SQLAlchemy Core pool). The existing `main.py` has a basic `pyodbc` connection to build on. - [ ] **Health check**: Port `checkBackendOkay()` → `/legacy/monitor` - [ ] **Query execution**: Port `runquery()` with parameterized queries. All 14 SQL statements need to be ported from T-SQL `@param` syntax to pyodbc `?` syntax: - `QUERY_CONTRACTLIST_BY_CREWNAME` ✅ (definition stubbed) - `QUERY_CONTRACT_BY_CREWNAME_AND_CONTRACT` ✅ - `QUERY_DEBITLIST_BY_CREWNAME` ✅ - `QUERY_DEBIT_BY_CREWNAME_AND_GUID` ✅ - `QUERY_MEMBERLIST` ✅ - `QUERY_MEMBERLIST_RAW` ✅ - `QUERY_MEMBER_BY_CREWNAME` ✅ - `QUERY_MEMBER_MEMO_BY_CREWNAME` ✅ - `QUERY_WITHDRAWALLIST_BY_CREWNAME` ✅ - `QUERY_WITHDRAWAL_BY_CREWNAME_AND_GUID` ✅ - `QUERY_PAYMENTLIST_BY_CREWNAME` ✅ - `QUERY_STATS_MEMBERS` (special, complex aggregation) ⬅️ needs implementation - `QUERY_STATS_CONTRACTS` (special) ⬅️ needs implementation - `QUERY_STATS_GENDERS` (special) ⬅️ needs implementation - `QUERY_STATS_AGES` (special) ⬅️ needs implementation --- ## Phase 3: Data Utilities - [x] **Port `memberdata.js`** → `memberdata.py`: - [x] `realstatus()` — determine crew/passive/ex-crew/raumfahrer status - [x] `datum()` — parse `YYYYMMDD` strings to German date format - [x] `datum_parsed()` — parse ISO date strings - [x] `patenarray()` / `cleanpaten()` — comma-separated name parsing --- ## Phase 4: Authentication & Authorization - [ ] **Port `authprovider.js`** → `auth.py`: - [x] `check_password()` — plaintext path done, apr1 MD5 hash verification needs `passlib` - [ ] `find_botuser()` — bot user lookup from config - [ ] `find_ldapuser()` — LDAP authentication (use `ldap3` Python library instead of `ldapauth-fork`) - [ ] Basic auth extraction from `Authorization` header (partially done in `app.py` for logging) - [ ] **Port permission resolution** → `permissions.py`: - [ ] `find_config_flags()` — flag assignment from config - [ ] `find_database_flags()` — DB-based flags (_member_, _astronaut_, _passive_) - [ ] `impersonate()` — `?impersonate=` query param support - [ ] `effective_permissions()` — lowest-level permission wins --- ## Phase 5: Filters & Mappings - [ ] **Port `filters.js`** → `filters.py`: - [x] `MEMBERLIST_ACTIVE_ONLY` — filter to active members (done, with lazy import) - [x] `MEMBERLIST_SELF_ONLY` — filter to requesting user only (done) - [x] `runfilter()` — apply configured filter (done) - [ ] **Port `mappings.js`** → `mappings.py` (largest file, ~420 lines): - [x] `NONE` — identity mapper (done) - [ ] `CONTRACT` — single contract data transformation - [ ] `CONTRACTLIST` — paginated contract list - [ ] `DEBIT` — single debit data - [ ] `DEBITLIST` — paginated debit list - [ ] `CONTRIBUTIONS` — aggregated contribution summaries (complex) - [ ] `MEMBER` — full member record (with board-only memo link) - [ ] `MEMO` — RTF parsing (need Python RTF library, e.g., `rtfparse`) - [ ] `MEMBERLIST` — paginated member list - [ ] `MEMBERLIST_TO_LDAPCSV` — CSV export format - [ ] `WITHDRAWAL` — single withdrawal data - [ ] `WITHDRAWALLIST` — paginated withdrawal list --- ## Phase 6: API Routes - [ ] **Port `startup.js` routes** → `views.py` (Flask blueprints): - [x] `GET /legacy/monitor` — health check (returns OK placeholder) - [ ] `GET /legacy/memberlist-oldformat` — CSV member list (LDAP export) - [ ] `GET /legacy/stats/members` — member count over time - [ ] `GET /legacy/stats/contracts` — contract statistics - [ ] `GET /legacy/stats/genders` — gender demographics - [ ] `GET /legacy/stats/ages` — age demographics - [ ] `GET /legacy/member/` — member details or list - [ ] `GET /legacy/member//raw` — raw DB record - [ ] `GET /legacy/member//memo` — RTF memo - [ ] `GET /legacy/member//contributions` — contribution summary - [ ] `GET /legacy/member///[]/raw/` — raw detail records --- ## Phase 7: Response Rendering - [x] **Port `renderers.js`** → `renderers.py`: - [x] `JSON_OUTPUT` — JSON with 2-decimal float formatting + JSONP callback support - [x] `CSV_OUTPUT` — semicolon-delimited CSV --- ## Phase 8: Middleware - [x] **Port request middleware** (partially done in `app.py`): - [x] Authorization header parsing + username extraction for logging - [x] `WWW-Authenticate` header on unauthenticated requests - [x] CORS / gzip (using `flask-compress` + `flask-cors`) --- ## Phase 9: Tests - [ ] **Port Mocha tests** to `pytest`: - [ ] `test/000-startup.js` → app startup + logging test - [ ] `test/authprovider-*.js` → auth unit tests (6 files) - [x] `test/memberdata_*.js` → memberdata unit tests (4 files merged into `test_memberdata.py`) - [ ] `test/legacy_monitor.js` → health check integration test - Use `pytest-fixtures` for DB mocking, `responses` or `requests-mock` for HTTP --- ## Phase 10: Validation & Cutover - [ ] **API parity testing**: Hit every endpoint on both old and new with identical credentials; diff JSON responses byte-for-byte - [ ] **Deployment**: Update `podman-compose.yml` to point to new Python service, test in staging, cutover --- ## Key Migration Notes | Concern | Details | |---|---| | **RTF parsing** | `unrtf` (JS) → need Python equivalent. `rtfparse` or `extract-msg` may work. This is the riskiest conversion. | | **LDAP** | `ldapauth-fork` → `ldap3`. `ldap3` is the standard Python LDAP library. | | **Password hashing** | `apache-md5` → `passlib` for `apr1` MD5 crypt. | | **Connection pooling** | Use `DBUtils.PooledDB` with `pyodbc` to match the `mssql` pool behavior. | | **JSONP** | The callback parameter for JSONP is legacy but must be preserved. | | **Config format** | Keep the same JSON config format so the deployment doesn't need reconfiguring. | --- ## Estimated Effort | Phase | Complexity | Status | |---|---|---| | 0. Scaffolding | Trivial | ✅ Done | | 1. Infrastructure | Low | ⬜ Pending | | 2. Database Layer | Medium | ⬜ Pending | | 3. Data Utilities | Low | ✅ Done | | 4. Auth & Permissions | Medium | ⬜ Pending | | 5. Filters & Mappings | High (big file) | ✅ Partial (filters done, mappings stubbed) | | 6. API Routes | Medium | ⬜ Pending | | 7. Response Rendering | Low | ✅ Done | | 8. Middleware | Low | ✅ Done (BunyanFormatter, WWW-Authenticate, CORS, gzip) | | 9. Tests | High | ⬜ Partial (memberdata + config tests done) | | 10. Validation | Medium | ⬜ Pending |