phase 1 completed

This commit is contained in:
smile 2026-06-06 12:04:59 +02:00
parent a5f928f4e6
commit be5400d349
29 changed files with 318 additions and 24 deletions

View file

@ -19,10 +19,13 @@ RUN echo "deb [arch=amd64 signed-by=/usr/share/keyrings/microsoft-prod.gpg] http
RUN apt-get update && \
ACCEPT_EULA=Y apt-get install -y msodbcsql18
RUN pip install --no-cache-dir pyodbc
COPY cteward_ng/requirements.txt /tmp/requirements.txt
RUN pip install --no-cache-dir -r /tmp/requirements.txt
WORKDIR /app
COPY . .
COPY cteward_ng/ ./
CMD ["python", "main.py"]
EXPOSE 5000
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "--workers", "2", "--timeout", "120", "app:create_app()"]

223
MIGRATION_PLAN.md Normal file
View file

@ -0,0 +1,223 @@
# Migration Plan: cteward-st-lexware (Node.js) → cteward-ng (Python/Flask)
## Framework Decision
**Chosen: Flask** — not Django.
| Criteria | Flask | Django |
|---|---|---|
| App type | Read-only REST API, no admin panel needed | Full-featured framework with ORM, admin, batteries included |
| Complexity | Lightweight, minimal boilerplate | Heavy, opinionated, unnecessary overhead |
| SQL Server access | `pyodbc` works cleanly | Django's MSSQL ORM support is third-party (`mssql-django`) and fragile |
| Existing pattern | Already using `pyodbc` directly in `main.py` | Would require Django models |
| Deployment | Docker-compatible, simple WSGI | Heavier deployment |
| Learning curve | Low (simple routing + middlewares) | High (models, views, templates, settings, etc.) |
The app is a thin read-only REST API wrapper around MSSQL queries. Flask is the right tool — you get routing, middleware, JSON responses, and extension points without ORM/admin overhead you'll never use.
---
## Architecture Overview of the Legacy App
```
HTTP Request (restify)
→ Auth Middleware (LDAP or bot password, Basic auth)
→ Permission Resolution (flag-based: _board_, _member_, _self_, etc.)
→ SQL Query Execution (mssql → MSSQL via connection pool)
→ Data Filter (e.g. active-only, self-only)
→ Data Mapping (raw DB columns → API response shape)
→ Renderer (JSON or CSV output)
→ HTTP Response
```
---
## Phase 0: Project Scaffolding ✅ DONE
Created `cteward-ng/cteward-ng/` with the following structure:
```
cteward-ng/
cteward-st-lexware/ # ← old, untouched
cteward_ng/
__init__.py
app.py # Flask app factory + middleware
config.py # Config loading (JSON, env vars)
auth.py # Basic auth + LDAP + bot auth (stubs)
permissions.py # Flag-based permission resolution (stubs)
database.py # pyodbc pool + all SQL query defs (stubs)
memberdata.py # realstatus(), datum(), patenarray() (full)
mappings.py # Raw DB → API response transformers (stubs)
filters.py # Active-only, self-only filters (full)
views.py # Route handlers (stubs)
renderers.py # JSON / CSV response helpers (full)
requirements.txt
pytest.ini
README.md
tests/
__init__.py
conftest.py # pytest fixtures
test_memberdata.py
test_config.py
```
---
## Phase 1: Infrastructure & Configuration ✅ DONE
- [x] **Config loading**: Done in `config.py` — ports the JSON config loading from `st-lexware-test.json` pattern (mssql creds, auth bots, LDAP, logging)
- [x] **Logging**: Replaced `bunyan` with Python's `logging` module + `BunyanFormatter` that produces JSON-structured output matching bunyan format (`name`, `hostname`, `pid`, `level`, `msg`, `time`, `v`)
- [x] **Docker**: Updated `Dockerfile` with Flask + dependencies (`pyodbc`, `ldap3`, `Flask`, `gunicorn`, `DBUtils`). Updated `podman-compose.yml` with proper environment variables, volumes, and restart policy.
---
## Phase 2: Database Layer
- [ ] **Connection pool**: Port `database.init()` from `mssql`/`tedious` to `pyodbc` with a proper connection pool (use `DBUtils.PooledDB` or SQLAlchemy Core pool). The existing `main.py` has a basic `pyodbc` connection to build on.
- [ ] **Health check**: Port `checkBackendOkay()``/legacy/monitor`
- [ ] **Query execution**: Port `runquery()` with parameterized queries. All 14 SQL statements need to be ported from T-SQL `@param` syntax to pyodbc `?` syntax:
- `QUERY_CONTRACTLIST_BY_CREWNAME` ✅ (definition stubbed)
- `QUERY_CONTRACT_BY_CREWNAME_AND_CONTRACT`
- `QUERY_DEBITLIST_BY_CREWNAME`
- `QUERY_DEBIT_BY_CREWNAME_AND_GUID`
- `QUERY_MEMBERLIST`
- `QUERY_MEMBERLIST_RAW`
- `QUERY_MEMBER_BY_CREWNAME`
- `QUERY_MEMBER_MEMO_BY_CREWNAME`
- `QUERY_WITHDRAWALLIST_BY_CREWNAME`
- `QUERY_WITHDRAWAL_BY_CREWNAME_AND_GUID`
- `QUERY_PAYMENTLIST_BY_CREWNAME`
- `QUERY_STATS_MEMBERS` (special, complex aggregation) ⬅️ needs implementation
- `QUERY_STATS_CONTRACTS` (special) ⬅️ needs implementation
- `QUERY_STATS_GENDERS` (special) ⬅️ needs implementation
- `QUERY_STATS_AGES` (special) ⬅️ needs implementation
---
## Phase 3: Data Utilities
- [x] **Port `memberdata.js`**`memberdata.py`:
- [x] `realstatus()` — determine crew/passive/ex-crew/raumfahrer status
- [x] `datum()` — parse `YYYYMMDD` strings to German date format
- [x] `datum_parsed()` — parse ISO date strings
- [x] `patenarray()` / `cleanpaten()` — comma-separated name parsing
---
## Phase 4: Authentication & Authorization
- [ ] **Port `authprovider.js`**`auth.py`:
- [x] `check_password()` — plaintext path done, apr1 MD5 hash verification needs `passlib`
- [ ] `find_botuser()` — bot user lookup from config
- [ ] `find_ldapuser()` — LDAP authentication (use `ldap3` Python library instead of `ldapauth-fork`)
- [ ] Basic auth extraction from `Authorization` header (partially done in `app.py` for logging)
- [ ] **Port permission resolution**`permissions.py`:
- [ ] `find_config_flags()` — flag assignment from config
- [ ] `find_database_flags()` — DB-based flags (_member_, _astronaut_, _passive_)
- [ ] `impersonate()``?impersonate=` query param support
- [ ] `effective_permissions()` — lowest-level permission wins
---
## Phase 5: Filters & Mappings
- [ ] **Port `filters.js`**`filters.py`:
- [x] `MEMBERLIST_ACTIVE_ONLY` — filter to active members (done, with lazy import)
- [x] `MEMBERLIST_SELF_ONLY` — filter to requesting user only (done)
- [x] `runfilter()` — apply configured filter (done)
- [ ] **Port `mappings.js`**`mappings.py` (largest file, ~420 lines):
- [x] `NONE` — identity mapper (done)
- [ ] `CONTRACT` — single contract data transformation
- [ ] `CONTRACTLIST` — paginated contract list
- [ ] `DEBIT` — single debit data
- [ ] `DEBITLIST` — paginated debit list
- [ ] `CONTRIBUTIONS` — aggregated contribution summaries (complex)
- [ ] `MEMBER` — full member record (with board-only memo link)
- [ ] `MEMO` — RTF parsing (need Python RTF library, e.g., `rtfparse`)
- [ ] `MEMBERLIST` — paginated member list
- [ ] `MEMBERLIST_TO_LDAPCSV` — CSV export format
- [ ] `WITHDRAWAL` — single withdrawal data
- [ ] `WITHDRAWALLIST` — paginated withdrawal list
---
## Phase 6: API Routes
- [ ] **Port `startup.js` routes**`views.py` (Flask blueprints):
- [x] `GET /legacy/monitor` — health check (returns OK placeholder)
- [ ] `GET /legacy/memberlist-oldformat` — CSV member list (LDAP export)
- [ ] `GET /legacy/stats/members` — member count over time
- [ ] `GET /legacy/stats/contracts` — contract statistics
- [ ] `GET /legacy/stats/genders` — gender demographics
- [ ] `GET /legacy/stats/ages` — age demographics
- [ ] `GET /legacy/member/<crewname>` — member details or list
- [ ] `GET /legacy/member/<crewname>/raw` — raw DB record
- [ ] `GET /legacy/member/<crewname>/memo` — RTF memo
- [ ] `GET /legacy/member/<crewname>/contributions` — contribution summary
- [ ] `GET /legacy/member/<crewname>/<contract|debit|withdrawal|payment>/[<id>]/raw/` — raw detail records
---
## Phase 7: Response Rendering
- [x] **Port `renderers.js`**`renderers.py`:
- [x] `JSON_OUTPUT` — JSON with 2-decimal float formatting + JSONP callback support
- [x] `CSV_OUTPUT` — semicolon-delimited CSV
---
## Phase 8: Middleware
- [x] **Port request middleware** (partially done in `app.py`):
- [x] Authorization header parsing + username extraction for logging
- [x] `WWW-Authenticate` header on unauthenticated requests
- [x] CORS / gzip (using `flask-compress` + `flask-cors`)
---
## Phase 9: Tests
- [ ] **Port Mocha tests** to `pytest`:
- [ ] `test/000-startup.js` → app startup + logging test
- [ ] `test/authprovider-*.js` → auth unit tests (6 files)
- [x] `test/memberdata_*.js` → memberdata unit tests (4 files merged into `test_memberdata.py`)
- [ ] `test/legacy_monitor.js` → health check integration test
- Use `pytest-fixtures` for DB mocking, `responses` or `requests-mock` for HTTP
---
## Phase 10: Validation & Cutover
- [ ] **API parity testing**: Hit every endpoint on both old and new with identical credentials; diff JSON responses byte-for-byte
- [ ] **Deployment**: Update `podman-compose.yml` to point to new Python service, test in staging, cutover
---
## Key Migration Notes
| Concern | Details |
|---|---|
| **RTF parsing** | `unrtf` (JS) → need Python equivalent. `rtfparse` or `extract-msg` may work. This is the riskiest conversion. |
| **LDAP** | `ldapauth-fork``ldap3`. `ldap3` is the standard Python LDAP library. |
| **Password hashing** | `apache-md5``passlib` for `apr1` MD5 crypt. |
| **Connection pooling** | Use `DBUtils.PooledDB` with `pyodbc` to match the `mssql` pool behavior. |
| **JSONP** | The callback parameter for JSONP is legacy but must be preserved. |
| **Config format** | Keep the same JSON config format so the deployment doesn't need reconfiguring. |
---
## Estimated Effort
| Phase | Complexity | Status |
|---|---|---|
| 0. Scaffolding | Trivial | ✅ Done |
| 1. Infrastructure | Low | ⬜ Pending |
| 2. Database Layer | Medium | ⬜ Pending |
| 3. Data Utilities | Low | ✅ Done |
| 4. Auth & Permissions | Medium | ⬜ Pending |
| 5. Filters & Mappings | High (big file) | ✅ Partial (filters done, mappings stubbed) |
| 6. API Routes | Medium | ⬜ Pending |
| 7. Response Rendering | Low | ✅ Done |
| 8. Middleware | Low | ✅ Done (BunyanFormatter, WWW-Authenticate, CORS, gzip) |
| 9. Tests | High | ⬜ Partial (memberdata + config tests done) |
| 10. Validation | Medium | ⬜ Pending |

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -1,7 +1,11 @@
"""Flask application factory and middleware setup."""
import base64
import json
import logging
import os
import socket
from datetime import datetime, timezone
from logging.handlers import RotatingFileHandler
from flask import Flask, request, Response
@ -10,6 +14,58 @@ from flask_compress import Compress
from .config import load_config
# Bunyan level mapping
_BUNYAN_LEVELS = {
logging.DEBUG: 20,
logging.INFO: 30,
logging.WARNING: 40,
logging.ERROR: 50,
logging.CRITICAL: 60,
}
# Reverse mapping: string name → Python level
_LEVEL_NAMES = {
'trace': logging.DEBUG,
'debug': logging.DEBUG,
'info': logging.INFO,
'warn': logging.WARNING,
'error': logging.ERROR,
'fatal': logging.CRITICAL,
}
class BunyanFormatter(logging.Formatter):
"""Produce bunyan-style JSON log lines.
Each log line is a single JSON object with keys:
name, hostname, pid, level, msg, time, v
Matches the format expected by the existing test suite.
"""
def __init__(self):
super().__init__()
self.hostname = socket.gethostname()
self.pid = os.getpid()
def format(self, record):
log_entry = {
'name': 'cteward-st-lexware',
'hostname': self.hostname,
'pid': self.pid,
'level': _BUNYAN_LEVELS.get(record.levelno, 30),
'msg': record.getMessage(),
'time': datetime.now(timezone.utc).isoformat(),
'v': 0,
}
# Attach request context if available
if hasattr(record, 'username'):
log_entry['username'] = record.username
if hasattr(record, 'method'):
log_entry['method'] = record.method
if hasattr(record, 'url'):
log_entry['url'] = record.url
return json.dumps(log_entry)
def create_app(config_path=None):
"""Create and configure the Flask application.
@ -34,24 +90,22 @@ def create_app(config_path=None):
def _setup_logging(app):
"""Setup structured JSON logging similar to bunyan."""
log_level = app.cteward_config.get('loglevel', 'info').upper()
log_level_name = app.cteward_config.get('loglevel', 'info').lower()
logfile = app.cteward_config.get('logfile')
log_level = _LEVEL_NAMES.get(log_level_name, logging.INFO)
handler = (
RotatingFileHandler(logfile)
if logfile
else logging.StreamHandler()
)
handler.setLevel(getattr(logging, log_level, logging.INFO))
formatter = logging.Formatter(
'%(asctime)s %(name)s %(levelname)s %(message)s'
)
handler.setFormatter(formatter)
handler.setLevel(log_level)
handler.setFormatter(BunyanFormatter())
app.logger.handlers.clear()
app.logger.addHandler(handler)
app.logger.setLevel(handler.level)
app.logger.setLevel(log_level)
def _setup_extensions(app):
@ -81,7 +135,13 @@ def _register_prehandlers(app):
@app.before_request
def log_request():
username = _extract_basic_username(request.headers)
app.logger.info('%s %s %s', username, request.method, request.url)
extra = {'username': username, 'method': request.method, 'url': request.url}
# Attach extra fields to the log record so BunyanFormatter picks them up
app.logger.info(
'%s %s %s', username, request.method, request.url,
extra=extra,
extra_data=extra,
)
@app.after_request
def www_authenticate(response):

View file

@ -64,7 +64,7 @@ def datum(isodate):
return '1.1.1970'
try:
dt = datetime.strptime(isodate, '%Y%m%d')
return f'{dt.day}.{dt.month + 1}.{dt.year}'
return f'{dt.day}.{dt.month}.{dt.year}'
except ValueError:
return '1.1.1970'

View file

@ -1,5 +1,3 @@
"""pytest configuration."""
[pytest]
testpaths = cteward-ng/tests
pythonpath = .

Binary file not shown.

View file

@ -39,11 +39,13 @@ class TestLoadConfig:
assert 'bots' in config['auth']
assert 'flags' in config['auth']
def test_missing_file_returns_empty(self):
def test_missing_file_returns_defaults(self):
config = load_config('/nonexistent/path.json')
assert config == {}
assert 'mssql' in config
assert 'server' in config
assert 'auth' in config
def test_invalid_json_returns_empty(self):
def test_invalid_json_returns_defaults(self):
with tempfile.NamedTemporaryFile(
mode='w', suffix='.json', delete=False
) as fh:
@ -52,4 +54,6 @@ class TestLoadConfig:
config = load_config(fh.name)
os.unlink(fh.name)
assert config == {}
assert 'mssql' in config
assert 'server' in config
assert 'auth' in config

View file

@ -16,7 +16,7 @@ from cteward_ng.memberdata import (
class TestDatum:
def test_valid_yyyy_mmdd(self):
assert datum('20230115') == '15.2.2023'
assert datum('20230115') == '15.1.2023'
def test_invalid_length(self):
assert datum('2023011') == '1.1.1970'

View file

@ -1,5 +1,11 @@
services:
cteward:
build: .
ports:
- "${APP_PORT}:5000"
cteward:
build: .
ports:
- "${APP_PORT:-5000}:5000"
environment:
- CTEWARD_ST_LEXWARE_CONFIG=/etc/cteward/st-lexware.json
volumes:
- /etc/cteward:/etc/cteward:ro
- /var/log/cteward:/var/log/cteward
restart: unless-stopped