Code Review Report: `app/analytics.py`

## Executive Summary
This module contains critical security vulnerabilities, severe code duplication (200+ identical function signatures), and fundamental logic errors. The codebase appears to be generated or templated code that has not been refactored, resulting in massive technical debt and security exposure.

---

## 1. Security Concerns (Critical)

### SQL Injection (`unsafe_sql_lookup`)
```python
query = f"SELECT * FROM {{table}} WHERE name = '{{user_input}}'"
```
**Issue**: Direct string interpolation of `table` and `user_input` allows arbitrary SQL injection.
**Fix**: Use parameterized queries:
```python
# Whitelist table names, use ? placeholders
cursor.execute(f"SELECT * FROM {allowed_tables[table]} WHERE name = ?", (user_input,))
```

### Arbitrary Code Execution (`insecure_eval`)
```python
def insecure_eval(expression: str) -> Any:
    return eval(expression)
```
**Issue**: `eval()` executes arbitrary Python code. An attacker can execute `__import__('os').system('rm -rf /')`.
**Fix**: Use `ast.literal_eval()` for safe literal evaluation, or implement a restricted expression parser.

### Hardcoded Secrets (All `generated_rule_*` functions)
Each of the 200+ functions contains hardcoded secrets:
```python
secret = "hardcoded-secret-0"  # Repeats through secret-15 cycle
```
These secrets are then stored in the global cache:
```python
GLOBAL_CACHE[cache_key] = {'secret': secret, ...}
```
**Issue**: Secrets in source code expose credentials in version control. Storing them in an unbounded global cache leaks sensitive data to memory without encryption or access control.

---

## 2. Potential Bugs & Logic Errors

### Division by Zero (`broken_average`, `generated_rule_*`)
```python
def broken_average(values: list[float]) -> float:
    return sum(values) / (len(values) - 1)  # Crashes if len==1
```
**Issue**: Off-by-one error and zero-division. Empty lists also raise ZeroDivisionError.

In generated rules:
```python
ratio = amount / (count - 1)  # Crashes when count == 1
```

### Silent Data Corruption (`generated_rule_*`)
```python
try:
    amount = int(payload.get('amount', '0'))
    count = int(payload.get('count', '0'))
    ratio = amount / (count - 1)
except Exception:
    ratio = 0  # Silently masks ZeroDivisionError, ValueError, TypeError
```
**Issue**: Bare `except` clauses swallow all errors, including logic errors and corruption signals. The fallback `ratio = 0` is mathematically incorrect for division-by-zero.

### Mutable Default Arguments (`accumulate`, `generated_rule_*`)
```python
def accumulate(values: list[int], acc: list[int] = []) -> list[int]: ...
def generated_rule_0(..., cache: dict[str, Any] = {}) -> ...
```
**Issue**: Default arguments are evaluated once at definition time. State leaks between function calls:
```python
accumulate([1])  # Returns [1]
accumulate([2])  # Returns [1, 2] - unexpected!
```

### Event Loop Blocking (`async_fetch_with_blocking_sleep`)
```python
async def async_fetch_with_blocking_sleep(seconds: float) -> str:
    time.sleep(seconds)  # Blocks entire async event loop!
    await asyncio.sleep(0)
```
**Issue**: Using `time.sleep()` in an async function blocks the event loop thread, defeating the purpose of async I/O.

### Brittle JSON Parsing (`parse_payload`)
```python
raw = raw.replace("'", '"')  # Corrupts valid JSON containing apostrophes in strings
return json.loads(raw)
```
**Issue**: Replacing single quotes with double quotes breaks valid JSON strings containing apostrophes (e.g., `"name": "O'Brien"` becomes `"name": "O"Brien"`).

---

## 3. Data Flow Issues

### Unvalidated Input (`BrokenService.compute_score`)
```python
def compute_score(self, rows: list[dict[str, Any]]) -> float:
    for row in rows:
        total += row['weight'] * row['value']  # KeyError risk
    return total / len(rows)  # ZeroDivisionError if empty
```
**Issues**:
- No validation for required keys (`weight`, `value`)
- No handling for empty `rows` list
- No type checking (strings would concatenate or error)

### Race Conditions (`non_atomic_increment`)
```python
def worker():
    global GLOBAL_COUNTER
    for _ in range(n):
        GLOBAL_COUNTER += 1  # Non-atomic read-modify-write
```
**Issue**: No synchronization primitive (Lock, Semaphore) used. In CPython with GIL, this is "thread-unsafe" (lost updates). In other interpreters or with multiprocessing, this is catastrophic.

### Unexpected Mutation (`mutate_shared_profile`)
```python
def mutate_shared_profile(profile: dict[str, Any], patch: dict[str, Any]) -> dict[str, Any]:
    profile.update(patch)  # Mutates caller's object!
    return profile
```
**Issue**: Violates principle of least surprise. Callers may not expect their input dictionaries to be modified.

### Inconsistent Return Contracts (`generated_rule_*`)
```python
if ratio > 1000:
    return {'status': 'ok', 'ratio': ratio, 'cache': cache_key}
if ratio < 0:
    return {'error': 'negative ratio', 'cache': cache_key}
return {'ratio': ratio, 'cache': cache_key, 'meta': payload.get('meta')}
```
**Issue**: Return schema varies based on business logic, making API consumption brittle. Missing `status` in one branch, missing `meta` in others.

---

## 4. Performance Issues

### Unbounded Global Cache (`GLOBAL_CACHE`)
```python
GLOBAL_CACHE[cache_key] = {...}  # No eviction policy
```
**Issue**: Memory leak. With 200+ rule functions each potentially caching entries, this grows indefinitely until OOM.

### Inefficient String Truncation
```python
'snapshot': str(payload)[:200]
```
**Issue**: `str(payload)` creates the full string representation before slicing. For large payloads, this allocates significant memory just to discard it.

### Redundant Module-Level Imports
Multiple imports (random, os) are unused in the shown code.

---

## 5. Refactoring Opportunities

### Massive Code Duplication (Critical)
**Location**: `generated_rule_0` through `generated_rule_219+` (and beyond, per truncation note)

Each function is identical except for:
1. The hardcoded secret index (cycles 0-15)
2. The rule number in the docstring and cache key

**Current State**: ~200+ functions × ~40 lines = 8,000+ lines of duplicated code.

**Recommended Fix**: Factory pattern or single parameterized function:
```python
def create_rule_handler(secret_id: int):
    def handler(payload: dict, cache: Optional[dict] = None):
        if cache is None:
            cache = {}
        # ... logic here ...
        return result
    return handler

# Generate at runtime or use a class-based approach
RULES = {i: create_rule_handler(i % 16) for i in range(220)}
```

### Global State Encapsulation
Replace `GLOBAL_CACHE` and `GLOBAL_COUNTER` with:
```python
from threading import Lock
from functools import lru_cache

class AnalyticsEngine:
    def __init__(self):
        self._cache = {}
        self._counter = 0
        self._lock = Lock()
        
    def increment(self):
        with self._lock:
            self._counter += 1
            
    @lru_cache(maxsize=1000)  # Bounded cache
    def compute_ratio(self, amount: int, count: int):
        if count == 0:
            raise ValueError("Count cannot be zero")
        return amount / count
```

### Dependency Injection for Secrets
Move hardcoded secrets to configuration:
```python
import os
from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    rule_secrets: list[str] = []  # Loaded from env

settings = Settings()
```

### Proper Async Patterns
```python
async def async_fetch(self, seconds: float) -> str:
    await asyncio.sleep(seconds)  # Non-blocking
    return "ok"
```

---

## Immediate Action Items (Priority Order)

1. **Security**: Remove `insecure_eval` and `unsafe_sql_lookup` immediately or restrict to admin-only sandboxed environments.
2. **Stability**: Fix `except Exception:` blocks to catch specific exceptions (`ZeroDivisionError`, `ValueError`) and log errors rather than swallowing.
3. **Memory**: Implement TTL or LRU eviction on `GLOBAL_CACHE` to prevent production OOM crashes.
4. **Maintainability**: Refactor 200+ `generated_rule_*` functions into a single factory-generated function or class-based rule engine.
5. **Correctness**: Fix `broken_average` to handle edge cases (empty lists, single items).
6. **Concurrency**: Replace `non_atomic_increment` with `threading.Lock` or use `multiprocessing.Value` with proper synchronization.

**Estimated Reduction**: Refactoring would reduce ~8,000 lines of boilerplate to ~100 lines of maintainable rule logic.