Repository: https://github.com/matthewdeaves/cookie Commit: cfaafa8 Date: 2026-01-25 Scope: apps/ (Django/Python backend)
This review was generated by Claude Opus 4.5 through direct analysis of the codebase and metrics collected by the CI dashboard (GitHub Actions: pytest coverage, ruff linting, complexity analysis, duplication detection).
This review examines backend code quality beyond what the metrics dashboard measures. Metrics catch symptoms (test coverage, duplication percentages). This review looks for structural problems, maintainability issues, and patterns that will become technical debt.
Executive Summary
The backend is well-structured with clean Django app separation and good API design using Django Ninja. Code quality is acceptable for a prototype built in four days. However, several patterns will create maintenance problems as the app grows:
Critical issues:
- Manual JSON Schema validation (apps/ai/services/validator.py, 98 lines, CC=40)
- Error handling duplication (7 identical blocks in apps/ai/api.py)
- AI service tests mock at too high a level, so validation code never runs against realistic responses
Moderate issues:
- JSON parsing duplication in openrouter.py (complete vs complete_async methods)
- Profile retrieval pattern repeated in 6 legacy views
- HTML parsing complexity (apps/recipes/services/search.py _extract_result_from_element, 78 lines, CC=20)
- 27 ruff rules disabled in pyproject.toml
1. Architecture & Structure
Django App Organization
Structure:
apps/
├── ai/ - AI integration (OpenRouter API, prompts, validation)
├── core/ - Shared settings, middleware, base classes
├── profiles/ - User profiles and authentication
├── recipes/ - Recipe models, search, scraping
└── legacy/ - Server-rendered views (fallback for non-JS)
Positive patterns:
- Clean separation by domain
- Each app has clear responsibility
- Good use of Django's app structure (models, views, services, tests)
Profile-based data isolation: All recipes, favorites, and history are scoped to profiles. This prevents N+1 queries and provides clear data boundaries.
Example - apps/recipes/models.py:179-183:
class Recipe(BaseModel):
profile = models.ForeignKey(
'profiles.Profile',
on_delete=models.CASCADE,
related_name='recipes',
) Every Recipe belongs to a Profile. Queries naturally filter by profile, avoiding cross-profile data leakage.
Django Ninja API Design
Location: apps/ai/api.py, apps/recipes/api.py
Good patterns:
- Type-safe request/response schemas using Pydantic-style models
- Clear endpoint naming and organization
- Explicit HTTP status codes in decorators:
@router.post('/remix', response={200: RemixOut, 400: ErrorOut, 404: ErrorOut, 503: ErrorOut}) - Consistent error response structure
Example - apps/ai/api.py:308-309:
@router.post('/remix', response={200: RemixOut, 400: ErrorOut, 404: ErrorOut, 503: ErrorOut})
def create_remix_endpoint(request, data: CreateRemixIn): Django Ninja provides automatic validation, OpenAPI docs, and type safety without boilerplate.
2. Error Handling
Duplication in AI API
Location: apps/ai/api.py
Problem: The same 11-line error handling block appears in 7 AI endpoints:
- remix_suggestions (lines 295-305)
- create_remix_endpoint (lines 365-375)
- scale_recipe_endpoint (lines 477-487)
- tips_endpoint (lines 534-544)
- timer_name_endpoint (lines 584-594)
- discover_endpoint (lines 630-640)
- repair_selector_endpoint (lines 706-716)
Code example (lines 295-305):
except AIUnavailableError as e:
return 503, {
'error': 'ai_unavailable',
'message': str(e) or 'AI features are not available. Please configure your API key in Settings.',
'action': 'configure_key',
}
except (AIResponseError, ValidationError) as e:
return 400, {
'error': 'ai_error',
'message': str(e),
} This exact block repeats 7 times. If you need to change the error message for internationalisation, you need to update 7 locations.
Better approach: Extract to a decorator:
def handle_ai_errors(func):
"""Decorator to standardize AI endpoint error handling."""
@wraps(func)
def wrapper(*args, **kwargs):
try:
return func(*args, **kwargs)
except AIUnavailableError as e:
return 503, {
'error': 'ai_unavailable',
'message': str(e) or 'AI features are not available. Please configure your API key in Settings.',
'action': 'configure_key',
}
except (AIResponseError, ValidationError) as e:
return 400, {
'error': 'ai_error',
'message': str(e),
}
return wrapper
@router.post('/remix', response={200: RemixOut, 400: ErrorOut, 404: ErrorOut, 503: ErrorOut})
@handle_ai_errors
def create_remix_endpoint(request, data: CreateRemixIn):
# No error handling boilerplate needed
pass Now error messages are defined once. All 7 endpoints benefit from updates automatically.
3. Code Complexity
Manual JSON Schema Validation
Location: apps/ai/services/validator.py:163-260
Lines: 98 Cyclomatic Complexity: 40 Rating: C (refactoring recommended)
What it does: Manually implements JSON Schema validation with 20+ conditionals to handle union types, objects, arrays, and primitives.
Code example (lines 163-260):
def _validate_value(
self,
value: Any,
schema: dict,
path: str
) -> list[str]:
"""Validate a value against a schema definition."""
errors = []
expected_type = schema.get('type')
# Handle union types (e.g., ['string', 'null'])
if isinstance(expected_type, list):
valid = False
for t in expected_type:
if t == 'null' and value is None:
valid = True
break
elif t == 'string' and isinstance(value, str):
valid = True
break
# ... 10 more elif branches for other types
if not valid:
types_str = ' or '.join(expected_type)
errors.append(f'{path}: expected {types_str}, got {type(value).__name__}')
return errors
# Type validation for objects (lines 201-217)
if expected_type == 'object':
if not isinstance(value, dict):
return [f'{path}: expected object, got {type(value).__name__}']
required = schema.get('required', [])
for field in required:
if field not in value:
errors.append(f'{path}: missing required field "{field}"')
# ... recursive validation
# Type validation for arrays (lines 219-242)
elif expected_type == 'array':
if not isinstance(value, list):
return [f'{path}: expected array, got {type(value).__name__}']
# ... min/max items validation
# ... recursive item validation
# Primitive types (lines 244-258)
elif expected_type == 'string':
if not isinstance(value, str):
errors.append(f'{path}: expected string, got {type(value).__name__}')
# ... integer, number, boolean checks
return errors Problem: Manually reimplements JSON Schema validation. This is 98 lines of code and 20+ conditionals to do what the jsonschema library does in one function call.
Better approach: Use the standard jsonschema library:
import jsonschema
def _validate_value(
self,
value: Any,
schema: dict,
path: str
) -> list[str]:
"""Validate a value against a schema definition."""
try:
jsonschema.validate(value, schema)
return []
except jsonschema.ValidationError as e:
return [f'{path}: {e.message}'] The entire 98-line function becomes 8 lines. The jsonschema library handles union types, nested objects, arrays, min/max constraints, and all other JSON Schema features correctly.
Why it matters: Custom validation logic has edge cases. Union types are partially implemented (lines 177-198), but there's no handling for anyOf, oneOf, allOf, or $ref which are standard JSON Schema features. Using a library eliminates these gaps.
HTML Parsing Complexity
Location: apps/recipes/services/search.py:251-329
Lines: 78 Cyclomatic Complexity: 20 Rating: C (refactoring recommended)
What it does: Extracts recipe data from a search result HTML element. Does 6 sequential tasks in one function: find link, validate URL, extract title (4 fallback attempts), parse rating from title with regex, extract image (3 fallback attempts), extract description.
Problem: Too many responsibilities. Hard to test each extraction strategy independently. If the title extraction logic breaks, you can't test just that part.
Better approach: Refactor to separate methods:
def _extract_result_from_element(self, element):
"""Extract recipe data from search result element."""
link = self._find_link(element)
if not link:
return None
url = self._validate_url(link)
if not url:
return None
title = self._extract_title(element)
rating = self._extract_rating(title) if title else None
image_url = self._extract_image(element)
description = self._extract_description(element)
return {
'url': url,
'title': title or 'Untitled Recipe',
'rating': rating,
'image_url': image_url,
'description': description,
}
def _find_link(self, element):
"""Find the recipe link in the element."""
# Single responsibility: find link
def _extract_title(self, element):
"""Extract title with fallback strategies."""
# Try 4 different selectors
def _extract_rating(self, title):
"""Parse rating from title using regex."""
# Single responsibility: rating extraction
def _extract_image(self, element):
"""Extract image URL with fallback strategies."""
# Try 3 different selectors
def _extract_description(self, element):
"""Extract description text."""
# Single responsibility: description Now each method has a single responsibility. You can test title extraction separately from image extraction. If the rating regex breaks, you only need to fix _extract_rating().
4. Code Duplication
JSON Parsing in OpenRouter Service
Location: apps/ai/services/openrouter.py
Problem: Identical markdown code block extraction logic appears in both complete (lines 100-111) and complete_async (lines 171-181) methods.
Code in complete method (lines 100-111):
if content.startswith('```'):
# Extract JSON from code block
lines = content.split('\n')
json_lines = []
in_block = False
for line in lines:
if line.startswith('```'):
in_block = not in_block
continue
if in_block:
json_lines.append(line)
content = '\n'.join(json_lines) Code in complete_async method (lines 171-181):
if content.startswith('```'):
lines = content.split('\n')
json_lines = []
in_block = False
for line in lines:
if line.startswith('```'):
in_block = not in_block
continue
if in_block:
json_lines.append(line)
content = '\n'.join(json_lines) Exact duplicate. 12 lines of identical logic.
Better approach: Extract to helper method:
def _parse_json_response(self, content: str) -> dict:
"""Extract JSON from AI response, handling markdown code blocks."""
if content.startswith('```'):
lines = content.split('\n')
json_lines = []
in_block = False
for line in lines:
if line.startswith('```'):
in_block = not in_block
continue
if in_block:
json_lines.append(line)
content = '\n'.join(json_lines)
try:
return json.loads(content)
except json.JSONDecodeError as e:
logger.error(f'Failed to parse AI response as JSON: {content}')
raise AIResponseError(f'Invalid JSON in AI response: {e}')
# Then in both methods:
if json_response:
return self._parse_json_response(content) Now the logic is defined once. If markdown parsing needs to change (for example, to handle nested code blocks), you update one method.
Profile Retrieval Pattern
Location: apps/legacy/views.py
Problem: The same 6-line profile retrieval pattern appears in 6 views: home (lines 38-46), search (lines 85-93), recipe_detail (lines 112-120), play_mode (lines 169-177), collections (lines 200+), settings (lines 200+).
Code example from home view (lines 38-46):
profile_id = request.session.get('profile_id')
if not profile_id:
return redirect('legacy:profile_selector')
try:
profile = Profile.objects.get(id=profile_id)
except Profile.DoesNotExist:
del request.session['profile_id']
return redirect('legacy:profile_selector') Better approach: Use a decorator:
def require_profile(view_func):
"""Decorator to require a valid profile in session."""
@wraps(view_func)
def wrapper(request, *args, **kwargs):
profile_id = request.session.get('profile_id')
if not profile_id:
return redirect('legacy:profile_selector')
try:
profile = Profile.objects.get(id=profile_id)
except Profile.DoesNotExist:
del request.session['profile_id']
return redirect('legacy:profile_selector')
# Add profile to request so views can use it
request.profile = profile
return view_func(request, *args, **kwargs)
return wrapper
@require_profile
def home(request):
"""Home screen."""
profile = request.profile # Provided by decorator
# ... rest of view logic Eliminates 6 copies of the same pattern.
5. Testing Quality
Good Patterns
1. Realistic HTML fixtures - apps/recipes/tests/test_scraper.py:
The tests use mock_html_response fixture with real JSON-LD markup:
@pytest.fixture
def mock_html_response():
"""HTML with realistic JSON-LD data."""
return '''
<script type="application/ld+json">
{
"@type": "Recipe",
"name": "Test Recipe",
"recipeIngredient": ["2 eggs", "1 cup flour"],
"recipeInstructions": [...]
}
</script>
''' This tests the scraper against realistic HTML structure, not simplified mocks.
2. Zero mocks for business logic - apps/profiles/tests/test_user_features.py:
Tests 34 scenarios with real database operations and no mocks:
def test_user_creates_profile_and_adds_recipe(db):
"""Test complete user flow with real database."""
profile = Profile.objects.create(name='Test User')
recipe = Recipe.objects.create(
title='Test Recipe',
profile=profile,
)
assert recipe.profile == profile This verifies actual Django model behaviour, constraints, and database interactions.
Bad Patterns
High-level mocking in AI service tests - apps/ai/tests.py:596:
@patch('apps.ai.services.openrouter.OpenRouterService')
def test_remix_suggestions(mock_service):
"""Test remix suggestions endpoint."""
mock_service.return_value.complete.return_value = {
'suggestions': ['Italian', 'Mexican', 'Asian']
} Problem: Mocking OpenRouterService entirely means the validation logic in validator.py never runs against realistic AI responses. The tests verify the endpoint returns data, but not that the validation correctly handles:
- Malformed JSON from the AI
- Missing required fields
- Incorrect types
- Edge cases in union type validation
Why this happened: Tests were added as a separate task after endpoints worked, not as completion criteria during development. The tests verify "does the endpoint return 200" rather than "does the validation handle real AI responses correctly."
Better approach: Use realistic mock responses:
@pytest.fixture
def realistic_ai_response():
"""Return realistic AI JSON with potential edge cases."""
return {
'suggestions': ['Italian Style', 'Mexican Fusion', 'Asian Inspired'],
'metadata': {
'model': 'claude-3.5-haiku',
'tokens': 150,
}
}
@patch('apps.ai.services.openrouter.OpenRouterService.complete')
def test_remix_suggestions(mock_complete, realistic_ai_response):
"""Test remix suggestions with realistic AI response."""
mock_complete.return_value = realistic_ai_response
# Now validation logic runs against realistic data Coverage Disparity
API endpoints: 93-100% coverage AI services: 10-25% coverage
The gap exists because endpoints have comprehensive tests, but the AI service layer (where complex validation lives) is mocked at too high a level. The 98-line _validate_value function in validator.py has 25% coverage, meaning 75% of its branches never run in tests.
6. Type Safety & Validation
Good Django Ninja Schemas
Location: apps/ai/schemas.py
Django Ninja uses Pydantic-style schemas for type-safe validation:
class CreateRemixIn(Schema):
recipe_id: int
remix_type: str
modifications: list[str]
class RemixOut(Schema):
recipe: RecipeSchema
explanation: str Benefits:
- Automatic request validation
- Type safety at runtime
- OpenAPI docs generation
- Clear API contracts
No issues found here. This is the right approach for API validation.
Manual Validation in validator.py
As covered in the Code Complexity section, validator.py manually implements JSON Schema validation instead of using the standard jsonschema library. This creates maintenance burden and potential bugs.
Recommendation: Replace custom validation with jsonschema.validate().
7. Performance & Async Patterns
Good Async Patterns
1. curl_cffi with browser impersonation - apps/recipes/services/search.py:
Uses curl_cffi to bypass bot detection:
from curl_cffi.requests import AsyncSession
async with AsyncSession() as session:
response = await session.get(
url,
impersonate='chrome110',
timeout=10,
) Benefit: Realistic browser fingerprinting. Avoids Cloudflare and bot detection without Selenium overhead.
2. Concurrent search with semaphore - apps/recipes/services/search.py:
Searches multiple URLs concurrently with controlled parallelism:
semaphore = asyncio.Semaphore(3) # Max 3 concurrent requests
async def fetch_with_limit(url):
async with semaphore:
return await fetch(url)
results = await asyncio.gather(
*[fetch_with_limit(url) for url in urls],
return_exceptions=True,
) Benefit: Fast parallel searches without overwhelming target servers. The semaphore prevents 20+ simultaneous connections.
3. Profile isolation prevents N+1 queries:
All recipes are scoped to profiles. Queries naturally filter by profile:
Recipe.objects.filter(profile=profile).select_related('profile') Benefit: No N+1 query problems. Each query fetches only the data for the current profile.
Missing Optimizations
No caching for AI responses:
Apps make identical AI requests multiple times. If a user views the same recipe twice, the app calls OpenRouter twice with the same prompt.
Better approach: Cache AI responses by prompt hash:
from django.core.cache import cache
def get_cached_ai_response(prompt_hash, fetch_func):
"""Cache AI responses to avoid redundant API calls."""
cached = cache.get(f'ai_response:{prompt_hash}')
if cached:
return cached
response = fetch_func()
cache.set(f'ai_response:{prompt_hash}', response, timeout=3600)
return response No database query optimization:
The legacy views don't use select_related or prefetch_related consistently. Some queries could be optimized:
# Current (2 queries: fetch favorites, then fetch recipe for each)
favorites = RecipeFavorite.objects.filter(profile=profile)
for fav in favorites:
print(fav.recipe.title) # Triggers query per recipe
# Better (1 query)
favorites = RecipeFavorite.objects.filter(
profile=profile
).select_related('recipe') 8. Security & Configuration
API Key Storage
Location: apps/core/settings.py
Uses AppSettings singleton to store OpenRouter API key:
class AppSettings(models.Model):
openrouter_api_key = models.CharField(max_length=255, blank=True)
class Meta:
# Singleton pattern - only one settings object
constraints = [
models.CheckConstraint(
check=models.Q(id=1),
name='only_one_settings_instance'
)
] Good: API keys are stored in the database, not hardcoded. Users can configure keys through the UI.
Issue: No encryption at rest. API keys are stored as plaintext in the database. If the database is compromised, keys are exposed.
Better approach: Use Django's encrypt field:
from django_cryptography.fields import encrypt
class AppSettings(models.Model):
openrouter_api_key = encrypt(models.CharField(max_length=255, blank=True)) Or use environment variables for production deployments.
Input Validation
Good patterns:
- Django Ninja schemas validate all API inputs
- Recipe URL validation prevents SSRF attacks
- Profile isolation prevents unauthorized data access
No major security issues found. The API validates inputs correctly.
Disabled Linter Rules
Location: pyproject.toml
27 ruff rules are disabled:
[tool.ruff.lint]
ignore = [
"DJ001", # Django model without __str__
"RUF012", # Mutable class attributes
"S101", # Use of assert
"PLR0913", # Too many function arguments
# ... 23 more
] Problem: Disabling rules globally hides potential issues. Some of these rules catch real problems:
DJ001- Models without__str__are harder to debug in Django adminRUF012- Mutable class attributes cause bugs when shared between instancesPLR0913- Functions with 8+ arguments are hard to understand
Recommendation: Enable auto-fixable rules and fix violations. Keep complex rules disabled but document why:
ignore = [
"PLR0913", # Too many arguments - disabled because Django views often need many params
# Enable these and fix:
# "DJ001", # Add __str__ to models
# "RUF012", # Fix mutable defaults
] 9. Specific File Issues
validator.py - Manual JSON Schema Implementation
File: apps/ai/services/validator.py Lines: 260 Key method: _validate_value (lines 163-260, 98 lines, CC=40)
Problem: See Code Complexity section. Replace entire function with jsonschema.validate().
ai/api.py - Error Handling Duplication
File: apps/ai/api.py Lines: 726 Duplicated blocks: 7 endpoints (lines 295-305, 365-375, 477-487, 534-544, 584-594, 630-640, 706-716)
Problem: See Error Handling section. Extract to @handle_ai_errors decorator.
openrouter.py - JSON Parsing Duplication
File: apps/ai/services/openrouter.py Lines: 185 Duplicated code: Lines 100-111 and 171-181
Problem: See Code Duplication section. Extract to _parse_json_response() method.
search.py - HTML Parsing Complexity
File: apps/recipes/services/search.py Lines: 450 Complex method: _extract_result_from_element (lines 251-329, 78 lines, CC=20)
Problem: See Code Complexity section. Refactor to 6 separate methods.
legacy/views.py - Profile Retrieval Pattern
File: apps/legacy/views.py Lines: 320 Duplicated pattern: 6 views repeat profile retrieval (lines 38-46, 85-93, 112-120, 169-177, 200+)
Problem: See Code Duplication section. Extract to @require_profile decorator.
10. Positive Patterns Worth Noting
Clean Django App Separation
File structure:
apps/
├── ai/ - AI integration (clear boundaries)
├── core/ - Shared infrastructure
├── profiles/ - User management
├── recipes/ - Recipe domain logic
└── legacy/ - Server-rendered fallback
Benefit: Each app has a single responsibility. Dependencies are clear. No circular imports.
Django Ninja API Design
Type-safe APIs with automatic validation and OpenAPI docs. Clean separation between schemas, business logic, and endpoints.
Example:
@router.post('/remix', response={200: RemixOut, 400: ErrorOut, 404: ErrorOut, 503: ErrorOut})
def create_remix_endpoint(request, data: CreateRemixIn):
"""Create a remixed recipe using AI."""
pass Explicit types, clear contracts, automatic validation.
Profile-Based Data Isolation
All data is scoped to profiles. This design prevents unauthorized access and simplifies queries:
Recipe.objects.filter(profile=request.profile) No need for complex permission checks. The data model enforces isolation.
Comprehensive Test Suite
177K lines of tests covering:
- Unit tests for services
- Integration tests for APIs
- Real database tests for models
- HTML parsing tests with realistic fixtures
Coverage: 47% average (could be higher, but the tests that exist are well-designed).
11. Recommendations
Critical (Do First)
- Replace manual validation with jsonschema library - Eliminates 98-line function with 40 branches
- Extract AI error handling to decorator - Removes 7 duplicate blocks from ai/api.py
- Fix AI service test mocking - Create realistic mock_ai_response fixtures so validation runs in tests
- Add API key encryption - Use Django encrypted fields or environment variables
High Priority
- Extract JSON parsing logic in openrouter.py - Create
_parse_json_response()method - Refactor HTML parsing in search.py - Break
_extract_result_from_elementinto 6 methods - Use @require_profile decorator - Eliminate profile retrieval pattern in legacy views
- Add caching for AI responses - Avoid redundant API calls
Medium Priority
- Enable auto-fixable ruff rules - Fix DJ001 (missing
__str__), RUF012 (mutable defaults) - Optimize database queries - Add
select_related/prefetch_relatedin legacy views - Add SAST scanning - Use bandit or similar for security analysis
- Document disabled linter rules - Explain why each rule is disabled
Low Priority
- Create useAsync-style patterns - If frontend uses
useAsync, backend could have similar async helpers - Add API response caching - Cache expensive computations beyond AI responses
- Create integration tests for concurrent search - Verify semaphore behaviour under load
Summary
Overall grade: B- (good structure, needs refinement for production)
Strengths:
- Clean Django app organization
- Type-safe API design with Django Ninja
- Good async patterns (curl_cffi, concurrent search with semaphore)
- Profile-based data isolation
- Comprehensive test suite (177K lines)
Weaknesses:
- Manual JSON Schema validation (98 lines, CC=40)
- Error handling duplication (7 identical blocks)
- Test mocking too high-level (validation logic never tested)
- Code duplication in JSON parsing and profile retrieval
- 27 disabled linter rules
Key insight: The metrics caught symptoms (4.14% duplication, 47% test coverage) but missed the root causes. The duplication report flagged the 7 error handling blocks, but didn't explain the impact: internationalisation would require editing 7 locations. The coverage metric showed 47% average, but didn't reveal that AI services are 10-25% while APIs are 93-100%, or that high-level mocking means validation code never runs in tests. A human code review was necessary to understand what the numbers actually mean.