Regional Filtering & Company Aliasing
Status: 📋 PLANNED Priority: High (after refactoring) Effort: 3-5 days total
Problem Statement
Currently the system shows "Bosch" and "Bosch Gruppe" as separate companies, fragmenting data:
- "Bosch" → has vacancies (12)
- "Bosch Gruppe" → has reviews (4939)
Users expect unified company view with regional filtering capability.
Solution: Two-Layer Architecture
Layer 1: Company Aliasing (Grouping)
Merge company variants into canonical entities:
-- New table: company_aliases
CREATE TABLE company_aliases (
id SERIAL PRIMARY KEY,
canonical_name VARCHAR(255) NOT NULL, -- "Bosch"
variant_name VARCHAR(255) NOT NULL, -- "Bosch Gruppe", "Robert Bosch GmbH"
country_hint VARCHAR(5), -- Optional: "de", "at"
UNIQUE(variant_name)
);
-- Examples:
INSERT INTO company_aliases (canonical_name, variant_name) VALUES
('Bosch', 'Bosch'),
('Bosch', 'Bosch Gruppe'),
('Bosch', 'Robert Bosch GmbH'),
('Bosch', 'bosch-gmbh'),
('Lidl', 'Lidl'),
('Lidl', 'Lidl Deutschland'),
('Lidl', 'Lidl Stiftung');
Layer 2: Regional Filter
Add country/region dimension to all queries:
Dashboard UI:
┌──────────────────────────────────────────────┐
│ [Company: Bosch ▼] [Region: 🌍 Global ▼] │
│ 🌍 Global │
│ 🇩🇪 Germany │
│ 🇦🇹 Austria │
│ 🇨🇭 Switzerland │
└──────────────────────────────────────────────┘
Implementation Phases
Phase 1: Company Aliasing (1 day)
- Create
company_aliasestable - Seed with 20+ major companies
- Update
/api/companiesto return canonical names only - Update all API endpoints to join on aliases
Phase 2: Region Dropdown UI (1 day)
- Add region selector to Header.jsx
- Store selectedRegion in context
- Pass
?country=to all API calls
Phase 3: Backend Regional Filtering (1 day)
- Add
countryparameter to all stats/review endpoints - Filter reviews by
countrycolumn - Filter vacancies by location parsing
Phase 4: Global Mode (1 day)
- When region = "Global", aggregate all regions
- Compute global Trust Score
- Show regional breakdown chart
API Changes
Before:
GET /api/stats?company=Bosch
After:
GET /api/stats?company=Bosch®ion=de
GET /api/stats?company=Bosch®ion=global
Data Model Changes
-- Ensure reviews have country
ALTER TABLE reviews ADD COLUMN IF NOT EXISTS country VARCHAR(5);
-- Ensure vacancies have country
ALTER TABLE job_vacancies ADD COLUMN IF NOT EXISTS country VARCHAR(5);
-- Index for fast filtering
CREATE INDEX idx_reviews_country ON reviews(country);
CREATE INDEX idx_vacancies_country ON job_vacancies(country);
Priority Justification
Do AFTER refactoring because:
- Current code has hardcoded DB connections
- API endpoints duplicated ILIKE logic everywhere
- Frontend has multiple data fetching patterns
Clean architecture first → easier to add regions
Related Files
/backend/main.py- All API endpoints/dashboard_app/src/context/- State management/dashboard_app/src/components/Header.jsx- Company selector/vartovii/scrapers/- Country parameter in spiders