Skip to main content

Regional Filtering & Company Aliasing

Status: 📋 PLANNED Priority: High (after refactoring) Effort: 3-5 days total


Problem Statement

Currently the system shows "Bosch" and "Bosch Gruppe" as separate companies, fragmenting data:

  • "Bosch" → has vacancies (12)
  • "Bosch Gruppe" → has reviews (4939)

Users expect unified company view with regional filtering capability.


Solution: Two-Layer Architecture

Layer 1: Company Aliasing (Grouping)

Merge company variants into canonical entities:

-- New table: company_aliases
CREATE TABLE company_aliases (
id SERIAL PRIMARY KEY,
canonical_name VARCHAR(255) NOT NULL, -- "Bosch"
variant_name VARCHAR(255) NOT NULL, -- "Bosch Gruppe", "Robert Bosch GmbH"
country_hint VARCHAR(5), -- Optional: "de", "at"
UNIQUE(variant_name)
);

-- Examples:
INSERT INTO company_aliases (canonical_name, variant_name) VALUES
('Bosch', 'Bosch'),
('Bosch', 'Bosch Gruppe'),
('Bosch', 'Robert Bosch GmbH'),
('Bosch', 'bosch-gmbh'),
('Lidl', 'Lidl'),
('Lidl', 'Lidl Deutschland'),
('Lidl', 'Lidl Stiftung');

Layer 2: Regional Filter

Add country/region dimension to all queries:

Dashboard UI:
┌──────────────────────────────────────────────┐
│ [Company: Bosch ▼] [Region: 🌍 Global ▼] │
│ 🌍 Global │
│ 🇩🇪 Germany │
│ 🇦🇹 Austria │
│ 🇨🇭 Switzerland │
└──────────────────────────────────────────────┘

Implementation Phases

Phase 1: Company Aliasing (1 day)

  • Create company_aliases table
  • Seed with 20+ major companies
  • Update /api/companies to return canonical names only
  • Update all API endpoints to join on aliases

Phase 2: Region Dropdown UI (1 day)

  • Add region selector to Header.jsx
  • Store selectedRegion in context
  • Pass ?country= to all API calls

Phase 3: Backend Regional Filtering (1 day)

  • Add country parameter to all stats/review endpoints
  • Filter reviews by country column
  • Filter vacancies by location parsing

Phase 4: Global Mode (1 day)

  • When region = "Global", aggregate all regions
  • Compute global Trust Score
  • Show regional breakdown chart

API Changes

Before:

GET /api/stats?company=Bosch

After:

GET /api/stats?company=Bosch&region=de
GET /api/stats?company=Bosch&region=global

Data Model Changes

-- Ensure reviews have country
ALTER TABLE reviews ADD COLUMN IF NOT EXISTS country VARCHAR(5);

-- Ensure vacancies have country
ALTER TABLE job_vacancies ADD COLUMN IF NOT EXISTS country VARCHAR(5);

-- Index for fast filtering
CREATE INDEX idx_reviews_country ON reviews(country);
CREATE INDEX idx_vacancies_country ON job_vacancies(country);

Priority Justification

Do AFTER refactoring because:

  1. Current code has hardcoded DB connections
  2. API endpoints duplicated ILIKE logic everywhere
  3. Frontend has multiple data fetching patterns

Clean architecture first → easier to add regions


  • /backend/main.py - All API endpoints
  • /dashboard_app/src/context/ - State management
  • /dashboard_app/src/components/Header.jsx - Company selector
  • /vartovii/scrapers/ - Country parameter in spiders