ABSA (Aspect-Based Sentiment Analysis)
Overview
ABSA (Aspect-Based Sentiment Analysis) extracts specific topics/aspects mentioned in employee reviews and their associated sentiment. This enables the "Topics" page in the dashboard to show what employees are talking about (salary, management, work-life balance, etc.) and whether sentiment is positive, neutral, or negative for each topic.
Architecture
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Reviews Table │────▶│ ABSA Analyzer │────▶│ review_aspects │
│ (raw text) │ │ (extraction) │ │ (results) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│
┌──────────┴──────────┐
│ │
Rule-based AI-powered
(fast, offline) (accurate, Gemini)
Database Schema
Table: review_aspects
| Column | Type | Description |
|---|---|---|
| id | INTEGER | Primary key |
| review_id | VARCHAR | FK to reviews.review_id |
| aspect | VARCHAR | Topic category (e.g., "salary and benefits") |
| sentiment | VARCHAR | positive, neutral, or negative |
| confidence | FLOAT | 0.0-1.0 confidence score |
| snippet | TEXT | Optional text snippet |
Predefined Aspect Categories
The system recognizes these topics:
- salary and benefits - Pay, bonuses, insurance, pension
- work-life balance - Hours, overtime, remote work, flexibility
- management - Leadership, bosses, supervision
- company culture - Atmosphere, values, team spirit
- career growth - Promotions, opportunities, development
- job security - Stability, layoffs, restructuring
- work environment - Office, facilities, equipment
- colleagues - Coworkers, team dynamics
- training and development - Learning, courses, skills
- communication - Transparency, information flow, feedback
Usage
Manual Run
cd /Users/vitaliiradionov/Desktop/Vartovii/backend
source venv/bin/activate
# Analyze specific company
python absa_analyzer.py --company "Audi" --limit 200
# Use rule-based only (faster, no AI costs)
python absa_analyzer.py --company "Audi" --no-ai
# Analyze all companies
python absa_analyzer.py --limit 500
Programmatic Usage
from absa_analyzer import ABSAAnalyzer
analyzer = ABSAAnalyzer()
# With AI (more accurate)
analyzer.analyze(company="Audi", limit=100, use_ai=True)
# Rule-based only (faster)
analyzer.analyze(company="Audi", limit=100, use_ai=False)
Automation
ABSA runs automatically after scraping:
- Scraping Service (
scraping_service.py) completes a job - Triggers AI sentiment analysis
- Triggers ABSA for the company (rule-based, 100 reviews)
- Refreshes materialized views
No manual intervention needed for new companies!
API Endpoints
GET /api/aspects
Returns aspect sentiment distribution.
Query Parameters:
company- Company name filter (case-insensitive)
Response:
{
"aspects": [
{ "aspect": "salary and benefits", "sentiment": "neutral", "count": 21 },
{ "aspect": "management", "sentiment": "neutral", "count": 18 },
{ "aspect": "work-life balance", "sentiment": "positive", "count": 14 }
]
}
Extraction Methods
1. Rule-Based (Default)
Fast keyword matching:
- Scans review text for topic keywords
- Uses simple sentiment word detection
- ~60-70% accuracy
- No API costs
2. AI-Powered (Gemini)
Uses Gemini AI for extraction:
- More accurate context understanding
- Better sentiment detection
- ~85-90% accuracy
- Costs ~$0.001 per review
Dashboard Integration
The Topics page (/app → Topics) displays:
- Top 10 Discussed Topics - Bar chart
- Topic Insights - Most positive, most negative, top mentioned
- Topic Details - Drill-down by company
Troubleshooting
No topics showing
- Check if reviews exist:
SELECT COUNT(*) FROM reviews WHERE company_name ILIKE '%CompanyName%';
- Check if aspects were extracted:
SELECT COUNT(*) FROM review_aspects ra
JOIN reviews r ON ra.review_id = r.review_id
WHERE r.company_name ILIKE '%CompanyName%';
- Run ABSA manually:
python absa_analyzer.py --company "CompanyName" --limit 200
Low accuracy
Switch to AI mode:
python absa_analyzer.py --company "CompanyName" # AI enabled by default
Related Files
backend/absa_analyzer.py- Main analyzer scriptbackend/api_repositories/aspect_repository.py- Data access layerbackend/main.py- API endpoints (/api/aspects)dashboard_app/src/components/TopicAnalysis.jsx- Frontend component
Future Improvements
- Multilingual support - German keyword detection
- Custom aspects - User-defined topics
- Trend analysis - Aspect sentiment over time
- Aspect clustering - AI-discovered topics