Skip to main content

Reddit Scraper

Collects company discussions from Reddit to provide social sentiment data.

Overview

Attribute	Value
Source	Reddit API (PRAW)
Auth	OAuth2
Rate Limit	60 requests/minute
Cache	6 hours

Data Collected

Field	Description
`title`	Post title
`text`	Post body + top comments
`score`	Upvotes
`subreddit`	Source subreddit
`date`	Post creation date

Target Subreddits

Subreddit	Content
r/jobs	Job hunting, reviews
r/careerguidance	Career advice
r/cscareerquestions	Tech companies
r/antiwork	Workplace issues
r/germany	DACH-specific

Search Strategy

# Search across multiple subreddits
posts = search_company_posts("BMW", limit=100)

Relevance Filtering

Minimum 5 upvotes
Must mention company name
Exclude job listings
Exclude promotional posts

Sentiment Derivation

No star ratings, so we derive sentiment from:

Post score (upvotes)
Comment tone
Keyword detection
Subreddit context

Privacy

Only public posts collected
Author names not stored
Comply with Reddit API terms

Reddit provides unfiltered employee perspectives.

Overview
Data Collected
Target Subreddits
Search Strategy
Relevance Filtering
Sentiment Derivation
Privacy