DEV Community

Naman Vashistha
Naman Vashistha

Posted on • Edited on

1 1 1 1 1

Automated Job Search: LinkedIn Jobs to Notion Board

notion board

A Python-based job scraping system that pulls LinkedIn listings into a structured Notion database. Repository: jobs-scrape-to-notion

Setup Steps

  1. Clone the repository:
git clone https://github.com/namanvashistha/jobs-scrape-to-notion
cd jobs-scrape-to-notion
Enter fullscreen mode Exit fullscreen mode
  1. Install dependencies:
pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode
  1. Configure Notion:

    • Create a Notion integration at notion.so/my-integrations
    • Create a new Notion database
    • Share your database with the integration
    • Copy the database ID from its URL
  2. Set environment variables:

cp .env.example .env
Enter fullscreen mode Exit fullscreen mode

Update .env with your credentials:

NOTION_API_KEY=your_integration_token
NOTION_DATABASE_ID=your_database_id
Enter fullscreen mode Exit fullscreen mode

Key Features

Job Scraping

def fetch_jobs(search_terms, location, results_wanted=20):
    # Scrapes LinkedIn jobs based on multiple search terms
    # Returns a pandas DataFrame with job details
Enter fullscreen mode Exit fullscreen mode

Notion Integration

  • Creates structured database entries
  • Handles rich text, URLs, dates, and company logos
  • Prevents duplicate entries
  • Manages API rate limits

Data Processing

  • Sanitizes input data
  • Formats salary ranges for Indian currency
  • Handles company metadata
  • Manages file attachments for logos

Running the Scraper

python main.py
Enter fullscreen mode Exit fullscreen mode

Default configuration:

  • Search terms: ["Software Engineer", "Backend", "SDE"]
  • Location: India
  • Results per term: 20
  • Platform: LinkedIn

Customization

Modify main() in scraper.py:

search_terms = ["Your", "Preferred", "Terms"]
location = "Your Location"
results_wanted = 30  # Number of results per term
Enter fullscreen mode Exit fullscreen mode

Error Handling

The system includes:

  • Comprehensive logging
  • Rate limit management
  • Duplicate prevention
  • Data validation

Visit the repository for source code and detailed documentation.

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

Top comments (0)

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

Introducing RTABench

To meet the need for real-time analytics workloads, we developed RTABench, a benchmark designed to test databases with these specific requirements. RTABench focuses on essential query patterns such as joins, filtering, and pre-aggregations.