Series Overview

This series applies data science techniques to understand the U.S. Congress through legislative data. From web scraping congressional records to building machine learning models for policy classification, we explore how data can illuminate the patterns and priorities of American democracy.

What You’ll Learn

The Journey

Exploring the 117th U.S. Congress establishes the data foundation through comprehensive web scraping and exploratory analysis. Learn how bills move through the legislative process, which parties introduce what types of legislation, and how success rates vary across policy areas.

Congressional Bill Policy Area Classification applies machine learning to automatically categorize bills by policy area. Using 48,000+ bills from three Congresses, we build baseline models that can distinguish between healthcare, defense, economics, and other policy domains.

Technical Skills

This series demonstrates practical data science workflows:

Real-World Applications

These techniques enable broader applications in:

Broader Context

Understanding congressional data provides insights into:

Future Directions

This foundation enables more advanced analyses:

Perfect for data scientists interested in civic applications, political researchers seeking quantitative methods, or anyone curious about applying machine learning to understand democratic processes.