Project

Data Retrieval System for Talent Intelligence

A system and platform for data insight creation using open source technologies.

May 13, 2026

View project repository on GitHub


The Project

Like most projects, they begin with an idea and a question that requires further exploration. My idea was, how can I query GitHub to create talent insights, to answer my question of, where does AI talent reside in the world?

Using Claude Code and Github Copilot I was able to prompt my way to query large amounts of data to analyze, utilizing the GitHub Archive Project on Google big query, which stores all public activity on GitHub, starting from 2012 to present day.

Pipeline
Data
GitHub Archive · BigQuery SQL aggregation
Pipeline
Python ETL · scoring · enrichment · location fetch
Geocoding
geopy / Nominatim · GeoNames reference data
Database
PostgreSQL 15 + PostGIS · Hetzner VPS · Docker Compose
Dashboards
Metabase · Kepler.gl · Plotly exports
Access
Tailscale · Cloudflare edge