
π¦PostHog Integration
Query Your Product Analytics with SQL and APIs
Overview
The PostHog integration allows you to query your PostHog analytics data using multiple approaches: HogQL (PostHog's SQL dialect), REST API calls, and the PostHog Query API. This integration makes it easy to analyze product analytics data alongside other databases in your data stack.
Quick Start
1. Get Your PostHog Credentials
# From your PostHog instance:
# 1. Get your Project API Key from Project Settings
# 2. Note your Project ID
# 3. Get your PostHog instance URL (e.g., https://app.posthog.com)
2. Connect to PostHog
{
"connection": {
"host": "https://us.posthog.com",
"port": "443",
"password": "YOUR_PROJECT_API_KEY",
"database": "YOUR_PROJECT_ID"
}
}
3. Run Your First Query
-- HogQL query to get recent events
SELECT event, count() as event_count
FROM events
WHERE timestamp > now() - interval 7 day
GROUP BY event
ORDER BY event_count DESC
LIMIT 10;
-- REST API call to get events
GET /api/projects/:project_id/events?limit=10
-- Query API call for insights
QUERY {"kind": "EventsQuery", "select": ["*"], "limit": 10}
Authentication Setup
PostHog Cloud
Get API Key
Go to your PostHog project settings
Navigate to "Project API Key"
Copy your Project API Key
Find Project ID
In PostHog, go to Project Settings
Copy the Project ID (numeric value)
Self-Hosted PostHog
Get Your Instance URL
# Your self-hosted PostHog URL https://posthog.yourcompany.com
Configure Connection
{ "type": "posthog", "host": "posthog.yourcompany.com", "port": "443", "username": "phc_your_project_api_key_here", "database": "12345" }
Query Types
The PostHog integration supports three types of queries:
1. HogQL Queries (Recommended)
PostHog's SQL dialect for querying event data:
SELECT event, properties.$browser, count()
FROM events
WHERE timestamp > '2024-01-01'
AND event = 'page_view'
GROUP BY event, properties.$browser
ORDER BY count() DESC;
2. REST API Calls
Direct API endpoint calls:
# Get events
GET /api/projects/:project_id/events
# Get insights
GET /api/projects/:project_id/insights
# Get persons
GET /api/projects/:project_id/persons
3. Query API Calls
Structured query requests:
QUERY {
"kind": "EventsQuery",
"select": ["event", "timestamp", "properties"],
"where": ["event", "=", "page_view"],
"limit": 100
}
HogQL Reference
HogQL is PostHog's SQL dialect optimized for event analytics.
Basic Syntax
SELECT columns
FROM events | persons | sessions | recordings
WHERE conditions
GROUP BY columns
ORDER BY columns
LIMIT number
Key Tables
events
All tracked events with properties
persons
User/person records with properties
sessions
Session-level data
recordings
Session recording metadata
Common Columns
Events Table
-- Core event fields
SELECT
event, -- Event name
timestamp, -- When the event occurred
distinct_id, -- User identifier
uuid, -- Unique event ID
properties, -- Event properties (JSON)
person_id -- Person UUID
FROM events;
Event Properties
-- Access event properties
SELECT
event,
properties.$browser, -- Browser name
properties.$current_url, -- Current URL
properties.$lib, -- Library used
properties.$os, -- Operating system
properties.custom_property -- Custom properties
FROM events;
Person Properties
-- Access person properties
SELECT
distinct_id,
properties.email,
properties.name,
properties.$initial_browser,
properties.$initial_current_url
FROM persons;
Date/Time Functions
-- Date filtering
WHERE timestamp > '2024-01-01'
WHERE timestamp >= now() - interval 7 day
WHERE timestamp between '2024-01-01' and '2024-01-31'
-- Date formatting
SELECT
toDate(timestamp) as date,
toStartOfWeek(timestamp) as week,
toStartOfMonth(timestamp) as month
FROM events;
Aggregation Functions
-- Counting
SELECT count() as total_events
SELECT count(distinct distinct_id) as unique_users
SELECT countIf(event = 'purchase') as purchases
-- Statistical functions
SELECT
avg(properties.session_duration) as avg_duration,
median(properties.session_duration) as median_duration,
percentile(properties.session_duration, 0.95) as p95_duration
FROM events;
REST API Usage
Event Queries
# Get recent events
GET /api/projects/:project_id/events?limit=100
# Filter events by name
GET /api/projects/:project_id/events?event=page_view
# Filter by date range
GET /api/projects/:project_id/events?after=2024-01-01&before=2024-01-31
Insights API
# Get existing insights
GET /api/projects/:project_id/insights
# Get specific insight
GET /api/projects/:project_id/insights/123
# Create new insight
POST /api/projects/:project_id/insights
{
"name": "Page Views This Week",
"filters": {
"events": [{"id": "page_view"}],
"date_from": "-7d"
}
}
Persons API
# Get persons
GET /api/projects/:project_id/persons
# Search persons
GET /api/projects/:project_id/persons?search=user@example.com
# Get person by ID
GET /api/projects/:project_id/persons/distinct_id
Query API Usage
The Query API provides structured querying capabilities.
Events Query
QUERY {
"kind": "EventsQuery",
"select": [
"event",
"timestamp",
"properties.$browser",
"properties.custom_field"
],
"where": [
"and",
["event", "=", "page_view"],
["timestamp", ">", "2024-01-01"]
],
"orderBy": ["timestamp desc"],
"limit": 100
}
Trends Query
QUERY {
"kind": "TrendsQuery",
"series": [
{
"event": "page_view",
"name": "Page Views"
}
],
"dateRange": {
"date_from": "-30d",
"date_to": null
},
"interval": "day"
}
Funnel Query
QUERY {
"kind": "FunnelsQuery",
"series": [
{"event": "page_view", "name": "Page View"},
{"event": "sign_up", "name": "Sign Up"},
{"event": "purchase", "name": "Purchase"}
],
"dateRange": {
"date_from": "-30d"
}
}
Query Examples
Event Analytics
-- Most popular events
SELECT event, count() as event_count
FROM events
WHERE timestamp > now() - interval 30 day
GROUP BY event
ORDER BY event_count DESC
LIMIT 20;
-- Events by hour of day
SELECT
toHour(timestamp) as hour,
count() as events
FROM events
WHERE timestamp > now() - interval 7 day
GROUP BY hour
ORDER BY hour;
-- Top pages viewed
SELECT
properties.$current_url as page,
count() as page_views,
count(distinct distinct_id) as unique_visitors
FROM events
WHERE event = 'page_view'
AND timestamp > now() - interval 7 day
GROUP BY properties.$current_url
ORDER BY page_views DESC
LIMIT 10;
User Analytics
-- Daily active users
SELECT
toDate(timestamp) as date,
count(distinct distinct_id) as daily_active_users
FROM events
WHERE timestamp > now() - interval 30 day
GROUP BY date
ORDER BY date;
-- User retention by signup date
SELECT
toStartOfWeek(first_seen) as signup_week,
count(distinct distinct_id) as signed_up,
countIf(last_seen > first_seen + interval 7 day) as retained_week1,
countIf(last_seen > first_seen + interval 30 day) as retained_month1
FROM (
SELECT
distinct_id,
min(timestamp) as first_seen,
max(timestamp) as last_seen
FROM events
GROUP BY distinct_id
)
GROUP BY signup_week
ORDER BY signup_week;
Product Analytics
-- Feature usage analysis
SELECT
properties.feature_name,
count() as usage_count,
count(distinct distinct_id) as unique_users
FROM events
WHERE event = 'feature_used'
AND timestamp > now() - interval 30 day
GROUP BY properties.feature_name
ORDER BY usage_count DESC;
-- Session duration analysis
SELECT
properties.$browser,
avg(properties.session_duration) as avg_duration,
median(properties.session_duration) as median_duration,
count(distinct distinct_id) as users
FROM events
WHERE event = 'session_end'
AND timestamp > now() - interval 7 day
GROUP BY properties.$browser
ORDER BY avg_duration DESC;
-- Conversion funnel
SELECT
step,
count(distinct user_id) as users,
count(distinct user_id) * 100.0 / first_value(count(distinct user_id)) OVER (ORDER BY step) as conversion_rate
FROM (
SELECT distinct_id as user_id, 1 as step FROM events WHERE event = 'page_view' AND timestamp > now() - interval 30 day
UNION ALL
SELECT distinct_id as user_id, 2 as step FROM events WHERE event = 'sign_up' AND timestamp > now() - interval 30 day
UNION ALL
SELECT distinct_id as user_id, 3 as step FROM events WHERE event = 'purchase' AND timestamp > now() - interval 30 day
)
GROUP BY step
ORDER BY step;
Performance Analytics
-- Page load times
SELECT
properties.$current_url as page,
avg(properties.load_time) as avg_load_time,
percentile(properties.load_time, 0.95) as p95_load_time,
count() as page_loads
FROM events
WHERE event = 'page_load'
AND timestamp > now() - interval 7 day
AND properties.load_time > 0
GROUP BY properties.$current_url
ORDER BY avg_load_time DESC
LIMIT 20;
-- Error tracking
SELECT
properties.error_type,
properties.error_message,
count() as error_count,
count(distinct distinct_id) as affected_users
FROM events
WHERE event = 'error'
AND timestamp > now() - interval 24 hour
GROUP BY properties.error_type, properties.error_message
ORDER BY error_count DESC;
Marketing Analytics
-- Traffic sources
SELECT
properties.$referring_domain as source,
count() as visits,
count(distinct distinct_id) as unique_visitors,
countIf(event = 'sign_up') as conversions
FROM events
WHERE event = 'page_view'
AND timestamp > now() - interval 30 day
GROUP BY properties.$referring_domain
ORDER BY visits DESC;
-- Campaign performance
SELECT
properties.utm_campaign,
properties.utm_source,
properties.utm_medium,
count(distinct distinct_id) as visitors,
countIf(event = 'purchase') as purchases,
sum(properties.revenue) as total_revenue
FROM events
WHERE timestamp > now() - interval 30 day
AND properties.utm_campaign IS NOT NULL
GROUP BY properties.utm_campaign, properties.utm_source, properties.utm_medium
ORDER BY total_revenue DESC;
REST API Examples
# Get events for specific user
GET /api/projects/:project_id/events?distinct_id=user123
# Get insights with filters
GET /api/projects/:project_id/insights?filter=events&events=[{"id":"page_view"}]
# Export events as CSV
GET /api/projects/:project_id/events/export?format=csv&after=2024-01-01
# Get cohorts
GET /api/projects/:project_id/cohorts
# Get feature flags
GET /api/projects/:project_id/feature_flags
Query API Examples
// User path analysis
QUERY {
"kind": "PathsQuery",
"dateRange": {"date_from": "-7d"},
"pathsFilter": {
"include_event_types": ["$pageview"],
"start_point": "/home"
}
}
// Session recording query
QUERY {
"kind": "SessionsTimelineQuery",
"dateRange": {"date_from": "-24h"},
"events": [
{"event": "error", "properties": {"error_type": "javascript"}}
]
}
// Property analysis
QUERY {
"kind": "EventsQuery",
"select": [
"properties.$browser",
"count()"
],
"where": [
"and",
["event", "=", "page_view"],
["timestamp", ">", "-7d"]
],
"groupBy": ["properties.$browser"]
}
Best Practices
1. Query Performance
-- β
Good: Use specific date ranges
WHERE timestamp > now() - interval 7 day
-- β
Good: Filter early and be specific
WHERE event = 'page_view'
AND timestamp > '2024-01-01'
AND properties.$current_url LIKE '/product%'
-- β
Good: Limit large result sets
LIMIT 1000
-- β Avoid: Querying all historical data
WHERE timestamp > '2020-01-01' -- Too broad
2. Property Access
-- β
Good: Use specific property names
SELECT properties.$browser, properties.custom_field
-- β
Good: Check for property existence
WHERE properties.email IS NOT NULL
-- β
Good: Use proper property types
WHERE toFloat64(properties.price) > 100
3. Aggregations
-- β
Good: Use appropriate aggregation functions
SELECT
count() as total_events,
count(distinct distinct_id) as unique_users,
avg(properties.session_duration) as avg_duration
-- β
Good: Group by meaningful dimensions
GROUP BY event, properties.$browser, toDate(timestamp)
4. API Usage
# β
Good: Use pagination for large datasets
GET /api/projects/:project_id/events?limit=100&offset=100
# β
Good: Filter at the API level
GET /api/projects/:project_id/events?event=page_view&after=2024-01-01
# β
Good: Use appropriate content types
Content-Type: application/json
Authorization: Bearer your_api_key
5. Error Handling
-- β
Good: Handle potential null values
WHERE properties.email IS NOT NULL AND properties.email != ''
-- β
Good: Use safe type conversions
WHERE toFloat64OrNull(properties.price) > 0
-- β
Good: Validate data before aggregating
WHERE timestamp > '2020-01-01' AND timestamp < now()
Troubleshooting
Common Errors
1. "Invalid API Key"
{
"error": "HTTP 401: {\"type\":\"authentication_error\",\"code\":\"invalid_api_key\"}"
}
Solution: Verify your Project API Key is correct and has proper permissions.
2. "Project not found"
{
"error": "HTTP 404: Project not found"
}
Solution: Check that your Project ID is correct and accessible with your API key.
3. "HogQL syntax error"
-- β Error: Invalid HogQL syntax
SELECT * FROM events WHERE timestamp > invalid_date
-- β
Fix: Use proper date format
SELECT * FROM events WHERE timestamp > '2024-01-01'
4. "Property not found"
-- β Error: Property doesn't exist
SELECT properties.nonexistent_field FROM events
-- β
Fix: Check property exists first
SELECT properties.nonexistent_field FROM events
WHERE properties.nonexistent_field IS NOT NULL
Debugging Tips
Start Simple: Begin with basic queries and add complexity gradually
Check Date Ranges: Ensure your date filters return data
Validate Properties: Use
DESCRIBE events
to see available propertiesTest with Limits: Use
LIMIT 10
for initial testingCheck API Responses: Look at raw JSON responses for debugging
Performance Optimization
Use Specific Date Ranges to reduce data scanned
Filter Early with WHERE clauses before aggregation
Limit Results with LIMIT clauses
Index on Properties that you query frequently
Use Sampling for large datasets when approximate results are acceptable
Connection Issues
# Test connection
curl -H "Authorization: Bearer your_api_key" \
https://app.posthog.com/api/projects/your_project_id/
# Verify API key format
# Should start with 'phc_' for project API keys
phc_abcdef1234567890...
# Check project ID is numeric
12345 # β
Correct
project_name # β Incorrect
Advanced Usage
Custom Event Properties
-- Track custom business metrics
SELECT
properties.plan_type,
count(distinct distinct_id) as customers,
sum(toFloat64(properties.monthly_revenue)) as total_revenue,
avg(toFloat64(properties.monthly_revenue)) as avg_revenue_per_customer
FROM events
WHERE event = 'subscription_created'
AND timestamp > now() - interval 30 day
GROUP BY properties.plan_type
ORDER BY total_revenue DESC;
Cohort Analysis
-- Monthly cohort retention
WITH cohorts AS (
SELECT
distinct_id,
toStartOfMonth(min(timestamp)) as cohort_month
FROM events
WHERE event = 'sign_up'
GROUP BY distinct_id
),
monthly_activity AS (
SELECT
distinct_id,
toStartOfMonth(timestamp) as activity_month
FROM events
WHERE event = 'page_view'
GROUP BY distinct_id, activity_month
)
SELECT
cohort_month,
count(distinct c.distinct_id) as cohort_size,
count(distinct ma.distinct_id) as retained_users,
count(distinct ma.distinct_id) * 100.0 / count(distinct c.distinct_id) as retention_rate
FROM cohorts c
LEFT JOIN monthly_activity ma ON c.distinct_id = ma.distinct_id
AND ma.activity_month = cohort_month + interval 1 month
GROUP BY cohort_month
ORDER BY cohort_month;
Real-time Dashboards
-- Live event stream (last 5 minutes)
SELECT
timestamp,
event,
distinct_id,
properties.$current_url
FROM events
WHERE timestamp > now() - interval 5 minute
ORDER BY timestamp DESC
LIMIT 100;
-- Current active users (last 5 minutes)
SELECT count(distinct distinct_id) as current_active_users
FROM events
WHERE timestamp > now() - interval 5 minute;
Getting Help
PostHog Documentation: PostHog Docs
HogQL Reference: HogQL Guide
API Reference: PostHog API Docs
Community: PostHog Slack
Start analyzing your product data with SyneHQ today! π¦β¨
Last updated