Skip to content

Daily Pipeline

The daily pipeline runs automatically on a fixed UTC schedule. Data flows from advertising platforms through Dataslayer (the ingestion tool) into BigQuery staging tables, then from staging into partitioned final tables via scheduled refresh scripts.

Schedule

Time (UTC)StepDescription
05:00DataslayerReddit staging ingestion (30-day lookback window)
06:00DataslayerGoogle staging ingestion (60-day lookback window)
06:00DataslayerMicrosoft staging ingestion (60-day lookback window)
06:00DataslayerLinkedIn staging ingestion (15-day lookback window)
06:00DataslayerLinkedIn creative staging ingestion
07:30Creative Refresh ScriptMerges creative staging into creative final tables
08:00Adlevel V2 Refresh ScriptRefreshes adlevel final tables from staging

Step-by-Step Flow

Step 1: Dataslayer Ingestion (05:00 - 06:00 UTC)

Dataslayer connects to each advertising platform's API and writes raw performance data into the corresponding staging tables.

Staging tables written:

PlatformStaging TableLookback
Redditsuse_reddit_adlevel_staging_v230 days
Googlesuse_google_adlevel_staging_v260 days
Microsoftsuse_microsoft_adlevel_staging_v260 days
LinkedInsuse_linkedin_adlevel_staging_v215 days
LinkedIn Creativesuse_linkedin_creative_staging_v2Full refresh

What happens: Dataslayer overwrites the staging table with fresh data from the platform API for the configured lookback window. Staging tables are treated as ephemeral -- they reflect only the most recent ingestion.

Lookback windows differ by platform based on data freshness requirements and API rate limits. Reddit uses a shorter 30-day window, while Google and Microsoft use 60 days to capture delayed conversion attribution.

Step 2: Creative Refresh Script (07:30 UTC)

The Creative Refresh Script runs a MERGE operation for each platform that has a creative table. It compares the creative staging table against the creative final table and:

  • Updates existing records that have changed
  • Inserts new records not yet in the final table
  • Preserves records in final that are no longer in staging (soft retention)
  • Adds _ingested_at timestamp to all inserted/updated rows

Tables processed:

PlatformStaging SourceFinal Target
LinkedInsuse_linkedin_creative_staging_v2suse_linkedin_creative_final_v2
Redditsuse_reddit_creative_staging_v2suse_reddit_creative_final_v2

Logging: Each platform's refresh status is logged to creative_refresh_log_v2 with one of:

  • OK -- Merge completed successfully
  • ERROR -- An exception occurred (logged with error message)
  • SKIPPED -- Platform was skipped (e.g., staging table empty)

Step 3: Adlevel V2 Refresh Script (08:00 UTC)

The Adlevel Refresh Script uses a DELETE + INSERT pattern for each platform:

  1. DELETE rows from the final table where the Date + ID combination matches rows in the staging table
  2. INSERT all rows from staging into the final table with an _ingested_at timestamp

This approach preserves historical data in the final table that falls outside the staging lookback window.

Tables processed:

PlatformStaging SourceFinal Target
Googlesuse_google_adlevel_staging_v2suse_google_adlevel_final_v2
Microsoftsuse_microsoft_adlevel_staging_v2suse_microsoft_adlevel_final_v2
LinkedInsuse_linkedin_adlevel_staging_v2suse_linkedin_adlevel_final_v2
Redditsuse_reddit_adlevel_staging_v2suse_reddit_adlevel_final_v2
6sensesuse_6sense_adlevel_staging_v2suse_6sense_adlevel_final_v2
StackAdaptsuse_stackadapt_adlevel_staging_v2suse_stackadapt_adlevel_final_v2

Logging: Each platform's refresh status is logged to refresh_log_v2.

Failure Handling

Both refresh scripts use BEGIN...EXCEPTION...END blocks around each platform's logic. If one platform fails:

  1. The error is caught and logged with status = 'ERROR' and the error message
  2. The script continues to the next platform
  3. Other platforms are not affected by the failure

This means a Google API outage will not prevent LinkedIn, Reddit, or other platforms from refreshing normally.

Verification

After the pipeline completes, verify successful execution:

Check Adlevel Refresh Log

sql
SELECT *
FROM `paidteam-data-warehouse.customer_ads_suse.refresh_log_v2`
WHERE run_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR)
ORDER BY run_timestamp DESC;

Check Creative Refresh Log

sql
SELECT *
FROM `paidteam-data-warehouse.customer_ads_suse.creative_refresh_log_v2`
WHERE run_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR)
ORDER BY run_timestamp DESC;

Verify Data Freshness

sql
-- Check the most recent date in each final table
SELECT 'Google' AS platform, MAX(Date) AS latest_date
FROM `paidteam-data-warehouse.customer_ads_suse.suse_google_adlevel_final_v2`
UNION ALL
SELECT 'Microsoft', MAX(Date)
FROM `paidteam-data-warehouse.customer_ads_suse.suse_microsoft_adlevel_final_v2`
UNION ALL
SELECT 'LinkedIn', MAX(Date)
FROM `paidteam-data-warehouse.customer_ads_suse.suse_linkedin_adlevel_final_v2`
UNION ALL
SELECT 'Reddit', MAX(Date)
FROM `paidteam-data-warehouse.customer_ads_suse.suse_reddit_adlevel_final_v2`
ORDER BY platform;

Verify Row Counts

sql
-- Compare staging vs final row counts for a recent date range
SELECT
  'Staging' AS source,
  COUNT(*) AS row_count,
  MIN(Date) AS min_date,
  MAX(Date) AS max_date
FROM `paidteam-data-warehouse.customer_ads_suse.suse_google_adlevel_staging_v2`

UNION ALL

SELECT
  'Final' AS source,
  COUNT(*) AS row_count,
  MIN(Date) AS min_date,
  MAX(Date) AS max_date
FROM `paidteam-data-warehouse.customer_ads_suse.suse_google_adlevel_final_v2`;

Step 4: Budget Refresh (08:30 UTC)

The Budget Refresh Script reads budget data from Google Sheets via external tables, UNPIVOTs wide format into long format, and full-replaces the budget final tables. See Budget Pacing Workflow for details.

Tables refreshed:

Source (External)Target (Final)
suse_budget_platform_staging_v2suse_budget_platform_final_v2
suse_budget_region_staging_v2suse_budget_region_final_v2
suse_budget_config_staging_v2suse_budget_config_v2

Logging: Results logged to suse_budget_refresh_log_v2.

Manual Platforms: 6sense & StackAdapt

6sense and StackAdapt are not supported by Dataslayer and are ingested manually via Google Sheets Apps Script. They are not part of the daily automated pipeline above.

To update 6sense or StackAdapt data:

  1. Export the CSV report from the platform dashboard
  2. Open the SUSE Ads Google Sheet
  3. Use the HoD Menu to trigger processing:
    • 6sense: HoD Menu > V3 Drive Processing > 6sense (or Complete All)
    • StackAdapt: HoD Menu > StackAdapt > Complete All
  4. The script parses the CSV, formats it, and uploads to BigQuery (both adlevel and status data)

These platforms' data freshness depends entirely on when the last manual upload was run.

Timing Considerations

  • Dataslayer ingestion times are approximate. Platform API latency can cause delays.
  • The Creative Refresh Script starts at 07:30 to allow a buffer after Dataslayer completes.
  • The Adlevel Refresh Script starts at 08:00 to ensure creative data is ready before adlevel data is refreshed.
  • Looker Studio dashboards query the final tables, so they reflect updated data after 08:00 UTC.
  • 6sense and StackAdapt data is only as fresh as the last manual upload.

SUSE Paid Advertising Data Warehouse V2