Skip to content

Debugging Refresh Failures

When the adlevel or creative refresh scripts encounter errors, they log the failure and continue processing other platforms. This guide covers how to identify, diagnose, and resolve refresh failures.

Log Tables

Adlevel Refresh Log

sql
SELECT *
FROM `paidteam-data-warehouse.customer_ads_suse.refresh_log_v2`
WHERE run_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR)
ORDER BY run_timestamp DESC;

Creative Refresh Log

sql
SELECT *
FROM `paidteam-data-warehouse.customer_ads_suse.creative_refresh_log_v2`
WHERE run_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR)
ORDER BY run_timestamp DESC;

Status Values

Both log tables use the following status codes:

StatusMeaningAction Required
OKRefresh completed successfullyNone
WARNINGRefresh completed with non-critical issuesReview the message for details
SKIPPEDPlatform was intentionally skippedExpected for platforms not yet configured
DUPLICATEDuplicate run detected, skipped to prevent double-processingNone -- normal behavior if script ran twice
ERRORAn exception occurred during refreshInvestigate immediately
NOT_CONFIGUREDPlatform does not have the required tables set upExpected for platforms in development

Common Failure Modes

1. Schema Mismatch Between Staging and Final

Symptom: ERROR status with a message like "Column not found" or "Type mismatch".

Cause: The ingestion tool (Dataslayer) added or changed a column in the staging table, but the final table was not updated to match.

Diagnosis:

sql
-- Compare staging vs final schemas
SELECT
  s.column_name,
  s.data_type AS staging_type,
  f.data_type AS final_type
FROM (
  SELECT column_name, data_type
  FROM `paidteam-data-warehouse.customer_ads_suse.INFORMATION_SCHEMA.COLUMNS`
  WHERE table_name = 'suse_{platform}_adlevel_staging_v2'
) s
FULL OUTER JOIN (
  SELECT column_name, data_type
  FROM `paidteam-data-warehouse.customer_ads_suse.INFORMATION_SCHEMA.COLUMNS`
  WHERE table_name = 'suse_{platform}_adlevel_final_v2'
) f
ON s.column_name = f.column_name
WHERE s.data_type != f.data_type
   OR s.column_name IS NULL
   OR f.column_name IS NULL;

Resolution: See the Schema Changes workflow to update the final table schema.

2. NULL Values in ID Columns

Symptom: ERROR status or unexpected row counts after refresh.

Cause: The advertising platform API returned rows with NULL values in ID columns (CampaignID, AdGroupID, AdID). These can cause issues with the DELETE + INSERT logic.

Diagnosis:

sql
-- Check for NULL IDs in staging
SELECT
  COUNT(*) AS total_rows,
  COUNTIF(CampaignID IS NULL) AS null_campaign_ids,
  COUNTIF(AdGroupID IS NULL) AS null_adgroup_ids,
  COUNTIF(AdID IS NULL) AS null_ad_ids
FROM `paidteam-data-warehouse.customer_ads_suse.suse_{platform}_adlevel_staging_v2`;

Resolution: If NULL IDs are legitimate (some platforms return summary rows), add WHERE filters to exclude them. If they represent data quality issues, contact the platform or Dataslayer support.

3. Stale Data in Staging

Symptom: OK status but final table shows outdated data. The MAX(Date) in the final table is older than expected.

Cause: Dataslayer ingestion failed or did not run before the refresh script executed.

Diagnosis:

sql
-- Check staging table freshness
SELECT
  MAX(Date) AS latest_date,
  COUNT(*) AS row_count
FROM `paidteam-data-warehouse.customer_ads_suse.suse_{platform}_adlevel_staging_v2`;

-- Check final table freshness
SELECT
  MAX(Date) AS latest_date,
  MAX(_ingested_at) AS latest_ingestion
FROM `paidteam-data-warehouse.customer_ads_suse.suse_{platform}_adlevel_final_v2`;

Resolution: Trigger a manual Dataslayer run, then re-run the refresh script. The refresh is idempotent, so running it again is safe.

4. Creative MERGE Conflict

Symptom: ERROR status in creative refresh log with messages about duplicate keys.

Cause: The staging table contains duplicate rows for the same primary key.

Diagnosis:

sql
-- Check for duplicate creative IDs in staging (single-key example)
SELECT CreativeID, COUNT(*) AS cnt
FROM `paidteam-data-warehouse.customer_ads_suse.suse_{platform}_creative_staging_v2`
GROUP BY CreativeID
HAVING cnt > 1;

-- Check for duplicates with composite key (Reddit example)
SELECT CampaignID, AdGroupID, AdID, COUNT(*) AS cnt
FROM `paidteam-data-warehouse.customer_ads_suse.suse_reddit_creative_staging_v2`
GROUP BY CampaignID, AdGroupID, AdID
HAVING cnt > 1;

Resolution: Deduplicate the staging data before the MERGE runs, or add deduplication logic to the MERGE statement using a subquery with ROW_NUMBER().

5. Permission or Resource Errors

Symptom: ERROR status with messages about "Access Denied" or "Resources exceeded".

Cause: Service account permissions changed, or the query exceeded BigQuery resource limits.

Resolution:

  • For permissions: Verify the service account has bigquery.dataEditor and bigquery.jobUser roles
  • For resources: Optimize the query, add partition pruning, or increase slot allocation

Diagnostic Queries

Recent Error Summary

sql
-- All errors in the last 7 days
SELECT
  run_timestamp,
  platform,
  status,
  message
FROM `paidteam-data-warehouse.customer_ads_suse.refresh_log_v2`
WHERE status = 'ERROR'
  AND run_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
ORDER BY run_timestamp DESC;

Platform Health Dashboard

sql
-- Latest status for each platform
SELECT
  platform,
  status,
  message,
  run_timestamp
FROM (
  SELECT
    platform,
    status,
    message,
    run_timestamp,
    ROW_NUMBER() OVER (PARTITION BY platform ORDER BY run_timestamp DESC) AS rn
  FROM `paidteam-data-warehouse.customer_ads_suse.refresh_log_v2`
)
WHERE rn = 1
ORDER BY platform;

Data Freshness Across All Platforms

sql
SELECT 'Google' AS platform, MAX(Date) AS latest_date, MAX(_ingested_at) AS latest_ingestion
FROM `paidteam-data-warehouse.customer_ads_suse.suse_google_adlevel_final_v2`
UNION ALL
SELECT 'Microsoft', MAX(Date), MAX(_ingested_at)
FROM `paidteam-data-warehouse.customer_ads_suse.suse_microsoft_adlevel_final_v2`
UNION ALL
SELECT 'LinkedIn', MAX(Date), MAX(_ingested_at)
FROM `paidteam-data-warehouse.customer_ads_suse.suse_linkedin_adlevel_final_v2`
UNION ALL
SELECT 'Reddit', MAX(Date), MAX(_ingested_at)
FROM `paidteam-data-warehouse.customer_ads_suse.suse_reddit_adlevel_final_v2`
UNION ALL
SELECT '6sense', MAX(Date), MAX(_ingested_at)
FROM `paidteam-data-warehouse.customer_ads_suse.suse_6sense_adlevel_final_v2`
UNION ALL
SELECT 'StackAdapt', MAX(Date), MAX(_ingested_at)
FROM `paidteam-data-warehouse.customer_ads_suse.suse_stackadapt_adlevel_final_v2`
ORDER BY platform;

Escalation

If a refresh failure cannot be resolved using the steps above:

  1. Check the BigQuery job history in the Google Cloud Console for detailed error information
  2. Review the Dataslayer dashboard for ingestion failures
  3. Check the platform's API status page for outages
  4. Review recent schema or configuration changes in the Changelog

SUSE Paid Advertising Data Warehouse V2