Skip to content

Schema Changes

Modifying table schemas in the data warehouse requires care to avoid breaking the refresh pipeline or losing data. This workflow covers safe procedures for common schema modifications.

Core Rules

  1. ALTER COLUMN for type changes -- Use ALTER TABLE ... ALTER COLUMN to change column types. This preserves existing data.
  2. Update BOTH staging AND final -- When changing a schema, both the staging and final tables must be updated to match. A mismatch will cause the refresh script to fail.
  3. Test with the ingestion tool first -- Before deploying schema changes, verify that Dataslayer can still write to the staging table with the new schema.
  4. Never DROP final tables without confirmation -- Final tables contain historical data that may not exist anywhere else.

Common Schema Changes

Changing a Column Type

The most common change is correcting a column's data type, such as changing a rate column from INTEGER to FLOAT64.

Step 1: Alter the staging table

sql
ALTER TABLE `paidteam-data-warehouse.customer_ads_suse.suse_{platform}_adlevel_staging_v2`
ALTER COLUMN {column_name} SET DATA TYPE FLOAT64;

Step 2: Alter the final table

sql
ALTER TABLE `paidteam-data-warehouse.customer_ads_suse.suse_{platform}_adlevel_final_v2`
ALTER COLUMN {column_name} SET DATA TYPE FLOAT64;

Step 3: Verify the change

sql
SELECT column_name, data_type
FROM `paidteam-data-warehouse.customer_ads_suse.INFORMATION_SCHEMA.COLUMNS`
WHERE table_name IN (
  'suse_{platform}_adlevel_staging_v2',
  'suse_{platform}_adlevel_final_v2'
)
AND column_name = '{column_name}'
ORDER BY table_name;

Step 4: Test ingestion

Trigger a manual Dataslayer run or insert a test row to confirm the ingestion tool works with the new type.

Real-World Example: Google VideoViewsrate Fix

On 2026-02-08, the VideoViewsrate column in the Google adlevel tables was changed from INTEGER to FLOAT64 because view rates are decimal percentages:

sql
-- Fix staging
ALTER TABLE `paidteam-data-warehouse.customer_ads_suse.suse_google_adlevel_staging_v2`
ALTER COLUMN VideoViewsrate SET DATA TYPE FLOAT64;

-- Fix final
ALTER TABLE `paidteam-data-warehouse.customer_ads_suse.suse_google_adlevel_final_v2`
ALTER COLUMN VideoViewsrate SET DATA TYPE FLOAT64;

Adding a New Column

When a platform starts providing new metrics, you may need to add columns.

Step 1: Add to staging

sql
ALTER TABLE `paidteam-data-warehouse.customer_ads_suse.suse_{platform}_adlevel_staging_v2`
ADD COLUMN {new_column_name} {data_type};

Step 2: Add to final

sql
ALTER TABLE `paidteam-data-warehouse.customer_ads_suse.suse_{platform}_adlevel_final_v2`
ADD COLUMN {new_column_name} {data_type};

Step 3: Update the refresh script

Ensure the INSERT statement in the refresh script includes the new column. If it uses SELECT *, no change is needed, but explicit column lists must be updated.

Renaming a Column

BigQuery does not support ALTER TABLE ... RENAME COLUMN directly. The recommended approach:

  1. Add a new column with the desired name
  2. Backfill data from the old column: UPDATE ... SET new_col = old_col WHERE TRUE
  3. Update the refresh script to use the new column name
  4. After verifying, drop the old column (optional)

WARNING

Dropping a column is a destructive operation. Ensure no views, UDFs, or Looker Studio dashboards reference the old column name before dropping it.

Dropping a Column

sql
ALTER TABLE `paidteam-data-warehouse.customer_ads_suse.suse_{platform}_adlevel_final_v2`
DROP COLUMN {column_name};

DANGER

Only drop columns from final tables after confirming that:

  • No views reference the column
  • No UDFs use the column
  • No Looker Studio dashboards display the column
  • The column is not used in any refresh script logic

Pre-Change Checklist

Before making any schema change:

  • [ ] Identify all tables that need the change (staging + final)
  • [ ] Check if any views reference the affected column
  • [ ] Check if any UDFs use the affected column
  • [ ] Verify the target data type is compatible with existing data
  • [ ] Plan the order of operations (staging first, then final)
  • [ ] Prepare rollback steps in case of failure

Comparing Schemas

Use this query to compare the schemas of staging and final tables side by side:

sql
SELECT
  COALESCE(s.column_name, f.column_name) AS column_name,
  s.data_type AS staging_type,
  f.data_type AS final_type,
  CASE
    WHEN s.column_name IS NULL THEN 'MISSING FROM STAGING'
    WHEN f.column_name IS NULL THEN 'MISSING FROM FINAL'
    WHEN s.data_type != f.data_type THEN 'TYPE MISMATCH'
    ELSE 'OK'
  END AS status
FROM (
  SELECT column_name, data_type
  FROM `paidteam-data-warehouse.customer_ads_suse.INFORMATION_SCHEMA.COLUMNS`
  WHERE table_name = 'suse_{platform}_adlevel_staging_v2'
) s
FULL OUTER JOIN (
  SELECT column_name, data_type
  FROM `paidteam-data-warehouse.customer_ads_suse.INFORMATION_SCHEMA.COLUMNS`
  WHERE table_name = 'suse_{platform}_adlevel_final_v2'
) f
ON s.column_name = f.column_name
ORDER BY column_name;

Post-Change Verification

After making a schema change, verify:

  1. Schema comparison shows no mismatches (except _ingested_at in final only)
  2. Ingestion test succeeds with the new schema
  3. Refresh script runs without errors
  4. Views referencing the changed tables still work
  5. Looker Studio dashboards render correctly

SUSE Paid Advertising Data Warehouse V2