Skip to content

get_content_name_from_url

Extracts human-readable content names from utm_campaign URL parameters. This function is used in the unified_performance_v2 view to provide content name coverage for Google Ads and Microsoft Ads, where creative names are unavailable or contain placeholder values like "0".

Current Version: 1.0.0

Signature

sql
get_content_name_from_url(url STRING, platform STRING) RETURNS STRING

Parameters

ParameterTypeDescription
urlSTRINGThe full URL containing a utm_campaign parameter
platformSTRINGThe platform label (currently unused in logic, reserved for future platform-specific rules)

Return Value

Returns a STRING containing the human-readable content name, normalized to lowercase with underscores replaced by spaces. Returns "Unknown" if the URL is NULL, empty, or does not contain a utm_campaign parameter.

Extraction Pipeline

The function applies a 9-step regex pipeline to extract and clean the content name from the utm_campaign parameter value:

Step 0: Extract utm_campaign

Extracts the value of the utm_campaign parameter from the URL query string.

Input:  https://www.suse.com/lp/edge?utm_campaign=5_0003991_Gartner_Edge_Google_Search_TOFU_RG_GLB_mp_12345_en
Output: 5_0003991_Gartner_Edge_Google_Search_TOFU_RG_GLB_mp_12345_en

Step 1: Strip mediaplan suffix

Removes _mp_DIGITS and optional trailing _lang suffix (Google pattern).

Before: 5_0003991_Gartner_Edge_Google_Search_TOFU_RG_GLB_mp_12345_en
After:  5_0003991_Gartner_Edge_Google_Search_TOFU_RG_GLB

Step 2: Strip trailing language code

Removes trailing 2-letter language codes (_en, _de, _fr, etc.).

Step 3: Strip trailing funnel stages

Removes trailing _TOFU, _MOFU, _BOFU, _All_Tiers, or _Global_Campaign.

Before: 5_0003991_Gartner_Edge_Google_Search_TOFU_RG_GLB
After:  5_0003991_Gartner_Edge_Google_Search_RG_GLB

Step 4: Strip trailing geo codes

Removes trailing _RG_GLB, _RG, or _GLB.

Before: 5_0003991_Gartner_Edge_Google_Search_RG_GLB
After:  5_0003991_Gartner_Edge_Google_Search

Step 5: Strip leading campaign code prefix

Removes leading digit(s) + underscore + 4-7 digit campaign code + underscore.

Before: 5_0003991_Gartner_Edge_Google_Search
After:  Gartner_Edge_Google_Search

Step 6-7: Remove platform/media tokens (2 passes)

Removes tokens like Google, Bing, Microsoft, PaidSearch, textads, Search, Display, Paid_Ad, OA, Horizontal, Vertical. Two passes handle adjacent tokens.

Before: Gartner_Edge_Google_Search
After:  Gartner_Edge

Step 8: Clean up

Collapses multiple underscores to single, strips leading/trailing underscores.

Step 9: Format

Replaces underscores with spaces, lowercases, trims.

Final:  gartner edge

Examples

utm_campaign valueExtracted content name
5_0003991_Gartner_Edge_Google_Search_TOFU_RG_GLB_mp_12345_engartner edge
5_0003991_Control_Trust_Your_AI_Innovation_Google_TOFU_RG_GLB_mp_23456control trust your ai innovation
5_0003849_ECM_eBook_Google_Search_TOFU_RG_GLB_mp_34567_enecm ebook

Microsoft Ads URLs

utm_campaign valueExtracted content name
0010372_Control_Trust_Your_AI_Bing_TOFUcontrol trust your ai
0003849_ECM_PaidSearch_MOFUecm
0012367_Rancher_AWS_Bing_Searchrancher aws

Edge Cases

InputOutputReason
URL without utm_campaignUnknownNo parameter to extract
NULL or empty URLUnknownGuard clause
FY-format campaigns (e.g., 9_FY23_DM_ECM_...)Noisy resultNo standard campaign code prefix
Dual campaign codesPartial stripOnly first code removed

Usage in unified_performance_v2

sql
-- ContentName routing in the enrichment layer
CASE
  WHEN b.Platform IN ('Google Ads', 'Microsoft Ads')
  THEN `customer_ads_suse.get_content_name_from_url`(b.DestinationURL, b.Platform)
  ELSE `customer_ads_suse.get_content_name`(b.AdCreativeName, b.CampaignName)
END AS ContentName

This CASE-based routing gives ContentName coverage for all 6 platforms:

  • Google Ads and Microsoft Ads: extracted from DestinationURL via this UDF
  • LinkedIn, Reddit, 6sense, StackAdapt: parsed from AdCreativeName via get_content_name

Standalone Usage

sql
-- Test against Google URLs
SELECT
  CreativeFinalURL,
  `customer_ads_suse.get_content_name_from_url`(CreativeFinalURL, 'Google Ads') AS ContentName
FROM `paidteam-data-warehouse.customer_ads_suse.suse_google_adlevel_final_v2`
WHERE Date >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
  AND CreativeFinalURL IS NOT NULL
LIMIT 20;

-- Test against Microsoft URLs
SELECT
  FinalUrl,
  `customer_ads_suse.get_content_name_from_url`(FinalUrl, 'Microsoft Ads') AS ContentName
FROM `paidteam-data-warehouse.customer_ads_suse.suse_microsoft_adlevel_final_v2`
WHERE Date >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
  AND FinalUrl IS NOT NULL
LIMIT 20;

Known Limitations

  1. FY-format Google campaigns (e.g., 9_FY23_DM_ECM_...) produce noisy results because they lack a standard campaign code prefix
  2. Some Microsoft campaigns use generic names like PaidSearch which results in just the product code (e.g., ecm)
  3. Dual campaign codes (e.g., 3_0002977_0003489_...) only strip the first code
  • get_content_name -- Extracts content names from creative naming patterns (used for LinkedIn, Reddit, 6sense, StackAdapt)
  • get_campaign_code -- Extracts campaign codes from URLs or creative names

Version History

VersionDateChanges
1.0.02026-02-08Initial release. 9-step regex pipeline for utm_campaign extraction.

SUSE Paid Advertising Data Warehouse V2