Appearance
get_content_name_from_url
Extracts human-readable content names from utm_campaign URL parameters. This function is used in the unified_performance_v2 view to provide content name coverage for Google Ads and Microsoft Ads, where creative names are unavailable or contain placeholder values like "0".
Current Version: 1.0.0
Signature
sql
get_content_name_from_url(url STRING, platform STRING) RETURNS STRINGParameters
| Parameter | Type | Description |
|---|---|---|
url | STRING | The full URL containing a utm_campaign parameter |
platform | STRING | The platform label (currently unused in logic, reserved for future platform-specific rules) |
Return Value
Returns a STRING containing the human-readable content name, normalized to lowercase with underscores replaced by spaces. Returns "Unknown" if the URL is NULL, empty, or does not contain a utm_campaign parameter.
Extraction Pipeline
The function applies a 9-step regex pipeline to extract and clean the content name from the utm_campaign parameter value:
Step 0: Extract utm_campaign
Extracts the value of the utm_campaign parameter from the URL query string.
Input: https://www.suse.com/lp/edge?utm_campaign=5_0003991_Gartner_Edge_Google_Search_TOFU_RG_GLB_mp_12345_en
Output: 5_0003991_Gartner_Edge_Google_Search_TOFU_RG_GLB_mp_12345_enStep 1: Strip mediaplan suffix
Removes _mp_DIGITS and optional trailing _lang suffix (Google pattern).
Before: 5_0003991_Gartner_Edge_Google_Search_TOFU_RG_GLB_mp_12345_en
After: 5_0003991_Gartner_Edge_Google_Search_TOFU_RG_GLBStep 2: Strip trailing language code
Removes trailing 2-letter language codes (_en, _de, _fr, etc.).
Step 3: Strip trailing funnel stages
Removes trailing _TOFU, _MOFU, _BOFU, _All_Tiers, or _Global_Campaign.
Before: 5_0003991_Gartner_Edge_Google_Search_TOFU_RG_GLB
After: 5_0003991_Gartner_Edge_Google_Search_RG_GLBStep 4: Strip trailing geo codes
Removes trailing _RG_GLB, _RG, or _GLB.
Before: 5_0003991_Gartner_Edge_Google_Search_RG_GLB
After: 5_0003991_Gartner_Edge_Google_SearchStep 5: Strip leading campaign code prefix
Removes leading digit(s) + underscore + 4-7 digit campaign code + underscore.
Before: 5_0003991_Gartner_Edge_Google_Search
After: Gartner_Edge_Google_SearchStep 6-7: Remove platform/media tokens (2 passes)
Removes tokens like Google, Bing, Microsoft, PaidSearch, textads, Search, Display, Paid_Ad, OA, Horizontal, Vertical. Two passes handle adjacent tokens.
Before: Gartner_Edge_Google_Search
After: Gartner_EdgeStep 8: Clean up
Collapses multiple underscores to single, strips leading/trailing underscores.
Step 9: Format
Replaces underscores with spaces, lowercases, trims.
Final: gartner edgeExamples
Google Ads URLs
| utm_campaign value | Extracted content name |
|---|---|
5_0003991_Gartner_Edge_Google_Search_TOFU_RG_GLB_mp_12345_en | gartner edge |
5_0003991_Control_Trust_Your_AI_Innovation_Google_TOFU_RG_GLB_mp_23456 | control trust your ai innovation |
5_0003849_ECM_eBook_Google_Search_TOFU_RG_GLB_mp_34567_en | ecm ebook |
Microsoft Ads URLs
| utm_campaign value | Extracted content name |
|---|---|
0010372_Control_Trust_Your_AI_Bing_TOFU | control trust your ai |
0003849_ECM_PaidSearch_MOFU | ecm |
0012367_Rancher_AWS_Bing_Search | rancher aws |
Edge Cases
| Input | Output | Reason |
|---|---|---|
URL without utm_campaign | Unknown | No parameter to extract |
| NULL or empty URL | Unknown | Guard clause |
FY-format campaigns (e.g., 9_FY23_DM_ECM_...) | Noisy result | No standard campaign code prefix |
| Dual campaign codes | Partial strip | Only first code removed |
Usage in unified_performance_v2
sql
-- ContentName routing in the enrichment layer
CASE
WHEN b.Platform IN ('Google Ads', 'Microsoft Ads')
THEN `customer_ads_suse.get_content_name_from_url`(b.DestinationURL, b.Platform)
ELSE `customer_ads_suse.get_content_name`(b.AdCreativeName, b.CampaignName)
END AS ContentNameThis CASE-based routing gives ContentName coverage for all 6 platforms:
- Google Ads and Microsoft Ads: extracted from
DestinationURLvia this UDF - LinkedIn, Reddit, 6sense, StackAdapt: parsed from
AdCreativeNameviaget_content_name
Standalone Usage
sql
-- Test against Google URLs
SELECT
CreativeFinalURL,
`customer_ads_suse.get_content_name_from_url`(CreativeFinalURL, 'Google Ads') AS ContentName
FROM `paidteam-data-warehouse.customer_ads_suse.suse_google_adlevel_final_v2`
WHERE Date >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
AND CreativeFinalURL IS NOT NULL
LIMIT 20;
-- Test against Microsoft URLs
SELECT
FinalUrl,
`customer_ads_suse.get_content_name_from_url`(FinalUrl, 'Microsoft Ads') AS ContentName
FROM `paidteam-data-warehouse.customer_ads_suse.suse_microsoft_adlevel_final_v2`
WHERE Date >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
AND FinalUrl IS NOT NULL
LIMIT 20;Known Limitations
- FY-format Google campaigns (e.g.,
9_FY23_DM_ECM_...) produce noisy results because they lack a standard campaign code prefix - Some Microsoft campaigns use generic names like
PaidSearchwhich results in just the product code (e.g.,ecm) - Dual campaign codes (e.g.,
3_0002977_0003489_...) only strip the first code
Related Functions
get_content_name-- Extracts content names from creative naming patterns (used for LinkedIn, Reddit, 6sense, StackAdapt)get_campaign_code-- Extracts campaign codes from URLs or creative names
Version History
| Version | Date | Changes |
|---|---|---|
| 1.0.0 | 2026-02-08 | Initial release. 9-step regex pipeline for utm_campaign extraction. |