Before Cocoon
GenAI is overwhelmed by the messy source data ๐Ÿคฏ
It gets confused and starts making stuff up ๐Ÿคช
You
๐Ÿ™‚ What is the average revenue per user for each ad campaign?
GenAI
๐Ÿ˜ต Current RAG: Reading all the source data schema from databases... phew, it's a lot!
CREATE TABLE 'advertised_product_report_data' (
    ad_group_id BIGINT,
    ad_id BIGINT,
    campaign_id BIGINT,
    date_ VARCHAR,
    _fivetran_synced VARCHAR,
    campaign_budget_amount DOUBLE,
    campaign_budget_currency_code VARCHAR,
    campaign_budget_type VARCHAR,
    clicks BIGINT,
    cost DOUBLE,
    impressions BIGINT,
    __advertised_asin DOUBLE,
    __advertised_sku DOUBLE
);

CREATE TABLE 'ad_group_history_data' (
    id BIGINT,
    last_updated_date VARCHAR,
    _fivetran_synced VARCHAR,
    campaign_id BIGINT,
    creation_date VARCHAR,
    default_bid DOUBLE,
    name VARCHAR,
    serving_status VARCHAR,
    state VARCHAR
);

CREATE TABLE 'ad_group_level_report_data' (
    ad_group_id BIGINT,
    date_ VARCHAR,
    _fivetran_synced VARCHAR,
    campaign_bidding_strategy VARCHAR,
    clicks BIGINT,
    cost DOUBLE,
    impressions BIGINT
);

CREATE TABLE 'campaign_history_data' (
    id BIGINT,
    last_updated_date VARCHAR,
    _fivetran_synced VARCHAR,
    bidding_strategy VARCHAR,
    creation_date VARCHAR,
    budget BIGINT,
    end_date DOUBLE,
    name VARCHAR,
    portfolio_id BIGINT,
    profile_id BIGINT,
    serving_status VARCHAR,
    start_date VARCHAR,
    state VARCHAR,
    targeting_type VARCHAR,
    budget_type VARCHAR,
    effective_budget DOUBLE
);

CREATE TABLE 'campaign_level_report_data' (
    campaign_id BIGINT,
    date_ VARCHAR,
    _fivetran_synced VARCHAR,
    campaign_applicable_budget_rule_id DOUBLE,
    campaign_applicable_budget_rule_name DOUBLE,
    campaign_bidding_strategy VARCHAR,
    campaign_budget_amount DOUBLE,
    campaign_budget_currency_code VARCHAR,
    campaign_budget_type VARCHAR,
    clicks BIGINT,
    cost DOUBLE,
    impressions DOUBLE,
    campaign_rule_based_budget_amount DOUBLE
);

CREATE TABLE 'keyword_history_data' (
    id BIGINT,
    last_updated_date VARCHAR,
    _fivetran_synced VARCHAR,
    ad_group_id BIGINT,
    bid DOUBLE,
    campaign_id BIGINT,
    creation_date VARCHAR,
    keyword_text VARCHAR,
    match_type VARCHAR,
    native_language_keyword DOUBLE,
    serving_status VARCHAR,
    state VARCHAR,
    native_language_locale DOUBLE
);

CREATE TABLE 'portfolio_history_data' (
    id BIGINT,
    last_updated_date VARCHAR,
    _fivetran_synced VARCHAR,
    budget_amount BIGINT,
    budget_currency_code VARCHAR,
    budget_end_date VARCHAR,
    budget_policy VARCHAR,
    budget_start_date VARCHAR,
    creation_date VARCHAR,
    in_budget BOOLEAN,
    name VARCHAR,
    profile_id BIGINT,
    serving_status VARCHAR,
    state VARCHAR
);

CREATE TABLE 'product_ad_history_data' (
    id BIGINT,
    last_updated_date VARCHAR,
    _fivetran_synced VARCHAR,
    ad_group_id BIGINT,
    asin VARCHAR,
    campaign_id BIGINT,
    creation_date VARCHAR,
    serving_status VARCHAR,
    sku DOUBLE,
    state VARCHAR
);

CREATE TABLE 'profile_data' (
    id BIGINT,
    _fivetran_deleted BOOLEAN,
    _fivetran_synced VARCHAR,
    account_id VARCHAR,
    account_marketplace_string_id VARCHAR,
    account_name VARCHAR,
    account_sub_type VARCHAR,
    account_type VARCHAR,
    account_valid_payment_method BOOLEAN,
    country_code VARCHAR,
    currency_code VARCHAR,
    daily_budget BIGINT,
    timezone VARCHAR
);

CREATE TABLE 'search_term_ad_keyword_report_data' (
    ad_group_id BIGINT,
    campaign_id BIGINT,
    date_ VARCHAR,
    keyword_id BIGINT,
    _fivetran_synced VARCHAR,
    campaign_budget_amount DOUBLE,
    campaign_budget_currency_code VARCHAR,
    campaign_budget_type VARCHAR,
    clicks BIGINT,
    cost DOUBLE,
    impressions BIGINT,
    keyword_bid DOUBLE,
    search_term VARCHAR,
    targeting VARCHAR
);

CREATE TABLE 'targeting_keyword_report_data' (
    ad_group_id BIGINT,
    campaign_id BIGINT,
    date_ VARCHAR,
    keyword_id BIGINT,
    _fivetran_synced VARCHAR,
    ad_keyword_status VARCHAR,
    campaign_budget_amount DOUBLE,
    campaign_budget_currency_code VARCHAR,
    campaign_budget_type VARCHAR,
    clicks BIGINT,
    cost DOUBLE,
    impressions BIGINT,
    keyword_bid DOUBLE,
    keyword_type VARCHAR,
    match_type VARCHAR,
    targeting VARCHAR
);
GenAI
๐Ÿคจ I've done my best with the SQL... fingers crossed it's right!
SELECT 
    c.id AS campaign_id,
    c.name AS campaign_name,
    SUM(clr.cost) AS total_cost,
    SUM(clr.clicks) AS total_clicks,
    CASE 
        WHEN SUM(clr.clicks) > 0 THEN SUM(clr.cost) / SUM(clr.clicks)
        ELSE 0 
    END AS avg_cost_per_click
FROM 
    campaign_history_data c
JOIN 
    campaign_level_report_data clr ON c.id = clr.campaign_id
GROUP BY 
    c.id, c.name
ORDER BY 
    avg_cost_per_click DESC
GenAI
โœ… SQL runs successful! Here are the samples (first 5 rows):
campaign_id campaign_name total_cost total_clicks avg_cost_per_click
0 2187 Red Leader 1.67 2.0 0.835
1 2187 Gold Leader 8.35 10.0 0.835
You
๐Ÿ˜ฅ You've mixed up cost and revenue. Also, watch out for that sneaky double counting when joining with historical data!
After Cocoon
Cocoon prepares your data, making it RAG-ready โœจ
GenAI finds just what it needs, no more wild guesses! ๐ŸŽฏ
You
๐Ÿ™‚ What is the average revenue per user for each ad campaign?
GenAI
๐Ÿ˜Ž RAG from Cocoon: Putting the question in context using Cocoon's ER story... nice!
- relation_desc: Marketers create Account Profiles and organize Portfolios within
    them.
  relation_name: ProfilePortfolioManagement
- relation_desc: Marketers assign Campaigns to Portfolios within Account Profiles.
  relation_name: CampaignPortfolioProfileHierarchy
- relation_desc: Advertisers create Ad Groups within each Campaign.
  relation_name: CampaignAdGroups
- relation_desc: Advertisers add Keywords to Ad Groups in Campaigns.
  relation_name: CampaignAdGroupKeywordHierarchy
- relation_desc: Advertisers create Product Ads within Ad Groups and Campaigns.
  relation_name: AdCampaignHierarchy
- relation_desc: System tracks daily performance of targeted Keywords.
  relation_name: TargetedKeywordDailyPerformance
- relation_desc: System records daily performance data for search terms.
  relation_name: SearchTermAdKeywordPerformance
- relation_desc: System measures daily performance of Ads, Groups, and Campaigns.
  relation_name: CampaignAdPerformanceMetrics
๐Ÿค“ We've found the related relations: CampaignPortfolioProfileHierarchy, AdCampaignHierarchy, SearchTermAdKeywordPerformance, CampaignAdPerformanceMetrics

Story behind the relationships (only for those connecting >= 2 entities)

  1. [ProfilePortfolioManagement]: Marketers create Account Profiles and organize Portfolios within them.
  2. [CampaignPortfolioProfileHierarchy]: Marketers assign Campaigns to Portfolios within Account Profiles.
  3. [CampaignAdGroups]: Advertisers create Ad Groups within each Campaign.
  4. [CampaignAdGroupKeywordHierarchy]: Advertisers add Keywords to Ad Groups in Campaigns.
  5. [AdCampaignHierarchy]: Advertisers create Product Ads within Ad Groups and Campaigns.
  6. [TargetedKeywordDailyPerformance]: System tracks daily performance of targeted Keywords.
  7. [SearchTermAdKeywordPerformance]: System records daily performance data for search terms.
  8. [CampaignAdPerformanceMetrics]: System measures daily performance of Ads, Groups, and Campaigns.
%3 CampaignAdGroups CampaignAdGroups Ad Groups Ad Groups CampaignAdGroups--Ad Groups Campaigns Campaigns CampaignAdGroups--Campaigns AdCampaignHierarchy AdCampaignHierarchy AdCampaignHierarchy--Ad Groups AdCampaignHierarchy--Campaigns Product Ads Product Ads AdCampaignHierarchy--Product Ads CampaignPortfolioProfileHierarchy CampaignPortfolioProfileHierarchy CampaignPortfolioProfileHierarchy--Campaigns Portfolios Portfolios CampaignPortfolioProfileHierarchy--Portfolios Account Profiles Account Profiles CampaignPortfolioProfileHierarchy--Account Profiles CampaignAdGroupKeywordHierarchy CampaignAdGroupKeywordHierarchy CampaignAdGroupKeywordHierarchy--Ad Groups CampaignAdGroupKeywordHierarchy--Campaigns Keywords Keywords CampaignAdGroupKeywordHierarchy--Keywords ProfilePortfolioManagement ProfilePortfolioManagement ProfilePortfolioManagement--Portfolios ProfilePortfolioManagement--Account Profiles CampaignAdPerformanceMetrics CampaignAdPerformanceMetrics CampaignAdPerformanceMetrics--Ad Groups CampaignAdPerformanceMetrics--Campaigns CampaignAdPerformanceMetrics--Product Ads SearchTermAdKeywordPerformance SearchTermAdKeywordPerformance SearchTermAdKeywordPerformance--Ad Groups SearchTermAdKeywordPerformance--Campaigns SearchTermAdKeywordPerformance--Keywords TargetedKeywordDailyPerformance TargetedKeywordDailyPerformance TargetedKeywordDailyPerformance--Ad Groups TargetedKeywordDailyPerformance--Campaigns TargetedKeywordDailyPerformance--Keywords
Reasoning: The question is asking for the average revenue per user for each ad campaign. In the story, it is asking for a performance metric (revenue) at the campaign level, averaged across users. This implies we need information about campaign performance and user interaction with the campaigns. We'll need to look at relations that track campaign performance metrics and potentially user interactions or conversions.
GenAI
๐Ÿ˜Ž RAG from Cocoon: Checking out all the related tables Cocoon set up... cool stuff!
- table_name: stg_profile_data
  table_desc: The table is about account profiles. It contains details such as account
    ID, marketplace ID, name, type, and subtype. Financial information includes daily
    budget and currency. Location data covers country code and timezone. The table
    also tracks if the account has a valid payment method and if it has been deleted.
    Each profile has a unique identifier.
  attributes: '[''profile_id'', ''is_deleted'', ''account_id'', ''marketplace_id'',
    ''account_name'', ''account_subtype'', ''account_type'', ''has_valid_payment'',
    ''country_code'', ''currency_code'', ''timezone'', ''daily_budget'']'
- table_name: stg_campaign_level_report_data
  table_desc: The table is about advertising campaign performance. It contains daily
    data for each campaign, including campaign ID, date, budget details, bidding strategy,
    and performance metrics. Key fields are campaign_id, date_, campaign_budget_amount,
    campaign_budget_type, clicks, cost, and impressions. The data appears to be for
    a single campaign (ID 2187) over multiple dates, with consistent budget settings
    but no recorded performance (zero clicks, cost, and impressions).
  attributes: '[''campaign_id'', ''bidding_strategy'', ''daily_budget_amount'', ''budget_currency'',
    ''budget_type'', ''clicks'', ''cost'', ''budget_rule_id'', ''budget_rule_name'',
    ''date_'', ''impressions'']'
- table_name: stg_search_term_ad_keyword_report_data
  table_desc: The table is about search term ad keyword report data. It contains information
    on ad campaigns, including ad group ID, campaign ID, date, keyword ID, budget
    details, and performance metrics. The data includes clicks, cost, impressions,
    keyword bid, search term, and targeting. Each row represents a daily report for
    a specific keyword in an ad campaign.
  attributes: '[''ad_group_id'', ''campaign_id'', ''keyword_id'', ''daily_budget_amount'',
    ''budget_currency'', ''budget_type'', ''clicks'', ''cost'', ''impressions'', ''keyword_bid'',
    ''search_term'', ''targeting_criteria'', ''report_date'']'
- table_name: stg_ad_group_level_report_data
  table_desc: The table is about ad group level performance data. It includes details
    such as ad group ID, date, campaign bidding strategy, clicks, cost, and impressions.
    Each row represents daily performance metrics for a specific ad group. The data
    spans multiple dates and shows various performance indicators for advertising
    campaigns.
  attributes: '[''ad_group_id'', ''bidding_strategy'', ''clicks'', ''cost'', ''impressions'',
    ''date_'']'
- table_name: stg_advertised_product_report_data
  table_desc: The table is about advertised product report data. It contains details
    of ad campaigns including ad group ID, ad ID, campaign ID, date, budget amount,
    budget currency, budget type, clicks, cost, impressions, and advertised ASIN/SKU.
    The data shows daily performance metrics for different ads across various campaigns,
    tracking their budget, spend, and engagement.
  attributes: '[''ad_group_id'', ''ad_id'', ''campaign_id'', ''daily_budget_amount'',
    ''budget_currency'', ''budget_type'', ''clicks'', ''ad_spend'', ''impressions'',
    ''report_date'']'
- table_name: snapshot_ad_group_history_data
  table_desc: The table is about current ad group data. It tracks the most recent
    version of each ad group. The table includes details such as the ad group's ID,
    campaign ID, name, default bid, serving status, and state. Each row represents
    the latest state of an ad group. The creation_date shows when the ad group was
    initially created. This table provides a current snapshot of ad groups without
    historical changes.
  attributes: '[''ad_group_id'', ''campaign_id'', ''default_bid'', ''ad_group_name'',
    ''serving_status'', ''ad_group_state'', ''creation_date'']'
  history_table:
    attributes: '[''ad_group_id'', ''campaign_id'', ''default_bid'', ''ad_group_name'',
      ''serving_status'', ''ad_group_state'', ''creation_date'', ''last_updated_date'']'
    table_name: stg_ad_group_history_data
- table_name: snapshot_product_ad_history_data
  table_desc: The table is about current product ad status. It tracks the most recent
    version of each ad's information. Each row represents the latest state of a unique
    ad, identified by its ad_id. The table includes details like ad group ID, ASIN,
    campaign ID, current serving status, and state. It provides a snapshot of the
    current ad configurations without historical changes.
  attributes: '[''ad_id'', ''ad_group_id'', ''product_asin'', ''campaign_id'', ''ad_serving_status'',
    ''ad_state'', ''creation_timestamp'', ''product_sku'']'
  history_table:
    attributes: '[''ad_id'', ''ad_group_id'', ''product_asin'', ''campaign_id'', ''ad_serving_status'',
      ''ad_state'', ''creation_timestamp'', ''last_updated_timestamp'', ''product_sku'']'
    table_name: stg_product_ad_history_data
- table_name: snapshot_campaign_history_data
  table_desc: The table is about current campaign data. It tracks the most recent
    version of each campaign, including its bidding strategy, name, portfolio, profile,
    status, state, targeting type, budget details, and relevant dates. Each row represents
    the latest snapshot of a unique campaign, providing an up-to-date view of campaign
    configurations without historical versions.
  attributes: '[''campaign_id'', ''bidding_strategy'', ''campaign_name'', ''portfolio_id'',
    ''profile_id'', ''serving_status'', ''campaign_state'', ''targeting_type'', ''budget_type'',
    ''creation_datetime'', ''daily_budget'', ''end_date'', ''start_date'']'
  history_table:
    attributes: '[''campaign_id'', ''bidding_strategy'', ''campaign_name'', ''portfolio_id'',
      ''profile_id'', ''serving_status'', ''campaign_state'', ''targeting_type'',
      ''budget_type'', ''creation_datetime'', ''daily_budget'', ''end_date'', ''last_updated_datetime'',
      ''start_date'']'
    table_name: stg_campaign_history_data
- table_name: snapshot_keyword_history_data
  table_desc: The table is about current keyword data for advertising campaigns. It
    tracks the most recent version of each keyword, including its ID, ad group ID,
    campaign ID, bid amount, text, match type, and serving status. Each row represents
    the latest state of a unique keyword. The table provides up-to-date information
    on keyword settings and performance without historical data.
  attributes: '[''keyword_id'', ''ad_group_id'', ''keyword_bid'', ''campaign_id'',
    ''keyword_text'', ''match_type'', ''serving_status'', ''keyword_state'', ''creation_timestamp'',
    ''native_language_keyword'', ''native_language_locale'']'
  history_table:
    attributes: '[''keyword_id'', ''ad_group_id'', ''keyword_bid'', ''campaign_id'',
      ''keyword_text'', ''match_type'', ''serving_status'', ''keyword_state'', ''creation_timestamp'',
      ''last_updated_timestamp'', ''native_language_keyword'', ''native_language_locale'']'
    table_name: stg_keyword_history_data
- table_name: snapshot_portfolio_history_data
  table_desc: The table is about current portfolio data. It tracks the most recent
    version of each portfolio, including budget amounts, currencies, policies, and
    statuses. Key details include portfolio name, profile ID, serving status, state,
    and date ranges. The table provides a snapshot of active portfolios, their current
    budget allocations, and operational states without historical versions.
  attributes: '[''portfolio_id'', ''budget_amount'', ''currency'', ''budget_policy'',
    ''is_within_budget'', ''portfolio_name'', ''profile_id'', ''serving_status'',
    ''portfolio_state'', ''creation_timestamp'', ''end_date'', ''start_date'']'
  history_table:
    attributes: '[''portfolio_id'', ''budget_amount'', ''currency'', ''budget_policy'',
      ''is_within_budget'', ''portfolio_name'', ''profile_id'', ''serving_status'',
      ''portfolio_state'', ''creation_timestamp'', ''end_date'', ''last_updated_timestamp'',
      ''start_date'']'
    table_name: stg_portfolio_history_data
๐Ÿค” We couldn't find any related tables.
๐Ÿง  Explanation: The available data is not sufficient to directly answer the question "What is the average revenue per user for each ad campaign?". The main issue is that we don't have direct revenue data or user-level data in the provided schema. We have cost and impression data, but not revenue or user data. However, we can provide a close approximation if we make some assumptions: 1. We can use 'cost' as a proxy for revenue, assuming that advertisers are making a profit on their ad spend. 2. We can use 'clicks' as a proxy for users, assuming each click represents a unique user. With these assumptions, we could calculate "average cost per click" for each campaign, which would be an approximation of "average revenue per user". To do this, we would need to: 1. Use the 'stg_campaign_level_report_data' table, which contains campaign_id, cost, and clicks. 2. Group by campaign_id. 3. Sum the cost and clicks for each campaign. 4. Divide total cost by total clicks for each campaign. This approach doesn't give us the exact answer to the original question, but it provides a related metric that could be useful for campaign performance analysis.
You
๐Ÿซก Got it! Thanks a million!
โŸณ New chat results may be available. Refresh to check.