shopify_abandoned_checkout_data (first 100 rows)
id | _fivetran_synced | abandoned_checkout_url | applied_discount_amount | applied_discount_applicable | applied_discount_description | applied_discount_non_applicable_reason | applied_discount_title | applied_discount_value | applied_discount_value_type | billing_address_address_1 | billing_address_address_0 | billing_address_city | billing_address_company | billing_address_country | billing_address_country_code | billing_address_first_name | billing_address_last_name | billing_address_latitude | billing_address_longitude | billing_address_name | billing_address_phone | billing_address_province | billing_address_province_code | billing_address_zip | buyer_accepts_marketing | cart_token | closed_at | completed_at | created_at | credit_card_first_name | credit_card_last_name | credit_card_month | credit_card_number | credit_card_verification_value | credit_card_year | currency | customer_id | customer_locale | device_id | gateway | landing_site_base_url | location_id | name | note | phone | referring_site | shipping_address_address_1 | shipping_address_address_0 | shipping_address_city | shipping_address_company | shipping_address_country | shipping_address_country_code | shipping_address_first_name | shipping_address_last_name | shipping_address_latitude | shipping_address_longitude | shipping_address_name | shipping_address_phone | shipping_address_province | shipping_address_province_code | shipping_address_zip | shipping_line | shipping_rate_id | shipping_rate_price | shipping_rate_title | source | source_identifier | source_name | source_url | subtotal_price | taxes_included | token | total_discounts | total_line_items_price | total_price | total_tax | total_weight | updated_at | user_id | note_attribute_littledata_updated_at | note_attribute_segment_client_id | billing_address_id | billing_address_is_default | presentment_currency | shipping_address_id | shipping_address_is_default | total_duties | note_attribute_email_client_id | note_attributes | note_attribute_google_client_id | _fivetran_deleted | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 12111 | 2020-06-03 11:11:51.015110 | https://kitties.com/1111311610/checkouts/f050eda125a10cca513162f01101b261/recover?key=bd0fdf1dc1a1af01aecbdaa3101ec063 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | None | None | None | NaN | None | None | None | None | NaN | NaN | None | NaN | None | None | NaN | False | aaaa211622dfb133 | NaN | NaN | 2020-11-12 10:06:50.111111 | NaN | NaN | NaN | NaN | NaN | NaN | USD | 121 | en | NaN | tnyrnbs@hh.com | paypal | /collections/the-archive-sale | NaN | #10160311 | NaN | NaN | None | 123 main st | Apt 02 | Washington | NaN | United States | US | Pauly | D | 31.111511 | -26.112602 | DJ PAULY D | (115) 061-1012 | District of Columbia | DC | 12305 | NaN | NaN | NaN | NaN | NaN | NaN | web | NaN | 56.00 | False | f050eda12f111b261 | 1.00 | 560.0 | 501.36 | 13.36 | 1 | 2020-11-12 10:51:10.111111 | NaN | NaN | None | NaN | NaN | None | NaN | NaN | NaN | NaN | None | NaN | NaN |
1 | 11111 | 2020-01-11 06:01:35.021111 | https://kitties.com/1111311610/checkouts/6661ff02165dfd11b12db112f0111226/recover?key=51611efdff11e0caccc0fd30b0e1e202 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | village | Apt 0 | daytona Beach | NaN | Florida | US | ohio | Calles | 1.126113 | -21.502661 | hi | 5.026611e+10 | Healdsburg | PA-11 | NaN | False | 611faa630ce5e6bcc0bacc2a105c0126 | NaN | NaN | 2020-05-11 01:01:30.111111 | NaN | NaN | NaN | NaN | NaN | NaN | USD | 366525 | en | NaN | hyrehher@gmail.com | None | /collections/sale | NaN | #13311 | NaN | NaN | https://www.google.com/ | 123 main st | Pty 3 | ghreiuhtg | NaN | United States | US | ohio | Calle pty115 | NaN | NaN | ohio Calle pty115 | +12161115152 | Florida | FL | 33120 | NaN | NaN | NaN | NaN | NaN | NaN | web | NaN | 10.35 | False | a165dfd11226 | 16.65 | 111.0 | 10.35 | 1.00 | 1 | 2020-05-11 01:06:35.111111 | NaN | NaN | None | NaN | NaN | None | NaN | NaN | NaN | NaN | None | NaN | NaN |
2 | 66531 | 2021-11-11 11:02:30.112110 | https://kitties.com/1111311610/checkouts/0abddd111c0211f1e616ec0d0c32021c/recover?key=abed6505d26f1a60a50aa0c02e01be31 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | None | None | None | NaN | None | None | None | None | NaN | NaN | None | NaN | None | None | NaN | False | aaaaa61e1d11af3adfac1f0 | NaN | NaN | 2021-11-11 02:05:13.111111 | NaN | NaN | NaN | NaN | NaN | NaN | USD | 160363 | en | NaN | hernebbe@hr.com | None | /collections/new | NaN | #166531 | NaN | NaN | https://l.facebook.com/ | 11-01 01st St | apt 0C | Springfield | NaN | United States | US | dan | the man | NaN | NaN | dan the man | +13021115311 | New York | NY | 11111-020 | NaN | NaN | NaN | NaN | NaN | NaN | web | NaN | 191.00 | False | l1abddd111c0211f2021c | 1.00 | 111.0 | 111.00 | 1.00 | 1 | 2021-11-11 02:05:55.111111 | NaN | 125150.0 | a111c-30fc-0bb6-a25e-06f201c6035c | NaN | NaN | USD | NaN | NaN | NaN | NaN | [{"name":"segment-clientID","value":"610a111c-30fc-0bb6-a25e-06f201c6035c"},{"name":"_updatedAt","value":"1613121625150"}] | NaN | NaN |
shopify_abandoned_checkout_discount_code_data (first 100 rows)
checkout_id | index_ | _fivetran_synced | amount | discount_id | code | created_at | type | updated_at | usage_count | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 901163 | 0 | 2022-12-07 06:49:37.929000 | 0.0 | NaN | CYBER12 | NaN | percentage | NaN | NaN |
1 | 4334827 | 0 | 2022-12-07 06:49:37.926000 | 0.0 | NaN | CYBER12 | NaN | percentage | NaN | NaN |
2 | 4566403 | 0 | 2022-12-07 06:49:33.182000 | 0.0 | NaN | BONUS | NaN | percentage | NaN | NaN |
shopify_abandoned_checkout_shipping_line_data (first 100 rows)
checkout_id | index_ | _fivetran_synced | api_client_id | carrier_identifier | carrier_service_id | code | delivery_category | discounted_price | id | markup | phone | price | requested_fulfillment_service_id | source | title | validation_context | delivery_expectation_range | delivery_expectation_type | original_shop_markup | original_shop_price | presentment_title | delivery_expectation_range_min | delivery_expectation_range_max | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 653675 | 1 | 2023-01-09 06:48:18.093000 | NaN | NaN | NaN | Standard | NaN | NaN | c3ce0972c2e30eaf7001bea | 0.0 | NaN | 0.0 | NaN | shopify | Standard | NaN | NaN | NaN | 0.0 | 0.0 | Standard | NaN | NaN |
1 | 379 | 1 | 2023-01-09 06:48:23.540000 | NaN | NaN | NaN | Standard | NaN | NaN | bf7c90953344902c13 | 0.0 | NaN | 0.0 | NaN | shopify | Standard | NaN | NaN | NaN | 0.0 | 0.0 | Standard | NaN | NaN |
2 | 635 | 1 | 2023-01-09 06:48:24.243000 | NaN | NaN | NaN | Standard | NaN | NaN | 519ff4275cd972e282db | 0.0 | NaN | 0.0 | NaN | shopify | Standard | NaN | NaN | NaN | 0.0 | 0.0 | Standard | NaN | NaN |
3 | 3211 | 1 | 2023-01-09 06:48:18.068000 | NaN | NaN | NaN | Standard | NaN | NaN | 8d18671d481ad46a | 0.0 | NaN | 0.0 | NaN | shopify | Standard | NaN | NaN | NaN | 0.0 | 0.0 | Standard | NaN | NaN |
4 | 381227 | 1 | 2023-01-09 06:48:16.985000 | NaN | NaN | NaN | Standard | NaN | NaN | 8f2fab1b455ec9e597 | 0.0 | NaN | 0.0 | NaN | shopify | Standard | NaN | NaN | NaN | 0.0 | 0.0 | Standard | NaN | NaN |
shopify_collection_data (first 100 rows)
id | _fivetran_deleted | _fivetran_synced | handle | published_at | published_scope | title | updated_at | disjunctive | rules | sort_order | template_suffix | body_html | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 997355 | True | 2021-09-01 05:53:25.838000 | NaN | NaN | NaN | NaN | 1970-01-01 00:00:00.000000 | NaN | NaN | NaN | NaN | NaN |
1 | 9930779 | True | 2021-09-01 05:53:26.673000 | NaN | NaN | NaN | NaN | 1970-01-01 00:00:00.000000 | NaN | NaN | NaN | NaN | NaN |
2 | 99967 | True | 2022-04-08 06:52:19.524000 | NaN | NaN | NaN | NaN | 1970-01-01 00:00:00.000000 | NaN | NaN | NaN | NaN | NaN |
shopify_collection_product_data (first 100 rows)
collection_id | product_id | _fivetran_synced | |
---|---|---|---|
0 | 37124 | 789131 | 2022-11-18 21:32:43.188000 |
1 | 9037124 | 74353899 | 2022-11-18 21:32:43.188000 |
2 | 37124 | 8891 | 2022-11-18 21:32:43.188000 |
shopify_customer_data (first 100 rows)
id | first_name | last_name | phone | state | orders_count | total_spent | created_at | updated_at | accepts_marketing | tax_exempt | verified_email | default_address_id | _fivetran_synced | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3588998496353 | 29e00d3659d1c5e75f99e892f0c1a1f1 | 3f0e6a46fb84eb1e6f5f00d86aa53b1b | ab0bf25ab8b2a6b78af26a141dd6f455 | NaN | disabled | 0 | 0.00 | 2020-09-11 13:26:15.000 | 2020-09-11 13:26:15.000 | False | False | True | 3951726461025 | 2020-09-12 00:14:04.512 |
1 | 3589760876641 | f0962b7a185488ecb752cedac1038349 | aa35cb67c26e64bb81a1bf3f17e858ba | 021cb20b5c78751fc7ddc091b6b69b3e | NaN | invited | 1 | 2.80 | 2020-09-11 19:35:42.000 | 2020-09-11 19:41:04.000 | True | False | True | 3952669655137 | 2020-09-12 00:14:04.506 |
2 | 3584045351009 | d3bae70c9d49bb7cb5a74cdd0eae7fc4 | 0dd89cff60965dff8f9ea2bc952a5474 | dce90c7b4e52e045e5975836aff49cf1 | NaN | disabled | 2 | 9.18 | 2020-09-09 22:57:44.000 | 2020-09-09 23:01:55.000 | False | False | True | 3946055729249 | 2020-09-10 00:13:59.106 |
shopify_customer_tag_data (first 100 rows)
customer_id | index_ | _fivetran_synced | value_ | |
---|---|---|---|---|
0 | 9919268 | 1 | 2022-12-03 06:49:03.314000 | GGPP |
1 | 4404 | 1 | 2022-12-03 06:48:53.295000 | GGPP |
2 | 5509188 | 1 | 2022-12-03 06:48:55.067000 | GGPP |
shopify_discount_code_data (first 100 rows)
id | _fivetran_synced | code | created_at | price_rule_id | updated_at | usage_count | |
---|---|---|---|---|---|---|---|
0 | 4773499 | 2021-12-10 07:04:44.670000 | CHECKVB34DDBQ3VH | 2021-12-10 06:48:35.000000 | 32543 | 2021-12-10 06:48:35.000000 | 0.0 |
1 | 436267 | 2021-12-10 07:04:44.670000 | CHECKVBLJG22DDD | 2021-12-10 06:48:35.000000 | 12543 | 2021-12-10 06:48:35.000000 | 0.0 |
2 | 469035 | 2021-12-10 07:04:44.670000 | CHECKV44CCCBCWB7 | 2021-12-10 06:48:35.000000 | 12543 | 2021-12-10 06:48:35.000000 | 0.0 |
shopify_fulfillment_data (first 100 rows)
id | _fivetran_synced | created_at | location_id | order_id | status | tracking_company | tracking_number | updated_at | tracking_numbers | tracking_urls | shipment_status | service | name | receipt_authorization | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 423844 | 2022-11-22 08:06:32.902000 | 2019-07-13 01:17:22.000000 | 123548 | 1228100 | success | NaN | NaN | 2019-07-13 01:17:22.000000 | [] | [] | NaN | manual | #151212.1 | NaN |
1 | 8308 | 2022-11-22 08:06:33.863000 | 2019-07-13 01:17:21.000000 | 548 | 1274564 | success | NaN | NaN | 2019-07-13 01:17:22.000000 | [] | [] | NaN | manual | #152317.1 | NaN |
2 | 548932 | 2022-11-22 08:06:56.262000 | 2019-07-13 01:17:21.000000 | 12348 | 1284 | success | NaN | NaN | 2019-07-13 01:17:21.000000 | [] | [] | NaN | manual | #1555923.1 | NaN |
shopify_fulfillment_event_data (first 100 rows)
id | _fivetran_synced | address_1 | city | country | created_at | estimated_delivery_at | fulfillment_id | happened_at | latitude | longitude | message | order_id | province | shop_id | status | updated_at | zip | _fivetran_deleted | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 451435 | 2022-11-18 04:39:07.945000 | NaN | None | None | 2022-08-29 20:52:39.000000 | None | 40495 | 2022-08-29 20:52:39.000000 | NaN | NaN | None | 4502987 | None | 89440612 | delivered | 2022-08-29 20:52:39.000000 | None | False |
1 | 48779 | 2022-11-18 05:48:01.773000 | NaN | LONDON | GB | 2022-09-13 08:07:57.000000 | None | 4064737 | 2022-08-15 12:41:00.000000 | 101.349998 | -14.033300 | Delay | 4588203 | None | 320612 | out_for_delivery | 2022-09-13 08:07:57.000000 | CR0 | False |
2 | 1481515 | 2022-11-18 05:41:00.745000 | NaN | ECHO PARK | AU | 2022-09-14 14:16:52.000000 | 2022-09-14 08:00:00.000000 | 4019339 | 2022-09-14 01:26:00.000000 | -3.797699 | 190.783958 | Delay | 451915 | None | 89320612 | delayed | 2022-09-14 14:16:52.000000 | 2759 | False |
3 | 558955 | 2022-11-18 10:51:24.286000 | NaN | LAZYTOWN | US | 2022-08-13 12:40:26.000000 | None | 402947 | 2022-03-01 10:36:39.000000 | 22.337700 | -71.731003 | Delay | 429188587 | MA | 89420612 | in_transit | 2022-08-13 12:40:26.000000 | 01505 | False |
4 | 6904235 | 2022-11-18 08:58:00.458000 | NaN | LA | US | 2022-08-24 06:29:21.000000 | 2022-08-24 23:59:59.000000 | 4060491 | 2022-08-24 05:30:57.000000 | 12.287498 | -21.357399 | Delay | 4242667 | MA | 89420612 | in_transit | 2022-08-24 06:29:21.000000 | 01760 | False |
shopify_inventory_item_data (first 100 rows)
id | _fivetran_synced | cost | created_at | requires_shipping | sku | tracked | updated_at | country_code_of_origin | province_code_of_origin | _fivetran_deleted | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 4555 | 2021-12-18 06:56:22.877000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | True |
1 | 501419 | 2022-02-25 06:52:29.767000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | True |
2 | 851179 | 2022-02-24 06:52:33.361000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | True |
shopify_inventory_level_data (first 100 rows)
inventory_item_id | location_id | _fivetran_synced | available | updated_at | |
---|---|---|---|---|---|
0 | 780939 | 287748 | 2021-11-13 08:02:21.760000 | NaN | NaN |
1 | 6027 | 287748 | 2021-11-13 08:02:21.760000 | NaN | NaN |
2 | 515 | 28748 | 2021-11-06 08:04:16.213000 | NaN | NaN |
shopify_location_data (first 100 rows)
id | _fivetran_synced | active | address_1 | address_2 | city | country | created_at | legacy | name | phone | province | updated_at | zip | country_code | country_name | localized_country_name | localized_province_name | province_code | _fivetran_deleted | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 8777748 | 2022-12-07 06:43:31.005000 | True | None | NaN | None | US | 2019-06-11 15:58:20.000000 | True | Plum | NaN | None | 2019-06-11 15:58:20.000000 | NaN | US | United States | United States | None | None | False |
1 | 7748 | 2022-12-07 06:43:31.005000 | True | 111 Tree Road | NaN | Tree | US | 2018-12-10 16:24:07.000000 | False | Plum Express | NaN | NY | 2019-05-16 13:37:39.000000 | 7394.0 | US | United States | United States | New Yorl | NY | False |
shopify_metafield_data (first 100 rows)
id | _fivetran_synced | created_at | description | key_ | namespace | owner_id | owner_resource | updated_at | value_ | value_type | type | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 5445055 | 2022-11-19 10:06:09.531000 | 2019-10-28 20:06:39.000000 | NaN | returnAuthorizations | blade_runner | 390244 | order | 2019-10-28 20:06:39.000000 | [{"id":"ce95-49e4-9daf-41f29bbbb799","totalValue":44444,"status":"RECEIVED","payload":{"totalReturnValue":4444,"validReturnItems":[{"UPC":"19073825552","Quantity":"1","Reason":"changed-mind","LineItem":"40055558892132"}]},"createdAt":"2019-10-28T20:06:39.569Z","modifiedAt":"2019-10-28T20:06:39.569Z"}] | NaN | json_string |
1 | 6337647 | 2022-11-21 01:57:33.851000 | 2020-06-17 11:35:28.000000 | NaN | returnAuthorizations | blade_runner | 254671 | order | 2020-06-17 11:35:28.000000 | [{"id":"557ece73-658b-cf694dcd3f7e","totalValue":4444,"status":"RECEIVED","payload":{"totalReturnValue":444.77,"validReturnItems":[{"UPC":"19055550468","Quantity":"1","Reason":"fit-issues","LineItem":"4935555579471"}]},"createdAt":"2020-06-17T11:35:28.469Z","modifiedAt":"2020-06-17T11:35:28.470Z"}] | NaN | json_string |
2 | 576111 | 2022-11-21 03:19:59.064000 | 2020-06-10 18:35:44.000000 | NaN | returnAuthorizations | blade_runner | 22527 | order | 2020-06-10 18:35:44.000000 | [{"id":"e461c20a-9dc7-d38de1c9012a","totalValue":4444,"status":"RECEIVED","payload":{"totalReturnValue":444,"validReturnItems":[{"UPC":"190735551121","Quantity":"1","Reason":"too-big","LineItem":"4925555231"}]},"createdAt":"2020-06-10T18:35:44.043Z","modifiedAt":"2020-06-10T18:35:44.043Z"}] | NaN | json_string |
3 | 55241839 | 2022-11-21 01:29:09.347000 | 2020-07-15 21:24:16.000000 | NaN | returnAuthorizations | blade_runner | 2335775 | order | 2020-07-15 21:24:16.000000 | [{"id":"0c79163e-f55b56f50aff","totalValue":44478.000000000004,"status":"RECEIVED","payload":{"totalReturnValue":4444.78000000000003,"validReturnItems":[{"UPC":"190555325","Quantity":"1","Reason":"fit-issues","LineItem":"5555599407"}]},"createdAt":"2020-07-15T21:24:16.210Z","modifiedAt":"2020-07-15T21:24:16.210Z"}] | NaN | json_string |
4 | 4575 | 2022-11-21 03:07:20.669000 | 2020-06-24 17:23:12.000000 | NaN | returnAuthorizations | blade_runner | 220655 | order | 2020-06-24 17:23:12.000000 | [{"id":"3679-4811-94fd-555bf9846753","totalValue":44581,"status":"BACKEND_GENERATED","payload":{"totalReturnValue":4444.81,"validReturnItems":[{"UPC":"190735558","Quantity":1,"Reason":"Changed My Mind","LineItem":"455555711"}]},"createdAt":"2020-06-24T17:23:12.272Z","modifiedAt":"2020-06-24T17:23:12.272Z"}] | NaN | json_string |
shopify_order_adjustment_data (first 100 rows)
id | order_id | refund_id | amount | tax_amount | kind | reason | amount_set | tax_amount_set | _fivetran_synced | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 109271056455 | 2712175083591 | 675617407047 | -465 | 0.0 | shipping_refund | Shipping refund | NaN | NaN | 2020-11-14 07:52:56.522 |
1 | 109277085767 | 2773486501959 | 675634708551 | -95 | 0.0 | shipping_refund | Shipping refund | NaN | NaN | 2020-11-14 07:54:41.682 |
2 | 109245956167 | 2771757826119 | 675548168263 | -27 | -1.6 | shipping_refund | Shipping refund | NaN | NaN | 2020-11-14 07:44:24.602 |
3 | 109248118855 | 2771329908807 | 675555016775 | -35 | 0.0 | shipping_refund | Shipping refund | NaN | NaN | 2020-11-14 07:45:11.536 |
4 | 109275742279 | 2773429682247 | 675632644167 | -515 | 0.0 | refund_discrepancy | Refund discrepancy | NaN | NaN | 2020-11-14 07:54:31.054 |
shopify_order_data (first 100 rows)
id | note | taxes_included | currency | subtotal_price | total_tax | total_price | created_at | updated_at | name | shipping_address_name | shipping_address_first_name | shipping_address_last_name | shipping_address_company | shipping_address_phone | shipping_address_address_1 | shipping_address_address_2 | shipping_address_city | shipping_address_country | shipping_address_country_code | shipping_address_province | shipping_address_province_code | shipping_address_zip | shipping_address_latitude | shipping_address_longitude | billing_address_name | billing_address_first_name | billing_address_last_name | billing_address_company | billing_address_phone | billing_address_address_1 | billing_address_address_2 | billing_address_city | billing_address_country | billing_address_country_code | billing_address_province | billing_address_province_code | billing_address_zip | billing_address_latitude | billing_address_longitude | customer_id | location_id | user_id | number | order_number | financial_status | fulfillment_status | processed_at | processing_method | referring_site | cancel_reason | cancelled_at | closed_at | total_discounts | total_line_items_price | total_weight | source_name | browser_ip | buyer_accepts_marketing | token | cart_token | checkout_token | test | landing_site_base_url | _fivetran_synced | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2674098602081 | 71509c29301d2cc14e37ecb53f735608 | 021cb20b5c78751fc7ddc091b6b69b3e | True | GBP | 2.8 | 0 | 2.80 | 2020-09-11 19:35:42.000 | 2020-09-11 19:35:46.000 | d1743fc58a1e4d78769eaac49994a994 | 8b121314a4d97bc9dc15bfba8518ec88 | f0962b7a185488ecb752cedac1038349 | aa35cb67c26e64bb81a1bf3f17e858ba | d41d8cd98f00b204e9800998ecf8427e | d41d8cd98f00b204e9800998ecf8427e | d6f4a399883df85d9d4b3a02bf6e738a | bc9b8576178dcd886639ba718f1d45c8 | ac08c606d455cde42980f980524a8038 | 89f9c9f489be2a83cf57e53b9197d288 | 79cba1185463850dedba31f172f1dc5b | d41d8cd98f00b204e9800998ecf8427e | NaN | 00079ce435afddc28205639142773870 | d97319f64674c02595f2989019970fc8 | c08dae474c5d4d3326fd6764d2a0ebe6 | 8b121314a4d97bc9dc15bfba8518ec88 | f0962b7a185488ecb752cedac1038349 | aa35cb67c26e64bb81a1bf3f17e858ba | d41d8cd98f00b204e9800998ecf8427e | d41d8cd98f00b204e9800998ecf8427e | d6f4a399883df85d9d4b3a02bf6e738a | bc9b8576178dcd886639ba718f1d45c8 | ac08c606d455cde42980f980524a8038 | 89f9c9f489be2a83cf57e53b9197d288 | 79cba1185463850dedba31f172f1dc5b | d41d8cd98f00b204e9800998ecf8427e | NaN | 00079ce435afddc28205639142773870 | d97319f64674c02595f2989019970fc8 | c08dae474c5d4d3326fd6764d2a0ebe6 | 3589760876641 | NaN | NaN | 4135 | 5135 | paid | None | 2020-09-11 19:35:42.000 | None | None | NaN | NaN | None | 2.8 | 5.6 | 0 | 294517 | None | True | 0f9c2880de17f71511eee5542c29b999 | None | None | False | None | 2020-09-12 00:15:10.199 |
1 | 2669516488801 | None | dce90c7b4e52e045e5975836aff49cf1 | True | GBP | 2.8 | 0 | 3.79 | 2020-09-09 23:01:54.000 | 2020-09-10 15:38:26.000 | 4fcb884b5b46413bae526a6e7e49d706 | c8189c7add9755e66391b58ecc12b3e2 | d3bae70c9d49bb7cb5a74cdd0eae7fc4 | 0dd89cff60965dff8f9ea2bc952a5474 | d41d8cd98f00b204e9800998ecf8427e | d41d8cd98f00b204e9800998ecf8427e | 1ff1de774005f8da13f42943881c655f | 70111f8840ccbd8b1007cc3f387ced6b | 1ac412baeba98370017c73df41c98a07 | 89f9c9f489be2a83cf57e53b9197d288 | 79cba1185463850dedba31f172f1dc5b | None | NaN | 2357e65b582faa0a2da3603b16fa4a7f | 75c29d6dd29594a652fcbd7c4c279a29 | 75468fbebc28e02ec5d4f54f4cbd4099 | c8189c7add9755e66391b58ecc12b3e2 | d3bae70c9d49bb7cb5a74cdd0eae7fc4 | 0dd89cff60965dff8f9ea2bc952a5474 | d41d8cd98f00b204e9800998ecf8427e | d41d8cd98f00b204e9800998ecf8427e | 1ff1de774005f8da13f42943881c655f | 70111f8840ccbd8b1007cc3f387ced6b | 1ac412baeba98370017c73df41c98a07 | 89f9c9f489be2a83cf57e53b9197d288 | 79cba1185463850dedba31f172f1dc5b | None | NaN | 2357e65b582faa0a2da3603b16fa4a7f | 75c29d6dd29594a652fcbd7c4c279a29 | 75468fbebc28e02ec5d4f54f4cbd4099 | 3584045351009 | NaN | NaN | 4066 | 5066 | paid | fulfilled | 2020-09-09 23:01:53.000 | direct | 2cc983716a820bc713b793a6e8e73f42 | NaN | NaN | 2020-09-10 15:38:26.000 | 0.0 | 2.8 | 0 | web | 109.249.185.68 | False | fb489b3ccc0ae36ce47744d7595e9746 | b1ff04883dfeab658cd5211050476729 | 7bdb994e1196de3e4f34586e357613f9 | False | 8584e97b29b0802fb393fa453a8b6a7a | 2020-09-11 00:14:33.536 |
2 | 2669509541985 | None | dce90c7b4e52e045e5975836aff49cf1 | True | GBP | 4.4 | 0 | 5.39 | 2020-09-09 22:57:51.000 | 2020-09-10 15:38:25.000 | 9e346f2e912c60e16679f4a4c8d29422 | c8189c7add9755e66391b58ecc12b3e2 | d3bae70c9d49bb7cb5a74cdd0eae7fc4 | 0dd89cff60965dff8f9ea2bc952a5474 | d41d8cd98f00b204e9800998ecf8427e | d41d8cd98f00b204e9800998ecf8427e | 1ff1de774005f8da13f42943881c655f | 70111f8840ccbd8b1007cc3f387ced6b | 1ac412baeba98370017c73df41c98a07 | 89f9c9f489be2a83cf57e53b9197d288 | 79cba1185463850dedba31f172f1dc5b | None | NaN | 2357e65b582faa0a2da3603b16fa4a7f | 75c29d6dd29594a652fcbd7c4c279a29 | 75468fbebc28e02ec5d4f54f4cbd4099 | c8189c7add9755e66391b58ecc12b3e2 | d3bae70c9d49bb7cb5a74cdd0eae7fc4 | 0dd89cff60965dff8f9ea2bc952a5474 | d41d8cd98f00b204e9800998ecf8427e | d41d8cd98f00b204e9800998ecf8427e | 1ff1de774005f8da13f42943881c655f | 70111f8840ccbd8b1007cc3f387ced6b | 1ac412baeba98370017c73df41c98a07 | 89f9c9f489be2a83cf57e53b9197d288 | 79cba1185463850dedba31f172f1dc5b | None | NaN | 2357e65b582faa0a2da3603b16fa4a7f | 75c29d6dd29594a652fcbd7c4c279a29 | 75468fbebc28e02ec5d4f54f4cbd4099 | 3584045351009 | NaN | NaN | 4065 | 5065 | paid | fulfilled | 2020-09-09 22:57:50.000 | direct | 2cc983716a820bc713b793a6e8e73f42 | NaN | NaN | 2020-09-10 15:38:25.000 | 0.0 | 4.4 | 0 | web | 109.249.185.68 | False | e44b7f04610a8f4032530cc7f12663de | 9600543f4d4613db59ac58a1009ecbb9 | cf0a9fe2c7c606b86559007dbb890a62 | False | 8584e97b29b0802fb393fa453a8b6a7a | 2020-09-11 00:14:33.037 |
shopify_order_discount_code_data (first 100 rows)
index_ | order_id | _fivetran_synced | amount | code | type | |
---|---|---|---|---|---|---|
0 | 1 | 2674098602081 | 2022-11-20 08:14:52.957000 | 11.0 | GIFTCARD | percentage |
1 | 2 | 2674098602081 | 2022-11-20 08:14:52.957000 | 5.0 | SHIPPING2022 | shipping |
2 | 3 | 2674098602081 | 2022-11-20 08:14:52.957000 | 1.0 | FIXED | fixed_amount |
3 | 1 | 2669516488801 | 2022-11-19 11:59:50.040000 | 0.0 | SHIPPING2022 | shipping |
4 | 1 | 2669509541985 | 2022-11-20 10:22:23.877000 | 2.0 | GIFTCARD | percentage |
shopify_order_line_data (first 100 rows)
order_id | id | product_id | variant_id | name | title | vendor | price | quantity | grams | sku | fulfillable_quantity | fulfillment_service | gift_card | requires_shipping | taxable | index_ | total_discount | pre_tax_price | fulfillment_status | _fivetran_synced | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2669509541985 | 5699743678561 | 4526236893281 | 31879811629153 | 327ea22d0f91783418e519cb45a4a3e9 | 327ea22d0f91783418e519cb45a4a3e9 | 13aea892c8de2d62f2608c6191cfab1f | 4.4 | 1 | 0 | 854a136da51d43fb87c63c86a62ffad0 | 0 | manual | False | True | False | 1 | 0 | NaN | fulfilled | 2020-09-11 00:14:33.293 |
1 | 2669516488801 | 5699758784609 | 4506451050593 | 31814873481313 | 1fccbdc6ac5f6edabf76e56eb0460019 | 1fccbdc6ac5f6edabf76e56eb0460019 | 13aea892c8de2d62f2608c6191cfab1f | 2.8 | 1 | 0 | 198369004c95b2b35f480f9691b14178 | 0 | manual | False | True | False | 1 | 0 | NaN | fulfilled | 2020-09-11 00:14:33.767 |
2 | 2674098602081 | 5708321914977 | 4505775439969 | 31812476895329 | 74c574cc1e545fef2beeaf9bbb148fcc | 74c574cc1e545fef2beeaf9bbb148fcc | 57403999f78b01b3fd325ba256eafe94 | 2.8 | 2 | 0 | b988b358c81b47d3e438c99bfb1c4ee1 | 2 | manual | False | True | False | 1 | 0 | NaN | None | 2020-09-12 00:15:10.199 |
shopify_order_line_refund_data (first 100 rows)
id | location_id | refund_id | restock_type | quantity | order_line_id | _fivetran_synced | subtotal | total_tax_set | subtotal_set | total_tax | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 189012115527 | 3.213171e+10 | 679976206407 | return | 1 | 6113984839751 | 2020-11-14 07:52:56.522 | 415 | NaN | NaN | 19.74 |
1 | 289901510727 | 3.213171e+10 | 800919683143 | return | 1 | 9698959196231 | 2020-11-14 07:52:56.522 | 415 | NaN | NaN | 56.33 |
2 | 196428005447 | 3.213171e+10 | 686409187399 | return | 1 | 6423996530759 | 2020-11-14 07:52:56.522 | 415 | NaN | NaN | 16.18 |
3 | 286567268423 | NaN | 798222680135 | no_restock | 1 | 6367161483335 | 2020-11-14 07:52:56.522 | 415 | NaN | NaN | 26.17 |
4 | 185936773191 | NaN | 677359190087 | no_restock | 1 | 6009460064327 | 2020-11-14 07:52:56.522 | 415 | NaN | NaN | 13.75 |
shopify_order_note_attribute_data (first 100 rows)
name | order_id | _fivetran_synced | value_ | |
---|---|---|---|---|
0 | last_name | 34171115 | 2022-11-19 07:30:28.480000 | "1418143823.1643992155" |
1 | first_name | 34171115 | 2022-11-19 07:30:28.480000 | "fb.1.1643992155109.1110590605" |
2 | updated_at | 34171115 | 2022-11-19 07:30:28.480000 | "1643992163253" |
3 | clientID | 34171115 | 2022-11-19 07:30:28.480000 | "a03d3118-4048-4159-b5bb-1b90d8abb69b" |
4 | name | 34171115 | 2022-11-19 07:30:28.480000 | "22707603636395" |
shopify_order_shipping_line_data (first 100 rows)
id | order_id | _fivetran_synced | carrier_identifier | code | delivery_category | discounted_price | phone | price | requested_fulfillment_service_id | source | title | discounted_price_set | price_set | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 54475 | 55 | 2022-11-19 14:09:18.923000 | NaN | Standard | NaN | 0.0 | NaN | 0.0 | NaN | shopify | Standard | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
1 | 651 | 425579 | 2022-11-19 11:28:21.391000 | NaN | Standard | NaN | 0.0 | NaN | 0.0 | NaN | shopify | Standard | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
2 | 188139 | 4599 | 2022-11-19 16:03:15.430000 | NaN | Standard | NaN | 0.0 | NaN | 0.0 | NaN | shopify | Standard | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
shopify_order_shipping_tax_line_data (first 100 rows)
index_ | order_shipping_line_id | _fivetran_synced | price | rate | title | price_set | |
---|---|---|---|---|---|---|---|
0 | 4 | 321291 | 2022-11-19 15:05:15.847000 | 0.0 | 0.000 | GEIWIHG | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
1 | 3 | 5995 | 2022-11-19 11:24:24.596000 | 0.0 | 0.007 | BANANAN | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
2 | 3 | 309131 | 2022-11-19 16:52:35.685000 | 0.0 | 0.010 | TOMATO | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
shopify_order_tag_data (first 100 rows)
index_ | order_id | _fivetran_synced | value_ | |
---|---|---|---|---|
0 | 1 | 6411 | 2022-12-07 06:49:30.307000 | #33333 |
1 | 1 | 47195 | 2022-12-07 06:49:26.771000 | #22222 |
2 | 1 | 46553 | 2022-12-07 06:49:38.197000 | #771222 |
shopify_order_url_tag_data (first 100 rows)
key_ | order_id | _fivetran_synced | value_ | |
---|---|---|---|---|
0 | image | 40347 | 2022-11-19 10:29:18.624000 | Image |
1 | utm_medium | 4290347 | 2022-11-19 10:29:18.624000 | |
2 | prop_channel | 47 | 2022-11-19 10:29:18.624000 | flows |
shopify_price_rule_data (first 100 rows)
id | _fivetran_synced | allocation_limit | allocation_method | created_at | customer_selection | ends_at | once_per_customer | prerequisite_quantity_range | prerequisite_shipping_price_range | prerequisite_subtotal_range | quantity_ratio_entitled_quantity | quantity_ratio_prerequisite_quantity | starts_at | target_selection | target_type | title | updated_at | usage_limit | value_ | value_type | prerequisite_to_entitlement_purchase_prerequisite_amount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 11443 | 2021-03-22 05:43:56.784000 | NaN | across | 2021-03-09 18:57:54.000000 | all | 2021-03-22 07:00:59.000000 | False | NaN | NaN | 500.0 | NaN | NaN | 2021-03-17 04:00:57.000000 | all | line_item | GIFTCARD | 2021-03-22 04:20:03.000000 | NaN | 0.0 | percentage | NaN |
1 | 564075 | 2021-11-11 07:43:53.706000 | NaN | across | 2021-11-10 22:26:31.000000 | all | 2021-11-30 14:00:59.000000 | False | NaN | NaN | NaN | NaN | NaN | 2021-11-10 22:25:32.000000 | entitled | line_item | THANKS | 2021-11-10 22:26:31.000000 | NaN | 0.0 | percentage | NaN |
2 | 9339 | 2021-12-03 06:47:21.433000 | NaN | across | 2021-11-11 22:38:18.000000 | all | 2021-12-02 19:00:59.000000 | False | NaN | NaN | NaN | NaN | NaN | 2021-11-23 21:30:38.000000 | all | line_item | THANKS | 2021-12-02 19:21:47.000000 | NaN | 0.0 | percentage | NaN |
shopify_product_data (first 100 rows)
id | title | handle | product_type | vendor | created_at | updated_at | published_at | published_scope | _fivetran_deleted | _fivetran_synced | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 4506451050593 | 1fccbdc6ac5f6edabf76e56eb0460019 | f4b6d0e4413a19b2e7a291f0ef4dc98f | fdb42fcb90ecd31c015932ffcd313014 | 13aea892c8de2d62f2608c6191cfab1f | 2020-02-14 19:18:05.000 | 2020-09-10 18:16:42.000 | 2020-02-14 19:02:02.000 | web | False | 2020-09-11 00:14:09.592 |
1 | 4526236893281 | 327ea22d0f91783418e519cb45a4a3e9 | 129181bbc087330e216a6a4d7939f00b | ec3bb3dd6e9d1f348a040ee7b45f1a72 | 13aea892c8de2d62f2608c6191cfab1f | 2020-03-04 05:04:32.000 | 2020-09-10 15:06:03.000 | 2020-03-04 05:04:32.000 | web | False | 2020-09-11 00:14:07.989 |
2 | 4505775439969 | c6c6fea8419b94103b0b05d64a5bab10 | f0a656254aca08bf40181226ac13418c | fdb42fcb90ecd31c015932ffcd313014 | 57403999f78b01b3fd325ba256eafe94 | 2020-02-14 02:09:59.000 | 2020-09-11 21:21:21.000 | 2020-02-14 02:09:59.000 | global | False | 2020-09-12 00:14:11.721 |
shopify_product_image_data (first 100 rows)
id | product_id | _fivetran_deleted | _fivetran_synced | alt | created_at | height | position_ | src | updated_at | width | is_default | variant_ids | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 14180 | 38804 | False | 2022-12-01 06:51:36.660000 | NaN | 2019-06-13 04:06:07.000000 | 1200 | 4 | https://cdn.shopify.com/s/files/glassess-1784103173.jpg?v=1560398767 | 2019-06-13 04:06:07.000000 | 956 | False | [] |
1 | 748644 | 34804 | False | 2022-12-01 06:51:36.660000 | NaN | 2019-06-13 04:06:07.000000 | 1200 | 2 | https://cdn.shopify.com/s/files/1/smile.jpg?v=1560398767 | 2019-06-13 04:06:07.000000 | 956 | False | [] |
2 | 679716 | 34604 | False | 2022-12-01 06:51:36.660000 | NaN | 2019-06-13 04:06:07.000000 | 1200 | 6 | https://cdn.shopify.com/s/files/1/kitten.jpg?v=1560398767 | 2019-06-13 04:06:07.000000 | 956 | False | [2755330292,27559733,275597338,275597536,2755931364,2755973,2734989668] |
shopify_product_tag_data (first 100 rows)
index_ | product_id | _fivetran_synced | value_ | |
---|---|---|---|---|
0 | 9 | 1234 | 2022-12-01 06:51:36.480000 | Type: Clothing |
1 | 5 | 1234 | 2022-12-01 06:51:36.480000 | Final Sale |
2 | 7 | 1234 | 2022-12-01 06:51:36.480000 | Sale |
3 | 8 | 1234 | 2022-12-01 06:51:36.480000 | StyleID:nice |
4 | 3 | 1234 | 2022-12-01 06:51:36.480000 | Collection: Bottoms |
shopify_product_variant_data (first 100 rows)
id | product_id | inventory_item_id | title | price | sku | position_ | inventory_policy | compare_at_price | fulfillment_service | inventory_management | created_at | updated_at | taxable | barcode | grams | image_id | inventory_quantity | weight | weight_unit | old_inventory_quantity | requires_shipping | _fivetran_synced | option_2 | tax_code | option_3 | option_1 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 39262114414663 | 6540108431431 | 41356021661767 | my title here | 111 | NaN | 1 | deny | NaN | manual | None | 2021-03-08 16:30:15.000 | 2021-04-12 19:49:43.000 | False | NaN | 0 | NaN | 0 | 0 | lb | 0 | False | 2021-04-16 07:50:32.995 | NaN | None | NaN | my title here |
1 | 39273118957639 | 6544066379847 | 41367035936839 | my title here | 222 | NaN | 1 | deny | NaN | manual | None | 2021-03-17 16:39:45.000 | 2021-04-12 19:46:59.000 | False | NaN | 0 | NaN | 0 | 0 | lb | 0 | False | 2021-04-16 07:50:29.241 | NaN | None | NaN | my title here |
2 | 39290169262151 | 6548438188103 | 41384094924871 | my title here | 5 | NaN | 1 | deny | NaN | manual | inventory manager | 2021-03-30 19:48:15.000 | 2021-03-30 19:48:15.000 | True | NaN | 0 | NaN | 0 | 0 | lb | 0 | True | 2021-04-16 07:50:32.720 | NaN | None | NaN | my title here |
3 | 39262115397703 | 6540109250631 | 41356022644807 | my title here | 333 | NaN | 1 | deny | NaN | manual | None | 2021-03-08 16:31:31.000 | 2021-04-12 19:47:26.000 | False | NaN | 0 | NaN | -5 | 0 | lb | -5 | False | 2021-04-16 07:50:29.822 | NaN | None | NaN | my title here |
4 | 29217058947142 | 3879735590982 | 30309980143686 | my other title | 444 | NaN | 1 | deny | NaN | manual | inventory manager | 2019-06-25 18:32:03.000 | 2019-10-01 23:40:09.000 | True | NaN | 222 | NaN | 0 | 1 | lb | 0 | True | 2021-04-16 07:50:25.006 | NaN | TR9999 | NaN | my other title |
shopify_refund_data (first 100 rows)
id | created_at | processed_at | note | restock | user_id | _fivetran_synced | total_duties_set | order_id | |
---|---|---|---|---|---|---|---|---|---|
0 | 801704738887 | 2021-04-17 20:25:08.000 | 2021-04-17 20:25:08.000 | None | False | 40467791943 | 2021-04-18 08:05:22.056 | NaN | 3726667481159 |
1 | 801695039559 | 2021-04-17 15:45:21.000 | 2021-04-17 15:45:21.000 | None | False | 40467791943 | 2021-04-18 07:52:19.104 | NaN | 3725521846343 |
2 | 801704181831 | 2021-04-17 20:15:01.000 | 2021-04-17 20:15:01.000 | None | False | 40467791943 | 2021-04-18 08:05:22.522 | NaN | 3726619476039 |
3 | 801703428167 | 2021-04-17 19:56:51.000 | 2021-04-17 19:56:51.000 | my refund note | False | 40467791943 | 2021-04-18 08:05:22.841 | NaN | 3726370996295 |
4 | 801707360327 | 2021-04-17 21:32:50.000 | 2021-04-17 21:32:50.000 | None | False | 40467791943 | 2021-04-18 08:02:24.256 | NaN | 3726858289223 |
shopify_shop_data (first 100 rows)
id | _fivetran_deleted | _fivetran_synced | address_1 | address_2 | auto_configure_tax_inclusivity | checkout_api_supported | city | cookie_consent_level | country | country_code | country_name | county_taxes | created_at | currency | customer_email | domain_ | eligible_for_card_reader_giveaway | eligible_for_payments | enabled_presentment_currencies | force_ssl | google_apps_domain | google_apps_login_enabled | has_discounts | has_gift_cards | has_storefront | iana_timezone | latitude | longitude | money_format | money_in_emails_format | money_with_currency_format | money_with_currency_in_emails_format | multi_location_enabled | myshopify_domain | name | password_enabled | phone | plan_display_name | plan_name | pre_launch_enabled | primary_locale | primary_location_id | province | province_code | requires_extra_payments_agreement | setup_required | shop_owner | source | tax_shipping | taxes_included | timezone | updated_at | visitor_tracking_consent_preference | weight_unit | zip | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 689 | False | 2022-12-07 06:49:41.652000 | 1 Main Street | 200th Floor | NaN | True | New York | implicit | US | US | United States | True | 2018-12-10 16:24:00.000000 | USD | noreply@kitties.com | kitties.com | True | True | abc@kitties.com | ["USD"] | True | NaN | NaN | True | True | True | America/New_York | 80.1234 | -123.12345 | ${{amount}} | ${{amount}} | ${{amount}} USD | ${{amount}} USD | True | kitties.myshopify.com | Garrett & Alfredo | False | 13373 | Shopify Plus | shopify_plus | False | en | 1234646345 | New York | NY | False | False | Garrett & Alfredo | NaN | NaN | False | (GMT-05:00) America/New_York | 2022-12-07 00:26:36.000000 | allow_all | lb | 10014 |
shopify_tax_line_data (first 100 rows)
index_ | order_line_id | _fivetran_synced | price | rate | title | price_set | |
---|---|---|---|---|---|---|---|
0 | 1 | 29227 | 2022-11-19 05:30:34.023000 | 0.0 | 0.0 | VAT | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
1 | 1 | 1839083 | 2022-11-19 07:14:05.023000 | 0.0 | 0.0 | VAT | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
2 | 1 | 11995 | 2022-11-19 05:30:34.023000 | 0.0 | 0.0 | VAT | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
3 | 1 | 10751 | 2022-11-19 07:14:05.024000 | 0.0 | 0.0 | VAT | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
4 | 1 | 194763 | 2022-11-19 05:30:34.023000 | 0.0 | 0.0 | VAT | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
shopify_tender_transaction_data (first 100 rows)
id | _fivetran_synced | amount | currency | order_id | payment_details_credit_card_company | payment_details_credit_card_number | payment_method | processed_at | remote_reference | test | user_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 34283 | 2022-12-01 06:51:34.004000 | 2895.74 | USD | 45379 | NaN | NaN | other | 2022-11-30 18:14:37.000000 | NaN | False | NaN |
1 | 905707 | 2022-12-01 06:51:42.309000 | 5900.75 | USD | 45243 | NaN | NaN | other | 2022-12-01 02:00:39.000000 | NaN | False | NaN |
2 | 411 | 2022-12-01 06:51:29.718000 | -164.72 | USD | 4559467 | NaN | NaN | other | 2022-11-30 14:29:13.000000 | NaN | False | NaN |
3 | 55179 | 2022-12-01 06:51:41.198000 | 5180.19 | USD | 35 | NaN | NaN | other | 2022-11-30 23:55:45.000000 | NaN | False | NaN |
4 | 16923 | 2022-12-01 06:51:42.358000 | 3004.30 | USD | 45955 | NaN | NaN | other | 2022-12-01 02:09:47.000000 | NaN | False | NaN |
shopify_transaction_data (first 100 rows)
id | order_id | refund_id | amount | authorization_ | created_at | processed_at | device_id | gateway | source_name | message | currency | location_id | parent_id | payment_avs_result_code | kind | currency_exchange_id | currency_exchange_adjustment | currency_exchange_original_amount | currency_exchange_final_amount | currency_exchange_currency | error_code | status | test | user_id | _fivetran_synced | payment_credit_card_bin | payment_cvv_result_code | payment_credit_card_number | payment_credit_card_company | receipt | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2667417567303 | 2181743870023 | NaN | 415.00 | abcd999999 | 2020-02-27 16:05:37.000 | 2020-02-27 16:05:37.000 | NaN | gateway_here | source_name | message_here | USD | NaN | NaN | Z | sale | NaN | NaN | NaN | NaN | NaN | NaN | success | False | NaN | 2020-10-28 20:33:09.797 | NaN | NaN | NaN | NaN | { "charges": { "data": [ { "balance_transaction": { "exchange_rate": null } }] }} |
1 | 2572210896967 | 2089104834631 | NaN | 415.00 | abcd888888 | 2020-01-12 20:06:37.000 | 2020-01-12 20:06:37.000 | NaN | gateway_here | source_name | message_here | USD | NaN | NaN | Y | sale | NaN | NaN | NaN | NaN | NaN | NaN | success | False | NaN | 2020-10-28 17:05:27.756 | NaN | NaN | NaN | NaN | None |
2 | 2664325611591 | 2179107356743 | NaN | 415.00 | abcd77777 | 2020-02-26 00:12:37.000 | 2020-02-26 00:12:37.000 | NaN | gateway_here | source_name | message_here | USD | NaN | NaN | None | sale | NaN | NaN | NaN | NaN | NaN | NaN | success | False | NaN | 2020-10-28 20:23:50.344 | NaN | NaN | NaN | NaN | { "charges": { "data": [ { "balance_transaction": { "exchange_rate": "0.523" } }] }} |
3 | 2595729735751 | 2114590769223 | NaN | 15.95 | abcd66666 | 2020-01-26 11:04:41.000 | 2020-01-26 11:04:41.000 | NaN | gateway_here | source_name | message_here | USD | NaN | NaN | Y | sale | NaN | NaN | NaN | NaN | NaN | NaN | success | False | NaN | 2020-10-28 18:10:27.604 | NaN | NaN | NaN | NaN | None |
4 | 2705030512711 | 2214516916295 | NaN | 212.12 | abcd5555 | 2020-03-18 00:17:24.000 | 2020-03-18 00:17:24.000 | NaN | gateway_here | source_name | message_here | USD | NaN | NaN | None | sale | NaN | NaN | NaN | NaN | NaN | NaN | success | False | NaN | 2020-10-28 22:14:02.944 | NaN | NaN | NaN | NaN | { "charges": { "data": [ { "balance_transaction": { "exchange_rate": "0.96581" } }] }} |
stg_shopify_order_discount_code_data (first 100 rows)
discount_order | discount_value | discount_code | discount_type | order_id | |
---|---|---|---|---|---|
0 | 1 | 11.0 | GIFTCARD | percentage | 2674098602081 |
1 | 2 | 5.0 | SHIPPING2022 | shipping | 2674098602081 |
2 | 3 | 1.0 | FIXED | fixed_amount | 2674098602081 |
3 | 1 | 0.0 | SHIPPING2022 | shipping | 2669516488801 |
4 | 1 | 2.0 | GIFTCARD | percentage | 2669509541985 |
stg_shopify_order_discount_code_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_order_discount_code_data_projected" AS (
-- Projection: Selecting 5 out of 6 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"index_",
"order_id",
"amount",
"code",
"type"
FROM "shopify_order_discount_code_data"
),
"shopify_order_discount_code_data_projected_renamed" AS (
-- Rename: Renaming columns
-- index_ -> discount_order
-- amount -> discount_value
-- code -> discount_code
-- type -> discount_type
SELECT
"index_" AS "discount_order",
"order_id",
"amount" AS "discount_value",
"code" AS "discount_code",
"type" AS "discount_type"
FROM "shopify_order_discount_code_data_projected"
),
"shopify_order_discount_code_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- order_id: from INT to VARCHAR
SELECT
"discount_order",
"discount_value",
"discount_code",
"discount_type",
CAST("order_id" AS VARCHAR) AS "order_id"
FROM "shopify_order_discount_code_data_projected_renamed"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_order_discount_code_data_projected_renamed_casted"
stg_shopify_order_discount_code_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_order_discount_code_data
description: The table is about order discounts in a Shopify store. It contains
details of discount codes applied to orders. Each row represents a discount, including
the order ID, discount amount, code used, and type of discount (percentage, shipping,
or fixed amount). Multiple discounts can be applied to a single order.
columns:
- name: discount_order
description: Order of application for multiple discounts
tests:
- not_null
- name: discount_value
description: Value of the discount applied
tests:
- not_null
- name: discount_code
description: Discount code used for the order
tests:
- not_null
- name: discount_type
description: Category of discount (percentage, shipping, fixed)
tests:
- not_null
- accepted_values:
values:
- percentage
- shipping
- fixed_amount
- name: order_id
description: Unique identifier for the order
tests:
- not_null
stg_shopify_price_rule_data (first 100 rows)
price_rule_id | allocation_method | customer_eligibility | one_time_use | subtotal_prerequisite | discount_target | target_type | price_rule_name | discount_value | discount_type | allocation_limit | creation_date | expiration_date | last_updated | start_date | usage_limit | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 11443 | across | all | False | 500.0 | all | line_item | GIFTCARD | 0.0 | percentage | None | 2021-03-09 18:57:54 | 2021-03-22 07:00:59 | 2021-03-22 04:20:03 | 2021-03-17 04:00:57 | None |
1 | 564075 | across | all | False | NaN | entitled | line_item | THANKS | 0.0 | percentage | None | 2021-11-10 22:26:31 | 2021-11-30 14:00:59 | 2021-11-10 22:26:31 | 2021-11-10 22:25:32 | None |
2 | 9339 | across | all | False | NaN | all | line_item | THANKS | 0.0 | percentage | None | 2021-11-11 22:38:18 | 2021-12-02 19:00:59 | 2021-12-02 19:21:47 | 2021-11-23 21:30:38 | None |
stg_shopify_price_rule_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_price_rule_data_projected" AS (
-- Projection: Selecting 21 out of 22 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"allocation_limit",
"allocation_method",
"created_at",
"customer_selection",
"ends_at",
"once_per_customer",
"prerequisite_quantity_range",
"prerequisite_shipping_price_range",
"prerequisite_subtotal_range",
"quantity_ratio_entitled_quantity",
"quantity_ratio_prerequisite_quantity",
"starts_at",
"target_selection",
"target_type",
"title",
"updated_at",
"usage_limit",
"value_",
"value_type",
"prerequisite_to_entitlement_purchase_prerequisite_amount"
FROM "shopify_price_rule_data"
),
"shopify_price_rule_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> price_rule_id
-- created_at -> creation_date
-- customer_selection -> customer_eligibility
-- ends_at -> expiration_date
-- once_per_customer -> one_time_use
-- prerequisite_quantity_range -> quantity_prerequisite
-- prerequisite_shipping_price_range -> shipping_price_prerequisite
-- prerequisite_subtotal_range -> subtotal_prerequisite
-- quantity_ratio_entitled_quantity -> entitled_quantity_ratio
-- quantity_ratio_prerequisite_quantity -> prerequisite_quantity_ratio
-- starts_at -> start_date
-- target_selection -> discount_target
-- title -> price_rule_name
-- updated_at -> last_updated
-- value_ -> discount_value
-- value_type -> discount_type
-- prerequisite_to_entitlement_purchase_prerequisite_amount -> entitlement_purchase_prerequisite
SELECT
"id" AS "price_rule_id",
"allocation_limit",
"allocation_method",
"created_at" AS "creation_date",
"customer_selection" AS "customer_eligibility",
"ends_at" AS "expiration_date",
"once_per_customer" AS "one_time_use",
"prerequisite_quantity_range" AS "quantity_prerequisite",
"prerequisite_shipping_price_range" AS "shipping_price_prerequisite",
"prerequisite_subtotal_range" AS "subtotal_prerequisite",
"quantity_ratio_entitled_quantity" AS "entitled_quantity_ratio",
"quantity_ratio_prerequisite_quantity" AS "prerequisite_quantity_ratio",
"starts_at" AS "start_date",
"target_selection" AS "discount_target",
"target_type",
"title" AS "price_rule_name",
"updated_at" AS "last_updated",
"usage_limit",
"value_" AS "discount_value",
"value_type" AS "discount_type",
"prerequisite_to_entitlement_purchase_prerequisite_amount" AS "entitlement_purchase_prerequisite"
FROM "shopify_price_rule_data_projected"
),
"shopify_price_rule_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- allocation_limit: from DECIMAL to VARCHAR
-- creation_date: from VARCHAR to TIMESTAMP
-- entitled_quantity_ratio: from DECIMAL to VARCHAR
-- entitlement_purchase_prerequisite: from DECIMAL to VARCHAR
-- expiration_date: from VARCHAR to TIMESTAMP
-- last_updated: from VARCHAR to TIMESTAMP
-- prerequisite_quantity_ratio: from DECIMAL to VARCHAR
-- quantity_prerequisite: from DECIMAL to VARCHAR
-- shipping_price_prerequisite: from DECIMAL to VARCHAR
-- start_date: from VARCHAR to TIMESTAMP
-- usage_limit: from DECIMAL to VARCHAR
SELECT
"price_rule_id",
"allocation_method",
"customer_eligibility",
"one_time_use",
"subtotal_prerequisite",
"discount_target",
"target_type",
"price_rule_name",
"discount_value",
"discount_type",
CAST("allocation_limit" AS VARCHAR) AS "allocation_limit",
CAST("creation_date" AS TIMESTAMP) AS "creation_date",
CAST("entitled_quantity_ratio" AS VARCHAR) AS "entitled_quantity_ratio",
CAST("entitlement_purchase_prerequisite" AS VARCHAR) AS "entitlement_purchase_prerequisite",
CAST("expiration_date" AS TIMESTAMP) AS "expiration_date",
CAST("last_updated" AS TIMESTAMP) AS "last_updated",
CAST("prerequisite_quantity_ratio" AS VARCHAR) AS "prerequisite_quantity_ratio",
CAST("quantity_prerequisite" AS VARCHAR) AS "quantity_prerequisite",
CAST("shipping_price_prerequisite" AS VARCHAR) AS "shipping_price_prerequisite",
CAST("start_date" AS TIMESTAMP) AS "start_date",
CAST("usage_limit" AS VARCHAR) AS "usage_limit"
FROM "shopify_price_rule_data_projected_renamed"
),
"shopify_price_rule_data_projected_renamed_casted_missing_handled" AS (
-- Handling missing values: There are 5 columns with unacceptable missing values
-- entitled_quantity_ratio has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- entitlement_purchase_prerequisite has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- prerequisite_quantity_ratio has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- quantity_prerequisite has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_price_prerequisite has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"price_rule_id",
"allocation_method",
"customer_eligibility",
"one_time_use",
"subtotal_prerequisite",
"discount_target",
"target_type",
"price_rule_name",
"discount_value",
"discount_type",
"allocation_limit",
"creation_date",
"expiration_date",
"last_updated",
"start_date",
"usage_limit"
FROM "shopify_price_rule_data_projected_renamed_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_price_rule_data_projected_renamed_casted_missing_handled"
stg_shopify_price_rule_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_price_rule_data
description: The table is about Shopify price rules. It contains details of discount
configurations. Each rule has an ID, creation date, and expiration date. Rules
specify customer eligibility, discount type, and value. They can target specific
items or all products. Additional fields set prerequisites like minimum purchase
amounts. The table allows for flexible discount creation and management in the
Shopify platform.
columns:
- name: price_rule_id
description: Unique identifier for the price rule
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column is a unique identifier for each price rule. For this
table, each row represents a distinct price rule configuration. price_rule_id
is unique across rows as it's designed to be the primary identifier for each
rule.
- name: allocation_method
description: Method for allocating discount across products
tests:
- not_null
- accepted_values:
values:
- proportional
- equal
- first item
- last item
- highest priced item
- lowest priced item
- random
- across
- name: customer_eligibility
description: Specifies which customers are eligible
tests:
- not_null
- accepted_values:
values:
- all
- new
- existing
- premium
- standard
- vip
- loyalty_program
- first_time
- returning
- age_18_plus
- age_21_plus
- students
- seniors
- military
- corporate
- name: one_time_use
description: Indicates if discount is one-time use
tests:
- not_null
- name: subtotal_prerequisite
description: Required subtotal range for discount eligibility
cocoon_meta:
missing_acceptable: Not applicable when no minimum purchase is required.
- name: discount_target
description: Specifies which items the discount applies to
tests:
- not_null
- accepted_values:
values:
- all
- entitled
- specific
- name: target_type
description: Type of target for the discount
tests:
- not_null
- accepted_values:
values:
- line_item
- order
- shipping
- product
- category
- customer
- customer_group
- name: price_rule_name
description: Name or title of the price rule
tests:
- not_null
- name: discount_value
description: Numerical value of the discount
tests:
- not_null
- name: discount_type
description: Type of value (percentage or fixed amount)
tests:
- not_null
- accepted_values:
values:
- percentage
- fixed amount
- name: allocation_limit
description: Limits how discount is allocated
cocoon_meta:
missing_acceptable: Not applicable when allocation method is 'across'.
- name: creation_date
description: Timestamp when the price rule was created
tests:
- not_null
- name: expiration_date
description: Timestamp when the price rule expires
tests:
- not_null
- name: last_updated
description: Timestamp of last update to the rule
tests:
- not_null
- name: start_date
description: Timestamp when the price rule becomes active
tests:
- not_null
- name: usage_limit
description: Maximum number of times rule can be used
cocoon_meta:
missing_acceptable: Not applicable when there's no limit on usage.
stg_shopify_order_shipping_line_data (first 100 rows)
shipping_line_id | order_id | shipping_code | discounted_price_numeric | price_numeric | shipping_source | shipping_method_title | carrier_id | discounted_price_details | price_details | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 54475 | 55 | Standard | 0.0 | 0.0 | shopify | Standard | None | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
1 | 651 | 425579 | Standard | 0.0 | 0.0 | shopify | Standard | None | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
2 | 188139 | 4599 | Standard | 0.0 | 0.0 | shopify | Standard | None | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
stg_shopify_order_shipping_line_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_order_shipping_line_data_projected" AS (
-- Projection: Selecting 13 out of 14 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"order_id",
"carrier_identifier",
"code",
"delivery_category",
"discounted_price",
"phone",
"price",
"requested_fulfillment_service_id",
"source",
"title",
"discounted_price_set",
"price_set"
FROM "shopify_order_shipping_line_data"
),
"shopify_order_shipping_line_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> shipping_line_id
-- carrier_identifier -> carrier_id
-- code -> shipping_code
-- delivery_category -> delivery_type
-- discounted_price -> discounted_price_numeric
-- phone -> shipping_phone
-- price -> price_numeric
-- requested_fulfillment_service_id -> fulfillment_service_id
-- source -> shipping_source
-- title -> shipping_method_title
-- discounted_price_set -> discounted_price_details
-- price_set -> price_details
SELECT
"id" AS "shipping_line_id",
"order_id",
"carrier_identifier" AS "carrier_id",
"code" AS "shipping_code",
"delivery_category" AS "delivery_type",
"discounted_price" AS "discounted_price_numeric",
"phone" AS "shipping_phone",
"price" AS "price_numeric",
"requested_fulfillment_service_id" AS "fulfillment_service_id",
"source" AS "shipping_source",
"title" AS "shipping_method_title",
"discounted_price_set" AS "discounted_price_details",
"price_set" AS "price_details"
FROM "shopify_order_shipping_line_data_projected"
),
"shopify_order_shipping_line_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- carrier_id: from DECIMAL to VARCHAR
-- delivery_type: from DECIMAL to VARCHAR
-- discounted_price_details: from VARCHAR to JSON
-- fulfillment_service_id: from DECIMAL to VARCHAR
-- price_details: from VARCHAR to JSON
-- shipping_phone: from DECIMAL to VARCHAR
SELECT
"shipping_line_id",
"order_id",
"shipping_code",
"discounted_price_numeric",
"price_numeric",
"shipping_source",
"shipping_method_title",
CAST("carrier_id" AS VARCHAR) AS "carrier_id",
CAST("delivery_type" AS VARCHAR) AS "delivery_type",
CAST("discounted_price_details" AS JSON) AS "discounted_price_details",
CAST("fulfillment_service_id" AS VARCHAR) AS "fulfillment_service_id",
CAST("price_details" AS JSON) AS "price_details",
CAST("shipping_phone" AS VARCHAR) AS "shipping_phone"
FROM "shopify_order_shipping_line_data_projected_renamed"
),
"shopify_order_shipping_line_data_projected_renamed_casted_missing_handled" AS (
-- Handling missing values: There are 3 columns with unacceptable missing values
-- delivery_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- fulfillment_service_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_phone has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"shipping_line_id",
"order_id",
"shipping_code",
"discounted_price_numeric",
"price_numeric",
"shipping_source",
"shipping_method_title",
"carrier_id",
"discounted_price_details",
"price_details"
FROM "shopify_order_shipping_line_data_projected_renamed_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_order_shipping_line_data_projected_renamed_casted_missing_handled"
stg_shopify_order_shipping_line_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_order_shipping_line_data
description: The table is about shipping information for Shopify orders. It includes
details such as the order ID, shipping carrier, delivery category, and pricing.
Each row represents a shipping line item for a specific order. The table contains
both discounted and regular pricing information in different currencies. All samples
show standard shipping with zero cost, suggesting possible free shipping offers.
columns:
- name: shipping_line_id
description: Unique identifier for the shipping line item
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each shipping line
item. For this table, each row represents a single shipping line item for
a specific order. The shipping_line_id is likely to be unique across rows
as it's designed to identify each shipping line individually.
- name: order_id
description: Identifier of the associated order
tests:
- not_null
- name: shipping_code
description: Code representing the shipping method
tests:
- not_null
- accepted_values:
values:
- Standard
- Express
- Overnight
- Two-Day
- Ground
- Priority
- Economy
- Same-Day
- International
- Freight
- name: discounted_price_numeric
description: Discounted shipping price as a numeric value
tests:
- not_null
- name: price_numeric
description: Regular shipping price as a numeric value
tests:
- not_null
- name: shipping_source
description: Source of the shipping information
tests:
- not_null
- accepted_values:
values:
- shopify
- manual
- api
- csv_import
- third_party_logistics
- marketplace
- dropshipping
- erp_system
- order_management_system
- custom_integration
- name: shipping_method_title
description: Title or name of the shipping method
tests:
- not_null
- accepted_values:
values:
- Standard
- Express
- Overnight
- Two-Day
- Ground
- Economy
- Priority
- Same Day
- International
- Free Shipping
- Local Pickup
- Flat Rate
- name: carrier_id
description: Identifier for the shipping carrier
cocoon_meta:
missing_acceptable: Not applicable when shipping is handled by Shopify.
- name: discounted_price_details
description: Detailed discounted price information in JSON format
tests:
- not_null
- name: price_details
description: Detailed regular price information in JSON format
tests:
- not_null
stg_shopify_refund_data (first 100 rows)
refund_note | items_restocked | customer_id | original_order_id | refund_created_at | refund_duties | refund_id | refund_processed_at | |
---|---|---|---|---|---|---|---|---|
0 | None | False | 40467791943 | 3726667481159 | 2021-04-17 20:25:08 | None | 801704738887 | 2021-04-17 20:25:08 |
1 | None | False | 40467791943 | 3725521846343 | 2021-04-17 15:45:21 | None | 801695039559 | 2021-04-17 15:45:21 |
2 | None | False | 40467791943 | 3726619476039 | 2021-04-17 20:15:01 | None | 801704181831 | 2021-04-17 20:15:01 |
3 | None | False | 40467791943 | 3726370996295 | 2021-04-17 19:56:51 | None | 801703428167 | 2021-04-17 19:56:51 |
4 | None | False | 40467791943 | 3726858289223 | 2021-04-17 21:32:50 | None | 801707360327 | 2021-04-17 21:32:50 |
stg_shopify_refund_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_refund_data_projected" AS (
-- Projection: Selecting 8 out of 9 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"created_at",
"processed_at",
"note",
"restock",
"user_id",
"total_duties_set",
"order_id"
FROM "shopify_refund_data"
),
"shopify_refund_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> refund_id
-- created_at -> refund_created_at
-- processed_at -> refund_processed_at
-- note -> refund_note
-- restock -> items_restocked
-- user_id -> customer_id
-- total_duties_set -> refund_duties
-- order_id -> original_order_id
SELECT
"id" AS "refund_id",
"created_at" AS "refund_created_at",
"processed_at" AS "refund_processed_at",
"note" AS "refund_note",
"restock" AS "items_restocked",
"user_id" AS "customer_id",
"total_duties_set" AS "refund_duties",
"order_id" AS "original_order_id"
FROM "shopify_refund_data_projected"
),
"shopify_refund_data_projected_renamed_cleaned" AS (
-- Clean unusual string values:
-- refund_note: The problem is that 'my refund note' appears to be a placeholder value rather than genuine refund notes. It's unusual because it's generic and doesn't provide any specific information about individual refunds. The correct values should be actual refund notes or an empty string if no specific note is available.
SELECT
"refund_id",
"refund_created_at",
"refund_processed_at",
CASE
WHEN "refund_note" = 'my refund note' THEN ''
ELSE "refund_note"
END AS "refund_note",
"items_restocked",
"customer_id",
"refund_duties",
"original_order_id"
FROM "shopify_refund_data_projected_renamed"
),
"shopify_refund_data_projected_renamed_cleaned_null" AS (
-- NULL Imputation: Impute Null to Disguised Missing Values
-- refund_note: ['']
SELECT
CASE
WHEN "refund_note" = '' THEN NULL
ELSE "refund_note"
END AS "refund_note",
"original_order_id",
"refund_created_at",
"refund_duties",
"customer_id",
"refund_processed_at",
"items_restocked",
"refund_id"
FROM "shopify_refund_data_projected_renamed_cleaned"
),
"shopify_refund_data_projected_renamed_cleaned_null_casted" AS (
-- Column Type Casting:
-- customer_id: from INT to VARCHAR
-- original_order_id: from INT to VARCHAR
-- refund_created_at: from VARCHAR to TIMESTAMP
-- refund_duties: from DECIMAL to VARCHAR
-- refund_id: from INT to VARCHAR
-- refund_processed_at: from VARCHAR to TIMESTAMP
SELECT
"refund_note",
"items_restocked",
CAST("customer_id" AS VARCHAR) AS "customer_id",
CAST("original_order_id" AS VARCHAR) AS "original_order_id",
CAST("refund_created_at" AS TIMESTAMP) AS "refund_created_at",
CAST("refund_duties" AS VARCHAR) AS "refund_duties",
CAST("refund_id" AS VARCHAR) AS "refund_id",
CAST("refund_processed_at" AS TIMESTAMP) AS "refund_processed_at"
FROM "shopify_refund_data_projected_renamed_cleaned_null"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_refund_data_projected_renamed_cleaned_null_casted"
stg_shopify_refund_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_refund_data
description: The table is about Shopify refunds. It includes details such as refund
ID, creation and processing timestamps, notes, restock status, user ID, total
duties, and associated order ID. Each row represents a single refund transaction.
The table allows tracking of refund activities, linking them to specific orders
and users in the Shopify system.
columns:
- name: refund_note
description: Optional note associated with the refund
cocoon_meta:
missing_acceptable: No additional notes were necessary for these refunds.
- name: items_restocked
description: Boolean indicating if items were restocked
tests:
- not_null
- name: customer_id
description: Identifier of the user associated with the refund
tests:
- not_null
- name: original_order_id
description: Identifier of the order being refunded
tests:
- not_null
- name: refund_created_at
description: Timestamp when the refund was created
tests:
- not_null
- name: refund_duties
description: Total duties set for the refund
cocoon_meta:
missing_acceptable: No duties charged or refunded for these transactions.
- name: refund_id
description: Unique identifier for the refund
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each refund. For
this table, each row is for a single refund transaction. The refund_id is
designed to be unique across all refunds in the Shopify system.
- name: refund_processed_at
description: Timestamp when the refund was processed
tests:
- not_null
stg_shopify_collection_product_data (first 100 rows)
collection_id | product_id | |
---|---|---|
0 | 37124 | 789131 |
1 | 9037124 | 74353899 |
2 | 37124 | 8891 |
stg_shopify_collection_product_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_collection_product_data_projected" AS (
-- Projection: Selecting 2 out of 3 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"collection_id",
"product_id"
FROM "shopify_collection_product_data"
),
"shopify_collection_product_data_projected_casted" AS (
-- Column Type Casting:
-- collection_id: from INT to VARCHAR
-- product_id: from INT to VARCHAR
SELECT
CAST("collection_id" AS VARCHAR) AS "collection_id",
CAST("product_id" AS VARCHAR) AS "product_id"
FROM "shopify_collection_product_data_projected"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_collection_product_data_projected_casted"
stg_shopify_collection_product_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_collection_product_data
description: The table represents the association between Shopify collections and
products. Each row links a collection to a product. Collections can contain multiple
products. Products can belong to multiple collections. The table uses IDs to uniquely
identify each collection and product.
columns:
- name: collection_id
description: Unique identifier for a Shopify collection
tests:
- not_null
- name: product_id
description: Unique identifier for a product in Shopify
tests:
- not_null
stg_shopify_customer_data (first 100 rows)
encrypted_first_name | encrypted_last_name | encrypted_email | account_state | orders_count | total_spent | marketing_consent | tax_exempt | email_verified | account_creation_date | customer_id | default_address_id | last_updated_date | phone | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 29e00d3659d1c5e75f99e892f0c1a1f1 | 3f0e6a46fb84eb1e6f5f00d86aa53b1b | ab0bf25ab8b2a6b78af26a141dd6f455 | disabled | 0 | 0.00 | False | False | True | 2020-09-11 13:26:15 | 3588998496353 | 3951726461025 | 2020-09-11 13:26:15 | None |
1 | f0962b7a185488ecb752cedac1038349 | aa35cb67c26e64bb81a1bf3f17e858ba | 021cb20b5c78751fc7ddc091b6b69b3e | invited | 1 | 2.80 | True | False | True | 2020-09-11 19:35:42 | 3589760876641 | 3952669655137 | 2020-09-11 19:41:04 | None |
2 | d3bae70c9d49bb7cb5a74cdd0eae7fc4 | 0dd89cff60965dff8f9ea2bc952a5474 | dce90c7b4e52e045e5975836aff49cf1 | disabled | 2 | 9.18 | False | False | True | 2020-09-09 22:57:44 | 3584045351009 | 3946055729249 | 2020-09-09 23:01:55 | None |
stg_shopify_customer_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_customer_data_projected" AS (
-- Projection: Selecting 14 out of 15 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"first_name",
"last_name",
"email",
"phone",
"state",
"orders_count",
"total_spent",
"created_at",
"updated_at",
"accepts_marketing",
"tax_exempt",
"verified_email",
"default_address_id"
FROM "shopify_customer_data"
),
"shopify_customer_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> customer_id
-- first_name -> encrypted_first_name
-- last_name -> encrypted_last_name
-- email -> encrypted_email
-- state -> account_state
-- created_at -> account_creation_date
-- updated_at -> last_updated_date
-- accepts_marketing -> marketing_consent
-- verified_email -> email_verified
SELECT
"id" AS "customer_id",
"first_name" AS "encrypted_first_name",
"last_name" AS "encrypted_last_name",
"email" AS "encrypted_email",
"phone",
"state" AS "account_state",
"orders_count",
"total_spent",
"created_at" AS "account_creation_date",
"updated_at" AS "last_updated_date",
"accepts_marketing" AS "marketing_consent",
"tax_exempt",
"verified_email" AS "email_verified",
"default_address_id"
FROM "shopify_customer_data_projected"
),
"shopify_customer_data_projected_renamed_trimmed" AS (
-- Trim Leading and Trailing Spaces
SELECT
"customer_id",
"encrypted_first_name",
"encrypted_last_name",
"encrypted_email",
"phone",
"account_state",
"orders_count",
"total_spent",
"marketing_consent",
"tax_exempt",
"email_verified",
"default_address_id",
TRIM("account_creation_date") AS "account_creation_date",
TRIM("last_updated_date") AS "last_updated_date"
FROM "shopify_customer_data_projected_renamed"
),
"shopify_customer_data_projected_renamed_trimmed_casted" AS (
-- Column Type Casting:
-- account_creation_date: from VARCHAR to TIMESTAMP
-- customer_id: from INT to VARCHAR
-- default_address_id: from INT to VARCHAR
-- last_updated_date: from VARCHAR to TIMESTAMP
-- phone: from DECIMAL to VARCHAR
SELECT
"encrypted_first_name",
"encrypted_last_name",
"encrypted_email",
"account_state",
"orders_count",
"total_spent",
"marketing_consent",
"tax_exempt",
"email_verified",
CAST("account_creation_date" AS TIMESTAMP) AS "account_creation_date",
CAST("customer_id" AS VARCHAR) AS "customer_id",
CAST("default_address_id" AS VARCHAR) AS "default_address_id",
CAST("last_updated_date" AS TIMESTAMP) AS "last_updated_date",
CAST("phone" AS VARCHAR) AS "phone"
FROM "shopify_customer_data_projected_renamed_trimmed"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_customer_data_projected_renamed_trimmed_casted"
stg_shopify_customer_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_customer_data
description: The table is about Shopify customers. It contains customer details
such as name, email, and phone. The table tracks customer order history, including
order count and total spent. It also includes customer preferences like marketing
acceptance and tax exemption status. Each customer has a unique ID and associated
timestamps for creation and updates.
columns:
- name: encrypted_first_name
description: Customer's first name (encrypted)
tests:
- not_null
- name: encrypted_last_name
description: Customer's last name (encrypted)
tests:
- not_null
- name: encrypted_email
description: Customer's email address (encrypted)
tests:
- not_null
- name: account_state
description: Current state of the customer account
tests:
- not_null
- accepted_values:
values:
- disabled
- invited
- active
- suspended
- pending
- closed
- archived
- name: orders_count
description: Number of orders placed by the customer
tests:
- not_null
- name: total_spent
description: Total amount spent by the customer
tests:
- not_null
- name: marketing_consent
description: Indicates if customer agrees to receive marketing
tests:
- not_null
- name: tax_exempt
description: Indicates if the customer is exempt from taxes
tests:
- not_null
- name: email_verified
description: Indicates if the customer's email is verified
tests:
- not_null
- name: account_creation_date
description: Timestamp when the customer account was created
tests:
- not_null
- name: customer_id
description: Unique identifier for the customer
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each customer. For
this table, each row is for a unique customer. Customer_id is designed to
be unique across all customers and is typically used as a primary key in database
systems.
- name: default_address_id
description: ID of the customer's default shipping address
tests:
- not_null
- name: last_updated_date
description: Timestamp of the last update to customer record
tests:
- not_null
- name: phone
description: Customer's phone number
cocoon_meta:
missing_acceptable: Phone number may not be required for all customers.
stg_shopify_shop_data (first 100 rows)
setup_required | timezone | ssl_enforced | weight_unit | county_taxes_applied | plan_display_name | gift_cards_offered | cookie_consent_level | checkout_api_support | is_deleted | payment_processing_eligible | longitude | discounts_offered | shopify_domain | country_name | primary_address | shop_timezone | password_protection_enabled | shop_domain | storefront_active | state_province_code | owner_email | iso_country_code | multi_location_enabled | primary_language | money_with_currency_format | taxes_included | primary_currency | latitude | shop_owner | pre_launch_enabled | city | money_format | plan_name | email_currency_format | store_name | tracking_consent_preference | card_reader_promo_eligible | shop_id | country_code | customer_contact_email | extra_payment_agreement_required | email_currency_display_format | state_province | creation_timestamp | enabled_currencies | google_apps_domain | google_apps_login_enabled | last_updated | phone | postal_code | primary_location_id | tax_on_shipping | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | False | (GMT-05:00) America/New_York | True | lb | True | Shopify Plus | True | implicit | True | False | True | -123.12345 | True | kitties.myshopify.com | United States | 1 Main Street | America/New_York | False | kitties.com | True | NY | abc@kitties.com | US | True | en | ${{amount}} USD | False | USD | 80.1234 | Garrett & Alfredo | False | New York | ${{amount}} | shopify_plus | ${{amount}} | Garrett & Alfredo | allow_all | True | 689 | US | noreply@kitties.com | False | ${{amount}} USD | New York | 2018-12-10 16:24:00 | [USD] | None | NaN | 2022-12-07 00:26:36 | 13373 | 10014 | 1234646345 | None |
stg_shopify_shop_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_shop_data_projected" AS (
-- Projection: Selecting 56 out of 57 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"_fivetran_deleted",
"address_1",
"address_2",
"auto_configure_tax_inclusivity",
"checkout_api_supported",
"city",
"cookie_consent_level",
"country",
"country_code",
"country_name",
"county_taxes",
"created_at",
"currency",
"customer_email",
"domain_",
"eligible_for_card_reader_giveaway",
"eligible_for_payments",
"email",
"enabled_presentment_currencies",
"force_ssl",
"google_apps_domain",
"google_apps_login_enabled",
"has_discounts",
"has_gift_cards",
"has_storefront",
"iana_timezone",
"latitude",
"longitude",
"money_format",
"money_in_emails_format",
"money_with_currency_format",
"money_with_currency_in_emails_format",
"multi_location_enabled",
"myshopify_domain",
"name",
"password_enabled",
"phone",
"plan_display_name",
"plan_name",
"pre_launch_enabled",
"primary_locale",
"primary_location_id",
"province",
"province_code",
"requires_extra_payments_agreement",
"setup_required",
"shop_owner",
"source",
"tax_shipping",
"taxes_included",
"timezone",
"updated_at",
"visitor_tracking_consent_preference",
"weight_unit",
"zip"
FROM "shopify_shop_data"
),
"shopify_shop_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> shop_id
-- _fivetran_deleted -> is_deleted
-- address_1 -> primary_address
-- address_2 -> secondary_address
-- auto_configure_tax_inclusivity -> auto_tax_inclusivity
-- checkout_api_supported -> checkout_api_support
-- country -> country_code
-- country_code -> iso_country_code
-- county_taxes -> county_taxes_applied
-- created_at -> creation_timestamp
-- currency -> primary_currency
-- customer_email -> customer_contact_email
-- domain_ -> shop_domain
-- eligible_for_card_reader_giveaway -> card_reader_promo_eligible
-- eligible_for_payments -> payment_processing_eligible
-- email -> owner_email
-- enabled_presentment_currencies -> enabled_currencies
-- force_ssl -> ssl_enforced
-- has_discounts -> discounts_offered
-- has_gift_cards -> gift_cards_offered
-- has_storefront -> storefront_active
-- iana_timezone -> shop_timezone
-- money_in_emails_format -> email_currency_format
-- money_with_currency_in_emails_format -> email_currency_display_format
-- myshopify_domain -> shopify_domain
-- name -> store_name
-- password_enabled -> password_protection_enabled
-- primary_locale -> primary_language
-- province -> state_province
-- province_code -> state_province_code
-- requires_extra_payments_agreement -> extra_payment_agreement_required
-- source -> creation_source
-- tax_shipping -> tax_on_shipping
-- updated_at -> last_updated
-- visitor_tracking_consent_preference -> tracking_consent_preference
-- zip -> postal_code
SELECT
"id" AS "shop_id",
"_fivetran_deleted" AS "is_deleted",
"address_1" AS "primary_address",
"address_2" AS "secondary_address",
"auto_configure_tax_inclusivity" AS "auto_tax_inclusivity",
"checkout_api_supported" AS "checkout_api_support",
"city",
"cookie_consent_level",
"country" AS "country_code",
"country_code" AS "iso_country_code",
"country_name",
"county_taxes" AS "county_taxes_applied",
"created_at" AS "creation_timestamp",
"currency" AS "primary_currency",
"customer_email" AS "customer_contact_email",
"domain_" AS "shop_domain",
"eligible_for_card_reader_giveaway" AS "card_reader_promo_eligible",
"eligible_for_payments" AS "payment_processing_eligible",
"email" AS "owner_email",
"enabled_presentment_currencies" AS "enabled_currencies",
"force_ssl" AS "ssl_enforced",
"google_apps_domain",
"google_apps_login_enabled",
"has_discounts" AS "discounts_offered",
"has_gift_cards" AS "gift_cards_offered",
"has_storefront" AS "storefront_active",
"iana_timezone" AS "shop_timezone",
"latitude",
"longitude",
"money_format",
"money_in_emails_format" AS "email_currency_format",
"money_with_currency_format",
"money_with_currency_in_emails_format" AS "email_currency_display_format",
"multi_location_enabled",
"myshopify_domain" AS "shopify_domain",
"name" AS "store_name",
"password_enabled" AS "password_protection_enabled",
"phone",
"plan_display_name",
"plan_name",
"pre_launch_enabled",
"primary_locale" AS "primary_language",
"primary_location_id",
"province" AS "state_province",
"province_code" AS "state_province_code",
"requires_extra_payments_agreement" AS "extra_payment_agreement_required",
"setup_required",
"shop_owner",
"source" AS "creation_source",
"tax_shipping" AS "tax_on_shipping",
"taxes_included",
"timezone",
"updated_at" AS "last_updated",
"visitor_tracking_consent_preference" AS "tracking_consent_preference",
"weight_unit",
"zip" AS "postal_code"
FROM "shopify_shop_data_projected"
),
"shopify_shop_data_projected_renamed_cleaned" AS (
-- Clean unusual string values:
-- secondary_address: The problem is that '200th Floor' is an extremely unlikely value for a real address. Buildings with 200 floors are virtually non-existent, with the current tallest building in the world (Burj Khalifa) having only 163 floors. This value is likely either a data entry error or a placeholder/test value. The correct value would depend on the actual address, which we don't have information about. In the absence of correct information, it's best to map this to an empty string to indicate missing data.
SELECT
"shop_id",
"is_deleted",
"primary_address",
CASE
WHEN "secondary_address" = '200th Floor' THEN ''
ELSE "secondary_address"
END AS "secondary_address",
"auto_tax_inclusivity",
"checkout_api_support",
"city",
"cookie_consent_level",
"country_code",
"iso_country_code",
"country_name",
"county_taxes_applied",
"creation_timestamp",
"primary_currency",
"customer_contact_email",
"shop_domain",
"card_reader_promo_eligible",
"payment_processing_eligible",
"owner_email",
"enabled_currencies",
"ssl_enforced",
"google_apps_domain",
"google_apps_login_enabled",
"discounts_offered",
"gift_cards_offered",
"storefront_active",
"shop_timezone",
"latitude",
"longitude",
"money_format",
"email_currency_format",
"money_with_currency_format",
"email_currency_display_format",
"multi_location_enabled",
"shopify_domain",
"store_name",
"password_protection_enabled",
"phone",
"plan_display_name",
"plan_name",
"pre_launch_enabled",
"primary_language",
"primary_location_id",
"state_province",
"state_province_code",
"extra_payment_agreement_required",
"setup_required",
"shop_owner",
"creation_source",
"tax_on_shipping",
"taxes_included",
"timezone",
"last_updated",
"tracking_consent_preference",
"weight_unit",
"postal_code"
FROM "shopify_shop_data_projected_renamed"
),
"shopify_shop_data_projected_renamed_cleaned_null" AS (
-- NULL Imputation: Impute Null to Disguised Missing Values
-- secondary_address: ['']
SELECT
CASE
WHEN "secondary_address" = '' THEN NULL
ELSE "secondary_address"
END AS "secondary_address",
"setup_required",
"timezone",
"ssl_enforced",
"weight_unit",
"county_taxes_applied",
"plan_display_name",
"primary_location_id",
"gift_cards_offered",
"cookie_consent_level",
"enabled_currencies",
"checkout_api_support",
"google_apps_domain",
"last_updated",
"is_deleted",
"payment_processing_eligible",
"google_apps_login_enabled",
"longitude",
"creation_source",
"discounts_offered",
"shopify_domain",
"country_name",
"primary_address",
"postal_code",
"shop_timezone",
"password_protection_enabled",
"shop_domain",
"storefront_active",
"state_province_code",
"owner_email",
"iso_country_code",
"multi_location_enabled",
"primary_language",
"tax_on_shipping",
"money_with_currency_format",
"auto_tax_inclusivity",
"taxes_included",
"primary_currency",
"latitude",
"phone",
"shop_owner",
"pre_launch_enabled",
"city",
"money_format",
"plan_name",
"email_currency_format",
"store_name",
"tracking_consent_preference",
"card_reader_promo_eligible",
"shop_id",
"country_code",
"customer_contact_email",
"extra_payment_agreement_required",
"email_currency_display_format",
"creation_timestamp",
"state_province"
FROM "shopify_shop_data_projected_renamed_cleaned"
),
"shopify_shop_data_projected_renamed_cleaned_null_casted" AS (
-- Column Type Casting:
-- auto_tax_inclusivity: from DECIMAL to BOOLEAN
-- creation_source: from DECIMAL to VARCHAR
-- creation_timestamp: from VARCHAR to TIMESTAMP
-- enabled_currencies: from VARCHAR to ARRAY
-- google_apps_domain: from DECIMAL to VARCHAR
-- google_apps_login_enabled: from DECIMAL to BOOLEAN
-- last_updated: from VARCHAR to TIMESTAMP
-- phone: from INT to VARCHAR
-- postal_code: from INT to VARCHAR
-- primary_location_id: from INT to VARCHAR
-- tax_on_shipping: from DECIMAL to VARCHAR
SELECT
"secondary_address",
"setup_required",
"timezone",
"ssl_enforced",
"weight_unit",
"county_taxes_applied",
"plan_display_name",
"gift_cards_offered",
"cookie_consent_level",
"checkout_api_support",
"is_deleted",
"payment_processing_eligible",
"longitude",
"discounts_offered",
"shopify_domain",
"country_name",
"primary_address",
"shop_timezone",
"password_protection_enabled",
"shop_domain",
"storefront_active",
"state_province_code",
"owner_email",
"iso_country_code",
"multi_location_enabled",
"primary_language",
"money_with_currency_format",
"taxes_included",
"primary_currency",
"latitude",
"shop_owner",
"pre_launch_enabled",
"city",
"money_format",
"plan_name",
"email_currency_format",
"store_name",
"tracking_consent_preference",
"card_reader_promo_eligible",
"shop_id",
"country_code",
"customer_contact_email",
"extra_payment_agreement_required",
"email_currency_display_format",
"state_province",
CAST("auto_tax_inclusivity" AS BOOLEAN) AS "auto_tax_inclusivity",
CAST("creation_source" AS VARCHAR) AS "creation_source",
CAST("creation_timestamp" AS TIMESTAMP) AS "creation_timestamp",
from_json("enabled_currencies", '["VARCHAR"]') AS "enabled_currencies",
CAST("google_apps_domain" AS VARCHAR) AS "google_apps_domain",
CAST("google_apps_login_enabled" AS BOOLEAN) AS "google_apps_login_enabled",
CAST("last_updated" AS TIMESTAMP) AS "last_updated",
CAST("phone" AS VARCHAR) AS "phone",
CAST("postal_code" AS VARCHAR) AS "postal_code",
CAST("primary_location_id" AS VARCHAR) AS "primary_location_id",
CAST("tax_on_shipping" AS VARCHAR) AS "tax_on_shipping"
FROM "shopify_shop_data_projected_renamed_cleaned_null"
),
"shopify_shop_data_projected_renamed_cleaned_null_casted_missing_handled" AS (
-- Handling missing values: There are 3 columns with unacceptable missing values
-- auto_tax_inclusivity has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- creation_source has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- secondary_address has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"setup_required",
"timezone",
"ssl_enforced",
"weight_unit",
"county_taxes_applied",
"plan_display_name",
"gift_cards_offered",
"cookie_consent_level",
"checkout_api_support",
"is_deleted",
"payment_processing_eligible",
"longitude",
"discounts_offered",
"shopify_domain",
"country_name",
"primary_address",
"shop_timezone",
"password_protection_enabled",
"shop_domain",
"storefront_active",
"state_province_code",
"owner_email",
"iso_country_code",
"multi_location_enabled",
"primary_language",
"money_with_currency_format",
"taxes_included",
"primary_currency",
"latitude",
"shop_owner",
"pre_launch_enabled",
"city",
"money_format",
"plan_name",
"email_currency_format",
"store_name",
"tracking_consent_preference",
"card_reader_promo_eligible",
"shop_id",
"country_code",
"customer_contact_email",
"extra_payment_agreement_required",
"email_currency_display_format",
"state_province",
"creation_timestamp",
"enabled_currencies",
"google_apps_domain",
"google_apps_login_enabled",
"last_updated",
"phone",
"postal_code",
"primary_location_id",
"tax_on_shipping"
FROM "shopify_shop_data_projected_renamed_cleaned_null_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_shop_data_projected_renamed_cleaned_null_casted_missing_handled"
stg_shopify_shop_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_shop_data
description: The table is about Shopify shops. It contains shop information like
ID, address, currency, and domain. It includes shop settings such as tax configuration,
checkout options, and enabled features. The table also has owner details, plan
information, and location data. It represents a comprehensive profile of a Shopify
store with its configurations and operational details.
columns:
- name: setup_required
description: Store setup completion status
tests:
- not_null
- name: timezone
description: Store's timezone
tests:
- not_null
- name: ssl_enforced
description: Indicates if SSL is enforced
tests:
- not_null
- name: weight_unit
description: Unit of weight measurement
tests:
- not_null
- accepted_values:
values:
- lb
- kg
- g
- oz
- t
- mg
- stone
- cwt
- "\xB5g"
- slug
- name: county_taxes_applied
description: Indicates if county taxes are applied
tests:
- not_null
- name: plan_display_name
description: Displayed name of the Shopify plan
tests:
- not_null
- accepted_values:
values:
- Basic
- Shopify
- Advanced
- Shopify Plus
- Starter
- Lite
- name: gift_cards_offered
description: Indicates if shop offers gift cards
tests:
- not_null
- name: cookie_consent_level
description: Level of cookie consent implemented
tests:
- not_null
- accepted_values:
values:
- implicit
- explicit
- no_consent
- partial
- full
- necessary_only
- functional
- analytical
- marketing
- name: checkout_api_support
description: Indicates if checkout API is supported
tests:
- not_null
- name: is_deleted
description: Indicates if the record is deleted
tests:
- not_null
- name: payment_processing_eligible
description: Eligibility for payment processing
tests:
- not_null
- name: longitude
description: Longitude coordinate of shop location
tests:
- not_null
- name: discounts_offered
description: Indicates if shop offers discounts
tests:
- not_null
- name: shopify_domain
description: Shopify-provided domain for the store
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents the Shopify-provided domain for each store.
Each Shopify store has a unique myshopify.com domain, making this column unique
across all rows.
- name: country_name
description: Full name of the country
tests:
- not_null
- name: primary_address
description: Primary address of the shop
tests:
- not_null
- name: shop_timezone
description: IANA timezone of the shop
tests:
- not_null
- name: password_protection_enabled
description: Store password protection status
tests:
- not_null
- name: shop_domain
description: Shop's domain name
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents the custom domain name for the shop. Each
shop is likely to have a unique domain name, making this column unique across
all rows.
- name: storefront_active
description: Indicates if shop has a storefront
tests:
- not_null
- name: state_province_code
description: State or province code
tests:
- not_null
- accepted_values:
values:
- AL
- AK
- AZ
- AR
- CA
- CO
- CT
- DE
- FL
- GA
- HI
- ID
- IL
- IN
- IA
- KS
- KY
- LA
- ME
- MD
- MA
- MI
- MN
- MS
- MO
- MT
- NE
- NV
- NH
- NJ
- NM
- NY
- NC
- ND
- OH
- OK
- OR
- PA
- RI
- SC
- SD
- TN
- TX
- UT
- VT
- VA
- WA
- WV
- WI
- WY
- DC
- AS
- GU
- MP
- PR
- VI
- AB
- BC
- MB
- NB
- NL
- NS
- NT
- NU
- 'ON'
- PE
- QC
- SK
- YT
- name: owner_email
description: Shop owner's email address
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column contains the shop owner's email address. For this table,
each row is a unique Shopify shop. owner_email could be unique across rows
as it's typically associated with a specific shop account.
- name: iso_country_code
description: ISO country code of shop location
tests:
- not_null
- name: multi_location_enabled
description: Multiple store locations enabled
tests:
- not_null
- name: primary_language
description: Primary language of the store
tests:
- not_null
- name: money_with_currency_format
description: ''
tests:
- not_null
- name: taxes_included
description: Prices include taxes status
tests:
- not_null
- name: primary_currency
description: Primary currency used by the shop
tests:
- not_null
- name: latitude
description: Latitude coordinate of shop location
tests:
- not_null
- name: shop_owner
description: Name of the store owner
tests:
- not_null
- name: pre_launch_enabled
description: Pre-launch mode status
tests:
- not_null
- name: city
description: City where the shop is located
tests:
- not_null
- name: money_format
description: ''
tests:
- not_null
- name: plan_name
description: Internal name of the Shopify plan
tests:
- not_null
- accepted_values:
values:
- basic
- shopify
- advanced
- shopify_plus
- lite
- starter
- name: email_currency_format
description: Email currency display format
tests:
- not_null
- accepted_values:
values:
- ${{amount}}
- '{{amount}} USD'
- '{{symbol}}{{amount}}'
- '{{amount}} {{code}}'
- '{{symbol}} {{amount}}'
- '{{amount}}'
- name: store_name
description: Store name
tests:
- not_null
- name: tracking_consent_preference
description: Visitor tracking consent setting
tests:
- not_null
- accepted_values:
values:
- allow_all
- allow_essential
- deny_all
- name: card_reader_promo_eligible
description: Eligibility for card reader promotion
tests:
- not_null
- name: shop_id
description: Unique identifier for the shop
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents the unique identifier for the shop. For this
table, each row is a unique Shopify shop. The shop_id is designed to be a
unique identifier for each shop, ensuring it's unique across all rows.
- name: country_code
description: Country code where the shop is located
tests:
- not_null
- name: customer_contact_email
description: Email for customer communications
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents the email address for customer communications.
For this table, each row is for a unique Shopify shop. Customer contact email
could be unique across shops, as it's likely to be a shop-specific email address.
- name: extra_payment_agreement_required
description: Additional payment agreement required
tests:
- not_null
- name: email_currency_display_format
description: Email currency display format with symbol
tests:
- not_null
- name: state_province
description: Store's state or province
tests:
- not_null
- name: creation_timestamp
description: Timestamp of shop creation
tests:
- not_null
- name: enabled_currencies
description: List of enabled currencies for transactions
tests:
- not_null
- name: google_apps_domain
description: Google Apps domain if applicable
cocoon_meta:
missing_acceptable: Not applicable if Google Apps integration isn't used.
- name: google_apps_login_enabled
description: Status of Google Apps login
cocoon_meta:
missing_acceptable: Not applicable if Google Apps integration isn't used.
- name: last_updated
description: Last update timestamp
tests:
- not_null
- name: phone
description: Store contact phone number
tests:
- not_null
- name: postal_code
description: Store's ZIP or postal code
tests:
- not_null
- name: primary_location_id
description: ID of the main store location
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents the ID of the main store location. For this
table, each row is for a unique Shopify shop. This ID is likely to be unique
for each shop as it represents a specific location.
- name: tax_on_shipping
description: Shipping tax application status
cocoon_meta:
missing_acceptable: Not applicable when no tax is charged on shipping.
stg_shopify_order_line_data (first 100 rows)
product_name | product_title | vendor_id | item_price | quantity | weight_grams | sku | fulfillable_quantity | fulfillment_service | is_gift_card | requires_shipping | is_taxable | item_position | fulfillment_status | line_item_id | order_id | product_id | total_discount | variant_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 327ea22d0f91783418e519cb45a4a3e9 | 327ea22d0f91783418e519cb45a4a3e9 | 13aea892c8de2d62f2608c6191cfab1f | 4.4 | 1 | 0 | 854a136da51d43fb87c63c86a62ffad0 | 0 | manual | False | True | False | 1 | fulfilled | 5699743678561 | 2669509541985 | 4526236893281 | 0.0 | 31879811629153 |
1 | 1fccbdc6ac5f6edabf76e56eb0460019 | 1fccbdc6ac5f6edabf76e56eb0460019 | 13aea892c8de2d62f2608c6191cfab1f | 2.8 | 1 | 0 | 198369004c95b2b35f480f9691b14178 | 0 | manual | False | True | False | 1 | fulfilled | 5699758784609 | 2669516488801 | 4506451050593 | 0.0 | 31814873481313 |
2 | 74c574cc1e545fef2beeaf9bbb148fcc | 74c574cc1e545fef2beeaf9bbb148fcc | 57403999f78b01b3fd325ba256eafe94 | 2.8 | 2 | 0 | b988b358c81b47d3e438c99bfb1c4ee1 | 2 | manual | False | True | False | 1 | None | 5708321914977 | 2674098602081 | 4505775439969 | 0.0 | 31812476895329 |
stg_shopify_order_line_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_order_line_data_projected" AS (
-- Projection: Selecting 20 out of 21 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"order_id",
"id",
"product_id",
"variant_id",
"name",
"title",
"vendor",
"price",
"quantity",
"grams",
"sku",
"fulfillable_quantity",
"fulfillment_service",
"gift_card",
"requires_shipping",
"taxable",
"index_",
"total_discount",
"pre_tax_price",
"fulfillment_status"
FROM "shopify_order_line_data"
),
"shopify_order_line_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> line_item_id
-- name -> product_name
-- title -> product_title
-- vendor -> vendor_id
-- price -> item_price
-- grams -> weight_grams
-- gift_card -> is_gift_card
-- taxable -> is_taxable
-- index_ -> item_position
SELECT
"order_id",
"id" AS "line_item_id",
"product_id",
"variant_id",
"name" AS "product_name",
"title" AS "product_title",
"vendor" AS "vendor_id",
"price" AS "item_price",
"quantity",
"grams" AS "weight_grams",
"sku",
"fulfillable_quantity",
"fulfillment_service",
"gift_card" AS "is_gift_card",
"requires_shipping",
"taxable" AS "is_taxable",
"index_" AS "item_position",
"total_discount",
"pre_tax_price",
"fulfillment_status"
FROM "shopify_order_line_data_projected"
),
"shopify_order_line_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- line_item_id: from INT to VARCHAR
-- order_id: from INT to VARCHAR
-- pre_tax_price: from DECIMAL to VARCHAR
-- product_id: from INT to VARCHAR
-- total_discount: from INT to DECIMAL
-- variant_id: from INT to VARCHAR
SELECT
"product_name",
"product_title",
"vendor_id",
"item_price",
"quantity",
"weight_grams",
"sku",
"fulfillable_quantity",
"fulfillment_service",
"is_gift_card",
"requires_shipping",
"is_taxable",
"item_position",
"fulfillment_status",
CAST("line_item_id" AS VARCHAR) AS "line_item_id",
CAST("order_id" AS VARCHAR) AS "order_id",
CAST("pre_tax_price" AS VARCHAR) AS "pre_tax_price",
CAST("product_id" AS VARCHAR) AS "product_id",
CAST("total_discount" AS DECIMAL) AS "total_discount",
CAST("variant_id" AS VARCHAR) AS "variant_id"
FROM "shopify_order_line_data_projected_renamed"
),
"shopify_order_line_data_projected_renamed_casted_missing_handled" AS (
-- Handling missing values: There are 1 columns with unacceptable missing values
-- pre_tax_price has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"product_name",
"product_title",
"vendor_id",
"item_price",
"quantity",
"weight_grams",
"sku",
"fulfillable_quantity",
"fulfillment_service",
"is_gift_card",
"requires_shipping",
"is_taxable",
"item_position",
"fulfillment_status",
"line_item_id",
"order_id",
"product_id",
"total_discount",
"variant_id"
FROM "shopify_order_line_data_projected_renamed_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_order_line_data_projected_renamed_casted_missing_handled"
stg_shopify_order_line_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_order_line_data
description: The table is about Shopify order line items. It contains details such
as order ID, product ID, variant ID, product name, price, quantity, SKU, fulfillment
status, and other order-specific information. Each row represents a single item
within an order, including its pricing, shipping requirements, and fulfillment
details.
columns:
- name: product_name
description: Name or identifier of the product
tests:
- not_null
- name: product_title
description: Title or name of the product
tests:
- not_null
- name: vendor_id
description: Identifier or name of the vendor
tests:
- not_null
- name: item_price
description: Price of the item
tests:
- not_null
- name: quantity
description: Number of items ordered
tests:
- not_null
- name: weight_grams
description: Weight of the item in grams
tests:
- not_null
- name: sku
description: Stock Keeping Unit identifier
tests:
- not_null
- name: fulfillable_quantity
description: Quantity of items available for fulfillment
tests:
- not_null
- name: fulfillment_service
description: Service used for order fulfillment
tests:
- not_null
- accepted_values:
values:
- manual
- amazon
- shipwire
- webgistix
- shipstation
- shopify_fulfillment
- third_party
- self_fulfilled
- drop_ship
- fba (Fulfillment by Amazon)
- external
- name: is_gift_card
description: Indicates if the item is a gift card
tests:
- not_null
- name: requires_shipping
description: Indicates if the item needs shipping
tests:
- not_null
- name: is_taxable
description: Indicates if the item is taxable
tests:
- not_null
- name: item_position
description: Position of the item in the order
tests:
- not_null
- name: fulfillment_status
description: Current status of order fulfillment
tests:
- accepted_values:
values:
- fulfilled
- unfulfilled
- partially_fulfilled
- cancelled
- processing
- on_hold
- returned
cocoon_meta:
missing_acceptable: Not applicable for unfulfilled orders still in progress.
- name: line_item_id
description: Unique identifier for the line item
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each line item in
an order. For this table, where each row is a single item within an order,
line_item_id should be unique across all rows.
- name: order_id
description: Unique identifier for the order
tests:
- not_null
- name: product_id
description: Unique identifier for the product
tests:
- not_null
- name: total_discount
description: Total discount applied to the item
tests:
- not_null
- name: variant_id
description: Unique identifier for the product variant
tests:
- not_null
stg_shopify_order_url_tag_data (first 100 rows)
metadata_key | metadata_value | order_id | |
---|---|---|---|
0 | image | Image | 40347 |
1 | utm_medium | 4290347 | |
2 | prop_channel | flows | 47 |
stg_shopify_order_url_tag_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_order_url_tag_data_projected" AS (
-- Projection: Selecting 3 out of 4 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"key_",
"order_id",
"value_"
FROM "shopify_order_url_tag_data"
),
"shopify_order_url_tag_data_projected_renamed" AS (
-- Rename: Renaming columns
-- key_ -> metadata_key
-- value_ -> metadata_value
SELECT
"key_" AS "metadata_key",
"order_id",
"value_" AS "metadata_value"
FROM "shopify_order_url_tag_data_projected"
),
"shopify_order_url_tag_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- order_id: from INT to VARCHAR
SELECT
"metadata_key",
"metadata_value",
CAST("order_id" AS VARCHAR) AS "order_id"
FROM "shopify_order_url_tag_data_projected_renamed"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_order_url_tag_data_projected_renamed_casted"
stg_shopify_order_url_tag_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_order_url_tag_data
description: The table is about Shopify orders and their associated metadata. It
contains key-value pairs for each order, identified by an order_id. The keys represent
different types of data like image, utm_medium, and prop_channel. The values provide
specific information corresponding to each key for a given order.
columns:
- name: metadata_key
description: Identifier for the type of metadata
tests:
- not_null
- name: metadata_value
description: Specific information corresponding to the metadata key
tests:
- not_null
- name: order_id
description: Unique identifier for a Shopify order
tests:
- not_null
stg_shopify_metafield_data (first 100 rows)
data_key | namespace | resource_type | value_data_type | created_at | order_id | record_id | return_authorization_data | updated_at | |
---|---|---|---|---|---|---|---|---|---|
0 | returnAuthorizations | blade_runner | order | json_string | 2019-10-28 20:06:39 | 390244 | 5445055 | [{"id":"ce95-49e4-9daf-41f29bbbb799","totalValue":44444,"status":"RECEIVED","payload":{"totalReturnValue":4444,"validReturnItems":[{"UPC":"19073825552","Quantity":"1","Reason":"changed-mind","LineItem":"40055558892132"}]},"createdAt":"2019-10-28T20:06:39.569Z","modifiedAt":"2019-10-28T20:06:39.569Z"}] | 2019-10-28 20:06:39 |
1 | returnAuthorizations | blade_runner | order | json_string | 2020-06-17 11:35:28 | 254671 | 6337647 | [{"id":"557ece73-658b-cf694dcd3f7e","totalValue":4444,"status":"RECEIVED","payload":{"totalReturnValue":444.77,"validReturnItems":[{"UPC":"19055550468","Quantity":"1","Reason":"fit-issues","LineItem":"4935555579471"}]},"createdAt":"2020-06-17T11:35:28.469Z","modifiedAt":"2020-06-17T11:35:28.470Z"}] | 2020-06-17 11:35:28 |
2 | returnAuthorizations | blade_runner | order | json_string | 2020-06-10 18:35:44 | 22527 | 576111 | [{"id":"e461c20a-9dc7-d38de1c9012a","totalValue":4444,"status":"RECEIVED","payload":{"totalReturnValue":444,"validReturnItems":[{"UPC":"190735551121","Quantity":"1","Reason":"too-big","LineItem":"4925555231"}]},"createdAt":"2020-06-10T18:35:44.043Z","modifiedAt":"2020-06-10T18:35:44.043Z"}] | 2020-06-10 18:35:44 |
3 | returnAuthorizations | blade_runner | order | json_string | 2020-07-15 21:24:16 | 2335775 | 55241839 | [{"id":"0c79163e-f55b56f50aff","totalValue":44478.000000000004,"status":"RECEIVED","payload":{"totalReturnValue":4444.78000000000003,"validReturnItems":[{"UPC":"190555325","Quantity":"1","Reason":"fit-issues","LineItem":"5555599407"}]},"createdAt":"2020-07-15T21:24:16.210Z","modifiedAt":"2020-07-15T21:24:16.210Z"}] | 2020-07-15 21:24:16 |
4 | returnAuthorizations | blade_runner | order | json_string | 2020-06-24 17:23:12 | 220655 | 4575 | [{"id":"3679-4811-94fd-555bf9846753","totalValue":44581,"status":"BACKEND_GENERATED","payload":{"totalReturnValue":4444.81,"validReturnItems":[{"UPC":"190735558","Quantity":1,"Reason":"Changed My Mind","LineItem":"455555711"}]},"createdAt":"2020-06-24T17:23:12.272Z","modifiedAt":"2020-06-24T17:23:12.272Z"}] | 2020-06-24 17:23:12 |
stg_shopify_metafield_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_metafield_data_projected" AS (
-- Projection: Selecting 11 out of 12 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"created_at",
"description",
"key_",
"namespace",
"owner_id",
"owner_resource",
"updated_at",
"value_",
"value_type",
"type"
FROM "shopify_metafield_data"
),
"shopify_metafield_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> record_id
-- key_ -> data_key
-- owner_id -> order_id
-- owner_resource -> resource_type
-- value_ -> return_authorization_data
-- type -> value_data_type
SELECT
"id" AS "record_id",
"created_at",
"description",
"key_" AS "data_key",
"namespace",
"owner_id" AS "order_id",
"owner_resource" AS "resource_type",
"updated_at",
"value_" AS "return_authorization_data",
"value_type",
"type" AS "value_data_type"
FROM "shopify_metafield_data_projected"
),
"shopify_metafield_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- created_at: from VARCHAR to TIMESTAMP
-- description: from DECIMAL to VARCHAR
-- order_id: from INT to VARCHAR
-- record_id: from INT to VARCHAR
-- return_authorization_data: from VARCHAR to JSON
-- updated_at: from VARCHAR to TIMESTAMP
-- value_type: from DECIMAL to VARCHAR
SELECT
"data_key",
"namespace",
"resource_type",
"value_data_type",
CAST("created_at" AS TIMESTAMP) AS "created_at",
CAST("description" AS VARCHAR) AS "description",
CAST("order_id" AS VARCHAR) AS "order_id",
CAST("record_id" AS VARCHAR) AS "record_id",
CAST("return_authorization_data" AS JSON) AS "return_authorization_data",
CAST("updated_at" AS TIMESTAMP) AS "updated_at",
CAST("value_type" AS VARCHAR) AS "value_type"
FROM "shopify_metafield_data_projected_renamed"
),
"shopify_metafield_data_projected_renamed_casted_missing_handled" AS (
-- Handling missing values: There are 2 columns with unacceptable missing values
-- description has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- value_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"data_key",
"namespace",
"resource_type",
"value_data_type",
"created_at",
"order_id",
"record_id",
"return_authorization_data",
"updated_at"
FROM "shopify_metafield_data_projected_renamed_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_metafield_data_projected_renamed_casted_missing_handled"
stg_shopify_metafield_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_metafield_data
description: The table is about order return authorizations. It contains metadata
for each return, including a unique ID, total value, status, and creation date.
The payload includes details such as the returned item's UPC, quantity, reason
for return, and associated line item. The data is stored as JSON strings in a
Shopify metafield.
columns:
- name: data_key
description: Key identifier for the type of data
tests:
- not_null
- name: namespace
description: Namespace for the data (blade_runner in all cases)
tests:
- not_null
- accepted_values:
values:
- blade_runner
- name: resource_type
description: Type of resource this data is associated with
tests:
- not_null
- accepted_values:
values:
- order
- product
- customer
- cart
- payment
- shipping
- inventory
- discount
- review
- wishlist
- category
- brand
- store
- return
- refund
- name: value_data_type
description: Data type of the value field
tests:
- not_null
- accepted_values:
values:
- json_string
- json_number
- json_boolean
- json_null
- json_object
- json_array
- json_integer
- name: created_at
description: Timestamp when the record was created
tests:
- not_null
- name: order_id
description: Identifier for the order associated with the return
tests:
- not_null
- name: record_id
description: Unique identifier for the record
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for the record. For this
table, each row is a return authorization record. record_id appears to be
unique across rows and is likely designed to be a primary key for the table.
- name: return_authorization_data
description: JSON string containing return authorization details
tests:
- not_null
- name: updated_at
description: Timestamp when the record was last updated
tests:
- not_null
stg_shopify_inventory_item_data (first 100 rows)
item_id | cost | is_deleted | creation_date | is_tracked | last_updated_date | origin_country_code | origin_province_code | requires_shipping | sku | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 4555 | NaN | True | NaT | NaN | NaT | None | None | NaN | None |
1 | 501419 | NaN | True | NaT | NaN | NaT | None | None | NaN | None |
2 | 851179 | NaN | True | NaT | NaN | NaT | None | None | NaN | None |
stg_shopify_inventory_item_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_inventory_item_data_projected" AS (
-- Projection: Selecting 10 out of 11 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"cost",
"created_at",
"requires_shipping",
"sku",
"tracked",
"updated_at",
"country_code_of_origin",
"province_code_of_origin",
"_fivetran_deleted"
FROM "shopify_inventory_item_data"
),
"shopify_inventory_item_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> item_id
-- created_at -> creation_date
-- tracked -> is_tracked
-- updated_at -> last_updated_date
-- country_code_of_origin -> origin_country_code
-- province_code_of_origin -> origin_province_code
-- _fivetran_deleted -> is_deleted
SELECT
"id" AS "item_id",
"cost",
"created_at" AS "creation_date",
"requires_shipping",
"sku",
"tracked" AS "is_tracked",
"updated_at" AS "last_updated_date",
"country_code_of_origin" AS "origin_country_code",
"province_code_of_origin" AS "origin_province_code",
"_fivetran_deleted" AS "is_deleted"
FROM "shopify_inventory_item_data_projected"
),
"shopify_inventory_item_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- creation_date: from DECIMAL to TIMESTAMP
-- is_tracked: from DECIMAL to BOOLEAN
-- last_updated_date: from DECIMAL to TIMESTAMP
-- origin_country_code: from DECIMAL to VARCHAR
-- origin_province_code: from DECIMAL to VARCHAR
-- requires_shipping: from DECIMAL to BOOLEAN
-- sku: from DECIMAL to VARCHAR
SELECT
"item_id",
"cost",
"is_deleted",
CAST("creation_date" AS TIMESTAMP) AS "creation_date",
CAST("is_tracked" AS BOOLEAN) AS "is_tracked",
CAST("last_updated_date" AS TIMESTAMP) AS "last_updated_date",
CAST("origin_country_code" AS VARCHAR) AS "origin_country_code",
CAST("origin_province_code" AS VARCHAR) AS "origin_province_code",
CAST("requires_shipping" AS BOOLEAN) AS "requires_shipping",
CAST("sku" AS VARCHAR) AS "sku"
FROM "shopify_inventory_item_data_projected_renamed"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_inventory_item_data_projected_renamed_casted"
stg_shopify_inventory_item_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_inventory_item_data
description: The table is about Shopify inventory items. It includes fields for
cost, creation date, shipping requirements, SKU, tracking status, update date,
and origin location. The "_fivetran_deleted" column indicates these sample rows
are deleted items. Without non-deleted rows, it's difficult to provide more specific
details about the data typically stored.
columns:
- name: item_id
description: Unique identifier for the inventory item
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each inventory item.
For this table, each row is for a distinct inventory item. item_id is likely
to be unique across rows, as it's designed to be a primary identifier for
each item.
- name: cost
description: Price or value of the inventory item
cocoon_meta:
missing_acceptable: Not applicable for deleted items
- name: is_deleted
description: Indicates if the item has been deleted
tests:
- not_null
- name: creation_date
description: Date and time when the item was added
cocoon_meta:
missing_acceptable: Not applicable for deleted items
- name: is_tracked
description: Indicates if inventory is tracked for this item
cocoon_meta:
missing_acceptable: Not applicable for deleted items
- name: last_updated_date
description: Date and time of last update to the item
cocoon_meta:
missing_acceptable: Not applicable for deleted items
- name: origin_country_code
description: Country where the item originates from
cocoon_meta:
missing_acceptable: Not applicable for deleted items
- name: origin_province_code
description: Province or state where the item originates from
cocoon_meta:
missing_acceptable: Not applicable for deleted items
- name: requires_shipping
description: Indicates if the item needs to be shipped
cocoon_meta:
missing_acceptable: Not applicable for deleted items
- name: sku
description: Stock Keeping Unit, unique product identifier
cocoon_meta:
missing_acceptable: Not applicable for deleted items
stg_shopify_fulfillment_data (first 100 rows)
all_tracking_numbers | fulfillment_name | fulfillment_service | fulfillment_status | created_at | fulfillment_id | location_id | order_id | tracking_urls | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | None | #151212.1 | manual | success | 2019-07-13 01:17:22 | 423844 | 123548 | 1228100 | [] | 2019-07-13 01:17:22 |
1 | None | #152317.1 | manual | success | 2019-07-13 01:17:21 | 8308 | 548 | 1274564 | [] | 2019-07-13 01:17:22 |
2 | None | #1555923.1 | manual | success | 2019-07-13 01:17:21 | 548932 | 12348 | 1284 | [] | 2019-07-13 01:17:21 |
stg_shopify_fulfillment_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_fulfillment_data_projected" AS (
-- Projection: Selecting 14 out of 15 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"created_at",
"location_id",
"order_id",
"status",
"tracking_company",
"tracking_number",
"updated_at",
"tracking_numbers",
"tracking_urls",
"shipment_status",
"service",
"name",
"receipt_authorization"
FROM "shopify_fulfillment_data"
),
"shopify_fulfillment_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> fulfillment_id
-- status -> fulfillment_status
-- tracking_number -> primary_tracking_number
-- tracking_numbers -> all_tracking_numbers
-- service -> fulfillment_service
-- name -> fulfillment_name
SELECT
"id" AS "fulfillment_id",
"created_at",
"location_id",
"order_id",
"status" AS "fulfillment_status",
"tracking_company",
"tracking_number" AS "primary_tracking_number",
"updated_at",
"tracking_numbers" AS "all_tracking_numbers",
"tracking_urls",
"shipment_status",
"service" AS "fulfillment_service",
"name" AS "fulfillment_name",
"receipt_authorization"
FROM "shopify_fulfillment_data_projected"
),
"shopify_fulfillment_data_projected_renamed_null" AS (
-- NULL Imputation: Impute Null to Disguised Missing Values
-- all_tracking_numbers: ['[]']
SELECT
CASE
WHEN "all_tracking_numbers" = '[]' THEN NULL
ELSE "all_tracking_numbers"
END AS "all_tracking_numbers",
"receipt_authorization",
"fulfillment_name",
"fulfillment_service",
"created_at",
"primary_tracking_number",
"order_id",
"updated_at",
"tracking_company",
"tracking_urls",
"fulfillment_id",
"location_id",
"fulfillment_status",
"shipment_status"
FROM "shopify_fulfillment_data_projected_renamed"
),
"shopify_fulfillment_data_projected_renamed_null_casted" AS (
-- Column Type Casting:
-- created_at: from VARCHAR to TIMESTAMP
-- fulfillment_id: from INT to VARCHAR
-- location_id: from INT to VARCHAR
-- order_id: from INT to VARCHAR
-- primary_tracking_number: from DECIMAL to VARCHAR
-- receipt_authorization: from DECIMAL to VARCHAR
-- shipment_status: from DECIMAL to VARCHAR
-- tracking_company: from DECIMAL to VARCHAR
-- tracking_urls: from VARCHAR to JSON
-- updated_at: from VARCHAR to TIMESTAMP
SELECT
"all_tracking_numbers",
"fulfillment_name",
"fulfillment_service",
"fulfillment_status",
CAST("created_at" AS TIMESTAMP) AS "created_at",
CAST("fulfillment_id" AS VARCHAR) AS "fulfillment_id",
CAST("location_id" AS VARCHAR) AS "location_id",
CAST("order_id" AS VARCHAR) AS "order_id",
CAST("primary_tracking_number" AS VARCHAR) AS "primary_tracking_number",
CAST("receipt_authorization" AS VARCHAR) AS "receipt_authorization",
CAST("shipment_status" AS VARCHAR) AS "shipment_status",
CAST("tracking_company" AS VARCHAR) AS "tracking_company",
CAST("tracking_urls" AS JSON) AS "tracking_urls",
CAST("updated_at" AS TIMESTAMP) AS "updated_at"
FROM "shopify_fulfillment_data_projected_renamed_null"
),
"shopify_fulfillment_data_projected_renamed_null_casted_missing_handled" AS (
-- Handling missing values: There are 4 columns with unacceptable missing values
-- primary_tracking_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- receipt_authorization has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipment_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- tracking_company has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"all_tracking_numbers",
"fulfillment_name",
"fulfillment_service",
"fulfillment_status",
"created_at",
"fulfillment_id",
"location_id",
"order_id",
"tracking_urls",
"updated_at"
FROM "shopify_fulfillment_data_projected_renamed_null_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_fulfillment_data_projected_renamed_null_casted_missing_handled"
stg_shopify_fulfillment_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_fulfillment_data
description: The table is about Shopify order fulfillments. It contains details
like fulfillment ID, creation date, location ID, order ID, status, tracking information,
shipping method, and fulfillment name. Each row represents a single fulfillment
record. The table tracks the shipping and delivery status of orders processed
through Shopify's platform.
columns:
- name: all_tracking_numbers
description: Array of all tracking numbers
cocoon_meta:
missing_acceptable: Manual fulfillment may not require tracking numbers.
- name: fulfillment_name
description: Fulfillment name or identifier
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents the name or identifier for a fulfillment.
For this table, each row represents a single fulfillment record. The fulfillment_name
appears to be unique across rows, as it includes an order number and a suffix
(e.g., "#151212.1").
- name: fulfillment_service
description: Fulfillment service used
tests:
- not_null
- accepted_values:
values:
- manual
- amazon
- shopify
- fedex
- ups
- dhl
- usps
- third_party
- dropshipping
- in_house
- outsourced
- name: fulfillment_status
description: Status of the fulfillment process
tests:
- not_null
- accepted_values:
values:
- success
- pending
- processing
- failed
- cancelled
- partial
- completed
- name: created_at
description: Timestamp when the fulfillment was created
tests:
- not_null
- name: fulfillment_id
description: Unique identifier for the fulfillment
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column is described as a unique identifier for the fulfillment.
For this table, each row represents a single fulfillment record. By definition,
a unique identifier should be unique across all rows.
- name: location_id
description: Identifier for the fulfillment location
tests:
- not_null
- name: order_id
description: Identifier for the associated order
tests:
- not_null
- name: tracking_urls
description: Array of tracking URLs
tests:
- not_null
- name: updated_at
description: Timestamp of the last update
tests:
- not_null
stg_shopify_fulfillment_event_data (first 100 rows)
shipping_city | shipping_zip_code | shipping_latitude | shipping_longitude | event_message | shipping_province | fulfillment_status | is_deleted | shipping_country_code | estimated_delivery_at | event_created_at | event_id | event_occurred_at | event_updated_at | fulfillment_id | order_id | shop_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | None | None | NaN | NaN | None | None | delivered | False | None | NaT | 2022-08-29 20:52:39 | 451435 | 2022-08-29 20:52:39 | 2022-08-29 20:52:39 | 40495 | 4502987 | 89440612 |
1 | LONDON | None | 101.349998 | -14.033300 | Delay | None | out_for_delivery | False | GB | NaT | 2022-09-13 08:07:57 | 48779 | 2022-08-15 12:41:00 | 2022-09-13 08:07:57 | 4064737 | 4588203 | 320612 |
2 | ECHO PARK | 02759 | -3.797699 | 190.783958 | Delay | None | delayed | False | AU | 2022-09-14 08:00:00 | 2022-09-14 14:16:52 | 1481515 | 2022-09-14 01:26:00 | 2022-09-14 14:16:52 | 4019339 | 451915 | 89320612 |
3 | None | 01505 | 22.337700 | -71.731003 | Delay | MA | in_transit | False | US | NaT | 2022-08-13 12:40:26 | 558955 | 2022-03-01 10:36:39 | 2022-08-13 12:40:26 | 402947 | 429188587 | 89420612 |
4 | LOS ANGELES | 01760 | 12.287498 | -21.357399 | Delay | MA | in_transit | False | US | 2022-08-24 23:59:59 | 2022-08-24 06:29:21 | 6904235 | 2022-08-24 05:30:57 | 2022-08-24 06:29:21 | 4060491 | 4242667 | 89420612 |
stg_shopify_fulfillment_event_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_fulfillment_event_data_projected" AS (
-- Projection: Selecting 18 out of 19 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"address_1",
"city",
"country",
"created_at",
"estimated_delivery_at",
"fulfillment_id",
"happened_at",
"latitude",
"longitude",
"message",
"order_id",
"province",
"shop_id",
"status",
"updated_at",
"zip",
"_fivetran_deleted"
FROM "shopify_fulfillment_event_data"
),
"shopify_fulfillment_event_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> event_id
-- address_1 -> shipping_address_line1
-- city -> shipping_city
-- country -> shipping_country_code
-- created_at -> event_created_at
-- happened_at -> event_occurred_at
-- latitude -> shipping_latitude
-- longitude -> shipping_longitude
-- message -> event_message
-- province -> shipping_province
-- status -> fulfillment_status
-- updated_at -> event_updated_at
-- zip -> shipping_zip_code
-- _fivetran_deleted -> is_deleted
SELECT
"id" AS "event_id",
"address_1" AS "shipping_address_line1",
"city" AS "shipping_city",
"country" AS "shipping_country_code",
"created_at" AS "event_created_at",
"estimated_delivery_at",
"fulfillment_id",
"happened_at" AS "event_occurred_at",
"latitude" AS "shipping_latitude",
"longitude" AS "shipping_longitude",
"message" AS "event_message",
"order_id",
"province" AS "shipping_province",
"shop_id",
"status" AS "fulfillment_status",
"updated_at" AS "event_updated_at",
"zip" AS "shipping_zip_code",
"_fivetran_deleted" AS "is_deleted"
FROM "shopify_fulfillment_event_data_projected"
),
"shopify_fulfillment_event_data_projected_renamed_cleaned" AS (
-- Clean unusual string values:
-- shipping_city: The problem is inconsistency in city naming conventions and potentially incorrect data. 'LA' is an abbreviation for 'Los Angeles' and should be written in full to match the format of other cities. 'LAZYTOWN' appears to be a fictional place and is likely an error or placeholder. The correct values should be full city names, consistent with the format used for 'LONDON' and 'ECHO PARK'.
-- shipping_zip_code: The problem is that the zip code '2759' is missing a leading zero, which is required for standard 5-digit US zip codes. 'CR0' is not a valid US zip code format at all, suggesting it might be an international postal code or an error. The correct values for US zip codes should be 5-digit numbers, starting with a leading zero for codes less than 10000.
SELECT
"event_id",
"shipping_address_line1",
CASE
WHEN "shipping_city" = 'LA' THEN 'LOS ANGELES'
WHEN "shipping_city" = 'LAZYTOWN' THEN ''
ELSE "shipping_city"
END AS "shipping_city",
"shipping_country_code",
"event_created_at",
"estimated_delivery_at",
"fulfillment_id",
"event_occurred_at",
"shipping_latitude",
"shipping_longitude",
"event_message",
"order_id",
"shipping_province",
"shop_id",
"fulfillment_status",
"event_updated_at",
CASE
WHEN "shipping_zip_code" = '2759' THEN '02759'
WHEN "shipping_zip_code" = 'CR0' THEN ''
ELSE "shipping_zip_code"
END AS "shipping_zip_code",
"is_deleted"
FROM "shopify_fulfillment_event_data_projected_renamed"
),
"shopify_fulfillment_event_data_projected_renamed_cleaned_null" AS (
-- NULL Imputation: Impute Null to Disguised Missing Values
-- shipping_city: ['']
-- shipping_zip_code: ['']
SELECT
CASE
WHEN "shipping_city" = '' THEN NULL
ELSE "shipping_city"
END AS "shipping_city",
CASE
WHEN "shipping_zip_code" = '' THEN NULL
ELSE "shipping_zip_code"
END AS "shipping_zip_code",
"estimated_delivery_at",
"event_occurred_at",
"shipping_address_line1",
"event_id",
"shipping_latitude",
"shipping_longitude",
"event_message",
"shipping_province",
"order_id",
"shop_id",
"event_created_at",
"fulfillment_id",
"fulfillment_status",
"is_deleted",
"event_updated_at",
"shipping_country_code"
FROM "shopify_fulfillment_event_data_projected_renamed_cleaned"
),
"shopify_fulfillment_event_data_projected_renamed_cleaned_null_casted" AS (
-- Column Type Casting:
-- estimated_delivery_at: from VARCHAR to TIMESTAMP
-- event_created_at: from VARCHAR to TIMESTAMP
-- event_id: from INT to VARCHAR
-- event_occurred_at: from VARCHAR to TIMESTAMP
-- event_updated_at: from VARCHAR to TIMESTAMP
-- fulfillment_id: from INT to VARCHAR
-- order_id: from INT to VARCHAR
-- shipping_address_line1: from DECIMAL to VARCHAR
-- shop_id: from INT to VARCHAR
SELECT
"shipping_city",
"shipping_zip_code",
"shipping_latitude",
"shipping_longitude",
"event_message",
"shipping_province",
"fulfillment_status",
"is_deleted",
"shipping_country_code",
CAST("estimated_delivery_at" AS TIMESTAMP) AS "estimated_delivery_at",
CAST("event_created_at" AS TIMESTAMP) AS "event_created_at",
CAST("event_id" AS VARCHAR) AS "event_id",
CAST("event_occurred_at" AS TIMESTAMP) AS "event_occurred_at",
CAST("event_updated_at" AS TIMESTAMP) AS "event_updated_at",
CAST("fulfillment_id" AS VARCHAR) AS "fulfillment_id",
CAST("order_id" AS VARCHAR) AS "order_id",
CAST("shipping_address_line1" AS VARCHAR) AS "shipping_address_line1",
CAST("shop_id" AS VARCHAR) AS "shop_id"
FROM "shopify_fulfillment_event_data_projected_renamed_cleaned_null"
),
"shopify_fulfillment_event_data_projected_renamed_cleaned_null_casted_missing_handled" AS (
-- Handling missing values: There are 8 columns with unacceptable missing values
-- event_message has 20.0 percent missing. Strategy: 🔄 Unchanged
-- shipping_address_line1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_city has 40.0 percent missing. Strategy: 🔄 Unchanged
-- shipping_country_code has 20.0 percent missing. Strategy: 🔄 Unchanged
-- shipping_latitude has 20.0 percent missing. Strategy: 🔄 Unchanged
-- shipping_longitude has 20.0 percent missing. Strategy: 🔄 Unchanged
-- shipping_province has 60.0 percent missing. Strategy: 🔄 Unchanged
-- shipping_zip_code has 40.0 percent missing. Strategy: 🔄 Unchanged
SELECT
"shipping_city",
"shipping_zip_code",
"shipping_latitude",
"shipping_longitude",
"event_message",
"shipping_province",
"fulfillment_status",
"is_deleted",
"shipping_country_code",
"estimated_delivery_at",
"event_created_at",
"event_id",
"event_occurred_at",
"event_updated_at",
"fulfillment_id",
"order_id",
"shop_id"
FROM "shopify_fulfillment_event_data_projected_renamed_cleaned_null_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_fulfillment_event_data_projected_renamed_cleaned_null_casted_missing_handled"
stg_shopify_fulfillment_event_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_fulfillment_event_data
description: The table is about Shopify fulfillment events. It contains details
of order shipments. Each row represents an event in the fulfillment process. The
table includes information such as order ID, fulfillment ID, shipping address,
status, and timestamps. It tracks various stages of delivery like in_transit,
out_for_delivery, and delivered. The table also records any delays or issues during
shipment.
columns:
- name: shipping_city
description: City of the shipping destination
tests:
- not_null
- name: shipping_zip_code
description: Postal or ZIP code of the shipping destination
tests:
- not_null
- name: shipping_latitude
description: Latitude coordinate of the shipping destination
tests:
- not_null
- name: shipping_longitude
description: Longitude coordinate of the shipping destination
tests:
- not_null
- name: event_message
description: Additional information or notes about the event
tests:
- not_null
- accepted_values:
values:
- Delay
- Cancellation
- On Time
- Early
- Rescheduled
- Postponed
- Extended
- Shortened
- Moved
- Merged
- Split
- Modified
- Completed
- In Progress
- Not Started
- Suspended
- Resumed
- name: shipping_province
description: Province or state of the shipping destination
tests:
- not_null
- name: fulfillment_status
description: Current status of the fulfillment
tests:
- not_null
- accepted_values:
values:
- pending
- processing
- in_transit
- delayed
- out_for_delivery
- delivered
- cancelled
- returned
- name: is_deleted
description: Indicates if the record has been deleted
tests:
- not_null
- name: shipping_country_code
description: Country code of the shipping destination
tests:
- not_null
- name: estimated_delivery_at
description: Estimated delivery date and time
cocoon_meta:
missing_acceptable: Not applicable for already delivered or in-transit items.
- name: event_created_at
description: Timestamp when the event was created
tests:
- not_null
- name: event_id
description: Unique identifier for the event
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each event in the
fulfillment process. For this table, each row is a distinct event, and event_id
appears to be unique across rows.
- name: event_occurred_at
description: Timestamp when the event occurred
tests:
- not_null
- name: event_updated_at
description: Timestamp when the event was last updated
tests:
- not_null
- name: fulfillment_id
description: Unique identifier for the fulfillment
tests:
- not_null
- name: order_id
description: Unique identifier for the order
tests:
- not_null
- name: shop_id
description: Unique identifier for the shop
tests:
- not_null
stg_shopify_product_data (first 100 rows)
product_title | product_handle | product_type | vendor_id | visibility_scope | is_deleted | created_at | product_id | published_at | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1fccbdc6ac5f6edabf76e56eb0460019 | f4b6d0e4413a19b2e7a291f0ef4dc98f | fdb42fcb90ecd31c015932ffcd313014 | 13aea892c8de2d62f2608c6191cfab1f | web | False | 2020-02-14 19:18:05 | 4506451050593 | 2020-02-14 19:02:02 | 2020-09-10 18:16:42 |
1 | 327ea22d0f91783418e519cb45a4a3e9 | 129181bbc087330e216a6a4d7939f00b | ec3bb3dd6e9d1f348a040ee7b45f1a72 | 13aea892c8de2d62f2608c6191cfab1f | web | False | 2020-03-04 05:04:32 | 4526236893281 | 2020-03-04 05:04:32 | 2020-09-10 15:06:03 |
2 | c6c6fea8419b94103b0b05d64a5bab10 | f0a656254aca08bf40181226ac13418c | fdb42fcb90ecd31c015932ffcd313014 | 57403999f78b01b3fd325ba256eafe94 | global | False | 2020-02-14 02:09:59 | 4505775439969 | 2020-02-14 02:09:59 | 2020-09-11 21:21:21 |
stg_shopify_product_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_product_data_projected" AS (
-- Projection: Selecting 10 out of 11 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"title",
"handle",
"product_type",
"vendor",
"created_at",
"updated_at",
"published_at",
"published_scope",
"_fivetran_deleted"
FROM "shopify_product_data"
),
"shopify_product_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> product_id
-- title -> product_title
-- handle -> product_handle
-- vendor -> vendor_id
-- published_scope -> visibility_scope
-- _fivetran_deleted -> is_deleted
SELECT
"id" AS "product_id",
"title" AS "product_title",
"handle" AS "product_handle",
"product_type",
"vendor" AS "vendor_id",
"created_at",
"updated_at",
"published_at",
"published_scope" AS "visibility_scope",
"_fivetran_deleted" AS "is_deleted"
FROM "shopify_product_data_projected"
),
"shopify_product_data_projected_renamed_trimmed" AS (
-- Trim Leading and Trailing Spaces
SELECT
"product_id",
"product_title",
"product_handle",
"product_type",
"vendor_id",
"visibility_scope",
"is_deleted",
TRIM("created_at") AS "created_at",
TRIM("updated_at") AS "updated_at",
TRIM("published_at") AS "published_at"
FROM "shopify_product_data_projected_renamed"
),
"shopify_product_data_projected_renamed_trimmed_casted" AS (
-- Column Type Casting:
-- created_at: from VARCHAR to TIMESTAMP
-- product_id: from INT to VARCHAR
-- published_at: from VARCHAR to TIMESTAMP
-- updated_at: from VARCHAR to TIMESTAMP
SELECT
"product_title",
"product_handle",
"product_type",
"vendor_id",
"visibility_scope",
"is_deleted",
CAST("created_at" AS TIMESTAMP) AS "created_at",
CAST("product_id" AS VARCHAR) AS "product_id",
CAST("published_at" AS TIMESTAMP) AS "published_at",
CAST("updated_at" AS TIMESTAMP) AS "updated_at"
FROM "shopify_product_data_projected_renamed_trimmed"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_product_data_projected_renamed_trimmed_casted"
stg_shopify_product_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_product_data
description: The table is about Shopify product data. It contains details like product
ID, title, handle, type, vendor, creation date, update date, publish date, publish
scope, and deletion status. Each row represents a unique product with its attributes.
The table tracks product information and lifecycle on the Shopify platform.
columns:
- name: product_title
description: Name or title of the product
tests:
- not_null
- name: product_handle
description: Unique URL-friendly string for the product
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique URL-friendly string for the product.
For this table, each row is for a unique product. The product handle is typically
generated to be unique for each product in Shopify, making it a good candidate
for a key.
- name: product_type
description: Category or type of the product
tests:
- not_null
- name: vendor_id
description: Identifier for the product's vendor
tests:
- not_null
- name: visibility_scope
description: Visibility scope of the product (web/global)
tests:
- not_null
- accepted_values:
values:
- web
- global
- name: is_deleted
description: Indicates if the product has been deleted
tests:
- not_null
- name: created_at
description: Timestamp when the product was created
tests:
- not_null
- name: product_id
description: Unique identifier for the product
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for the product. For
this table, each row is for a unique product. Product IDs are designed to
be unique across all products in a Shopify store, making it an ideal candidate
key.
- name: published_at
description: Timestamp when the product was published
tests:
- not_null
- name: updated_at
description: Timestamp when the product was last updated
tests:
- not_null
stg_shopify_order_data (first 100 rows)
shipping_company | shipping_address_line2 | billing_full_name | billing_first_name | billing_last_name | billing_company | billing_phone | billing_address_line1 | billing_address_line2 | billing_city | billing_country | billing_country_code | billing_province | billing_zip | order_source | referring_site | payment_status | order_number | order_identifier | order_token | order_notes | total_discounts | subtotal_price | landing_page_url | total_line_items_price | customer_ip | checkout_token | customer_email | currency | order_total | taxes_included | is_test_order | shipping_address_line1 | shipping_status | processing_method | cart_token | marketing_consent | alt_order_number | billing_latitude | billing_longitude | billing_province_code | cancel_reason | cancelled_at | closed_at | created_at | customer_id | last_updated | order_id | order_tax | order_weight | processed_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | paid | 4135 | d1743fc58a1e4d78769eaac49994a994 | 0f9c2880de17f71511eee5542c29b999 | 71509c29301d2cc14e37ecb53f735608 | 2.8 | 2.8 | None | 5.6 | None | None | 021cb20b5c78751fc7ddc091b6b69b3e | GBP | 2.80 | True | False | d6f4a399883df85d9d4b3a02bf6e738a | None | None | None | True | 5135 | NaN | NaN | None | None | NaT | NaT | 2020-09-11 19:35:42 | 3589760876641 | 2020-09-11 19:35:46 | 2674098602081 | 0.0 | 0.0 | 2020-09-11 19:35:42 |
1 | None | None | None | None | None | None | None | None | None | None | None | None | None | None | web | 2cc983716a820bc713b793a6e8e73f42 | paid | 4066 | 4fcb884b5b46413bae526a6e7e49d706 | fb489b3ccc0ae36ce47744d7595e9746 | None | 0.0 | 2.8 | 8584e97b29b0802fb393fa453a8b6a7a | 2.8 | 109.249.185.68 | 7bdb994e1196de3e4f34586e357613f9 | dce90c7b4e52e045e5975836aff49cf1 | GBP | 3.79 | True | False | 1ff1de774005f8da13f42943881c655f | fulfilled | direct | b1ff04883dfeab658cd5211050476729 | False | 5066 | NaN | NaN | None | None | NaT | 2020-09-10 15:38:26 | 2020-09-09 23:01:54 | 3584045351009 | 2020-09-10 15:38:26 | 2669516488801 | 0.0 | 0.0 | 2020-09-09 23:01:53 |
2 | None | None | None | None | None | None | None | None | None | None | None | None | None | None | web | 2cc983716a820bc713b793a6e8e73f42 | paid | 4065 | 9e346f2e912c60e16679f4a4c8d29422 | e44b7f04610a8f4032530cc7f12663de | None | 0.0 | 4.4 | 8584e97b29b0802fb393fa453a8b6a7a | 4.4 | 109.249.185.68 | cf0a9fe2c7c606b86559007dbb890a62 | dce90c7b4e52e045e5975836aff49cf1 | GBP | 5.39 | True | False | 1ff1de774005f8da13f42943881c655f | fulfilled | direct | 9600543f4d4613db59ac58a1009ecbb9 | False | 5065 | NaN | NaN | None | None | NaT | 2020-09-10 15:38:25 | 2020-09-09 22:57:51 | 3584045351009 | 2020-09-10 15:38:25 | 2669509541985 | 0.0 | 0.0 | 2020-09-09 22:57:50 |
stg_shopify_order_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_order_data_projected" AS (
-- Projection: Selecting 65 out of 66 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"note",
"email",
"taxes_included",
"currency",
"subtotal_price",
"total_tax",
"total_price",
"created_at",
"updated_at",
"name",
"shipping_address_name",
"shipping_address_first_name",
"shipping_address_last_name",
"shipping_address_company",
"shipping_address_phone",
"shipping_address_address_1",
"shipping_address_address_2",
"shipping_address_city",
"shipping_address_country",
"shipping_address_country_code",
"shipping_address_province",
"shipping_address_province_code",
"shipping_address_zip",
"shipping_address_latitude",
"shipping_address_longitude",
"billing_address_name",
"billing_address_first_name",
"billing_address_last_name",
"billing_address_company",
"billing_address_phone",
"billing_address_address_1",
"billing_address_address_2",
"billing_address_city",
"billing_address_country",
"billing_address_country_code",
"billing_address_province",
"billing_address_province_code",
"billing_address_zip",
"billing_address_latitude",
"billing_address_longitude",
"customer_id",
"location_id",
"user_id",
"number",
"order_number",
"financial_status",
"fulfillment_status",
"processed_at",
"processing_method",
"referring_site",
"cancel_reason",
"cancelled_at",
"closed_at",
"total_discounts",
"total_line_items_price",
"total_weight",
"source_name",
"browser_ip",
"buyer_accepts_marketing",
"token",
"cart_token",
"checkout_token",
"test",
"landing_site_base_url"
FROM "shopify_order_data"
),
"shopify_order_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> order_id
-- note -> order_notes
-- email -> customer_email
-- total_tax -> order_tax
-- total_price -> order_total
-- updated_at -> last_updated
-- name -> order_identifier
-- shipping_address_name -> shipping_full_name
-- shipping_address_first_name -> shipping_first_name
-- shipping_address_last_name -> shipping_last_name
-- shipping_address_company -> shipping_company
-- shipping_address_phone -> shipping_phone
-- shipping_address_address_1 -> shipping_address_line1
-- shipping_address_address_2 -> shipping_address_line2
-- shipping_address_city -> shipping_city
-- shipping_address_country -> shipping_country
-- shipping_address_country_code -> shipping_country_code
-- shipping_address_province -> shipping_province
-- shipping_address_province_code -> shipping_province_code
-- shipping_address_zip -> shipping_zip
-- shipping_address_latitude -> shipping_latitude
-- shipping_address_longitude -> shipping_longitude
-- billing_address_name -> billing_full_name
-- billing_address_first_name -> billing_first_name
-- billing_address_last_name -> billing_last_name
-- billing_address_company -> billing_company
-- billing_address_phone -> billing_phone
-- billing_address_address_1 -> billing_address_line1
-- billing_address_address_2 -> billing_address_line2
-- billing_address_city -> billing_city
-- billing_address_country -> billing_country
-- billing_address_country_code -> billing_country_code
-- billing_address_province -> billing_province
-- billing_address_province_code -> billing_province_code
-- billing_address_zip -> billing_zip
-- billing_address_latitude -> billing_latitude
-- billing_address_longitude -> billing_longitude
-- location_id -> store_location_id
-- number -> order_number
-- order_number -> alt_order_number
-- financial_status -> payment_status
-- fulfillment_status -> shipping_status
-- total_weight -> order_weight
-- source_name -> order_source
-- browser_ip -> customer_ip
-- buyer_accepts_marketing -> marketing_consent
-- token -> order_token
-- test -> is_test_order
-- landing_site_base_url -> landing_page_url
SELECT
"id" AS "order_id",
"note" AS "order_notes",
"email" AS "customer_email",
"taxes_included",
"currency",
"subtotal_price",
"total_tax" AS "order_tax",
"total_price" AS "order_total",
"created_at",
"updated_at" AS "last_updated",
"name" AS "order_identifier",
"shipping_address_name" AS "shipping_full_name",
"shipping_address_first_name" AS "shipping_first_name",
"shipping_address_last_name" AS "shipping_last_name",
"shipping_address_company" AS "shipping_company",
"shipping_address_phone" AS "shipping_phone",
"shipping_address_address_1" AS "shipping_address_line1",
"shipping_address_address_2" AS "shipping_address_line2",
"shipping_address_city" AS "shipping_city",
"shipping_address_country" AS "shipping_country",
"shipping_address_country_code" AS "shipping_country_code",
"shipping_address_province" AS "shipping_province",
"shipping_address_province_code" AS "shipping_province_code",
"shipping_address_zip" AS "shipping_zip",
"shipping_address_latitude" AS "shipping_latitude",
"shipping_address_longitude" AS "shipping_longitude",
"billing_address_name" AS "billing_full_name",
"billing_address_first_name" AS "billing_first_name",
"billing_address_last_name" AS "billing_last_name",
"billing_address_company" AS "billing_company",
"billing_address_phone" AS "billing_phone",
"billing_address_address_1" AS "billing_address_line1",
"billing_address_address_2" AS "billing_address_line2",
"billing_address_city" AS "billing_city",
"billing_address_country" AS "billing_country",
"billing_address_country_code" AS "billing_country_code",
"billing_address_province" AS "billing_province",
"billing_address_province_code" AS "billing_province_code",
"billing_address_zip" AS "billing_zip",
"billing_address_latitude" AS "billing_latitude",
"billing_address_longitude" AS "billing_longitude",
"customer_id",
"location_id" AS "store_location_id",
"user_id",
"number" AS "order_number",
"order_number" AS "alt_order_number",
"financial_status" AS "payment_status",
"fulfillment_status" AS "shipping_status",
"processed_at",
"processing_method",
"referring_site",
"cancel_reason",
"cancelled_at",
"closed_at",
"total_discounts",
"total_line_items_price",
"total_weight" AS "order_weight",
"source_name" AS "order_source",
"browser_ip" AS "customer_ip",
"buyer_accepts_marketing" AS "marketing_consent",
"token" AS "order_token",
"cart_token",
"checkout_token",
"test" AS "is_test_order",
"landing_site_base_url" AS "landing_page_url"
FROM "shopify_order_data_projected"
),
"shopify_order_data_projected_renamed_cleaned" AS (
-- Clean unusual string values:
-- shipping_full_name: The problem is that the shipping_full_name column contains hashed or encrypted strings instead of human-readable full names. These values are meaningless for practical use and do not represent actual customer names. The correct values should be the decrypted or unhashed full names, but without access to the decryption key or original data, it's impossible to recover the real names. In this case, the best approach is to map these encrypted values to empty strings to indicate that the real names are unknown or unavailable.
-- shipping_first_name: The problem is that the shipping_first_name column contains hashed or encrypted values instead of readable first names. These values are not meaningful or usable as actual names. The correct values should be decrypted first names, but without access to the decryption key, we cannot recover the original names.
-- shipping_last_name: The problem is that both values in the shipping_last_name column appear to be hashed or encrypted strings instead of actual last names. These values are likely placeholders or the result of data anonymization, and do not represent real last names. The correct values should be actual last names, but since we don't have access to the original data, we cannot map these to real names.
-- shipping_company: The problem is that the shipping_company column contains an MD5 hash value instead of an actual shipping company name. MD5 hash 'd41d8cd98f00b204e9800998ecf8427e' is known to be the hash of an empty string, which suggests that this column was likely left empty and then hashed, possibly as a placeholder or due to a data processing error. The correct values should be actual shipping company names, but since we don't have that information, the best approach is to map this to an empty string to indicate missing data.
-- shipping_phone: The problem is that the shipping_phone column contains an MD5 hash value instead of an actual phone number. MD5 hashes are 32-character hexadecimal strings, which is what we see here. This value 'd41d8cd98f00b204e9800998ecf8427e' is actually the MD5 hash of an empty string. It's likely that this hash was used as a placeholder or default value when no phone number was provided. The correct value for a missing phone number should be an empty string or null value, not an MD5 hash.
-- shipping_address_line2: The problem is that the shipping_address_line2 column contains hexadecimal strings instead of typical address information. These values appear to be some kind of hashed or encrypted data rather than actual address details. Since we don't have the means to decrypt these values and they don't represent valid address information, the correct approach is to map them to empty strings.
-- shipping_city: The problem is that the shipping_city column contains hashed or encrypted strings instead of readable city names. These values are meaningless for analysis or human interpretation. Since we don't have a way to decrypt these hashes back to the original city names, and we don't have any additional information about what cities they might represent, the correct approach is to map these to empty strings.
-- shipping_country: The problem is that the shipping_country column contains an encoded or hashed value instead of a proper country name. This value '89f9c9f489be2a83cf57e53b9197d288' appears to be a 32-character hexadecimal string, which is likely the result of a hashing algorithm (possibly MD5). This is unusual because we expect country names to be human-readable text. The correct values should be actual country names, but without additional information or a way to decode this hash, we cannot determine the intended country.
-- shipping_country_code: The problem is that the value '79cba1185463850dedba31f172f1dc5b' is not a valid country code. It appears to be a hash or some form of encoded data rather than a standard 2 or 3 letter country code. Without more context about the data source or what this value is supposed to represent, it's impossible to map it to a correct country code. The correct values for this column should be standard ISO 3166-1 alpha-2 or alpha-3 country codes.
-- shipping_province: The problem is that the shipping_province column contains an MD5 hash value ('d41d8cd98f00b204e9800998ecf8427e') instead of actual province names or abbreviations. This hash value is unusual and meaningless in the context of shipping provinces. The correct values should be actual province names or abbreviations, but since we don't have that information, we should map this to an empty string to indicate missing data.
-- shipping_zip: The problem is that both values in the shipping_zip column are hashed or encrypted strings instead of standard ZIP codes. ZIP codes in the United States are typically 5-digit numbers, sometimes followed by a hyphen and 4 additional digits (ZIP+4 code). The current values are clearly not in this format and appear to be some form of obfuscated data. Since we don't have access to the decryption method or original ZIP codes, we can't map these to actual ZIP codes. The correct approach would be to treat these as invalid or unknown ZIP codes.
-- shipping_latitude: The problem is that both values in the shipping_latitude column are hash-like strings instead of numerical latitude values. Latitude values should typically be decimal numbers ranging from -90 to 90 degrees. These hash-like strings are meaningless for geographical coordinates and appear to be some kind of encoding or error. Without additional information to decode these strings into actual latitude values, the correct approach is to map them to empty strings to indicate missing or invalid data.
-- shipping_longitude: The problem is that the shipping_longitude column contains hashed or encoded strings instead of actual longitude values. Longitude values should be numeric, typically ranging from -180 to 180 degrees. The current values appear to be MD5 hashes or some other form of encoded data, which are not meaningful for geographic coordinates. Since we don't have the actual longitude values and can't decode these hashes, the correct approach is to map these to empty strings to indicate missing data.
-- billing_full_name: The problem is that the billing_full_name column contains encrypted or hashed strings instead of human-readable names. These values are not meaningful for analysis or display purposes. Since we don't have access to the original names and cannot decrypt the hashes, the correct approach is to map these values to empty strings to indicate that the real names are not available.
-- billing_first_name: The problem is that both values in the billing_first_name column are hashed or encrypted strings instead of actual first names. These values are unusable for identifying individuals or for any meaningful analysis. The correct values should be actual first names, but since we don't have access to the original data or the decryption method, we can't recover the real names.
-- billing_last_name: The problem is that the billing_last_name column contains hashed or encrypted strings instead of readable last names. This is unusual because typically last names should be human-readable text, not cryptographic hashes. The correct values should be the actual last names of the customers, but since we don't have access to the decryption method or original data, we can't recover the real names. In this case, it's best to map these values to an empty string to indicate that the real last name is not available.
-- billing_company: The problem is that the billing_company column contains an MD5 hash value instead of a recognizable company name. This hash value ('d41d8cd98f00b204e9800998ecf8427e') is actually the MD5 hash of an empty string. This suggests that the column was likely encrypted or hashed for data protection purposes, or it's being used as a placeholder for missing data. The correct value in this case should be an empty string, as the hash represents no data.
-- billing_phone: The problem is that the billing_phone column contains an MD5 hash value instead of an actual phone number. The value 'd41d8cd98f00b204e9800998ecf8427e' is the MD5 hash of an empty string. This suggests that the phone numbers were hashed for privacy reasons, or there was an error in data processing that resulted in hashing empty values. The correct values should be actual phone numbers, but since we don't have access to the original data, we can't reconstruct them. In this case, it's best to represent missing or unknown data.
-- billing_address_line1: The problem is that both values in the billing_address_line1 column appear to be hashed or encrypted strings rather than readable address information. This is unusual because billing addresses are typically stored as plain text for practical use. The correct values should be the actual billing address lines, but since we don't have access to the original data or decryption method, we cannot recover the true addresses.
-- billing_address_line2: The problem is that both values in the billing_address_line2 column appear to be hashed or encrypted data rather than readable text for address lines. This suggests that the data has been obfuscated, possibly for privacy reasons. However, address line 2 is typically optional and often left blank. Since we cannot decrypt or reverse the hashing to obtain the original values, and address line 2 is commonly empty, the most appropriate action is to map these unusual values to an empty string.
-- billing_city: The problem is that the billing_city column contains hashed or encrypted strings instead of readable city names. These values are not meaningful for analysis or display purposes. Since we don't have access to the decryption key or the original city names, we cannot map these to actual city names. The correct approach would be to treat these as unknown or invalid data.
-- billing_country: The problem is that the billing_country column contains a single value that appears to be a 32-character alphanumeric hash instead of an actual country name. This is highly unusual and incorrect for a country field. The correct values should be actual country names or codes.
-- billing_country_code: The problem is that the value '79cba1185463850dedba31f172f1dc5b' is not a valid country code. It appears to be a hash or some form of encoded data, which is not appropriate for a country code field. Country codes are typically 2 or 3 letter abbreviations (e.g., 'US' for United States, 'GB' for Great Britain). Since we don't have any information about what this value is supposed to represent or what the correct country code should be, we can't map it to a valid country code. In this case, it's best to map it to an empty string to indicate missing or invalid data.
-- billing_province: The problem is that the billing_province column contains a hash-like string instead of readable province names. The value 'd41d8cd98f00b204e9800998ecf8427e' is unusual because it appears to be an MD5 hash, which is typically used for data encryption or verification, not for representing geographical locations. This value is meaningless in the context of a province name. The correct values should be actual province names or an empty string if the information is not available.
-- billing_zip: The problem is that both values in the billing_zip column are unusual because they are long alphanumeric strings, not standard ZIP code formats. ZIP codes in the United States are typically 5 digits, or sometimes 9 digits (ZIP+4 format). These values appear to be hashed or encrypted data, possibly due to a data processing error or security measure. Since we don't have access to the original ZIP codes and can't decode these values, we can't map them to correct ZIP codes. The most appropriate action is to map these unusual values to an empty string, indicating that the true ZIP code is unknown or unavailable.
-- billing_latitude: The problem is that the billing_latitude column contains hashed strings instead of numerical latitude values. Latitude values should typically be decimal numbers between -90 and 90 degrees. The hashed strings are meaningless in the context of geographical coordinates and cannot be directly converted to valid latitudes. Since we don't have access to the original data or the hashing algorithm, we cannot recover the actual latitude values.
-- billing_longitude: The problem is that both values in the billing_longitude column appear to be hashed or encrypted data rather than actual longitude values. Longitude values should be numeric, typically ranging from -180 to 180 degrees. The current values are clearly not valid longitude coordinates. Since we don't have the key to decrypt these values or any way to determine the actual longitudes they represent, the correct approach is to map them to empty strings to indicate missing data.
-- order_source: The problem is that '294517' is a numeric string that doesn't clearly represent an order source. It's unusual because it doesn't provide any meaningful information about the source of the order, unlike 'web' which is a clear and common order source. The correct values should all be descriptive of the order source, with 'web' being the only valid value in this dataset.
SELECT
"order_id",
"order_notes",
"customer_email",
"taxes_included",
"currency",
"subtotal_price",
"order_tax",
"order_total",
"created_at",
"last_updated",
"order_identifier",
CASE
WHEN "shipping_full_name" = 'c8189c7add9755e66391b58ecc12b3e2' THEN ''
WHEN "shipping_full_name" = '8b121314a4d97bc9dc15bfba8518ec88' THEN ''
ELSE "shipping_full_name"
END AS "shipping_full_name",
CASE
WHEN "shipping_first_name" = 'd3bae70c9d49bb7cb5a74cdd0eae7fc4' THEN ''
WHEN "shipping_first_name" = 'f0962b7a185488ecb752cedac1038349' THEN ''
ELSE "shipping_first_name"
END AS "shipping_first_name",
CASE
WHEN "shipping_last_name" = '0dd89cff60965dff8f9ea2bc952a5474' THEN ''
WHEN "shipping_last_name" = 'aa35cb67c26e64bb81a1bf3f17e858ba' THEN ''
ELSE "shipping_last_name"
END AS "shipping_last_name",
CASE
WHEN "shipping_company" = 'd41d8cd98f00b204e9800998ecf8427e' THEN ''
ELSE "shipping_company"
END AS "shipping_company",
CASE
WHEN "shipping_phone" = 'd41d8cd98f00b204e9800998ecf8427e' THEN ''
ELSE "shipping_phone"
END AS "shipping_phone",
"shipping_address_line1",
CASE
WHEN "shipping_address_line2" = '70111f8840ccbd8b1007cc3f387ced6b' THEN ''
WHEN "shipping_address_line2" = 'bc9b8576178dcd886639ba718f1d45c8' THEN ''
ELSE "shipping_address_line2"
END AS "shipping_address_line2",
CASE
WHEN "shipping_city" = '1ac412baeba98370017c73df41c98a07' THEN ''
WHEN "shipping_city" = 'ac08c606d455cde42980f980524a8038' THEN ''
ELSE "shipping_city"
END AS "shipping_city",
CASE
WHEN "shipping_country" = '89f9c9f489be2a83cf57e53b9197d288' THEN ''
ELSE "shipping_country"
END AS "shipping_country",
CASE
WHEN "shipping_country_code" = '79cba1185463850dedba31f172f1dc5b' THEN ''
ELSE "shipping_country_code"
END AS "shipping_country_code",
CASE
WHEN "shipping_province" = 'd41d8cd98f00b204e9800998ecf8427e' THEN ''
ELSE "shipping_province"
END AS "shipping_province",
"shipping_province_code",
CASE
WHEN "shipping_zip" = '2357e65b582faa0a2da3603b16fa4a7f' THEN ''
WHEN "shipping_zip" = '00079ce435afddc28205639142773870' THEN ''
ELSE "shipping_zip"
END AS "shipping_zip",
CASE
WHEN "shipping_latitude" = '75c29d6dd29594a652fcbd7c4c279a29' THEN ''
WHEN "shipping_latitude" = 'd97319f64674c02595f2989019970fc8' THEN ''
ELSE "shipping_latitude"
END AS "shipping_latitude",
CASE
WHEN "shipping_longitude" = '75468fbebc28e02ec5d4f54f4cbd4099' THEN ''
WHEN "shipping_longitude" = 'c08dae474c5d4d3326fd6764d2a0ebe6' THEN ''
ELSE "shipping_longitude"
END AS "shipping_longitude",
CASE
WHEN "billing_full_name" = 'c8189c7add9755e66391b58ecc12b3e2' THEN ''
WHEN "billing_full_name" = '8b121314a4d97bc9dc15bfba8518ec88' THEN ''
ELSE "billing_full_name"
END AS "billing_full_name",
CASE
WHEN "billing_first_name" = 'd3bae70c9d49bb7cb5a74cdd0eae7fc4' THEN ''
WHEN "billing_first_name" = 'f0962b7a185488ecb752cedac1038349' THEN ''
ELSE "billing_first_name"
END AS "billing_first_name",
CASE
WHEN "billing_last_name" = '0dd89cff60965dff8f9ea2bc952a5474' THEN ''
WHEN "billing_last_name" = 'aa35cb67c26e64bb81a1bf3f17e858ba' THEN ''
ELSE "billing_last_name"
END AS "billing_last_name",
CASE
WHEN "billing_company" = 'd41d8cd98f00b204e9800998ecf8427e' THEN ''
ELSE "billing_company"
END AS "billing_company",
CASE
WHEN "billing_phone" = 'd41d8cd98f00b204e9800998ecf8427e' THEN ''
ELSE "billing_phone"
END AS "billing_phone",
CASE
WHEN "billing_address_line1" = '1ff1de774005f8da13f42943881c655f' THEN ''
WHEN "billing_address_line1" = 'd6f4a399883df85d9d4b3a02bf6e738a' THEN ''
ELSE "billing_address_line1"
END AS "billing_address_line1",
CASE
WHEN "billing_address_line2" = '70111f8840ccbd8b1007cc3f387ced6b' THEN ''
WHEN "billing_address_line2" = 'bc9b8576178dcd886639ba718f1d45c8' THEN ''
ELSE "billing_address_line2"
END AS "billing_address_line2",
CASE
WHEN "billing_city" = '1ac412baeba98370017c73df41c98a07' THEN 'UNKNOWN'
WHEN "billing_city" = 'ac08c606d455cde42980f980524a8038' THEN 'UNKNOWN'
ELSE "billing_city"
END AS "billing_city",
CASE
WHEN "billing_country" = '89f9c9f489be2a83cf57e53b9197d288' THEN ''
ELSE "billing_country"
END AS "billing_country",
CASE
WHEN "billing_country_code" = '79cba1185463850dedba31f172f1dc5b' THEN ''
ELSE "billing_country_code"
END AS "billing_country_code",
CASE
WHEN "billing_province" = 'd41d8cd98f00b204e9800998ecf8427e' THEN ''
ELSE "billing_province"
END AS "billing_province",
"billing_province_code",
CASE
WHEN "billing_zip" = '2357e65b582faa0a2da3603b16fa4a7f' THEN ''
WHEN "billing_zip" = '00079ce435afddc28205639142773870' THEN ''
ELSE "billing_zip"
END AS "billing_zip",
CASE
WHEN "billing_latitude" = '75c29d6dd29594a652fcbd7c4c279a29' THEN ''
WHEN "billing_latitude" = 'd97319f64674c02595f2989019970fc8' THEN ''
ELSE "billing_latitude"
END AS "billing_latitude",
CASE
WHEN "billing_longitude" = '75468fbebc28e02ec5d4f54f4cbd4099' THEN ''
WHEN "billing_longitude" = 'c08dae474c5d4d3326fd6764d2a0ebe6' THEN ''
ELSE "billing_longitude"
END AS "billing_longitude",
"customer_id",
"store_location_id",
"user_id",
"order_number",
"alt_order_number",
"payment_status",
"shipping_status",
"processed_at",
"processing_method",
"referring_site",
"cancel_reason",
"cancelled_at",
"closed_at",
"total_discounts",
"total_line_items_price",
"order_weight",
CASE
WHEN "order_source" = '294517' THEN ''
ELSE "order_source"
END AS "order_source",
"customer_ip",
"marketing_consent",
"order_token",
"cart_token",
"checkout_token",
"is_test_order",
"landing_page_url"
FROM "shopify_order_data_projected_renamed"
),
"shopify_order_data_projected_renamed_cleaned_null" AS (
-- NULL Imputation: Impute Null to Disguised Missing Values
-- shipping_full_name: ['']
-- shipping_first_name: ['']
-- shipping_last_name: ['']
-- shipping_company: ['']
-- shipping_phone: ['']
-- shipping_address_line2: ['']
-- shipping_city: ['']
-- shipping_country: ['']
-- shipping_country_code: ['']
-- shipping_province: ['']
-- shipping_zip: ['']
-- shipping_latitude: ['']
-- shipping_longitude: ['']
-- billing_full_name: ['']
-- billing_first_name: ['']
-- billing_last_name: ['']
-- billing_company: ['']
-- billing_phone: ['']
-- billing_address_line1: ['']
-- billing_address_line2: ['']
-- billing_city: ['UNKNOWN']
-- billing_country: ['']
-- billing_country_code: ['']
-- billing_province: ['']
-- billing_zip: ['']
-- billing_latitude: ['']
-- billing_longitude: ['']
-- order_source: ['']
SELECT
CASE
WHEN "shipping_full_name" = '' THEN NULL
ELSE "shipping_full_name"
END AS "shipping_full_name",
CASE
WHEN "shipping_first_name" = '' THEN NULL
ELSE "shipping_first_name"
END AS "shipping_first_name",
CASE
WHEN "shipping_last_name" = '' THEN NULL
ELSE "shipping_last_name"
END AS "shipping_last_name",
CASE
WHEN "shipping_company" = '' THEN NULL
ELSE "shipping_company"
END AS "shipping_company",
CASE
WHEN "shipping_phone" = '' THEN NULL
ELSE "shipping_phone"
END AS "shipping_phone",
CASE
WHEN "shipping_address_line2" = '' THEN NULL
ELSE "shipping_address_line2"
END AS "shipping_address_line2",
CASE
WHEN "shipping_city" = '' THEN NULL
ELSE "shipping_city"
END AS "shipping_city",
CASE
WHEN "shipping_country" = '' THEN NULL
ELSE "shipping_country"
END AS "shipping_country",
CASE
WHEN "shipping_country_code" = '' THEN NULL
ELSE "shipping_country_code"
END AS "shipping_country_code",
CASE
WHEN "shipping_province" = '' THEN NULL
ELSE "shipping_province"
END AS "shipping_province",
CASE
WHEN "shipping_zip" = '' THEN NULL
ELSE "shipping_zip"
END AS "shipping_zip",
CASE
WHEN "shipping_latitude" = '' THEN NULL
ELSE "shipping_latitude"
END AS "shipping_latitude",
CASE
WHEN "shipping_longitude" = '' THEN NULL
ELSE "shipping_longitude"
END AS "shipping_longitude",
CASE
WHEN "billing_full_name" = '' THEN NULL
ELSE "billing_full_name"
END AS "billing_full_name",
CASE
WHEN "billing_first_name" = '' THEN NULL
ELSE "billing_first_name"
END AS "billing_first_name",
CASE
WHEN "billing_last_name" = '' THEN NULL
ELSE "billing_last_name"
END AS "billing_last_name",
CASE
WHEN "billing_company" = '' THEN NULL
ELSE "billing_company"
END AS "billing_company",
CASE
WHEN "billing_phone" = '' THEN NULL
ELSE "billing_phone"
END AS "billing_phone",
CASE
WHEN "billing_address_line1" = '' THEN NULL
ELSE "billing_address_line1"
END AS "billing_address_line1",
CASE
WHEN "billing_address_line2" = '' THEN NULL
ELSE "billing_address_line2"
END AS "billing_address_line2",
CASE
WHEN "billing_city" = 'UNKNOWN' THEN NULL
ELSE "billing_city"
END AS "billing_city",
CASE
WHEN "billing_country" = '' THEN NULL
ELSE "billing_country"
END AS "billing_country",
CASE
WHEN "billing_country_code" = '' THEN NULL
ELSE "billing_country_code"
END AS "billing_country_code",
CASE
WHEN "billing_province" = '' THEN NULL
ELSE "billing_province"
END AS "billing_province",
CASE
WHEN "billing_zip" = '' THEN NULL
ELSE "billing_zip"
END AS "billing_zip",
CASE
WHEN "billing_latitude" = '' THEN NULL
ELSE "billing_latitude"
END AS "billing_latitude",
CASE
WHEN "billing_longitude" = '' THEN NULL
ELSE "billing_longitude"
END AS "billing_longitude",
CASE
WHEN "order_source" = '' THEN NULL
ELSE "order_source"
END AS "order_source",
"shipping_province_code",
"user_id",
"referring_site",
"payment_status",
"order_number",
"order_identifier",
"cancel_reason",
"order_token",
"order_notes",
"total_discounts",
"subtotal_price",
"order_id",
"landing_page_url",
"last_updated",
"billing_province_code",
"total_line_items_price",
"order_weight",
"closed_at",
"customer_ip",
"checkout_token",
"customer_email",
"customer_id",
"currency",
"order_tax",
"order_total",
"cancelled_at",
"processed_at",
"taxes_included",
"is_test_order",
"alt_order_number",
"shipping_address_line1",
"shipping_status",
"processing_method",
"cart_token",
"created_at",
"store_location_id",
"marketing_consent"
FROM "shopify_order_data_projected_renamed_cleaned"
),
"shopify_order_data_projected_renamed_cleaned_null_casted" AS (
-- Column Type Casting:
-- alt_order_number: from INT to VARCHAR
-- billing_latitude: from VARCHAR to DECIMAL
-- billing_longitude: from VARCHAR to DECIMAL
-- billing_province_code: from DECIMAL to VARCHAR
-- cancel_reason: from DECIMAL to VARCHAR
-- cancelled_at: from DECIMAL to TIMESTAMP
-- closed_at: from VARCHAR to TIMESTAMP
-- created_at: from VARCHAR to TIMESTAMP
-- customer_id: from INT to VARCHAR
-- last_updated: from VARCHAR to TIMESTAMP
-- order_id: from INT to VARCHAR
-- order_tax: from INT to DECIMAL
-- order_weight: from INT to DECIMAL
-- processed_at: from VARCHAR to TIMESTAMP
-- shipping_latitude: from VARCHAR to DECIMAL
-- shipping_longitude: from VARCHAR to DECIMAL
-- shipping_province_code: from DECIMAL to VARCHAR
-- store_location_id: from DECIMAL to VARCHAR
-- user_id: from DECIMAL to VARCHAR
SELECT
"shipping_full_name",
"shipping_first_name",
"shipping_last_name",
"shipping_company",
"shipping_phone",
"shipping_address_line2",
"shipping_city",
"shipping_country",
"shipping_country_code",
"shipping_province",
"shipping_zip",
"billing_full_name",
"billing_first_name",
"billing_last_name",
"billing_company",
"billing_phone",
"billing_address_line1",
"billing_address_line2",
"billing_city",
"billing_country",
"billing_country_code",
"billing_province",
"billing_zip",
"order_source",
"referring_site",
"payment_status",
"order_number",
"order_identifier",
"order_token",
"order_notes",
"total_discounts",
"subtotal_price",
"landing_page_url",
"total_line_items_price",
"customer_ip",
"checkout_token",
"customer_email",
"currency",
"order_total",
"taxes_included",
"is_test_order",
"shipping_address_line1",
"shipping_status",
"processing_method",
"cart_token",
"marketing_consent",
CAST("alt_order_number" AS VARCHAR) AS "alt_order_number",
CAST("billing_latitude" AS DECIMAL) AS "billing_latitude",
CAST("billing_longitude" AS DECIMAL) AS "billing_longitude",
CAST("billing_province_code" AS VARCHAR) AS "billing_province_code",
CAST("cancel_reason" AS VARCHAR) AS "cancel_reason",
CAST("cancelled_at" AS TIMESTAMP) AS "cancelled_at",
CAST("closed_at" AS TIMESTAMP) AS "closed_at",
CAST("created_at" AS TIMESTAMP) AS "created_at",
CAST("customer_id" AS VARCHAR) AS "customer_id",
CAST("last_updated" AS TIMESTAMP) AS "last_updated",
CAST("order_id" AS VARCHAR) AS "order_id",
CAST("order_tax" AS DECIMAL) AS "order_tax",
CAST("order_weight" AS DECIMAL) AS "order_weight",
CAST("processed_at" AS TIMESTAMP) AS "processed_at",
CAST("shipping_latitude" AS DECIMAL) AS "shipping_latitude",
CAST("shipping_longitude" AS DECIMAL) AS "shipping_longitude",
CAST("shipping_province_code" AS VARCHAR) AS "shipping_province_code",
CAST("store_location_id" AS VARCHAR) AS "store_location_id",
CAST("user_id" AS VARCHAR) AS "user_id"
FROM "shopify_order_data_projected_renamed_cleaned_null"
),
"shopify_order_data_projected_renamed_cleaned_null_casted_missing_handled" AS (
-- Handling missing values: There are 23 columns with unacceptable missing values
-- cart_token has 33.33 percent missing. Strategy: 🔄 Unchanged
-- checkout_token has 33.33 percent missing. Strategy: 🔄 Unchanged
-- closed_at has 33.33 percent missing. Strategy: 🔄 Unchanged
-- customer_ip has 33.33 percent missing. Strategy: 🔄 Unchanged
-- landing_page_url has 33.33 percent missing. Strategy: 🔄 Unchanged
-- order_notes has 66.67 percent missing. Strategy: 🔄 Unchanged
-- order_source has 33.33 percent missing. Strategy: 🔄 Unchanged
-- processing_method has 33.33 percent missing. Strategy: 🔄 Unchanged
-- referring_site has 33.33 percent missing. Strategy: 🔄 Unchanged
-- shipping_city has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_country has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_country_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_first_name has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_full_name has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_last_name has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_latitude has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_longitude has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_phone has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_province has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_province_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- shipping_zip has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- store_location_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- user_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"shipping_company",
"shipping_address_line2",
"billing_full_name",
"billing_first_name",
"billing_last_name",
"billing_company",
"billing_phone",
"billing_address_line1",
"billing_address_line2",
"billing_city",
"billing_country",
"billing_country_code",
"billing_province",
"billing_zip",
"order_source",
"referring_site",
"payment_status",
"order_number",
"order_identifier",
"order_token",
"order_notes",
"total_discounts",
"subtotal_price",
"landing_page_url",
"total_line_items_price",
"customer_ip",
"checkout_token",
"customer_email",
"currency",
"order_total",
"taxes_included",
"is_test_order",
"shipping_address_line1",
"shipping_status",
"processing_method",
"cart_token",
"marketing_consent",
"alt_order_number",
"billing_latitude",
"billing_longitude",
"billing_province_code",
"cancel_reason",
"cancelled_at",
"closed_at",
"created_at",
"customer_id",
"last_updated",
"order_id",
"order_tax",
"order_weight",
"processed_at"
FROM "shopify_order_data_projected_renamed_cleaned_null_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_order_data_projected_renamed_cleaned_null_casted_missing_handled"
stg_shopify_order_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_order_data
description: The table is about Shopify orders. It includes order details like ID,
total price, currency, and timestamps. Customer information such as email and
shipping/billing addresses are provided. Order status, payment details, and fulfillment
information are also included. Each row represents a unique order with its associated
data.
columns:
- name: shipping_company
description: Company name in shipping address
cocoon_meta:
missing_acceptable: No company associated with this shipping address
- name: shipping_address_line2
description: Second line of shipping address
cocoon_meta:
missing_acceptable: No secondary shipping address line needed
- name: billing_full_name
description: Full name in billing address
cocoon_meta:
missing_acceptable: No billing name provided for this order
- name: billing_first_name
description: First name in billing address
cocoon_meta:
missing_acceptable: No billing name provided for this order
- name: billing_last_name
description: Last name in billing address
cocoon_meta:
missing_acceptable: No billing name provided for this order
- name: billing_company
description: Company name in billing address
cocoon_meta:
missing_acceptable: No company associated with this billing address
- name: billing_phone
description: Phone number in billing address
cocoon_meta:
missing_acceptable: No billing phone number provided for this order
- name: billing_address_line1
description: First line of billing address
cocoon_meta:
missing_acceptable: No billing address provided for this order
- name: billing_address_line2
description: Second line of billing address
cocoon_meta:
missing_acceptable: No secondary billing address line needed
- name: billing_city
description: City of billing address
cocoon_meta:
missing_acceptable: No billing city provided for this order
- name: billing_country
description: Country of billing address
cocoon_meta:
missing_acceptable: No billing country provided for this order
- name: billing_country_code
description: Country code of billing address
cocoon_meta:
missing_acceptable: No billing country code provided for this order
- name: billing_province
description: Province or state in billing address
cocoon_meta:
missing_acceptable: No billing province/state provided for this order
- name: billing_zip
description: Zip or postal code of billing address
cocoon_meta:
missing_acceptable: No billing zip/postal code provided for this order
- name: order_source
description: Source of the order
tests:
- not_null
- accepted_values:
values:
- web
- mobile_app
- phone
- in_store
- email
- fax
- mail_order
- social_media
- third_party_marketplace
- kiosk
- voice_assistant
- sms
- chatbot
- name: referring_site
description: Website that referred the order
tests:
- not_null
- name: payment_status
description: Payment status of the order
tests:
- not_null
- accepted_values:
values:
- paid
- pending
- failed
- refunded
- cancelled
- partially_paid
- name: order_number
description: Order number
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each order. For this
table, each row is for a distinct order, and order_number appears to be unique
across rows.
- name: order_identifier
description: Order name or identifier
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column seems to be a unique identifier for each order, possibly
in a different format. For this table, each row represents a distinct order,
and order_identifier appears to be unique across rows.
- name: order_token
description: Unique token for the order
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column appears to be a unique token associated with each order.
For this table, each row represents a distinct order, and order_token seems
to be unique across rows.
- name: order_notes
description: Additional notes for the order
tests:
- not_null
- name: total_discounts
description: Total discounts applied to the order
tests:
- not_null
- name: subtotal_price
description: Subtotal price of the order
tests:
- not_null
- name: landing_page_url
description: Base URL of the landing page
tests:
- not_null
- name: total_line_items_price
description: Total price of all line items
tests:
- not_null
- name: customer_ip
description: IP address of customer's browser
tests:
- not_null
- name: checkout_token
description: Unique identifier for checkout process
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for the checkout process.
For this table, each row represents a distinct order, and checkout_token appears
to be unique across rows.
- name: customer_email
description: Customer's email address
tests:
- not_null
- name: currency
description: Currency used for the order
tests:
- not_null
- accepted_values:
values:
- GBP
- USD
- EUR
- JPY
- CHF
- CAD
- AUD
- CNY
- HKD
- NZD
- SEK
- NOK
- DKK
- SGD
- MXN
- INR
- BRL
- ZAR
- RUB
- TRY
- name: order_total
description: Total price of the order
tests:
- not_null
- name: taxes_included
description: Indicates if taxes are included in price
tests:
- not_null
- name: is_test_order
description: Indicates if this is a test order
tests:
- not_null
- name: shipping_address_line1
description: First line of shipping address
tests:
- not_null
- name: shipping_status
description: Shipping status of the order
tests:
- accepted_values:
values:
- fulfilled
- pending
- shipped
- delivered
- cancelled
- returned
- processing
- on hold
- backordered
- partial
cocoon_meta:
missing_acceptable: Not applicable for orders that haven't been shipped yet.
- name: processing_method
description: Method used to process the order
tests:
- not_null
- accepted_values:
values:
- direct
- online
- phone
- mail
- in-store
- fax
- email
- mobile app
- third-party marketplace
- social media
- voice assistant
- text message
- name: cart_token
description: Unique identifier for shopping cart
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for the shopping cart.
For this table, each row represents a distinct order, and cart_token appears
to be unique across rows.
- name: marketing_consent
description: Customer's marketing preferences
tests:
- not_null
- name: alt_order_number
description: Alternative order number
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column seems to be an alternative order number. For this table,
each row represents a distinct order, and alt_order_number appears to be unique
across rows.
- name: billing_latitude
description: Latitude of billing address
cocoon_meta:
missing_acceptable: No billing location coordinates provided
- name: billing_longitude
description: Longitude of billing address
cocoon_meta:
missing_acceptable: No billing location coordinates provided
- name: billing_province_code
description: Province or state code in billing address
cocoon_meta:
missing_acceptable: No billing province/state code provided for this order
- name: cancel_reason
description: Reason for order cancellation
cocoon_meta:
missing_acceptable: Order not cancelled
- name: cancelled_at
description: Timestamp of order cancellation
cocoon_meta:
missing_acceptable: Order not cancelled
- name: closed_at
description: Timestamp when order was closed
tests:
- not_null
- name: created_at
description: Timestamp when order was created
tests:
- not_null
- name: customer_id
description: Unique identifier for the user who placed the order
tests:
- not_null
- name: last_updated
description: Timestamp of the last update to the order
tests:
- not_null
- name: order_id
description: Unique identifier for the order
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column is a unique identifier for the order. For this table,
each row represents a unique order. Order IDs are typically designed to be
unique for each order in e-commerce systems. Based on the sample data, we
can see that each row has a different order_id, suggesting it's unique across
all rows.
- name: order_tax
description: Total tax amount for the order
tests:
- not_null
- name: order_weight
description: Total weight of the order
tests:
- not_null
- name: processed_at
description: Timestamp of order processing
tests:
- not_null
stg_shopify_product_image_data (first 100 rows)
is_deleted | is_default | image_url | product_id | image_id | display_order | height | width | created_at | updated_at | variant_ids | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | False | False | https://cdn.shopify.com/s/files/glassess-1784103173.jpg?v=1560398767 | 38804 | 14180 | 4 | 1200 | 956 | 2019-06-13 04:06:07 | 2019-06-13 04:06:07 | NaN |
1 | False | False | https://cdn.shopify.com/s/files/1/smile.jpg?v=1560398767 | 34804 | 748644 | 2 | 1200 | 956 | 2019-06-13 04:06:07 | 2019-06-13 04:06:07 | NaN |
2 | False | False | https://cdn.shopify.com/s/files/1/kitten.jpg?v=1560398767 | 34604 | 679716 | 6 | 1200 | 956 | 2019-06-13 04:06:07 | 2019-06-13 04:06:07 | [None, 27559733, 275597338, 275597536, None, 2755973, None] |
stg_shopify_product_image_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_product_image_data_projected" AS (
-- Projection: Selecting 12 out of 13 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"product_id",
"_fivetran_deleted",
"alt",
"created_at",
"height",
"position_",
"src",
"updated_at",
"width",
"is_default",
"variant_ids"
FROM "shopify_product_image_data"
),
"shopify_product_image_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> image_id
-- _fivetran_deleted -> is_deleted
-- alt -> alt_text
-- position_ -> display_order
-- src -> image_url
SELECT
"id" AS "image_id",
"product_id",
"_fivetran_deleted" AS "is_deleted",
"alt" AS "alt_text",
"created_at",
"height",
"position_" AS "display_order",
"src" AS "image_url",
"updated_at",
"width",
"is_default",
"variant_ids"
FROM "shopify_product_image_data_projected"
),
"shopify_product_image_data_projected_renamed_null" AS (
-- NULL Imputation: Impute Null to Disguised Missing Values
-- variant_ids: ['[]']
SELECT
CASE
WHEN "variant_ids" = '[]' THEN NULL
ELSE "variant_ids"
END AS "variant_ids",
"updated_at",
"is_deleted",
"is_default",
"image_url",
"product_id",
"image_id",
"alt_text",
"display_order",
"height",
"created_at",
"width"
FROM "shopify_product_image_data_projected_renamed"
),
"shopify_product_image_data_projected_renamed_null_casted" AS (
-- Column Type Casting:
-- alt_text: from DECIMAL to VARCHAR
-- created_at: from VARCHAR to TIMESTAMP
-- updated_at: from VARCHAR to TIMESTAMP
-- variant_ids: from VARCHAR to ARRAY
SELECT
"is_deleted",
"is_default",
"image_url",
"product_id",
"image_id",
"display_order",
"height",
"width",
CAST("alt_text" AS VARCHAR) AS "alt_text",
CAST("created_at" AS TIMESTAMP) AS "created_at",
CAST("updated_at" AS TIMESTAMP) AS "updated_at",
from_json("variant_ids", '["INTEGER"]') AS "variant_ids"
FROM "shopify_product_image_data_projected_renamed_null"
),
"shopify_product_image_data_projected_renamed_null_casted_missing_handled" AS (
-- Handling missing values: There are 1 columns with unacceptable missing values
-- alt_text has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"is_deleted",
"is_default",
"image_url",
"product_id",
"image_id",
"display_order",
"height",
"width",
"created_at",
"updated_at",
"variant_ids"
FROM "shopify_product_image_data_projected_renamed_null_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_product_image_data_projected_renamed_null_casted_missing_handled"
stg_shopify_product_image_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_product_image_data
description: The table is about Shopify product images. It contains image details
such as ID, product ID, creation date, dimensions, URL, and position. Each row
represents one image. The table includes information on whether the image is default
and which product variants it's associated with. It also tracks if the image has
been deleted from the system.
columns:
- name: is_deleted
description: Indicates if the image has been deleted
tests:
- not_null
- name: is_default
description: Indicates if this is the default product image
tests:
- not_null
- name: image_url
description: URL source of the image
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column contains the URL source of the image. For this table,
each row represents a unique image. The image_url is likely to be unique across
rows as it points to a specific image file on Shopify's CDN.
- name: product_id
description: ID of the product associated with the image
tests:
- not_null
- name: image_id
description: Unique identifier for the image
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column contains a unique identifier for the image. For this
table, each row represents a unique image, and the image_id is designed to
be a unique identifier for each image.
- name: display_order
description: Order of the image in product gallery
tests:
- not_null
- name: height
description: Height of the image in pixels
tests:
- not_null
- name: width
description: Width of the image in pixels
tests:
- not_null
- name: created_at
description: Timestamp when the image was created
tests:
- not_null
- name: updated_at
description: Timestamp when the image was last updated
tests:
- not_null
- name: variant_ids
description: List of product variant IDs associated with the image
cocoon_meta:
missing_acceptable: Not all products have variants or multiple versions.
stg_shopify_tender_transaction_data (first 100 rows)
transaction_amount | currency | payment_method | is_test_transaction | credit_card_company | order_id | processing_timestamp | transaction_id | |
---|---|---|---|---|---|---|---|---|
0 | 2895.74 | USD | other | False | None | 45379 | 2022-11-30 18:14:37 | 34283 |
1 | 5900.75 | USD | other | False | None | 45243 | 2022-12-01 02:00:39 | 905707 |
2 | -164.72 | USD | other | False | None | 4559467 | 2022-11-30 14:29:13 | 411 |
3 | 5180.19 | USD | other | False | None | 35 | 2022-11-30 23:55:45 | 55179 |
4 | 3004.30 | USD | other | False | None | 45955 | 2022-12-01 02:09:47 | 16923 |
stg_shopify_tender_transaction_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_tender_transaction_data_projected" AS (
-- Projection: Selecting 11 out of 12 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"amount",
"currency",
"order_id",
"payment_details_credit_card_company",
"payment_details_credit_card_number",
"payment_method",
"processed_at",
"remote_reference",
"test",
"user_id"
FROM "shopify_tender_transaction_data"
),
"shopify_tender_transaction_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> transaction_id
-- amount -> transaction_amount
-- payment_details_credit_card_company -> credit_card_company
-- payment_details_credit_card_number -> masked_card_number
-- processed_at -> processing_timestamp
-- remote_reference -> external_reference
-- test -> is_test_transaction
SELECT
"id" AS "transaction_id",
"amount" AS "transaction_amount",
"currency",
"order_id",
"payment_details_credit_card_company" AS "credit_card_company",
"payment_details_credit_card_number" AS "masked_card_number",
"payment_method",
"processed_at" AS "processing_timestamp",
"remote_reference" AS "external_reference",
"test" AS "is_test_transaction",
"user_id"
FROM "shopify_tender_transaction_data_projected"
),
"shopify_tender_transaction_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- credit_card_company: from DECIMAL to VARCHAR
-- external_reference: from DECIMAL to VARCHAR
-- masked_card_number: from DECIMAL to VARCHAR
-- order_id: from INT to VARCHAR
-- processing_timestamp: from VARCHAR to TIMESTAMP
-- transaction_id: from INT to VARCHAR
-- user_id: from DECIMAL to VARCHAR
SELECT
"transaction_amount",
"currency",
"payment_method",
"is_test_transaction",
CAST("credit_card_company" AS VARCHAR) AS "credit_card_company",
CAST("external_reference" AS VARCHAR) AS "external_reference",
CAST("masked_card_number" AS VARCHAR) AS "masked_card_number",
CAST("order_id" AS VARCHAR) AS "order_id",
CAST("processing_timestamp" AS TIMESTAMP) AS "processing_timestamp",
CAST("transaction_id" AS VARCHAR) AS "transaction_id",
CAST("user_id" AS VARCHAR) AS "user_id"
FROM "shopify_tender_transaction_data_projected_renamed"
),
"shopify_tender_transaction_data_projected_renamed_casted_missing_handled" AS (
-- Handling missing values: There are 3 columns with unacceptable missing values
-- external_reference has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- masked_card_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- user_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"transaction_amount",
"currency",
"payment_method",
"is_test_transaction",
"credit_card_company",
"order_id",
"processing_timestamp",
"transaction_id"
FROM "shopify_tender_transaction_data_projected_renamed_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_tender_transaction_data_projected_renamed_casted_missing_handled"
stg_shopify_tender_transaction_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_tender_transaction_data
description: The table is about financial transactions in a Shopify store. It includes
details such as transaction ID, amount, currency, order ID, payment method, processing
time, and whether it was a test transaction. The table captures both positive
and negative amounts, suggesting it covers both sales and refunds. All transactions
are in USD and use the "other" payment method.
columns:
- name: transaction_amount
description: Transaction amount in USD
tests:
- not_null
- name: currency
description: Currency code of the transaction
tests:
- not_null
- name: payment_method
description: Method used for payment
tests:
- not_null
- accepted_values:
values:
- cash
- credit card
- debit card
- bank transfer
- check
- money order
- PayPal
- Apple Pay
- Google Pay
- cryptocurrency
- gift card
- store credit
- loyalty points
- installment plan
- wire transfer
- mobile payment
- contactless payment
- electronic wallet
- direct debit
- other
- name: is_test_transaction
description: Indicates if this is a test transaction
tests:
- not_null
- name: credit_card_company
description: Name of the credit card company
cocoon_meta:
missing_acceptable: Payment method is 'other', not a credit card.
- name: order_id
description: ID of the associated order
tests:
- not_null
- name: processing_timestamp
description: Date and time of transaction processing
tests:
- not_null
- name: transaction_id
description: Unique identifier for the transaction
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column is a unique identifier for the transaction. For this
table, each row is a financial transaction. As it's designed to be a unique
identifier, it should be unique across all rows.
stg_shopify_order_adjustment_data (first 100 rows)
adjustment_amount_cents | tax_amount_cents | adjustment_type | adjustment_reason | adjustment_id | order_id | refund_id | |
---|---|---|---|---|---|---|---|
0 | -465 | 0.0 | shipping_refund | Shipping refund | 109271056455 | 2712175083591 | 675617407047 |
1 | -95 | 0.0 | shipping_refund | Shipping refund | 109277085767 | 2773486501959 | 675634708551 |
2 | -27 | -1.6 | shipping_refund | Shipping refund | 109245956167 | 2771757826119 | 675548168263 |
3 | -35 | 0.0 | shipping_refund | Shipping refund | 109248118855 | 2771329908807 | 675555016775 |
4 | -515 | 0.0 | refund_discrepancy | Refund discrepancy | 109275742279 | 2773429682247 | 675632644167 |
stg_shopify_order_adjustment_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_order_adjustment_data_projected" AS (
-- Projection: Selecting 9 out of 10 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"order_id",
"refund_id",
"amount",
"tax_amount",
"kind",
"reason",
"amount_set",
"tax_amount_set"
FROM "shopify_order_adjustment_data"
),
"shopify_order_adjustment_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> adjustment_id
-- amount -> adjustment_amount_cents
-- tax_amount -> tax_amount_cents
-- kind -> adjustment_type
-- reason -> adjustment_reason
-- amount_set -> currency_info
-- tax_amount_set -> tax_currency_info
SELECT
"id" AS "adjustment_id",
"order_id",
"refund_id",
"amount" AS "adjustment_amount_cents",
"tax_amount" AS "tax_amount_cents",
"kind" AS "adjustment_type",
"reason" AS "adjustment_reason",
"amount_set" AS "currency_info",
"tax_amount_set" AS "tax_currency_info"
FROM "shopify_order_adjustment_data_projected"
),
"shopify_order_adjustment_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- adjustment_id: from INT to VARCHAR
-- currency_info: from DECIMAL to VARCHAR
-- order_id: from INT to VARCHAR
-- refund_id: from INT to VARCHAR
-- tax_currency_info: from DECIMAL to VARCHAR
SELECT
"adjustment_amount_cents",
"tax_amount_cents",
"adjustment_type",
"adjustment_reason",
CAST("adjustment_id" AS VARCHAR) AS "adjustment_id",
CAST("currency_info" AS VARCHAR) AS "currency_info",
CAST("order_id" AS VARCHAR) AS "order_id",
CAST("refund_id" AS VARCHAR) AS "refund_id",
CAST("tax_currency_info" AS VARCHAR) AS "tax_currency_info"
FROM "shopify_order_adjustment_data_projected_renamed"
),
"shopify_order_adjustment_data_projected_renamed_casted_missing_handled" AS (
-- Handling missing values: There are 2 columns with unacceptable missing values
-- currency_info has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- tax_currency_info has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"adjustment_amount_cents",
"tax_amount_cents",
"adjustment_type",
"adjustment_reason",
"adjustment_id",
"order_id",
"refund_id"
FROM "shopify_order_adjustment_data_projected_renamed_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_order_adjustment_data_projected_renamed_casted_missing_handled"
stg_shopify_order_adjustment_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_order_adjustment_data
description: The table is about Shopify order adjustments. It includes details such
as order ID, refund ID, adjustment amount, tax amount, kind of adjustment, and
reason. The main types of adjustments are shipping refunds and refund discrepancies.
Each row represents a specific adjustment made to an order, with associated amounts
and reasons.
columns:
- name: adjustment_amount_cents
description: Adjustment amount in cents
tests:
- not_null
- name: tax_amount_cents
description: Tax amount associated with the adjustment in cents
tests:
- not_null
- name: adjustment_type
description: Type of adjustment (e.g., shipping_refund, refund_discrepancy)
tests:
- not_null
- accepted_values:
values:
- shipping_refund
- refund_discrepancy
- price_adjustment
- tax_adjustment
- coupon_adjustment
- fee_adjustment
- partial_refund
- full_refund
- return_adjustment
- exchange_adjustment
- credit_adjustment
- promotional_adjustment
- loyalty_point_adjustment
- gift_card_adjustment
- handling_fee_adjustment
- currency_exchange_adjustment
- inventory_adjustment
- damaged_goods_adjustment
- miscellaneous_adjustment
- name: adjustment_reason
description: Explanation for the adjustment
tests:
- not_null
- accepted_values:
values:
- Shipping refund
- Refund discrepancy
- Price adjustment
- Damaged item
- Missing item
- Wrong item shipped
- Coupon/discount applied
- Customer satisfaction
- Bulk order discount
- Loyalty program credit
- Warranty claim
- Return processing fee
- Exchange difference
- Partial shipment adjustment
- Canceled order
- Promotional offer
- Tax adjustment
- Currency exchange rate
- Shipping upgrade
- Shipping downgrade
- Late delivery compensation
- Product recall
- Price match
- Inventory error
- Payment processing error
- name: adjustment_id
description: Unique identifier for the adjustment
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each adjustment.
For this table, each row is a specific adjustment made to an order. The adjustment_id
is likely to be unique across all rows as it's designed to distinctly identify
each adjustment.
- name: order_id
description: Unique identifier for the associated order
tests:
- not_null
- name: refund_id
description: Unique identifier for the associated refund
tests:
- not_null
stg_shopify_location_data (first 100 rows)
is_deleted | location_name | is_active | province_state | is_legacy | local_province_name | country_name | province_state_code | primary_address | iso_country_code | location_id | local_country_name | country_code | creation_timestamp | last_update_timestamp | postal_code | secondary_address | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | False | Plum | True | None | True | None | United States | None | None | US | 8777748 | United States | US | 2019-06-11 15:58:20 | 2019-06-11 15:58:20 | None | None |
1 | False | Plum Express | True | NY | False | New York | United States | NY | 111 Tree Road | US | 7748 | United States | US | 2018-12-10 16:24:07 | 2019-05-16 13:37:39 | 7394.0 | None |
stg_shopify_location_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_location_data_projected" AS (
-- Projection: Selecting 19 out of 20 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"active",
"address_1",
"address_2",
"city",
"country",
"created_at",
"legacy",
"name",
"phone",
"province",
"updated_at",
"zip",
"country_code",
"country_name",
"localized_country_name",
"localized_province_name",
"province_code",
"_fivetran_deleted"
FROM "shopify_location_data"
),
"shopify_location_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> location_id
-- active -> is_active
-- address_1 -> primary_address
-- address_2 -> secondary_address
-- country -> country_code
-- created_at -> creation_timestamp
-- legacy -> is_legacy
-- name -> location_name
-- phone -> phone_number
-- province -> province_state
-- updated_at -> last_update_timestamp
-- zip -> postal_code
-- country_code -> iso_country_code
-- localized_country_name -> local_country_name
-- localized_province_name -> local_province_name
-- province_code -> province_state_code
-- _fivetran_deleted -> is_deleted
SELECT
"id" AS "location_id",
"active" AS "is_active",
"address_1" AS "primary_address",
"address_2" AS "secondary_address",
"city",
"country" AS "country_code",
"created_at" AS "creation_timestamp",
"legacy" AS "is_legacy",
"name" AS "location_name",
"phone" AS "phone_number",
"province" AS "province_state",
"updated_at" AS "last_update_timestamp",
"zip" AS "postal_code",
"country_code" AS "iso_country_code",
"country_name",
"localized_country_name" AS "local_country_name",
"localized_province_name" AS "local_province_name",
"province_code" AS "province_state_code",
"_fivetran_deleted" AS "is_deleted"
FROM "shopify_location_data_projected"
),
"shopify_location_data_projected_renamed_cleaned" AS (
-- Clean unusual string values:
-- city: The problem is that 'Tree' is not a valid city name. It appears that this column has been mistakenly populated with data that should belong to a different column, likely one describing types of vegetation or natural features. Since we don't have the correct city information and 'Tree' is meaningless in this context, we should map it to an empty string to indicate missing data.
-- local_province_name: The problem is a misspelling in the local_province_name column. The value 'New Yorl' is a typo and should be corrected to 'New York'. This is likely a data entry error where the 'k' was accidentally typed as 'l'.
SELECT
"location_id",
"is_active",
"primary_address",
"secondary_address",
CASE
WHEN "city" = 'Tree' THEN ''
ELSE "city"
END AS "city",
"country_code",
"creation_timestamp",
"is_legacy",
"location_name",
"phone_number",
"province_state",
"last_update_timestamp",
"postal_code",
"iso_country_code",
"country_name",
"local_country_name",
CASE
WHEN "local_province_name" = 'New Yorl' THEN 'New York'
ELSE "local_province_name"
END AS "local_province_name",
"province_state_code",
"is_deleted"
FROM "shopify_location_data_projected_renamed"
),
"shopify_location_data_projected_renamed_cleaned_null" AS (
-- NULL Imputation: Impute Null to Disguised Missing Values
-- city: ['']
SELECT
CASE
WHEN "city" = '' THEN NULL
ELSE "city"
END AS "city",
"is_deleted",
"location_name",
"is_active",
"province_state",
"is_legacy",
"local_province_name",
"country_name",
"province_state_code",
"postal_code",
"primary_address",
"iso_country_code",
"location_id",
"secondary_address",
"local_country_name",
"phone_number",
"country_code",
"creation_timestamp",
"last_update_timestamp"
FROM "shopify_location_data_projected_renamed_cleaned"
),
"shopify_location_data_projected_renamed_cleaned_null_casted" AS (
-- Column Type Casting:
-- creation_timestamp: from VARCHAR to TIMESTAMP
-- last_update_timestamp: from VARCHAR to TIMESTAMP
-- phone_number: from DECIMAL to VARCHAR
-- postal_code: from DECIMAL to VARCHAR
-- secondary_address: from DECIMAL to VARCHAR
SELECT
"city",
"is_deleted",
"location_name",
"is_active",
"province_state",
"is_legacy",
"local_province_name",
"country_name",
"province_state_code",
"primary_address",
"iso_country_code",
"location_id",
"local_country_name",
"country_code",
CAST("creation_timestamp" AS TIMESTAMP) AS "creation_timestamp",
CAST("last_update_timestamp" AS TIMESTAMP) AS "last_update_timestamp",
CAST("phone_number" AS VARCHAR) AS "phone_number",
CAST("postal_code" AS VARCHAR) AS "postal_code",
CAST("secondary_address" AS VARCHAR) AS "secondary_address"
FROM "shopify_location_data_projected_renamed_cleaned_null"
),
"shopify_location_data_projected_renamed_cleaned_null_casted_missing_handled" AS (
-- Handling missing values: There are 7 columns with unacceptable missing values
-- city has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- local_province_name has 50.0 percent missing. Strategy: 🔄 Unchanged
-- phone_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- postal_code has 50.0 percent missing. Strategy: 🔄 Unchanged
-- primary_address has 50.0 percent missing. Strategy: 🔄 Unchanged
-- province_state has 50.0 percent missing. Strategy: 🔄 Unchanged
-- province_state_code has 50.0 percent missing. Strategy: 🔄 Unchanged
SELECT
"is_deleted",
"location_name",
"is_active",
"province_state",
"is_legacy",
"local_province_name",
"country_name",
"province_state_code",
"primary_address",
"iso_country_code",
"location_id",
"local_country_name",
"country_code",
"creation_timestamp",
"last_update_timestamp",
"postal_code",
"secondary_address"
FROM "shopify_location_data_projected_renamed_cleaned_null_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_location_data_projected_renamed_cleaned_null_casted_missing_handled"
stg_shopify_location_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_location_data
description: The table contains details about Shopify store locations. It includes
information such as location ID, address, city, country, phone number, and status
(active/inactive). Each row represents a unique store location with its associated
attributes. The table tracks both physical and online store locations, with fields
for physical addresses as well as digital-only stores.
columns:
- name: is_deleted
description: Indicates if the record is deleted
tests:
- not_null
- name: location_name
description: Name of the store location
tests:
- not_null
- name: is_active
description: Indicates if the location is currently active
tests:
- not_null
- name: province_state
description: Province or state of the location
tests:
- not_null
- accepted_values:
values:
- AL
- AK
- AZ
- AR
- CA
- CO
- CT
- DE
- FL
- GA
- HI
- ID
- IL
- IN
- IA
- KS
- KY
- LA
- ME
- MD
- MA
- MI
- MN
- MS
- MO
- MT
- NE
- NV
- NH
- NJ
- NM
- NY
- NC
- ND
- OH
- OK
- OR
- PA
- RI
- SC
- SD
- TN
- TX
- UT
- VT
- VA
- WA
- WV
- WI
- WY
- name: is_legacy
description: Indicates if the location is a legacy entry
tests:
- not_null
- name: local_province_name
description: Province name in local language
tests:
- not_null
- name: country_name
description: Full name of the country
tests:
- not_null
- name: province_state_code
description: Code for the province or state
tests:
- not_null
- name: primary_address
description: Primary address line of the location
tests:
- not_null
- name: iso_country_code
description: ISO country code of the location
tests:
- not_null
- name: location_id
description: Unique identifier for the location
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column is described as a unique identifier for the location.
For this table, each row is for a unique store location. Given it's explicitly
described as a unique identifier, it should be unique across rows.
- name: local_country_name
description: Country name in local language
tests:
- not_null
- name: country_code
description: Country code where the location is situated
tests:
- not_null
- name: creation_timestamp
description: Timestamp when the location was created
tests:
- not_null
- name: last_update_timestamp
description: Timestamp when the location was last updated
tests:
- not_null
- name: postal_code
description: Postal or ZIP code of the location
tests:
- not_null
- name: secondary_address
description: Secondary address line of the location
cocoon_meta:
missing_acceptable: Not all locations have or need a secondary address.
stg_shopify_product_tag_data (first 100 rows)
tag_id | tag_value | product_id | |
---|---|---|---|
0 | 9 | Type: Clothing | 1234 |
1 | 5 | Final Sale | 1234 |
2 | 7 | Sale | 1234 |
3 | 8 | StyleID: Nice | 1234 |
4 | 3 | Collection: Bottoms | 1234 |
stg_shopify_product_tag_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_product_tag_data_projected" AS (
-- Projection: Selecting 3 out of 4 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"index_",
"product_id",
"value_"
FROM "shopify_product_tag_data"
),
"shopify_product_tag_data_projected_renamed" AS (
-- Rename: Renaming columns
-- index_ -> tag_id
-- value_ -> tag_value
SELECT
"index_" AS "tag_id",
"product_id",
"value_" AS "tag_value"
FROM "shopify_product_tag_data_projected"
),
"shopify_product_tag_data_projected_renamed_cleaned" AS (
-- Clean unusual string values:
-- tag_value: The problem is inconsistent formatting and an outlier value. Most values use a colon followed by a space to separate categories, except for "StyleID:nice" which lacks a space after the colon. "Final Sale" and "Sale" don't follow the category:value pattern at all. The correct values should follow the "Category: Value" format consistently, or be a single descriptive term for sales items.
SELECT
"tag_id",
"product_id",
CASE
WHEN "tag_value" = 'StyleID:nice' THEN 'StyleID: Nice'
ELSE "tag_value"
END AS "tag_value"
FROM "shopify_product_tag_data_projected_renamed"
),
"shopify_product_tag_data_projected_renamed_cleaned_casted" AS (
-- Column Type Casting:
-- product_id: from INT to VARCHAR
SELECT
"tag_id",
"tag_value",
CAST("product_id" AS VARCHAR) AS "product_id"
FROM "shopify_product_tag_data_projected_renamed_cleaned"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_product_tag_data_projected_renamed_cleaned_casted"
stg_shopify_product_tag_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_product_tag_data
description: The table is about product tags in a Shopify system. It contains product
IDs and associated tag values. Each product can have multiple tags. Tags include
information like product type, sale status, style ID, and collection category.
The table allows for flexible categorization and labeling of products.
columns:
- name: tag_id
description: Unique identifier for each tag entry
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each tag entry. For
this table, each row represents a specific tag associated with a product.
The tag_id appears to be unique across rows, as it's described as a "Unique
identifier for each tag entry".
- name: tag_value
description: The actual tag content or description
tests:
- not_null
- name: product_id
description: Identifier for the product associated with the tag
tests:
- not_null
stg_shopify_tax_line_data (first 100 rows)
row_id | tax_amount | tax_rate | tax_type | order_line_id | tax_price_set | |
---|---|---|---|---|---|---|
0 | 1 | 0.0 | 0.0 | VAT | 29227 | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
1 | 1 | 0.0 | 0.0 | VAT | 1839083 | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
2 | 1 | 0.0 | 0.0 | VAT | 11995 | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
3 | 1 | 0.0 | 0.0 | VAT | 10751 | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
4 | 1 | 0.0 | 0.0 | VAT | 194763 | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
stg_shopify_tax_line_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_tax_line_data_projected" AS (
-- Projection: Selecting 6 out of 7 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"index_",
"order_line_id",
"price",
"rate",
"title",
"price_set"
FROM "shopify_tax_line_data"
),
"shopify_tax_line_data_projected_renamed" AS (
-- Rename: Renaming columns
-- index_ -> row_id
-- price -> tax_amount
-- rate -> tax_rate
-- title -> tax_type
-- price_set -> tax_price_set
SELECT
"index_" AS "row_id",
"order_line_id",
"price" AS "tax_amount",
"rate" AS "tax_rate",
"title" AS "tax_type",
"price_set" AS "tax_price_set"
FROM "shopify_tax_line_data_projected"
),
"shopify_tax_line_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- order_line_id: from INT to VARCHAR
-- tax_price_set: from VARCHAR to JSON
SELECT
"row_id",
"tax_amount",
"tax_rate",
"tax_type",
CAST("order_line_id" AS VARCHAR) AS "order_line_id",
CAST("tax_price_set" AS JSON) AS "tax_price_set"
FROM "shopify_tax_line_data_projected_renamed"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_tax_line_data_projected_renamed_casted"
stg_shopify_tax_line_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_tax_line_data
description: The table is about tax information for Shopify order lines. It includes
details such as order line ID, tax price, tax rate, tax title (always "VAT" in
the samples), and a price set with shop and presentment money amounts. All sample
entries show zero tax, suggesting these may be tax-exempt transactions or orders
from regions without applicable taxes.
columns:
- name: row_id
description: Identifier for the table row
tests:
- not_null
- name: tax_amount
description: Tax amount for the order line
tests:
- not_null
- name: tax_rate
description: Tax rate applied to the order line
tests:
- not_null
- name: tax_type
description: Type of tax applied
tests:
- not_null
- accepted_values:
values:
- VAT
- Sales Tax
- Income Tax
- Property Tax
- Capital Gains Tax
- Corporate Tax
- Excise Tax
- Payroll Tax
- Estate Tax
- Gift Tax
- Customs Duty
- Stamp Duty
- Wealth Tax
- Carbon Tax
- Sin Tax
- Withholding Tax
- name: order_line_id
description: Unique identifier for the order line
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column is a unique identifier for each order line. For this
table, each row represents a tax entry for an order line. The order_line_id
appears to be unique across rows, as each value in the sample is different.
- name: tax_price_set
description: Detailed price information in different currencies
tests:
- not_null
stg_shopify_inventory_level_data (first 100 rows)
inventory_item_id | location_id | |
---|---|---|
0 | 780939 | 287748 |
1 | 6027 | 287748 |
2 | 515 | 28748 |
stg_shopify_inventory_level_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_inventory_level_data_projected" AS (
-- Projection: Selecting 4 out of 5 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"inventory_item_id",
"location_id",
"available",
"updated_at"
FROM "shopify_inventory_level_data"
),
"shopify_inventory_level_data_projected_renamed" AS (
-- Rename: Renaming columns
-- available -> quantity_available
-- updated_at -> last_updated
SELECT
"inventory_item_id",
"location_id",
"available" AS "quantity_available",
"updated_at" AS "last_updated"
FROM "shopify_inventory_level_data_projected"
),
"shopify_inventory_level_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- last_updated: from DECIMAL to TIMESTAMP
-- quantity_available: from DECIMAL to INT
SELECT
"inventory_item_id",
"location_id",
CAST("last_updated" AS TIMESTAMP) AS "last_updated",
CAST("quantity_available" AS INT) AS "quantity_available"
FROM "shopify_inventory_level_data_projected_renamed"
),
"shopify_inventory_level_data_projected_renamed_casted_missing_handled" AS (
-- Handling missing values: There are 2 columns with unacceptable missing values
-- last_updated has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- quantity_available has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"inventory_item_id",
"location_id"
FROM "shopify_inventory_level_data_projected_renamed_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_inventory_level_data_projected_renamed_casted_missing_handled"
stg_shopify_inventory_level_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_inventory_level_data
description: The table is about inventory levels in a Shopify store. It contains
details of inventory items, their locations, available quantities, and update
timestamps. Each row represents a specific inventory item at a particular location.
The empty fields suggest incomplete or missing data for some entries.
columns:
- name: inventory_item_id
description: Unique identifier for the inventory item
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each inventory item.
For this table, each row represents a specific inventory item at a particular
location. Since inventory_item_id is designed to be a unique identifier, it
should be unique across rows, even when the same item is present in multiple
locations.
- name: location_id
description: Unique identifier for the store location
tests:
- not_null
stg_shopify_abandoned_checkout_shipping_line_data (first 100 rows)
shipping_option_order | shipping_method_code | shipping_line_id | shipping_markup | shipping_price | shipping_option_source | shipping_option_title | original_shop_markup | original_shop_price | display_title | api_client_id | carrier_identifier | carrier_service_id | checkout_id | delivery_category | delivery_expectation_range | delivery_expectation_type | discounted_price | fulfillment_service_id | max_delivery_days | min_delivery_days | shipping_phone | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Standard | c3ce0972c2e30eaf7001bea | 0.0 | 0.0 | shopify | Standard | 0.0 | 0.0 | Standard | None | None | None | 653675 | None | None | None | None | None | None | None | None |
1 | 1 | Standard | bf7c90953344902c13 | 0.0 | 0.0 | shopify | Standard | 0.0 | 0.0 | Standard | None | None | None | 379 | None | None | None | None | None | None | None | None |
2 | 1 | Standard | 519ff4275cd972e282db | 0.0 | 0.0 | shopify | Standard | 0.0 | 0.0 | Standard | None | None | None | 635 | None | None | None | None | None | None | None | None |
3 | 1 | Standard | 8d18671d481ad46a | 0.0 | 0.0 | shopify | Standard | 0.0 | 0.0 | Standard | None | None | None | 3211 | None | None | None | None | None | None | None | None |
4 | 1 | Standard | 8f2fab1b455ec9e597 | 0.0 | 0.0 | shopify | Standard | 0.0 | 0.0 | Standard | None | None | None | 381227 | None | None | None | None | None | None | None | None |
stg_shopify_abandoned_checkout_shipping_line_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_abandoned_checkout_shipping_line_data_projected" AS (
-- Projection: Selecting 23 out of 24 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"checkout_id",
"index_",
"api_client_id",
"carrier_identifier",
"carrier_service_id",
"code",
"delivery_category",
"discounted_price",
"id",
"markup",
"phone",
"price",
"requested_fulfillment_service_id",
"source",
"title",
"validation_context",
"delivery_expectation_range",
"delivery_expectation_type",
"original_shop_markup",
"original_shop_price",
"presentment_title",
"delivery_expectation_range_min",
"delivery_expectation_range_max"
FROM "shopify_abandoned_checkout_shipping_line_data"
),
"shopify_abandoned_checkout_shipping_line_data_projected_renamed" AS (
-- Rename: Renaming columns
-- index_ -> shipping_option_order
-- code -> shipping_method_code
-- id -> shipping_line_id
-- markup -> shipping_markup
-- phone -> shipping_phone
-- price -> shipping_price
-- requested_fulfillment_service_id -> fulfillment_service_id
-- source -> shipping_option_source
-- title -> shipping_option_title
-- presentment_title -> display_title
-- delivery_expectation_range_min -> min_delivery_days
-- delivery_expectation_range_max -> max_delivery_days
SELECT
"checkout_id",
"index_" AS "shipping_option_order",
"api_client_id",
"carrier_identifier",
"carrier_service_id",
"code" AS "shipping_method_code",
"delivery_category",
"discounted_price",
"id" AS "shipping_line_id",
"markup" AS "shipping_markup",
"phone" AS "shipping_phone",
"price" AS "shipping_price",
"requested_fulfillment_service_id" AS "fulfillment_service_id",
"source" AS "shipping_option_source",
"title" AS "shipping_option_title",
"validation_context",
"delivery_expectation_range",
"delivery_expectation_type",
"original_shop_markup",
"original_shop_price",
"presentment_title" AS "display_title",
"delivery_expectation_range_min" AS "min_delivery_days",
"delivery_expectation_range_max" AS "max_delivery_days"
FROM "shopify_abandoned_checkout_shipping_line_data_projected"
),
"shopify_abandoned_checkout_shipping_line_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- api_client_id: from DECIMAL to VARCHAR
-- carrier_identifier: from DECIMAL to VARCHAR
-- carrier_service_id: from DECIMAL to VARCHAR
-- checkout_id: from INT to VARCHAR
-- delivery_category: from DECIMAL to VARCHAR
-- delivery_expectation_range: from DECIMAL to VARCHAR
-- delivery_expectation_type: from DECIMAL to VARCHAR
-- discounted_price: from DECIMAL to VARCHAR
-- fulfillment_service_id: from DECIMAL to VARCHAR
-- max_delivery_days: from DECIMAL to VARCHAR
-- min_delivery_days: from DECIMAL to VARCHAR
-- shipping_phone: from DECIMAL to VARCHAR
-- validation_context: from DECIMAL to VARCHAR
SELECT
"shipping_option_order",
"shipping_method_code",
"shipping_line_id",
"shipping_markup",
"shipping_price",
"shipping_option_source",
"shipping_option_title",
"original_shop_markup",
"original_shop_price",
"display_title",
CAST("api_client_id" AS VARCHAR) AS "api_client_id",
CAST("carrier_identifier" AS VARCHAR) AS "carrier_identifier",
CAST("carrier_service_id" AS VARCHAR) AS "carrier_service_id",
CAST("checkout_id" AS VARCHAR) AS "checkout_id",
CAST("delivery_category" AS VARCHAR) AS "delivery_category",
CAST("delivery_expectation_range" AS VARCHAR) AS "delivery_expectation_range",
CAST("delivery_expectation_type" AS VARCHAR) AS "delivery_expectation_type",
CAST("discounted_price" AS VARCHAR) AS "discounted_price",
CAST("fulfillment_service_id" AS VARCHAR) AS "fulfillment_service_id",
CAST("max_delivery_days" AS VARCHAR) AS "max_delivery_days",
CAST("min_delivery_days" AS VARCHAR) AS "min_delivery_days",
CAST("shipping_phone" AS VARCHAR) AS "shipping_phone",
CAST("validation_context" AS VARCHAR) AS "validation_context"
FROM "shopify_abandoned_checkout_shipping_line_data_projected_renamed"
),
"shopify_abandoned_checkout_shipping_line_data_projected_renamed_casted_missing_handled" AS (
-- Handling missing values: There are 1 columns with unacceptable missing values
-- validation_context has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"shipping_option_order",
"shipping_method_code",
"shipping_line_id",
"shipping_markup",
"shipping_price",
"shipping_option_source",
"shipping_option_title",
"original_shop_markup",
"original_shop_price",
"display_title",
"api_client_id",
"carrier_identifier",
"carrier_service_id",
"checkout_id",
"delivery_category",
"delivery_expectation_range",
"delivery_expectation_type",
"discounted_price",
"fulfillment_service_id",
"max_delivery_days",
"min_delivery_days",
"shipping_phone"
FROM "shopify_abandoned_checkout_shipping_line_data_projected_renamed_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_abandoned_checkout_shipping_line_data_projected_renamed_casted_missing_handled"
stg_shopify_abandoned_checkout_shipping_line_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_abandoned_checkout_shipping_line_data
description: The table is about shipping details for abandoned Shopify checkouts.
It includes checkout ID, shipping method details, pricing information, and delivery
expectations. All rows show "Standard" shipping with no cost. The data seems to
capture basic shipping line information for checkouts that were not completed.
columns:
- name: shipping_option_order
description: Order of the shipping option
tests:
- not_null
- name: shipping_method_code
description: Shipping method code
tests:
- not_null
- accepted_values:
values:
- Standard
- Express
- Overnight
- Two-Day
- Ground
- Priority
- Economy
- International
- Local
- Same-Day
- Freight
- name: shipping_line_id
description: Unique identifier for the shipping line
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each shipping line.
For this table, each row represents a shipping option for an abandoned checkout.
The shipping_line_id appears to be unique across rows, as it's a specific
identifier for each shipping line.
- name: shipping_markup
description: Additional charge on top of shipping cost
tests:
- not_null
- name: shipping_price
description: Price of the shipping option
tests:
- not_null
- name: shipping_option_source
description: Source of the shipping option
tests:
- not_null
- accepted_values:
values:
- shopify
- manual
- third_party_api
- carrier_calculated
- flat_rate
- weight_based
- local_delivery
- pickup
- free_shipping
- real_time
- custom
- name: shipping_option_title
description: Title of the shipping option
tests:
- not_null
- accepted_values:
values:
- Standard
- Express
- Overnight
- Two-Day
- Economy
- Priority
- Same-Day
- Free
- Flat Rate
- International
- Local Pickup
- name: original_shop_markup
description: Original markup set by the shop
tests:
- not_null
- name: original_shop_price
description: Original price set by the shop
tests:
- not_null
- name: display_title
description: Display title for the shipping option
tests:
- not_null
- accepted_values:
values:
- Standard
- Express
- Overnight
- Two-Day
- Economy
- Same-Day
- Priority
- First Class
- Ground
- International
- name: api_client_id
description: API client identifier
cocoon_meta:
missing_acceptable: Not needed for standard internal shipping method
- name: carrier_identifier
description: Shipping carrier identifier
cocoon_meta:
missing_acceptable: Not applicable for standard internal shipping
- name: carrier_service_id
description: Unique ID for carrier service
cocoon_meta:
missing_acceptable: Not used for standard internal shipping
- name: checkout_id
description: Unique identifier for the checkout
tests:
- not_null
- name: delivery_category
description: Category of delivery service
cocoon_meta:
missing_acceptable: Not relevant for standard shipping option
- name: delivery_expectation_range
description: Expected delivery timeframe
cocoon_meta:
missing_acceptable: Not specified for standard shipping
- name: delivery_expectation_type
description: Type of delivery expectation
cocoon_meta:
missing_acceptable: Not defined for standard shipping
- name: discounted_price
description: Price after applying discounts
cocoon_meta:
missing_acceptable: No discount applied to standard shipping
- name: fulfillment_service_id
description: ID of requested fulfillment service
cocoon_meta:
missing_acceptable: Not used for standard internal shipping
- name: max_delivery_days
description: Maximum days for expected delivery
cocoon_meta:
missing_acceptable: Not specified for standard shipping
- name: min_delivery_days
description: Minimum days for expected delivery
cocoon_meta:
missing_acceptable: Not specified for standard shipping
- name: shipping_phone
description: Contact phone number for shipping
cocoon_meta:
missing_acceptable: Not required for standard shipping method
stg_shopify_order_note_attribute_data (first 100 rows)
attribute_name | attribute_value | order_id | |
---|---|---|---|
0 | last_name | "1418143823.1643992155" | 34171115 |
1 | first_name | "fb.1.1643992155109.1110590605" | 34171115 |
2 | updated_at | "1643992163253" | 34171115 |
3 | clientID | "a03d3118-4048-4159-b5bb-1b90d8abb69b" | 34171115 |
4 | name | "22707603636395" | 34171115 |
stg_shopify_order_note_attribute_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_order_note_attribute_data_projected" AS (
-- Projection: Selecting 3 out of 4 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"name",
"order_id",
"value_"
FROM "shopify_order_note_attribute_data"
),
"shopify_order_note_attribute_data_projected_renamed" AS (
-- Rename: Renaming columns
-- name -> attribute_name
-- value_ -> attribute_value
SELECT
"name" AS "attribute_name",
"order_id",
"value_" AS "attribute_value"
FROM "shopify_order_note_attribute_data_projected"
),
"shopify_order_note_attribute_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- order_id: from INT to VARCHAR
SELECT
"attribute_name",
"attribute_value",
CAST("order_id" AS VARCHAR) AS "order_id"
FROM "shopify_order_note_attribute_data_projected_renamed"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_order_note_attribute_data_projected_renamed_casted"
stg_shopify_order_note_attribute_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_order_note_attribute_data
description: The table is about Shopify order attributes. It contains various details
related to a specific order, identified by the order_id. The attributes include
customer information (first name, last name), order-specific data (updated timestamp,
clientID), and possibly product information (name attribute with a numeric value).
Each row represents a different attribute for the same order.
columns:
- name: attribute_name
description: Attribute name or type of information
tests:
- not_null
- name: attribute_value
description: Corresponding value for the attribute
tests:
- not_null
- name: order_id
description: Unique identifier for the Shopify order
tests:
- not_null
stg_shopify_product_variant_data (first 100 rows)
title | display_position | inventory_policy | fulfillment_service | inventory_management | is_taxable | weight_grams | stock_quantity | weight_unit | previous_stock_quantity | requires_shipping | tax_code | option_1 | created_at | image_id | inventory_item_id | price | product_id | updated_at | variant_id | weight | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | my title here | 1 | deny | manual | None | False | 0 | 0 | lb | 0 | False | None | my title here | 2021-03-08 16:30:15 | None | 41356021661767 | 111 | 6540108431431 | 2021-04-12 19:49:43 | 39262114414663 | 0.0 |
1 | my title here | 1 | deny | manual | None | False | 0 | 0 | lb | 0 | False | None | my title here | 2021-03-17 16:39:45 | None | 41367035936839 | 222 | 6544066379847 | 2021-04-12 19:46:59 | 39273118957639 | 0.0 |
2 | my title here | 1 | deny | manual | inventory manager | True | 0 | 0 | lb | 0 | True | None | my title here | 2021-03-30 19:48:15 | None | 41384094924871 | 5 | 6548438188103 | 2021-03-30 19:48:15 | 39290169262151 | 0.0 |
3 | my title here | 1 | deny | manual | None | False | 0 | -5 | lb | -5 | False | None | my title here | 2021-03-08 16:31:31 | None | 41356022644807 | 333 | 6540109250631 | 2021-04-12 19:47:26 | 39262115397703 | 0.0 |
4 | my other title | 1 | deny | manual | inventory manager | True | 222 | 0 | lb | 0 | True | TR9999 | my other title | 2019-06-25 18:32:03 | None | 30309980143686 | 444 | 3879735590982 | 2019-10-01 23:40:09 | 29217058947142 | 1.0 |
stg_shopify_product_variant_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_product_variant_data_projected" AS (
-- Projection: Selecting 26 out of 27 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"product_id",
"inventory_item_id",
"title",
"price",
"sku",
"position_",
"inventory_policy",
"compare_at_price",
"fulfillment_service",
"inventory_management",
"created_at",
"updated_at",
"taxable",
"barcode",
"grams",
"image_id",
"inventory_quantity",
"weight",
"weight_unit",
"old_inventory_quantity",
"requires_shipping",
"option_2",
"tax_code",
"option_3",
"option_1"
FROM "shopify_product_variant_data"
),
"shopify_product_variant_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> variant_id
-- position_ -> display_position
-- compare_at_price -> original_price
-- taxable -> is_taxable
-- grams -> weight_grams
-- inventory_quantity -> stock_quantity
-- old_inventory_quantity -> previous_stock_quantity
SELECT
"id" AS "variant_id",
"product_id",
"inventory_item_id",
"title",
"price",
"sku",
"position_" AS "display_position",
"inventory_policy",
"compare_at_price" AS "original_price",
"fulfillment_service",
"inventory_management",
"created_at",
"updated_at",
"taxable" AS "is_taxable",
"barcode",
"grams" AS "weight_grams",
"image_id",
"inventory_quantity" AS "stock_quantity",
"weight",
"weight_unit",
"old_inventory_quantity" AS "previous_stock_quantity",
"requires_shipping",
"option_2",
"tax_code",
"option_3",
"option_1"
FROM "shopify_product_variant_data_projected"
),
"shopify_product_variant_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- barcode: from DECIMAL to VARCHAR
-- created_at: from VARCHAR to TIMESTAMP
-- image_id: from DECIMAL to VARCHAR
-- inventory_item_id: from INT to VARCHAR
-- option_2: from DECIMAL to VARCHAR
-- option_3: from DECIMAL to VARCHAR
-- original_price: from DECIMAL to VARCHAR
-- price: from INT to VARCHAR
-- product_id: from INT to VARCHAR
-- sku: from DECIMAL to VARCHAR
-- updated_at: from VARCHAR to TIMESTAMP
-- variant_id: from INT to VARCHAR
-- weight: from INT to DECIMAL
SELECT
"title",
"display_position",
"inventory_policy",
"fulfillment_service",
"inventory_management",
"is_taxable",
"weight_grams",
"stock_quantity",
"weight_unit",
"previous_stock_quantity",
"requires_shipping",
"tax_code",
"option_1",
CAST("barcode" AS VARCHAR) AS "barcode",
CAST("created_at" AS TIMESTAMP) AS "created_at",
CAST("image_id" AS VARCHAR) AS "image_id",
CAST("inventory_item_id" AS VARCHAR) AS "inventory_item_id",
CAST("option_2" AS VARCHAR) AS "option_2",
CAST("option_3" AS VARCHAR) AS "option_3",
CAST("original_price" AS VARCHAR) AS "original_price",
CAST("price" AS VARCHAR) AS "price",
CAST("product_id" AS VARCHAR) AS "product_id",
CAST("sku" AS VARCHAR) AS "sku",
CAST("updated_at" AS TIMESTAMP) AS "updated_at",
CAST("variant_id" AS VARCHAR) AS "variant_id",
CAST("weight" AS DECIMAL) AS "weight"
FROM "shopify_product_variant_data_projected_renamed"
),
"shopify_product_variant_data_projected_renamed_casted_missing_handled" AS (
-- Handling missing values: There are 7 columns with unacceptable missing values
-- barcode has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- inventory_management has 60.0 percent missing. Strategy: 🔄 Unchanged
-- option_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- option_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- original_price has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- sku has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- tax_code has 80.0 percent missing. Strategy: 🔄 Unchanged
SELECT
"title",
"display_position",
"inventory_policy",
"fulfillment_service",
"inventory_management",
"is_taxable",
"weight_grams",
"stock_quantity",
"weight_unit",
"previous_stock_quantity",
"requires_shipping",
"tax_code",
"option_1",
"created_at",
"image_id",
"inventory_item_id",
"price",
"product_id",
"updated_at",
"variant_id",
"weight"
FROM "shopify_product_variant_data_projected_renamed_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_product_variant_data_projected_renamed_casted_missing_handled"
stg_shopify_product_variant_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_product_variant_data
description: The table is about Shopify product variants. It contains details like
variant ID, product ID, price, SKU, inventory information, creation and update
timestamps, shipping requirements, and tax status. Each row represents a specific
variant of a product, with attributes such as title, price, weight, and inventory
quantity. The table likely serves as a central record for managing product variants
in a Shopify e-commerce system.
columns:
- name: title
description: Title or name of the variant
tests:
- not_null
- name: display_position
description: Position of the variant in listings
tests:
- not_null
- name: inventory_policy
description: Policy for handling out-of-stock items
tests:
- not_null
- accepted_values:
values:
- deny
- backorder
- substitute
- notify
- waitlist
- name: fulfillment_service
description: Service used for order fulfillment
tests:
- not_null
- accepted_values:
values:
- manual
- amazon
- shipwire
- webgistix
- shipstation
- shopify_fulfillment
- third_party
- self_fulfilled
- drop_ship
- fba (Fulfillment by Amazon)
- external
- name: inventory_management
description: Method used for inventory management
tests:
- not_null
- accepted_values:
values:
- inventory manager
- just-in-time (JIT)
- economic order quantity (EOQ)
- abc analysis
- first-in, first-out (FIFO)
- last-in, first-out (LIFO)
- safety stock
- vendor-managed inventory (VMI)
- consignment inventory
- dropshipping
- perpetual inventory system
- periodic inventory system
- barcode system
- radio-frequency identification (RFID)
- cycle counting
- min-max inventory method
- reorder point planning
- materials requirement planning (MRP)
- batch tracking
- demand forecasting
- name: is_taxable
description: Indicates if the variant is taxable
tests:
- not_null
- name: weight_grams
description: Weight of the product in grams
tests:
- not_null
- name: stock_quantity
description: Current quantity in stock
tests:
- not_null
- name: weight_unit
description: Unit of measurement for weight
tests:
- not_null
- accepted_values:
values:
- lb
- kg
- g
- oz
- stone
- ton
- metric ton
- mg
- name: previous_stock_quantity
description: Previous quantity in stock
tests:
- not_null
- name: requires_shipping
description: Indicates if shipping is required
tests:
- not_null
- name: tax_code
description: Tax code for the variant
tests:
- not_null
- name: option_1
description: Primary product option
tests:
- not_null
- name: created_at
description: Timestamp when the variant was created
tests:
- not_null
- name: image_id
description: Identifier for the variant's image
cocoon_meta:
missing_acceptable: Not all products require an image.
- name: inventory_item_id
description: Identifier for inventory tracking
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column is an identifier for inventory tracking. For this table,
each row is for a specific product variant. As it's an identifier specifically
for inventory items, it's likely to be unique for each variant.
- name: price
description: Current price of the variant
tests:
- not_null
- name: product_id
description: Identifier of the parent product
tests:
- not_null
- name: updated_at
description: Timestamp of last update
tests:
- not_null
- name: variant_id
description: Unique identifier for the variant
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column is the unique identifier for the variant. For this table,
each row is for a specific product variant. As it's explicitly described as
a unique identifier, it should be unique across all rows.
- name: weight
description: Weight of the product
tests:
- not_null
stg_shopify_collection_data (first 100 rows)
collection_id | is_deleted | is_disjunctive | last_updated | |
---|---|---|---|---|
0 | 997355 | True | None | 1970-01-01 |
1 | 9930779 | True | None | 1970-01-01 |
2 | 99967 | True | None | 1970-01-01 |
stg_shopify_collection_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_collection_data_projected" AS (
-- Projection: Selecting 12 out of 13 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"_fivetran_deleted",
"handle",
"published_at",
"published_scope",
"title",
"updated_at",
"disjunctive",
"rules",
"sort_order",
"template_suffix",
"body_html"
FROM "shopify_collection_data"
),
"shopify_collection_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> collection_id
-- _fivetran_deleted -> is_deleted
-- handle -> url_slug
-- published_at -> publish_date
-- published_scope -> visibility_scope
-- title -> collection_name
-- updated_at -> last_updated
-- disjunctive -> is_disjunctive
-- rules -> product_rules
-- sort_order -> product_sort_order
-- template_suffix -> page_template
-- body_html -> description_html
SELECT
"id" AS "collection_id",
"_fivetran_deleted" AS "is_deleted",
"handle" AS "url_slug",
"published_at" AS "publish_date",
"published_scope" AS "visibility_scope",
"title" AS "collection_name",
"updated_at" AS "last_updated",
"disjunctive" AS "is_disjunctive",
"rules" AS "product_rules",
"sort_order" AS "product_sort_order",
"template_suffix" AS "page_template",
"body_html" AS "description_html"
FROM "shopify_collection_data_projected"
),
"shopify_collection_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- collection_name: from DECIMAL to VARCHAR
-- description_html: from DECIMAL to VARCHAR
-- is_disjunctive: from DECIMAL to VARCHAR
-- last_updated: from VARCHAR to TIMESTAMP
-- page_template: from DECIMAL to VARCHAR
-- product_rules: from DECIMAL to VARCHAR
-- product_sort_order: from DECIMAL to VARCHAR
-- publish_date: from DECIMAL to VARCHAR
-- url_slug: from DECIMAL to VARCHAR
-- visibility_scope: from DECIMAL to VARCHAR
SELECT
"collection_id",
"is_deleted",
CAST("collection_name" AS VARCHAR) AS "collection_name",
CAST("description_html" AS VARCHAR) AS "description_html",
CAST("is_disjunctive" AS VARCHAR) AS "is_disjunctive",
CAST("last_updated" AS TIMESTAMP) AS "last_updated",
CAST("page_template" AS VARCHAR) AS "page_template",
CAST("product_rules" AS VARCHAR) AS "product_rules",
CAST("product_sort_order" AS VARCHAR) AS "product_sort_order",
CAST("publish_date" AS VARCHAR) AS "publish_date",
CAST("url_slug" AS VARCHAR) AS "url_slug",
CAST("visibility_scope" AS VARCHAR) AS "visibility_scope"
FROM "shopify_collection_data_projected_renamed"
),
"shopify_collection_data_projected_renamed_casted_missing_handled" AS (
-- Handling missing values: There are 8 columns with unacceptable missing values
-- collection_name has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- description_html has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- page_template has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- product_rules has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- product_sort_order has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- publish_date has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- url_slug has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- visibility_scope has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"collection_id",
"is_deleted",
"is_disjunctive",
"last_updated"
FROM "shopify_collection_data_projected_renamed_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_collection_data_projected_renamed_casted_missing_handled"
stg_shopify_collection_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_collection_data
description: The table is about Shopify collections. It contains collection IDs,
deletion status, handles, publication details, titles, update timestamps, and
other collection-specific attributes. The data seems to represent deleted collections,
as the _fivetran_deleted field is set to True and most fields are empty. The table
likely stores historical data of collections that were once active in a Shopify
store.
columns:
- name: collection_id
description: Unique identifier for the collection
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each Shopify collection.
For this table, each row represents a deleted collection. The collection_id
is likely to be unique across rows as it's typically assigned by Shopify to
uniquely identify each collection.
- name: is_deleted
description: Indicates if the collection has been deleted
tests:
- not_null
- name: is_disjunctive
description: Determines if products must match all or any rules
cocoon_meta:
missing_acceptable: Not applicable for non-filterable or single-category collections.
- name: last_updated
description: Date and time of last update to the collection
tests:
- not_null
stg_shopify_order_shipping_tax_line_data (first 100 rows)
tax_name | row_index | shipping_tax_amount | shipping_tax_rate | order_shipping_line_id | tax_amount_currencies | |
---|---|---|---|---|---|---|
0 | None | 4 | 0.0 | 0.000 | 321291 | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
1 | BANANA | 3 | 0.0 | 0.007 | 5995 | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
2 | TOMATO | 3 | 0.0 | 0.010 | 309131 | {"shop_money":{"amount":"0.00","currency_code":"USD"},"presentment_money":{"amount":"0.00","currency_code":"USD"}} |
stg_shopify_order_shipping_tax_line_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_order_shipping_tax_line_data_projected" AS (
-- Projection: Selecting 6 out of 7 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"index_",
"order_shipping_line_id",
"price",
"rate",
"title",
"price_set"
FROM "shopify_order_shipping_tax_line_data"
),
"shopify_order_shipping_tax_line_data_projected_renamed" AS (
-- Rename: Renaming columns
-- index_ -> row_index
-- price -> shipping_tax_amount
-- rate -> shipping_tax_rate
-- title -> tax_name
-- price_set -> tax_amount_currencies
SELECT
"index_" AS "row_index",
"order_shipping_line_id",
"price" AS "shipping_tax_amount",
"rate" AS "shipping_tax_rate",
"title" AS "tax_name",
"price_set" AS "tax_amount_currencies"
FROM "shopify_order_shipping_tax_line_data_projected"
),
"shopify_order_shipping_tax_line_data_projected_renamed_cleaned" AS (
-- Clean unusual string values:
-- tax_name: The problem is that 'BANANAN' is a misspelling of 'BANANA', and 'GEIWIHG' is an unrecognizable term that doesn't appear to be a valid fruit or vegetable name. The correct values should be common fruit or vegetable names. 'TOMATO' is already correct and doesn't need to be changed.
SELECT
"row_index",
"order_shipping_line_id",
"shipping_tax_amount",
"shipping_tax_rate",
CASE
WHEN "tax_name" = 'BANANAN' THEN 'BANANA'
WHEN "tax_name" = 'GEIWIHG' THEN ''
ELSE "tax_name"
END AS "tax_name",
"tax_amount_currencies"
FROM "shopify_order_shipping_tax_line_data_projected_renamed"
),
"shopify_order_shipping_tax_line_data_projected_renamed_cleaned_null" AS (
-- NULL Imputation: Impute Null to Disguised Missing Values
-- tax_name: ['']
SELECT
CASE
WHEN "tax_name" = '' THEN NULL
ELSE "tax_name"
END AS "tax_name",
"order_shipping_line_id",
"row_index",
"tax_amount_currencies",
"shipping_tax_amount",
"shipping_tax_rate"
FROM "shopify_order_shipping_tax_line_data_projected_renamed_cleaned"
),
"shopify_order_shipping_tax_line_data_projected_renamed_cleaned_null_casted" AS (
-- Column Type Casting:
-- order_shipping_line_id: from INT to VARCHAR
-- tax_amount_currencies: from VARCHAR to JSON
SELECT
"tax_name",
"row_index",
"shipping_tax_amount",
"shipping_tax_rate",
CAST("order_shipping_line_id" AS VARCHAR) AS "order_shipping_line_id",
CAST("tax_amount_currencies" AS JSON) AS "tax_amount_currencies"
FROM "shopify_order_shipping_tax_line_data_projected_renamed_cleaned_null"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_order_shipping_tax_line_data_projected_renamed_cleaned_null_casted"
stg_shopify_order_shipping_tax_line_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_order_shipping_tax_line_data
description: The table is about shipping tax line details for Shopify orders. It
includes information such as the order shipping line ID, tax price, tax rate,
tax title, and price set in different currencies. Each row represents a specific
tax line associated with a shipping line of an order. The price set contains the
tax amount in both shop currency and presentment currency.
columns:
- name: tax_name
description: Name or code of the tax applied
cocoon_meta:
missing_acceptable: No tax applied when shipping_tax_rate is 0.0.
- name: row_index
description: Row identifier or index number
tests:
- not_null
- name: shipping_tax_amount
description: Tax amount for the shipping line
tests:
- not_null
- name: shipping_tax_rate
description: Tax rate applied to the shipping line
tests:
- not_null
- name: order_shipping_line_id
description: Unique identifier for the order shipping line
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents the unique identifier for the order shipping
line. For this table, each row is for a specific tax line associated with
a shipping line of an order. order_shipping_line_id is likely to be unique
across rows, as it should uniquely identify each shipping line.
- name: tax_amount_currencies
description: Tax amount in shop and presentment currencies
tests:
- not_null
stg_shopify_abandoned_checkout_data (first 100 rows)
billing_address_line2 | billing_first_name | billing_full_name | currency | display_currency | billing_latitude | payment_gateway | accepts_marketing | billing_address_line1 | billing_country | customer_locale | checkout_token | discount_amount | taxes_included | customer_id | order_number | landing_page_url | billing_province_code | referral_source | billing_country_code | recovery_url | source_name | billing_longitude | discount_value | cart_token | billing_city | subtotal | billing_province | billing_last_name | abandoned_at | billing_address_id | billing_company | billing_phone | billing_zip | cc_cvv | cc_exp_month | cc_exp_year | cc_first_name | cc_last_name | cc_number | checkout_id | custom_attributes | discount_description | discount_non_applicable_reason | discount_title | discount_value_type | last_updated_at | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | None | None | None | USD | None | NaN | paypal | False | None | None | en | tnyrnbs@hh.com | f050eda12f111b261 | NaN | False | 121 | #10160311 | /collections/the-archive-sale | None | None | None | https://kitties.com/1111311610/checkouts/f050eda125a10cca513162f01101b261/recover?key=bd0fdf1dc1a1af01aecbdaa3101ec063 | web | NaN | NaN | aaaa211622dfb133 | None | 56.00 | None | None | 2020-11-12 10:06:50.111111 | None | None | None | None | None | None | None | None | None | None | 12111 | None | None | None | None | None | 2020-11-12 10:51:10.111111 |
1 | None | None | None | USD | None | 1.126113 | None | False | Apt 0 | USA | en | hyrehher@gmail.com | a165dfd11226 | NaN | False | 366525 | #13311 | /collections/sale | PA-11 | https://www.google.com/ | US | https://kitties.com/1111311610/checkouts/6661ff02165dfd11b12db112f0111226/recover?key=51611efdff11e0caccc0fd30b0e1e202 | web | -21.502661 | NaN | 611faa630ce5e6bcc0bacc2a105c0126 | Daytona Beach | 10.35 | CA | Calles | 2020-05-11 01:01:30.111111 | None | None | 50266111110.0 | None | None | None | None | None | None | None | 11111 | None | None | None | None | None | 2020-05-11 01:06:35.111111 |
2 | None | None | None | USD | USD | NaN | None | False | None | None | en | hernebbe@hr.com | l1abddd111c0211f2021c | NaN | False | 160363 | #166531 | /collections/new | None | https://l.facebook.com/ | None | https://kitties.com/1111311610/checkouts/0abddd111c0211f1e616ec0d0c32021c/recover?key=abed6505d26f1a60a50aa0c02e01be31 | web | NaN | NaN | aaaaa61e1d11af3adfac1f0 | None | 191.00 | None | None | 2021-11-11 02:05:13.111111 | None | None | None | None | None | None | None | None | None | None | 66531 | [{"name":"segment-clientID","value":"610a111c-30fc-0bb6-a25e-06f201c6035c"},{"name":"_updatedAt","value":"1613121625150"}] | None | None | None | None | 2021-11-11 02:05:55.111111 |
stg_shopify_abandoned_checkout_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_abandoned_checkout_data_removeWideColumns" AS (
-- Remove wide columns with pattern. The regex and columns are:
-- ^shipping_address_.*$: shipping_address_address_0, shipping_address_address_1, shipping_address_city, shipping_address_company, shipping_address_country, shipping_address_country_code, shipping_address_first_name, shipping_address_id, shipping_address_is_default, shipping_address_last_name ...
-- ^shipping_rate_.*$: shipping_rate_id, shipping_rate_price, shipping_rate_title
-- ^note_attribute_.*$: note_attribute_email_client_id, note_attribute_google_client_id, note_attribute_littledata_updated_at, note_attribute_segment_client_id
-- ^total_.*$: total_discounts, total_duties, total_line_items_price, total_price, total_tax, total_weight
SELECT
"_fivetran_deleted",
"_fivetran_synced",
"abandoned_checkout_url",
"applied_discount_amount",
"applied_discount_applicable",
"applied_discount_description",
"applied_discount_non_applicable_reason",
"applied_discount_title",
"applied_discount_value",
"applied_discount_value_type",
"billing_address_address_0",
"billing_address_address_1",
"billing_address_city",
"billing_address_company",
"billing_address_country",
"billing_address_country_code",
"billing_address_first_name",
"billing_address_id",
"billing_address_is_default",
"billing_address_last_name",
"billing_address_latitude",
"billing_address_longitude",
"billing_address_name",
"billing_address_phone",
"billing_address_province",
"billing_address_province_code",
"billing_address_zip",
"buyer_accepts_marketing",
"cart_token",
"closed_at",
"completed_at",
"created_at",
"credit_card_first_name",
"credit_card_last_name",
"credit_card_month",
"credit_card_number",
"credit_card_verification_value",
"credit_card_year",
"currency",
"customer_id",
"customer_locale",
"device_id",
"email",
"gateway",
"id",
"landing_site_base_url",
"location_id",
"name",
"note",
"note_attributes",
"phone",
"presentment_currency",
"referring_site",
"shipping_line",
"source",
"source_identifier",
"source_name",
"source_url",
"subtotal_price",
"taxes_included",
"token",
"updated_at",
"user_id"
FROM "shopify_abandoned_checkout_data"
),
"shopify_abandoned_checkout_data_removeWideColumns_projected" AS (
-- Projection: Selecting 62 out of 63 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"_fivetran_deleted",
"abandoned_checkout_url",
"applied_discount_amount",
"applied_discount_applicable",
"applied_discount_description",
"applied_discount_non_applicable_reason",
"applied_discount_title",
"applied_discount_value",
"applied_discount_value_type",
"billing_address_address_0",
"billing_address_address_1",
"billing_address_city",
"billing_address_company",
"billing_address_country",
"billing_address_country_code",
"billing_address_first_name",
"billing_address_id",
"billing_address_is_default",
"billing_address_last_name",
"billing_address_latitude",
"billing_address_longitude",
"billing_address_name",
"billing_address_phone",
"billing_address_province",
"billing_address_province_code",
"billing_address_zip",
"buyer_accepts_marketing",
"cart_token",
"closed_at",
"completed_at",
"created_at",
"credit_card_first_name",
"credit_card_last_name",
"credit_card_month",
"credit_card_number",
"credit_card_verification_value",
"credit_card_year",
"currency",
"customer_id",
"customer_locale",
"device_id",
"email",
"gateway",
"id",
"landing_site_base_url",
"location_id",
"name",
"note",
"note_attributes",
"phone",
"presentment_currency",
"referring_site",
"shipping_line",
"source",
"source_identifier",
"source_name",
"source_url",
"subtotal_price",
"taxes_included",
"token",
"updated_at",
"user_id"
FROM "shopify_abandoned_checkout_data_removeWideColumns"
),
"shopify_abandoned_checkout_data_removeWideColumns_projected_renamed" AS (
-- Rename: Renaming columns
-- _fivetran_deleted -> is_deleted
-- abandoned_checkout_url -> recovery_url
-- applied_discount_amount -> discount_amount
-- applied_discount_applicable -> is_discount_applicable
-- applied_discount_description -> discount_description
-- applied_discount_non_applicable_reason -> discount_non_applicable_reason
-- applied_discount_title -> discount_title
-- applied_discount_value -> discount_value
-- applied_discount_value_type -> discount_value_type
-- billing_address_address_0 -> billing_address_line1
-- billing_address_address_1 -> billing_address_line2
-- billing_address_city -> billing_city
-- billing_address_company -> billing_company
-- billing_address_country -> billing_country
-- billing_address_country_code -> billing_country_code
-- billing_address_first_name -> billing_first_name
-- billing_address_is_default -> is_default_billing_address
-- billing_address_last_name -> billing_last_name
-- billing_address_latitude -> billing_latitude
-- billing_address_longitude -> billing_longitude
-- billing_address_name -> billing_full_name
-- billing_address_phone -> billing_phone
-- billing_address_province -> billing_province
-- billing_address_province_code -> billing_province_code
-- billing_address_zip -> billing_zip
-- buyer_accepts_marketing -> accepts_marketing
-- created_at -> abandoned_at
-- credit_card_first_name -> cc_first_name
-- credit_card_last_name -> cc_last_name
-- credit_card_month -> cc_exp_month
-- credit_card_number -> cc_number
-- credit_card_verification_value -> cc_cvv
-- credit_card_year -> cc_exp_year
-- gateway -> payment_gateway
-- id -> checkout_id
-- landing_site_base_url -> landing_page_url
-- name -> order_number
-- note -> order_notes
-- note_attributes -> custom_attributes
-- presentment_currency -> display_currency
-- referring_site -> referral_source
-- shipping_line -> shipping_details
-- source -> checkout_source
-- source_identifier -> source_id
-- subtotal_price -> subtotal
-- token -> checkout_token
-- updated_at -> last_updated_at
SELECT
"_fivetran_deleted" AS "is_deleted",
"abandoned_checkout_url" AS "recovery_url",
"applied_discount_amount" AS "discount_amount",
"applied_discount_applicable" AS "is_discount_applicable",
"applied_discount_description" AS "discount_description",
"applied_discount_non_applicable_reason" AS "discount_non_applicable_reason",
"applied_discount_title" AS "discount_title",
"applied_discount_value" AS "discount_value",
"applied_discount_value_type" AS "discount_value_type",
"billing_address_address_0" AS "billing_address_line1",
"billing_address_address_1" AS "billing_address_line2",
"billing_address_city" AS "billing_city",
"billing_address_company" AS "billing_company",
"billing_address_country" AS "billing_country",
"billing_address_country_code" AS "billing_country_code",
"billing_address_first_name" AS "billing_first_name",
"billing_address_id",
"billing_address_is_default" AS "is_default_billing_address",
"billing_address_last_name" AS "billing_last_name",
"billing_address_latitude" AS "billing_latitude",
"billing_address_longitude" AS "billing_longitude",
"billing_address_name" AS "billing_full_name",
"billing_address_phone" AS "billing_phone",
"billing_address_province" AS "billing_province",
"billing_address_province_code" AS "billing_province_code",
"billing_address_zip" AS "billing_zip",
"buyer_accepts_marketing" AS "accepts_marketing",
"cart_token",
"closed_at",
"completed_at",
"created_at" AS "abandoned_at",
"credit_card_first_name" AS "cc_first_name",
"credit_card_last_name" AS "cc_last_name",
"credit_card_month" AS "cc_exp_month",
"credit_card_number" AS "cc_number",
"credit_card_verification_value" AS "cc_cvv",
"credit_card_year" AS "cc_exp_year",
"currency",
"customer_id",
"customer_locale",
"device_id",
"email",
"gateway" AS "payment_gateway",
"id" AS "checkout_id",
"landing_site_base_url" AS "landing_page_url",
"location_id",
"name" AS "order_number",
"note" AS "order_notes",
"note_attributes" AS "custom_attributes",
"phone",
"presentment_currency" AS "display_currency",
"referring_site" AS "referral_source",
"shipping_line" AS "shipping_details",
"source" AS "checkout_source",
"source_identifier" AS "source_id",
"source_name",
"source_url",
"subtotal_price" AS "subtotal",
"taxes_included",
"token" AS "checkout_token",
"updated_at" AS "last_updated_at",
"user_id"
FROM "shopify_abandoned_checkout_data_removeWideColumns_projected"
),
"shopify_abandoned_checkout_data_removeWideColumns_projected_renamed_cleaned" AS (
-- Clean unusual string values:
-- billing_address_line2: The problem is that 'village' is too generic and lacks specific information for an address line 2. Typically, address line 2 should contain more specific details like apartment numbers, suite numbers, or building names. The value 'village' doesn't provide any meaningful information in this context. The correct value in this case should be an empty string, as there's no specific information to include.
-- billing_city: The problem is that 'daytona Beach' is not properly capitalized. City names should have their first letters capitalized. The correct value should be 'Daytona Beach'.
-- billing_country: The problem is that 'Florida' is a state in the United States, not a country, and it appears in a column named 'billing_country'. The correct value should be the country that Florida is part of, which is the United States of America (USA).
-- billing_first_name: The problem is that 'ohio' is a state name, not a typical first name for billing information. This column should contain personal first names. Since we don't have any additional information about the correct first name for this entry, we can't map it to a valid name. The correct value should be an empty string to indicate missing data.
-- billing_full_name: The problem is that 'hi' is not a valid full name for billing purposes. A full name typically consists of at least a first name and a last name. The value 'hi' appears to be a greeting or placeholder rather than an actual name. For billing purposes, we need accurate and complete customer information. Since there are no valid names provided, we should map this meaningless value to an empty string.
-- billing_province: The problem is that 'Healdsburg' is a city name, not a province or state. For a billing_province column, we would expect to see state or province names. Since Healdsburg is a city in California, the correct value should be the state abbreviation 'CA' for California.
SELECT
"is_deleted",
"recovery_url",
"discount_amount",
"is_discount_applicable",
"discount_description",
"discount_non_applicable_reason",
"discount_title",
"discount_value",
"discount_value_type",
"billing_address_line1",
CASE
WHEN "billing_address_line2" = 'village' THEN ''
ELSE "billing_address_line2"
END AS "billing_address_line2",
CASE
WHEN "billing_city" = 'daytona Beach' THEN 'Daytona Beach'
ELSE "billing_city"
END AS "billing_city",
"billing_company",
CASE
WHEN "billing_country" = 'Florida' THEN 'USA'
ELSE "billing_country"
END AS "billing_country",
"billing_country_code",
CASE
WHEN "billing_first_name" = 'ohio' THEN ''
ELSE "billing_first_name"
END AS "billing_first_name",
"billing_address_id",
"is_default_billing_address",
"billing_last_name",
"billing_latitude",
"billing_longitude",
CASE
WHEN "billing_full_name" = 'hi' THEN ''
ELSE "billing_full_name"
END AS "billing_full_name",
"billing_phone",
CASE
WHEN "billing_province" = 'Healdsburg' THEN 'CA'
ELSE "billing_province"
END AS "billing_province",
"billing_province_code",
"billing_zip",
"accepts_marketing",
"cart_token",
"closed_at",
"completed_at",
"abandoned_at",
"cc_first_name",
"cc_last_name",
"cc_exp_month",
"cc_number",
"cc_cvv",
"cc_exp_year",
"currency",
"customer_id",
"customer_locale",
"device_id",
"email",
"payment_gateway",
"checkout_id",
"landing_page_url",
"location_id",
"order_number",
"order_notes",
"custom_attributes",
"phone",
"display_currency",
"referral_source",
"shipping_details",
"checkout_source",
"source_id",
"source_name",
"source_url",
"subtotal",
"taxes_included",
"checkout_token",
"last_updated_at",
"user_id"
FROM "shopify_abandoned_checkout_data_removeWideColumns_projected_renamed"
),
"shopify_abandoned_checkout_data_removeWideColumns_projected_renamed_cleaned_null" AS (
-- NULL Imputation: Impute Null to Disguised Missing Values
-- billing_address_line2: ['']
-- billing_first_name: ['']
-- billing_full_name: ['']
SELECT
CASE
WHEN "billing_address_line2" = '' THEN NULL
ELSE "billing_address_line2"
END AS "billing_address_line2",
CASE
WHEN "billing_first_name" = '' THEN NULL
ELSE "billing_first_name"
END AS "billing_first_name",
CASE
WHEN "billing_full_name" = '' THEN NULL
ELSE "billing_full_name"
END AS "billing_full_name",
"phone",
"cc_last_name",
"currency",
"display_currency",
"cc_exp_month",
"billing_latitude",
"billing_zip",
"completed_at",
"cc_exp_year",
"payment_gateway",
"shipping_details",
"billing_address_id",
"accepts_marketing",
"billing_address_line1",
"billing_country",
"discount_description",
"customer_locale",
"email",
"checkout_token",
"billing_company",
"discount_non_applicable_reason",
"order_notes",
"cc_number",
"device_id",
"location_id",
"is_default_billing_address",
"discount_amount",
"abandoned_at",
"user_id",
"discount_value_type",
"last_updated_at",
"taxes_included",
"checkout_source",
"customer_id",
"order_number",
"landing_page_url",
"billing_province_code",
"discount_title",
"is_deleted",
"source_url",
"referral_source",
"billing_country_code",
"recovery_url",
"source_name",
"billing_longitude",
"billing_phone",
"closed_at",
"cc_cvv",
"source_id",
"is_discount_applicable",
"discount_value",
"cc_first_name",
"checkout_id",
"cart_token",
"billing_city",
"custom_attributes",
"subtotal",
"billing_province",
"billing_last_name"
FROM "shopify_abandoned_checkout_data_removeWideColumns_projected_renamed_cleaned"
),
"shopify_abandoned_checkout_data_removeWideColumns_projected_renamed_cleaned_null_casted" AS (
-- Column Type Casting:
-- abandoned_at: from VARCHAR to TIMESTAMP
-- billing_address_id: from DECIMAL to VARCHAR
-- billing_company: from DECIMAL to VARCHAR
-- billing_phone: from DECIMAL to VARCHAR
-- billing_zip: from DECIMAL to VARCHAR
-- cc_cvv: from DECIMAL to VARCHAR
-- cc_exp_month: from DECIMAL to VARCHAR
-- cc_exp_year: from DECIMAL to VARCHAR
-- cc_first_name: from DECIMAL to VARCHAR
-- cc_last_name: from DECIMAL to VARCHAR
-- cc_number: from DECIMAL to VARCHAR
-- checkout_id: from INT to VARCHAR
-- checkout_source: from DECIMAL to VARCHAR
-- closed_at: from DECIMAL to TIMESTAMP
-- completed_at: from DECIMAL to TIMESTAMP
-- custom_attributes: from VARCHAR to JSON
-- device_id: from DECIMAL to VARCHAR
-- discount_description: from DECIMAL to VARCHAR
-- discount_non_applicable_reason: from DECIMAL to VARCHAR
-- discount_title: from DECIMAL to VARCHAR
-- discount_value_type: from DECIMAL to VARCHAR
-- is_default_billing_address: from DECIMAL to BOOLEAN
-- is_deleted: from DECIMAL to BOOLEAN
-- is_discount_applicable: from DECIMAL to BOOLEAN
-- last_updated_at: from VARCHAR to TIMESTAMP
-- location_id: from DECIMAL to VARCHAR
-- order_notes: from DECIMAL to VARCHAR
-- phone: from DECIMAL to VARCHAR
-- shipping_details: from DECIMAL to JSON
-- source_id: from DECIMAL to VARCHAR
-- source_url: from DECIMAL to VARCHAR
-- user_id: from DECIMAL to VARCHAR
SELECT
"billing_address_line2",
"billing_first_name",
"billing_full_name",
"currency",
"display_currency",
"billing_latitude",
"payment_gateway",
"accepts_marketing",
"billing_address_line1",
"billing_country",
"customer_locale",
"email",
"checkout_token",
"discount_amount",
"taxes_included",
"customer_id",
"order_number",
"landing_page_url",
"billing_province_code",
"referral_source",
"billing_country_code",
"recovery_url",
"source_name",
"billing_longitude",
"discount_value",
"cart_token",
"billing_city",
"subtotal",
"billing_province",
"billing_last_name",
CAST("abandoned_at" AS TIMESTAMP) AS "abandoned_at",
CAST("billing_address_id" AS VARCHAR) AS "billing_address_id",
CAST("billing_company" AS VARCHAR) AS "billing_company",
CAST("billing_phone" AS VARCHAR) AS "billing_phone",
CAST("billing_zip" AS VARCHAR) AS "billing_zip",
CAST("cc_cvv" AS VARCHAR) AS "cc_cvv",
CAST("cc_exp_month" AS VARCHAR) AS "cc_exp_month",
CAST("cc_exp_year" AS VARCHAR) AS "cc_exp_year",
CAST("cc_first_name" AS VARCHAR) AS "cc_first_name",
CAST("cc_last_name" AS VARCHAR) AS "cc_last_name",
CAST("cc_number" AS VARCHAR) AS "cc_number",
CAST("checkout_id" AS VARCHAR) AS "checkout_id",
CAST("checkout_source" AS VARCHAR) AS "checkout_source",
CAST("closed_at" AS TIMESTAMP) AS "closed_at",
CAST("completed_at" AS TIMESTAMP) AS "completed_at",
CAST("custom_attributes" AS JSON) AS "custom_attributes",
CAST("device_id" AS VARCHAR) AS "device_id",
CAST("discount_description" AS VARCHAR) AS "discount_description",
CAST("discount_non_applicable_reason" AS VARCHAR) AS "discount_non_applicable_reason",
CAST("discount_title" AS VARCHAR) AS "discount_title",
CAST("discount_value_type" AS VARCHAR) AS "discount_value_type",
CAST("is_default_billing_address" AS BOOLEAN) AS "is_default_billing_address",
CAST("is_deleted" AS BOOLEAN) AS "is_deleted",
CAST("is_discount_applicable" AS BOOLEAN) AS "is_discount_applicable",
CAST("last_updated_at" AS TIMESTAMP) AS "last_updated_at",
CAST("location_id" AS VARCHAR) AS "location_id",
CAST("order_notes" AS VARCHAR) AS "order_notes",
CAST("phone" AS VARCHAR) AS "phone",
CAST("shipping_details" AS JSON) AS "shipping_details",
CAST("source_id" AS VARCHAR) AS "source_id",
CAST("source_url" AS VARCHAR) AS "source_url",
CAST("user_id" AS VARCHAR) AS "user_id"
FROM "shopify_abandoned_checkout_data_removeWideColumns_projected_renamed_cleaned_null"
),
"shopify_abandoned_checkout_data_removeWideColumns_projected_renamed_cleaned_null_casted_missing_handled" AS (
-- Handling missing values: There are 17 columns with unacceptable missing values
-- checkout_source has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- closed_at has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- completed_at has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- custom_attributes has 66.67 percent missing. Strategy: 🔄 Unchanged
-- device_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- display_currency has 66.67 percent missing. Strategy: 🔄 Unchanged
-- is_default_billing_address has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- is_deleted has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- is_discount_applicable has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- location_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- order_notes has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- phone has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- referral_source has 33.33 percent missing. Strategy: 🔄 Unchanged
-- shipping_details has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- source_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- source_url has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- user_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"billing_address_line2",
"billing_first_name",
"billing_full_name",
"currency",
"display_currency",
"billing_latitude",
"payment_gateway",
"accepts_marketing",
"billing_address_line1",
"billing_country",
"customer_locale",
"email",
"checkout_token",
"discount_amount",
"taxes_included",
"customer_id",
"order_number",
"landing_page_url",
"billing_province_code",
"referral_source",
"billing_country_code",
"recovery_url",
"source_name",
"billing_longitude",
"discount_value",
"cart_token",
"billing_city",
"subtotal",
"billing_province",
"billing_last_name",
"abandoned_at",
"billing_address_id",
"billing_company",
"billing_phone",
"billing_zip",
"cc_cvv",
"cc_exp_month",
"cc_exp_year",
"cc_first_name",
"cc_last_name",
"cc_number",
"checkout_id",
"custom_attributes",
"discount_description",
"discount_non_applicable_reason",
"discount_title",
"discount_value_type",
"last_updated_at"
FROM "shopify_abandoned_checkout_data_removeWideColumns_projected_renamed_cleaned_null_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_abandoned_checkout_data_removeWideColumns_projected_renamed_cleaned_null_casted_missing_handled"
stg_shopify_abandoned_checkout_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_abandoned_checkout_data
description: The table is about abandoned checkouts on a Shopify store. It contains
details of incomplete orders including customer information, billing address,
product details, pricing, and checkout URLs. Each row represents a single abandoned
cart with data like email, currency, subtotal, and timestamps. The table tracks
customer behavior and potential sales that were not completed.
columns:
- name: billing_address_line2
description: Second line of billing address
cocoon_meta:
missing_acceptable: No secondary address line needed
- name: billing_first_name
description: First name in billing address
cocoon_meta:
missing_acceptable: No billing name provided for the transaction
- name: billing_full_name
description: Full name in billing address
cocoon_meta:
missing_acceptable: No billing name provided for the transaction
- name: currency
description: Currency used for the transaction
tests:
- not_null
- name: display_currency
description: Currency presented to the customer
tests:
- not_null
- name: billing_latitude
description: Latitude of billing address
cocoon_meta:
missing_acceptable: No billing address provided for the transaction
- name: payment_gateway
description: Payment gateway used
tests:
- accepted_values:
values:
- PayPal
- Stripe
- Square
- Authorize.Net
- Braintree
- 2Checkout
- Amazon Pay
- Google Pay
- Apple Pay
- Skrill
- Klarna
- Adyen
- WorldPay
- Sage Pay
- Dwolla
- WePay
- Payoneer
- BlueSnap
- Checkout.com
- Alipay
- paypal
cocoon_meta:
missing_acceptable: Not applicable when payment hasn't been processed yet.
- name: accepts_marketing
description: Indicates if buyer accepts marketing emails
tests:
- not_null
- name: billing_address_line1
description: First line of billing address
cocoon_meta:
missing_acceptable: No billing address provided for the transaction
- name: billing_country
description: Country of billing address
cocoon_meta:
missing_acceptable: No billing address provided for the transaction
- name: customer_locale
description: Language/region setting of the customer
tests:
- not_null
- name: email
description: Customer's email address
tests:
- not_null
- name: checkout_token
description: Unique token for the abandoned checkout
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column contains a unique token for each abandoned checkout.
For this table, each row is an abandoned checkout. Checkout token is designed
to be a unique identifier for each checkout session.
- name: discount_amount
description: Amount of discount applied to the order
cocoon_meta:
missing_acceptable: No discount applied to the transaction
- name: taxes_included
description: Whether taxes are included in the price
tests:
- not_null
- name: customer_id
description: Unique identifier for the customer who abandoned the cart
tests:
- not_null
- name: order_number
description: Order number or identifier
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column contains an order number or identifier. For this table,
each row is an abandoned checkout. Order numbers are typically unique for
each order or checkout attempt.
- name: landing_page_url
description: URL of the page where customer entered site
tests:
- not_null
- name: billing_province_code
description: Province or state code of billing address
cocoon_meta:
missing_acceptable: No billing address provided for the transaction
- name: referral_source
description: Website that referred the customer
tests:
- not_null
- name: billing_country_code
description: Country code of billing address
cocoon_meta:
missing_acceptable: No billing address provided for the transaction
- name: recovery_url
description: URL for recovering the abandoned checkout
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column contains the URL for recovering the abandoned checkout.
For this table, each row is an abandoned checkout. The recovery URL appears
to be unique for each abandoned checkout, as it contains a unique token.
- name: source_name
description: Name of the checkout source
tests:
- not_null
- accepted_values:
values:
- web
- mobile
- desktop
- tablet
- kiosk
- api
- in-store
- phone
- mail
- fax
- social_media
- voice_assistant
- smartwatch
- smart_tv
- game_console
- iot_device
- name: billing_longitude
description: Longitude of billing address
cocoon_meta:
missing_acceptable: No billing address provided for the transaction
- name: discount_value
description: Value of the applied discount
cocoon_meta:
missing_acceptable: No discount applied to the transaction
- name: cart_token
description: Unique identifier for the shopping cart
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column contains a unique identifier for the shopping cart.
For this table, each row is an abandoned checkout. The cart token is likely
to be unique for each abandoned cart.
- name: billing_city
description: City of billing address
cocoon_meta:
missing_acceptable: No billing address provided for the transaction
- name: subtotal
description: Subtotal of the order before taxes/shipping
tests:
- not_null
- name: billing_province
description: Province or state of billing address
cocoon_meta:
missing_acceptable: No billing address provided for the transaction
- name: billing_last_name
description: Last name in billing address
cocoon_meta:
missing_acceptable: No billing name provided for the transaction
- name: abandoned_at
description: Timestamp of when the checkout was abandoned
tests:
- not_null
- name: billing_address_id
description: Unique identifier for billing address
cocoon_meta:
missing_acceptable: No billing address provided for the transaction
- name: billing_company
description: Company name in billing address
cocoon_meta:
missing_acceptable: No company associated with the billing
- name: billing_phone
description: Phone number in billing address
cocoon_meta:
missing_acceptable: No phone number provided for billing
- name: billing_zip
description: ZIP or postal code of billing address
cocoon_meta:
missing_acceptable: No billing address provided for the transaction
- name: cc_cvv
description: CVV of the credit card
cocoon_meta:
missing_acceptable: Credit card not used for the transaction
- name: cc_exp_month
description: Expiration month of the credit card
cocoon_meta:
missing_acceptable: Credit card not used for the transaction
- name: cc_exp_year
description: Expiration year of the credit card
cocoon_meta:
missing_acceptable: Credit card not used for the transaction
- name: cc_first_name
description: First name on the credit card
cocoon_meta:
missing_acceptable: Credit card not used for the transaction
- name: cc_last_name
description: Last name on the credit card
cocoon_meta:
missing_acceptable: Credit card not used for the transaction
- name: cc_number
description: Credit card number (likely masked)
cocoon_meta:
missing_acceptable: Credit card not used for the transaction
- name: checkout_id
description: Unique identifier for the abandoned checkout
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column is a unique identifier for the abandoned checkout. For
this table, each row represents a unique abandoned cart. As it's designed
to be a unique identifier, it should be unique across all rows and can identify
each abandoned cart uniquely.
- name: custom_attributes
description: Custom attributes for the order
tests:
- not_null
- name: discount_description
description: Description of the applied discount
cocoon_meta:
missing_acceptable: No discount applied to the transaction
- name: discount_non_applicable_reason
description: Reason why discount is not applicable
cocoon_meta:
missing_acceptable: No discount applied to the transaction
- name: discount_title
description: Title of the applied discount
cocoon_meta:
missing_acceptable: No discount applied to the transaction
- name: discount_value_type
description: Type of discount value (percentage or fixed)
cocoon_meta:
missing_acceptable: No discount applied to the transaction
- name: last_updated_at
description: Timestamp of when the abandoned cart was last updated
tests:
- not_null
stg_shopify_order_tag_data (first 100 rows)
tag_group_id | order_id | color_tag | |
---|---|---|---|
0 | 1 | 6411 | #333333 |
1 | 1 | 47195 | #222222 |
2 | 1 | 46553 | #771222 |
stg_shopify_order_tag_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_order_tag_data_projected" AS (
-- Projection: Selecting 3 out of 4 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"index_",
"order_id",
"value_"
FROM "shopify_order_tag_data"
),
"shopify_order_tag_data_projected_renamed" AS (
-- Rename: Renaming columns
-- index_ -> tag_group_id
-- value_ -> color_tag
SELECT
"index_" AS "tag_group_id",
"order_id",
"value_" AS "color_tag"
FROM "shopify_order_tag_data_projected"
),
"shopify_order_tag_data_projected_renamed_cleaned" AS (
-- Clean unusual string values:
-- color_tag: The problem is that '#22222' and '#33333' are invalid hex color codes because they have only 5 digits instead of the standard 6 digits. Hex color codes should always have 6 digits (or 3 digits in shorthand notation). The correct values should have 6 digits. To fix this, we can assume that the last digit was accidentally omitted and duplicate it to create valid 6-digit hex codes.
SELECT
"tag_group_id",
"order_id",
CASE
WHEN "color_tag" = '#22222' THEN '#222222'
WHEN "color_tag" = '#33333' THEN '#333333'
ELSE "color_tag"
END AS "color_tag"
FROM "shopify_order_tag_data_projected_renamed"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_order_tag_data_projected_renamed_cleaned"
stg_shopify_order_tag_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_order_tag_data
description: The table is about Shopify order tags. Each row represents a tag associated
with an order. The table contains an index, order ID, and a tag value. The tag
values appear to be color codes starting with '#'. This table likely allows attaching
additional metadata or categorization to Shopify orders.
columns:
- name: tag_group_id
description: Identifier for grouping related tags
tests:
- not_null
- name: order_id
description: Unique identifier for a Shopify order
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for a Shopify order.
For this table, each row represents a tag associated with an order. Since
each order can have only one tag in this table structure, order_id is likely
to be unique across rows.
- name: color_tag
description: Color code tag associated with the order
tests:
- not_null
stg_shopify_order_line_refund_data (first 100 rows)
store_location_id | restock_type | refunded_quantity | refund_tax_amount | original_order_line_id | refund_id | refund_line_item_id | refund_subtotal | |
---|---|---|---|---|---|---|---|---|
0 | 3.213171e+10 | return | 1 | 19.74 | 6113984839751 | 679976206407 | 189012115527 | 415.0 |
1 | 3.213171e+10 | return | 1 | 56.33 | 9698959196231 | 800919683143 | 289901510727 | 415.0 |
2 | 3.213171e+10 | return | 1 | 16.18 | 6423996530759 | 686409187399 | 196428005447 | 415.0 |
3 | NaN | no_restock | 1 | 26.17 | 6367161483335 | 798222680135 | 286567268423 | 415.0 |
4 | NaN | no_restock | 1 | 13.75 | 6009460064327 | 677359190087 | 185936773191 | 415.0 |
stg_shopify_order_line_refund_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_order_line_refund_data_projected" AS (
-- Projection: Selecting 10 out of 11 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"location_id",
"refund_id",
"restock_type",
"quantity",
"order_line_id",
"subtotal",
"total_tax_set",
"subtotal_set",
"total_tax"
FROM "shopify_order_line_refund_data"
),
"shopify_order_line_refund_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> refund_line_item_id
-- location_id -> store_location_id
-- quantity -> refunded_quantity
-- order_line_id -> original_order_line_id
-- subtotal -> refund_subtotal
-- total_tax_set -> tax_amount_set
-- total_tax -> refund_tax_amount
SELECT
"id" AS "refund_line_item_id",
"location_id" AS "store_location_id",
"refund_id",
"restock_type",
"quantity" AS "refunded_quantity",
"order_line_id" AS "original_order_line_id",
"subtotal" AS "refund_subtotal",
"total_tax_set" AS "tax_amount_set",
"subtotal_set",
"total_tax" AS "refund_tax_amount"
FROM "shopify_order_line_refund_data_projected"
),
"shopify_order_line_refund_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- original_order_line_id: from INT to VARCHAR
-- refund_id: from INT to VARCHAR
-- refund_line_item_id: from INT to VARCHAR
-- refund_subtotal: from INT to DECIMAL
-- subtotal_set: from DECIMAL to VARCHAR
-- tax_amount_set: from DECIMAL to VARCHAR
SELECT
"store_location_id",
"restock_type",
"refunded_quantity",
"refund_tax_amount",
CAST("original_order_line_id" AS VARCHAR) AS "original_order_line_id",
CAST("refund_id" AS VARCHAR) AS "refund_id",
CAST("refund_line_item_id" AS VARCHAR) AS "refund_line_item_id",
CAST("refund_subtotal" AS DECIMAL) AS "refund_subtotal",
CAST("subtotal_set" AS VARCHAR) AS "subtotal_set",
CAST("tax_amount_set" AS VARCHAR) AS "tax_amount_set"
FROM "shopify_order_line_refund_data_projected_renamed"
),
"shopify_order_line_refund_data_projected_renamed_casted_missing_handled" AS (
-- Handling missing values: There are 2 columns with unacceptable missing values
-- subtotal_set has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- tax_amount_set has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"store_location_id",
"restock_type",
"refunded_quantity",
"refund_tax_amount",
"original_order_line_id",
"refund_id",
"refund_line_item_id",
"refund_subtotal"
FROM "shopify_order_line_refund_data_projected_renamed_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_order_line_refund_data_projected_renamed_casted_missing_handled"
stg_shopify_order_line_refund_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_order_line_refund_data
description: The table is about Shopify order line refund data. It includes details
such as refund ID, location ID, restock type, quantity refunded, order line ID,
subtotal, and tax information. Each row represents a single refund line item associated
with an order. The table tracks both returns and no-restock refunds, providing
financial and operational information for each refunded item.
columns:
- name: store_location_id
description: Identifier for the store location
cocoon_meta:
missing_acceptable: Not applicable for 'no_restock' refund types.
- name: restock_type
description: Indicates if item is returned or not restocked
tests:
- not_null
- accepted_values:
values:
- return
- no_restock
- name: refunded_quantity
description: Number of items refunded
tests:
- not_null
- name: refund_tax_amount
description: Total tax amount refunded
tests:
- not_null
- name: original_order_line_id
description: Identifier for the original order line item
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents the identifier for the original order line
item. For this table, each row is a single refund line item associated with
an order. The original_order_line_id is likely to be unique across rows as
each refund typically corresponds to a unique order line.
- name: refund_id
description: Unique identifier for the overall refund
tests:
- not_null
- name: refund_line_item_id
description: Unique identifier for the refund line item
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column is the unique identifier for the refund line item. For
this table, each row represents a distinct refund line item. Therefore, the
refund_line_item_id should be unique across all rows.
- name: refund_subtotal
description: Refunded amount before tax
tests:
- not_null
stg_shopify_discount_code_data (first 100 rows)
discount_id | discount_code | price_rule_id | usage_count | created_at | updated_at | |
---|---|---|---|---|---|---|
0 | 4773499 | CHECKVB34DDBQ3VH | 32543 | 0.0 | 2021-12-10 06:48:35 | 2021-12-10 06:48:35 |
1 | 436267 | CHECKVBLJG22DDD | 12543 | 0.0 | 2021-12-10 06:48:35 | 2021-12-10 06:48:35 |
2 | 469035 | CHECKV44CCCBCWB7 | 12543 | 0.0 | 2021-12-10 06:48:35 | 2021-12-10 06:48:35 |
stg_shopify_discount_code_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_discount_code_data_projected" AS (
-- Projection: Selecting 6 out of 7 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"code",
"created_at",
"price_rule_id",
"updated_at",
"usage_count"
FROM "shopify_discount_code_data"
),
"shopify_discount_code_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> discount_id
-- code -> discount_code
SELECT
"id" AS "discount_id",
"code" AS "discount_code",
"created_at",
"price_rule_id",
"updated_at",
"usage_count"
FROM "shopify_discount_code_data_projected"
),
"shopify_discount_code_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- created_at: from VARCHAR to TIMESTAMP
-- updated_at: from VARCHAR to TIMESTAMP
SELECT
"discount_id",
"discount_code",
"price_rule_id",
"usage_count",
CAST("created_at" AS TIMESTAMP) AS "created_at",
CAST("updated_at" AS TIMESTAMP) AS "updated_at"
FROM "shopify_discount_code_data_projected_renamed"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_discount_code_data_projected_renamed_casted"
stg_shopify_discount_code_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_discount_code_data
description: The table is about discount codes. It contains details such as the
unique identifier, the actual code, creation and update timestamps, associated
price rule ID, and usage count. Each row represents a specific discount code with
its properties. The table tracks information needed to manage and apply discounts
in an online store.
columns:
- name: discount_id
description: Unique identifier for the discount code entry
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each discount code
entry. For this table, each row represents a specific discount code, and discount_id
is unique across rows.
- name: discount_code
description: Unique discount code for customer use
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column contains the actual discount code that customers use.
For this table, each row represents a specific discount code, and discount_code
is unique across rows as each code is designed to be distinct.
- name: price_rule_id
description: ID of the associated pricing rule
tests:
- not_null
- name: usage_count
description: Number of times the discount code has been used
tests:
- not_null
- name: created_at
description: Timestamp when the discount code was created
tests:
- not_null
- name: updated_at
description: Timestamp of the last update to the entry
tests:
- not_null
stg_shopify_abandoned_checkout_discount_code_data (first 100 rows)
checkout_id | discount_index | discount_amount | discount_code | discount_type | discount_created_at | discount_updated_at | |
---|---|---|---|---|---|---|---|
0 | 901163 | 0 | 0.0 | CYBER12 | percentage | NaT | NaT |
1 | 4334827 | 0 | 0.0 | CYBER12 | percentage | NaT | NaT |
2 | 4566403 | 0 | 0.0 | BONUS | percentage | NaT | NaT |
stg_shopify_abandoned_checkout_discount_code_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_abandoned_checkout_discount_code_data_projected" AS (
-- Projection: Selecting 9 out of 10 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"checkout_id",
"index_",
"amount",
"discount_id",
"code",
"created_at",
"type",
"updated_at",
"usage_count"
FROM "shopify_abandoned_checkout_discount_code_data"
),
"shopify_abandoned_checkout_discount_code_data_projected_renamed" AS (
-- Rename: Renaming columns
-- index_ -> discount_index
-- amount -> discount_amount
-- code -> discount_code
-- created_at -> discount_created_at
-- type -> discount_type
-- updated_at -> discount_updated_at
-- usage_count -> discount_usage_count
SELECT
"checkout_id",
"index_" AS "discount_index",
"amount" AS "discount_amount",
"discount_id",
"code" AS "discount_code",
"created_at" AS "discount_created_at",
"type" AS "discount_type",
"updated_at" AS "discount_updated_at",
"usage_count" AS "discount_usage_count"
FROM "shopify_abandoned_checkout_discount_code_data_projected"
),
"shopify_abandoned_checkout_discount_code_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- discount_created_at: from DECIMAL to TIMESTAMP
-- discount_id: from DECIMAL to VARCHAR
-- discount_updated_at: from DECIMAL to TIMESTAMP
-- discount_usage_count: from DECIMAL to INT
SELECT
"checkout_id",
"discount_index",
"discount_amount",
"discount_code",
"discount_type",
CAST("discount_created_at" AS TIMESTAMP) AS "discount_created_at",
CAST("discount_id" AS VARCHAR) AS "discount_id",
CAST("discount_updated_at" AS TIMESTAMP) AS "discount_updated_at",
CAST("discount_usage_count" AS INT) AS "discount_usage_count"
FROM "shopify_abandoned_checkout_discount_code_data_projected_renamed"
),
"shopify_abandoned_checkout_discount_code_data_projected_renamed_casted_missing_handled" AS (
-- Handling missing values: There are 2 columns with unacceptable missing values
-- discount_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- discount_usage_count has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"checkout_id",
"discount_index",
"discount_amount",
"discount_code",
"discount_type",
"discount_created_at",
"discount_updated_at"
FROM "shopify_abandoned_checkout_discount_code_data_projected_renamed_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_abandoned_checkout_discount_code_data_projected_renamed_casted_missing_handled"
stg_shopify_abandoned_checkout_discount_code_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_abandoned_checkout_discount_code_data
description: The table is about discount codes applied to abandoned checkouts in
Shopify. It includes details like checkout ID, discount code, amount, type (percentage),
and usage count. Each row represents a specific checkout with an applied discount
code. The table tracks information about discounts offered to encourage completion
of abandoned carts.
columns:
- name: checkout_id
description: Unique identifier for the abandoned checkout
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each abandoned checkout.
For this table, each row represents a specific checkout with an applied discount
code. The checkout_id is unique across rows, as it's designed to uniquely
identify each abandoned cart.
- name: discount_index
description: Position or order of the discount
tests:
- not_null
- name: discount_amount
description: Discount amount applied to the checkout
tests:
- not_null
- name: discount_code
description: Discount code applied to the checkout
tests:
- not_null
- name: discount_type
description: Type of discount (e.g., percentage)
tests:
- not_null
- accepted_values:
values:
- percentage
- fixed amount
- buy one get one free (BOGO)
- free shipping
- bundle discount
- loyalty points
- seasonal discount
- first-time customer discount
- volume discount
- rebate
- name: discount_created_at
description: Timestamp when the discount was created
cocoon_meta:
missing_acceptable: Not applicable for discounts that haven't been modified.
- name: discount_updated_at
description: Timestamp when the discount was last updated
cocoon_meta:
missing_acceptable: Not applicable for discounts that haven't been updated.
stg_shopify_customer_tag_data (first 100 rows)
tag_index | tag_value | customer_id | |
---|---|---|---|
0 | 1 | GGPP | 9919268 |
1 | 1 | GGPP | 4404 |
2 | 1 | GGPP | 5509188 |
stg_shopify_customer_tag_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_customer_tag_data_projected" AS (
-- Projection: Selecting 3 out of 4 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"customer_id",
"index_",
"value_"
FROM "shopify_customer_tag_data"
),
"shopify_customer_tag_data_projected_renamed" AS (
-- Rename: Renaming columns
-- index_ -> tag_index
-- value_ -> tag_value
SELECT
"customer_id",
"index_" AS "tag_index",
"value_" AS "tag_value"
FROM "shopify_customer_tag_data_projected"
),
"shopify_customer_tag_data_projected_renamed_casted" AS (
-- Column Type Casting:
-- customer_id: from INT to VARCHAR
SELECT
"tag_index",
"tag_value",
CAST("customer_id" AS VARCHAR) AS "customer_id"
FROM "shopify_customer_tag_data_projected_renamed"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_customer_tag_data_projected_renamed_casted"
stg_shopify_customer_tag_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_customer_tag_data
description: The table is about customer tags in a Shopify system. It contains customer
IDs and associated tag values. Each row represents a customer with their unique
identifier and a corresponding tag. The 'index_' column suggests there might be
multiple tags per customer, but only one tag ('GGPP') is shown in the samples.
columns:
- name: tag_index
description: Potential indicator for multiple tags per customer
tests:
- not_null
- name: tag_value
description: The tag value associated with the customer
tests:
- not_null
- name: customer_id
description: Unique identifier for each customer
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column is a unique identifier for each customer. For this table,
each row represents a tag associated with a customer. customer_id appears
to be unique across rows in the sample data, and it's described as a "Unique
identifier for each customer" in the given information.
stg_shopify_transaction_data (first 100 rows)
transaction_type | amount | avs_result_code | transaction_status | authorization_code | currency_code | is_test_transaction | created_at | credit_card_bin | credit_card_company | credit_card_number | cvv_result_code | exchange_adjustment | exchange_currency | exchange_final_amount | exchange_id | exchange_original_amount | order_id | parent_transaction_id | processed_at | receipt_details | refund_id | transaction_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | sale | 415.00 | Z | success | abcd999999 | USD | False | 2020-02-27 16:05:37 | None | None | None | None | None | None | None | None | None | 2181743870023 | None | 2020-02-27 16:05:37 | { "charges": { "data": [ { "balance_transaction": { "exchange_rate": null } }] }} | None | 2667417567303 |
1 | sale | 415.00 | Y | success | abcd888888 | USD | False | 2020-01-12 20:06:37 | None | None | None | None | None | None | None | None | None | 2089104834631 | None | 2020-01-12 20:06:37 | None | None | 2572210896967 |
2 | sale | 415.00 | None | success | abcd77777 | USD | False | 2020-02-26 00:12:37 | None | None | None | None | None | None | None | None | None | 2179107356743 | None | 2020-02-26 00:12:37 | { "charges": { "data": [ { "balance_transaction": { "exchange_rate": "0.523" } }] }} | None | 2664325611591 |
3 | sale | 15.95 | Y | success | abcd66666 | USD | False | 2020-01-26 11:04:41 | None | None | None | None | None | None | None | None | None | 2114590769223 | None | 2020-01-26 11:04:41 | None | None | 2595729735751 |
4 | sale | 212.12 | None | success | abcd5555 | USD | False | 2020-03-18 00:17:24 | None | None | None | None | None | None | None | None | None | 2214516916295 | None | 2020-03-18 00:17:24 | { "charges": { "data": [ { "balance_transaction": { "exchange_rate": "0.96581" } }] }} | None | 2705030512711 |
stg_shopify_transaction_data.sql (clean the table)
-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
WITH
"shopify_transaction_data_projected" AS (
-- Projection: Selecting 30 out of 31 columns
-- Columns projected out: ['_fivetran_synced']
SELECT
"id",
"order_id",
"refund_id",
"amount",
"authorization_",
"created_at",
"processed_at",
"device_id",
"gateway",
"source_name",
"message",
"currency",
"location_id",
"parent_id",
"payment_avs_result_code",
"kind",
"currency_exchange_id",
"currency_exchange_adjustment",
"currency_exchange_original_amount",
"currency_exchange_final_amount",
"currency_exchange_currency",
"error_code",
"status",
"test",
"user_id",
"payment_credit_card_bin",
"payment_cvv_result_code",
"payment_credit_card_number",
"payment_credit_card_company",
"receipt"
FROM "shopify_transaction_data"
),
"shopify_transaction_data_projected_renamed" AS (
-- Rename: Renaming columns
-- id -> transaction_id
-- authorization_ -> authorization_code
-- gateway -> payment_gateway
-- message -> transaction_message
-- currency -> currency_code
-- parent_id -> parent_transaction_id
-- payment_avs_result_code -> avs_result_code
-- kind -> transaction_type
-- currency_exchange_id -> exchange_id
-- currency_exchange_adjustment -> exchange_adjustment
-- currency_exchange_original_amount -> exchange_original_amount
-- currency_exchange_final_amount -> exchange_final_amount
-- currency_exchange_currency -> exchange_currency
-- status -> transaction_status
-- test -> is_test_transaction
-- payment_credit_card_bin -> credit_card_bin
-- payment_cvv_result_code -> cvv_result_code
-- payment_credit_card_number -> credit_card_number
-- payment_credit_card_company -> credit_card_company
-- receipt -> receipt_details
SELECT
"id" AS "transaction_id",
"order_id",
"refund_id",
"amount",
"authorization_" AS "authorization_code",
"created_at",
"processed_at",
"device_id",
"gateway" AS "payment_gateway",
"source_name",
"message" AS "transaction_message",
"currency" AS "currency_code",
"location_id",
"parent_id" AS "parent_transaction_id",
"payment_avs_result_code" AS "avs_result_code",
"kind" AS "transaction_type",
"currency_exchange_id" AS "exchange_id",
"currency_exchange_adjustment" AS "exchange_adjustment",
"currency_exchange_original_amount" AS "exchange_original_amount",
"currency_exchange_final_amount" AS "exchange_final_amount",
"currency_exchange_currency" AS "exchange_currency",
"error_code",
"status" AS "transaction_status",
"test" AS "is_test_transaction",
"user_id",
"payment_credit_card_bin" AS "credit_card_bin",
"payment_cvv_result_code" AS "cvv_result_code",
"payment_credit_card_number" AS "credit_card_number",
"payment_credit_card_company" AS "credit_card_company",
"receipt" AS "receipt_details"
FROM "shopify_transaction_data_projected"
),
"shopify_transaction_data_projected_renamed_cleaned" AS (
-- Clean unusual string values:
-- payment_gateway: The problem is that 'gateway_here' is not a real payment gateway name but a placeholder. This indicates that the actual payment gateway information was not properly filled in or was intentionally obscured. In a real dataset, we would expect to see names of actual payment gateways such as PayPal, Stripe, Square, etc. Since we don't have any information about what the real gateway should be, we can't map it to a correct value. In this case, it's best to map it to an empty string to indicate missing data.
-- source_name: The problem is that 'source_name' appears to be a column header that has been mistakenly included in the data values, rather than actual source name data. This is unusual because column names should typically be separate from the data values. The correct values for a source_name column would be actual names of sources, not the column header itself. Since we don't have information about the correct source names, we should map this to an empty string to remove the erroneous data.
-- transaction_message: The problem is that 'message_here' is a placeholder value and not an actual transaction message. It appears to be the only value in the column, which suggests that real transaction messages are missing or were not properly recorded. The correct values should be actual transaction messages specific to each transaction, but since we don't have that information, we can't map it to a meaningful value.
SELECT
"transaction_id",
"order_id",
"refund_id",
"amount",
"authorization_code",
"created_at",
"processed_at",
"device_id",
CASE
WHEN "payment_gateway" = 'gateway_here' THEN ''
ELSE "payment_gateway"
END AS "payment_gateway",
CASE
WHEN "source_name" = 'source_name' THEN ''
ELSE "source_name"
END AS "source_name",
CASE
WHEN "transaction_message" = 'message_here' THEN ''
ELSE "transaction_message"
END AS "transaction_message",
"currency_code",
"location_id",
"parent_transaction_id",
"avs_result_code",
"transaction_type",
"exchange_id",
"exchange_adjustment",
"exchange_original_amount",
"exchange_final_amount",
"exchange_currency",
"error_code",
"transaction_status",
"is_test_transaction",
"user_id",
"credit_card_bin",
"cvv_result_code",
"credit_card_number",
"credit_card_company",
"receipt_details"
FROM "shopify_transaction_data_projected_renamed"
),
"shopify_transaction_data_projected_renamed_cleaned_null" AS (
-- NULL Imputation: Impute Null to Disguised Missing Values
-- payment_gateway: ['']
-- source_name: ['']
-- transaction_message: ['']
SELECT
CASE
WHEN "payment_gateway" = '' THEN NULL
ELSE "payment_gateway"
END AS "payment_gateway",
CASE
WHEN "source_name" = '' THEN NULL
ELSE "source_name"
END AS "source_name",
CASE
WHEN "transaction_message" = '' THEN NULL
ELSE "transaction_message"
END AS "transaction_message",
"exchange_currency",
"location_id",
"transaction_type",
"error_code",
"credit_card_company",
"amount",
"transaction_id",
"user_id",
"order_id",
"exchange_final_amount",
"credit_card_number",
"avs_result_code",
"cvv_result_code",
"parent_transaction_id",
"refund_id",
"transaction_status",
"authorization_code",
"credit_card_bin",
"currency_code",
"device_id",
"exchange_adjustment",
"exchange_original_amount",
"is_test_transaction",
"exchange_id",
"created_at",
"processed_at",
"receipt_details"
FROM "shopify_transaction_data_projected_renamed_cleaned"
),
"shopify_transaction_data_projected_renamed_cleaned_null_casted" AS (
-- Column Type Casting:
-- created_at: from VARCHAR to TIMESTAMP
-- credit_card_bin: from DECIMAL to VARCHAR
-- credit_card_company: from DECIMAL to VARCHAR
-- credit_card_number: from DECIMAL to VARCHAR
-- cvv_result_code: from DECIMAL to VARCHAR
-- device_id: from DECIMAL to VARCHAR
-- error_code: from DECIMAL to VARCHAR
-- exchange_adjustment: from DECIMAL to VARCHAR
-- exchange_currency: from DECIMAL to VARCHAR
-- exchange_final_amount: from DECIMAL to VARCHAR
-- exchange_id: from DECIMAL to VARCHAR
-- exchange_original_amount: from DECIMAL to VARCHAR
-- location_id: from DECIMAL to VARCHAR
-- order_id: from INT to VARCHAR
-- parent_transaction_id: from DECIMAL to VARCHAR
-- processed_at: from VARCHAR to TIMESTAMP
-- receipt_details: from VARCHAR to JSON
-- refund_id: from DECIMAL to VARCHAR
-- transaction_id: from INT to VARCHAR
-- user_id: from DECIMAL to VARCHAR
SELECT
"payment_gateway",
"source_name",
"transaction_message",
"transaction_type",
"amount",
"avs_result_code",
"transaction_status",
"authorization_code",
"currency_code",
"is_test_transaction",
CAST("created_at" AS TIMESTAMP) AS "created_at",
CAST("credit_card_bin" AS VARCHAR) AS "credit_card_bin",
CAST("credit_card_company" AS VARCHAR) AS "credit_card_company",
CAST("credit_card_number" AS VARCHAR) AS "credit_card_number",
CAST("cvv_result_code" AS VARCHAR) AS "cvv_result_code",
CAST("device_id" AS VARCHAR) AS "device_id",
CAST("error_code" AS VARCHAR) AS "error_code",
CAST("exchange_adjustment" AS VARCHAR) AS "exchange_adjustment",
CAST("exchange_currency" AS VARCHAR) AS "exchange_currency",
CAST("exchange_final_amount" AS VARCHAR) AS "exchange_final_amount",
CAST("exchange_id" AS VARCHAR) AS "exchange_id",
CAST("exchange_original_amount" AS VARCHAR) AS "exchange_original_amount",
CAST("location_id" AS VARCHAR) AS "location_id",
CAST("order_id" AS VARCHAR) AS "order_id",
CAST("parent_transaction_id" AS VARCHAR) AS "parent_transaction_id",
CAST("processed_at" AS TIMESTAMP) AS "processed_at",
CAST("receipt_details" AS JSON) AS "receipt_details",
CAST("refund_id" AS VARCHAR) AS "refund_id",
CAST("transaction_id" AS VARCHAR) AS "transaction_id",
CAST("user_id" AS VARCHAR) AS "user_id"
FROM "shopify_transaction_data_projected_renamed_cleaned_null"
),
"shopify_transaction_data_projected_renamed_cleaned_null_casted_missing_handled" AS (
-- Handling missing values: There are 7 columns with unacceptable missing values
-- device_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- error_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- location_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- payment_gateway has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- source_name has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- transaction_message has 100.0 percent missing. Strategy: 🗑️ Drop Column
-- user_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
SELECT
"transaction_type",
"amount",
"avs_result_code",
"transaction_status",
"authorization_code",
"currency_code",
"is_test_transaction",
"created_at",
"credit_card_bin",
"credit_card_company",
"credit_card_number",
"cvv_result_code",
"exchange_adjustment",
"exchange_currency",
"exchange_final_amount",
"exchange_id",
"exchange_original_amount",
"order_id",
"parent_transaction_id",
"processed_at",
"receipt_details",
"refund_id",
"transaction_id"
FROM "shopify_transaction_data_projected_renamed_cleaned_null_casted"
)
-- COCOON BLOCK END
SELECT * FROM "shopify_transaction_data_projected_renamed_cleaned_null_casted_missing_handled"
stg_shopify_transaction_data.yml (Document the table)
version: 2
models:
- name: stg_shopify_transaction_data
description: The table is about financial transactions. It contains details like
transaction ID, order ID, amount, currency, payment gateway, and status. Each
row represents a single transaction. The table includes information on payment
processing, currency exchange rates, and credit card details. It also has timestamps
for when transactions were created and processed.
columns:
- name: transaction_type
description: Type of transaction
tests:
- not_null
- accepted_values:
values:
- sale
- purchase
- refund
- exchange
- rental
- subscription
- deposit
- withdrawal
- transfer
- payment
- name: amount
description: Transaction amount
tests:
- not_null
- name: avs_result_code
description: Address Verification System result code
tests:
- accepted_values:
values:
- A
- B
- C
- D
- E
- F
- G
- I
- M
- N
- P
- R
- S
- U
- W
- X
- Y
- Z
cocoon_meta:
missing_acceptable: Not applicable for transactions without address verification.
- name: transaction_status
description: Status of the transaction
tests:
- not_null
- accepted_values:
values:
- success
- failure
- pending
- cancelled
- rejected
- refunded
- expired
- authorized
- captured
- settled
- name: authorization_code
description: Authorization code for the transaction
tests:
- not_null
- name: currency_code
description: Currency code of the transaction
tests:
- not_null
- name: is_test_transaction
description: Indicates if transaction is a test
tests:
- not_null
- name: created_at
description: Timestamp of transaction creation
tests:
- not_null
- name: credit_card_bin
description: Bank Identification Number of credit card
cocoon_meta:
missing_acceptable: Not applicable for non-credit card transactions.
- name: credit_card_company
description: Credit card company
cocoon_meta:
missing_acceptable: Not applicable for non-credit card transactions.
- name: credit_card_number
description: Masked credit card number
cocoon_meta:
missing_acceptable: Not applicable for non-credit card transactions.
- name: cvv_result_code
description: Card Verification Value result code
cocoon_meta:
missing_acceptable: Not applicable for transactions without CVV verification.
- name: exchange_adjustment
description: Adjustment for currency exchange
cocoon_meta:
missing_acceptable: Not applicable for transactions without currency exchange.
- name: exchange_currency
description: Currency used in exchange
cocoon_meta:
missing_acceptable: Not applicable for transactions without currency exchange.
- name: exchange_final_amount
description: Final amount after currency exchange
cocoon_meta:
missing_acceptable: Not applicable for transactions without currency exchange.
- name: exchange_id
description: Identifier for currency exchange
cocoon_meta:
missing_acceptable: Not applicable for transactions without currency exchange.
- name: exchange_original_amount
description: Original amount before currency exchange
cocoon_meta:
missing_acceptable: Not applicable for transactions without currency exchange.
- name: order_id
description: Identifier for the associated order
tests:
- not_null
- name: parent_transaction_id
description: Identifier of parent transaction if applicable
cocoon_meta:
missing_acceptable: Not applicable for transactions without a parent transaction.
- name: processed_at
description: Timestamp of transaction processing
tests:
- not_null
- name: receipt_details
description: Receipt details in JSON format
cocoon_meta:
missing_acceptable: Not applicable for transactions without detailed receipt
information.
- name: refund_id
description: Identifier for associated refund
cocoon_meta:
missing_acceptable: Not applicable for transactions that are not refunds.
- name: transaction_id
description: Unique identifier for the transaction
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique identifier for each transaction.
For this table, each row is a unique transaction. Transaction IDs are typically
designed to be unique across all transactions in a system.
snapshot_shopify_location_data (first 100 rows)
is_deleted | location_name | is_active | province_state | is_legacy | local_province_name | country_name | province_state_code | primary_address | iso_country_code | location_id | local_country_name | country_code | creation_timestamp | postal_code | secondary_address | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | False | Plum | True | None | True | None | United States | None | None | US | 8777748 | United States | US | 2019-06-11 15:58:20 | None | None |
1 | False | Plum Express | True | NY | False | New York | United States | NY | 111 Tree Road | US | 7748 | United States | US | 2018-12-10 16:24:07 | 7394.0 | None |
snapshot_shopify_location_data.sql (clean the table)
-- Slowly Changing Dimension: Dimension keys are "location_id"
-- Effective date columns are "last_update_timestamp"
-- We will create Type 1 SCD (latest snapshot)
SELECT
"is_deleted",
"location_name",
"is_active",
"province_state",
"is_legacy",
"local_province_name",
"country_name",
"province_state_code",
"primary_address",
"iso_country_code",
"location_id",
"local_country_name",
"country_code",
"creation_timestamp",
"postal_code",
"secondary_address"
FROM (
SELECT
"is_deleted",
"location_name",
"is_active",
"province_state",
"is_legacy",
"local_province_name",
"country_name",
"province_state_code",
"primary_address",
"iso_country_code",
"location_id",
"local_country_name",
"country_code",
"creation_timestamp",
"postal_code",
"secondary_address",
ROW_NUMBER() OVER (
PARTITION BY "location_id"
ORDER BY "last_update_timestamp"
DESC) AS "cocoon_rn"
FROM "stg_shopify_location_data"
) ranked
WHERE "cocoon_rn" = 1
snapshot_shopify_location_data.yml (Document the table)
version: 2
models:
- name: snapshot_shopify_location_data
description: The table contains the latest information about Shopify store locations.
It tracks the most recent version of each unique location, identified by location_id.
The table includes details such as location name, address, country, province/state,
postal code, and status (active/inactive). It covers both physical and online
store locations, omitting historical versions and update timestamps.
columns:
- name: is_deleted
description: Indicates if the record is deleted
tests:
- not_null
- name: location_name
description: Name of the store location
tests:
- not_null
- name: is_active
description: Indicates if the location is currently active
tests:
- not_null
- name: province_state
description: Province or state of the location
tests:
- not_null
- accepted_values:
values:
- AL
- AK
- AZ
- AR
- CA
- CO
- CT
- DE
- FL
- GA
- HI
- ID
- IL
- IN
- IA
- KS
- KY
- LA
- ME
- MD
- MA
- MI
- MN
- MS
- MO
- MT
- NE
- NV
- NH
- NJ
- NM
- NY
- NC
- ND
- OH
- OK
- OR
- PA
- RI
- SC
- SD
- TN
- TX
- UT
- VT
- VA
- WA
- WV
- WI
- WY
- name: is_legacy
description: Indicates if the location is a legacy entry
tests:
- not_null
- name: local_province_name
description: Province name in local language
tests:
- not_null
- name: country_name
description: Full name of the country
tests:
- not_null
- name: province_state_code
description: Code for the province or state
tests:
- not_null
- name: primary_address
description: Primary address line of the location
tests:
- not_null
- name: iso_country_code
description: ISO country code of the location
tests:
- not_null
- name: location_id
description: Unique identifier for the location
tests:
- not_null
- unique
cocoon_meta:
uniqueness: Unique dimension key, derived from the slowly changing dimension
- name: local_country_name
description: Country name in local language
tests:
- not_null
- name: country_code
description: Country code where the location is situated
tests:
- not_null
- name: creation_timestamp
description: Timestamp when the location was created
tests:
- not_null
- name: postal_code
description: Postal or ZIP code of the location
tests:
- not_null
- name: secondary_address
description: Secondary address line of the location
cocoon_meta:
missing_acceptable: Not all locations have or need a secondary address.
cocoon_meta:
scd_base_table: stg_shopify_location_data
snapshot_shopify_product_data (first 100 rows)
product_title | product_handle | product_type | vendor_id | visibility_scope | is_deleted | created_at | product_id | published_at | |
---|---|---|---|---|---|---|---|---|---|
0 | 1fccbdc6ac5f6edabf76e56eb0460019 | f4b6d0e4413a19b2e7a291f0ef4dc98f | fdb42fcb90ecd31c015932ffcd313014 | 13aea892c8de2d62f2608c6191cfab1f | web | False | 2020-02-14 19:18:05 | 4506451050593 | 2020-02-14 19:02:02 |
1 | c6c6fea8419b94103b0b05d64a5bab10 | f0a656254aca08bf40181226ac13418c | fdb42fcb90ecd31c015932ffcd313014 | 57403999f78b01b3fd325ba256eafe94 | global | False | 2020-02-14 02:09:59 | 4505775439969 | 2020-02-14 02:09:59 |
2 | 327ea22d0f91783418e519cb45a4a3e9 | 129181bbc087330e216a6a4d7939f00b | ec3bb3dd6e9d1f348a040ee7b45f1a72 | 13aea892c8de2d62f2608c6191cfab1f | web | False | 2020-03-04 05:04:32 | 4526236893281 | 2020-03-04 05:04:32 |
snapshot_shopify_product_data.sql (clean the table)
-- Slowly Changing Dimension: Dimension keys are "product_id"
-- Effective date columns are "updated_at"
-- We will create Type 1 SCD (latest snapshot)
SELECT
"product_title",
"product_handle",
"product_type",
"vendor_id",
"visibility_scope",
"is_deleted",
"created_at",
"product_id",
"published_at"
FROM (
SELECT
"product_title",
"product_handle",
"product_type",
"vendor_id",
"visibility_scope",
"is_deleted",
"created_at",
"product_id",
"published_at",
ROW_NUMBER() OVER (
PARTITION BY "product_id"
ORDER BY "updated_at"
DESC) AS "cocoon_rn"
FROM "stg_shopify_product_data"
) ranked
WHERE "cocoon_rn" = 1
snapshot_shopify_product_data.yml (Document the table)
version: 2
models:
- name: snapshot_shopify_product_data
description: The table contains the latest Shopify product data. It includes current
product details such as ID, title, handle, type, vendor, visibility, and deletion
status. The table tracks the most recent version of each unique product on the
Shopify platform. It provides a snapshot of up-to-date product information without
historical versions or update timestamps.
columns:
- name: product_title
description: Name or title of the product
tests:
- not_null
- name: product_handle
description: Unique URL-friendly string for the product
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column represents a unique URL-friendly string for the product.
For this table, each row is for a unique product. The product handle is typically
generated to be unique for each product in Shopify, making it a good candidate
for a key.
- name: product_type
description: Category or type of the product
tests:
- not_null
- name: vendor_id
description: Identifier for the product's vendor
tests:
- not_null
- name: visibility_scope
description: Visibility scope of the product (web/global)
tests:
- not_null
- accepted_values:
values:
- web
- global
- name: is_deleted
description: Indicates if the product has been deleted
tests:
- not_null
- name: created_at
description: Timestamp when the product was created
tests:
- not_null
- name: product_id
description: Unique identifier for the product
tests:
- not_null
- unique
cocoon_meta:
uniqueness: Unique dimension key, derived from the slowly changing dimension
- name: published_at
description: Timestamp when the product was published
tests:
- not_null
cocoon_meta:
scd_base_table: stg_shopify_product_data
snapshot_shopify_price_rule_data (first 100 rows)
price_rule_id | allocation_method | customer_eligibility | one_time_use | subtotal_prerequisite | discount_target | target_type | price_rule_name | discount_value | discount_type | allocation_limit | creation_date | expiration_date | start_date | usage_limit | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 564075 | across | all | False | NaN | entitled | line_item | THANKS | 0.0 | percentage | None | 2021-11-10 22:26:31 | 2021-11-30 14:00:59 | 2021-11-10 22:25:32 | None |
1 | 9339 | across | all | False | NaN | all | line_item | THANKS | 0.0 | percentage | None | 2021-11-11 22:38:18 | 2021-12-02 19:00:59 | 2021-11-23 21:30:38 | None |
2 | 11443 | across | all | False | 500.0 | all | line_item | GIFTCARD | 0.0 | percentage | None | 2021-03-09 18:57:54 | 2021-03-22 07:00:59 | 2021-03-17 04:00:57 | None |
snapshot_shopify_price_rule_data.sql (clean the table)
-- Slowly Changing Dimension: Dimension keys are "price_rule_id"
-- Effective date columns are "last_updated"
-- We will create Type 1 SCD (latest snapshot)
SELECT
"price_rule_id",
"allocation_method",
"customer_eligibility",
"one_time_use",
"subtotal_prerequisite",
"discount_target",
"target_type",
"price_rule_name",
"discount_value",
"discount_type",
"allocation_limit",
"creation_date",
"expiration_date",
"start_date",
"usage_limit"
FROM (
SELECT
"price_rule_id",
"allocation_method",
"customer_eligibility",
"one_time_use",
"subtotal_prerequisite",
"discount_target",
"target_type",
"price_rule_name",
"discount_value",
"discount_type",
"allocation_limit",
"creation_date",
"expiration_date",
"start_date",
"usage_limit",
ROW_NUMBER() OVER (
PARTITION BY "price_rule_id"
ORDER BY "last_updated"
DESC) AS "cocoon_rn"
FROM "stg_shopify_price_rule_data"
) ranked
WHERE "cocoon_rn" = 1
snapshot_shopify_price_rule_data.yml (Document the table)
version: 2
models:
- name: snapshot_shopify_price_rule_data
description: The table tracks the most recent versions of Shopify price rules. It
contains details of current discount configurations, including rule IDs, customer
eligibility, and discount types. Each rule specifies target products, discount
values, and any prerequisites. The table omits historical versions and update
timestamps. It provides a snapshot of active price rules for managing discounts
in the Shopify platform.
columns:
- name: price_rule_id
description: Unique identifier for the price rule
tests:
- not_null
- unique
cocoon_meta:
uniqueness: Unique dimension key, derived from the slowly changing dimension
- name: allocation_method
description: Method for allocating discount across products
tests:
- not_null
- accepted_values:
values:
- proportional
- equal
- first item
- last item
- highest priced item
- lowest priced item
- random
- across
- name: customer_eligibility
description: Specifies which customers are eligible
tests:
- not_null
- accepted_values:
values:
- all
- new
- existing
- premium
- standard
- vip
- loyalty_program
- first_time
- returning
- age_18_plus
- age_21_plus
- students
- seniors
- military
- corporate
- name: one_time_use
description: Indicates if discount is one-time use
tests:
- not_null
- name: subtotal_prerequisite
description: Required subtotal range for discount eligibility
cocoon_meta:
missing_acceptable: Not applicable when no minimum purchase is required.
- name: discount_target
description: Specifies which items the discount applies to
tests:
- not_null
- accepted_values:
values:
- all
- entitled
- specific
- name: target_type
description: Type of target for the discount
tests:
- not_null
- accepted_values:
values:
- line_item
- order
- shipping
- product
- category
- customer
- customer_group
- name: price_rule_name
description: Name or title of the price rule
tests:
- not_null
- name: discount_value
description: Numerical value of the discount
tests:
- not_null
- name: discount_type
description: Type of value (percentage or fixed amount)
tests:
- not_null
- accepted_values:
values:
- percentage
- fixed amount
- name: allocation_limit
description: Limits how discount is allocated
cocoon_meta:
missing_acceptable: Not applicable when allocation method is 'across'.
- name: creation_date
description: Timestamp when the price rule was created
tests:
- not_null
- name: expiration_date
description: Timestamp when the price rule expires
tests:
- not_null
- name: start_date
description: Timestamp when the price rule becomes active
tests:
- not_null
- name: usage_limit
description: Maximum number of times rule can be used
cocoon_meta:
missing_acceptable: Not applicable when there's no limit on usage.
cocoon_meta:
scd_base_table: stg_shopify_price_rule_data
snapshot_shopify_product_variant_data (first 100 rows)
title | display_position | inventory_policy | fulfillment_service | inventory_management | is_taxable | weight_grams | stock_quantity | weight_unit | previous_stock_quantity | requires_shipping | tax_code | option_1 | created_at | image_id | inventory_item_id | price | product_id | variant_id | weight | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | my title here | 1 | deny | manual | None | False | 0 | 0 | lb | 0 | False | None | my title here | 2021-03-17 16:39:45 | None | 41367035936839 | 222 | 6544066379847 | 39273118957639 | 0.0 |
1 | my other title | 1 | deny | manual | inventory manager | True | 222 | 0 | lb | 0 | True | TR9999 | my other title | 2019-06-25 18:32:03 | None | 30309980143686 | 444 | 3879735590982 | 29217058947142 | 1.0 |
2 | my title here | 1 | deny | manual | None | False | 0 | -5 | lb | -5 | False | None | my title here | 2021-03-08 16:31:31 | None | 41356022644807 | 333 | 6540109250631 | 39262115397703 | 0.0 |
3 | my title here | 1 | deny | manual | inventory manager | True | 0 | 0 | lb | 0 | True | None | my title here | 2021-03-30 19:48:15 | None | 41384094924871 | 5 | 6548438188103 | 39290169262151 | 0.0 |
4 | my title here | 1 | deny | manual | None | False | 0 | 0 | lb | 0 | False | None | my title here | 2021-03-08 16:30:15 | None | 41356021661767 | 111 | 6540108431431 | 39262114414663 | 0.0 |
snapshot_shopify_product_variant_data.sql (clean the table)
-- Slowly Changing Dimension: Dimension keys are "variant_id"
-- Effective date columns are "updated_at"
-- We will create Type 1 SCD (latest snapshot)
SELECT
"title",
"display_position",
"inventory_policy",
"fulfillment_service",
"inventory_management",
"is_taxable",
"weight_grams",
"stock_quantity",
"weight_unit",
"previous_stock_quantity",
"requires_shipping",
"tax_code",
"option_1",
"created_at",
"image_id",
"inventory_item_id",
"price",
"product_id",
"variant_id",
"weight"
FROM (
SELECT
"title",
"display_position",
"inventory_policy",
"fulfillment_service",
"inventory_management",
"is_taxable",
"weight_grams",
"stock_quantity",
"weight_unit",
"previous_stock_quantity",
"requires_shipping",
"tax_code",
"option_1",
"created_at",
"image_id",
"inventory_item_id",
"price",
"product_id",
"variant_id",
"weight",
ROW_NUMBER() OVER (
PARTITION BY "variant_id"
ORDER BY "updated_at"
DESC) AS "cocoon_rn"
FROM "stg_shopify_product_variant_data"
) ranked
WHERE "cocoon_rn" = 1
snapshot_shopify_product_variant_data.yml (Document the table)
version: 2
models:
- name: snapshot_shopify_product_variant_data
description: The table is about current Shopify product variants. It tracks the
most recent version of each variant, including its title, price, inventory status,
and shipping details. Each row represents a unique product variant identified
by its variant ID. The table excludes historical versions and update timestamps,
focusing on the latest information for each variant in the Shopify e-commerce
system.
columns:
- name: title
description: Title or name of the variant
tests:
- not_null
- name: display_position
description: Position of the variant in listings
tests:
- not_null
- name: inventory_policy
description: Policy for handling out-of-stock items
tests:
- not_null
- accepted_values:
values:
- deny
- backorder
- substitute
- notify
- waitlist
- name: fulfillment_service
description: Service used for order fulfillment
tests:
- not_null
- accepted_values:
values:
- manual
- amazon
- shipwire
- webgistix
- shipstation
- shopify_fulfillment
- third_party
- self_fulfilled
- drop_ship
- fba (Fulfillment by Amazon)
- external
- name: inventory_management
description: Method used for inventory management
tests:
- not_null
- accepted_values:
values:
- inventory manager
- just-in-time (JIT)
- economic order quantity (EOQ)
- abc analysis
- first-in, first-out (FIFO)
- last-in, first-out (LIFO)
- safety stock
- vendor-managed inventory (VMI)
- consignment inventory
- dropshipping
- perpetual inventory system
- periodic inventory system
- barcode system
- radio-frequency identification (RFID)
- cycle counting
- min-max inventory method
- reorder point planning
- materials requirement planning (MRP)
- batch tracking
- demand forecasting
- name: is_taxable
description: Indicates if the variant is taxable
tests:
- not_null
- name: weight_grams
description: Weight of the product in grams
tests:
- not_null
- name: stock_quantity
description: Current quantity in stock
tests:
- not_null
- name: weight_unit
description: Unit of measurement for weight
tests:
- not_null
- accepted_values:
values:
- lb
- kg
- g
- oz
- stone
- ton
- metric ton
- mg
- name: previous_stock_quantity
description: Previous quantity in stock
tests:
- not_null
- name: requires_shipping
description: Indicates if shipping is required
tests:
- not_null
- name: tax_code
description: Tax code for the variant
tests:
- not_null
- name: option_1
description: Primary product option
tests:
- not_null
- name: created_at
description: Timestamp when the variant was created
tests:
- not_null
- name: image_id
description: Identifier for the variant's image
cocoon_meta:
missing_acceptable: Not all products require an image.
- name: inventory_item_id
description: Identifier for inventory tracking
tests:
- not_null
- unique
cocoon_meta:
uniqueness: This column is an identifier for inventory tracking. For this table,
each row is for a specific product variant. As it's an identifier specifically
for inventory items, it's likely to be unique for each variant.
- name: price
description: Current price of the variant
tests:
- not_null
- name: product_id
description: Identifier of the parent product
tests:
- not_null
- name: variant_id
description: Unique identifier for the variant
tests:
- not_null
- unique
cocoon_meta:
uniqueness: Unique dimension key, derived from the slowly changing dimension
- name: weight
description: Weight of the product
tests:
- not_null
cocoon_meta:
scd_base_table: stg_shopify_product_variant_data
Join Graph (FK to PK)
cocoon_join.yml (Document the joins)
join_graph:
- table_name: stg_shopify_abandoned_checkout_data
primary_key: checkout_id
foreign_keys:
- column: customer_id
reference:
table_name: stg_shopify_customer_data
column: customer_id
- table_name: stg_shopify_abandoned_checkout_discount_code_data
foreign_keys:
- column: checkout_id
reference:
table_name: stg_shopify_abandoned_checkout_data
column: checkout_id
- table_name: stg_shopify_abandoned_checkout_shipping_line_data
foreign_keys:
- column: checkout_id
reference:
table_name: stg_shopify_abandoned_checkout_data
column: checkout_id
- table_name: stg_shopify_collection_data
primary_key: collection_id
foreign_keys: []
- table_name: stg_shopify_collection_product_data
foreign_keys:
- column: collection_id
reference:
table_name: stg_shopify_collection_data
column: collection_id
- column: product_id
reference:
table_name: snapshot_shopify_product_data
column: product_id
- table_name: stg_shopify_customer_data
primary_key: customer_id
foreign_keys: []
- table_name: stg_shopify_customer_tag_data
foreign_keys:
- column: customer_id
reference:
table_name: stg_shopify_customer_data
column: customer_id
- table_name: stg_shopify_order_data
foreign_keys:
- column: customer_id
reference:
table_name: stg_shopify_customer_data
column: customer_id
primary_key: order_id
- table_name: stg_shopify_refund_data
foreign_keys:
- column: customer_id
reference:
table_name: stg_shopify_customer_data
column: customer_id
primary_key: refund_id
- table_name: stg_shopify_fulfillment_data
primary_key: fulfillment_id
foreign_keys:
- column: location_id
reference:
table_name: snapshot_shopify_location_data
column: location_id
- table_name: stg_shopify_fulfillment_event_data
foreign_keys:
- column: fulfillment_id
reference:
table_name: stg_shopify_fulfillment_data
column: fulfillment_id
- column: order_id
reference:
table_name: stg_shopify_order_data
column: order_id
- column: shop_id
reference:
table_name: stg_shopify_shop_data
column: shop_id
- table_name: stg_shopify_inventory_item_data
primary_key: item_id
foreign_keys: []
- table_name: stg_shopify_inventory_level_data
foreign_keys:
- column: inventory_item_id
reference:
table_name: stg_shopify_inventory_item_data
column: item_id
- column: location_id
reference:
table_name: snapshot_shopify_location_data
column: location_id
- table_name: snapshot_shopify_product_variant_data
foreign_keys:
- column: inventory_item_id
reference:
table_name: stg_shopify_inventory_item_data
column: item_id
- column: image_id
reference:
table_name: stg_shopify_product_image_data
column: image_id
- column: product_id
reference:
table_name: snapshot_shopify_product_data
column: product_id
primary_key: variant_id
- table_name: stg_shopify_order_adjustment_data
foreign_keys:
- column: order_id
reference:
table_name: stg_shopify_order_data
column: order_id
- column: refund_id
reference:
table_name: stg_shopify_refund_data
column: refund_id
- table_name: stg_shopify_order_discount_code_data
foreign_keys:
- column: order_id
reference:
table_name: stg_shopify_order_data
column: order_id
- table_name: stg_shopify_order_line_data
foreign_keys:
- column: order_id
reference:
table_name: stg_shopify_order_data
column: order_id
- column: product_id
reference:
table_name: snapshot_shopify_product_data
column: product_id
- column: variant_id
reference:
table_name: snapshot_shopify_product_variant_data
column: variant_id
primary_key: line_item_id
- table_name: stg_shopify_order_note_attribute_data
foreign_keys:
- column: order_id
reference:
table_name: stg_shopify_order_data
column: order_id
- table_name: stg_shopify_order_shipping_line_data
foreign_keys:
- column: order_id
reference:
table_name: stg_shopify_order_data
column: order_id
primary_key: shipping_line_id
- table_name: stg_shopify_order_tag_data
foreign_keys:
- column: order_id
reference:
table_name: stg_shopify_order_data
column: order_id
- table_name: stg_shopify_order_url_tag_data
foreign_keys:
- column: order_id
reference:
table_name: stg_shopify_order_data
column: order_id
- table_name: stg_shopify_metafield_data
foreign_keys:
- column: order_id
reference:
table_name: stg_shopify_order_data
column: order_id
- table_name: stg_shopify_transaction_data
foreign_keys:
- column: order_id
reference:
table_name: stg_shopify_order_data
column: order_id
primary_key: transaction_id
- table_name: stg_shopify_tender_transaction_data
foreign_keys:
- column: order_id
reference:
table_name: stg_shopify_order_data
column: order_id
- column: transaction_id
reference:
table_name: stg_shopify_transaction_data
column: transaction_id
- table_name: stg_shopify_order_line_refund_data
foreign_keys:
- column: original_order_line_id
reference:
table_name: stg_shopify_order_line_data
column: line_item_id
- column: refund_id
reference:
table_name: stg_shopify_refund_data
column: refund_id
- table_name: stg_shopify_order_shipping_tax_line_data
foreign_keys:
- column: order_shipping_line_id
reference:
table_name: stg_shopify_order_shipping_line_data
column: shipping_line_id
- table_name: stg_shopify_product_image_data
primary_key: image_id
foreign_keys:
- column: product_id
reference:
table_name: snapshot_shopify_product_data
column: product_id
- table_name: stg_shopify_shop_data
primary_key: shop_id
foreign_keys:
- column: primary_location_id
reference:
table_name: snapshot_shopify_location_data
column: location_id
- table_name: snapshot_shopify_location_data
primary_key: location_id
foreign_keys: []
- table_name: snapshot_shopify_price_rule_data
primary_key: price_rule_id
foreign_keys: []
- table_name: stg_shopify_discount_code_data
foreign_keys:
- column: price_rule_id
reference:
table_name: snapshot_shopify_price_rule_data
column: price_rule_id
- table_name: snapshot_shopify_product_data
primary_key: product_id
foreign_keys: []
- table_name: stg_shopify_product_tag_data
foreign_keys:
- column: product_id
reference:
table_name: snapshot_shopify_product_data
column: product_id
cocoon_er.yml (Document the ER model)
entities:
- entity_name: Abandoned Checkouts
entity_description: Represents incomplete orders or abandoned carts in a Shopify
store
table_name: stg_shopify_abandoned_checkout_data
primary_key: checkout_id
- entity_name: Collections
entity_description: Represents groups of products in a Shopify store
table_name: stg_shopify_collection_data
primary_key: collection_id
- entity_name: Customers
entity_description: Represents individual customers who have interacted with the
Shopify store
table_name: stg_shopify_customer_data
primary_key: customer_id
- entity_name: Fulfillments
entity_description: Represents the process of preparing and shipping orders to customers
table_name: stg_shopify_fulfillment_data
primary_key: fulfillment_id
- entity_name: Inventory Items
entity_description: Represents individual items in the store's inventory
table_name: stg_shopify_inventory_item_data
primary_key: item_id
- entity_name: Orders
entity_description: Represents customer purchases or transactions in the Shopify
store
table_name: stg_shopify_order_data
primary_key: order_id
- entity_name: Order Line Items
entity_description: Represents individual products within an order
table_name: stg_shopify_order_line_data
primary_key: line_item_id
- entity_name: Order Shipping Lines
entity_description: Represents shipping information for specific orders
table_name: stg_shopify_order_shipping_line_data
primary_key: shipping_line_id
- entity_name: Product Images
entity_description: Represents visual representations of products in the Shopify
store
table_name: stg_shopify_product_image_data
primary_key: image_id
- entity_name: Refunds
entity_description: Represents transactions where money is returned to customers
table_name: stg_shopify_refund_data
primary_key: refund_id
- entity_name: Shops
entity_description: Represents individual Shopify stores with their configurations
and details
table_name: stg_shopify_shop_data
primary_key: shop_id
- entity_name: Transactions
entity_description: Represents financial transactions associated with orders
table_name: stg_shopify_transaction_data
primary_key: transaction_id
- entity_name: Locations
entity_description: Represents physical or online locations associated with the
Shopify store
table_name: snapshot_shopify_location_data
primary_key: location_id
- entity_name: Price Rules
entity_description: Represents discount configurations and pricing rules in the
Shopify store
table_name: snapshot_shopify_price_rule_data
primary_key: price_rule_id
- entity_name: Products
entity_description: Represents items for sale in the Shopify store
table_name: snapshot_shopify_product_data
primary_key: product_id
- entity_name: Product Variants
entity_description: Represents specific versions or variations of products in the
Shopify store
table_name: snapshot_shopify_product_variant_data
primary_key: variant_id
relations:
- relation_name: CustomerAbandonedCheckouts
relation_description: This tracks Abandoned Checkouts initiated by Customers who
started but didn't complete the purchase process on the Shopify store.
table_name: stg_shopify_abandoned_checkout_data
entities:
- Abandoned Checkouts
- Customers
- relation_name: FulfillmentLocationAssociation
relation_description: Fulfillments are processed at specific Locations within Shopify's
platform for order shipping and delivery tracking.
table_name: stg_shopify_fulfillment_data
entities:
- Fulfillments
- Locations
- relation_name: CustomerOrders
relation_description: This stores the Orders placed by Customers, including details
of the purchase, shipping, and billing information.
table_name: stg_shopify_order_data
entities:
- Orders
- Customers
- relation_name: OrderLineItemDetails
relation_description: Order Line Items detail specific Products and their Variants
within Orders, connecting individual purchases to the broader Product catalog.
table_name: stg_shopify_order_line_data
entities:
- Order Line Items
- Orders
- Products
- Product Variants
- relation_name: OrderShippingDetails
relation_description: Order Shipping Lines provide detailed shipping information
for individual Orders, including pricing and carrier details.
table_name: stg_shopify_order_shipping_line_data
entities:
- Order Shipping Lines
- Orders
- relation_name: ProductImageAssociation
relation_description: Product Images are visual representations of Products, with
each Product potentially having multiple associated images.
table_name: stg_shopify_product_image_data
entities:
- Product Images
- Products
- relation_name: CustomerRefundDetails
relation_description: This tracks Refunds issued to Customers, detailing the reimbursement
process for specific orders in the Shopify system.
table_name: stg_shopify_refund_data
entities:
- Refunds
- Customers
- relation_name: ShopOperatesInLocations
relation_description: Shops operate in one or more Locations, with each shop having
a primary location and potentially multiple additional locations.
table_name: stg_shopify_shop_data
entities:
- Shops
- Locations
- relation_name: OrderTransactions
relation_description: Transactions record financial details of Orders, including
payment processing and currency information for each order.
table_name: stg_shopify_transaction_data
entities:
- Transactions
- Orders
- relation_name: ShopifyProductVariantDetails
relation_description: Product Variants are specific versions of Products, associated
with Inventory Items for stock management and optionally linked to Product Images
for visual representation.
table_name: snapshot_shopify_product_variant_data
entities:
- Product Variants
- Inventory Items
- Product Images
- Products
- relation_description: This table tracks discount codes applied to abandoned checkouts
in Shopify, providing details about each abandoned cart's associated discount.
table_name: stg_shopify_abandoned_checkout_discount_code_data
entities:
- Abandoned Checkouts
- relation_description: This table captures shipping line details for abandoned checkouts
in Shopify, representing unfulfilled purchase attempts.
table_name: stg_shopify_abandoned_checkout_shipping_line_data
entities:
- Abandoned Checkouts
- relation_name: CollectionProductAssociation
relation_description: This associates Collections with Products, indicating which
products are included in each collection and which collections contain each product.
table_name: stg_shopify_collection_product_data
entities:
- Collections
- Products
- relation_description: This stores the tags associated with Customers in a Shopify
system, representing customer attributes or classifications.
table_name: stg_shopify_customer_tag_data
entities:
- Customers
- relation_name: ShopOrderFulfillmentEvents
relation_description: Shops process Orders, which are then fulfilled through Fulfillments,
tracking the shipping and delivery status of each order.
table_name: stg_shopify_fulfillment_event_data
entities:
- Fulfillments
- Orders
- Shops
- relation_name: InventoryItemLocationQuantity
relation_description: This tracks the quantity of Inventory Items available at specific
Locations within a Shopify store.
table_name: stg_shopify_inventory_level_data
entities:
- Inventory Items
- Locations
- relation_name: OrderRefundAdjustments
relation_description: Orders can receive Refunds, which may include adjustments
for shipping or discrepancies, affecting the final order amount.
table_name: stg_shopify_order_adjustment_data
entities:
- Orders
- Refunds
- relation_description: This table stores discount information applied to Orders in
a Shopify store, including multiple discounts per order.
table_name: stg_shopify_order_discount_code_data
entities:
- Orders
- relation_description: This stores various attributes and details associated with
individual Shopify orders, including customer information and order-specific data.
table_name: stg_shopify_order_note_attribute_data
entities:
- Orders
- relation_description: This stores color tag metadata associated with Shopify orders,
allowing for additional categorization or visual identification of orders.
table_name: stg_shopify_order_tag_data
entities:
- Orders
- relation_description: This table stores metadata associated with individual Shopify
orders, including key-value pairs for various attributes like image, utm_medium,
and prop_channel.
table_name: stg_shopify_order_url_tag_data
entities:
- Orders
- relation_description: This table contains detailed metadata about return authorizations
for orders, including return reasons, quantities, and values.
table_name: stg_shopify_metafield_data
entities:
- Orders
- relation_description: This captures the Tender Transactions (direct money passing)
associated with Orders in a Shopify store, including sales and refunds.
table_name: stg_shopify_tender_transaction_data
entities:
- Orders
- Transactions
- relation_name: OrderLineItemRefunds
relation_description: This tracks Refunds applied to specific Order Line Items,
detailing the refund amount, quantity, and restock information.
table_name: stg_shopify_order_line_refund_data
entities:
- Order Line Items
- Refunds
- relation_description: This table represents the tax details associated with shipping
lines for individual Shopify orders.
table_name: stg_shopify_order_shipping_tax_line_data
entities:
- Order Shipping Lines
- relation_description: This table stores discount codes associated with specific
price rules, tracking their usage and creation details.
table_name: stg_shopify_discount_code_data
entities:
- Price Rules
- relation_description: This stores the tags associated with Products, allowing for
flexible categorization and labeling of individual products in a Shopify system.
table_name: stg_shopify_product_tag_data
entities:
- Products
story:
- relation_name: ShopOperatesInLocations
story_line: Shops establish primary and additional operating locations.
- relation_name: ProductImageAssociation
story_line: Shops upload multiple images for each product.
- relation_name: ShopifyProductVariantDetails
story_line: Shops create product variants and link to inventory.
- relation_name: InventoryItemLocationQuantity
story_line: Shops update inventory quantities across different locations.
- relation_name: CollectionProductAssociation
story_line: Shops organize products into themed collections.
- relation_name: CustomerAbandonedCheckouts
story_line: Customers add items but leave without completing purchase.
- relation_name: CustomerOrders
story_line: Customers place orders for desired products.
- relation_name: OrderLineItemDetails
story_line: Orders list specific products and variants purchased.
- relation_name: OrderShippingDetails
story_line: Orders include shipping information and carrier details.
- relation_name: OrderTransactions
story_line: Orders process payments and record financial details.
- relation_name: FulfillmentLocationAssociation
story_line: Shops assign orders to specific fulfillment locations.
- relation_name: ShopOrderFulfillmentEvents
story_line: Shops process and track order shipping status.
- relation_name: CustomerRefundDetails
story_line: Shops issue refunds to customers for returns.
- relation_name: OrderRefundAdjustments
story_line: Refunds adjust for shipping or pricing discrepancies.
- relation_name: OrderLineItemRefunds
story_line: Refunds detail specific items returned and restocked.