Fivetran Sap

We display the source tables from the data warehouses to model.

sap_bkpf_data (first 100 rows)

belnr bukrs gjahr mandt blart bldat budat monat cpudt cputm aedat upddt wwert usnam tcode bvorg xblnr dbblg stblg stjah bktxt waers kursf kzwrs kzkrs bstat xnetb frath xrueb glvor grpid dokid arcid iblar awtyp awkey fikrs hwaer hwae2 hwae3 kurs2 kurs3 basw2 basw3 umrd2 umrd3 xstov stodt xmwst curt2 curt3 kuty2 kuty3 xsnet ausbk xusvr duefl awsys txkrs ctxkrs lotkz xwvof stgrd ppnam brnch numpg adisc xref1_hd xref2_hd xreversal reindat rldnr ldgrp propmano xblnr_alt vatdate doccat xsplit cash_alloc follow_on xreorg subset kurst kursx kur2x kur3x xmca resubmission _sapf15_status psoty psoak psoks psosg psofn intform intdate psobt psozl psodt psotm fm_umart ccins ccnum ssblk batch sname sampled exclude_flag blind offset_status offset_refer_dat penrc knumv _fivetran_rowid _fivetran_deleted _fivetran_synced
0 200001076 3000 2006 800 sa 20060425 20060425 4 20060425 112823 0 0 20060425 d002766 fb50 NaN NaN NaN NaN 0 NaN usd 0.0 NaN 0.0 NaN NaN 0.0 NaN rfbu NaN NaN NaN NaN bkpf 10000107630002006 3000 usd eur usd -1.24 0.0 2 2 3 3 NaN 0 NaN 30 40 m m NaN NaN NaN NaN NaN 0.0 0.0 NaN NaN NaN NaN NaN 0 NaN NaN NaN NaN 0 NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN 0.0 0.0 0.0 NaN 0 NaN NaN NaN NaN NaN NaN NaN 0 0 NaN 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN 1 False 2023-03-28T15:53:27.753Z
1 200001077 3000 2006 800 sa 20060426 20060426 4 20060426 94020 0 0 20060426 d002766 fb50 NaN NaN NaN NaN 0 NaN usd 0.0 NaN 0.0 NaN NaN 0.0 NaN rfbu NaN NaN NaN NaN bkpf 10000107730002006 3000 usd eur usd -1.24 0.0 2 2 3 3 NaN 0 NaN 30 40 m m NaN NaN NaN NaN NaN 0.0 0.0 NaN NaN NaN NaN NaN 0 NaN NaN NaN NaN 0 NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN 0.0 0.0 0.0 NaN 0 NaN NaN NaN NaN NaN NaN NaN 0 0 NaN 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN 2 False 2023-03-28T15:53:27.753Z
2 200001078 3000 2006 800 sa 20060426 20060426 4 20060426 94135 0 0 20060426 d002766 fb50 NaN NaN NaN NaN 0 NaN usd 0.0 NaN 0.0 NaN NaN 0.0 NaN rfbu NaN NaN NaN NaN bkpf 10000107830002006 3000 usd eur usd -1.24 0.0 2 2 3 3 NaN 0 NaN 30 40 m m NaN NaN NaN NaN NaN 0.0 0.0 NaN NaN NaN NaN NaN 0 NaN NaN NaN NaN 0 NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN 0.0 0.0 0.0 NaN 0 NaN NaN NaN NaN NaN NaN NaN 0 0 NaN 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN 3 False 2023-03-28T15:53:27.753Z

sap_bseg_data (first 100 rows)

belnr bukrs buzei gjahr mandt buzid augdt augcp augbl bschl koart umskz umsks zumsk shkzg gsber pargb mwskz qsskz dmbtr wrbtr kzbtr pswbt pswsl txbhw txbfw mwsts wmwst hwbas fwbas hwzuz fwzuz shzuz stekz mwart txgrp ktosl qsshb kursr gbetr bdiff bdif2 valut zuonr sgtxt zinkz vbund bewar altkt vorgn fdlev fdgrp fdwbt fdtag fkont kokrs kostl projn aufnr vbeln vbel2 posn2 eten2 anln1 anln2 anbwa bzdat pernr xumsw xhres xkres xopvw xcpdd xskst xsauf xspro xserg xfakt xuman xanet xskrl xinve xpanz xauto xncop xzahl saknr hkont kunnr lifnr filkd xbilk gvtyp hzuon zfbdt zterm zbd1t zbd2t zbd3t zbd1p zbd2p skfbt sknto wskto zlsch zlspr zbfix hbkid bvtyp nebtr mwsk1 dmbt1 wrbt1 mwsk2 dmbt2 wrbt2 mwsk3 dmbt3 wrbt3 rebzg rebzj rebzz rebzt zollt zolld lzbkz landl diekz samnr abper vrskz vrsdt disbn disbj disbz wverw anfbn anfbj anfbu anfae blnbt blnkz blnpz mschl mansp madat manst maber esrnr esrre esrpz klibt qsznr qbshb qsfbt navhw navfw matnr werks menge meins erfmg erfme bpmng bprme ebeln ebelp zekkn elikz vprsv peinh bwkey bwtar bustw rewrt rewwr bonfb bualt psalt nprei tbtkz spgrp spgrm spgrt spgrg spgrv spgrq stceg egbld eglld rstgr ryacq rpacq rdiff rdif2 prctr xhkom vname recid egrup vptnr vertt vertn vbewa depot txjcd imkey dabrz popts fipos kstrg nplnr aufpl aplzl projk paobjnr pasubnr spgrs spgrc btype etype xegdr lnran hrkft dmbe2 dmbe3 dmb21 dmb22 dmb23 dmb31 dmb32 dmb33 mwst2 mwst3 navh2 navh3 sknt2 sknt3 bdif3 rdif3 hwmet glupm xragl uzawe lokkt fistl geber stbuk txbh2 txbh3 pprct xref1 xref2 kblnr kblpos sttax fkber obzei xnegp rfzei ccbtc kkber empfb xref3 dtws1 dtws2 dtws3 dtws4 gricd grirg gityp xpypr kidno absbt idxsp linfv kontt kontl txdat agzei pycur pyamt bupla secco lstar cession_kz prznr ppdiff ppdif2 ppdif3 penlc1 penlc2 penlc3 penfc pendays penrc grant_nbr sctax fkber_long gmvkz srtype intreno measure auggj ppa_ex_ind docln segment psegment pfkber hktid kstar xlgclr taxps pays_prov pays_tran mndid xfrge_bseg squan zzspreg zzbuspartn zzchan zzproduct zzloca zzlob zzuserfld1 zzuserfld2 zzuserfld3 zzstate zzregion re_bukrs re_account pgeber pgrant_nbr budget_pd pbudget_pd j_1tpbupl perop_beg perop_end fastpay ignr_ivref fmfgus_key fmxdocnr fmxyear fmxdocln fmxzekkn prodper recrf _fivetran_rowid _fivetran_deleted _fivetran_synced
0 100000795 3000 1 2006 800 NaN 0 0 NaN 50 s NaN NaN NaN h 9900 NaN NaN NaN 297.0 297.0 0.0 297.0 eur 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN 0 NaN 0.0 0.0 0.0 0.0 0.0 0 2300 soll-buchung NaN NaN NaN 320700 rfbu NaN NaN 0.0 0 0 1000 2300 NaN NaN NaN NaN 0 0 NaN NaN NaN 0 0 NaN NaN x NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN x NaN 483000 NaN NaN NaN NaN x NaN 0 NaN 0 0 0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN 0.0 NaN 0.0 0.0 NaN 0.0 0.0 NaN 0.0 0.0 NaN 0 0 NaN NaN 0 NaN NaN NaN 0 0 NaN 0 NaN 0 0 NaN NaN 0 NaN 0 0.0 NaN 0.0 NaN NaN 0 0 NaN NaN NaN NaN 0.0 NaN 0.0 0.0 0.0 0.0 NaN NaN 0.0 NaN 0.0 NaN 0.0 NaN NaN 0 0 NaN NaN 0 NaN NaN NaN 0.0 0.0 0.0 0.0 NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0.0 0.0 1400 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0.0 1431 NaN NaN 0 0 0 0 0 NaN NaN NaN NaN NaN 0 NaN 297.0 368.70 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN 483000 NaN NaN NaN 0.0 0.0 NaN NaN NaN NaN 0 0.0 980 0 NaN 0 NaN NaN NaN NaN 0 0 0 0 NaN NaN NaN NaN NaN 0.0 NaN 0 NaN NaN 0 0 NaN 0.0 NaN NaN NaN NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 NaN NaN 0.0 980 NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN NaN NaN NaN 0 0 0 0 NaN 495617 False 2023-03-28T15:56:08.415Z
1 100000798 3000 1 2006 800 NaN 0 0 NaN 50 s NaN NaN NaN h 4000 NaN NaN NaN 1001.0 1001.0 0.0 1001.0 eur 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN 0 NaN 0.0 0.0 0.0 0.0 0.0 0 3120 soll-buchung NaN NaN NaN 320700 rfbu NaN NaN 0.0 0 0 1000 3120 NaN NaN NaN NaN 0 0 NaN NaN NaN 0 0 NaN NaN x NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN x NaN 483000 NaN NaN NaN NaN x NaN 0 NaN 0 0 0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN 0.0 NaN 0.0 0.0 NaN 0.0 0.0 NaN 0.0 0.0 NaN 0 0 NaN NaN 0 NaN NaN NaN 0 0 NaN 0 NaN 0 0 NaN NaN 0 NaN 0 0.0 NaN 0.0 NaN NaN 0 0 NaN NaN NaN NaN 0.0 NaN 0.0 0.0 0.0 0.0 NaN NaN 0.0 NaN 0.0 NaN 0.0 NaN NaN 0 0 NaN NaN 0 NaN NaN NaN 0.0 0.0 0.0 0.0 NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0.0 0.0 1100 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0.0 1431 NaN NaN 0 0 0 0 0 NaN NaN NaN NaN NaN 0 NaN 1001.0 1242.64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN 483000 NaN NaN NaN 0.0 0.0 NaN NaN NaN NaN 0 0.0 980 0 NaN 0 NaN NaN NaN NaN 0 0 0 0 NaN NaN NaN NaN NaN 0.0 NaN 0 NaN NaN 0 0 NaN 0.0 NaN NaN NaN NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 NaN NaN 0.0 980 NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN NaN NaN NaN 0 0 0 0 NaN 495618 False 2023-03-28T15:56:08.415Z
2 100000806 3000 1 2006 800 NaN 0 0 NaN 50 s NaN NaN NaN h 9900 NaN NaN NaN 13.0 13.0 0.0 13.0 eur 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN 0 NaN 0.0 0.0 0.0 0.0 0.0 0 4110 soll-buchung NaN NaN NaN 320700 rfbu NaN NaN 0.0 0 0 1000 4110 NaN NaN NaN NaN 0 0 NaN NaN NaN 0 0 NaN NaN x NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN x NaN 483000 NaN NaN NaN NaN x NaN 0 NaN 0 0 0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN 0.0 NaN 0.0 0.0 NaN 0.0 0.0 NaN 0.0 0.0 NaN 0 0 NaN NaN 0 NaN NaN NaN 0 0 NaN 0 NaN 0 0 NaN NaN 0 NaN 0 0.0 NaN 0.0 NaN NaN 0 0 NaN NaN NaN NaN 0.0 NaN 0.0 0.0 0.0 0.0 NaN NaN 0.0 NaN 0.0 NaN 0.0 NaN NaN 0 0 NaN NaN 0 NaN NaN NaN 0.0 0.0 0.0 0.0 NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0.0 0.0 1400 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0.0 1431 NaN NaN 0 0 0 0 0 NaN NaN NaN NaN NaN 0 NaN 13.0 16.14 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN 483000 NaN NaN NaN 0.0 0.0 NaN NaN NaN NaN 0 0.0 980 0 NaN 0 NaN NaN NaN NaN 0 0 0 0 NaN NaN NaN NaN NaN 0.0 NaN 0 NaN NaN 0 0 NaN 0.0 NaN NaN NaN NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 NaN NaN 0.0 980 NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN NaN NaN NaN 0 0 0 0 NaN 495619 False 2023-03-28T15:56:08.415Z

sap_faglflexa_data (first 100 rows)

docln docnr rbukrs rclnt rldnr ryear activ rmvct rtcur runit awtyp rrcty rvers logsys racct cost_elem rcntr prctr rfarea rbusa kokrs segment zzspreg scntr pprctr sfarea sbusa rassc psegment tsl hsl ksl osl msl wsl drcrk poper rwcur gjahr budat belnr buzei bschl bstat linetype xsplitmod usnam timestamp_ _fivetran_rowid _fivetran_deleted _fivetran_synced
0 2 100002655 3000 800 0l 2007 rfbu NaN usd NaN bkpf 0 1 NaN 113100 NaN NaN NaN NaN NaN 2000 NaN NaN NaN NaN NaN NaN NaN NaN -2949.00 -2949.00 -2286.05 -2949.00 0.0 -2949.00 h 6 usd 2006 20060601 200001076 2 50 NaN NaN NaN steiner 20070525092226 3388016 False 2023-03-28T15:56:32.286Z
1 2 100002658 3000 800 0l 2007 rfbu NaN usd NaN bkpf 0 1 NaN 113100 NaN NaN NaN NaN NaN 2000 NaN NaN NaN NaN NaN NaN NaN NaN -655.50 -655.50 -508.14 -655.50 0.0 -655.50 h 6 usd 2006 20060601 200001077 2 50 NaN NaN NaN steiner 20070525092228 3388017 False 2023-03-28T15:56:32.248Z
2 2 100002659 3000 800 0l 2007 rfbu NaN usd NaN bkpf 0 1 NaN 113100 NaN NaN NaN NaN NaN 2000 NaN NaN NaN NaN NaN NaN NaN NaN -1595.28 -1595.28 -1236.65 -1595.28 0.0 -1595.28 h 6 usd 2006 20060601 200001078 2 50 NaN NaN NaN steiner 20070525092228 3388018 False 2023-03-28T15:56:32.286Z

sap_faglflext_data (first 100 rows)

drcrk objnr00 objnr01 objnr02 objnr03 objnr04 objnr05 objnr06 objnr07 objnr08 rclnt rpmax ryear activ rmvct rtcur runit awtyp rldnr rrcty rvers logsys racct cost_elem rbukrs rcntr prctr rfarea rbusa kokrs segment zzspreg scntr pprctr sfarea sbusa rassc psegment tslvt tsl01 tsl02 tsl03 tsl04 tsl05 tsl06 tsl07 tsl08 tsl09 tsl10 tsl11 tsl12 tsl13 tsl14 tsl15 tsl16 hslvt hsl01 hsl02 hsl03 hsl04 hsl05 hsl06 hsl07 hsl08 hsl09 hsl10 hsl11 hsl12 hsl13 hsl14 hsl15 hsl16 kslvt ksl01 ksl02 ksl03 ksl04 ksl05 ksl06 ksl07 ksl08 ksl09 ksl10 ksl11 ksl12 ksl13 ksl14 ksl15 ksl16 oslvt osl01 osl02 osl03 osl04 osl05 osl06 osl07 osl08 osl09 osl10 osl11 osl12 osl13 osl14 osl15 osl16 mslvt msl01 msl02 msl03 msl04 msl05 msl06 msl07 msl08 msl09 msl10 msl11 msl12 msl13 msl14 msl15 msl16 timestamp_ _fivetran_rowid _fivetran_deleted _fivetran_synced
0 h 7 1 76 73 0 0 0 0 0 800 16 2002 rfbu NaN usd NaN bkpf 0l 0 1 NaN 140000 NaN 3000 NaN NaN NaN 3000 2000 NaN NaN NaN NaN NaN NaN NaN NaN 0.0 0.00 0.00 0.0 0.00 0.00 0.00 0.00 0.00 0.0 0.00 -245194.66 -245194.66 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0.0 0.00 0.00 0.00 0.00 0.00 0.0 0.00 -245194.66 -245194.66 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -204328.07 -204328.07 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0.0 0.00 0.00 0.00 0.00 0.00 0.0 0.00 -245194.66 -245194.66 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 20060117132931 196611 False 2023-04-03T20:52:44.147Z
1 s 7 1 137 394 0 0 0 0 0 800 16 2002 rfbu NaN usd NaN bkpf 0l 0 1 NaN 403000 NaN 3000 4215.0 3005.0 100.0 3000 2000 NaN NaN NaN NaN NaN NaN NaN NaN 0.0 5093.77 4992.89 4841.6 5245.07 4892.03 5093.77 5194.64 4942.46 5144.2 5043.34 4892.03 4791.17 0.0 0.0 0.0 0.0 0.0 5093.77 4992.89 4841.6 5245.07 4892.03 5093.77 5194.64 4942.46 5144.2 5043.34 4892.03 4791.17 0.0 0.0 0.0 0.0 0.0 4244.79 4160.73 4034.65 4370.87 4076.68 4244.79 4328.85 4118.70 4286.82 4202.77 4076.68 3992.63 0.0 0.0 0.0 0.0 0.0 5093.77 4992.89 4841.6 5245.07 4892.03 5093.77 5194.64 4942.46 5144.2 5043.34 4892.03 4791.17 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 20060117134333 196612 False 2023-04-03T20:52:43.849Z
2 s 7 1 137 395 0 0 0 0 0 800 16 2002 rfbu NaN usd NaN bkpf 0l 0 1 NaN 403000 NaN 3000 4216.0 3005.0 100.0 3000 2000 NaN NaN NaN NaN NaN NaN NaN NaN 0.0 6009.50 5890.49 5712.0 6188.00 5771.49 6009.50 6128.50 5830.99 6069.0 5950.00 5771.49 5652.50 0.0 0.0 0.0 0.0 0.0 6009.50 5890.49 5712.0 6188.00 5771.49 6009.50 6128.50 5830.99 6069.0 5950.00 5771.49 5652.50 0.0 0.0 0.0 0.0 0.0 5007.90 4908.72 4759.98 5156.65 4809.56 5007.90 5107.06 4859.14 5057.48 4958.31 4809.56 4710.40 0.0 0.0 0.0 0.0 0.0 6009.50 5890.49 5712.0 6188.00 5771.49 6009.50 6128.50 5830.99 6069.0 5950.00 5771.49 5652.50 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 20060117134333 196613 False 2023-04-03T20:52:43.849Z

sap_kna1_data (first 100 rows)

kunnr mandt land1 name1 name2 ort01 pstlz regio sortl stras telf1 telfx xcpdk adrnr mcod1 mcod2 mcod3 anred aufsd bahne bahns bbbnr bbsnr begru brsch bubkz datlt erdat ernam exabl faksd fiskn knazk knrza konzs ktokd kukla lifnr lifsd locco loevm name3 name4 niels ort02 pfach pstl2 counc cityc rpmkr sperr spras stcd1 stcd2 stkza stkzu telbx telf2 teltx telx1 lzone xzemp vbund stceg dear1 dear2 dear3 dear4 dear5 gform bran1 bran2 bran3 bran4 bran5 ekont umsat umjah uwaer jmzah jmjah katr1 katr2 katr3 katr4 katr5 katr6 katr7 katr8 katr9 katr10 stkzn umsa1 txjcd periv abrvw inspbydebi inspatdebi ktocd pfort werks dtams dtaws duefl hzuor sperz etikg civve milve kdkg1 kdkg2 kdkg3 kdkg4 kdkg5 xknza fityp stcdt stcd3 stcd4 stcd5 xicms xxipi xsubt cfopc txlw1 txlw2 ccc01 ccc02 ccc03 ccc04 cassd knurl j_1kfrepre j_1kftbus j_1kftind confs updat uptim nodel dear6 cvp_xblck suframa rg exp uf rgdate ric rne rnedate cnae legalnat crtn icmstaxpay indtyp tdt comsize decregpc _vso_r_palhgt _vso_r_pal_ul _vso_r_pk_mat _vso_r_matpal _vso_r_i_no_lyr _vso_r_one_mat _vso_r_one_sort _vso_r_uld_side _vso_r_load_pref _vso_r_dpoint _xlso_customer _xlso_sysid _xlso_client _xlso_partner _xlso_pref_pay alc pmt_office fee_schedule duns duns4 psofg psois pson1 pson2 pson3 psovn psotl psohs psost psoo1 psoo2 psoo3 psoo4 psoo5 oidrc oid_poreqd oipbl _fivetran_rowid _fivetran_deleted _fivetran_synced
0 CA301 800 USA Dunder Mifflin NaN Scranton 18503 130 CA301 3927 Saticoy St NaN NaN NaN 620981 DUNDER MIFFLIN NaN SCRANTON NaN NaN NaN NaN 0 0 NaN NaN 0 NaN 20111201 SCHRUTE NaN NaN NaN NaN NaN NaN DWIGHT NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 0 NaN 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN X 0 NaN NaN X NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN NaN NaN NaN NaN NaN NaN 0 0 NaN 0 NaN 0 NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN 0 NaN NaN 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 False 2023-01-16T20:59:25.282Z
1 CA302 800 PO Krusty Krab. NaN Bikini Bottom 000001 100 CA302 831 Bottom Feeder Lane NaN NaN NaN 620983 KRUSTY KRAB NaN BIKINI BOTTOM NaN NaN NaN NaN 0 0 NaN NaN 0 NaN 20111201 KRABS NaN NaN NaN NaN NaN NaN EUGENE NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 0 NaN 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN X 0 NaN NaN X NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN NaN NaN NaN NaN NaN NaN 0 0 NaN 0 NaN 0 NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN 0 NaN NaN 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 2 False 2023-01-16T20:59:25.282Z
2 CA303 800 UK Holmes And Watson NaN London NW1 6XE 120 CA303 221B Baker Street NaN NaN NaN 620985 HOLMES AND WATSON NaN LONDON NaN NaN NaN NaN 0 0 NaN NaN 0 NaN 20111201 HOLMES NaN NaN NaN NaN NaN NaN WATSON NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 0 NaN 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN X 0 NaN NaN X NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN NaN NaN NaN NaN NaN NaN 0 0 NaN 0 NaN 0 NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN 0 NaN NaN 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 3 False 2023-01-16T20:59:25.283Z

sap_lfa1_data (first 100 rows)

lifnr mandt land1 name1 name2 name3 name4 ort01 ort02 pfach pstl2 pstlz regio sortl stras adrnr mcod1 mcod2 mcod3 anred bahns bbbnr bbsnr begru brsch bubkz datlt dtams dtaws erdat ernam esrnr konzs ktokk kunnr lnrza loevm sperr sperm spras stcd1 stcd2 stkza stkzu telbx telf1 telf2 telfx teltx telx1 xcpdk xzemp vbund fiskn stceg stkzn sperq gbort gbdat sexkz kraus revdb qssys ktock pfort werks ltsna werkr plkal duefl txjcd sperz scacd sfrgr lzone xlfza dlgrp fityp stcdt regss actss stcd3 stcd4 stcd5 ipisp taxbs profs stgdl emnfr lfurl j_1kfrepre j_1kftbus j_1kftind confs updat uptim nodel qssysdat podkzb fisku stenr carrier_conf min_comp term_li crc_num cvp_xblck rg exp uf rgdate ric rne rnedate cnae legalnat crtn icmstaxpay indtyp tdt comsize decregpc j_sc_capital j_sc_currency alc pmt_office ppa_relevant psofg psois pson1 pson2 pson3 psovn psotl psohs psost transport_chain staging_time scheduling_type submi_relevant _fivetran_rowid _fivetran_deleted _fivetran_synced
0 EWM_3001 800 US Willy Wonka Chocolate Factory NaN NaN NaN ITASCA DUPAGE NaN NaN 11223 IL EWM 1445 West Norwood Avenue 64202 WILLY WONKA CHOCOLATE FACTORY NaN ITASCA Company NaN 0 0 NaN None 0 NaN NaN NaN 20071127 C5093610 NaN NaN VEND NaN NaN NaN NaN NaN E NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN X 3.304720e+09 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN 0 NaN 0 NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN 1 False 2023-02-27T15:34:02.636Z
1 EWM_3003 800 US Nakatomi Plaza NaN NaN NaN LOS ANGELES CENTURY CITY NaN NaN 60154 CA EWM 2121 Avenue of the Stars 64203 NAKATOMI PLAZA NaN LOS ANGELES Company NaN 0 0 NaN None 0 NaN NaN NaN 20071127 C5093610 NaN NaN VEND NaN NaN NaN NaN NaN E NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN X 1.403131e+09 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN 0 NaN 0 NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN 2 False 2023-02-27T15:34:02.637Z
2 EXTERN 800 US Initech NaN NaN NaN AUSTIN AUSTIN NaN NaN 73301 TX FSC120 4120 Freidrich Lane 46098 INITECH NaN AUSTIN Firma NaN 0 0 NaN TRAD 0 NaN NaN NaN 20040324 D036964 NaN NaN VEND NaN NaN NaN NaN NaN D NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN X NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN 0 NaN 0 NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN 3 False 2023-02-27T15:34:02.637Z

sap_mara_data (first 100 rows)

mandt matnr ersda ernam laeda aenam vpsta pstat lvorm mtart mbrsh matkl bismt meins bstme zeinr zeiar zeivr zeifo aeszn blatt blanz ferth formt groes wrkst normt labor ekwsl brgew ntgew gewei volum voleh behvo raube tempb disst tragr stoff spart kunnr eannr wesch bwvor bwscl saiso etiar etifo entar ean11 numtp laeng breit hoehe meabm prdha aeklk cadkz qmpur ergew ergei ervol ervoe gewto volto vabme kzrev kzkfg xchpf vhart fuelg stfak magrv begru datab liqdt saisj plgtp mlgut extwg satnr attyp kzkup kznfm pmata mstae mstav mstde mstdv taklv rbnrm mhdrz mhdhb mhdlp inhme inhal vpreh etiag inhbr cmeth cuobf kzumw kosch sprof nrfhg mfrpn mfrnr bmatn mprof kzwsm saity profl ihivi iloos serlv kzgvh xgchp kzeff compl iprkz rdmhd przus mtpos_mara bflme nsnid _fivetran_rowid _fivetran_deleted _fivetran_synced
0 700 51066122 10230308 hvruser 10230308 hvruser k k NaN fert m NaN updated desc bag NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN 0.0 0.0 NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN 0.0 0.0 0.0 NaN NaN NaN NaN NaN 0.0 NaN 0.0 NaN 0.0 0.0 NaN NaN NaN NaN NaN 0 0 NaN NaN 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN NaN 0 0 0 NaN 0.0 0 NaN 0.0 NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN norm NaN NaN 26047 False 2023-03-08T05:03:51.321Z
1 700 51066123 10230308 hvruser 10000000 None k k NaN zmdg m NaN None ea NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN 0.0 0.0 NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN 0.0 0.0 0.0 NaN NaN NaN NaN NaN 0.0 NaN 0.0 NaN 0.0 0.0 NaN NaN NaN NaN NaN 0 0 NaN NaN 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN NaN 0 0 0 NaN 0.0 0 NaN 0.0 NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN norm NaN NaN 26048 False 2023-03-08T14:25:56.786Z
2 700 51066124 10230309 hvruser 10230309 hvruser k k NaN fert m NaN updated desc bag NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN NaN 0.0 0.0 NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN 0.0 0.0 0.0 NaN NaN NaN NaN NaN 0.0 NaN 0.0 NaN 0.0 0.0 NaN NaN NaN NaN NaN 0 0 NaN NaN 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 NaN NaN 0 0 0 NaN 0.0 0 NaN 0.0 NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN norm NaN NaN 26053 False 2023-03-09T09:35:26.934Z

sap_pa0000_data (first 100 rows)

begda endda mandt objps pernr seqnr sprps subty aedtm uname histo itxex refex ordex itbld preas flag1 flag2 flag3 flag4 rese1 rese2 grpvl massn massg stat1 stat2 stat3 _fivetran_rowid _fivetran_deleted _fivetran_synced
0 20020101 99991231 800 NaN 10 0 NaN NaN 20030507 bobsponge NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 NaN NaN 3 1 1 False 2023-06-15T13:01:17.79Z
1 20030101 99991231 800 NaN 69 0 NaN NaN 20030917 wardsquid NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 NaN NaN 3 1 2 False 2023-06-15T13:01:17.79Z
2 20030101 99991231 800 NaN 70 0 NaN NaN 20030917 starpatrick NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 52 NaN NaN 3 1 3 False 2023-06-15T13:01:17.79Z

sap_pa0001_data (first 100 rows)

begda endda mandt objps pernr seqnr sprps subty aedtm uname histo itxex refex ordex itbld preas flag1 flag2 flag3 flag4 rese1 rese2 grpvl bukrs werks persg persk vdsk1 gsber btrtl juper abkrs ansvh kostl orgeh plans stell mstbr sacha sachp sachz sname ename otype sbmod kokrs fistl geber fkber grant_nbr sgmnt budget_pd _fivetran_rowid _fivetran_deleted _fivetran_synced
0 20020101 99991231 800 NaN 10 0 NaN NaN 20030507 powersa NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 2000 200 1 gc 200 NaN 2 NaN g1 NaN NaN 50001357 50005214 50016575 NaN NaN 3 NaN powers austin austin powers s 200 1000 NaN NaN NaN NaN NaN NaN 1 False 2023-06-15T13:01:26.498Z
1 20030101 99991231 800 NaN 69 0 NaN NaN 20111114 c5115457 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 2000 200 1 gc 200 NaN 2 NaN g1 NaN NaN 50002214 50005687 50029038 NaN NaN 3 NaN bob sponge mr sponge bob s 200 1000 NaN NaN NaN NaN NaN NaN 2 False 2023-06-15T13:01:26.498Z
2 20030101 99991231 800 NaN 70 0 NaN NaN 20111114 c5115457 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 2000 200 1 gc 200 NaN 2 NaN g1 NaN NaN 50002214 50005691 50043146 NaN NaN 3 NaN wayne bruce mr bruce wayne s 200 1000 NaN NaN NaN NaN NaN NaN 3 False 2023-06-15T13:01:26.498Z

sap_pa0007_data (first 100 rows)

begda endda mandt objps pernr seqnr sprps subty aedtm uname histo itxex refex ordex itbld preas flag1 flag2 flag3 flag4 rese1 rese2 grpvl schkz zterf empct mostd wostd arbst wkwdy jrstd teilk minta maxta minwo maxwo minmo maxmo minja maxja dysch kztim wweek awtyp _fivetran_rowid _fivetran_deleted _fivetran_synced
0 19911215 99991231 800 NaN 80052 0 NaN NaN 20121029 C5174732 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN norm 0 100.0 173.34 40.0 8.0 5.0 2080.0 NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN 1414 False 2023-06-15T13:12:49.091Z
1 19920315 99991231 800 NaN 80053 0 NaN NaN 20121029 C5174732 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN norm 0 100.0 173.34 40.0 8.0 5.0 2080.0 NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN 1415 False 2023-06-15T13:12:49.091Z
2 19940101 99991231 800 NaN 1003 0 NaN NaN 19950531 LIMPERT NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN flex 0 100.0 156.48 36.0 7.2 5.0 1879.2 NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN 12 False 2023-06-15T13:12:48.906Z

sap_pa0008_data (first 100 rows)

begda endda mandt objps pernr seqnr sprps subty aedtm uname histo itxex refex ordex itbld preas flag1 flag2 flag3 flag4 rese1 rese2 grpvl trfar trfgb trfgr trfst stvor orzst partn waers vglta vglgb vglgr vglst vglsv bsgrd divgv ansal falgk falgr lga01 bet01 anz01 ein01 opk01 lga02 bet02 anz02 ein02 opk02 lga03 bet03 anz03 ein03 opk03 lga04 bet04 anz04 ein04 opk04 lga05 bet05 anz05 ein05 opk05 lga06 bet06 anz06 ein06 opk06 lga07 bet07 anz07 ein07 opk07 lga08 bet08 anz08 ein08 opk08 lga09 bet09 anz09 ein09 opk09 lga10 bet10 anz10 ein10 opk10 lga11 bet11 anz11 ein11 opk11 lga12 bet12 anz12 ein12 opk12 lga13 bet13 anz13 ein13 opk13 lga14 bet14 anz14 ein14 opk14 lga15 bet15 anz15 ein15 opk15 lga16 bet16 anz16 ein16 opk16 lga17 bet17 anz17 ein17 opk17 lga18 bet18 anz18 ein18 opk18 lga19 bet19 anz19 ein19 opk19 lga20 bet20 anz20 ein20 opk20 lga21 bet21 anz21 ein21 opk21 lga22 bet22 anz22 ein22 opk22 lga23 bet23 anz23 ein23 opk23 lga24 bet24 anz24 ein24 opk24 lga25 bet25 anz25 ein25 opk25 lga26 bet26 anz26 ein26 opk26 lga27 bet27 anz27 ein27 opk27 lga28 bet28 anz28 ein28 opk28 lga29 bet29 anz29 ein29 opk29 lga30 bet30 anz30 ein30 opk30 lga31 bet31 anz31 ein31 opk31 lga32 bet32 anz32 ein32 opk32 lga33 bet33 anz33 ein33 opk33 lga34 bet34 anz34 ein34 opk34 lga35 bet35 anz35 ein35 opk35 lga36 bet36 anz36 ein36 opk36 lga37 bet37 anz37 ein37 opk37 lga38 bet38 anz38 ein38 opk38 lga39 bet39 anz39 ein39 opk39 lga40 bet40 anz40 ein40 opk40 ind01 ind02 ind03 ind04 ind05 ind06 ind07 ind08 ind09 ind10 ind11 ind12 ind13 ind14 ind15 ind16 ind17 ind18 ind19 ind20 ind21 ind22 ind23 ind24 ind25 ind26 ind27 ind28 ind29 ind30 ind31 ind32 ind33 ind34 ind35 ind36 ind37 ind38 ind39 ind40 ancur cpind flaga _fivetran_rowid _fivetran_deleted _fivetran_synced
0 19750401 99991231 800 NaN 22314 0 NaN 0 20140919 I026759 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 2 AGE 55 0 NaN NaN JPY NaN NaN AGE 55.0 0 0.0 0.00 0.0 NaN NaN M000 0.0 0.0 NaN NaN M001 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN I I NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN JPY T NaN 5969 False 2023-06-15T13:12:52.308Z
1 19911215 99991231 800 NaN 80052 0 NaN 0 20121029 C5174732 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 1 GRD10 3 0 NaN NaN USD NaN NaN None NaN 0 100.0 86.67 0.0 NaN NaN 1002 0.0 0.0 NaN NaN None 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN I None NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN USD T NaN 1855 False 2023-06-15T13:12:51.411Z
2 19920315 99991231 800 NaN 80053 0 NaN 0 20121029 C5174732 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 1 GRD10 1 0 NaN NaN USD NaN NaN None NaN 0 100.0 86.67 0.0 NaN NaN 1002 0.0 0.0 NaN NaN None 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN 0.0 0.0 NaN NaN I None NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN USD T NaN 1856 False 2023-06-15T13:12:51.411Z

sap_pa0031_data (first 100 rows)

aedtm begda endda flag1 flag2 flag3 flag4 grpvl histo itbld itxex mandt objps ordex pernr preas refex rese1 rese2 rfp01 rfp02 rfp03 rfp04 rfp05 rfp06 rfp07 rfp08 rfp09 rfp10 rfp11 rfp12 rfp13 rfp14 rfp15 rfp16 rfp17 rfp18 rfp19 rfp20 seqnr sprps subty uname _fivetran_rowid _fivetran_deleted _fivetran_synced
0 20140919 19750401 99991231 NaN NaN NaN NaN NaN NaN NaN NaN 800 NaN NaN 22314 NaN NaN NaN NaN 12345678 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN 0 I026759 1 False 2023-06-15T13:01:17.79Z
1 20121029 19911215 99991231 NaN NaN NaN NaN NaN NaN NaN NaN 800 NaN NaN 80052 NaN NaN NaN NaN 23456789 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN 0 I026759 2 False 2023-06-15T13:01:18.79Z
2 20140922 19920315 99991231 NaN NaN NaN NaN NaN NaN NaN NaN 800 NaN NaN 80053 NaN NaN NaN NaN 34567890 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 NaN 0 C5174732 3 False 2023-06-15T13:01:19.79Z

sap_ska1_data (first 100 rows)

mandt ktopl saknr bilkt gvtyp vbund xbilk sakan erdat ernam ktoks xloev xspea xspeb xspep func_area mustr _fivetran_rowid _fivetran_deleted _fivetran_synced
0 0 dabe 111000 NaN NaN NaN x 111000 19920624 sap sako NaN NaN NaN NaN NaN NaN 1 False 2023-03-02T15:28:38.925Z
1 0 dabe 112000 NaN NaN NaN x 112000 19920625 sap sako NaN NaN NaN NaN NaN NaN 2 False 2023-03-02T15:28:38.929Z
2 0 dabe 113000 NaN NaN NaN x 113000 19920626 sap sako NaN NaN NaN NaN NaN NaN 3 False 2023-03-02T15:28:38.929Z

sap_t001_data (first 100 rows)

bukrs mandt butxt ort01 land1 waers spras ktopl waabw periv kokfi rcomp adrnr stceg fikrs xfmco xfmcb xfmca txjcd fmhrdate buvar fdbuk xfdis xvalv xskfn kkber xmwsn mregl xgsbe xgjrv xkdft xprod xeink xjvaa xvvwa xslta xfdmm xfdsd xextb ebukr ktop2 umkrs bukrs_glob fstva opvar xcovr txkrs wfvar xbbbf xbbbe xbbba xbbko xstdt mwskv mwska impda xnegp xkkbi wt_newwt pp_pdate infmt fstvare kopim dkweg offsacct bapovar xcos xcession xsplt surccm dtprov dtamtc dttaxc dttdsp dtaxr xvatdate pst_per_var xbbsc fm_derive_acc _fivetran_rowid _fivetran_deleted _fivetran_synced
0 3000 811 novi grad sokovia so eur f cafr 10 k4 2 2223.0 NaN fr93341612697 2222.0 NaN NaN NaN NaN 0 2.0 NaN x x NaN 2222.0 NaN NaN None x x NaN x NaN NaN NaN x x NaN NaN None 2200.0 NaN 2222 2222 NaN NaN NaN NaN NaN NaN NaN NaN v0 a0 2.0 x NaN x NaN NaN None NaN NaN 0 NaN 2.0 x NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 336 False 2023-01-16T20:59:37.848Z
1 3000 0 florin city florin fn eur s cape 10 k4 2 NaN 21634.0 None NaN NaN NaN NaN NaN 0 2.0 NaN None x NaN NaN NaN NaN None x None NaN x NaN NaN NaN None None NaN NaN None NaN NaN 1 1 NaN NaN NaN NaN NaN NaN NaN NaN c0 d0 NaN None NaN x NaN NaN None NaN NaN 0 NaN NaN None NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 46 False 2023-01-16T20:59:37.828Z
2 3000 800 kingdom of genovia genovia ga eur o int 10 k4 2 1107.0 66216.0 None NaN NaN NaN NaN NaN 20000101 NaN NaN x x NaN 1000.0 NaN NaN x x x NaN None NaN NaN NaN None None NaN NaN gkr 1000.0 NaN 1000 1000 NaN NaN 1000.0 NaN NaN NaN NaN NaN v0 a0 NaN x NaN None NaN NaN fmre NaN NaN 0 NaN 2.0 x NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 114 False 2023-01-16T20:59:37.833Z

sap_t503_data (first 100 rows)

mandt persg persk abart abtyp antyp trfkz zeity aksta ansta austa konty burkz molga typsz inwid _fivetran_rowid _fivetran_deleted _fivetran_synced
0 800 1 A0 3 3 NaN 3 2 0 0 0 1 0 NaN NaN NaN 692 False 2023-06-15T14:31:17.313Z
1 800 1 A1 2 2 NaN 2 1 0 0 0 2 0 NaN NaN NaN 693 False 2023-06-15T14:31:17.313Z
2 800 1 A2 2 2 NaN 2 1 0 0 0 2 0 NaN NaN NaN 694 False 2023-06-15T14:31:17.313Z

sap_t880_data (first 100 rows)

mandt rcomp name1 cntry name2 langu stret pobox pstlc city curr modcp glsip resta rform zweig mcomp mclnt lccomp strt2 indpo _fivetran_rowid _fivetran_deleted _fivetran_synced
0 800 1 Willy Wonka Chocolate Factory US NaN D 1445 West Norwood Avenue NaN 11223 Walldorf USD NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN 5 False 2023-06-26T11:19:44.003Z
1 800 5 Holmes And Watson UK NaN D 221B Baker Street NaN NW1 6XE London GBP NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN 6 False 2023-06-26T11:19:44.004Z
2 800 6 Nakatomi Plaza US NaN E 2121 Avenue of the Stars NaN 60154 Los Angeles USD NaN NaN NaN NaN NaN NaN 0 NaN NaN NaN 7 False 2023-06-26T11:19:44.004Z
Source tables may have typos, unclear names, incorrect column types, etc. We clean these tables.

stg_sap_pa0007_data (first 100 rows)

client_id employee_id sequence_number last_modified_by schedule_type time_recording_indicator employment_percentage monthly_hours weekly_hours daily_hours workdays_per_week yearly_hours min_daily_hours max_daily_hours min_weekly_hours max_weekly_hours min_monthly_hours max_monthly_hours min_yearly_hours max_yearly_hours row_id is_deleted dynamic_scheduling last_modified_date valid_from_date valid_to_date
0 800 80052 0 C5174732 norm 0 100.0 173.34 40.0 8.0 5.0 2080.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1414 False None 2012-10-29 1991-12-15 9999-12-31
1 800 80053 0 C5174732 norm 0 100.0 173.34 40.0 8.0 5.0 2080.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1415 False None 2012-10-29 1992-03-15 9999-12-31
2 800 1003 0 LIMPERT flex 0 100.0 156.48 36.0 7.2 5.0 1879.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 12 False None 1995-05-31 1994-01-01 9999-12-31

stg_sap_pa0007_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 05:14:43.756140+00:00
WITH 
"sap_pa0007_data_projected" AS (
    -- Projection: Selecting 46 out of 47 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "begda",
        "endda",
        "mandt",
        "objps",
        "pernr",
        "seqnr",
        "sprps",
        "subty",
        "aedtm",
        "uname",
        "histo",
        "itxex",
        "refex",
        "ordex",
        "itbld",
        "preas",
        "flag1",
        "flag2",
        "flag3",
        "flag4",
        "rese1",
        "rese2",
        "grpvl",
        "schkz",
        "zterf",
        "empct",
        "mostd",
        "wostd",
        "arbst",
        "wkwdy",
        "jrstd",
        "teilk",
        "minta",
        "maxta",
        "minwo",
        "maxwo",
        "minmo",
        "maxmo",
        "minja",
        "maxja",
        "dysch",
        "kztim",
        "wweek",
        "awtyp",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_pa0007_data"
),

"sap_pa0007_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- begda -> valid_from_date
    -- endda -> valid_to_date
    -- mandt -> client_id
    -- objps -> personnel_calculation_object
    -- pernr -> employee_id
    -- seqnr -> sequence_number
    -- sprps -> lock_indicator
    -- subty -> subtype
    -- aedtm -> last_modified_date
    -- uname -> last_modified_by
    -- histo -> is_historical
    -- itxex -> integration_time_execution
    -- refex -> external_reference
    -- ordex -> execution_order
    -- itbld -> integration_time_building
    -- preas -> reason_code
    -- flag1 -> custom_flag_1
    -- flag2 -> custom_flag_2
    -- flag3 -> custom_flag_3
    -- flag4 -> custom_flag_4
    -- rese1 -> reserve_field_1
    -- rese2 -> reserve_field_2
    -- grpvl -> group_value
    -- schkz -> schedule_type
    -- zterf -> time_recording_indicator
    -- empct -> employment_percentage
    -- mostd -> monthly_hours
    -- wostd -> weekly_hours
    -- arbst -> daily_hours
    -- wkwdy -> workdays_per_week
    -- jrstd -> yearly_hours
    -- teilk -> part_time_indicator
    -- minta -> min_daily_hours
    -- maxta -> max_daily_hours
    -- minwo -> min_weekly_hours
    -- maxwo -> max_weekly_hours
    -- minmo -> min_monthly_hours
    -- maxmo -> max_monthly_hours
    -- minja -> min_yearly_hours
    -- maxja -> max_yearly_hours
    -- dysch -> dynamic_scheduling
    -- kztim -> time_management_indicator
    -- wweek -> work_week_definition
    -- awtyp -> work_time_type
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "begda" AS "valid_from_date",
        "endda" AS "valid_to_date",
        "mandt" AS "client_id",
        "objps" AS "personnel_calculation_object",
        "pernr" AS "employee_id",
        "seqnr" AS "sequence_number",
        "sprps" AS "lock_indicator",
        "subty" AS "subtype",
        "aedtm" AS "last_modified_date",
        "uname" AS "last_modified_by",
        "histo" AS "is_historical",
        "itxex" AS "integration_time_execution",
        "refex" AS "external_reference",
        "ordex" AS "execution_order",
        "itbld" AS "integration_time_building",
        "preas" AS "reason_code",
        "flag1" AS "custom_flag_1",
        "flag2" AS "custom_flag_2",
        "flag3" AS "custom_flag_3",
        "flag4" AS "custom_flag_4",
        "rese1" AS "reserve_field_1",
        "rese2" AS "reserve_field_2",
        "grpvl" AS "group_value",
        "schkz" AS "schedule_type",
        "zterf" AS "time_recording_indicator",
        "empct" AS "employment_percentage",
        "mostd" AS "monthly_hours",
        "wostd" AS "weekly_hours",
        "arbst" AS "daily_hours",
        "wkwdy" AS "workdays_per_week",
        "jrstd" AS "yearly_hours",
        "teilk" AS "part_time_indicator",
        "minta" AS "min_daily_hours",
        "maxta" AS "max_daily_hours",
        "minwo" AS "min_weekly_hours",
        "maxwo" AS "max_weekly_hours",
        "minmo" AS "min_monthly_hours",
        "maxmo" AS "max_monthly_hours",
        "minja" AS "min_yearly_hours",
        "maxja" AS "max_yearly_hours",
        "dysch" AS "dynamic_scheduling",
        "kztim" AS "time_management_indicator",
        "wweek" AS "work_week_definition",
        "awtyp" AS "work_time_type",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_pa0007_data_projected"
),

"sap_pa0007_data_projected_renamed_casted" AS (
    -- Column Type Casting: 
    -- custom_flag_1: from DECIMAL to VARCHAR
    -- custom_flag_2: from DECIMAL to VARCHAR
    -- custom_flag_3: from DECIMAL to VARCHAR
    -- custom_flag_4: from DECIMAL to VARCHAR
    -- dynamic_scheduling: from DECIMAL to VARCHAR
    -- execution_order: from DECIMAL to VARCHAR
    -- external_reference: from DECIMAL to VARCHAR
    -- group_value: from DECIMAL to VARCHAR
    -- integration_time_building: from DECIMAL to VARCHAR
    -- integration_time_execution: from DECIMAL to VARCHAR
    -- is_historical: from DECIMAL to VARCHAR
    -- last_modified_date: from INT to DATE
    -- lock_indicator: from DECIMAL to VARCHAR
    -- part_time_indicator: from DECIMAL to VARCHAR
    -- personnel_calculation_object: from DECIMAL to VARCHAR
    -- reason_code: from DECIMAL to VARCHAR
    -- reserve_field_1: from DECIMAL to VARCHAR
    -- reserve_field_2: from DECIMAL to VARCHAR
    -- subtype: from DECIMAL to VARCHAR
    -- time_management_indicator: from DECIMAL to VARCHAR
    -- valid_from_date: from INT to DATE
    -- valid_to_date: from INT to DATE
    -- work_time_type: from DECIMAL to VARCHAR
    -- work_week_definition: from DECIMAL to VARCHAR
    SELECT
        "client_id",
        "employee_id",
        "sequence_number",
        "last_modified_by",
        "schedule_type",
        "time_recording_indicator",
        "employment_percentage",
        "monthly_hours",
        "weekly_hours",
        "daily_hours",
        "workdays_per_week",
        "yearly_hours",
        "min_daily_hours",
        "max_daily_hours",
        "min_weekly_hours",
        "max_weekly_hours",
        "min_monthly_hours",
        "max_monthly_hours",
        "min_yearly_hours",
        "max_yearly_hours",
        "row_id",
        "is_deleted",
        CAST("custom_flag_1" AS VARCHAR) AS "custom_flag_1",
        CAST("custom_flag_2" AS VARCHAR) AS "custom_flag_2",
        CAST("custom_flag_3" AS VARCHAR) AS "custom_flag_3",
        CAST("custom_flag_4" AS VARCHAR) AS "custom_flag_4",
        CAST("dynamic_scheduling" AS VARCHAR) AS "dynamic_scheduling",
        CAST("execution_order" AS VARCHAR) AS "execution_order",
        CAST("external_reference" AS VARCHAR) AS "external_reference",
        CAST("group_value" AS VARCHAR) AS "group_value",
        CAST("integration_time_building" AS VARCHAR) AS "integration_time_building",
        CAST("integration_time_execution" AS VARCHAR) AS "integration_time_execution",
        CAST("is_historical" AS VARCHAR) AS "is_historical",
        strptime(CAST("last_modified_date" AS VARCHAR), '%Y%m%d') AS "last_modified_date",
        CAST("lock_indicator" AS VARCHAR) AS "lock_indicator",
        CAST("part_time_indicator" AS VARCHAR) AS "part_time_indicator",
        CAST("personnel_calculation_object" AS VARCHAR) AS "personnel_calculation_object",
        CAST("reason_code" AS VARCHAR) AS "reason_code",
        CAST("reserve_field_1" AS VARCHAR) AS "reserve_field_1",
        CAST("reserve_field_2" AS VARCHAR) AS "reserve_field_2",
        CAST("subtype" AS VARCHAR) AS "subtype",
        CAST("time_management_indicator" AS VARCHAR) AS "time_management_indicator",
        strptime(CAST("valid_from_date" AS VARCHAR), '%Y%m%d') AS "valid_from_date",
        strptime(CAST("valid_to_date" AS VARCHAR), '%Y%m%d') AS "valid_to_date",
        CAST("work_time_type" AS VARCHAR) AS "work_time_type",
        CAST("work_week_definition" AS VARCHAR) AS "work_week_definition"
    FROM "sap_pa0007_data_projected_renamed"
),

"sap_pa0007_data_projected_renamed_casted_missing_handled" AS (
    -- Handling missing values: There are 20 columns with unacceptable missing values
    -- custom_flag_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- custom_flag_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- custom_flag_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- custom_flag_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- execution_order has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- external_reference has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- group_value has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- integration_time_building has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- integration_time_execution has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- is_historical has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lock_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- part_time_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- personnel_calculation_object has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reason_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reserve_field_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reserve_field_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- subtype has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- time_management_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- work_time_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- work_week_definition has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "client_id",
        "employee_id",
        "sequence_number",
        "last_modified_by",
        "schedule_type",
        "time_recording_indicator",
        "employment_percentage",
        "monthly_hours",
        "weekly_hours",
        "daily_hours",
        "workdays_per_week",
        "yearly_hours",
        "min_daily_hours",
        "max_daily_hours",
        "min_weekly_hours",
        "max_weekly_hours",
        "min_monthly_hours",
        "max_monthly_hours",
        "min_yearly_hours",
        "max_yearly_hours",
        "row_id",
        "is_deleted",
        "dynamic_scheduling",
        "last_modified_date",
        "valid_from_date",
        "valid_to_date"
    FROM "sap_pa0007_data_projected_renamed_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_pa0007_data_projected_renamed_casted_missing_handled"

stg_sap_pa0007_data.yml (Document the table)

version: 2
models:
- name: stg_sap_pa0007_data
  description: The table is about employee work schedules. It contains details such
    as employee number (pernr), start and end dates (begda, endda), work schedule
    type (schkz), employment percentage (empct), weekly hours (wostd), daily hours
    (arbst), and yearly hours (jrstd). The table also includes fields for minimum
    and maximum hours per day, week, month, and year. Additional fields store administrative
    information like creation date (aedtm) and user (uname).
  columns:
  - name: client_id
    description: Client identifier
    tests:
    - not_null
  - name: employee_id
    description: Employee personnel number
    tests:
    - not_null
  - name: sequence_number
    description: Sequence number
    tests:
    - not_null
  - name: last_modified_by
    description: User who last changed the record
    tests:
    - not_null
  - name: schedule_type
    description: Work schedule type
    tests:
    - not_null
    - accepted_values:
        values:
        - norm
        - flex
        - part-time
        - shift
  - name: time_recording_indicator
    description: Time recording indicator
    tests:
    - not_null
  - name: employment_percentage
    description: Employment percentage
    tests:
    - not_null
  - name: monthly_hours
    description: Monthly working hours
    tests:
    - not_null
  - name: weekly_hours
    description: Weekly working hours
    tests:
    - not_null
  - name: daily_hours
    description: Daily working hours
    tests:
    - not_null
  - name: workdays_per_week
    description: Workdays per week
    tests:
    - not_null
  - name: yearly_hours
    description: Yearly working hours
    tests:
    - not_null
  - name: min_daily_hours
    description: Minimum daily hours
    tests:
    - not_null
  - name: max_daily_hours
    description: Maximum daily hours
    tests:
    - not_null
  - name: min_weekly_hours
    description: Minimum weekly hours
    tests:
    - not_null
  - name: max_weekly_hours
    description: Maximum weekly hours
    tests:
    - not_null
  - name: min_monthly_hours
    description: Minimum monthly hours
    tests:
    - not_null
  - name: max_monthly_hours
    description: Maximum monthly hours
    tests:
    - not_null
  - name: min_yearly_hours
    description: Minimum yearly hours
    tests:
    - not_null
  - name: max_yearly_hours
    description: Maximum yearly hours
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column appears to be a unique identifier for each row in the
        table. For this table, each row represents a distinct work schedule entry.
        The row_id is likely to be unique across all rows, making it a suitable candidate
        key.
  - name: is_deleted
    description: Indicates if the record is deleted
    tests:
    - not_null
  - name: dynamic_scheduling
    description: Dynamic scheduling indicator
    cocoon_meta:
      missing_acceptable: Not applicable for standard or fixed scheduling types.
  - name: last_modified_date
    description: Date of last change
    tests:
    - not_null
  - name: valid_from_date
    description: Start date of validity
    tests:
    - not_null
  - name: valid_to_date
    description: End date of validity
    tests:
    - not_null

stg_sap_t001_data (first 100 rows)

company_name city country_key currency_key language_key chart_of_accounts fiscal_year_variant rcomp vat_registration_number business_transaction_variant distribution_flag valuation_flag credit_control_area business_area_flag fiscal_year_variant_flag customer_down_payment_flag purchasing_company_code mm_flag sd_flag interest_calculation_profit_center sales_organization cash_flow_variant output_tax_category input_tax_category implementation_date negative_postings_flag new_withholding_tax extended_funds_management_variant xcos factoring_indicator row_id is_deleted address_number client company_code controlling_to_fi_interface exchange_rate_tolerance financial_management_area fiscal_year_variant_change_date funds_management_variant mgmt_consolidation_flag offsetting_account open_period_variant pl_consolidation_flag pl_statement_account
0 novi grad sokovia so eur f cafr k4 2223.0 fr93341612697 2.0 x x 2222.0 None x x x x x None 2200.0 NaN v0 a0 2.0 x x None 2.0 x 336 False NaN 811 3000 2 10 2222.0 NaT 2222 None 0 2222 None None
1 florin city florin fn eur s cape k4 NaN None 2.0 None x NaN None x None x None None None NaN NaN c0 d0 NaN None x None NaN None 46 False 21634.0 0 3000 2 10 None NaT 1 None 0 1 None None
2 kingdom of genovia genovia ga eur o int k4 1107.0 None NaN x x 1000.0 x x x None None None gkr 1000.0 1000.0 v0 a0 NaN x None fmre 2.0 x 114 False 66216.0 800 3000 2 10 None 2000-01-01 1000 None 0 1000 None None

stg_sap_t001_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 14:54:30.885040+00:00
WITH 
"sap_t001_data_projected" AS (
    -- Projection: Selecting 81 out of 82 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "bukrs",
        "mandt",
        "butxt",
        "ort01",
        "land1",
        "waers",
        "spras",
        "ktopl",
        "waabw",
        "periv",
        "kokfi",
        "rcomp",
        "adrnr",
        "stceg",
        "fikrs",
        "xfmco",
        "xfmcb",
        "xfmca",
        "txjcd",
        "fmhrdate",
        "buvar",
        "fdbuk",
        "xfdis",
        "xvalv",
        "xskfn",
        "kkber",
        "xmwsn",
        "mregl",
        "xgsbe",
        "xgjrv",
        "xkdft",
        "xprod",
        "xeink",
        "xjvaa",
        "xvvwa",
        "xslta",
        "xfdmm",
        "xfdsd",
        "xextb",
        "ebukr",
        "ktop2",
        "umkrs",
        "bukrs_glob",
        "fstva",
        "opvar",
        "xcovr",
        "txkrs",
        "wfvar",
        "xbbbf",
        "xbbbe",
        "xbbba",
        "xbbko",
        "xstdt",
        "mwskv",
        "mwska",
        "impda",
        "xnegp",
        "xkkbi",
        "wt_newwt",
        "pp_pdate",
        "infmt",
        "fstvare",
        "kopim",
        "dkweg",
        "offsacct",
        "bapovar",
        "xcos",
        "xcession",
        "xsplt",
        "surccm",
        "dtprov",
        "dtamtc",
        "dttaxc",
        "dttdsp",
        "dtaxr",
        "xvatdate",
        "pst_per_var",
        "xbbsc",
        "fm_derive_acc",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_t001_data"
),

"sap_t001_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- bukrs -> company_code
    -- mandt -> client
    -- butxt -> company_name
    -- ort01 -> city
    -- land1 -> country_key
    -- waers -> currency_key
    -- spras -> language_key
    -- ktopl -> chart_of_accounts
    -- waabw -> exchange_rate_tolerance
    -- periv -> fiscal_year_variant
    -- kokfi -> controlling_to_fi_interface
    -- adrnr -> address_number
    -- stceg -> vat_registration_number
    -- fikrs -> financial_management_area
    -- xfmco -> pl_consolidation_flag
    -- xfmcb -> balance_sheet_consolidation_flag
    -- xfmca -> mgmt_consolidation_flag
    -- txjcd -> tax_jurisdiction_code
    -- fmhrdate -> fiscal_year_variant_change_date
    -- buvar -> business_transaction_variant
    -- fdbuk -> fm_company_code
    -- xfdis -> distribution_flag
    -- xvalv -> valuation_flag
    -- xskfn -> special_gl_transactions_flag
    -- kkber -> credit_control_area
    -- xmwsn -> vat_flag
    -- mregl -> material_ledger_regulation
    -- xgsbe -> business_area_flag
    -- xgjrv -> fiscal_year_variant_flag
    -- xkdft -> customer_down_payment_flag
    -- xprod -> production_orders_flag
    -- xeink -> purchasing_company_code
    -- xjvaa -> joint_venture_accounting_flag
    -- xvvwa -> foreign_currency_valuation_flag
    -- xslta -> contract_management_flag
    -- xfdmm -> mm_flag
    -- xfdsd -> sd_flag
    -- xextb -> extended_bookkeeping
    -- ktop2 -> interest_calculation_profit_center
    -- umkrs -> sales_organization
    -- bukrs_glob -> global_company_code
    -- fstva -> funds_management_variant
    -- opvar -> open_period_variant
    -- xcovr -> coverage_indicator
    -- txkrs -> tax_calculation_procedure
    -- wfvar -> cash_flow_variant
    -- xbbbf -> cash_flow_account
    -- xbbbe -> pl_statement_account
    -- xbbba -> balance_sheet_account
    -- xbbko -> cost_accounting_account
    -- xstdt -> statistical_postings_flag
    -- mwskv -> output_tax_category
    -- mwska -> input_tax_category
    -- impda -> implementation_date
    -- xnegp -> negative_postings_flag
    -- xkkbi -> vendor_down_payment_flag
    -- wt_newwt -> new_withholding_tax
    -- pp_pdate -> posting_period_end_date
    -- infmt -> information_system_format
    -- fstvare -> extended_funds_management_variant
    -- dkweg -> dunning_procedure
    -- offsacct -> offsetting_account
    -- bapovar -> business_area_posting_variant
    -- xcession -> factoring_indicator
    -- xsplt -> splitting_flag
    -- surccm -> surcharge_calculation_method
    -- dtprov -> provisions_doc_type
    -- dtamtc -> auto_clearing_doc_type
    -- dttaxc -> tax_clearing_doc_type
    -- dttdsp -> down_payment_doc_type
    -- dtaxr -> auto_tax_reporting_doc_type
    -- xvatdate -> vat_reporting_date_flag
    -- pst_per_var -> posting_period_variant
    -- xbbsc -> equity_changes_account
    -- fm_derive_acc -> fm_derive_accounts
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "bukrs" AS "company_code",
        "mandt" AS "client",
        "butxt" AS "company_name",
        "ort01" AS "city",
        "land1" AS "country_key",
        "waers" AS "currency_key",
        "spras" AS "language_key",
        "ktopl" AS "chart_of_accounts",
        "waabw" AS "exchange_rate_tolerance",
        "periv" AS "fiscal_year_variant",
        "kokfi" AS "controlling_to_fi_interface",
        "rcomp",
        "adrnr" AS "address_number",
        "stceg" AS "vat_registration_number",
        "fikrs" AS "financial_management_area",
        "xfmco" AS "pl_consolidation_flag",
        "xfmcb" AS "balance_sheet_consolidation_flag",
        "xfmca" AS "mgmt_consolidation_flag",
        "txjcd" AS "tax_jurisdiction_code",
        "fmhrdate" AS "fiscal_year_variant_change_date",
        "buvar" AS "business_transaction_variant",
        "fdbuk" AS "fm_company_code",
        "xfdis" AS "distribution_flag",
        "xvalv" AS "valuation_flag",
        "xskfn" AS "special_gl_transactions_flag",
        "kkber" AS "credit_control_area",
        "xmwsn" AS "vat_flag",
        "mregl" AS "material_ledger_regulation",
        "xgsbe" AS "business_area_flag",
        "xgjrv" AS "fiscal_year_variant_flag",
        "xkdft" AS "customer_down_payment_flag",
        "xprod" AS "production_orders_flag",
        "xeink" AS "purchasing_company_code",
        "xjvaa" AS "joint_venture_accounting_flag",
        "xvvwa" AS "foreign_currency_valuation_flag",
        "xslta" AS "contract_management_flag",
        "xfdmm" AS "mm_flag",
        "xfdsd" AS "sd_flag",
        "xextb" AS "extended_bookkeeping",
        "ebukr",
        "ktop2" AS "interest_calculation_profit_center",
        "umkrs" AS "sales_organization",
        "bukrs_glob" AS "global_company_code",
        "fstva" AS "funds_management_variant",
        "opvar" AS "open_period_variant",
        "xcovr" AS "coverage_indicator",
        "txkrs" AS "tax_calculation_procedure",
        "wfvar" AS "cash_flow_variant",
        "xbbbf" AS "cash_flow_account",
        "xbbbe" AS "pl_statement_account",
        "xbbba" AS "balance_sheet_account",
        "xbbko" AS "cost_accounting_account",
        "xstdt" AS "statistical_postings_flag",
        "mwskv" AS "output_tax_category",
        "mwska" AS "input_tax_category",
        "impda" AS "implementation_date",
        "xnegp" AS "negative_postings_flag",
        "xkkbi" AS "vendor_down_payment_flag",
        "wt_newwt" AS "new_withholding_tax",
        "pp_pdate" AS "posting_period_end_date",
        "infmt" AS "information_system_format",
        "fstvare" AS "extended_funds_management_variant",
        "kopim",
        "dkweg" AS "dunning_procedure",
        "offsacct" AS "offsetting_account",
        "bapovar" AS "business_area_posting_variant",
        "xcos",
        "xcession" AS "factoring_indicator",
        "xsplt" AS "splitting_flag",
        "surccm" AS "surcharge_calculation_method",
        "dtprov" AS "provisions_doc_type",
        "dtamtc" AS "auto_clearing_doc_type",
        "dttaxc" AS "tax_clearing_doc_type",
        "dttdsp" AS "down_payment_doc_type",
        "dtaxr" AS "auto_tax_reporting_doc_type",
        "xvatdate" AS "vat_reporting_date_flag",
        "pst_per_var" AS "posting_period_variant",
        "xbbsc" AS "equity_changes_account",
        "fm_derive_acc" AS "fm_derive_accounts",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_t001_data_projected"
),

"sap_t001_data_projected_renamed_cleaned" AS (
    -- Clean unusual string values: 
    -- country_key: The problem is that 'fn', 'ga', and 'so' are not standard ISO 3166-1 alpha-2 country codes. 'fn' likely stands for Finland, which should be 'FI'. 'ga' might be a typo for Gabon, which should be 'GA' (uppercase). 'so' is probably Somalia, which should be 'SO' (uppercase). The correct values should be standardized ISO 3166-1 alpha-2 country codes. 
    -- distribution_flag: The problem is that the distribution_flag column only contains the value 'x', which is not descriptive and doesn't clearly indicate the purpose of the flag. In flag columns, it's generally more informative to use boolean values (True/False) or more descriptive terms (e.g., 'distributed', 'not_distributed'). Since we don't have additional context about what 'x' specifically represents, we can't determine a more precise meaning. However, we can infer that the presence of 'x' likely indicates some form of distribution or selection. 
    -- business_area_flag: The problem is that the business_area_flag column contains only one value, 'x', which is not descriptive and doesn't represent a clear business area category. This value is unusual because it doesn't provide any meaningful information about the business area. The correct values for a business area flag should be more descriptive and represent actual business areas or categories. However, without additional context or information about the intended use of this column, it's difficult to suggest specific correct values. 
    -- customer_down_payment_flag: The problem is that 'x' is an unusual value for a flag field. Typically, flag fields use more descriptive values like 'true'/'false' or '1'/'0'. In this case, 'x' appears to be used to indicate a positive flag (i.e., the customer made a down payment). The correct values should be boolean 'true' or 'false', or numeric '1' or '0'. 
    -- purchasing_company_code: The problem is that 'x' is not a typical company code format. Company codes are usually more descriptive or follow a standardized format, such as alphanumeric combinations or abbreviations of company names. The value 'x' seems to be a placeholder or an error, rather than a meaningful company code. Without more context or information about the correct company codes, it's difficult to map this to a specific correct value. In this case, it might be best to mark this data as missing or invalid. 
    -- sd_flag: The problem is that the sd_flag column uses 'x' as its only value, which is unusual for a flag column. Typically, flag columns use more meaningful and explicit values such as 'true'/'false', 'yes'/'no', or '1'/'0'. The 'x' value doesn't clearly indicate what the flag represents or what its absence means. The correct values for a flag column should be more descriptive and follow a boolean logic. 
    -- new_withholding_tax: The problem is that 'x' is not a valid numeric value for a tax rate. Typically, withholding tax rates are expressed as percentages or decimal values. The 'x' likely represents missing or unknown data. In this case, since we don't have any other information about what the correct tax rate should be, the most appropriate action is to map this to an empty string to indicate missing data. 
    SELECT
        "company_code",
        "client",
        "company_name",
        "city",
        CASE
            WHEN "country_key" = '''fn''' THEN '''FI'''
            WHEN "country_key" = '''ga''' THEN '''GA'''
            WHEN "country_key" = '''so''' THEN '''SO'''
            ELSE "country_key"
        END AS "country_key",
        "currency_key",
        "language_key",
        "chart_of_accounts",
        "exchange_rate_tolerance",
        "fiscal_year_variant",
        "controlling_to_fi_interface",
        "rcomp",
        "address_number",
        "vat_registration_number",
        "financial_management_area",
        "pl_consolidation_flag",
        "balance_sheet_consolidation_flag",
        "mgmt_consolidation_flag",
        "tax_jurisdiction_code",
        "fiscal_year_variant_change_date",
        "business_transaction_variant",
        "fm_company_code",
        CASE
            WHEN "distribution_flag" = '''x''' THEN '''distributed'''
            ELSE "distribution_flag"
        END AS "distribution_flag",
        "valuation_flag",
        "special_gl_transactions_flag",
        "credit_control_area",
        "vat_flag",
        "material_ledger_regulation",
        CASE
            WHEN "business_area_flag" = '''x''' THEN ''''
            ELSE "business_area_flag"
        END AS "business_area_flag",
        "fiscal_year_variant_flag",
        CASE
            WHEN "customer_down_payment_flag" = '''x''' THEN '''true'''
            ELSE "customer_down_payment_flag"
        END AS "customer_down_payment_flag",
        "production_orders_flag",
        CASE
            WHEN "purchasing_company_code" = '''x''' THEN ''''
            ELSE "purchasing_company_code"
        END AS "purchasing_company_code",
        "joint_venture_accounting_flag",
        "foreign_currency_valuation_flag",
        "contract_management_flag",
        "mm_flag",
        CASE
            WHEN "sd_flag" = '''x''' THEN '''true'''
            ELSE "sd_flag"
        END AS "sd_flag",
        "extended_bookkeeping",
        "ebukr",
        "interest_calculation_profit_center",
        "sales_organization",
        "global_company_code",
        "funds_management_variant",
        "open_period_variant",
        "coverage_indicator",
        "tax_calculation_procedure",
        "cash_flow_variant",
        "cash_flow_account",
        "pl_statement_account",
        "balance_sheet_account",
        "cost_accounting_account",
        "statistical_postings_flag",
        "output_tax_category",
        "input_tax_category",
        "implementation_date",
        "negative_postings_flag",
        "vendor_down_payment_flag",
        CASE
            WHEN "new_withholding_tax" = '''x''' THEN ''''
            ELSE "new_withholding_tax"
        END AS "new_withholding_tax",
        "posting_period_end_date",
        "information_system_format",
        "extended_funds_management_variant",
        "kopim",
        "dunning_procedure",
        "offsetting_account",
        "business_area_posting_variant",
        "xcos",
        "factoring_indicator",
        "splitting_flag",
        "surcharge_calculation_method",
        "provisions_doc_type",
        "auto_clearing_doc_type",
        "tax_clearing_doc_type",
        "down_payment_doc_type",
        "auto_tax_reporting_doc_type",
        "vat_reporting_date_flag",
        "posting_period_variant",
        "equity_changes_account",
        "fm_derive_accounts",
        "row_id",
        "is_deleted"
    FROM "sap_t001_data_projected_renamed"
),

"sap_t001_data_projected_renamed_cleaned_casted" AS (
    -- Column Type Casting: 
    -- address_number: from DECIMAL to INT
    -- auto_clearing_doc_type: from DECIMAL to VARCHAR
    -- auto_tax_reporting_doc_type: from DECIMAL to VARCHAR
    -- balance_sheet_account: from DECIMAL to VARCHAR
    -- balance_sheet_consolidation_flag: from DECIMAL to VARCHAR
    -- business_area_posting_variant: from DECIMAL to VARCHAR
    -- cash_flow_account: from DECIMAL to VARCHAR
    -- client: from INT to VARCHAR
    -- company_code: from INT to VARCHAR
    -- contract_management_flag: from DECIMAL to VARCHAR
    -- controlling_to_fi_interface: from INT to VARCHAR
    -- cost_accounting_account: from DECIMAL to VARCHAR
    -- coverage_indicator: from DECIMAL to VARCHAR
    -- down_payment_doc_type: from DECIMAL to VARCHAR
    -- dunning_procedure: from DECIMAL to VARCHAR
    -- ebukr: from DECIMAL to VARCHAR
    -- equity_changes_account: from DECIMAL to VARCHAR
    -- exchange_rate_tolerance: from INT to VARCHAR
    -- extended_bookkeeping: from DECIMAL to VARCHAR
    -- financial_management_area: from DECIMAL to VARCHAR
    -- fiscal_year_variant_change_date: from INT to DATE
    -- fm_company_code: from DECIMAL to VARCHAR
    -- fm_derive_accounts: from DECIMAL to VARCHAR
    -- foreign_currency_valuation_flag: from DECIMAL to VARCHAR
    -- funds_management_variant: from INT to VARCHAR
    -- global_company_code: from DECIMAL to VARCHAR
    -- information_system_format: from DECIMAL to VARCHAR
    -- joint_venture_accounting_flag: from DECIMAL to VARCHAR
    -- kopim: from DECIMAL to VARCHAR
    -- material_ledger_regulation: from DECIMAL to VARCHAR
    -- mgmt_consolidation_flag: from DECIMAL to VARCHAR
    -- offsetting_account: from INT to VARCHAR
    -- open_period_variant: from INT to VARCHAR
    -- pl_consolidation_flag: from DECIMAL to VARCHAR
    -- pl_statement_account: from DECIMAL to VARCHAR
    -- posting_period_end_date: from DECIMAL to VARCHAR
    -- posting_period_variant: from DECIMAL to VARCHAR
    -- production_orders_flag: from DECIMAL to VARCHAR
    -- provisions_doc_type: from DECIMAL to VARCHAR
    -- special_gl_transactions_flag: from DECIMAL to VARCHAR
    -- splitting_flag: from DECIMAL to VARCHAR
    -- statistical_postings_flag: from DECIMAL to VARCHAR
    -- surcharge_calculation_method: from DECIMAL to VARCHAR
    -- tax_calculation_procedure: from DECIMAL to VARCHAR
    -- tax_clearing_doc_type: from DECIMAL to VARCHAR
    -- tax_jurisdiction_code: from DECIMAL to VARCHAR
    -- vat_flag: from DECIMAL to VARCHAR
    -- vat_reporting_date_flag: from DECIMAL to VARCHAR
    -- vendor_down_payment_flag: from DECIMAL to VARCHAR
    SELECT
        "company_name",
        "city",
        "country_key",
        "currency_key",
        "language_key",
        "chart_of_accounts",
        "fiscal_year_variant",
        "rcomp",
        "vat_registration_number",
        "business_transaction_variant",
        "distribution_flag",
        "valuation_flag",
        "credit_control_area",
        "business_area_flag",
        "fiscal_year_variant_flag",
        "customer_down_payment_flag",
        "purchasing_company_code",
        "mm_flag",
        "sd_flag",
        "interest_calculation_profit_center",
        "sales_organization",
        "cash_flow_variant",
        "output_tax_category",
        "input_tax_category",
        "implementation_date",
        "negative_postings_flag",
        "new_withholding_tax",
        "extended_funds_management_variant",
        "xcos",
        "factoring_indicator",
        "row_id",
        "is_deleted",
        CAST("address_number" AS INT) AS "address_number",
        CAST("auto_clearing_doc_type" AS VARCHAR) AS "auto_clearing_doc_type",
        CAST("auto_tax_reporting_doc_type" AS VARCHAR) AS "auto_tax_reporting_doc_type",
        CAST("balance_sheet_account" AS VARCHAR) AS "balance_sheet_account",
        CAST("balance_sheet_consolidation_flag" AS VARCHAR) AS "balance_sheet_consolidation_flag",
        CAST("business_area_posting_variant" AS VARCHAR) AS "business_area_posting_variant",
        CAST("cash_flow_account" AS VARCHAR) AS "cash_flow_account",
        CAST("client" AS VARCHAR) AS "client",
        CAST("company_code" AS VARCHAR) AS "company_code",
        CAST("contract_management_flag" AS VARCHAR) AS "contract_management_flag",
        CAST("controlling_to_fi_interface" AS VARCHAR) AS "controlling_to_fi_interface",
        CAST("cost_accounting_account" AS VARCHAR) AS "cost_accounting_account",
        CAST("coverage_indicator" AS VARCHAR) AS "coverage_indicator",
        CAST("down_payment_doc_type" AS VARCHAR) AS "down_payment_doc_type",
        CAST("dunning_procedure" AS VARCHAR) AS "dunning_procedure",
        CAST("ebukr" AS VARCHAR) AS "ebukr",
        CAST("equity_changes_account" AS VARCHAR) AS "equity_changes_account",
        CAST("exchange_rate_tolerance" AS VARCHAR) AS "exchange_rate_tolerance",
        CAST("extended_bookkeeping" AS VARCHAR) AS "extended_bookkeeping",
        CAST("financial_management_area" AS VARCHAR) AS "financial_management_area",
        CASE 
            WHEN "fiscal_year_variant_change_date" = 0 THEN NULL
            ELSE strptime(CAST("fiscal_year_variant_change_date" AS VARCHAR), '%Y%m%d')
        END AS "fiscal_year_variant_change_date",
        CAST("fm_company_code" AS VARCHAR) AS "fm_company_code",
        CAST("fm_derive_accounts" AS VARCHAR) AS "fm_derive_accounts",
        CAST("foreign_currency_valuation_flag" AS VARCHAR) AS "foreign_currency_valuation_flag",
        CAST("funds_management_variant" AS VARCHAR) AS "funds_management_variant",
        CAST("global_company_code" AS VARCHAR) AS "global_company_code",
        CAST("information_system_format" AS VARCHAR) AS "information_system_format",
        CAST("joint_venture_accounting_flag" AS VARCHAR) AS "joint_venture_accounting_flag",
        CAST("kopim" AS VARCHAR) AS "kopim",
        CAST("material_ledger_regulation" AS VARCHAR) AS "material_ledger_regulation",
        CAST("mgmt_consolidation_flag" AS VARCHAR) AS "mgmt_consolidation_flag",
        CAST("offsetting_account" AS VARCHAR) AS "offsetting_account",
        CAST("open_period_variant" AS VARCHAR) AS "open_period_variant",
        CAST("pl_consolidation_flag" AS VARCHAR) AS "pl_consolidation_flag",
        CAST("pl_statement_account" AS VARCHAR) AS "pl_statement_account",
        CAST("posting_period_end_date" AS VARCHAR) AS "posting_period_end_date",
        CAST("posting_period_variant" AS VARCHAR) AS "posting_period_variant",
        CAST("production_orders_flag" AS VARCHAR) AS "production_orders_flag",
        CAST("provisions_doc_type" AS VARCHAR) AS "provisions_doc_type",
        CAST("special_gl_transactions_flag" AS VARCHAR) AS "special_gl_transactions_flag",
        CAST("splitting_flag" AS VARCHAR) AS "splitting_flag",
        CAST("statistical_postings_flag" AS VARCHAR) AS "statistical_postings_flag",
        CAST("surcharge_calculation_method" AS VARCHAR) AS "surcharge_calculation_method",
        CAST("tax_calculation_procedure" AS VARCHAR) AS "tax_calculation_procedure",
        CAST("tax_clearing_doc_type" AS VARCHAR) AS "tax_clearing_doc_type",
        CAST("tax_jurisdiction_code" AS VARCHAR) AS "tax_jurisdiction_code",
        CAST("vat_flag" AS VARCHAR) AS "vat_flag",
        CAST("vat_reporting_date_flag" AS VARCHAR) AS "vat_reporting_date_flag",
        CAST("vendor_down_payment_flag" AS VARCHAR) AS "vendor_down_payment_flag"
    FROM "sap_t001_data_projected_renamed_cleaned"
),

"sap_t001_data_projected_renamed_cleaned_casted_missing_handled" AS (
    -- Handling missing values: There are 50 columns with unacceptable missing values
    -- address_number has 33.33 percent missing. Strategy: 🔄 Unchanged
    -- auto_clearing_doc_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- auto_tax_reporting_doc_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- balance_sheet_account has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- balance_sheet_consolidation_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- business_area_posting_variant has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- business_transaction_variant has 33.33 percent missing. Strategy: 🔄 Unchanged
    -- cash_flow_account has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- cash_flow_variant has 66.67 percent missing. Strategy: 🔄 Unchanged
    -- contract_management_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- cost_accounting_account has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- coverage_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- credit_control_area has 33.33 percent missing. Strategy: 🔄 Unchanged
    -- down_payment_doc_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- dunning_procedure has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ebukr has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- equity_changes_account has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- extended_bookkeeping has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- extended_funds_management_variant has 66.67 percent missing. Strategy: 🔄 Unchanged
    -- financial_management_area has 66.67 percent missing. Strategy: 🔄 Unchanged
    -- fm_company_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- fm_derive_accounts has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- foreign_currency_valuation_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- global_company_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- implementation_date has 66.67 percent missing. Strategy: 🔄 Unchanged
    -- information_system_format has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- interest_calculation_profit_center has 66.67 percent missing. Strategy: 🔄 Unchanged
    -- joint_venture_accounting_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- kopim has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- material_ledger_regulation has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- mm_flag has 66.67 percent missing. Strategy: 🔄 Unchanged
    -- negative_postings_flag has 33.33 percent missing. Strategy: 🔄 Unchanged
    -- new_withholding_tax has 33.33 percent missing. Strategy: 🔄 Unchanged
    -- posting_period_end_date has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- posting_period_variant has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- production_orders_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- provisions_doc_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- purchasing_company_code has 33.33 percent missing. Strategy: 🔄 Unchanged
    -- rcomp has 33.33 percent missing. Strategy: 🔄 Unchanged
    -- special_gl_transactions_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- splitting_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- statistical_postings_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- surcharge_calculation_method has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_calculation_procedure has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_clearing_doc_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_jurisdiction_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- vat_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- vat_reporting_date_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- vendor_down_payment_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- xcos has 33.33 percent missing. Strategy: 🔄 Unchanged
    SELECT
        "company_name",
        "city",
        "country_key",
        "currency_key",
        "language_key",
        "chart_of_accounts",
        "fiscal_year_variant",
        "rcomp",
        "vat_registration_number",
        "business_transaction_variant",
        "distribution_flag",
        "valuation_flag",
        "credit_control_area",
        "business_area_flag",
        "fiscal_year_variant_flag",
        "customer_down_payment_flag",
        "purchasing_company_code",
        "mm_flag",
        "sd_flag",
        "interest_calculation_profit_center",
        "sales_organization",
        "cash_flow_variant",
        "output_tax_category",
        "input_tax_category",
        "implementation_date",
        "negative_postings_flag",
        "new_withholding_tax",
        "extended_funds_management_variant",
        "xcos",
        "factoring_indicator",
        "row_id",
        "is_deleted",
        "address_number",
        "client",
        "company_code",
        "controlling_to_fi_interface",
        "exchange_rate_tolerance",
        "financial_management_area",
        "fiscal_year_variant_change_date",
        "funds_management_variant",
        "mgmt_consolidation_flag",
        "offsetting_account",
        "open_period_variant",
        "pl_consolidation_flag",
        "pl_statement_account"
    FROM "sap_t001_data_projected_renamed_cleaned_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_t001_data_projected_renamed_cleaned_casted_missing_handled"

stg_sap_t001_data.yml (Document the table)

version: 2
models:
- name: stg_sap_t001_data
  description: The table is about company codes in an SAP system. It contains details
    like company code (bukrs), client (mandt), company name (butxt), city (ort01),
    country (land1), and currency (waers). It also includes various configuration
    flags and settings related to financial accounting, controlling, and other business
    processes for each company code.
  columns:
  - name: company_name
    description: Company Name
    tests:
    - not_null
  - name: city
    description: City
    tests:
    - not_null
  - name: country_key
    description: Country Key
    tests:
    - not_null
  - name: currency_key
    description: Currency key
    tests:
    - not_null
  - name: language_key
    description: Language key
    tests:
    - not_null
    - accepted_values:
        values:
        - a
        - b
        - c
        - d
        - e
        - f
        - g
        - h
        - i
        - j
        - k
        - l
        - m
        - n
        - o
        - p
        - q
        - r
        - s
        - t
        - u
        - v
        - w
        - x
        - y
        - z
  - name: chart_of_accounts
    description: Chart of Accounts
    tests:
    - not_null
  - name: fiscal_year_variant
    description: Fiscal year variant
    tests:
    - not_null
    - accepted_values:
        values:
        - K4
        - K1
        - K2
        - K3
        - V1
        - V2
        - V3
        - V4
        - W1
        - A1
        - A2
        - A3
        - A4
        - B1
        - B2
        - B3
        - B4
        - C1
        - C2
        - C3
        - C4
        - D1
        - D2
        - D3
        - D4
        - E1
        - E2
        - E3
        - E4
        - k4
  - name: rcomp
    description: ''
    tests:
    - not_null
  - name: vat_registration_number
    description: VAT registration number
    cocoon_meta:
      missing_acceptable: Not applicable for entities not registered for VAT
  - name: business_transaction_variant
    description: Business Transaction Variant
    tests:
    - not_null
  - name: distribution_flag
    description: Flag for distribution
    tests:
    - accepted_values:
        values:
        - x
        - ''
    cocoon_meta:
      missing_acceptable: Not applicable if not involved in distribution
  - name: valuation_flag
    description: Flag for valuation
    tests:
    - not_null
    - accepted_values:
        values:
        - x
        - ''
  - name: credit_control_area
    description: Credit Control Area
    tests:
    - not_null
  - name: business_area_flag
    description: Flag for business area
    tests:
    - accepted_values:
        values:
        - x
        - ''
    cocoon_meta:
      missing_acceptable: Not applicable if business areas aren't used
  - name: fiscal_year_variant_flag
    description: Flag for fiscal year variant
    tests:
    - not_null
    - accepted_values:
        values:
        - x
        - ''
  - name: customer_down_payment_flag
    description: Flag for customer down payment
    tests:
    - accepted_values:
        values:
        - Y
        - N
        - x
    cocoon_meta:
      missing_acceptable: Not applicable if down payments aren't accepted
  - name: purchasing_company_code
    description: Indicator for purchasing company code
    tests:
    - not_null
  - name: mm_flag
    description: Flag for materials management
    tests:
    - not_null
    - accepted_values:
        values:
        - x
        - ' '
  - name: sd_flag
    description: Flag for sales and distribution
    tests:
    - accepted_values:
        values:
        - x
        - ''
        - n
    cocoon_meta:
      missing_acceptable: Not applicable for non-sales related entries
  - name: interest_calculation_profit_center
    description: Profit Center for Interest Calculation
    tests:
    - not_null
  - name: sales_organization
    description: Sales organization
    cocoon_meta:
      missing_acceptable: Not applicable for companies without sales operations
  - name: cash_flow_variant
    description: Cash flow variant
    tests:
    - not_null
  - name: output_tax_category
    description: Tax category for output tax
    tests:
    - not_null
  - name: input_tax_category
    description: Tax category for input tax
    tests:
    - not_null
  - name: implementation_date
    description: Implementation Date
    tests:
    - not_null
  - name: negative_postings_flag
    description: Flag for negative postings
    tests:
    - not_null
    - accepted_values:
        values:
        - Y
        - N
        - x
  - name: new_withholding_tax
    description: New withholding tax indicator
    tests:
    - not_null
    - accepted_values:
        values:
        - x
        - ' '
  - name: extended_funds_management_variant
    description: Extended Funds Management Variant
    tests:
    - not_null
    - accepted_values:
        values:
        - fmre
        - fmco
        - fmse
        - fmgp
  - name: xcos
    description: ''
    tests:
    - not_null
  - name: factoring_indicator
    description: Indicator for factoring
    tests:
    - accepted_values:
        values:
        - x
        - ''
    cocoon_meta:
      missing_acceptable: Not applicable if factoring isn't used
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
  - name: is_deleted
    description: Indicates if the record is deleted
    tests:
    - not_null
  - name: address_number
    description: Address number
    tests:
    - not_null
  - name: client
    description: Client
    tests:
    - not_null
  - name: company_code
    description: Company Code
    tests:
    - not_null
  - name: controlling_to_fi_interface
    description: Controlling to FI Interface
    tests:
    - not_null
  - name: exchange_rate_tolerance
    description: Exchange rate difference tolerance percentage
    tests:
    - not_null
  - name: financial_management_area
    description: Financial Management Area
    tests:
    - not_null
  - name: fiscal_year_variant_change_date
    description: Fiscal Year Variant Change Date
    cocoon_meta:
      missing_acceptable: Not applicable if fiscal year variant hasn't changed
  - name: funds_management_variant
    description: Funds Management Variant
    tests:
    - not_null
  - name: mgmt_consolidation_flag
    description: Flag for management consolidation
    cocoon_meta:
      missing_acceptable: Not applicable for non-consolidated management accounts
  - name: offsetting_account
    description: Offsetting account number
    tests:
    - not_null
  - name: open_period_variant
    description: Open period variant
    tests:
    - not_null
  - name: pl_consolidation_flag
    description: Flag for profit and loss consolidation
    cocoon_meta:
      missing_acceptable: Not applicable for non-consolidated profit and loss accounts
  - name: pl_statement_account
    description: Indicator for P&L statement account
    cocoon_meta:
      missing_acceptable: Not applicable if not a profit and loss account

stg_sap_faglflext_data (first 100 rows)

debit_credit_indicator object_number objnr01 max_periods fiscal_year activity_type currency document_type ledger record_type version rbukrs profit_center tslvt january_amount february_amount march_amount april_amount may_amount june_amount july_amount august_amount september_amount october_amount november_amount december_amount tsl13 tsl14 tsl15 tsl16 amount_previous_year amount_period_01 amount_period_02 amount_period_03 amount_period_04 amount_period_05 amount_period_06 amount_period_07 amount_period_08 amount_period_09 amount_period_10 amount_period_11 amount_period_12 amount_period_13 amount_period_14 amount_period_15 amount_period_16 cost_element_total group_amount_period_01 group_amount_period_02 group_amount_period_03 group_amount_period_04 group_amount_period_05 group_amount_period_06 cost_element_july cost_element_august cost_element_september cost_element_october cost_element_november cost_element_december cost_element_january_next cost_element_february_next cost_element_march_next cost_element_april_next oslvt period_01_value period_02_value period_03_value period_04_value period_05_value period_06_value period_07_value period_08_value period_09_value period_10_value period_11_value period_12_value osl13 osl14 osl15 osl16 stat_key_figure_total stat_key_figure_january stat_key_figure_february stat_key_figure_march stat_key_figure_april stat_key_figure_may stat_key_figure_june stat_key_figure_july stat_key_figure_august stat_key_figure_september stat_key_figure_october stat_key_figure_november stat_key_figure_december stat_key_figure_january_next stat_key_figure_february_next stat_key_figure_march_next stat_key_figure_april_next row_id is_deleted account_group account_number business_area client controlling_area cost_center functional_area gl_account record_timestamp unit_of_measure unused_object_4 unused_object_5 unused_object_6 unused_object_7 unused_object_8
0 h 7 1 16 2002 rfbu usd bkpf 0l 0 1 3000 NaN 0.0 0.00 0.00 0.0 0.00 0.00 0.00 0.00 0.00 0.0 0.00 -245194.66 -245194.66 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0.0 0.00 0.00 0.00 0.00 0.00 0.0 0.00 -245194.66 -245194.66 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -204328.07 -204328.07 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0.0 0.00 0.00 0.00 0.00 0.00 0.0 0.00 -245194.66 -245194.66 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 196611 False 76 73 3000 800 2000 None None 140000 2006-01-17 13:29:31 None b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00'
1 s 7 1 16 2002 rfbu usd bkpf 0l 0 1 3000 3005.0 0.0 5093.77 4992.89 4841.6 5245.07 4892.03 5093.77 5194.64 4942.46 5144.2 5043.34 4892.03 4791.17 0.0 0.0 0.0 0.0 0.0 5093.77 4992.89 4841.6 5245.07 4892.03 5093.77 5194.64 4942.46 5144.2 5043.34 4892.03 4791.17 0.0 0.0 0.0 0.0 0.0 4244.79 4160.73 4034.65 4370.87 4076.68 4244.79 4328.85 4118.70 4286.82 4202.77 4076.68 3992.63 0.0 0.0 0.0 0.0 0.0 5093.77 4992.89 4841.6 5245.07 4892.03 5093.77 5194.64 4942.46 5144.2 5043.34 4892.03 4791.17 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 196612 False 137 394 3000 800 2000 4215.0 100.0 403000 2006-01-17 13:43:33 None b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00'
2 s 7 1 16 2002 rfbu usd bkpf 0l 0 1 3000 3005.0 0.0 6009.50 5890.49 5712.0 6188.00 5771.49 6009.50 6128.50 5830.99 6069.0 5950.00 5771.49 5652.50 0.0 0.0 0.0 0.0 0.0 6009.50 5890.49 5712.0 6188.00 5771.49 6009.50 6128.50 5830.99 6069.0 5950.00 5771.49 5652.50 0.0 0.0 0.0 0.0 0.0 5007.90 4908.72 4759.98 5156.65 4809.56 5007.90 5107.06 4859.14 5057.48 4958.31 4809.56 4710.40 0.0 0.0 0.0 0.0 0.0 6009.50 5890.49 5712.0 6188.00 5771.49 6009.50 6128.50 5830.99 6069.0 5950.00 5771.49 5652.50 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 196613 False 137 395 3000 800 2000 4216.0 100.0 403000 2006-01-17 13:43:33 None b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00'

stg_sap_faglflext_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 04:44:35.524411+00:00
WITH 
"sap_faglflext_data_projected" AS (
    -- Projection: Selecting 126 out of 127 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "drcrk",
        "objnr00",
        "objnr01",
        "objnr02",
        "objnr03",
        "objnr04",
        "objnr05",
        "objnr06",
        "objnr07",
        "objnr08",
        "rclnt",
        "rpmax",
        "ryear",
        "activ",
        "rmvct",
        "rtcur",
        "runit",
        "awtyp",
        "rldnr",
        "rrcty",
        "rvers",
        "logsys",
        "racct",
        "cost_elem",
        "rbukrs",
        "rcntr",
        "prctr",
        "rfarea",
        "rbusa",
        "kokrs",
        "segment",
        "zzspreg",
        "scntr",
        "pprctr",
        "sfarea",
        "sbusa",
        "rassc",
        "psegment",
        "tslvt",
        "tsl01",
        "tsl02",
        "tsl03",
        "tsl04",
        "tsl05",
        "tsl06",
        "tsl07",
        "tsl08",
        "tsl09",
        "tsl10",
        "tsl11",
        "tsl12",
        "tsl13",
        "tsl14",
        "tsl15",
        "tsl16",
        "hslvt",
        "hsl01",
        "hsl02",
        "hsl03",
        "hsl04",
        "hsl05",
        "hsl06",
        "hsl07",
        "hsl08",
        "hsl09",
        "hsl10",
        "hsl11",
        "hsl12",
        "hsl13",
        "hsl14",
        "hsl15",
        "hsl16",
        "kslvt",
        "ksl01",
        "ksl02",
        "ksl03",
        "ksl04",
        "ksl05",
        "ksl06",
        "ksl07",
        "ksl08",
        "ksl09",
        "ksl10",
        "ksl11",
        "ksl12",
        "ksl13",
        "ksl14",
        "ksl15",
        "ksl16",
        "oslvt",
        "osl01",
        "osl02",
        "osl03",
        "osl04",
        "osl05",
        "osl06",
        "osl07",
        "osl08",
        "osl09",
        "osl10",
        "osl11",
        "osl12",
        "osl13",
        "osl14",
        "osl15",
        "osl16",
        "mslvt",
        "msl01",
        "msl02",
        "msl03",
        "msl04",
        "msl05",
        "msl06",
        "msl07",
        "msl08",
        "msl09",
        "msl10",
        "msl11",
        "msl12",
        "msl13",
        "msl14",
        "msl15",
        "msl16",
        "timestamp_",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_faglflext_data"
),

"sap_faglflext_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- drcrk -> debit_credit_indicator
    -- objnr00 -> object_number
    -- objnr02 -> account_group
    -- objnr03 -> account_number
    -- objnr04 -> unused_object_4
    -- objnr05 -> unused_object_5
    -- objnr06 -> unused_object_6
    -- objnr07 -> unused_object_7
    -- objnr08 -> unused_object_8
    -- rclnt -> client
    -- rpmax -> max_periods
    -- ryear -> fiscal_year
    -- activ -> activity_type
    -- rmvct -> movement_type
    -- rtcur -> currency
    -- runit -> unit_of_measure
    -- awtyp -> document_type
    -- rldnr -> ledger
    -- rrcty -> record_type
    -- rvers -> version
    -- logsys -> logical_system
    -- racct -> gl_account
    -- cost_elem -> cost_element
    -- rcntr -> cost_center
    -- prctr -> profit_center
    -- rfarea -> functional_area
    -- rbusa -> business_area
    -- kokrs -> controlling_area
    -- zzspreg -> special_region_code
    -- scntr -> sender_cost_center
    -- pprctr -> partner_profit_center
    -- sfarea -> sender_functional_area
    -- sbusa -> sender_business_area
    -- rassc -> asset_class
    -- psegment -> profit_segment
    -- tsl01 -> january_amount
    -- tsl02 -> february_amount
    -- tsl03 -> march_amount
    -- tsl04 -> april_amount
    -- tsl05 -> may_amount
    -- tsl06 -> june_amount
    -- tsl07 -> july_amount
    -- tsl08 -> august_amount
    -- tsl09 -> september_amount
    -- tsl10 -> october_amount
    -- tsl11 -> november_amount
    -- tsl12 -> december_amount
    -- hslvt -> amount_previous_year
    -- hsl01 -> amount_period_01
    -- hsl02 -> amount_period_02
    -- hsl03 -> amount_period_03
    -- hsl04 -> amount_period_04
    -- hsl05 -> amount_period_05
    -- hsl06 -> amount_period_06
    -- hsl07 -> amount_period_07
    -- hsl08 -> amount_period_08
    -- hsl09 -> amount_period_09
    -- hsl10 -> amount_period_10
    -- hsl11 -> amount_period_11
    -- hsl12 -> amount_period_12
    -- hsl13 -> amount_period_13
    -- hsl14 -> amount_period_14
    -- hsl15 -> amount_period_15
    -- hsl16 -> amount_period_16
    -- kslvt -> cost_element_total
    -- ksl01 -> group_amount_period_01
    -- ksl02 -> group_amount_period_02
    -- ksl03 -> group_amount_period_03
    -- ksl04 -> group_amount_period_04
    -- ksl05 -> group_amount_period_05
    -- ksl06 -> group_amount_period_06
    -- ksl07 -> cost_element_july
    -- ksl08 -> cost_element_august
    -- ksl09 -> cost_element_september
    -- ksl10 -> cost_element_october
    -- ksl11 -> cost_element_november
    -- ksl12 -> cost_element_december
    -- ksl13 -> cost_element_january_next
    -- ksl14 -> cost_element_february_next
    -- ksl15 -> cost_element_march_next
    -- ksl16 -> cost_element_april_next
    -- osl01 -> period_01_value
    -- osl02 -> period_02_value
    -- osl03 -> period_03_value
    -- osl04 -> period_04_value
    -- osl05 -> period_05_value
    -- osl06 -> period_06_value
    -- osl07 -> period_07_value
    -- osl08 -> period_08_value
    -- osl09 -> period_09_value
    -- osl10 -> period_10_value
    -- osl11 -> period_11_value
    -- osl12 -> period_12_value
    -- mslvt -> stat_key_figure_total
    -- msl01 -> stat_key_figure_january
    -- msl02 -> stat_key_figure_february
    -- msl03 -> stat_key_figure_march
    -- msl04 -> stat_key_figure_april
    -- msl05 -> stat_key_figure_may
    -- msl06 -> stat_key_figure_june
    -- msl07 -> stat_key_figure_july
    -- msl08 -> stat_key_figure_august
    -- msl09 -> stat_key_figure_september
    -- msl10 -> stat_key_figure_october
    -- msl11 -> stat_key_figure_november
    -- msl12 -> stat_key_figure_december
    -- msl13 -> stat_key_figure_january_next
    -- msl14 -> stat_key_figure_february_next
    -- msl15 -> stat_key_figure_march_next
    -- msl16 -> stat_key_figure_april_next
    -- timestamp_ -> record_timestamp
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "drcrk" AS "debit_credit_indicator",
        "objnr00" AS "object_number",
        "objnr01",
        "objnr02" AS "account_group",
        "objnr03" AS "account_number",
        "objnr04" AS "unused_object_4",
        "objnr05" AS "unused_object_5",
        "objnr06" AS "unused_object_6",
        "objnr07" AS "unused_object_7",
        "objnr08" AS "unused_object_8",
        "rclnt" AS "client",
        "rpmax" AS "max_periods",
        "ryear" AS "fiscal_year",
        "activ" AS "activity_type",
        "rmvct" AS "movement_type",
        "rtcur" AS "currency",
        "runit" AS "unit_of_measure",
        "awtyp" AS "document_type",
        "rldnr" AS "ledger",
        "rrcty" AS "record_type",
        "rvers" AS "version",
        "logsys" AS "logical_system",
        "racct" AS "gl_account",
        "cost_elem" AS "cost_element",
        "rbukrs",
        "rcntr" AS "cost_center",
        "prctr" AS "profit_center",
        "rfarea" AS "functional_area",
        "rbusa" AS "business_area",
        "kokrs" AS "controlling_area",
        "segment",
        "zzspreg" AS "special_region_code",
        "scntr" AS "sender_cost_center",
        "pprctr" AS "partner_profit_center",
        "sfarea" AS "sender_functional_area",
        "sbusa" AS "sender_business_area",
        "rassc" AS "asset_class",
        "psegment" AS "profit_segment",
        "tslvt",
        "tsl01" AS "january_amount",
        "tsl02" AS "february_amount",
        "tsl03" AS "march_amount",
        "tsl04" AS "april_amount",
        "tsl05" AS "may_amount",
        "tsl06" AS "june_amount",
        "tsl07" AS "july_amount",
        "tsl08" AS "august_amount",
        "tsl09" AS "september_amount",
        "tsl10" AS "october_amount",
        "tsl11" AS "november_amount",
        "tsl12" AS "december_amount",
        "tsl13",
        "tsl14",
        "tsl15",
        "tsl16",
        "hslvt" AS "amount_previous_year",
        "hsl01" AS "amount_period_01",
        "hsl02" AS "amount_period_02",
        "hsl03" AS "amount_period_03",
        "hsl04" AS "amount_period_04",
        "hsl05" AS "amount_period_05",
        "hsl06" AS "amount_period_06",
        "hsl07" AS "amount_period_07",
        "hsl08" AS "amount_period_08",
        "hsl09" AS "amount_period_09",
        "hsl10" AS "amount_period_10",
        "hsl11" AS "amount_period_11",
        "hsl12" AS "amount_period_12",
        "hsl13" AS "amount_period_13",
        "hsl14" AS "amount_period_14",
        "hsl15" AS "amount_period_15",
        "hsl16" AS "amount_period_16",
        "kslvt" AS "cost_element_total",
        "ksl01" AS "group_amount_period_01",
        "ksl02" AS "group_amount_period_02",
        "ksl03" AS "group_amount_period_03",
        "ksl04" AS "group_amount_period_04",
        "ksl05" AS "group_amount_period_05",
        "ksl06" AS "group_amount_period_06",
        "ksl07" AS "cost_element_july",
        "ksl08" AS "cost_element_august",
        "ksl09" AS "cost_element_september",
        "ksl10" AS "cost_element_october",
        "ksl11" AS "cost_element_november",
        "ksl12" AS "cost_element_december",
        "ksl13" AS "cost_element_january_next",
        "ksl14" AS "cost_element_february_next",
        "ksl15" AS "cost_element_march_next",
        "ksl16" AS "cost_element_april_next",
        "oslvt",
        "osl01" AS "period_01_value",
        "osl02" AS "period_02_value",
        "osl03" AS "period_03_value",
        "osl04" AS "period_04_value",
        "osl05" AS "period_05_value",
        "osl06" AS "period_06_value",
        "osl07" AS "period_07_value",
        "osl08" AS "period_08_value",
        "osl09" AS "period_09_value",
        "osl10" AS "period_10_value",
        "osl11" AS "period_11_value",
        "osl12" AS "period_12_value",
        "osl13",
        "osl14",
        "osl15",
        "osl16",
        "mslvt" AS "stat_key_figure_total",
        "msl01" AS "stat_key_figure_january",
        "msl02" AS "stat_key_figure_february",
        "msl03" AS "stat_key_figure_march",
        "msl04" AS "stat_key_figure_april",
        "msl05" AS "stat_key_figure_may",
        "msl06" AS "stat_key_figure_june",
        "msl07" AS "stat_key_figure_july",
        "msl08" AS "stat_key_figure_august",
        "msl09" AS "stat_key_figure_september",
        "msl10" AS "stat_key_figure_october",
        "msl11" AS "stat_key_figure_november",
        "msl12" AS "stat_key_figure_december",
        "msl13" AS "stat_key_figure_january_next",
        "msl14" AS "stat_key_figure_february_next",
        "msl15" AS "stat_key_figure_march_next",
        "msl16" AS "stat_key_figure_april_next",
        "timestamp_" AS "record_timestamp",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_faglflext_data_projected"
),

"sap_faglflext_data_projected_renamed_cleaned" AS (
    -- Clean unusual string values: 
    -- debit_credit_indicator: The problem is that 's' and 'h' are not standard debit/credit indicators. Typically, debit/credit indicators use 'D' for debit and 'C' for credit, or sometimes 'Debit' and 'Credit'. The values 's' and 'h' are unusual and unclear in their meaning. Without more context about the specific system or data source, it's difficult to determine what these letters stand for. They could potentially be abbreviations for specific transaction types or have some other meaning within the system. 
    -- activity_type: The problem is that 'rfbu' is the only value present in the activity_type column, and it's an unclear acronym that doesn't provide meaningful information about the activity type. Without additional context or a data dictionary, it's impossible to determine what 'rfbu' stands for or what kind of activity it represents. In this case, since we don't have enough information to map it to a correct value, the best approach is to map it to an empty string to indicate that the activity type is unknown or undefined. 
    -- ledger: The problem is that '0l' appears to be a typo. It's likely that this was meant to be '01' (zero-one), representing the first ledger or account number. The letter 'l' (lowercase L) can easily be mistaken for the number '1' in many fonts. Since this is the only value present and it's clearly a data entry error, we should correct it to the intended value. 
    SELECT
        CASE
            WHEN "debit_credit_indicator" = '''s''' THEN '''D'''
            WHEN "debit_credit_indicator" = '''h''' THEN '''C'''
            ELSE "debit_credit_indicator"
        END AS "debit_credit_indicator",
        "object_number",
        "objnr01",
        "account_group",
        "account_number",
        "unused_object_4",
        "unused_object_5",
        "unused_object_6",
        "unused_object_7",
        "unused_object_8",
        "client",
        "max_periods",
        "fiscal_year",
        CASE
            WHEN "activity_type" = '''rfbu''' THEN ''''
            ELSE "activity_type"
        END AS "activity_type",
        "movement_type",
        "currency",
        "unit_of_measure",
        "document_type",
        CASE
            WHEN "ledger" = '''0l''' THEN '''01'''
            ELSE "ledger"
        END AS "ledger",
        "record_type",
        "version",
        "logical_system",
        "gl_account",
        "cost_element",
        "rbukrs",
        "cost_center",
        "profit_center",
        "functional_area",
        "business_area",
        "controlling_area",
        "segment",
        "special_region_code",
        "sender_cost_center",
        "partner_profit_center",
        "sender_functional_area",
        "sender_business_area",
        "asset_class",
        "profit_segment",
        "tslvt",
        "january_amount",
        "february_amount",
        "march_amount",
        "april_amount",
        "may_amount",
        "june_amount",
        "july_amount",
        "august_amount",
        "september_amount",
        "october_amount",
        "november_amount",
        "december_amount",
        "tsl13",
        "tsl14",
        "tsl15",
        "tsl16",
        "amount_previous_year",
        "amount_period_01",
        "amount_period_02",
        "amount_period_03",
        "amount_period_04",
        "amount_period_05",
        "amount_period_06",
        "amount_period_07",
        "amount_period_08",
        "amount_period_09",
        "amount_period_10",
        "amount_period_11",
        "amount_period_12",
        "amount_period_13",
        "amount_period_14",
        "amount_period_15",
        "amount_period_16",
        "cost_element_total",
        "group_amount_period_01",
        "group_amount_period_02",
        "group_amount_period_03",
        "group_amount_period_04",
        "group_amount_period_05",
        "group_amount_period_06",
        "cost_element_july",
        "cost_element_august",
        "cost_element_september",
        "cost_element_october",
        "cost_element_november",
        "cost_element_december",
        "cost_element_january_next",
        "cost_element_february_next",
        "cost_element_march_next",
        "cost_element_april_next",
        "oslvt",
        "period_01_value",
        "period_02_value",
        "period_03_value",
        "period_04_value",
        "period_05_value",
        "period_06_value",
        "period_07_value",
        "period_08_value",
        "period_09_value",
        "period_10_value",
        "period_11_value",
        "period_12_value",
        "osl13",
        "osl14",
        "osl15",
        "osl16",
        "stat_key_figure_total",
        "stat_key_figure_january",
        "stat_key_figure_february",
        "stat_key_figure_march",
        "stat_key_figure_april",
        "stat_key_figure_may",
        "stat_key_figure_june",
        "stat_key_figure_july",
        "stat_key_figure_august",
        "stat_key_figure_september",
        "stat_key_figure_october",
        "stat_key_figure_november",
        "stat_key_figure_december",
        "stat_key_figure_january_next",
        "stat_key_figure_february_next",
        "stat_key_figure_march_next",
        "stat_key_figure_april_next",
        "record_timestamp",
        "row_id",
        "is_deleted"
    FROM "sap_faglflext_data_projected_renamed"
),

"sap_faglflext_data_projected_renamed_cleaned_casted" AS (
    -- Column Type Casting: 
    -- account_group: from INT to VARCHAR
    -- account_number: from INT to VARCHAR
    -- asset_class: from DECIMAL to VARCHAR
    -- business_area: from INT to VARCHAR
    -- client: from INT to VARCHAR
    -- controlling_area: from INT to VARCHAR
    -- cost_center: from DECIMAL to VARCHAR
    -- cost_element: from DECIMAL to VARCHAR
    -- functional_area: from DECIMAL to VARCHAR
    -- gl_account: from INT to VARCHAR
    -- logical_system: from DECIMAL to VARCHAR
    -- movement_type: from DECIMAL to VARCHAR
    -- partner_profit_center: from DECIMAL to VARCHAR
    -- profit_segment: from DECIMAL to VARCHAR
    -- record_timestamp: from INT to TIMESTAMP
    -- segment: from DECIMAL to VARCHAR
    -- sender_business_area: from DECIMAL to VARCHAR
    -- sender_cost_center: from DECIMAL to VARCHAR
    -- sender_functional_area: from DECIMAL to VARCHAR
    -- special_region_code: from DECIMAL to VARCHAR
    -- unit_of_measure: from DECIMAL to VARCHAR
    -- unused_object_4: from INT to BIT
    -- unused_object_5: from INT to BIT
    -- unused_object_6: from INT to BIT
    -- unused_object_7: from INT to BIT
    -- unused_object_8: from INT to BIT
    SELECT
        "debit_credit_indicator",
        "object_number",
        "objnr01",
        "max_periods",
        "fiscal_year",
        "activity_type",
        "currency",
        "document_type",
        "ledger",
        "record_type",
        "version",
        "rbukrs",
        "profit_center",
        "tslvt",
        "january_amount",
        "february_amount",
        "march_amount",
        "april_amount",
        "may_amount",
        "june_amount",
        "july_amount",
        "august_amount",
        "september_amount",
        "october_amount",
        "november_amount",
        "december_amount",
        "tsl13",
        "tsl14",
        "tsl15",
        "tsl16",
        "amount_previous_year",
        "amount_period_01",
        "amount_period_02",
        "amount_period_03",
        "amount_period_04",
        "amount_period_05",
        "amount_period_06",
        "amount_period_07",
        "amount_period_08",
        "amount_period_09",
        "amount_period_10",
        "amount_period_11",
        "amount_period_12",
        "amount_period_13",
        "amount_period_14",
        "amount_period_15",
        "amount_period_16",
        "cost_element_total",
        "group_amount_period_01",
        "group_amount_period_02",
        "group_amount_period_03",
        "group_amount_period_04",
        "group_amount_period_05",
        "group_amount_period_06",
        "cost_element_july",
        "cost_element_august",
        "cost_element_september",
        "cost_element_october",
        "cost_element_november",
        "cost_element_december",
        "cost_element_january_next",
        "cost_element_february_next",
        "cost_element_march_next",
        "cost_element_april_next",
        "oslvt",
        "period_01_value",
        "period_02_value",
        "period_03_value",
        "period_04_value",
        "period_05_value",
        "period_06_value",
        "period_07_value",
        "period_08_value",
        "period_09_value",
        "period_10_value",
        "period_11_value",
        "period_12_value",
        "osl13",
        "osl14",
        "osl15",
        "osl16",
        "stat_key_figure_total",
        "stat_key_figure_january",
        "stat_key_figure_february",
        "stat_key_figure_march",
        "stat_key_figure_april",
        "stat_key_figure_may",
        "stat_key_figure_june",
        "stat_key_figure_july",
        "stat_key_figure_august",
        "stat_key_figure_september",
        "stat_key_figure_october",
        "stat_key_figure_november",
        "stat_key_figure_december",
        "stat_key_figure_january_next",
        "stat_key_figure_february_next",
        "stat_key_figure_march_next",
        "stat_key_figure_april_next",
        "row_id",
        "is_deleted",
        CAST("account_group" AS VARCHAR) AS "account_group",
        CAST("account_number" AS VARCHAR) AS "account_number",
        CAST("asset_class" AS VARCHAR) AS "asset_class",
        CAST("business_area" AS VARCHAR) AS "business_area",
        CAST("client" AS VARCHAR) AS "client",
        CAST("controlling_area" AS VARCHAR) AS "controlling_area",
        CAST("cost_center" AS VARCHAR) AS "cost_center",
        CAST("cost_element" AS VARCHAR) AS "cost_element",
        CAST("functional_area" AS VARCHAR) AS "functional_area",
        CAST("gl_account" AS VARCHAR) AS "gl_account",
        CAST("logical_system" AS VARCHAR) AS "logical_system",
        CAST("movement_type" AS VARCHAR) AS "movement_type",
        CAST("partner_profit_center" AS VARCHAR) AS "partner_profit_center",
        CAST("profit_segment" AS VARCHAR) AS "profit_segment",
        strptime(CAST("record_timestamp" AS VARCHAR), '%Y%m%d%H%M%S') AS "record_timestamp",
        CAST("segment" AS VARCHAR) AS "segment",
        CAST("sender_business_area" AS VARCHAR) AS "sender_business_area",
        CAST("sender_cost_center" AS VARCHAR) AS "sender_cost_center",
        CAST("sender_functional_area" AS VARCHAR) AS "sender_functional_area",
        CAST("special_region_code" AS VARCHAR) AS "special_region_code",
        CAST("unit_of_measure" AS VARCHAR) AS "unit_of_measure",
        CAST("unused_object_4" AS BIT) AS "unused_object_4",
        CAST("unused_object_5" AS BIT) AS "unused_object_5",
        CAST("unused_object_6" AS BIT) AS "unused_object_6",
        CAST("unused_object_7" AS BIT) AS "unused_object_7",
        CAST("unused_object_8" AS BIT) AS "unused_object_8"
    FROM "sap_faglflext_data_projected_renamed_cleaned"
),

"sap_faglflext_data_projected_renamed_cleaned_casted_missing_handled" AS (
    -- Handling missing values: There are 13 columns with unacceptable missing values
    -- asset_class has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- cost_element has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- functional_area has 33.33 percent missing. Strategy: 🔄 Unchanged
    -- logical_system has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- movement_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- partner_profit_center has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- profit_center has 33.33 percent missing. Strategy: 🔄 Unchanged
    -- profit_segment has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- segment has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sender_business_area has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sender_cost_center has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sender_functional_area has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- special_region_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "debit_credit_indicator",
        "object_number",
        "objnr01",
        "max_periods",
        "fiscal_year",
        "activity_type",
        "currency",
        "document_type",
        "ledger",
        "record_type",
        "version",
        "rbukrs",
        "profit_center",
        "tslvt",
        "january_amount",
        "february_amount",
        "march_amount",
        "april_amount",
        "may_amount",
        "june_amount",
        "july_amount",
        "august_amount",
        "september_amount",
        "october_amount",
        "november_amount",
        "december_amount",
        "tsl13",
        "tsl14",
        "tsl15",
        "tsl16",
        "amount_previous_year",
        "amount_period_01",
        "amount_period_02",
        "amount_period_03",
        "amount_period_04",
        "amount_period_05",
        "amount_period_06",
        "amount_period_07",
        "amount_period_08",
        "amount_period_09",
        "amount_period_10",
        "amount_period_11",
        "amount_period_12",
        "amount_period_13",
        "amount_period_14",
        "amount_period_15",
        "amount_period_16",
        "cost_element_total",
        "group_amount_period_01",
        "group_amount_period_02",
        "group_amount_period_03",
        "group_amount_period_04",
        "group_amount_period_05",
        "group_amount_period_06",
        "cost_element_july",
        "cost_element_august",
        "cost_element_september",
        "cost_element_october",
        "cost_element_november",
        "cost_element_december",
        "cost_element_january_next",
        "cost_element_february_next",
        "cost_element_march_next",
        "cost_element_april_next",
        "oslvt",
        "period_01_value",
        "period_02_value",
        "period_03_value",
        "period_04_value",
        "period_05_value",
        "period_06_value",
        "period_07_value",
        "period_08_value",
        "period_09_value",
        "period_10_value",
        "period_11_value",
        "period_12_value",
        "osl13",
        "osl14",
        "osl15",
        "osl16",
        "stat_key_figure_total",
        "stat_key_figure_january",
        "stat_key_figure_february",
        "stat_key_figure_march",
        "stat_key_figure_april",
        "stat_key_figure_may",
        "stat_key_figure_june",
        "stat_key_figure_july",
        "stat_key_figure_august",
        "stat_key_figure_september",
        "stat_key_figure_october",
        "stat_key_figure_november",
        "stat_key_figure_december",
        "stat_key_figure_january_next",
        "stat_key_figure_february_next",
        "stat_key_figure_march_next",
        "stat_key_figure_april_next",
        "row_id",
        "is_deleted",
        "account_group",
        "account_number",
        "business_area",
        "client",
        "controlling_area",
        "cost_center",
        "functional_area",
        "gl_account",
        "record_timestamp",
        "unit_of_measure",
        "unused_object_4",
        "unused_object_5",
        "unused_object_6",
        "unused_object_7",
        "unused_object_8"
    FROM "sap_faglflext_data_projected_renamed_cleaned_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_faglflext_data_projected_renamed_cleaned_casted_missing_handled"

stg_sap_faglflext_data.yml (Document the table)

version: 2
models:
- name: stg_sap_faglflext_data
  description: The table is about financial data, likely from an SAP system. It contains
    accounting records with details like account numbers, cost centers, and financial
    amounts. Each row represents a financial transaction or balance. The data includes
    various financial values across different periods (monthly columns). Key fields
    are currency, year, account, company code, and cost center.
  columns:
  - name: debit_credit_indicator
    description: Debit/Credit indicator
    tests:
    - not_null
    - accepted_values:
        values:
        - d
        - c
        - s
        - h
  - name: object_number
    description: Object number
    tests:
    - not_null
  - name: objnr01
    description: ''
    tests:
    - not_null
  - name: max_periods
    description: Maximum number of periods
    tests:
    - not_null
  - name: fiscal_year
    description: Fiscal year
    tests:
    - not_null
  - name: activity_type
    description: Activity type or transaction code
    tests:
    - not_null
  - name: currency
    description: Transaction currency
    tests:
    - not_null
  - name: document_type
    description: Document type
    tests:
    - not_null
    - accepted_values:
        values:
        - AB
        - AF
        - AN
        - AZ
        - BA
        - BB
        - BK
        - DA
        - DG
        - DZ
        - EF
        - KA
        - KG
        - KN
        - KR
        - KZ
        - PR
        - SA
        - SK
        - SU
        - WA
        - WE
        - WL
        - bkpf
  - name: ledger
    description: Ledger
    tests:
    - not_null
  - name: record_type
    description: Record type
    tests:
    - not_null
  - name: version
    description: Version
    tests:
    - not_null
  - name: rbukrs
    description: ''
    tests:
    - not_null
  - name: profit_center
    description: Profit center
    tests:
    - not_null
  - name: tslvt
    description: ''
    tests:
    - not_null
  - name: january_amount
    description: January amount
    tests:
    - not_null
  - name: february_amount
    description: February amount
    tests:
    - not_null
  - name: march_amount
    description: March amount
    tests:
    - not_null
  - name: april_amount
    description: April amount
    tests:
    - not_null
  - name: may_amount
    description: May amount
    tests:
    - not_null
  - name: june_amount
    description: June amount
    tests:
    - not_null
  - name: july_amount
    description: July amount
    tests:
    - not_null
  - name: august_amount
    description: August amount
    tests:
    - not_null
  - name: september_amount
    description: September amount
    tests:
    - not_null
  - name: october_amount
    description: October amount
    tests:
    - not_null
  - name: november_amount
    description: November amount
    tests:
    - not_null
  - name: december_amount
    description: December amount
    tests:
    - not_null
  - name: tsl13
    description: ''
    tests:
    - not_null
  - name: tsl14
    description: ''
    tests:
    - not_null
  - name: tsl15
    description: ''
    tests:
    - not_null
  - name: tsl16
    description: ''
    tests:
    - not_null
  - name: amount_previous_year
    description: Amount for previous year in local currency
    tests:
    - not_null
  - name: amount_period_01
    description: Amount for period 01 in local currency
    tests:
    - not_null
  - name: amount_period_02
    description: Amount for period 02 in local currency
    tests:
    - not_null
  - name: amount_period_03
    description: Amount for period 03 in local currency
    tests:
    - not_null
  - name: amount_period_04
    description: Amount for period 04 in local currency
    tests:
    - not_null
  - name: amount_period_05
    description: Amount for period 05 in local currency
    tests:
    - not_null
  - name: amount_period_06
    description: Amount for period 06 in local currency
    tests:
    - not_null
  - name: amount_period_07
    description: Amount for period 07 in local currency
    tests:
    - not_null
  - name: amount_period_08
    description: Amount for period 08 in local currency
    tests:
    - not_null
  - name: amount_period_09
    description: Amount for period 09 in local currency
    tests:
    - not_null
  - name: amount_period_10
    description: Amount for period 10 in local currency
    tests:
    - not_null
  - name: amount_period_11
    description: Amount for period 11 in local currency
    tests:
    - not_null
  - name: amount_period_12
    description: Amount for period 12 in local currency
    tests:
    - not_null
  - name: amount_period_13
    description: Amount for period 13 in local currency
    tests:
    - not_null
  - name: amount_period_14
    description: Amount for period 14 in local currency
    tests:
    - not_null
  - name: amount_period_15
    description: Amount for period 15 in local currency
    tests:
    - not_null
  - name: amount_period_16
    description: Amount for period 16 in local currency
    tests:
    - not_null
  - name: cost_element_total
    description: Cost element value total
    tests:
    - not_null
  - name: group_amount_period_01
    description: Amount for period 01 in group currency
    tests:
    - not_null
  - name: group_amount_period_02
    description: Amount for period 02 in group currency
    tests:
    - not_null
  - name: group_amount_period_03
    description: Amount for period 03 in group currency
    tests:
    - not_null
  - name: group_amount_period_04
    description: Amount for period 04 in group currency
    tests:
    - not_null
  - name: group_amount_period_05
    description: Amount for period 05 in group currency
    tests:
    - not_null
  - name: group_amount_period_06
    description: Amount for period 06 in group currency
    tests:
    - not_null
  - name: cost_element_july
    description: Cost element value for July
    tests:
    - not_null
  - name: cost_element_august
    description: Cost element value for August
    tests:
    - not_null
  - name: cost_element_september
    description: Cost element value for September
    tests:
    - not_null
  - name: cost_element_october
    description: Cost element value for October
    tests:
    - not_null
  - name: cost_element_november
    description: Cost element value for November
    tests:
    - not_null
  - name: cost_element_december
    description: Cost element value for December
    tests:
    - not_null
  - name: cost_element_january_next
    description: Cost element value for January next year
    tests:
    - not_null
  - name: cost_element_february_next
    description: Cost element value for February next year
    tests:
    - not_null
  - name: cost_element_march_next
    description: Cost element value for March next year
    tests:
    - not_null
  - name: cost_element_april_next
    description: Cost element value for April next year
    tests:
    - not_null
  - name: oslvt
    description: ''
    tests:
    - not_null
  - name: period_01_value
    description: Financial value for period 1 (e.g., January)
    tests:
    - not_null
  - name: period_02_value
    description: Financial value for period 2 (e.g., February)
    tests:
    - not_null
  - name: period_03_value
    description: Financial value for period 3 (e.g., March)
    tests:
    - not_null
  - name: period_04_value
    description: Financial value for period 4 (e.g., April)
    tests:
    - not_null
  - name: period_05_value
    description: Financial value for period 5 (e.g., May)
    tests:
    - not_null
  - name: period_06_value
    description: Financial value for period 6 (e.g., June)
    tests:
    - not_null
  - name: period_07_value
    description: Financial value for period 7 (e.g., July)
    tests:
    - not_null
  - name: period_08_value
    description: Financial value for period 8 (e.g., August)
    tests:
    - not_null
  - name: period_09_value
    description: Financial value for period 9 (e.g., September)
    tests:
    - not_null
  - name: period_10_value
    description: Financial value for period 10 (e.g., October)
    tests:
    - not_null
  - name: period_11_value
    description: Financial value for period 11 (e.g., November)
    tests:
    - not_null
  - name: period_12_value
    description: Financial value for period 12 (e.g., December)
    tests:
    - not_null
  - name: osl13
    description: ''
    tests:
    - not_null
  - name: osl14
    description: ''
    tests:
    - not_null
  - name: osl15
    description: ''
    tests:
    - not_null
  - name: osl16
    description: ''
    tests:
    - not_null
  - name: stat_key_figure_total
    description: Statistical key figure total
    tests:
    - not_null
  - name: stat_key_figure_january
    description: Statistical key figure for January
    tests:
    - not_null
  - name: stat_key_figure_february
    description: Statistical key figure for February
    tests:
    - not_null
  - name: stat_key_figure_march
    description: Statistical key figure for March
    tests:
    - not_null
  - name: stat_key_figure_april
    description: Statistical key figure for April
    tests:
    - not_null
  - name: stat_key_figure_may
    description: Statistical key figure for May
    tests:
    - not_null
  - name: stat_key_figure_june
    description: Statistical key figure for June
    tests:
    - not_null
  - name: stat_key_figure_july
    description: Statistical key figure for July
    tests:
    - not_null
  - name: stat_key_figure_august
    description: Statistical key figure for August
    tests:
    - not_null
  - name: stat_key_figure_september
    description: Statistical key figure for September
    tests:
    - not_null
  - name: stat_key_figure_october
    description: Statistical key figure for October
    tests:
    - not_null
  - name: stat_key_figure_november
    description: Statistical key figure for November
    tests:
    - not_null
  - name: stat_key_figure_december
    description: Statistical key figure for December
    tests:
    - not_null
  - name: stat_key_figure_january_next
    description: Statistical key figure for January next year
    tests:
    - not_null
  - name: stat_key_figure_february_next
    description: Statistical key figure for February next year
    tests:
    - not_null
  - name: stat_key_figure_march_next
    description: Statistical key figure for March next year
    tests:
    - not_null
  - name: stat_key_figure_april_next
    description: Statistical key figure for April next year
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column appears to be a unique identifier for each row in the
        table. For this financial data table, each row represents a distinct financial
        record, and the row_id seems to be incremental and unique for each entry.
  - name: is_deleted
    description: Indicates if the record is deleted
    tests:
    - not_null
  - name: account_group
    description: Object number 2, possibly account group
    tests:
    - not_null
  - name: account_number
    description: Object number 3, possibly specific account
    tests:
    - not_null
  - name: business_area
    description: Business area
    tests:
    - not_null
  - name: client
    description: Client
    tests:
    - not_null
    - accepted_values:
        values:
        - '800'
        - '888'
        - '877'
        - '866'
        - '855'
        - '844'
        - '833'
        - '822'
        - '880'
        - '881'
        - '882'
        - '883'
        - '884'
  - name: controlling_area
    description: Controlling area or company code
    tests:
    - not_null
  - name: cost_center
    description: Cost center
    cocoon_meta:
      missing_acceptable: Not applicable for non-cost center related accounts.
  - name: functional_area
    description: Functional area
    tests:
    - not_null
  - name: gl_account
    description: G/L Account number
    tests:
    - not_null
  - name: record_timestamp
    description: Timestamp of record
    tests:
    - not_null
  - name: unit_of_measure
    description: Unit of measure
    cocoon_meta:
      missing_acceptable: Measurement values are all 0.0, so unit not needed.
  - name: unused_object_4
    description: Object number 4, unused in this dataset
    tests:
    - not_null
  - name: unused_object_5
    description: Object number 5, unused in this dataset
    tests:
    - not_null
  - name: unused_object_6
    description: Object number 6, unused in this dataset
    tests:
    - not_null
  - name: unused_object_7
    description: Object number 7, unused in this dataset
    tests:
    - not_null
  - name: unused_object_8
    description: Object number 8, unused in this dataset
    tests:
    - not_null

stg_sap_t503_data (first 100 rows)

employee_group employee_subgroup payroll_area payroll_type tariff_indicator time_management_status payroll_status employment_status termination_status account_type posting_indicator row_id is_deleted client_code
0 1 A0 3 3 3 2 0 0 0 1 0 692 False 800
1 1 A1 2 2 2 1 0 0 0 2 0 693 False 800
2 1 A2 2 2 2 1 0 0 0 2 0 694 False 800

stg_sap_t503_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 14:55:10.122332+00:00
WITH 
"sap_t503_data_projected" AS (
    -- Projection: Selecting 18 out of 19 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "mandt",
        "persg",
        "persk",
        "abart",
        "abtyp",
        "antyp",
        "trfkz",
        "zeity",
        "aksta",
        "ansta",
        "austa",
        "konty",
        "burkz",
        "molga",
        "typsz",
        "inwid",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_t503_data"
),

"sap_t503_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- mandt -> client_code
    -- persg -> employee_group
    -- persk -> employee_subgroup
    -- abart -> payroll_area
    -- abtyp -> payroll_type
    -- antyp -> employment_type
    -- trfkz -> tariff_indicator
    -- zeity -> time_management_status
    -- aksta -> payroll_status
    -- ansta -> employment_status
    -- austa -> termination_status
    -- konty -> account_type
    -- burkz -> posting_indicator
    -- molga -> country_grouping
    -- typsz -> special_payment_type
    -- inwid -> in_house_pay_scale
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "mandt" AS "client_code",
        "persg" AS "employee_group",
        "persk" AS "employee_subgroup",
        "abart" AS "payroll_area",
        "abtyp" AS "payroll_type",
        "antyp" AS "employment_type",
        "trfkz" AS "tariff_indicator",
        "zeity" AS "time_management_status",
        "aksta" AS "payroll_status",
        "ansta" AS "employment_status",
        "austa" AS "termination_status",
        "konty" AS "account_type",
        "burkz" AS "posting_indicator",
        "molga" AS "country_grouping",
        "typsz" AS "special_payment_type",
        "inwid" AS "in_house_pay_scale",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_t503_data_projected"
),

"sap_t503_data_projected_renamed_casted" AS (
    -- Column Type Casting: 
    -- client_code: from INT to VARCHAR
    -- country_grouping: from DECIMAL to VARCHAR
    -- employment_type: from DECIMAL to VARCHAR
    -- in_house_pay_scale: from DECIMAL to VARCHAR
    -- special_payment_type: from DECIMAL to VARCHAR
    SELECT
        "employee_group",
        "employee_subgroup",
        "payroll_area",
        "payroll_type",
        "tariff_indicator",
        "time_management_status",
        "payroll_status",
        "employment_status",
        "termination_status",
        "account_type",
        "posting_indicator",
        "row_id",
        "is_deleted",
        CAST("client_code" AS VARCHAR) AS "client_code",
        CAST("country_grouping" AS VARCHAR) AS "country_grouping",
        CAST("employment_type" AS VARCHAR) AS "employment_type",
        CAST("in_house_pay_scale" AS VARCHAR) AS "in_house_pay_scale",
        CAST("special_payment_type" AS VARCHAR) AS "special_payment_type"
    FROM "sap_t503_data_projected_renamed"
),

"sap_t503_data_projected_renamed_casted_missing_handled" AS (
    -- Handling missing values: There are 4 columns with unacceptable missing values
    -- country_grouping has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- employment_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- in_house_pay_scale has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- special_payment_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "employee_group",
        "employee_subgroup",
        "payroll_area",
        "payroll_type",
        "tariff_indicator",
        "time_management_status",
        "payroll_status",
        "employment_status",
        "termination_status",
        "account_type",
        "posting_indicator",
        "row_id",
        "is_deleted",
        "client_code"
    FROM "sap_t503_data_projected_renamed_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_t503_data_projected_renamed_casted_missing_handled"

stg_sap_t503_data.yml (Document the table)

version: 2
models:
- name: stg_sap_t503_data
  description: The table is about employee grouping and payroll settings. It contains
    fields for employee group (persg), employee subgroup (persk), payroll area (abart),
    time management status (zeity), and various other payroll-related flags and indicators.
    The table likely serves as a reference for HR and payroll systems to categorize
    employees and determine their specific payroll processing rules.
  columns:
  - name: employee_group
    description: Employee group
    tests:
    - not_null
  - name: employee_subgroup
    description: Employee subgroup
    tests:
    - not_null
  - name: payroll_area
    description: Payroll area code
    tests:
    - not_null
  - name: payroll_type
    description: Payroll type
    tests:
    - not_null
  - name: tariff_indicator
    description: Tariff indicator
    tests:
    - not_null
  - name: time_management_status
    description: Time management status
    tests:
    - not_null
  - name: payroll_status
    description: Payroll status indicator
    tests:
    - not_null
  - name: employment_status
    description: Employment status indicator
    tests:
    - not_null
  - name: termination_status
    description: Termination status indicator
    tests:
    - not_null
  - name: account_type
    description: Account type
    tests:
    - not_null
  - name: posting_indicator
    description: Posting indicator
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column is described as a unique identifier for the row. For
        this table, each row represents a distinct combination of employee grouping
        and payroll settings. The row_id is designed to be unique across all rows
        in the table.
  - name: is_deleted
    description: Indicates if the record is deleted
    tests:
    - not_null
  - name: client_code
    description: Client or company code
    tests:
    - not_null

stg_sap_pa0001_data (first 100 rows)

employee_id sequence_number user_name personnel_area persg employee_subgroup personnel_subarea work_schedule_rule sachp sname ename object_type payroll_modifier row_id is_deleted client company_code controlling_area distribution_key is_historical job_id last_changed_date lock_indicator org_unit_id position_id processing_reason valid_from valid_to
0 10 0 powersa 200 1 gc 2 g1 3 powers austin Mr. Austin Powers s 200 1 False 800 2000 1000 200 None 50016575 2003-05-07 None 50001357 50005214 None 2002-01-01 9999-12-31
1 69 0 c5115457 200 1 gc 2 g1 3 bob sponge Mr. Sponge Bob s 200 2 False 800 2000 1000 200 None 50029038 2011-11-14 None 50002214 50005687 None 2003-01-01 9999-12-31
2 70 0 c5115457 200 1 gc 2 g1 3 wayne bruce Mr. Bruce Wayne s 200 3 False 800 2000 1000 200 None 50043146 2011-11-14 None 50002214 50005691 None 2003-01-01 9999-12-31

stg_sap_pa0001_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 05:13:22.987849+00:00
WITH 
"sap_pa0001_data_projected" AS (
    -- Projection: Selecting 54 out of 55 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "begda",
        "endda",
        "mandt",
        "objps",
        "pernr",
        "seqnr",
        "sprps",
        "subty",
        "aedtm",
        "uname",
        "histo",
        "itxex",
        "refex",
        "ordex",
        "itbld",
        "preas",
        "flag1",
        "flag2",
        "flag3",
        "flag4",
        "rese1",
        "rese2",
        "grpvl",
        "bukrs",
        "werks",
        "persg",
        "persk",
        "vdsk1",
        "gsber",
        "btrtl",
        "juper",
        "abkrs",
        "ansvh",
        "kostl",
        "orgeh",
        "plans",
        "stell",
        "mstbr",
        "sacha",
        "sachp",
        "sachz",
        "sname",
        "ename",
        "otype",
        "sbmod",
        "kokrs",
        "fistl",
        "geber",
        "fkber",
        "grant_nbr",
        "sgmnt",
        "budget_pd",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_pa0001_data"
),

"sap_pa0001_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- begda -> valid_from
    -- endda -> valid_to
    -- mandt -> client
    -- objps -> object_specification
    -- pernr -> employee_id
    -- seqnr -> sequence_number
    -- sprps -> lock_indicator
    -- subty -> subtype
    -- aedtm -> last_changed_date
    -- uname -> user_name
    -- histo -> is_historical
    -- itxex -> tax_exemption
    -- refex -> reference_id
    -- ordex -> order_id
    -- itbld -> it_planning_layout
    -- preas -> processing_reason
    -- flag1 -> custom_flag_1
    -- flag2 -> custom_flag_2
    -- flag3 -> custom_flag_3
    -- flag4 -> custom_flag_4
    -- rese1 -> reserved_field_1
    -- rese2 -> reserved_field_2
    -- bukrs -> company_code
    -- werks -> personnel_area
    -- persk -> employee_subgroup
    -- vdsk1 -> distribution_key
    -- gsber -> business_area
    -- btrtl -> personnel_subarea
    -- juper -> payroll_period
    -- abkrs -> work_schedule_rule
    -- ansvh -> action_reason
    -- kostl -> cost_center
    -- orgeh -> org_unit_id
    -- plans -> position_id
    -- stell -> job_id
    -- sacha -> wage_type
    -- sachz -> time_constraint
    -- otype -> object_type
    -- sbmod -> payroll_modifier
    -- kokrs -> controlling_area
    -- fistl -> funds_center
    -- geber -> fund
    -- fkber -> functional_area
    -- grant_nbr -> grant_number
    -- sgmnt -> segment_id
    -- budget_pd -> budget_period
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "begda" AS "valid_from",
        "endda" AS "valid_to",
        "mandt" AS "client",
        "objps" AS "object_specification",
        "pernr" AS "employee_id",
        "seqnr" AS "sequence_number",
        "sprps" AS "lock_indicator",
        "subty" AS "subtype",
        "aedtm" AS "last_changed_date",
        "uname" AS "user_name",
        "histo" AS "is_historical",
        "itxex" AS "tax_exemption",
        "refex" AS "reference_id",
        "ordex" AS "order_id",
        "itbld" AS "it_planning_layout",
        "preas" AS "processing_reason",
        "flag1" AS "custom_flag_1",
        "flag2" AS "custom_flag_2",
        "flag3" AS "custom_flag_3",
        "flag4" AS "custom_flag_4",
        "rese1" AS "reserved_field_1",
        "rese2" AS "reserved_field_2",
        "grpvl",
        "bukrs" AS "company_code",
        "werks" AS "personnel_area",
        "persg",
        "persk" AS "employee_subgroup",
        "vdsk1" AS "distribution_key",
        "gsber" AS "business_area",
        "btrtl" AS "personnel_subarea",
        "juper" AS "payroll_period",
        "abkrs" AS "work_schedule_rule",
        "ansvh" AS "action_reason",
        "kostl" AS "cost_center",
        "orgeh" AS "org_unit_id",
        "plans" AS "position_id",
        "stell" AS "job_id",
        "mstbr",
        "sacha" AS "wage_type",
        "sachp",
        "sachz" AS "time_constraint",
        "sname",
        "ename",
        "otype" AS "object_type",
        "sbmod" AS "payroll_modifier",
        "kokrs" AS "controlling_area",
        "fistl" AS "funds_center",
        "geber" AS "fund",
        "fkber" AS "functional_area",
        "grant_nbr" AS "grant_number",
        "sgmnt" AS "segment_id",
        "budget_pd" AS "budget_period",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_pa0001_data_projected"
),

"sap_pa0001_data_projected_renamed_cleaned" AS (
    -- Clean unusual string values: 
    -- employee_subgroup: The problem is that 'gc' is the only value present in the employee_subgroup column, and it's an ambiguous code that doesn't clearly indicate its meaning for employee subgroups. Without additional context or a data dictionary, it's impossible to determine what 'gc' stands for or if it's a valid employee subgroup designation. The correct values should be more descriptive and meaningful employee subgroup categories. 
    -- ename: The problem is inconsistent capitalization and title formatting across the ename values. 'austin powers' lacks capitalization and a title, while 'mr bruce wayne' and 'mr sponge bob' have inconsistent title formatting (lowercase 'mr'). The correct values should have consistent capitalization (title case for names) and title formatting (capitalized 'Mr.'). 
    -- object_type: The problem is that 's' is the only value in the object_type column, and it's a single letter that doesn't clearly indicate what object type it represents. In this case, 's' likely stands for 'star', as it's common in astronomical datasets to use 's' as an abbreviation for star objects. However, without more context or other values to compare it to, it's difficult to be certain. The correct value should be more descriptive to avoid ambiguity. 
    SELECT
        "valid_from",
        "valid_to",
        "client",
        "object_specification",
        "employee_id",
        "sequence_number",
        "lock_indicator",
        "subtype",
        "last_changed_date",
        "user_name",
        "is_historical",
        "tax_exemption",
        "reference_id",
        "order_id",
        "it_planning_layout",
        "processing_reason",
        "custom_flag_1",
        "custom_flag_2",
        "custom_flag_3",
        "custom_flag_4",
        "reserved_field_1",
        "reserved_field_2",
        "grpvl",
        "company_code",
        "personnel_area",
        "persg",
        CASE
            WHEN "employee_subgroup" = '''gc''' THEN ''''
            ELSE "employee_subgroup"
        END AS "employee_subgroup",
        "distribution_key",
        "business_area",
        "personnel_subarea",
        "payroll_period",
        "work_schedule_rule",
        "action_reason",
        "cost_center",
        "org_unit_id",
        "position_id",
        "job_id",
        "mstbr",
        "wage_type",
        "sachp",
        "time_constraint",
        "sname",
        CASE
            WHEN "ename" = 'austin powers' THEN 'Mr. Austin Powers'
            WHEN "ename" = 'mr bruce wayne' THEN 'Mr. Bruce Wayne'
            WHEN "ename" = 'mr sponge bob' THEN 'Mr. Sponge Bob'
            ELSE "ename"
        END AS "ename",
        CASE
            WHEN "object_type" = '''s''' THEN '''star'''
            ELSE "object_type"
        END AS "object_type",
        "payroll_modifier",
        "controlling_area",
        "funds_center",
        "fund",
        "functional_area",
        "grant_number",
        "segment_id",
        "budget_period",
        "row_id",
        "is_deleted"
    FROM "sap_pa0001_data_projected_renamed"
),

"sap_pa0001_data_projected_renamed_cleaned_casted" AS (
    -- Column Type Casting: 
    -- action_reason: from DECIMAL to VARCHAR
    -- budget_period: from DECIMAL to VARCHAR
    -- business_area: from DECIMAL to VARCHAR
    -- client: from INT to VARCHAR
    -- company_code: from INT to VARCHAR
    -- controlling_area: from INT to VARCHAR
    -- cost_center: from DECIMAL to VARCHAR
    -- custom_flag_1: from DECIMAL to VARCHAR
    -- custom_flag_2: from DECIMAL to VARCHAR
    -- custom_flag_3: from DECIMAL to VARCHAR
    -- custom_flag_4: from DECIMAL to VARCHAR
    -- distribution_key: from INT to VARCHAR
    -- functional_area: from DECIMAL to VARCHAR
    -- fund: from DECIMAL to VARCHAR
    -- funds_center: from DECIMAL to VARCHAR
    -- grant_number: from DECIMAL to VARCHAR
    -- grpvl: from DECIMAL to VARCHAR
    -- is_historical: from DECIMAL to VARCHAR
    -- it_planning_layout: from DECIMAL to VARCHAR
    -- job_id: from INT to VARCHAR
    -- last_changed_date: from INT to DATE
    -- lock_indicator: from DECIMAL to VARCHAR
    -- mstbr: from DECIMAL to VARCHAR
    -- object_specification: from DECIMAL to VARCHAR
    -- order_id: from DECIMAL to VARCHAR
    -- org_unit_id: from INT to VARCHAR
    -- payroll_period: from DECIMAL to VARCHAR
    -- position_id: from INT to VARCHAR
    -- processing_reason: from DECIMAL to VARCHAR
    -- reference_id: from DECIMAL to VARCHAR
    -- reserved_field_1: from DECIMAL to VARCHAR
    -- reserved_field_2: from DECIMAL to VARCHAR
    -- segment_id: from DECIMAL to VARCHAR
    -- subtype: from DECIMAL to VARCHAR
    -- tax_exemption: from DECIMAL to VARCHAR
    -- time_constraint: from DECIMAL to VARCHAR
    -- valid_from: from INT to DATE
    -- valid_to: from INT to DATE
    -- wage_type: from DECIMAL to VARCHAR
    SELECT
        "employee_id",
        "sequence_number",
        "user_name",
        "personnel_area",
        "persg",
        "employee_subgroup",
        "personnel_subarea",
        "work_schedule_rule",
        "sachp",
        "sname",
        "ename",
        "object_type",
        "payroll_modifier",
        "row_id",
        "is_deleted",
        CAST("action_reason" AS VARCHAR) AS "action_reason",
        CAST("budget_period" AS VARCHAR) AS "budget_period",
        CAST("business_area" AS VARCHAR) AS "business_area",
        CAST("client" AS VARCHAR) AS "client",
        CAST("company_code" AS VARCHAR) AS "company_code",
        CAST("controlling_area" AS VARCHAR) AS "controlling_area",
        CAST("cost_center" AS VARCHAR) AS "cost_center",
        CAST("custom_flag_1" AS VARCHAR) AS "custom_flag_1",
        CAST("custom_flag_2" AS VARCHAR) AS "custom_flag_2",
        CAST("custom_flag_3" AS VARCHAR) AS "custom_flag_3",
        CAST("custom_flag_4" AS VARCHAR) AS "custom_flag_4",
        CAST("distribution_key" AS VARCHAR) AS "distribution_key",
        CAST("functional_area" AS VARCHAR) AS "functional_area",
        CAST("fund" AS VARCHAR) AS "fund",
        CAST("funds_center" AS VARCHAR) AS "funds_center",
        CAST("grant_number" AS VARCHAR) AS "grant_number",
        CAST("grpvl" AS VARCHAR) AS "grpvl",
        CAST("is_historical" AS VARCHAR) AS "is_historical",
        CAST("it_planning_layout" AS VARCHAR) AS "it_planning_layout",
        CAST("job_id" AS VARCHAR) AS "job_id",
        strptime(CAST("last_changed_date" AS VARCHAR), '%Y%m%d') AS "last_changed_date",
        CAST("lock_indicator" AS VARCHAR) AS "lock_indicator",
        CAST("mstbr" AS VARCHAR) AS "mstbr",
        CAST("object_specification" AS VARCHAR) AS "object_specification",
        CAST("order_id" AS VARCHAR) AS "order_id",
        CAST("org_unit_id" AS VARCHAR) AS "org_unit_id",
        CAST("payroll_period" AS VARCHAR) AS "payroll_period",
        CAST("position_id" AS VARCHAR) AS "position_id",
        CAST("processing_reason" AS VARCHAR) AS "processing_reason",
        CAST("reference_id" AS VARCHAR) AS "reference_id",
        CAST("reserved_field_1" AS VARCHAR) AS "reserved_field_1",
        CAST("reserved_field_2" AS VARCHAR) AS "reserved_field_2",
        CAST("segment_id" AS VARCHAR) AS "segment_id",
        CAST("subtype" AS VARCHAR) AS "subtype",
        CAST("tax_exemption" AS VARCHAR) AS "tax_exemption",
        CAST("time_constraint" AS VARCHAR) AS "time_constraint",
        strptime(CAST("valid_from" AS VARCHAR), '%Y%m%d') AS "valid_from",
        strptime(CAST("valid_to" AS VARCHAR), '%Y%m%d') AS "valid_to",
        CAST("wage_type" AS VARCHAR) AS "wage_type"
    FROM "sap_pa0001_data_projected_renamed_cleaned"
),

"sap_pa0001_data_projected_renamed_cleaned_casted_missing_handled" AS (
    -- Handling missing values: There are 26 columns with unacceptable missing values
    -- action_reason has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- budget_period has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- business_area has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- cost_center has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- custom_flag_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- custom_flag_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- custom_flag_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- custom_flag_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- functional_area has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- fund has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- funds_center has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- grant_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- grpvl has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- it_planning_layout has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- mstbr has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- object_specification has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- order_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- payroll_period has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reserved_field_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reserved_field_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- segment_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- subtype has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_exemption has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- time_constraint has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "employee_id",
        "sequence_number",
        "user_name",
        "personnel_area",
        "persg",
        "employee_subgroup",
        "personnel_subarea",
        "work_schedule_rule",
        "sachp",
        "sname",
        "ename",
        "object_type",
        "payroll_modifier",
        "row_id",
        "is_deleted",
        "client",
        "company_code",
        "controlling_area",
        "distribution_key",
        "is_historical",
        "job_id",
        "last_changed_date",
        "lock_indicator",
        "org_unit_id",
        "position_id",
        "processing_reason",
        "valid_from",
        "valid_to"
    FROM "sap_pa0001_data_projected_renamed_cleaned_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_pa0001_data_projected_renamed_cleaned_casted_missing_handled"

stg_sap_pa0001_data.yml (Document the table)

version: 2
models:
- name: stg_sap_pa0001_data
  description: The table is about employee information in an SAP HR system. It contains
    personal and organizational details for employees. Key fields include employee
    number (pernr), name (ename), position (plans), organizational unit (orgeh), and
    employment dates (begda/endda). The table stores data like company code (bukrs),
    personnel area (werks), and employee group (persg). It appears to track historical
    changes with fields for creation date (aedtm) and user (uname).
  columns:
  - name: employee_id
    description: Personnel number (employee ID)
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column represents the unique personnel number for each employee.
        For this table, each row represents an employee's record. The employee_id
        is likely to be unique across rows as it's a standard practice in HR systems
        to assign unique identifiers to employees.
  - name: sequence_number
    description: Sequence number
    tests:
    - not_null
  - name: user_name
    description: User name (creator or modifier)
    tests:
    - not_null
  - name: personnel_area
    description: Personnel area (often plant or location)
    tests:
    - not_null
  - name: persg
    description: ''
    tests:
    - not_null
  - name: employee_subgroup
    description: Employee subgroup
    tests:
    - not_null
  - name: personnel_subarea
    description: Personnel subarea
    tests:
    - not_null
  - name: work_schedule_rule
    description: Work schedule rule
    tests:
    - not_null
  - name: sachp
    description: ''
    tests:
    - not_null
  - name: sname
    description: ''
    tests:
    - not_null
  - name: ename
    description: ''
    tests:
    - not_null
  - name: object_type
    description: Object type
    tests:
    - not_null
  - name: payroll_modifier
    description: Payroll modifier
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column is explicitly described as a unique identifier for the
        row. In a properly designed database, this would be a unique value for each
        record, making it an excellent candidate key.
  - name: is_deleted
    description: Indicates if the record has been deleted
    tests:
    - not_null
  - name: client
    description: Client
    tests:
    - not_null
    - accepted_values:
        values:
        - '800'
        - '888'
        - '877'
        - '866'
        - '855'
        - '844'
        - '833'
        - '822'
        - '880'
        - '881'
        - '882'
        - '883'
        - '884'
  - name: company_code
    description: Company code
    tests:
    - not_null
  - name: controlling_area
    description: Controlling area
    tests:
    - not_null
  - name: distribution_key
    description: Distribution key
    tests:
    - not_null
  - name: is_historical
    description: Historical record indicator
    cocoon_meta:
      missing_acceptable: Current records are not historical by definition.
  - name: job_id
    description: Job identifier
    tests:
    - not_null
  - name: last_changed_date
    description: Last changed date
    tests:
    - not_null
  - name: lock_indicator
    description: Lock indicator
    cocoon_meta:
      missing_acceptable: Unlocked records don't need a lock indicator.
  - name: org_unit_id
    description: Organizational unit identifier
    tests:
    - not_null
  - name: position_id
    description: Position identifier
    tests:
    - not_null
  - name: processing_reason
    description: Processing reason
    cocoon_meta:
      missing_acceptable: No special processing needed for these standard records.
  - name: valid_from
    description: Start date of validity
    tests:
    - not_null
  - name: valid_to
    description: End date of validity
    tests:
    - not_null

stg_sap_lfa1_data (first 100 rows)

vendor_number client country_key name1 ort01 district region sort_field street_address address_number mcod1 mcod3 title industry_key cash_management_indicator created_by vendor_account_group language_key birth_date revision_indicator data_transfer_status tax_jurisdiction_code tax_base_amount qm_system_validity_date reference_group_date risk_indicator risk_notification_date registered_capital staging_time row_id is_deleted alternative_payee_account alternative_payee_account_indicator alternative_payee_allowed birth_place creation_date dunning_level export_indicator foreign_representative gender_indicator international_location_number ipi_taxpayer last_update_date last_update_time legal_nature location_number_check_digit natural_person_indicator one_time_account_indicator payment_block_indicator plant_calendar_key po_box po_box_city po_box_postal_code po_box_postal_code_2 post_office_branch postal_code profession risk_notification_email street_2 street_3 street_4 street_abbreviation telebox_number teletex_number telex_number train_station
0 EWM_3001 800 US Willy Wonka Chocolate Factory ITASCA DUPAGE IL EWM 1445 West Norwood Avenue 64202 WILLY WONKA CHOCOLATE FACTORY ITASCA Company None 0 C5093610 VEND E 0 0 X 3.304720e+09 0 0 0 0 0 0.0 0 1 False None None None None 2007-11-27 None None None None 0 None 0 0 0 0 None None None None None None None None None 11223 None None None None None None None None None None
1 EWM_3003 800 US Nakatomi Plaza LOS ANGELES CENTURY CITY CA EWM 2121 Avenue of the Stars 64203 NAKATOMI PLAZA LOS ANGELES Company None 0 C5093610 VEND E 0 0 X 1.403131e+09 0 0 0 0 0 0.0 0 2 False None None None None 2007-11-27 None None None None 0 None 0 0 0 0 None None None None None None None None None 60154 None None None None None None None None None None
2 EXTERN 800 US Initech AUSTIN AUSTIN TX FSC120 4120 Freidrich Lane 46098 INITECH AUSTIN Firma TRAD 0 D036964 VEND D 0 0 X NaN 0 0 0 0 0 0.0 0 3 False None None None None 2004-03-24 None None None None 0 None 0 0 0 0 None None None None None None None None None 73301 None None None None None None None None None None

stg_sap_lfa1_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 05:02:19.038665+00:00
WITH 
"sap_lfa1_data_projected" AS (
    -- Projection: Selecting 141 out of 142 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "lifnr",
        "mandt",
        "land1",
        "name1",
        "name2",
        "name3",
        "name4",
        "ort01",
        "ort02",
        "pfach",
        "pstl2",
        "pstlz",
        "regio",
        "sortl",
        "stras",
        "adrnr",
        "mcod1",
        "mcod2",
        "mcod3",
        "anred",
        "bahns",
        "bbbnr",
        "bbsnr",
        "begru",
        "brsch",
        "bubkz",
        "datlt",
        "dtams",
        "dtaws",
        "erdat",
        "ernam",
        "esrnr",
        "konzs",
        "ktokk",
        "kunnr",
        "lnrza",
        "loevm",
        "sperr",
        "sperm",
        "spras",
        "stcd1",
        "stcd2",
        "stkza",
        "stkzu",
        "telbx",
        "telf1",
        "telf2",
        "telfx",
        "teltx",
        "telx1",
        "xcpdk",
        "xzemp",
        "vbund",
        "fiskn",
        "stceg",
        "stkzn",
        "sperq",
        "gbort",
        "gbdat",
        "sexkz",
        "kraus",
        "revdb",
        "qssys",
        "ktock",
        "pfort",
        "werks",
        "ltsna",
        "werkr",
        "plkal",
        "duefl",
        "txjcd",
        "sperz",
        "scacd",
        "sfrgr",
        "lzone",
        "xlfza",
        "dlgrp",
        "fityp",
        "stcdt",
        "regss",
        "actss",
        "stcd3",
        "stcd4",
        "stcd5",
        "ipisp",
        "taxbs",
        "profs",
        "stgdl",
        "emnfr",
        "lfurl",
        "j_1kfrepre",
        "j_1kftbus",
        "j_1kftind",
        "confs",
        "updat",
        "uptim",
        "nodel",
        "qssysdat",
        "podkzb",
        "fisku",
        "stenr",
        "carrier_conf",
        "min_comp",
        "term_li",
        "crc_num",
        "cvp_xblck",
        "rg",
        "exp",
        "uf",
        "rgdate",
        "ric",
        "rne",
        "rnedate",
        "cnae",
        "legalnat",
        "crtn",
        "icmstaxpay",
        "indtyp",
        "tdt",
        "comsize",
        "decregpc",
        "j_sc_capital",
        "j_sc_currency",
        "alc",
        "pmt_office",
        "ppa_relevant",
        "psofg",
        "psois",
        "pson1",
        "pson2",
        "pson3",
        "psovn",
        "psotl",
        "psohs",
        "psost",
        "transport_chain",
        "staging_time",
        "scheduling_type",
        "submi_relevant",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_lfa1_data"
),

"sap_lfa1_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- lifnr -> vendor_number
    -- mandt -> client
    -- land1 -> country_key
    -- name3 -> vendor_name_3
    -- name4 -> vendor_name_4
    -- ort02 -> district
    -- pfach -> po_box
    -- pstl2 -> po_box_postal_code_2
    -- pstlz -> postal_code
    -- regio -> region
    -- sortl -> sort_field
    -- stras -> street_address
    -- adrnr -> address_number
    -- anred -> title
    -- bahns -> train_station
    -- bbbnr -> international_location_number
    -- bbsnr -> location_number_check_digit
    -- begru -> authorization_group
    -- brsch -> industry_key
    -- bubkz -> cash_management_indicator
    -- datlt -> communication_line_type
    -- dtams -> data_medium_exchange_indicator
    -- dtaws -> data_medium_exchange_instruction
    -- erdat -> creation_date
    -- ernam -> created_by
    -- esrnr -> isr_subscriber_number
    -- konzs -> group_key
    -- ktokk -> vendor_account_group
    -- kunnr -> customer_number
    -- lnrza -> alternative_payee_account
    -- sperr -> central_block
    -- sperm -> permitted_functions
    -- spras -> language_key
    -- stcd2 -> tax_number_2
    -- stkzu -> alternative_payee_account_indicator
    -- telbx -> telebox_number
    -- telf1 -> telephone_number_1
    -- telf2 -> telephone_number_2
    -- telfx -> fax_number
    -- teltx -> teletex_number
    -- telx1 -> telex_number
    -- xcpdk -> one_time_account_indicator
    -- xzemp -> payment_block_indicator
    -- vbund -> trading_partner_company_id
    -- fiskn -> fiscal_number
    -- stceg -> vat_registration_number
    -- stkzn -> natural_person_indicator
    -- sperq -> blocked_functions
    -- gbort -> birth_place
    -- gbdat -> birth_date
    -- sexkz -> gender_indicator
    -- kraus -> credit_info_number
    -- revdb -> revision_indicator
    -- qssys -> quality_management_system
    -- ktock -> payment_tolerance_group
    -- pfort -> po_box_city
    -- werks -> plant
    -- werkr -> reference_plant
    -- plkal -> plant_calendar_key
    -- duefl -> data_transfer_status
    -- txjcd -> tax_jurisdiction_code
    -- scacd -> supply_chain_activity_code
    -- sfrgr -> group_status
    -- lzone -> transportation_zone
    -- dlgrp -> dunning_level
    -- fityp -> tax_type
    -- stcdt -> tax_number_type
    -- regss -> regional_subdivision
    -- actss -> active_sales_service_status
    -- stcd3 -> tax_number_3
    -- stcd4 -> tax_number_4
    -- stcd5 -> tax_number_5
    -- ipisp -> ipi_taxpayer
    -- taxbs -> tax_base_amount
    -- profs -> profession
    -- stgdl -> statistics_group
    -- emnfr -> external_manufacturer
    -- lfurl -> vendor_url
    -- j_1kfrepre -> foreign_representative
    -- j_1kftbus -> business_type
    -- j_1kftind -> industry_classification
    -- confs -> confirmation_status
    -- updat -> last_update_date
    -- uptim -> last_update_time
    -- qssysdat -> qm_system_validity_date
    -- podkzb -> po_box_postal_code
    -- stenr -> statistical_number
    -- carrier_conf -> carrier_confirmation
    -- min_comp -> minimum_comparison
    -- term_li -> delivery_terms
    -- crc_num -> crc_number
    -- rg -> reference_group
    -- exp -> export_indicator
    -- uf -> user_field
    -- rgdate -> reference_group_date
    -- ric -> risk_indicator
    -- rne -> risk_notification_email
    -- rnedate -> risk_notification_date
    -- cnae -> brazilian_industry_code
    -- legalnat -> legal_nature
    -- crtn -> tax_related_number
    -- icmstaxpay -> icms_taxpayer
    -- indtyp -> industry_type
    -- tdt -> transaction_datetime
    -- comsize -> company_size
    -- decregpc -> declaration_regimen
    -- j_sc_capital -> registered_capital
    -- j_sc_currency -> capital_currency
    -- alc -> alternative_payee_allowed
    -- pmt_office -> payment_office
    -- ppa_relevant -> pci_relevant
    -- psofg -> post_office_branch
    -- psois -> street
    -- pson1 -> street_2
    -- pson2 -> street_3
    -- pson3 -> street_4
    -- psovn -> house_number_supplement
    -- psotl -> street_number
    -- psohs -> house_number
    -- psost -> street_abbreviation
    -- submi_relevant -> submission_relevance
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "lifnr" AS "vendor_number",
        "mandt" AS "client",
        "land1" AS "country_key",
        "name1",
        "name2",
        "name3" AS "vendor_name_3",
        "name4" AS "vendor_name_4",
        "ort01",
        "ort02" AS "district",
        "pfach" AS "po_box",
        "pstl2" AS "po_box_postal_code_2",
        "pstlz" AS "postal_code",
        "regio" AS "region",
        "sortl" AS "sort_field",
        "stras" AS "street_address",
        "adrnr" AS "address_number",
        "mcod1",
        "mcod2",
        "mcod3",
        "anred" AS "title",
        "bahns" AS "train_station",
        "bbbnr" AS "international_location_number",
        "bbsnr" AS "location_number_check_digit",
        "begru" AS "authorization_group",
        "brsch" AS "industry_key",
        "bubkz" AS "cash_management_indicator",
        "datlt" AS "communication_line_type",
        "dtams" AS "data_medium_exchange_indicator",
        "dtaws" AS "data_medium_exchange_instruction",
        "erdat" AS "creation_date",
        "ernam" AS "created_by",
        "esrnr" AS "isr_subscriber_number",
        "konzs" AS "group_key",
        "ktokk" AS "vendor_account_group",
        "kunnr" AS "customer_number",
        "lnrza" AS "alternative_payee_account",
        "loevm",
        "sperr" AS "central_block",
        "sperm" AS "permitted_functions",
        "spras" AS "language_key",
        "stcd1",
        "stcd2" AS "tax_number_2",
        "stkza",
        "stkzu" AS "alternative_payee_account_indicator",
        "telbx" AS "telebox_number",
        "telf1" AS "telephone_number_1",
        "telf2" AS "telephone_number_2",
        "telfx" AS "fax_number",
        "teltx" AS "teletex_number",
        "telx1" AS "telex_number",
        "xcpdk" AS "one_time_account_indicator",
        "xzemp" AS "payment_block_indicator",
        "vbund" AS "trading_partner_company_id",
        "fiskn" AS "fiscal_number",
        "stceg" AS "vat_registration_number",
        "stkzn" AS "natural_person_indicator",
        "sperq" AS "blocked_functions",
        "gbort" AS "birth_place",
        "gbdat" AS "birth_date",
        "sexkz" AS "gender_indicator",
        "kraus" AS "credit_info_number",
        "revdb" AS "revision_indicator",
        "qssys" AS "quality_management_system",
        "ktock" AS "payment_tolerance_group",
        "pfort" AS "po_box_city",
        "werks" AS "plant",
        "ltsna",
        "werkr" AS "reference_plant",
        "plkal" AS "plant_calendar_key",
        "duefl" AS "data_transfer_status",
        "txjcd" AS "tax_jurisdiction_code",
        "sperz",
        "scacd" AS "supply_chain_activity_code",
        "sfrgr" AS "group_status",
        "lzone" AS "transportation_zone",
        "xlfza",
        "dlgrp" AS "dunning_level",
        "fityp" AS "tax_type",
        "stcdt" AS "tax_number_type",
        "regss" AS "regional_subdivision",
        "actss" AS "active_sales_service_status",
        "stcd3" AS "tax_number_3",
        "stcd4" AS "tax_number_4",
        "stcd5" AS "tax_number_5",
        "ipisp" AS "ipi_taxpayer",
        "taxbs" AS "tax_base_amount",
        "profs" AS "profession",
        "stgdl" AS "statistics_group",
        "emnfr" AS "external_manufacturer",
        "lfurl" AS "vendor_url",
        "j_1kfrepre" AS "foreign_representative",
        "j_1kftbus" AS "business_type",
        "j_1kftind" AS "industry_classification",
        "confs" AS "confirmation_status",
        "updat" AS "last_update_date",
        "uptim" AS "last_update_time",
        "nodel",
        "qssysdat" AS "qm_system_validity_date",
        "podkzb" AS "po_box_postal_code",
        "fisku",
        "stenr" AS "statistical_number",
        "carrier_conf" AS "carrier_confirmation",
        "min_comp" AS "minimum_comparison",
        "term_li" AS "delivery_terms",
        "crc_num" AS "crc_number",
        "cvp_xblck",
        "rg" AS "reference_group",
        "exp" AS "export_indicator",
        "uf" AS "user_field",
        "rgdate" AS "reference_group_date",
        "ric" AS "risk_indicator",
        "rne" AS "risk_notification_email",
        "rnedate" AS "risk_notification_date",
        "cnae" AS "brazilian_industry_code",
        "legalnat" AS "legal_nature",
        "crtn" AS "tax_related_number",
        "icmstaxpay" AS "icms_taxpayer",
        "indtyp" AS "industry_type",
        "tdt" AS "transaction_datetime",
        "comsize" AS "company_size",
        "decregpc" AS "declaration_regimen",
        "j_sc_capital" AS "registered_capital",
        "j_sc_currency" AS "capital_currency",
        "alc" AS "alternative_payee_allowed",
        "pmt_office" AS "payment_office",
        "ppa_relevant" AS "pci_relevant",
        "psofg" AS "post_office_branch",
        "psois" AS "street",
        "pson1" AS "street_2",
        "pson2" AS "street_3",
        "pson3" AS "street_4",
        "psovn" AS "house_number_supplement",
        "psotl" AS "street_number",
        "psohs" AS "house_number",
        "psost" AS "street_abbreviation",
        "transport_chain",
        "staging_time",
        "scheduling_type",
        "submi_relevant" AS "submission_relevance",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_lfa1_data_projected"
),

"sap_lfa1_data_projected_renamed_cleaned" AS (
    -- Clean unusual string values: 
    -- data_transfer_status: The problem is that 'X' is the only value present in the data_transfer_status column, and it's not descriptive of any actual data transfer status. This severely limits the usefulness of the column. Typically, a data transfer status column would contain values that indicate the current state of a data transfer, such as 'Completed', 'In Progress', 'Failed', or 'Pending'. The 'X' value is meaningless in this context and doesn't provide any useful information about the status of data transfers. 
    SELECT
        "vendor_number",
        "client",
        "country_key",
        "name1",
        "name2",
        "vendor_name_3",
        "vendor_name_4",
        "ort01",
        "district",
        "po_box",
        "po_box_postal_code_2",
        "postal_code",
        "region",
        "sort_field",
        "street_address",
        "address_number",
        "mcod1",
        "mcod2",
        "mcod3",
        "title",
        "train_station",
        "international_location_number",
        "location_number_check_digit",
        "authorization_group",
        "industry_key",
        "cash_management_indicator",
        "communication_line_type",
        "data_medium_exchange_indicator",
        "data_medium_exchange_instruction",
        "creation_date",
        "created_by",
        "isr_subscriber_number",
        "group_key",
        "vendor_account_group",
        "customer_number",
        "alternative_payee_account",
        "loevm",
        "central_block",
        "permitted_functions",
        "language_key",
        "stcd1",
        "tax_number_2",
        "stkza",
        "alternative_payee_account_indicator",
        "telebox_number",
        "telephone_number_1",
        "telephone_number_2",
        "fax_number",
        "teletex_number",
        "telex_number",
        "one_time_account_indicator",
        "payment_block_indicator",
        "trading_partner_company_id",
        "fiscal_number",
        "vat_registration_number",
        "natural_person_indicator",
        "blocked_functions",
        "birth_place",
        "birth_date",
        "gender_indicator",
        "credit_info_number",
        "revision_indicator",
        "quality_management_system",
        "payment_tolerance_group",
        "po_box_city",
        "plant",
        "ltsna",
        "reference_plant",
        "plant_calendar_key",
        CASE
            WHEN "data_transfer_status" = '''X''' THEN ''''
            ELSE "data_transfer_status"
        END AS "data_transfer_status",
        "tax_jurisdiction_code",
        "sperz",
        "supply_chain_activity_code",
        "group_status",
        "transportation_zone",
        "xlfza",
        "dunning_level",
        "tax_type",
        "tax_number_type",
        "regional_subdivision",
        "active_sales_service_status",
        "tax_number_3",
        "tax_number_4",
        "tax_number_5",
        "ipi_taxpayer",
        "tax_base_amount",
        "profession",
        "statistics_group",
        "external_manufacturer",
        "vendor_url",
        "foreign_representative",
        "business_type",
        "industry_classification",
        "confirmation_status",
        "last_update_date",
        "last_update_time",
        "nodel",
        "qm_system_validity_date",
        "po_box_postal_code",
        "fisku",
        "statistical_number",
        "carrier_confirmation",
        "minimum_comparison",
        "delivery_terms",
        "crc_number",
        "cvp_xblck",
        "reference_group",
        "export_indicator",
        "user_field",
        "reference_group_date",
        "risk_indicator",
        "risk_notification_email",
        "risk_notification_date",
        "brazilian_industry_code",
        "legal_nature",
        "tax_related_number",
        "icms_taxpayer",
        "industry_type",
        "transaction_datetime",
        "company_size",
        "declaration_regimen",
        "registered_capital",
        "capital_currency",
        "alternative_payee_allowed",
        "payment_office",
        "pci_relevant",
        "post_office_branch",
        "street",
        "street_2",
        "street_3",
        "street_4",
        "house_number_supplement",
        "street_number",
        "house_number",
        "street_abbreviation",
        "transport_chain",
        "staging_time",
        "scheduling_type",
        "submission_relevance",
        "row_id",
        "is_deleted"
    FROM "sap_lfa1_data_projected_renamed"
),

"sap_lfa1_data_projected_renamed_cleaned_casted" AS (
    -- Column Type Casting: 
    -- active_sales_service_status: from DECIMAL to VARCHAR
    -- alternative_payee_account: from DECIMAL to VARCHAR
    -- alternative_payee_account_indicator: from DECIMAL to VARCHAR
    -- alternative_payee_allowed: from DECIMAL to VARCHAR
    -- authorization_group: from DECIMAL to VARCHAR
    -- birth_place: from DECIMAL to VARCHAR
    -- blocked_functions: from DECIMAL to VARCHAR
    -- brazilian_industry_code: from DECIMAL to VARCHAR
    -- business_type: from DECIMAL to VARCHAR
    -- capital_currency: from DECIMAL to VARCHAR
    -- carrier_confirmation: from DECIMAL to VARCHAR
    -- central_block: from DECIMAL to VARCHAR
    -- communication_line_type: from DECIMAL to VARCHAR
    -- company_size: from DECIMAL to VARCHAR
    -- confirmation_status: from DECIMAL to VARCHAR
    -- crc_number: from DECIMAL to VARCHAR
    -- creation_date: from INT to DATE
    -- credit_info_number: from DECIMAL to VARCHAR
    -- customer_number: from DECIMAL to VARCHAR
    -- cvp_xblck: from DECIMAL to VARCHAR
    -- data_medium_exchange_indicator: from DECIMAL to VARCHAR
    -- data_medium_exchange_instruction: from DECIMAL to VARCHAR
    -- declaration_regimen: from DECIMAL to VARCHAR
    -- delivery_terms: from DECIMAL to VARCHAR
    -- dunning_level: from DECIMAL to VARCHAR
    -- export_indicator: from DECIMAL to VARCHAR
    -- external_manufacturer: from DECIMAL to VARCHAR
    -- fax_number: from DECIMAL to VARCHAR
    -- fiscal_number: from DECIMAL to VARCHAR
    -- fisku: from DECIMAL to VARCHAR
    -- foreign_representative: from DECIMAL to VARCHAR
    -- gender_indicator: from DECIMAL to VARCHAR
    -- group_key: from DECIMAL to VARCHAR
    -- group_status: from DECIMAL to VARCHAR
    -- house_number: from DECIMAL to VARCHAR
    -- house_number_supplement: from DECIMAL to VARCHAR
    -- icms_taxpayer: from DECIMAL to VARCHAR
    -- industry_classification: from DECIMAL to VARCHAR
    -- industry_type: from DECIMAL to VARCHAR
    -- international_location_number: from INT to VARCHAR
    -- ipi_taxpayer: from DECIMAL to VARCHAR
    -- isr_subscriber_number: from DECIMAL to VARCHAR
    -- last_update_date: from INT to VARCHAR
    -- last_update_time: from INT to VARCHAR
    -- legal_nature: from INT to VARCHAR
    -- location_number_check_digit: from INT to VARCHAR
    -- loevm: from DECIMAL to VARCHAR
    -- ltsna: from DECIMAL to VARCHAR
    -- mcod2: from DECIMAL to VARCHAR
    -- minimum_comparison: from DECIMAL to VARCHAR
    -- name2: from DECIMAL to VARCHAR
    -- natural_person_indicator: from DECIMAL to VARCHAR
    -- nodel: from DECIMAL to VARCHAR
    -- one_time_account_indicator: from DECIMAL to VARCHAR
    -- payment_block_indicator: from DECIMAL to VARCHAR
    -- payment_office: from DECIMAL to VARCHAR
    -- payment_tolerance_group: from DECIMAL to VARCHAR
    -- pci_relevant: from DECIMAL to VARCHAR
    -- permitted_functions: from DECIMAL to VARCHAR
    -- plant: from DECIMAL to VARCHAR
    -- plant_calendar_key: from DECIMAL to VARCHAR
    -- po_box: from DECIMAL to VARCHAR
    -- po_box_city: from DECIMAL to VARCHAR
    -- po_box_postal_code: from DECIMAL to VARCHAR
    -- po_box_postal_code_2: from DECIMAL to VARCHAR
    -- post_office_branch: from DECIMAL to VARCHAR
    -- postal_code: from INT to VARCHAR
    -- profession: from DECIMAL to VARCHAR
    -- quality_management_system: from DECIMAL to VARCHAR
    -- reference_group: from DECIMAL to VARCHAR
    -- reference_plant: from DECIMAL to VARCHAR
    -- regional_subdivision: from DECIMAL to VARCHAR
    -- risk_notification_email: from DECIMAL to VARCHAR
    -- scheduling_type: from DECIMAL to VARCHAR
    -- sperz: from DECIMAL to VARCHAR
    -- statistical_number: from DECIMAL to VARCHAR
    -- statistics_group: from DECIMAL to VARCHAR
    -- stcd1: from DECIMAL to VARCHAR
    -- stkza: from DECIMAL to VARCHAR
    -- street: from DECIMAL to VARCHAR
    -- street_2: from DECIMAL to VARCHAR
    -- street_3: from DECIMAL to VARCHAR
    -- street_4: from DECIMAL to VARCHAR
    -- street_abbreviation: from DECIMAL to VARCHAR
    -- street_number: from DECIMAL to VARCHAR
    -- submission_relevance: from DECIMAL to VARCHAR
    -- supply_chain_activity_code: from DECIMAL to VARCHAR
    -- tax_number_2: from DECIMAL to VARCHAR
    -- tax_number_3: from DECIMAL to VARCHAR
    -- tax_number_4: from DECIMAL to VARCHAR
    -- tax_number_5: from DECIMAL to VARCHAR
    -- tax_number_type: from DECIMAL to VARCHAR
    -- tax_related_number: from DECIMAL to VARCHAR
    -- tax_type: from DECIMAL to VARCHAR
    -- telebox_number: from DECIMAL to VARCHAR
    -- telephone_number_1: from DECIMAL to VARCHAR
    -- telephone_number_2: from DECIMAL to VARCHAR
    -- teletex_number: from DECIMAL to VARCHAR
    -- telex_number: from DECIMAL to VARCHAR
    -- trading_partner_company_id: from DECIMAL to VARCHAR
    -- train_station: from DECIMAL to VARCHAR
    -- transaction_datetime: from DECIMAL to TIMESTAMP
    -- transport_chain: from DECIMAL to VARCHAR
    -- transportation_zone: from DECIMAL to VARCHAR
    -- user_field: from DECIMAL to VARCHAR
    -- vat_registration_number: from DECIMAL to VARCHAR
    -- vendor_name_3: from DECIMAL to VARCHAR
    -- vendor_name_4: from DECIMAL to VARCHAR
    -- vendor_url: from DECIMAL to VARCHAR
    -- xlfza: from DECIMAL to VARCHAR
    SELECT
        "vendor_number",
        "client",
        "country_key",
        "name1",
        "ort01",
        "district",
        "region",
        "sort_field",
        "street_address",
        "address_number",
        "mcod1",
        "mcod3",
        "title",
        "industry_key",
        "cash_management_indicator",
        "created_by",
        "vendor_account_group",
        "language_key",
        "birth_date",
        "revision_indicator",
        "data_transfer_status",
        "tax_jurisdiction_code",
        "tax_base_amount",
        "qm_system_validity_date",
        "reference_group_date",
        "risk_indicator",
        "risk_notification_date",
        "registered_capital",
        "staging_time",
        "row_id",
        "is_deleted",
        CAST("active_sales_service_status" AS VARCHAR) AS "active_sales_service_status",
        CAST("alternative_payee_account" AS VARCHAR) AS "alternative_payee_account",
        CAST("alternative_payee_account_indicator" AS VARCHAR) AS "alternative_payee_account_indicator",
        CAST("alternative_payee_allowed" AS VARCHAR) AS "alternative_payee_allowed",
        CAST("authorization_group" AS VARCHAR) AS "authorization_group",
        CAST("birth_place" AS VARCHAR) AS "birth_place",
        CAST("blocked_functions" AS VARCHAR) AS "blocked_functions",
        CAST("brazilian_industry_code" AS VARCHAR) AS "brazilian_industry_code",
        CAST("business_type" AS VARCHAR) AS "business_type",
        CAST("capital_currency" AS VARCHAR) AS "capital_currency",
        CAST("carrier_confirmation" AS VARCHAR) AS "carrier_confirmation",
        CAST("central_block" AS VARCHAR) AS "central_block",
        CAST("communication_line_type" AS VARCHAR) AS "communication_line_type",
        CAST("company_size" AS VARCHAR) AS "company_size",
        CAST("confirmation_status" AS VARCHAR) AS "confirmation_status",
        CAST("crc_number" AS VARCHAR) AS "crc_number",
        strptime(CAST("creation_date" AS VARCHAR), '%Y%m%d') AS "creation_date",
        CAST("credit_info_number" AS VARCHAR) AS "credit_info_number",
        CAST("customer_number" AS VARCHAR) AS "customer_number",
        CAST("cvp_xblck" AS VARCHAR) AS "cvp_xblck",
        CAST("data_medium_exchange_indicator" AS VARCHAR) AS "data_medium_exchange_indicator",
        CAST("data_medium_exchange_instruction" AS VARCHAR) AS "data_medium_exchange_instruction",
        CAST("declaration_regimen" AS VARCHAR) AS "declaration_regimen",
        CAST("delivery_terms" AS VARCHAR) AS "delivery_terms",
        CAST("dunning_level" AS VARCHAR) AS "dunning_level",
        CAST("export_indicator" AS VARCHAR) AS "export_indicator",
        CAST("external_manufacturer" AS VARCHAR) AS "external_manufacturer",
        CAST("fax_number" AS VARCHAR) AS "fax_number",
        CAST("fiscal_number" AS VARCHAR) AS "fiscal_number",
        CAST("fisku" AS VARCHAR) AS "fisku",
        CAST("foreign_representative" AS VARCHAR) AS "foreign_representative",
        CAST("gender_indicator" AS VARCHAR) AS "gender_indicator",
        CAST("group_key" AS VARCHAR) AS "group_key",
        CAST("group_status" AS VARCHAR) AS "group_status",
        CAST("house_number" AS VARCHAR) AS "house_number",
        CAST("house_number_supplement" AS VARCHAR) AS "house_number_supplement",
        CAST("icms_taxpayer" AS VARCHAR) AS "icms_taxpayer",
        CAST("industry_classification" AS VARCHAR) AS "industry_classification",
        CAST("industry_type" AS VARCHAR) AS "industry_type",
        CAST("international_location_number" AS VARCHAR) AS "international_location_number",
        CAST("ipi_taxpayer" AS VARCHAR) AS "ipi_taxpayer",
        CAST("isr_subscriber_number" AS VARCHAR) AS "isr_subscriber_number",
        CAST("last_update_date" AS VARCHAR) AS "last_update_date",
        CAST("last_update_time" AS VARCHAR) AS "last_update_time",
        CAST("legal_nature" AS VARCHAR) AS "legal_nature",
        CAST("location_number_check_digit" AS VARCHAR) AS "location_number_check_digit",
        CAST("loevm" AS VARCHAR) AS "loevm",
        CAST("ltsna" AS VARCHAR) AS "ltsna",
        CAST("mcod2" AS VARCHAR) AS "mcod2",
        CAST("minimum_comparison" AS VARCHAR) AS "minimum_comparison",
        CAST("name2" AS VARCHAR) AS "name2",
        CAST("natural_person_indicator" AS VARCHAR) AS "natural_person_indicator",
        CAST("nodel" AS VARCHAR) AS "nodel",
        CAST("one_time_account_indicator" AS VARCHAR) AS "one_time_account_indicator",
        CAST("payment_block_indicator" AS VARCHAR) AS "payment_block_indicator",
        CAST("payment_office" AS VARCHAR) AS "payment_office",
        CAST("payment_tolerance_group" AS VARCHAR) AS "payment_tolerance_group",
        CAST("pci_relevant" AS VARCHAR) AS "pci_relevant",
        CAST("permitted_functions" AS VARCHAR) AS "permitted_functions",
        CAST("plant" AS VARCHAR) AS "plant",
        CAST("plant_calendar_key" AS VARCHAR) AS "plant_calendar_key",
        CAST("po_box" AS VARCHAR) AS "po_box",
        CAST("po_box_city" AS VARCHAR) AS "po_box_city",
        CAST("po_box_postal_code" AS VARCHAR) AS "po_box_postal_code",
        CAST("po_box_postal_code_2" AS VARCHAR) AS "po_box_postal_code_2",
        CAST("post_office_branch" AS VARCHAR) AS "post_office_branch",
        CAST("postal_code" AS VARCHAR) AS "postal_code",
        CAST("profession" AS VARCHAR) AS "profession",
        CAST("quality_management_system" AS VARCHAR) AS "quality_management_system",
        CAST("reference_group" AS VARCHAR) AS "reference_group",
        CAST("reference_plant" AS VARCHAR) AS "reference_plant",
        CAST("regional_subdivision" AS VARCHAR) AS "regional_subdivision",
        CAST("risk_notification_email" AS VARCHAR) AS "risk_notification_email",
        CAST("scheduling_type" AS VARCHAR) AS "scheduling_type",
        CAST("sperz" AS VARCHAR) AS "sperz",
        CAST("statistical_number" AS VARCHAR) AS "statistical_number",
        CAST("statistics_group" AS VARCHAR) AS "statistics_group",
        CAST("stcd1" AS VARCHAR) AS "stcd1",
        CAST("stkza" AS VARCHAR) AS "stkza",
        CAST("street" AS VARCHAR) AS "street",
        CAST("street_2" AS VARCHAR) AS "street_2",
        CAST("street_3" AS VARCHAR) AS "street_3",
        CAST("street_4" AS VARCHAR) AS "street_4",
        CAST("street_abbreviation" AS VARCHAR) AS "street_abbreviation",
        CAST("street_number" AS VARCHAR) AS "street_number",
        CAST("submission_relevance" AS VARCHAR) AS "submission_relevance",
        CAST("supply_chain_activity_code" AS VARCHAR) AS "supply_chain_activity_code",
        CAST("tax_number_2" AS VARCHAR) AS "tax_number_2",
        CAST("tax_number_3" AS VARCHAR) AS "tax_number_3",
        CAST("tax_number_4" AS VARCHAR) AS "tax_number_4",
        CAST("tax_number_5" AS VARCHAR) AS "tax_number_5",
        CAST("tax_number_type" AS VARCHAR) AS "tax_number_type",
        CAST("tax_related_number" AS VARCHAR) AS "tax_related_number",
        CAST("tax_type" AS VARCHAR) AS "tax_type",
        CAST("telebox_number" AS VARCHAR) AS "telebox_number",
        CAST("telephone_number_1" AS VARCHAR) AS "telephone_number_1",
        CAST("telephone_number_2" AS VARCHAR) AS "telephone_number_2",
        CAST("teletex_number" AS VARCHAR) AS "teletex_number",
        CAST("telex_number" AS VARCHAR) AS "telex_number",
        CAST("trading_partner_company_id" AS VARCHAR) AS "trading_partner_company_id",
        CAST("train_station" AS VARCHAR) AS "train_station",
        CAST("transaction_datetime" AS TIMESTAMP) AS "transaction_datetime",
        CAST("transport_chain" AS VARCHAR) AS "transport_chain",
        CAST("transportation_zone" AS VARCHAR) AS "transportation_zone",
        CAST("user_field" AS VARCHAR) AS "user_field",
        CAST("vat_registration_number" AS VARCHAR) AS "vat_registration_number",
        CAST("vendor_name_3" AS VARCHAR) AS "vendor_name_3",
        CAST("vendor_name_4" AS VARCHAR) AS "vendor_name_4",
        CAST("vendor_url" AS VARCHAR) AS "vendor_url",
        CAST("xlfza" AS VARCHAR) AS "xlfza"
    FROM "sap_lfa1_data_projected_renamed_cleaned"
),

"sap_lfa1_data_projected_renamed_cleaned_casted_missing_handled" AS (
    -- Handling missing values: There are 77 columns with unacceptable missing values
    -- active_sales_service_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- authorization_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- blocked_functions has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- brazilian_industry_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- business_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- capital_currency has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- carrier_confirmation has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- central_block has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- communication_line_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- company_size has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- confirmation_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- crc_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- credit_info_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- cvp_xblck has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- data_medium_exchange_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- data_medium_exchange_instruction has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- declaration_regimen has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- delivery_terms has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- external_manufacturer has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- fax_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- fiscal_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- fisku has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- group_key has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- group_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- house_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- house_number_supplement has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- icms_taxpayer has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- industry_classification has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- industry_key has 66.67 percent missing. Strategy: 🔄 Unchanged
    -- industry_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- isr_subscriber_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- loevm has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ltsna has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- mcod2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- minimum_comparison has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- name2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- nodel has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- payment_office has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- payment_tolerance_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- pci_relevant has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- permitted_functions has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- plant has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- quality_management_system has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_plant has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- regional_subdivision has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- scheduling_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sperz has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- statistical_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- statistics_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- stcd1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- stkza has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- street has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- street_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- submission_relevance has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- supply_chain_activity_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_jurisdiction_code has 33.33 percent missing. Strategy: 🔄 Unchanged
    -- tax_number_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_number_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_number_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_number_5 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_number_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_related_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- telephone_number_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- telephone_number_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- trading_partner_company_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- transaction_datetime has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- transport_chain has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- transportation_zone has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- user_field has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- vat_registration_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- vendor_name_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- vendor_name_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- vendor_url has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- xlfza has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "vendor_number",
        "client",
        "country_key",
        "name1",
        "ort01",
        "district",
        "region",
        "sort_field",
        "street_address",
        "address_number",
        "mcod1",
        "mcod3",
        "title",
        "industry_key",
        "cash_management_indicator",
        "created_by",
        "vendor_account_group",
        "language_key",
        "birth_date",
        "revision_indicator",
        "data_transfer_status",
        "tax_jurisdiction_code",
        "tax_base_amount",
        "qm_system_validity_date",
        "reference_group_date",
        "risk_indicator",
        "risk_notification_date",
        "registered_capital",
        "staging_time",
        "row_id",
        "is_deleted",
        "alternative_payee_account",
        "alternative_payee_account_indicator",
        "alternative_payee_allowed",
        "birth_place",
        "creation_date",
        "dunning_level",
        "export_indicator",
        "foreign_representative",
        "gender_indicator",
        "international_location_number",
        "ipi_taxpayer",
        "last_update_date",
        "last_update_time",
        "legal_nature",
        "location_number_check_digit",
        "natural_person_indicator",
        "one_time_account_indicator",
        "payment_block_indicator",
        "plant_calendar_key",
        "po_box",
        "po_box_city",
        "po_box_postal_code",
        "po_box_postal_code_2",
        "post_office_branch",
        "postal_code",
        "profession",
        "risk_notification_email",
        "street_2",
        "street_3",
        "street_4",
        "street_abbreviation",
        "telebox_number",
        "teletex_number",
        "telex_number",
        "train_station"
    FROM "sap_lfa1_data_projected_renamed_cleaned_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_lfa1_data_projected_renamed_cleaned_casted_missing_handled"

stg_sap_lfa1_data.yml (Document the table)

version: 2
models:
- name: stg_sap_lfa1_data
  description: The table is about vendor data in SAP. It contains details like vendor
    number, name, address, contact info, and classification. Key fields include LIFNR
    (vendor number), NAME1 (vendor name), LAND1 (country key), and KTOKK (vendor account
    group). The table stores both general info and SAP-specific data for vendors used
    in procurement and financial processes.
  columns:
  - name: vendor_number
    description: Vendor number
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column represents a unique identifier for each vendor. For
        this table, each row represents a distinct vendor, and the vendor_number is
        likely to be unique across rows.
  - name: client
    description: Client
    tests:
    - not_null
  - name: country_key
    description: Country key
    tests:
    - not_null
  - name: name1
    description: ''
    tests:
    - not_null
  - name: ort01
    description: ''
    tests:
    - not_null
  - name: district
    description: District of the vendor
    tests:
    - not_null
  - name: region
    description: Region (state, province, county)
    tests:
    - not_null
    - accepted_values:
        values:
        - AL
        - AK
        - AZ
        - AR
        - CA
        - CO
        - CT
        - DE
        - FL
        - GA
        - HI
        - ID
        - IL
        - IN
        - IA
        - KS
        - KY
        - LA
        - ME
        - MD
        - MA
        - MI
        - MN
        - MS
        - MO
        - MT
        - NE
        - NV
        - NH
        - NJ
        - NM
        - NY
        - NC
        - ND
        - OH
        - OK
        - OR
        - PA
        - RI
        - SC
        - SD
        - TN
        - TX
        - UT
        - VT
        - VA
        - WA
        - WV
        - WI
        - WY
  - name: sort_field
    description: Sort field
    tests:
    - not_null
  - name: street_address
    description: Street and house number
    tests:
    - not_null
  - name: address_number
    description: Address number
    tests:
    - not_null
  - name: mcod1
    description: ''
    tests:
    - not_null
  - name: mcod3
    description: ''
    tests:
    - not_null
  - name: title
    description: Title or form of address
    tests:
    - not_null
    - accepted_values:
        values:
        - Company
        - Firma
        - Corporation
        - Inc.
        - LLC
        - Ltd.
        - GmbH
        - AG
        - SA
        - Pty Ltd
        - PLC
        - LLP
        - SARL
        - BV
  - name: industry_key
    description: Industry key
    tests:
    - not_null
    - accepted_values:
        values:
        - TRAD
        - TECH
        - FIN
        - HEALTH
        - MANUF
        - RETAIL
        - ENERGY
        - AGRI
        - CONST
        - TRANS
        - EDU
        - MEDIA
        - HOSP
        - REAL
        - GOVT
  - name: cash_management_indicator
    description: Cash management indicator
    tests:
    - not_null
  - name: created_by
    description: Name of person who created the object
    tests:
    - not_null
  - name: vendor_account_group
    description: Vendor account group
    tests:
    - not_null
  - name: language_key
    description: Language key
    tests:
    - not_null
    - accepted_values:
        values:
        - E
        - D
        - F
        - S
        - I
        - P
        - R
        - C
        - J
        - A
        - H
        - K
        - N
        - G
        - T
        - W
        - L
        - Z
        - M
        - U
  - name: birth_date
    description: Date of birth
    tests:
    - not_null
  - name: revision_indicator
    description: Revision database indicator
    tests:
    - not_null
  - name: data_transfer_status
    description: Status of data transfer
    tests:
    - not_null
    - accepted_values:
        values:
        - Pending
        - In Progress
        - Completed
        - Failed
        - Paused
        - Cancelled
        - Queued
        - Retrying
        - Partial
        - Unknown
        - X
  - name: tax_jurisdiction_code
    description: Tax jurisdiction code
    tests:
    - not_null
  - name: tax_base_amount
    description: Tax base amount
    tests:
    - not_null
  - name: qm_system_validity_date
    description: QM system validity date
    tests:
    - not_null
  - name: reference_group_date
    description: Reference group date
    tests:
    - not_null
  - name: risk_indicator
    description: Risk indicator
    tests:
    - not_null
  - name: risk_notification_date
    description: Risk notification email date
    tests:
    - not_null
  - name: registered_capital
    description: Registered capital of the vendor
    tests:
    - not_null
  - name: staging_time
    description: Time for staging
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column appears to be a unique identifier for each row in the
        table. For this table, each row represents a unique vendor. The row_id is
        likely to be unique across all rows, as it's a common practice to use such
        identifiers to uniquely identify records in a database.
  - name: is_deleted
    description: Indicates if the record was deleted
    tests:
    - not_null
  - name: alternative_payee_account
    description: Account number of the alternative payee
    cocoon_meta:
      missing_acceptable: Not applicable if no alternative payee is used.
  - name: alternative_payee_account_indicator
    description: 'Indicator: alternative payee using account number'
    cocoon_meta:
      missing_acceptable: Not applicable if no alternative payee is used.
  - name: alternative_payee_allowed
    description: Alternative payee allowed
    cocoon_meta:
      missing_acceptable: Not applicable if alternative payees are not considered.
  - name: birth_place
    description: Place of birth
    cocoon_meta:
      missing_acceptable: Not applicable for non-individual (company) records.
  - name: creation_date
    description: Date on which the record was created
    tests:
    - not_null
  - name: dunning_level
    description: Dunning level
    cocoon_meta:
      missing_acceptable: Not applicable if no overdue payments exist.
  - name: export_indicator
    description: Export indicator
    cocoon_meta:
      missing_acceptable: Not applicable for businesses not involved in exports.
  - name: foreign_representative
    description: Representative of foreign company
    cocoon_meta:
      missing_acceptable: Not applicable for domestic businesses without foreign representation.
  - name: gender_indicator
    description: Gender indicator
    cocoon_meta:
      missing_acceptable: Not applicable for non-individual (company) records.
  - name: international_location_number
    description: International location number
    tests:
    - not_null
  - name: ipi_taxpayer
    description: IPI taxpayer indicator
    cocoon_meta:
      missing_acceptable: Not applicable for businesses not subject to IPI tax.
  - name: last_update_date
    description: Date of last update
    tests:
    - not_null
  - name: last_update_time
    description: Time of last update
    tests:
    - not_null
  - name: legal_nature
    description: Legal nature
    tests:
    - not_null
    - accepted_values:
        values:
        - Individual
        - Corporation
        - Partnership
        - Limited Liability Company (LLC)
        - Non-Profit Organization
        - Sole Proprietorship
        - Cooperative
        - Trust
        - Association
        - Joint Venture
        - Public Limited Company (PLC)
        - Limited Partnership (LP)
        - Limited Liability Partnership (LLP)
        - Government Entity
        - Foundation
        - '0'
  - name: location_number_check_digit
    description: Check digit for international location number
    tests:
    - not_null
    - accepted_values:
        values:
        - '0'
        - '1'
        - '2'
        - '3'
        - '4'
        - '5'
        - '6'
        - '7'
        - '8'
        - '9'
  - name: natural_person_indicator
    description: Natural person indicator
    cocoon_meta:
      missing_acceptable: Not applicable for business entities
  - name: one_time_account_indicator
    description: One-time account indicator
    cocoon_meta:
      missing_acceptable: Not applicable for regular, recurring accounts
  - name: payment_block_indicator
    description: Payment block indicator
    cocoon_meta:
      missing_acceptable: Not applicable if payments are allowed
  - name: plant_calendar_key
    description: Plant calendar key
    cocoon_meta:
      missing_acceptable: Not applicable if no specific plant calendar exists
  - name: po_box
    description: PO Box number
    cocoon_meta:
      missing_acceptable: Not all addresses have PO boxes
  - name: po_box_city
    description: PO Box city
    cocoon_meta:
      missing_acceptable: Not applicable if no PO box exists
  - name: po_box_postal_code
    description: PO Box postal code
    cocoon_meta:
      missing_acceptable: Not applicable if no PO box exists
  - name: po_box_postal_code_2
    description: P.O. Box postal code
    cocoon_meta:
      missing_acceptable: Not applicable if no PO box or extended zip code
  - name: post_office_branch
    description: Post office branch
    cocoon_meta:
      missing_acceptable: Not all addresses have specific post office branches
  - name: postal_code
    description: Postal code
    tests:
    - not_null
  - name: profession
    description: Profession (free text)
    cocoon_meta:
      missing_acceptable: Not applicable for business entities
  - name: risk_notification_email
    description: Risk notification email
    cocoon_meta:
      missing_acceptable: Not applicable if no risk notifications are set up
  - name: street_2
    description: Street 2
    cocoon_meta:
      missing_acceptable: Additional street information may not be needed for all
        addresses.
  - name: street_3
    description: Street 3
    cocoon_meta:
      missing_acceptable: Additional street information may not be needed for all
        addresses.
  - name: street_4
    description: Street 4
    cocoon_meta:
      missing_acceptable: Additional street information may not be needed for all
        addresses.
  - name: street_abbreviation
    description: Street abbreviation
    cocoon_meta:
      missing_acceptable: Not all streets have commonly used abbreviations.
  - name: telebox_number
    description: Telebox number
    cocoon_meta:
      missing_acceptable: Not all vendors may use telebox services.
  - name: teletex_number
    description: Teletex number
    cocoon_meta:
      missing_acceptable: Outdated technology, likely not used by most vendors.
  - name: telex_number
    description: Telex number
    cocoon_meta:
      missing_acceptable: Outdated technology, likely not used by most vendors.
  - name: train_station
    description: Train station
    cocoon_meta:
      missing_acceptable: Not all vendor locations are near train stations.

stg_sap_pa0008_data (first 100 rows)

sequence_number subtype uname wage_type wage_area pay_scale_group pay_scale_level pay_scale_type waers comparison_pay_scale_group comparison_pay_scale_level comparison_collective_agreement bsgrd dividend_percentage annual_salary lga01 wage_component_1 wage_component_01 lga02 wage_component_2 wage_component_02 wage_component_3 wage_component_03 wage_component_4 wage_component_04 wage_component_5 wage_component_05 wage_component_6 wage_component_06 wage_component_7 wage_component_07 wage_component_8 wage_component_08 wage_component_9 wage_component_09 bet10 anz10 bet11 anz11 bet12 anz12 bet13 anz13 bet14 anz14 salary_component_15 wage_component_15 salary_component_16 wage_component_16 salary_component_17 wage_component_17 salary_component_18 anz18 salary_component_19 anz19 salary_component_20 anz20 salary_component_21 anz21 salary_component_22 anz22 salary_component_23 anz23 salary_component_24 anz24 salary_component_25 anz25 salary_component_26 unknown_value_26 salary_component_27 unknown_value_27 salary_component_28 unknown_value_28 salary_component_29 unknown_value_29 salary_component_30 unknown_value_30 salary_component_31 unknown_value_31 salary_component_32 unknown_value_32 salary_component_33 unknown_value_33 salary_component_34 unknown_value_34 salary_component_35 unknown_value_35 salary_component_36 unknown_value_36 salary_component_37 unknown_value_37 salary_component_38 unknown_value_38 salary_component_39 unknown_value_39 salary_component_40 unknown_value_40 indicator_1 indicator_2 salary_currency currency_indicator row_id is_deleted adjustment_reason allowance allowances begda benefits benefits_eligible client_code commission commission_rate contract_end_date endda historical_indicator pernr salary_record_date salary_review_date
0 0 0 I026759 1 2 AGE 55 0 JPY AGE 55.0 0 0.0 0.00 0.0 M000 0.0 0.0 M001 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 I I JPY T 5969 False None None None 1975-04-01 None None 800 None None NaT 9999-12-31 None 22314 2014-09-19 NaT
1 0 0 C5174732 1 1 GRD10 3 0 USD None NaN 0 100.0 86.67 0.0 1002 0.0 0.0 None 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 I None USD T 1855 False None None None 1991-12-15 None None 800 None None NaT 9999-12-31 None 80052 2012-10-29 NaT
2 0 0 C5174732 1 1 GRD10 1 0 USD None NaN 0 100.0 86.67 0.0 1002 0.0 0.0 None 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 I None USD T 1856 False None None None 1992-03-15 None None 800 None None NaT 9999-12-31 None 80053 2012-10-29 NaT

stg_sap_pa0008_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 14:44:00.378541+00:00
WITH 
"sap_pa0008_data_projected" AS (
    -- Projection: Selecting 286 out of 287 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "begda",
        "endda",
        "mandt",
        "objps",
        "pernr",
        "seqnr",
        "sprps",
        "subty",
        "aedtm",
        "uname",
        "histo",
        "itxex",
        "refex",
        "ordex",
        "itbld",
        "preas",
        "flag1",
        "flag2",
        "flag3",
        "flag4",
        "rese1",
        "rese2",
        "grpvl",
        "trfar",
        "trfgb",
        "trfgr",
        "trfst",
        "stvor",
        "orzst",
        "partn",
        "waers",
        "vglta",
        "vglgb",
        "vglgr",
        "vglst",
        "vglsv",
        "bsgrd",
        "divgv",
        "ansal",
        "falgk",
        "falgr",
        "lga01",
        "bet01",
        "anz01",
        "ein01",
        "opk01",
        "lga02",
        "bet02",
        "anz02",
        "ein02",
        "opk02",
        "lga03",
        "bet03",
        "anz03",
        "ein03",
        "opk03",
        "lga04",
        "bet04",
        "anz04",
        "ein04",
        "opk04",
        "lga05",
        "bet05",
        "anz05",
        "ein05",
        "opk05",
        "lga06",
        "bet06",
        "anz06",
        "ein06",
        "opk06",
        "lga07",
        "bet07",
        "anz07",
        "ein07",
        "opk07",
        "lga08",
        "bet08",
        "anz08",
        "ein08",
        "opk08",
        "lga09",
        "bet09",
        "anz09",
        "ein09",
        "opk09",
        "lga10",
        "bet10",
        "anz10",
        "ein10",
        "opk10",
        "lga11",
        "bet11",
        "anz11",
        "ein11",
        "opk11",
        "lga12",
        "bet12",
        "anz12",
        "ein12",
        "opk12",
        "lga13",
        "bet13",
        "anz13",
        "ein13",
        "opk13",
        "lga14",
        "bet14",
        "anz14",
        "ein14",
        "opk14",
        "lga15",
        "bet15",
        "anz15",
        "ein15",
        "opk15",
        "lga16",
        "bet16",
        "anz16",
        "ein16",
        "opk16",
        "lga17",
        "bet17",
        "anz17",
        "ein17",
        "opk17",
        "lga18",
        "bet18",
        "anz18",
        "ein18",
        "opk18",
        "lga19",
        "bet19",
        "anz19",
        "ein19",
        "opk19",
        "lga20",
        "bet20",
        "anz20",
        "ein20",
        "opk20",
        "lga21",
        "bet21",
        "anz21",
        "ein21",
        "opk21",
        "lga22",
        "bet22",
        "anz22",
        "ein22",
        "opk22",
        "lga23",
        "bet23",
        "anz23",
        "ein23",
        "opk23",
        "lga24",
        "bet24",
        "anz24",
        "ein24",
        "opk24",
        "lga25",
        "bet25",
        "anz25",
        "ein25",
        "opk25",
        "lga26",
        "bet26",
        "anz26",
        "ein26",
        "opk26",
        "lga27",
        "bet27",
        "anz27",
        "ein27",
        "opk27",
        "lga28",
        "bet28",
        "anz28",
        "ein28",
        "opk28",
        "lga29",
        "bet29",
        "anz29",
        "ein29",
        "opk29",
        "lga30",
        "bet30",
        "anz30",
        "ein30",
        "opk30",
        "lga31",
        "bet31",
        "anz31",
        "ein31",
        "opk31",
        "lga32",
        "bet32",
        "anz32",
        "ein32",
        "opk32",
        "lga33",
        "bet33",
        "anz33",
        "ein33",
        "opk33",
        "lga34",
        "bet34",
        "anz34",
        "ein34",
        "opk34",
        "lga35",
        "bet35",
        "anz35",
        "ein35",
        "opk35",
        "lga36",
        "bet36",
        "anz36",
        "ein36",
        "opk36",
        "lga37",
        "bet37",
        "anz37",
        "ein37",
        "opk37",
        "lga38",
        "bet38",
        "anz38",
        "ein38",
        "opk38",
        "lga39",
        "bet39",
        "anz39",
        "ein39",
        "opk39",
        "lga40",
        "bet40",
        "anz40",
        "ein40",
        "opk40",
        "ind01",
        "ind02",
        "ind03",
        "ind04",
        "ind05",
        "ind06",
        "ind07",
        "ind08",
        "ind09",
        "ind10",
        "ind11",
        "ind12",
        "ind13",
        "ind14",
        "ind15",
        "ind16",
        "ind17",
        "ind18",
        "ind19",
        "ind20",
        "ind21",
        "ind22",
        "ind23",
        "ind24",
        "ind25",
        "ind26",
        "ind27",
        "ind28",
        "ind29",
        "ind30",
        "ind31",
        "ind32",
        "ind33",
        "ind34",
        "ind35",
        "ind36",
        "ind37",
        "ind38",
        "ind39",
        "ind40",
        "ancur",
        "cpind",
        "flaga",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_pa0008_data"
),

"sap_pa0008_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- mandt -> client_code
    -- objps -> position_id
    -- seqnr -> sequence_number
    -- sprps -> special_purpose
    -- subty -> subtype
    -- aedtm -> salary_record_date
    -- histo -> historical_indicator
    -- itxex -> record_id
    -- refex -> reference_id
    -- ordex -> order_index
    -- itbld -> it_build
    -- preas -> personnel_reason_code
    -- flag1 -> flag_1
    -- flag2 -> flag_2
    -- flag3 -> flag_3
    -- flag4 -> flag_4
    -- rese1 -> reserve_field_1
    -- rese2 -> reserved_field_2
    -- grpvl -> group_value
    -- trfar -> wage_type
    -- trfgb -> wage_area
    -- trfgr -> pay_scale_group
    -- trfst -> pay_scale_level
    -- stvor -> pay_scale_type
    -- orzst -> organizational_status
    -- partn -> partner_code
    -- vglta -> comparison_date
    -- vglgb -> comparison_wage_area
    -- vglgr -> comparison_pay_scale_group
    -- vglst -> comparison_pay_scale_level
    -- vglsv -> comparison_collective_agreement
    -- divgv -> dividend_percentage
    -- ansal -> annual_salary
    -- falgk -> compensation_flag
    -- falgr -> reason_flag
    -- bet01 -> wage_component_1
    -- anz01 -> wage_component_01
    -- opk01 -> org_key_01
    -- bet02 -> wage_component_2
    -- anz02 -> wage_component_02
    -- opk02 -> org_key_02
    -- bet03 -> wage_component_3
    -- anz03 -> wage_component_03
    -- opk03 -> org_key_03
    -- bet04 -> wage_component_4
    -- anz04 -> wage_component_04
    -- opk04 -> org_key_04
    -- bet05 -> wage_component_5
    -- anz05 -> wage_component_05
    -- opk05 -> org_key_05
    -- bet06 -> wage_component_6
    -- anz06 -> wage_component_06
    -- opk06 -> org_key_06
    -- bet07 -> wage_component_7
    -- anz07 -> wage_component_07
    -- opk07 -> org_key_07
    -- bet08 -> wage_component_8
    -- anz08 -> wage_component_08
    -- opk08 -> org_key_08
    -- lga09 -> location
    -- bet09 -> wage_component_9
    -- anz09 -> wage_component_09
    -- opk09 -> org_key_09
    -- opk10 -> org_key_10
    -- opk11 -> org_key_11
    -- ein12 -> allowance
    -- opk12 -> org_key_12
    -- lga13 -> overtime
    -- ein13 -> overtime_pay
    -- opk13 -> org_key_13
    -- lga14 -> allowances
    -- ein14 -> commission
    -- opk14 -> org_key_14
    -- lga15 -> benefits
    -- bet15 -> salary_component_15
    -- anz15 -> wage_component_15
    -- opk15 -> org_key_15
    -- bet16 -> salary_component_16
    -- anz16 -> wage_component_16
    -- opk16 -> org_key_16
    -- bet17 -> salary_component_17
    -- anz17 -> wage_component_17
    -- opk17 -> org_key_17
    -- bet18 -> salary_component_18
    -- bet19 -> salary_component_19
    -- lga20 -> manager_id
    -- bet20 -> salary_component_20
    -- ein20 -> salary_review_date
    -- bet21 -> salary_component_21
    -- ein21 -> benefits_eligible
    -- lga22 -> next_review_date
    -- bet22 -> salary_component_22
    -- lga23 -> adjustment_reason
    -- bet23 -> salary_component_23
    -- ein23 -> retirement_contribution
    -- lga24 -> union_status
    -- bet24 -> salary_component_24
    -- ein24 -> health_insurance_plan
    -- lga25 -> work_schedule
    -- bet25 -> salary_component_25
    -- ein25 -> vacation_days
    -- lga26 -> commission_rate
    -- bet26 -> salary_component_26
    -- anz26 -> unknown_value_26
    -- ein26 -> sick_leave
    -- opk26 -> wage_component_26
    -- bet27 -> salary_component_27
    -- anz27 -> unknown_value_27
    -- ein27 -> training_budget
    -- opk27 -> wage_component_27
    -- lga28 -> retirement_contributions
    -- bet28 -> salary_component_28
    -- anz28 -> unknown_value_28
    -- opk28 -> wage_component_28
    -- lga29 -> compensation_notes
    -- bet29 -> salary_component_29
    -- anz29 -> unknown_value_29
    -- ein29 -> next_pay_raise_date
    -- opk29 -> wage_component_29
    -- lga30 -> org_attribute_30
    -- bet30 -> salary_component_30
    -- anz30 -> unknown_value_30
    -- ein30 -> probation_end_date
    -- opk30 -> wage_component_30
    -- lga31 -> org_attribute_31
    -- bet31 -> salary_component_31
    -- anz31 -> unknown_value_31
    -- ein31 -> contract_end_date
    -- opk31 -> wage_component_31
    -- lga32 -> org_attribute_32
    -- bet32 -> salary_component_32
    -- anz32 -> unknown_value_32
    -- ein32 -> employee_id_32
    -- opk32 -> wage_component_32
    -- lga33 -> org_attribute_33
    -- bet33 -> salary_component_33
    -- anz33 -> unknown_value_33
    -- ein33 -> employee_id_33
    -- opk33 -> wage_component_33
    -- lga34 -> org_attribute_34
    -- bet34 -> salary_component_34
    -- anz34 -> unknown_value_34
    -- ein34 -> employee_id_34
    -- opk34 -> wage_component_34
    -- lga35 -> org_attribute_35
    -- bet35 -> salary_component_35
    -- anz35 -> unknown_value_35
    -- ein35 -> employee_id_35
    -- opk35 -> wage_component_35
    -- lga36 -> org_attribute_36
    -- bet36 -> salary_component_36
    -- anz36 -> unknown_value_36
    -- ein36 -> employee_id_36
    -- opk36 -> wage_component_36
    -- lga37 -> org_attribute_37
    -- bet37 -> salary_component_37
    -- anz37 -> unknown_value_37
    -- ein37 -> employee_id_37
    -- opk37 -> wage_component_37
    -- lga38 -> org_attribute_38
    -- bet38 -> salary_component_38
    -- anz38 -> unknown_value_38
    -- ein38 -> employee_id_38
    -- opk38 -> wage_component_38
    -- lga39 -> org_attribute_39
    -- bet39 -> salary_component_39
    -- anz39 -> unknown_value_39
    -- ein39 -> employee_id_39
    -- opk39 -> wage_component_39
    -- lga40 -> org_attribute_40
    -- bet40 -> salary_component_40
    -- anz40 -> unknown_value_40
    -- ein40 -> employee_id_40
    -- opk40 -> wage_component_40
    -- ind01 -> indicator_1
    -- ind02 -> indicator_2
    -- ind03 -> indicator_3
    -- ind04 -> indicator_4
    -- ind05 -> indicator_5
    -- ind06 -> indicator_6
    -- ind07 -> indicator_7
    -- ind08 -> indicator_8
    -- ind09 -> indicator_9
    -- ind10 -> indicator_10
    -- ind11 -> indicator_11
    -- ind12 -> indicator_12
    -- ind13 -> indicator_13
    -- ind14 -> indicator_14
    -- ind15 -> indicator_15
    -- ind16 -> indicator_16
    -- ind17 -> indicator_17
    -- ind18 -> indicator_18
    -- ind19 -> indicator_19
    -- ind20 -> indicator_20
    -- ind21 -> indicator_21
    -- ind22 -> indicator_22
    -- ind23 -> indicator_23
    -- ind24 -> indicator_24
    -- ind25 -> indicator_25
    -- ind26 -> indicator_26
    -- ind27 -> indicator_27
    -- ind28 -> indicator_28
    -- ind29 -> indicator_29
    -- ind30 -> indicator_30
    -- ind31 -> indicator_31
    -- ind32 -> indicator_32
    -- ind33 -> indicator_33
    -- ind34 -> indicator_34
    -- ind35 -> indicator_35
    -- ind36 -> indicator_36
    -- ind37 -> indicator_37
    -- ind38 -> indicator_38
    -- ind39 -> indicator_39
    -- ind40 -> indicator_40
    -- ancur -> salary_currency
    -- cpind -> currency_indicator
    -- flaga -> flag_additional
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "begda",
        "endda",
        "mandt" AS "client_code",
        "objps" AS "position_id",
        "pernr",
        "seqnr" AS "sequence_number",
        "sprps" AS "special_purpose",
        "subty" AS "subtype",
        "aedtm" AS "salary_record_date",
        "uname",
        "histo" AS "historical_indicator",
        "itxex" AS "record_id",
        "refex" AS "reference_id",
        "ordex" AS "order_index",
        "itbld" AS "it_build",
        "preas" AS "personnel_reason_code",
        "flag1" AS "flag_1",
        "flag2" AS "flag_2",
        "flag3" AS "flag_3",
        "flag4" AS "flag_4",
        "rese1" AS "reserve_field_1",
        "rese2" AS "reserved_field_2",
        "grpvl" AS "group_value",
        "trfar" AS "wage_type",
        "trfgb" AS "wage_area",
        "trfgr" AS "pay_scale_group",
        "trfst" AS "pay_scale_level",
        "stvor" AS "pay_scale_type",
        "orzst" AS "organizational_status",
        "partn" AS "partner_code",
        "waers",
        "vglta" AS "comparison_date",
        "vglgb" AS "comparison_wage_area",
        "vglgr" AS "comparison_pay_scale_group",
        "vglst" AS "comparison_pay_scale_level",
        "vglsv" AS "comparison_collective_agreement",
        "bsgrd",
        "divgv" AS "dividend_percentage",
        "ansal" AS "annual_salary",
        "falgk" AS "compensation_flag",
        "falgr" AS "reason_flag",
        "lga01",
        "bet01" AS "wage_component_1",
        "anz01" AS "wage_component_01",
        "ein01",
        "opk01" AS "org_key_01",
        "lga02",
        "bet02" AS "wage_component_2",
        "anz02" AS "wage_component_02",
        "ein02",
        "opk02" AS "org_key_02",
        "lga03",
        "bet03" AS "wage_component_3",
        "anz03" AS "wage_component_03",
        "ein03",
        "opk03" AS "org_key_03",
        "lga04",
        "bet04" AS "wage_component_4",
        "anz04" AS "wage_component_04",
        "ein04",
        "opk04" AS "org_key_04",
        "lga05",
        "bet05" AS "wage_component_5",
        "anz05" AS "wage_component_05",
        "ein05",
        "opk05" AS "org_key_05",
        "lga06",
        "bet06" AS "wage_component_6",
        "anz06" AS "wage_component_06",
        "ein06",
        "opk06" AS "org_key_06",
        "lga07",
        "bet07" AS "wage_component_7",
        "anz07" AS "wage_component_07",
        "ein07",
        "opk07" AS "org_key_07",
        "lga08",
        "bet08" AS "wage_component_8",
        "anz08" AS "wage_component_08",
        "ein08",
        "opk08" AS "org_key_08",
        "lga09" AS "location",
        "bet09" AS "wage_component_9",
        "anz09" AS "wage_component_09",
        "ein09",
        "opk09" AS "org_key_09",
        "lga10",
        "bet10",
        "anz10",
        "ein10",
        "opk10" AS "org_key_10",
        "lga11",
        "bet11",
        "anz11",
        "ein11",
        "opk11" AS "org_key_11",
        "lga12",
        "bet12",
        "anz12",
        "ein12" AS "allowance",
        "opk12" AS "org_key_12",
        "lga13" AS "overtime",
        "bet13",
        "anz13",
        "ein13" AS "overtime_pay",
        "opk13" AS "org_key_13",
        "lga14" AS "allowances",
        "bet14",
        "anz14",
        "ein14" AS "commission",
        "opk14" AS "org_key_14",
        "lga15" AS "benefits",
        "bet15" AS "salary_component_15",
        "anz15" AS "wage_component_15",
        "ein15",
        "opk15" AS "org_key_15",
        "lga16",
        "bet16" AS "salary_component_16",
        "anz16" AS "wage_component_16",
        "ein16",
        "opk16" AS "org_key_16",
        "lga17",
        "bet17" AS "salary_component_17",
        "anz17" AS "wage_component_17",
        "ein17",
        "opk17" AS "org_key_17",
        "lga18",
        "bet18" AS "salary_component_18",
        "anz18",
        "ein18",
        "opk18",
        "lga19",
        "bet19" AS "salary_component_19",
        "anz19",
        "ein19",
        "opk19",
        "lga20" AS "manager_id",
        "bet20" AS "salary_component_20",
        "anz20",
        "ein20" AS "salary_review_date",
        "opk20",
        "lga21",
        "bet21" AS "salary_component_21",
        "anz21",
        "ein21" AS "benefits_eligible",
        "opk21",
        "lga22" AS "next_review_date",
        "bet22" AS "salary_component_22",
        "anz22",
        "ein22",
        "opk22",
        "lga23" AS "adjustment_reason",
        "bet23" AS "salary_component_23",
        "anz23",
        "ein23" AS "retirement_contribution",
        "opk23",
        "lga24" AS "union_status",
        "bet24" AS "salary_component_24",
        "anz24",
        "ein24" AS "health_insurance_plan",
        "opk24",
        "lga25" AS "work_schedule",
        "bet25" AS "salary_component_25",
        "anz25",
        "ein25" AS "vacation_days",
        "opk25",
        "lga26" AS "commission_rate",
        "bet26" AS "salary_component_26",
        "anz26" AS "unknown_value_26",
        "ein26" AS "sick_leave",
        "opk26" AS "wage_component_26",
        "lga27",
        "bet27" AS "salary_component_27",
        "anz27" AS "unknown_value_27",
        "ein27" AS "training_budget",
        "opk27" AS "wage_component_27",
        "lga28" AS "retirement_contributions",
        "bet28" AS "salary_component_28",
        "anz28" AS "unknown_value_28",
        "ein28",
        "opk28" AS "wage_component_28",
        "lga29" AS "compensation_notes",
        "bet29" AS "salary_component_29",
        "anz29" AS "unknown_value_29",
        "ein29" AS "next_pay_raise_date",
        "opk29" AS "wage_component_29",
        "lga30" AS "org_attribute_30",
        "bet30" AS "salary_component_30",
        "anz30" AS "unknown_value_30",
        "ein30" AS "probation_end_date",
        "opk30" AS "wage_component_30",
        "lga31" AS "org_attribute_31",
        "bet31" AS "salary_component_31",
        "anz31" AS "unknown_value_31",
        "ein31" AS "contract_end_date",
        "opk31" AS "wage_component_31",
        "lga32" AS "org_attribute_32",
        "bet32" AS "salary_component_32",
        "anz32" AS "unknown_value_32",
        "ein32" AS "employee_id_32",
        "opk32" AS "wage_component_32",
        "lga33" AS "org_attribute_33",
        "bet33" AS "salary_component_33",
        "anz33" AS "unknown_value_33",
        "ein33" AS "employee_id_33",
        "opk33" AS "wage_component_33",
        "lga34" AS "org_attribute_34",
        "bet34" AS "salary_component_34",
        "anz34" AS "unknown_value_34",
        "ein34" AS "employee_id_34",
        "opk34" AS "wage_component_34",
        "lga35" AS "org_attribute_35",
        "bet35" AS "salary_component_35",
        "anz35" AS "unknown_value_35",
        "ein35" AS "employee_id_35",
        "opk35" AS "wage_component_35",
        "lga36" AS "org_attribute_36",
        "bet36" AS "salary_component_36",
        "anz36" AS "unknown_value_36",
        "ein36" AS "employee_id_36",
        "opk36" AS "wage_component_36",
        "lga37" AS "org_attribute_37",
        "bet37" AS "salary_component_37",
        "anz37" AS "unknown_value_37",
        "ein37" AS "employee_id_37",
        "opk37" AS "wage_component_37",
        "lga38" AS "org_attribute_38",
        "bet38" AS "salary_component_38",
        "anz38" AS "unknown_value_38",
        "ein38" AS "employee_id_38",
        "opk38" AS "wage_component_38",
        "lga39" AS "org_attribute_39",
        "bet39" AS "salary_component_39",
        "anz39" AS "unknown_value_39",
        "ein39" AS "employee_id_39",
        "opk39" AS "wage_component_39",
        "lga40" AS "org_attribute_40",
        "bet40" AS "salary_component_40",
        "anz40" AS "unknown_value_40",
        "ein40" AS "employee_id_40",
        "opk40" AS "wage_component_40",
        "ind01" AS "indicator_1",
        "ind02" AS "indicator_2",
        "ind03" AS "indicator_3",
        "ind04" AS "indicator_4",
        "ind05" AS "indicator_5",
        "ind06" AS "indicator_6",
        "ind07" AS "indicator_7",
        "ind08" AS "indicator_8",
        "ind09" AS "indicator_9",
        "ind10" AS "indicator_10",
        "ind11" AS "indicator_11",
        "ind12" AS "indicator_12",
        "ind13" AS "indicator_13",
        "ind14" AS "indicator_14",
        "ind15" AS "indicator_15",
        "ind16" AS "indicator_16",
        "ind17" AS "indicator_17",
        "ind18" AS "indicator_18",
        "ind19" AS "indicator_19",
        "ind20" AS "indicator_20",
        "ind21" AS "indicator_21",
        "ind22" AS "indicator_22",
        "ind23" AS "indicator_23",
        "ind24" AS "indicator_24",
        "ind25" AS "indicator_25",
        "ind26" AS "indicator_26",
        "ind27" AS "indicator_27",
        "ind28" AS "indicator_28",
        "ind29" AS "indicator_29",
        "ind30" AS "indicator_30",
        "ind31" AS "indicator_31",
        "ind32" AS "indicator_32",
        "ind33" AS "indicator_33",
        "ind34" AS "indicator_34",
        "ind35" AS "indicator_35",
        "ind36" AS "indicator_36",
        "ind37" AS "indicator_37",
        "ind38" AS "indicator_38",
        "ind39" AS "indicator_39",
        "ind40" AS "indicator_40",
        "ancur" AS "salary_currency",
        "cpind" AS "currency_indicator",
        "flaga" AS "flag_additional",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_pa0008_data_projected"
),

"sap_pa0008_data_projected_renamed_casted" AS (
    -- Column Type Casting: 
    -- adjustment_reason: from DECIMAL to VARCHAR
    -- allowance: from DECIMAL to VARCHAR
    -- allowances: from DECIMAL to VARCHAR
    -- begda: from INT to DATE
    -- benefits: from DECIMAL to VARCHAR
    -- benefits_eligible: from DECIMAL to VARCHAR
    -- client_code: from INT to VARCHAR
    -- commission: from DECIMAL to VARCHAR
    -- commission_rate: from DECIMAL to VARCHAR
    -- comparison_date: from DECIMAL to DATE
    -- comparison_wage_area: from DECIMAL to VARCHAR
    -- compensation_flag: from DECIMAL to VARCHAR
    -- compensation_notes: from DECIMAL to VARCHAR
    -- contract_end_date: from DECIMAL to DATE
    -- ein01: from DECIMAL to VARCHAR
    -- ein02: from DECIMAL to VARCHAR
    -- ein03: from DECIMAL to VARCHAR
    -- ein04: from DECIMAL to VARCHAR
    -- ein05: from DECIMAL to VARCHAR
    -- ein06: from DECIMAL to VARCHAR
    -- ein07: from DECIMAL to VARCHAR
    -- ein08: from DECIMAL to VARCHAR
    -- ein09: from DECIMAL to VARCHAR
    -- ein10: from DECIMAL to VARCHAR
    -- ein11: from DECIMAL to VARCHAR
    -- ein15: from DECIMAL to VARCHAR
    -- ein16: from DECIMAL to VARCHAR
    -- ein17: from DECIMAL to VARCHAR
    -- ein18: from DECIMAL to VARCHAR
    -- ein19: from DECIMAL to VARCHAR
    -- ein22: from DECIMAL to VARCHAR
    -- ein28: from DECIMAL to VARCHAR
    -- employee_id_32: from DECIMAL to VARCHAR
    -- employee_id_33: from DECIMAL to VARCHAR
    -- employee_id_34: from DECIMAL to VARCHAR
    -- employee_id_35: from DECIMAL to VARCHAR
    -- employee_id_36: from DECIMAL to VARCHAR
    -- employee_id_37: from DECIMAL to VARCHAR
    -- employee_id_38: from DECIMAL to VARCHAR
    -- employee_id_39: from DECIMAL to VARCHAR
    -- employee_id_40: from DECIMAL to VARCHAR
    -- endda: from INT to DATE
    -- flag_1: from DECIMAL to VARCHAR
    -- flag_2: from DECIMAL to VARCHAR
    -- flag_3: from DECIMAL to VARCHAR
    -- flag_4: from DECIMAL to VARCHAR
    -- flag_additional: from DECIMAL to VARCHAR
    -- group_value: from DECIMAL to VARCHAR
    -- health_insurance_plan: from DECIMAL to VARCHAR
    -- historical_indicator: from DECIMAL to VARCHAR
    -- indicator_10: from DECIMAL to VARCHAR
    -- indicator_11: from DECIMAL to VARCHAR
    -- indicator_12: from DECIMAL to VARCHAR
    -- indicator_13: from DECIMAL to VARCHAR
    -- indicator_14: from DECIMAL to VARCHAR
    -- indicator_15: from DECIMAL to VARCHAR
    -- indicator_16: from DECIMAL to VARCHAR
    -- indicator_17: from DECIMAL to VARCHAR
    -- indicator_18: from DECIMAL to VARCHAR
    -- indicator_19: from DECIMAL to VARCHAR
    -- indicator_20: from DECIMAL to VARCHAR
    -- indicator_21: from DECIMAL to VARCHAR
    -- indicator_22: from DECIMAL to VARCHAR
    -- indicator_23: from DECIMAL to VARCHAR
    -- indicator_24: from DECIMAL to VARCHAR
    -- indicator_25: from DECIMAL to VARCHAR
    -- indicator_26: from DECIMAL to VARCHAR
    -- indicator_27: from DECIMAL to VARCHAR
    -- indicator_28: from DECIMAL to VARCHAR
    -- indicator_29: from DECIMAL to VARCHAR
    -- indicator_3: from DECIMAL to VARCHAR
    -- indicator_30: from DECIMAL to VARCHAR
    -- indicator_31: from DECIMAL to VARCHAR
    -- indicator_32: from DECIMAL to VARCHAR
    -- indicator_33: from DECIMAL to VARCHAR
    -- indicator_34: from DECIMAL to VARCHAR
    -- indicator_35: from DECIMAL to VARCHAR
    -- indicator_36: from DECIMAL to VARCHAR
    -- indicator_37: from DECIMAL to VARCHAR
    -- indicator_38: from DECIMAL to VARCHAR
    -- indicator_39: from DECIMAL to VARCHAR
    -- indicator_4: from DECIMAL to VARCHAR
    -- indicator_40: from DECIMAL to VARCHAR
    -- indicator_5: from DECIMAL to VARCHAR
    -- indicator_6: from DECIMAL to VARCHAR
    -- indicator_7: from DECIMAL to VARCHAR
    -- indicator_8: from DECIMAL to VARCHAR
    -- indicator_9: from DECIMAL to VARCHAR
    -- it_build: from DECIMAL to VARCHAR
    -- lga03: from DECIMAL to VARCHAR
    -- lga04: from DECIMAL to VARCHAR
    -- lga05: from DECIMAL to VARCHAR
    -- lga06: from DECIMAL to VARCHAR
    -- lga07: from DECIMAL to VARCHAR
    -- lga08: from DECIMAL to VARCHAR
    -- lga10: from DECIMAL to VARCHAR
    -- lga11: from DECIMAL to VARCHAR
    -- lga12: from DECIMAL to VARCHAR
    -- lga16: from DECIMAL to VARCHAR
    -- lga17: from DECIMAL to VARCHAR
    -- lga18: from DECIMAL to VARCHAR
    -- lga19: from DECIMAL to VARCHAR
    -- lga21: from DECIMAL to VARCHAR
    -- lga27: from DECIMAL to VARCHAR
    -- location: from DECIMAL to VARCHAR
    -- manager_id: from DECIMAL to INT
    -- next_pay_raise_date: from DECIMAL to DATE
    -- next_review_date: from DECIMAL to DATE
    -- opk18: from DECIMAL to VARCHAR
    -- opk19: from DECIMAL to VARCHAR
    -- opk20: from DECIMAL to VARCHAR
    -- opk21: from DECIMAL to VARCHAR
    -- opk22: from DECIMAL to VARCHAR
    -- opk23: from DECIMAL to VARCHAR
    -- opk24: from DECIMAL to VARCHAR
    -- opk25: from DECIMAL to VARCHAR
    -- order_index: from DECIMAL to INT
    -- org_attribute_30: from DECIMAL to VARCHAR
    -- org_attribute_31: from DECIMAL to VARCHAR
    -- org_attribute_32: from DECIMAL to VARCHAR
    -- org_attribute_33: from DECIMAL to VARCHAR
    -- org_attribute_34: from DECIMAL to VARCHAR
    -- org_attribute_35: from DECIMAL to VARCHAR
    -- org_attribute_36: from DECIMAL to VARCHAR
    -- org_attribute_37: from DECIMAL to VARCHAR
    -- org_attribute_38: from DECIMAL to VARCHAR
    -- org_attribute_39: from DECIMAL to VARCHAR
    -- org_attribute_40: from DECIMAL to VARCHAR
    -- org_key_01: from DECIMAL to VARCHAR
    -- org_key_02: from DECIMAL to VARCHAR
    -- org_key_03: from DECIMAL to VARCHAR
    -- org_key_04: from DECIMAL to VARCHAR
    -- org_key_05: from DECIMAL to VARCHAR
    -- org_key_06: from DECIMAL to VARCHAR
    -- org_key_07: from DECIMAL to VARCHAR
    -- org_key_08: from DECIMAL to VARCHAR
    -- org_key_09: from DECIMAL to VARCHAR
    -- org_key_10: from DECIMAL to VARCHAR
    -- org_key_11: from DECIMAL to VARCHAR
    -- org_key_12: from DECIMAL to VARCHAR
    -- org_key_13: from DECIMAL to VARCHAR
    -- org_key_14: from DECIMAL to VARCHAR
    -- org_key_15: from DECIMAL to VARCHAR
    -- org_key_16: from DECIMAL to VARCHAR
    -- org_key_17: from DECIMAL to VARCHAR
    -- organizational_status: from DECIMAL to VARCHAR
    -- overtime: from DECIMAL to VARCHAR
    -- overtime_pay: from DECIMAL to VARCHAR
    -- partner_code: from DECIMAL to VARCHAR
    -- pernr: from INT to VARCHAR
    -- personnel_reason_code: from DECIMAL to VARCHAR
    -- position_id: from DECIMAL to VARCHAR
    -- probation_end_date: from DECIMAL to DATE
    -- reason_flag: from DECIMAL to VARCHAR
    -- record_id: from DECIMAL to VARCHAR
    -- reference_id: from DECIMAL to VARCHAR
    -- reserve_field_1: from DECIMAL to VARCHAR
    -- reserved_field_2: from DECIMAL to VARCHAR
    -- retirement_contribution: from DECIMAL to VARCHAR
    -- retirement_contributions: from DECIMAL to VARCHAR
    -- salary_record_date: from INT to DATE
    -- salary_review_date: from DECIMAL to DATE
    -- sick_leave: from DECIMAL to VARCHAR
    -- special_purpose: from DECIMAL to VARCHAR
    -- training_budget: from DECIMAL to VARCHAR
    -- union_status: from DECIMAL to VARCHAR
    -- vacation_days: from DECIMAL to VARCHAR
    -- wage_component_26: from DECIMAL to VARCHAR
    -- wage_component_27: from DECIMAL to VARCHAR
    -- wage_component_28: from DECIMAL to VARCHAR
    -- wage_component_29: from DECIMAL to VARCHAR
    -- wage_component_30: from DECIMAL to VARCHAR
    -- wage_component_31: from DECIMAL to VARCHAR
    -- wage_component_32: from DECIMAL to VARCHAR
    -- wage_component_33: from DECIMAL to VARCHAR
    -- wage_component_34: from DECIMAL to VARCHAR
    -- wage_component_35: from DECIMAL to VARCHAR
    -- wage_component_36: from DECIMAL to VARCHAR
    -- wage_component_37: from DECIMAL to VARCHAR
    -- wage_component_38: from DECIMAL to VARCHAR
    -- wage_component_39: from DECIMAL to VARCHAR
    -- wage_component_40: from DECIMAL to VARCHAR
    -- work_schedule: from DECIMAL to VARCHAR
    SELECT
        "sequence_number",
        "subtype",
        "uname",
        "wage_type",
        "wage_area",
        "pay_scale_group",
        "pay_scale_level",
        "pay_scale_type",
        "waers",
        "comparison_pay_scale_group",
        "comparison_pay_scale_level",
        "comparison_collective_agreement",
        "bsgrd",
        "dividend_percentage",
        "annual_salary",
        "lga01",
        "wage_component_1",
        "wage_component_01",
        "lga02",
        "wage_component_2",
        "wage_component_02",
        "wage_component_3",
        "wage_component_03",
        "wage_component_4",
        "wage_component_04",
        "wage_component_5",
        "wage_component_05",
        "wage_component_6",
        "wage_component_06",
        "wage_component_7",
        "wage_component_07",
        "wage_component_8",
        "wage_component_08",
        "wage_component_9",
        "wage_component_09",
        "bet10",
        "anz10",
        "bet11",
        "anz11",
        "bet12",
        "anz12",
        "bet13",
        "anz13",
        "bet14",
        "anz14",
        "salary_component_15",
        "wage_component_15",
        "salary_component_16",
        "wage_component_16",
        "salary_component_17",
        "wage_component_17",
        "salary_component_18",
        "anz18",
        "salary_component_19",
        "anz19",
        "salary_component_20",
        "anz20",
        "salary_component_21",
        "anz21",
        "salary_component_22",
        "anz22",
        "salary_component_23",
        "anz23",
        "salary_component_24",
        "anz24",
        "salary_component_25",
        "anz25",
        "salary_component_26",
        "unknown_value_26",
        "salary_component_27",
        "unknown_value_27",
        "salary_component_28",
        "unknown_value_28",
        "salary_component_29",
        "unknown_value_29",
        "salary_component_30",
        "unknown_value_30",
        "salary_component_31",
        "unknown_value_31",
        "salary_component_32",
        "unknown_value_32",
        "salary_component_33",
        "unknown_value_33",
        "salary_component_34",
        "unknown_value_34",
        "salary_component_35",
        "unknown_value_35",
        "salary_component_36",
        "unknown_value_36",
        "salary_component_37",
        "unknown_value_37",
        "salary_component_38",
        "unknown_value_38",
        "salary_component_39",
        "unknown_value_39",
        "salary_component_40",
        "unknown_value_40",
        "indicator_1",
        "indicator_2",
        "salary_currency",
        "currency_indicator",
        "row_id",
        "is_deleted",
        CAST("adjustment_reason" AS VARCHAR) AS "adjustment_reason",
        CAST("allowance" AS VARCHAR) AS "allowance",
        CAST("allowances" AS VARCHAR) AS "allowances",
        strptime(CAST("begda" AS VARCHAR), '%Y%m%d') AS "begda",
        CAST("benefits" AS VARCHAR) AS "benefits",
        CAST("benefits_eligible" AS VARCHAR) AS "benefits_eligible",
        CAST("client_code" AS VARCHAR) AS "client_code",
        CAST("commission" AS VARCHAR) AS "commission",
        CAST("commission_rate" AS VARCHAR) AS "commission_rate",
        CAST("comparison_date" AS DATE) AS "comparison_date",
        CAST("comparison_wage_area" AS VARCHAR) AS "comparison_wage_area",
        CAST("compensation_flag" AS VARCHAR) AS "compensation_flag",
        CAST("compensation_notes" AS VARCHAR) AS "compensation_notes",
        CAST("contract_end_date" AS DATE) AS "contract_end_date",
        CAST("ein01" AS VARCHAR) AS "ein01",
        CAST("ein02" AS VARCHAR) AS "ein02",
        CAST("ein03" AS VARCHAR) AS "ein03",
        CAST("ein04" AS VARCHAR) AS "ein04",
        CAST("ein05" AS VARCHAR) AS "ein05",
        CAST("ein06" AS VARCHAR) AS "ein06",
        CAST("ein07" AS VARCHAR) AS "ein07",
        CAST("ein08" AS VARCHAR) AS "ein08",
        CAST("ein09" AS VARCHAR) AS "ein09",
        CAST("ein10" AS VARCHAR) AS "ein10",
        CAST("ein11" AS VARCHAR) AS "ein11",
        CAST("ein15" AS VARCHAR) AS "ein15",
        CAST("ein16" AS VARCHAR) AS "ein16",
        CAST("ein17" AS VARCHAR) AS "ein17",
        CAST("ein18" AS VARCHAR) AS "ein18",
        CAST("ein19" AS VARCHAR) AS "ein19",
        CAST("ein22" AS VARCHAR) AS "ein22",
        CAST("ein28" AS VARCHAR) AS "ein28",
        CAST("employee_id_32" AS VARCHAR) AS "employee_id_32",
        CAST("employee_id_33" AS VARCHAR) AS "employee_id_33",
        CAST("employee_id_34" AS VARCHAR) AS "employee_id_34",
        CAST("employee_id_35" AS VARCHAR) AS "employee_id_35",
        CAST("employee_id_36" AS VARCHAR) AS "employee_id_36",
        CAST("employee_id_37" AS VARCHAR) AS "employee_id_37",
        CAST("employee_id_38" AS VARCHAR) AS "employee_id_38",
        CAST("employee_id_39" AS VARCHAR) AS "employee_id_39",
        CAST("employee_id_40" AS VARCHAR) AS "employee_id_40",
        strptime(CAST("endda" AS VARCHAR), '%Y%m%d') AS "endda",
        CAST("flag_1" AS VARCHAR) AS "flag_1",
        CAST("flag_2" AS VARCHAR) AS "flag_2",
        CAST("flag_3" AS VARCHAR) AS "flag_3",
        CAST("flag_4" AS VARCHAR) AS "flag_4",
        CAST("flag_additional" AS VARCHAR) AS "flag_additional",
        CAST("group_value" AS VARCHAR) AS "group_value",
        CAST("health_insurance_plan" AS VARCHAR) AS "health_insurance_plan",
        CAST("historical_indicator" AS VARCHAR) AS "historical_indicator",
        CAST("indicator_10" AS VARCHAR) AS "indicator_10",
        CAST("indicator_11" AS VARCHAR) AS "indicator_11",
        CAST("indicator_12" AS VARCHAR) AS "indicator_12",
        CAST("indicator_13" AS VARCHAR) AS "indicator_13",
        CAST("indicator_14" AS VARCHAR) AS "indicator_14",
        CAST("indicator_15" AS VARCHAR) AS "indicator_15",
        CAST("indicator_16" AS VARCHAR) AS "indicator_16",
        CAST("indicator_17" AS VARCHAR) AS "indicator_17",
        CAST("indicator_18" AS VARCHAR) AS "indicator_18",
        CAST("indicator_19" AS VARCHAR) AS "indicator_19",
        CAST("indicator_20" AS VARCHAR) AS "indicator_20",
        CAST("indicator_21" AS VARCHAR) AS "indicator_21",
        CAST("indicator_22" AS VARCHAR) AS "indicator_22",
        CAST("indicator_23" AS VARCHAR) AS "indicator_23",
        CAST("indicator_24" AS VARCHAR) AS "indicator_24",
        CAST("indicator_25" AS VARCHAR) AS "indicator_25",
        CAST("indicator_26" AS VARCHAR) AS "indicator_26",
        CAST("indicator_27" AS VARCHAR) AS "indicator_27",
        CAST("indicator_28" AS VARCHAR) AS "indicator_28",
        CAST("indicator_29" AS VARCHAR) AS "indicator_29",
        CAST("indicator_3" AS VARCHAR) AS "indicator_3",
        CAST("indicator_30" AS VARCHAR) AS "indicator_30",
        CAST("indicator_31" AS VARCHAR) AS "indicator_31",
        CAST("indicator_32" AS VARCHAR) AS "indicator_32",
        CAST("indicator_33" AS VARCHAR) AS "indicator_33",
        CAST("indicator_34" AS VARCHAR) AS "indicator_34",
        CAST("indicator_35" AS VARCHAR) AS "indicator_35",
        CAST("indicator_36" AS VARCHAR) AS "indicator_36",
        CAST("indicator_37" AS VARCHAR) AS "indicator_37",
        CAST("indicator_38" AS VARCHAR) AS "indicator_38",
        CAST("indicator_39" AS VARCHAR) AS "indicator_39",
        CAST("indicator_4" AS VARCHAR) AS "indicator_4",
        CAST("indicator_40" AS VARCHAR) AS "indicator_40",
        CAST("indicator_5" AS VARCHAR) AS "indicator_5",
        CAST("indicator_6" AS VARCHAR) AS "indicator_6",
        CAST("indicator_7" AS VARCHAR) AS "indicator_7",
        CAST("indicator_8" AS VARCHAR) AS "indicator_8",
        CAST("indicator_9" AS VARCHAR) AS "indicator_9",
        CAST("it_build" AS VARCHAR) AS "it_build",
        CAST("lga03" AS VARCHAR) AS "lga03",
        CAST("lga04" AS VARCHAR) AS "lga04",
        CAST("lga05" AS VARCHAR) AS "lga05",
        CAST("lga06" AS VARCHAR) AS "lga06",
        CAST("lga07" AS VARCHAR) AS "lga07",
        CAST("lga08" AS VARCHAR) AS "lga08",
        CAST("lga10" AS VARCHAR) AS "lga10",
        CAST("lga11" AS VARCHAR) AS "lga11",
        CAST("lga12" AS VARCHAR) AS "lga12",
        CAST("lga16" AS VARCHAR) AS "lga16",
        CAST("lga17" AS VARCHAR) AS "lga17",
        CAST("lga18" AS VARCHAR) AS "lga18",
        CAST("lga19" AS VARCHAR) AS "lga19",
        CAST("lga21" AS VARCHAR) AS "lga21",
        CAST("lga27" AS VARCHAR) AS "lga27",
        CAST("location" AS VARCHAR) AS "location",
        CAST("manager_id" AS INT) AS "manager_id",
        CAST("next_pay_raise_date" AS DATE) AS "next_pay_raise_date",
        CAST("next_review_date" AS DATE) AS "next_review_date",
        CAST("opk18" AS VARCHAR) AS "opk18",
        CAST("opk19" AS VARCHAR) AS "opk19",
        CAST("opk20" AS VARCHAR) AS "opk20",
        CAST("opk21" AS VARCHAR) AS "opk21",
        CAST("opk22" AS VARCHAR) AS "opk22",
        CAST("opk23" AS VARCHAR) AS "opk23",
        CAST("opk24" AS VARCHAR) AS "opk24",
        CAST("opk25" AS VARCHAR) AS "opk25",
        CAST("order_index" AS INT) AS "order_index",
        CAST("org_attribute_30" AS VARCHAR) AS "org_attribute_30",
        CAST("org_attribute_31" AS VARCHAR) AS "org_attribute_31",
        CAST("org_attribute_32" AS VARCHAR) AS "org_attribute_32",
        CAST("org_attribute_33" AS VARCHAR) AS "org_attribute_33",
        CAST("org_attribute_34" AS VARCHAR) AS "org_attribute_34",
        CAST("org_attribute_35" AS VARCHAR) AS "org_attribute_35",
        CAST("org_attribute_36" AS VARCHAR) AS "org_attribute_36",
        CAST("org_attribute_37" AS VARCHAR) AS "org_attribute_37",
        CAST("org_attribute_38" AS VARCHAR) AS "org_attribute_38",
        CAST("org_attribute_39" AS VARCHAR) AS "org_attribute_39",
        CAST("org_attribute_40" AS VARCHAR) AS "org_attribute_40",
        CAST("org_key_01" AS VARCHAR) AS "org_key_01",
        CAST("org_key_02" AS VARCHAR) AS "org_key_02",
        CAST("org_key_03" AS VARCHAR) AS "org_key_03",
        CAST("org_key_04" AS VARCHAR) AS "org_key_04",
        CAST("org_key_05" AS VARCHAR) AS "org_key_05",
        CAST("org_key_06" AS VARCHAR) AS "org_key_06",
        CAST("org_key_07" AS VARCHAR) AS "org_key_07",
        CAST("org_key_08" AS VARCHAR) AS "org_key_08",
        CAST("org_key_09" AS VARCHAR) AS "org_key_09",
        CAST("org_key_10" AS VARCHAR) AS "org_key_10",
        CAST("org_key_11" AS VARCHAR) AS "org_key_11",
        CAST("org_key_12" AS VARCHAR) AS "org_key_12",
        CAST("org_key_13" AS VARCHAR) AS "org_key_13",
        CAST("org_key_14" AS VARCHAR) AS "org_key_14",
        CAST("org_key_15" AS VARCHAR) AS "org_key_15",
        CAST("org_key_16" AS VARCHAR) AS "org_key_16",
        CAST("org_key_17" AS VARCHAR) AS "org_key_17",
        CAST("organizational_status" AS VARCHAR) AS "organizational_status",
        CAST("overtime" AS VARCHAR) AS "overtime",
        CAST("overtime_pay" AS VARCHAR) AS "overtime_pay",
        CAST("partner_code" AS VARCHAR) AS "partner_code",
        CAST("pernr" AS VARCHAR) AS "pernr",
        CAST("personnel_reason_code" AS VARCHAR) AS "personnel_reason_code",
        CAST("position_id" AS VARCHAR) AS "position_id",
        CAST("probation_end_date" AS DATE) AS "probation_end_date",
        CAST("reason_flag" AS VARCHAR) AS "reason_flag",
        CAST("record_id" AS VARCHAR) AS "record_id",
        CAST("reference_id" AS VARCHAR) AS "reference_id",
        CAST("reserve_field_1" AS VARCHAR) AS "reserve_field_1",
        CAST("reserved_field_2" AS VARCHAR) AS "reserved_field_2",
        CAST("retirement_contribution" AS VARCHAR) AS "retirement_contribution",
        CAST("retirement_contributions" AS VARCHAR) AS "retirement_contributions",
        strptime(CAST("salary_record_date" AS VARCHAR), '%Y%m%d') AS "salary_record_date",
        CAST("salary_review_date" AS DATE) AS "salary_review_date",
        CAST("sick_leave" AS VARCHAR) AS "sick_leave",
        CAST("special_purpose" AS VARCHAR) AS "special_purpose",
        CAST("training_budget" AS VARCHAR) AS "training_budget",
        CAST("union_status" AS VARCHAR) AS "union_status",
        CAST("vacation_days" AS VARCHAR) AS "vacation_days",
        CAST("wage_component_26" AS VARCHAR) AS "wage_component_26",
        CAST("wage_component_27" AS VARCHAR) AS "wage_component_27",
        CAST("wage_component_28" AS VARCHAR) AS "wage_component_28",
        CAST("wage_component_29" AS VARCHAR) AS "wage_component_29",
        CAST("wage_component_30" AS VARCHAR) AS "wage_component_30",
        CAST("wage_component_31" AS VARCHAR) AS "wage_component_31",
        CAST("wage_component_32" AS VARCHAR) AS "wage_component_32",
        CAST("wage_component_33" AS VARCHAR) AS "wage_component_33",
        CAST("wage_component_34" AS VARCHAR) AS "wage_component_34",
        CAST("wage_component_35" AS VARCHAR) AS "wage_component_35",
        CAST("wage_component_36" AS VARCHAR) AS "wage_component_36",
        CAST("wage_component_37" AS VARCHAR) AS "wage_component_37",
        CAST("wage_component_38" AS VARCHAR) AS "wage_component_38",
        CAST("wage_component_39" AS VARCHAR) AS "wage_component_39",
        CAST("wage_component_40" AS VARCHAR) AS "wage_component_40",
        CAST("work_schedule" AS VARCHAR) AS "work_schedule"
    FROM "sap_pa0008_data_projected_renamed"
),

"sap_pa0008_data_projected_renamed_casted_missing_handled" AS (
    -- Handling missing values: There are 170 columns with unacceptable missing values
    -- comparison_date has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- comparison_pay_scale_group has 66.67 percent missing. Strategy: 🔄 Unchanged
    -- comparison_pay_scale_level has 66.67 percent missing. Strategy: 🔄 Unchanged
    -- comparison_wage_area has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- compensation_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- compensation_notes has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein01 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein02 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein03 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein04 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein05 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein06 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein07 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein08 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein09 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein10 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein11 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein15 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein16 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein17 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein18 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein19 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein22 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ein28 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- employee_id_32 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- employee_id_33 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- employee_id_34 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- employee_id_35 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- employee_id_36 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- employee_id_37 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- employee_id_38 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- employee_id_39 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- employee_id_40 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- flag_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- flag_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- flag_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- flag_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- flag_additional has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- group_value has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- health_insurance_plan has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_10 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_11 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_12 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_13 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_14 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_15 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_16 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_17 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_18 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_19 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_20 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_21 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_22 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_23 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_24 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_25 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_26 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_27 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_28 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_29 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_30 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_31 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_32 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_33 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_34 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_35 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_36 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_37 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_38 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_39 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_40 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_5 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_6 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_7 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_8 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- indicator_9 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- it_build has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga03 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga04 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga05 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga06 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga07 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga08 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga10 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga11 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga12 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga16 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga17 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga18 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga19 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga21 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lga27 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- location has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- manager_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- next_pay_raise_date has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- next_review_date has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- opk18 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- opk19 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- opk20 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- opk21 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- opk22 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- opk23 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- opk24 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- opk25 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- order_index has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_attribute_30 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_attribute_31 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_attribute_32 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_attribute_33 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_attribute_34 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_attribute_35 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_attribute_36 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_attribute_37 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_attribute_38 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_attribute_39 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_attribute_40 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_01 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_02 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_03 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_04 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_05 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_06 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_07 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_08 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_09 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_10 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_11 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_12 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_13 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_14 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_15 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_16 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- org_key_17 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- organizational_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- overtime has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- overtime_pay has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- partner_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- personnel_reason_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- position_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- probation_end_date has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reason_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- record_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reserve_field_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reserved_field_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- retirement_contribution has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- retirement_contributions has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sick_leave has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- special_purpose has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- training_budget has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- union_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- vacation_days has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_26 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_27 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_28 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_29 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_30 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_31 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_32 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_33 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_34 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_35 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_36 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_37 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_38 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_39 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wage_component_40 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- work_schedule has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "sequence_number",
        "subtype",
        "uname",
        "wage_type",
        "wage_area",
        "pay_scale_group",
        "pay_scale_level",
        "pay_scale_type",
        "waers",
        "comparison_pay_scale_group",
        "comparison_pay_scale_level",
        "comparison_collective_agreement",
        "bsgrd",
        "dividend_percentage",
        "annual_salary",
        "lga01",
        "wage_component_1",
        "wage_component_01",
        "lga02",
        "wage_component_2",
        "wage_component_02",
        "wage_component_3",
        "wage_component_03",
        "wage_component_4",
        "wage_component_04",
        "wage_component_5",
        "wage_component_05",
        "wage_component_6",
        "wage_component_06",
        "wage_component_7",
        "wage_component_07",
        "wage_component_8",
        "wage_component_08",
        "wage_component_9",
        "wage_component_09",
        "bet10",
        "anz10",
        "bet11",
        "anz11",
        "bet12",
        "anz12",
        "bet13",
        "anz13",
        "bet14",
        "anz14",
        "salary_component_15",
        "wage_component_15",
        "salary_component_16",
        "wage_component_16",
        "salary_component_17",
        "wage_component_17",
        "salary_component_18",
        "anz18",
        "salary_component_19",
        "anz19",
        "salary_component_20",
        "anz20",
        "salary_component_21",
        "anz21",
        "salary_component_22",
        "anz22",
        "salary_component_23",
        "anz23",
        "salary_component_24",
        "anz24",
        "salary_component_25",
        "anz25",
        "salary_component_26",
        "unknown_value_26",
        "salary_component_27",
        "unknown_value_27",
        "salary_component_28",
        "unknown_value_28",
        "salary_component_29",
        "unknown_value_29",
        "salary_component_30",
        "unknown_value_30",
        "salary_component_31",
        "unknown_value_31",
        "salary_component_32",
        "unknown_value_32",
        "salary_component_33",
        "unknown_value_33",
        "salary_component_34",
        "unknown_value_34",
        "salary_component_35",
        "unknown_value_35",
        "salary_component_36",
        "unknown_value_36",
        "salary_component_37",
        "unknown_value_37",
        "salary_component_38",
        "unknown_value_38",
        "salary_component_39",
        "unknown_value_39",
        "salary_component_40",
        "unknown_value_40",
        "indicator_1",
        "indicator_2",
        "salary_currency",
        "currency_indicator",
        "row_id",
        "is_deleted",
        "adjustment_reason",
        "allowance",
        "allowances",
        "begda",
        "benefits",
        "benefits_eligible",
        "client_code",
        "commission",
        "commission_rate",
        "contract_end_date",
        "endda",
        "historical_indicator",
        "pernr",
        "salary_record_date",
        "salary_review_date"
    FROM "sap_pa0008_data_projected_renamed_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_pa0008_data_projected_renamed_casted_missing_handled"

stg_sap_pa0008_data.yml (Document the table)

version: 2
models:
- name: stg_sap_pa0008_data
  description: The table is about employee compensation data. It contains fields for
    employee ID, start and end dates, salary grade, currency, and various wage components.
    The table includes multiple salary fields (bet01-bet40) suggesting different pay
    elements or time periods. It also has fields for organizational data like position
    and job grade. The data appears to track salary history and structure for employees.
  columns:
  - name: sequence_number
    description: Sequence number
    tests:
    - not_null
  - name: subtype
    description: Subtype
    tests:
    - not_null
  - name: uname
    description: ''
    tests:
    - not_null
  - name: wage_type
    description: Wage type
    tests:
    - not_null
  - name: wage_area
    description: Wage area
    tests:
    - not_null
  - name: pay_scale_group
    description: Pay scale group
    tests:
    - not_null
  - name: pay_scale_level
    description: Pay scale level
    tests:
    - not_null
  - name: pay_scale_type
    description: Pay scale type
    tests:
    - not_null
  - name: waers
    description: ''
    tests:
    - not_null
  - name: comparison_pay_scale_group
    description: Comparison pay scale group
    tests:
    - not_null
    - accepted_values:
        values:
        - Under 18
        - 18-24
        - 25-34
        - 35-44
        - 45-54
        - 55-64
        - 65 and over
        - AGE
  - name: comparison_pay_scale_level
    description: Comparison pay scale level
    tests:
    - not_null
  - name: comparison_collective_agreement
    description: Comparison collective agreement
    tests:
    - not_null
  - name: bsgrd
    description: ''
    tests:
    - not_null
  - name: dividend_percentage
    description: Dividend or bonus percentage
    tests:
    - not_null
  - name: annual_salary
    description: Annual salary amount
    tests:
    - not_null
  - name: lga01
    description: ''
    tests:
    - not_null
  - name: wage_component_1
    description: Wage component 1
    tests:
    - not_null
  - name: wage_component_01
    description: Wage component 1
    tests:
    - not_null
  - name: lga02
    description: ''
    cocoon_meta:
      missing_acceptable: May be a specific attribute not applicable to all entries.
  - name: wage_component_2
    description: Wage component 2
    tests:
    - not_null
  - name: wage_component_02
    description: Wage component 2
    tests:
    - not_null
  - name: wage_component_3
    description: Wage component 3
    tests:
    - not_null
  - name: wage_component_03
    description: Wage component 3
    tests:
    - not_null
  - name: wage_component_4
    description: Wage component 4
    tests:
    - not_null
  - name: wage_component_04
    description: Wage component 4
    tests:
    - not_null
  - name: wage_component_5
    description: Wage component 5
    tests:
    - not_null
  - name: wage_component_05
    description: Wage component 5
    tests:
    - not_null
  - name: wage_component_6
    description: Wage component 6
    tests:
    - not_null
  - name: wage_component_06
    description: Wage component 6
    tests:
    - not_null
  - name: wage_component_7
    description: Wage component 7
    tests:
    - not_null
  - name: wage_component_07
    description: Wage component 7
    tests:
    - not_null
  - name: wage_component_8
    description: Wage component 8
    tests:
    - not_null
  - name: wage_component_08
    description: Wage component 8
    tests:
    - not_null
  - name: wage_component_9
    description: Wage component 9
    tests:
    - not_null
  - name: wage_component_09
    description: Wage component 9
    tests:
    - not_null
  - name: bet10
    description: ''
    tests:
    - not_null
  - name: anz10
    description: ''
    tests:
    - not_null
  - name: bet11
    description: ''
    tests:
    - not_null
  - name: anz11
    description: ''
    tests:
    - not_null
  - name: bet12
    description: ''
    tests:
    - not_null
  - name: anz12
    description: ''
    tests:
    - not_null
  - name: bet13
    description: ''
    tests:
    - not_null
  - name: anz13
    description: ''
    tests:
    - not_null
  - name: bet14
    description: ''
    tests:
    - not_null
  - name: anz14
    description: ''
    tests:
    - not_null
  - name: salary_component_15
    description: Salary component 15
    tests:
    - not_null
  - name: wage_component_15
    description: Wage component 15
    tests:
    - not_null
  - name: salary_component_16
    description: Salary component 16
    tests:
    - not_null
  - name: wage_component_16
    description: Wage component 16
    tests:
    - not_null
  - name: salary_component_17
    description: Salary component 17
    tests:
    - not_null
  - name: wage_component_17
    description: Wage component 17
    tests:
    - not_null
  - name: salary_component_18
    description: Salary component 18
    tests:
    - not_null
  - name: anz18
    description: ''
    tests:
    - not_null
  - name: salary_component_19
    description: Salary component 19
    tests:
    - not_null
  - name: anz19
    description: ''
    tests:
    - not_null
  - name: salary_component_20
    description: Salary component 20
    tests:
    - not_null
  - name: anz20
    description: ''
    tests:
    - not_null
  - name: salary_component_21
    description: Salary component 21
    tests:
    - not_null
  - name: anz21
    description: ''
    tests:
    - not_null
  - name: salary_component_22
    description: Salary component 22
    tests:
    - not_null
  - name: anz22
    description: ''
    tests:
    - not_null
  - name: salary_component_23
    description: Salary component 23
    tests:
    - not_null
  - name: anz23
    description: ''
    tests:
    - not_null
  - name: salary_component_24
    description: Salary component 24
    tests:
    - not_null
  - name: anz24
    description: ''
    tests:
    - not_null
  - name: salary_component_25
    description: Salary component 25
    tests:
    - not_null
  - name: anz25
    description: ''
    tests:
    - not_null
  - name: salary_component_26
    description: Salary component 26
    tests:
    - not_null
  - name: unknown_value_26
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_27
    description: Salary component 27
    tests:
    - not_null
  - name: unknown_value_27
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_28
    description: Salary component 28
    tests:
    - not_null
  - name: unknown_value_28
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_29
    description: Salary component 29
    tests:
    - not_null
  - name: unknown_value_29
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_30
    description: Salary component 30
    tests:
    - not_null
  - name: unknown_value_30
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_31
    description: Salary component 31
    tests:
    - not_null
  - name: unknown_value_31
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_32
    description: Salary component 32
    tests:
    - not_null
  - name: unknown_value_32
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_33
    description: Salary component 33
    tests:
    - not_null
  - name: unknown_value_33
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_34
    description: Salary component 34
    tests:
    - not_null
  - name: unknown_value_34
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_35
    description: Salary component 35
    tests:
    - not_null
  - name: unknown_value_35
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_36
    description: Salary component 36
    tests:
    - not_null
  - name: unknown_value_36
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_37
    description: Salary component 37
    tests:
    - not_null
  - name: unknown_value_37
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_38
    description: Salary component 38
    tests:
    - not_null
  - name: unknown_value_38
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_39
    description: Salary component 39
    tests:
    - not_null
  - name: unknown_value_39
    description: Unknown numeric value
    tests:
    - not_null
  - name: salary_component_40
    description: Salary component 40
    tests:
    - not_null
  - name: unknown_value_40
    description: Unknown numeric value
    tests:
    - not_null
  - name: indicator_1
    description: Indicator field 1
    tests:
    - not_null
    - accepted_values:
        values:
        - A
        - B
        - C
        - D
        - E
        - F
        - G
        - H
        - I
        - J
        - K
        - L
        - M
        - N
        - O
        - P
        - Q
        - R
        - S
        - T
        - U
        - V
        - W
        - X
        - Y
        - Z
  - name: indicator_2
    description: Indicator field 2
    tests:
    - accepted_values:
        values:
        - I
        - B
    cocoon_meta:
      missing_acceptable: Partially filled, might be conditional on other factors.
  - name: salary_currency
    description: Currency code for the salary
    tests:
    - not_null
  - name: currency_indicator
    description: Currency indicator
    tests:
    - not_null
    - accepted_values:
        values:
        - $
        - "\u20AC"
        - "\xA3"
        - T
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column is explicitly described as a unique identifier for the
        row. In a table where each row represents a salary record, this would indeed
        be a unique value for each record.
  - name: is_deleted
    description: Indicates if the record has been deleted
    tests:
    - not_null
  - name: adjustment_reason
    description: Salary adjustment reason
    cocoon_meta:
      missing_acceptable: No salary adjustment made
  - name: allowance
    description: Allowance amount
    cocoon_meta:
      missing_acceptable: No additional allowances provided
  - name: allowances
    description: Allowances
    cocoon_meta:
      missing_acceptable: No additional allowances provided
  - name: begda
    description: ''
    tests:
    - not_null
  - name: benefits
    description: Benefits
    cocoon_meta:
      missing_acceptable: No additional benefits provided
  - name: benefits_eligible
    description: Benefits eligibility flag
    cocoon_meta:
      missing_acceptable: Not eligible for benefits
  - name: client_code
    description: Client or company code
    tests:
    - not_null
  - name: commission
    description: Commission amount
    cocoon_meta:
      missing_acceptable: No commission-based compensation
  - name: commission_rate
    description: Commission rate or structure
    cocoon_meta:
      missing_acceptable: No commission-based compensation
  - name: contract_end_date
    description: Employment contract end date
    cocoon_meta:
      missing_acceptable: Permanent or ongoing employment without fixed end date
  - name: endda
    description: ''
    tests:
    - not_null
  - name: historical_indicator
    description: Historical data indicator
    cocoon_meta:
      missing_acceptable: Not applicable for current, non-historical records.
  - name: pernr
    description: Employee ID number
    tests:
    - not_null
  - name: salary_record_date
    description: Date of the salary record
    tests:
    - not_null
  - name: salary_review_date
    description: Salary review date
    cocoon_meta:
      missing_acceptable: May not be scheduled yet for recent hires

stg_sap_kna1_data (first 100 rows)

customer_number country_code name_1 city postal_code customer_id street_address search_term_1 search_term_3 international_location_number location_number_check_digit creator_name ktokd monthly_turnover last_turnover_year last_dunning_number last_dunning_year annual_turnover due_diligence_flag dunning_level civil_servant last_update_date last_update_time registration_date risk_classification registration_number_date legal_nature vso_palette_height vso_integer_value vso_unload_side vso_loading_preference row_id is_deleted address_number alcohol_info business_region_school central_billing_block central_payment_allowed certificate_registration_number client company_code company_code_indicator company_size creation_date language_key one_time_customer one_time_customer_flag po_box po_box_city railway_station_code railway_station_name region suframa_number vat_number vendor_number vso_data_point vso_material_palette vso_one_material_flag vso_one_sort_flag vso_packing_material vso_palette_unit_load
0 CA301 USA Dunder Mifflin Scranton 18503 CA301 3927 Saticoy St DUNDER MIFFLIN SCRANTON 0 0 SCHRUTE DWIGHT 0.0 0 0 0 0.0 X 0 X 0 0 0 0 0 0 0.0 0 0 0 1 False 620981 None None None None None 800 None 0 None 2011-12-01 1 None None None None None None 130 None None None None None None None None None
1 CA302 PO Krusty Krab. Bikini Bottom 000001 CA302 831 Bottom Feeder Lane KRUSTY KRAB BIKINI BOTTOM 0 0 KRABS EUGENE 0.0 0 0 0 0.0 X 0 X 0 0 0 0 0 0 0.0 0 0 0 2 False 620983 None None None None None 800 None 0 None 2011-12-01 1 None None None None None None 100 None None None None None None None None None
2 CA303 UK Holmes And Watson London NW1 6XE CA303 221B Baker Street HOLMES AND WATSON LONDON 0 0 HOLMES WATSON 0.0 0 0 0 0.0 X 0 X 0 0 0 0 0 0 0.0 0 0 0 3 False 620985 None None None None None 800 None 0 None 2011-12-01 1 None None None None None None 120 None None None None None None None None None

stg_sap_kna1_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 04:54:40.061772+00:00
WITH 
"sap_kna1_data_projected" AS (
    -- Projection: Selecting 196 out of 197 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "kunnr",
        "mandt",
        "land1",
        "name1",
        "name2",
        "ort01",
        "pstlz",
        "regio",
        "sortl",
        "stras",
        "telf1",
        "telfx",
        "xcpdk",
        "adrnr",
        "mcod1",
        "mcod2",
        "mcod3",
        "anred",
        "aufsd",
        "bahne",
        "bahns",
        "bbbnr",
        "bbsnr",
        "begru",
        "brsch",
        "bubkz",
        "datlt",
        "erdat",
        "ernam",
        "exabl",
        "faksd",
        "fiskn",
        "knazk",
        "knrza",
        "konzs",
        "ktokd",
        "kukla",
        "lifnr",
        "lifsd",
        "locco",
        "loevm",
        "name3",
        "name4",
        "niels",
        "ort02",
        "pfach",
        "pstl2",
        "counc",
        "cityc",
        "rpmkr",
        "sperr",
        "spras",
        "stcd1",
        "stcd2",
        "stkza",
        "stkzu",
        "telbx",
        "telf2",
        "teltx",
        "telx1",
        "lzone",
        "xzemp",
        "vbund",
        "stceg",
        "dear1",
        "dear2",
        "dear3",
        "dear4",
        "dear5",
        "gform",
        "bran1",
        "bran2",
        "bran3",
        "bran4",
        "bran5",
        "ekont",
        "umsat",
        "umjah",
        "uwaer",
        "jmzah",
        "jmjah",
        "katr1",
        "katr2",
        "katr3",
        "katr4",
        "katr5",
        "katr6",
        "katr7",
        "katr8",
        "katr9",
        "katr10",
        "stkzn",
        "umsa1",
        "txjcd",
        "periv",
        "abrvw",
        "inspbydebi",
        "inspatdebi",
        "ktocd",
        "pfort",
        "werks",
        "dtams",
        "dtaws",
        "duefl",
        "hzuor",
        "sperz",
        "etikg",
        "civve",
        "milve",
        "kdkg1",
        "kdkg2",
        "kdkg3",
        "kdkg4",
        "kdkg5",
        "xknza",
        "fityp",
        "stcdt",
        "stcd3",
        "stcd4",
        "stcd5",
        "xicms",
        "xxipi",
        "xsubt",
        "cfopc",
        "txlw1",
        "txlw2",
        "ccc01",
        "ccc02",
        "ccc03",
        "ccc04",
        "cassd",
        "knurl",
        "j_1kfrepre",
        "j_1kftbus",
        "j_1kftind",
        "confs",
        "updat",
        "uptim",
        "nodel",
        "dear6",
        "cvp_xblck",
        "suframa",
        "rg",
        "exp",
        "uf",
        "rgdate",
        "ric",
        "rne",
        "rnedate",
        "cnae",
        "legalnat",
        "crtn",
        "icmstaxpay",
        "indtyp",
        "tdt",
        "comsize",
        "decregpc",
        "_vso_r_palhgt",
        "_vso_r_pal_ul",
        "_vso_r_pk_mat",
        "_vso_r_matpal",
        "_vso_r_i_no_lyr",
        "_vso_r_one_mat",
        "_vso_r_one_sort",
        "_vso_r_uld_side",
        "_vso_r_load_pref",
        "_vso_r_dpoint",
        "_xlso_customer",
        "_xlso_sysid",
        "_xlso_client",
        "_xlso_partner",
        "_xlso_pref_pay",
        "alc",
        "pmt_office",
        "fee_schedule",
        "duns",
        "duns4",
        "psofg",
        "psois",
        "pson1",
        "pson2",
        "pson3",
        "psovn",
        "psotl",
        "psohs",
        "psost",
        "psoo1",
        "psoo2",
        "psoo3",
        "psoo4",
        "psoo5",
        "oidrc",
        "oid_poreqd",
        "oipbl",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_kna1_data"
),

"sap_kna1_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- kunnr -> customer_number
    -- mandt -> client
    -- land1 -> country_code
    -- name1 -> name_1
    -- name2 -> name_2
    -- ort01 -> city
    -- pstlz -> postal_code
    -- regio -> region
    -- sortl -> customer_id
    -- stras -> street_address
    -- telf1 -> telephone_1
    -- telfx -> fax_number
    -- xcpdk -> one_time_customer_flag
    -- adrnr -> address_number
    -- mcod1 -> search_term_1
    -- mcod2 -> search_term_2
    -- mcod3 -> search_term_3
    -- anred -> form_of_address
    -- aufsd -> order_block
    -- bahne -> railway_station_name
    -- bahns -> railway_station_code
    -- bbbnr -> international_location_number
    -- bbsnr -> location_number_check_digit
    -- begru -> authorization_group
    -- brsch -> business_region_school
    -- bubkz -> company_code_indicator
    -- datlt -> data_line_type
    -- erdat -> creation_date
    -- ernam -> creator_name
    -- exabl -> credit_limit
    -- faksd -> central_billing_block
    -- fiskn -> fiscal_number
    -- knazk -> customer_category
    -- knrza -> reference_number
    -- konzs -> customer_classification
    -- kukla -> customer_class
    -- lifnr -> vendor_number
    -- lifsd -> vendor_lockout_status
    -- locco -> location_code
    -- loevm -> deletion_flag
    -- name3 -> name_3
    -- name4 -> customer_name_4
    -- niels -> nielsen_code
    -- ort02 -> district
    -- pfach -> po_box
    -- pstl2 -> district_code
    -- counc -> county_code
    -- cityc -> city_code
    -- rpmkr -> risk_profile_marker
    -- sperr -> blocking_indicator
    -- spras -> language_key
    -- stcd1 -> tax_number_1
    -- stcd2 -> tax_number_2
    -- stkza -> statistical_group
    -- stkzu -> sales_tax_liability
    -- telbx -> telebox_number
    -- telf2 -> telephone_2
    -- teltx -> teletex_number
    -- telx1 -> telex_number
    -- lzone -> transportation_zone
    -- xzemp -> central_payment_allowed
    -- vbund -> company_code
    -- stceg -> vat_number
    -- dear1 -> address_form_1
    -- dear2 -> address_form_2
    -- dear3 -> address_form_3
    -- dear4 -> address_form_4
    -- dear5 -> address_form_5
    -- gform -> legal_form
    -- bran1 -> industry_sector_1
    -- bran2 -> industry_sector_2
    -- bran3 -> industry_sector_3
    -- bran4 -> branch_code_4
    -- bran5 -> branch_code_5
    -- ekont -> purchasing_account
    -- umsat -> monthly_turnover
    -- umjah -> last_turnover_year
    -- uwaer -> currency_code
    -- jmzah -> last_dunning_number
    -- jmjah -> last_dunning_year
    -- katr6 -> customer_attribute_6
    -- katr7 -> customer_attribute_7
    -- katr8 -> custom_attribute_1
    -- katr9 -> custom_attribute_2
    -- katr10 -> customer_attribute_10
    -- stkzn -> natural_person_indicator
    -- umsa1 -> annual_turnover
    -- txjcd -> tax_jurisdiction
    -- periv -> fiscal_year_variant
    -- abrvw -> communication_preference
    -- inspbydebi -> payment_instruction_key_alt
    -- inspatdebi -> payment_instruction_key
    -- ktocd -> account_code
    -- pfort -> po_box_city
    -- werks -> plant_code
    -- dtams -> data_medium_exchange
    -- dtaws -> data_transfer_method
    -- duefl -> due_diligence_flag
    -- hzuor -> dunning_level
    -- sperz -> central_blocking_indicator
    -- etikg -> delivery_priority
    -- civve -> civil_servant
    -- milve -> military_verification
    -- kdkg4 -> customer_group_4
    -- kdkg5 -> customer_group_5
    -- xknza -> alternative_payee_allowed
    -- fityp -> tax_type
    -- stcdt -> tax_number_type
    -- stcd3 -> tax_number_3
    -- stcd4 -> tax_number_4
    -- stcd5 -> tax_number_5
    -- xicms -> icms_tax_liable
    -- xxipi -> ipi_tax_liable
    -- xsubt -> subtotal_per_delivery
    -- cfopc -> financial_operations_center
    -- txlw1 -> tax_law_classification
    -- txlw2 -> tax_law_indicator
    -- ccc01 -> customer_classification_1
    -- ccc02 -> customer_classification_2
    -- ccc03 -> customer_classification_3
    -- ccc04 -> customer_classification_4
    -- cassd -> central_address_service
    -- knurl -> customer_url
    -- j_1kfrepre -> representative_name
    -- j_1kftbus -> business_type
    -- j_1kftind -> industry_category
    -- confs -> statement_confirmation
    -- updat -> last_update_date
    -- uptim -> last_update_time
    -- nodel -> delivery_block
    -- dear6 -> address_form_6
    -- cvp_xblck -> vendor_master_block
    -- suframa -> suframa_number
    -- rg -> account_number_prefix
    -- exp -> export_indicator
    -- uf -> state_code
    -- rgdate -> registration_date
    -- ric -> risk_classification
    -- rne -> registration_number
    -- rnedate -> registration_number_date
    -- cnae -> economic_activity_classification
    -- legalnat -> legal_nature
    -- crtn -> certificate_registration_number
    -- icmstaxpay -> icms_taxpayer
    -- indtyp -> industry_type
    -- tdt -> transit_time
    -- comsize -> company_size
    -- decregpc -> decree_registration_code
    -- _vso_r_palhgt -> vso_palette_height
    -- _vso_r_pal_ul -> vso_palette_unit_load
    -- _vso_r_pk_mat -> vso_packing_material
    -- _vso_r_matpal -> vso_material_palette
    -- _vso_r_i_no_lyr -> vso_integer_value
    -- _vso_r_one_mat -> vso_one_material_flag
    -- _vso_r_one_sort -> vso_one_sort_flag
    -- _vso_r_uld_side -> vso_unload_side
    -- _vso_r_load_pref -> vso_loading_preference
    -- _vso_r_dpoint -> vso_data_point
    -- _xlso_customer -> xlso_customer_id
    -- _xlso_sysid -> xlso_system_id
    -- _xlso_client -> xlso_client_id
    -- _xlso_partner -> xlso_partner_id
    -- _xlso_pref_pay -> xlso_preferred_payment
    -- alc -> alcohol_info
    -- pmt_office -> payment_office
    -- psofg -> sales_office
    -- psois -> sales_district
    -- psovn -> customer_account_group
    -- psotl -> order_probability
    -- psohs -> sales_group
    -- psost -> statistics_group
    -- oidrc -> risk_category
    -- oid_poreqd -> one_time_customer
    -- oipbl -> payment_block
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "kunnr" AS "customer_number",
        "mandt" AS "client",
        "land1" AS "country_code",
        "name1" AS "name_1",
        "name2" AS "name_2",
        "ort01" AS "city",
        "pstlz" AS "postal_code",
        "regio" AS "region",
        "sortl" AS "customer_id",
        "stras" AS "street_address",
        "telf1" AS "telephone_1",
        "telfx" AS "fax_number",
        "xcpdk" AS "one_time_customer_flag",
        "adrnr" AS "address_number",
        "mcod1" AS "search_term_1",
        "mcod2" AS "search_term_2",
        "mcod3" AS "search_term_3",
        "anred" AS "form_of_address",
        "aufsd" AS "order_block",
        "bahne" AS "railway_station_name",
        "bahns" AS "railway_station_code",
        "bbbnr" AS "international_location_number",
        "bbsnr" AS "location_number_check_digit",
        "begru" AS "authorization_group",
        "brsch" AS "business_region_school",
        "bubkz" AS "company_code_indicator",
        "datlt" AS "data_line_type",
        "erdat" AS "creation_date",
        "ernam" AS "creator_name",
        "exabl" AS "credit_limit",
        "faksd" AS "central_billing_block",
        "fiskn" AS "fiscal_number",
        "knazk" AS "customer_category",
        "knrza" AS "reference_number",
        "konzs" AS "customer_classification",
        "ktokd",
        "kukla" AS "customer_class",
        "lifnr" AS "vendor_number",
        "lifsd" AS "vendor_lockout_status",
        "locco" AS "location_code",
        "loevm" AS "deletion_flag",
        "name3" AS "name_3",
        "name4" AS "customer_name_4",
        "niels" AS "nielsen_code",
        "ort02" AS "district",
        "pfach" AS "po_box",
        "pstl2" AS "district_code",
        "counc" AS "county_code",
        "cityc" AS "city_code",
        "rpmkr" AS "risk_profile_marker",
        "sperr" AS "blocking_indicator",
        "spras" AS "language_key",
        "stcd1" AS "tax_number_1",
        "stcd2" AS "tax_number_2",
        "stkza" AS "statistical_group",
        "stkzu" AS "sales_tax_liability",
        "telbx" AS "telebox_number",
        "telf2" AS "telephone_2",
        "teltx" AS "teletex_number",
        "telx1" AS "telex_number",
        "lzone" AS "transportation_zone",
        "xzemp" AS "central_payment_allowed",
        "vbund" AS "company_code",
        "stceg" AS "vat_number",
        "dear1" AS "address_form_1",
        "dear2" AS "address_form_2",
        "dear3" AS "address_form_3",
        "dear4" AS "address_form_4",
        "dear5" AS "address_form_5",
        "gform" AS "legal_form",
        "bran1" AS "industry_sector_1",
        "bran2" AS "industry_sector_2",
        "bran3" AS "industry_sector_3",
        "bran4" AS "branch_code_4",
        "bran5" AS "branch_code_5",
        "ekont" AS "purchasing_account",
        "umsat" AS "monthly_turnover",
        "umjah" AS "last_turnover_year",
        "uwaer" AS "currency_code",
        "jmzah" AS "last_dunning_number",
        "jmjah" AS "last_dunning_year",
        "katr1",
        "katr2",
        "katr3",
        "katr4",
        "katr5",
        "katr6" AS "customer_attribute_6",
        "katr7" AS "customer_attribute_7",
        "katr8" AS "custom_attribute_1",
        "katr9" AS "custom_attribute_2",
        "katr10" AS "customer_attribute_10",
        "stkzn" AS "natural_person_indicator",
        "umsa1" AS "annual_turnover",
        "txjcd" AS "tax_jurisdiction",
        "periv" AS "fiscal_year_variant",
        "abrvw" AS "communication_preference",
        "inspbydebi" AS "payment_instruction_key_alt",
        "inspatdebi" AS "payment_instruction_key",
        "ktocd" AS "account_code",
        "pfort" AS "po_box_city",
        "werks" AS "plant_code",
        "dtams" AS "data_medium_exchange",
        "dtaws" AS "data_transfer_method",
        "duefl" AS "due_diligence_flag",
        "hzuor" AS "dunning_level",
        "sperz" AS "central_blocking_indicator",
        "etikg" AS "delivery_priority",
        "civve" AS "civil_servant",
        "milve" AS "military_verification",
        "kdkg1",
        "kdkg2",
        "kdkg3",
        "kdkg4" AS "customer_group_4",
        "kdkg5" AS "customer_group_5",
        "xknza" AS "alternative_payee_allowed",
        "fityp" AS "tax_type",
        "stcdt" AS "tax_number_type",
        "stcd3" AS "tax_number_3",
        "stcd4" AS "tax_number_4",
        "stcd5" AS "tax_number_5",
        "xicms" AS "icms_tax_liable",
        "xxipi" AS "ipi_tax_liable",
        "xsubt" AS "subtotal_per_delivery",
        "cfopc" AS "financial_operations_center",
        "txlw1" AS "tax_law_classification",
        "txlw2" AS "tax_law_indicator",
        "ccc01" AS "customer_classification_1",
        "ccc02" AS "customer_classification_2",
        "ccc03" AS "customer_classification_3",
        "ccc04" AS "customer_classification_4",
        "cassd" AS "central_address_service",
        "knurl" AS "customer_url",
        "j_1kfrepre" AS "representative_name",
        "j_1kftbus" AS "business_type",
        "j_1kftind" AS "industry_category",
        "confs" AS "statement_confirmation",
        "updat" AS "last_update_date",
        "uptim" AS "last_update_time",
        "nodel" AS "delivery_block",
        "dear6" AS "address_form_6",
        "cvp_xblck" AS "vendor_master_block",
        "suframa" AS "suframa_number",
        "rg" AS "account_number_prefix",
        "exp" AS "export_indicator",
        "uf" AS "state_code",
        "rgdate" AS "registration_date",
        "ric" AS "risk_classification",
        "rne" AS "registration_number",
        "rnedate" AS "registration_number_date",
        "cnae" AS "economic_activity_classification",
        "legalnat" AS "legal_nature",
        "crtn" AS "certificate_registration_number",
        "icmstaxpay" AS "icms_taxpayer",
        "indtyp" AS "industry_type",
        "tdt" AS "transit_time",
        "comsize" AS "company_size",
        "decregpc" AS "decree_registration_code",
        "_vso_r_palhgt" AS "vso_palette_height",
        "_vso_r_pal_ul" AS "vso_palette_unit_load",
        "_vso_r_pk_mat" AS "vso_packing_material",
        "_vso_r_matpal" AS "vso_material_palette",
        "_vso_r_i_no_lyr" AS "vso_integer_value",
        "_vso_r_one_mat" AS "vso_one_material_flag",
        "_vso_r_one_sort" AS "vso_one_sort_flag",
        "_vso_r_uld_side" AS "vso_unload_side",
        "_vso_r_load_pref" AS "vso_loading_preference",
        "_vso_r_dpoint" AS "vso_data_point",
        "_xlso_customer" AS "xlso_customer_id",
        "_xlso_sysid" AS "xlso_system_id",
        "_xlso_client" AS "xlso_client_id",
        "_xlso_partner" AS "xlso_partner_id",
        "_xlso_pref_pay" AS "xlso_preferred_payment",
        "alc" AS "alcohol_info",
        "pmt_office" AS "payment_office",
        "fee_schedule",
        "duns",
        "duns4",
        "psofg" AS "sales_office",
        "psois" AS "sales_district",
        "pson1",
        "pson2",
        "pson3",
        "psovn" AS "customer_account_group",
        "psotl" AS "order_probability",
        "psohs" AS "sales_group",
        "psost" AS "statistics_group",
        "psoo1",
        "psoo2",
        "psoo3",
        "psoo4",
        "psoo5",
        "oidrc" AS "risk_category",
        "oid_poreqd" AS "one_time_customer",
        "oipbl" AS "payment_block",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_kna1_data_projected"
),

"sap_kna1_data_projected_renamed_cleaned" AS (
    -- Clean unusual string values: 
    -- country_code: The problem is that the country codes are inconsistent and non-standard. 'PO' is not a standard ISO country code, 'UK' is often used but not the official ISO code, and 'USA' is the three-letter code instead of the two-letter code. The correct values should be ISO 3166-1 alpha-2 country codes. 'PO' likely stands for Poland, which should be 'PL'. 'UK' should be 'GB' (for Great Britain), and 'USA' should be 'US'. 
    -- postal_code: The problem is that '000001' is an unusual postal code format, as it's an all-zero code with leading zeros, which is atypical for most postal systems. The other two values ('18503' and 'NW1 6XE') appear to be valid postal codes for the US and UK respectively. Since '000001' is the most frequent value, it might be a placeholder or default value rather than a real postal code. The correct approach would be to replace it with an empty string to indicate missing data. 
    -- civil_servant: The problem is that the civil_servant column only contains the value 'X', which is ambiguous and doesn't clearly indicate civil servant status. Typically, a binary column like this should have clear 'Yes'/'No' or 'True'/'False' values. The correct values should be more explicit to indicate whether someone is a civil servant or not. 
    SELECT
        "customer_number",
        "client",
        CASE
            WHEN "country_code" = '''PO''' THEN '''PL'''
            WHEN "country_code" = '''UK''' THEN '''GB'''
            WHEN "country_code" = '''USA''' THEN '''US'''
            ELSE "country_code"
        END AS "country_code",
        "name_1",
        "name_2",
        "city",
        CASE
            WHEN "postal_code" = '''000001''' THEN ''''
            ELSE "postal_code"
        END AS "postal_code",
        "region",
        "customer_id",
        "street_address",
        "telephone_1",
        "fax_number",
        "one_time_customer_flag",
        "address_number",
        "search_term_1",
        "search_term_2",
        "search_term_3",
        "form_of_address",
        "order_block",
        "railway_station_name",
        "railway_station_code",
        "international_location_number",
        "location_number_check_digit",
        "authorization_group",
        "business_region_school",
        "company_code_indicator",
        "data_line_type",
        "creation_date",
        "creator_name",
        "credit_limit",
        "central_billing_block",
        "fiscal_number",
        "customer_category",
        "reference_number",
        "customer_classification",
        "ktokd",
        "customer_class",
        "vendor_number",
        "vendor_lockout_status",
        "location_code",
        "deletion_flag",
        "name_3",
        "customer_name_4",
        "nielsen_code",
        "district",
        "po_box",
        "district_code",
        "county_code",
        "city_code",
        "risk_profile_marker",
        "blocking_indicator",
        "language_key",
        "tax_number_1",
        "tax_number_2",
        "statistical_group",
        "sales_tax_liability",
        "telebox_number",
        "telephone_2",
        "teletex_number",
        "telex_number",
        "transportation_zone",
        "central_payment_allowed",
        "company_code",
        "vat_number",
        "address_form_1",
        "address_form_2",
        "address_form_3",
        "address_form_4",
        "address_form_5",
        "legal_form",
        "industry_sector_1",
        "industry_sector_2",
        "industry_sector_3",
        "branch_code_4",
        "branch_code_5",
        "purchasing_account",
        "monthly_turnover",
        "last_turnover_year",
        "currency_code",
        "last_dunning_number",
        "last_dunning_year",
        "katr1",
        "katr2",
        "katr3",
        "katr4",
        "katr5",
        "customer_attribute_6",
        "customer_attribute_7",
        "custom_attribute_1",
        "custom_attribute_2",
        "customer_attribute_10",
        "natural_person_indicator",
        "annual_turnover",
        "tax_jurisdiction",
        "fiscal_year_variant",
        "communication_preference",
        "payment_instruction_key_alt",
        "payment_instruction_key",
        "account_code",
        "po_box_city",
        "plant_code",
        "data_medium_exchange",
        "data_transfer_method",
        "due_diligence_flag",
        "dunning_level",
        "central_blocking_indicator",
        "delivery_priority",
        CASE
            WHEN "civil_servant" = '''X''' THEN '''Yes'''
            ELSE "civil_servant"
        END AS "civil_servant",
        "military_verification",
        "kdkg1",
        "kdkg2",
        "kdkg3",
        "customer_group_4",
        "customer_group_5",
        "alternative_payee_allowed",
        "tax_type",
        "tax_number_type",
        "tax_number_3",
        "tax_number_4",
        "tax_number_5",
        "icms_tax_liable",
        "ipi_tax_liable",
        "subtotal_per_delivery",
        "financial_operations_center",
        "tax_law_classification",
        "tax_law_indicator",
        "customer_classification_1",
        "customer_classification_2",
        "customer_classification_3",
        "customer_classification_4",
        "central_address_service",
        "customer_url",
        "representative_name",
        "business_type",
        "industry_category",
        "statement_confirmation",
        "last_update_date",
        "last_update_time",
        "delivery_block",
        "address_form_6",
        "vendor_master_block",
        "suframa_number",
        "account_number_prefix",
        "export_indicator",
        "state_code",
        "registration_date",
        "risk_classification",
        "registration_number",
        "registration_number_date",
        "economic_activity_classification",
        "legal_nature",
        "certificate_registration_number",
        "icms_taxpayer",
        "industry_type",
        "transit_time",
        "company_size",
        "decree_registration_code",
        "vso_palette_height",
        "vso_palette_unit_load",
        "vso_packing_material",
        "vso_material_palette",
        "vso_integer_value",
        "vso_one_material_flag",
        "vso_one_sort_flag",
        "vso_unload_side",
        "vso_loading_preference",
        "vso_data_point",
        "xlso_customer_id",
        "xlso_system_id",
        "xlso_client_id",
        "xlso_partner_id",
        "xlso_preferred_payment",
        "alcohol_info",
        "payment_office",
        "fee_schedule",
        "duns",
        "duns4",
        "sales_office",
        "sales_district",
        "pson1",
        "pson2",
        "pson3",
        "customer_account_group",
        "order_probability",
        "sales_group",
        "statistics_group",
        "psoo1",
        "psoo2",
        "psoo3",
        "psoo4",
        "psoo5",
        "risk_category",
        "one_time_customer",
        "payment_block",
        "row_id",
        "is_deleted"
    FROM "sap_kna1_data_projected_renamed"
),

"sap_kna1_data_projected_renamed_cleaned_casted" AS (
    -- Column Type Casting: 
    -- account_code: from DECIMAL to VARCHAR
    -- account_number_prefix: from DECIMAL to VARCHAR
    -- address_form_1: from DECIMAL to VARCHAR
    -- address_form_2: from DECIMAL to VARCHAR
    -- address_form_3: from DECIMAL to VARCHAR
    -- address_form_4: from DECIMAL to VARCHAR
    -- address_form_5: from DECIMAL to VARCHAR
    -- address_form_6: from DECIMAL to VARCHAR
    -- address_number: from INT to VARCHAR
    -- alcohol_info: from DECIMAL to VARCHAR
    -- alternative_payee_allowed: from DECIMAL to VARCHAR
    -- authorization_group: from DECIMAL to VARCHAR
    -- blocking_indicator: from DECIMAL to VARCHAR
    -- branch_code_4: from DECIMAL to VARCHAR
    -- branch_code_5: from DECIMAL to VARCHAR
    -- business_region_school: from DECIMAL to VARCHAR
    -- business_type: from DECIMAL to VARCHAR
    -- central_address_service: from DECIMAL to VARCHAR
    -- central_billing_block: from DECIMAL to VARCHAR
    -- central_blocking_indicator: from DECIMAL to VARCHAR
    -- central_payment_allowed: from DECIMAL to VARCHAR
    -- certificate_registration_number: from DECIMAL to VARCHAR
    -- city_code: from DECIMAL to VARCHAR
    -- client: from INT to VARCHAR
    -- communication_preference: from DECIMAL to VARCHAR
    -- company_code: from DECIMAL to VARCHAR
    -- company_code_indicator: from INT to VARCHAR
    -- company_size: from DECIMAL to VARCHAR
    -- county_code: from DECIMAL to VARCHAR
    -- creation_date: from INT to DATE
    -- currency_code: from DECIMAL to VARCHAR
    -- custom_attribute_1: from DECIMAL to VARCHAR
    -- custom_attribute_2: from DECIMAL to VARCHAR
    -- customer_account_group: from DECIMAL to VARCHAR
    -- customer_attribute_10: from DECIMAL to VARCHAR
    -- customer_attribute_6: from DECIMAL to VARCHAR
    -- customer_attribute_7: from DECIMAL to VARCHAR
    -- customer_category: from DECIMAL to VARCHAR
    -- customer_class: from DECIMAL to VARCHAR
    -- customer_classification: from DECIMAL to VARCHAR
    -- customer_classification_1: from DECIMAL to VARCHAR
    -- customer_classification_2: from DECIMAL to VARCHAR
    -- customer_classification_3: from DECIMAL to VARCHAR
    -- customer_classification_4: from DECIMAL to VARCHAR
    -- customer_group_4: from DECIMAL to VARCHAR
    -- customer_group_5: from DECIMAL to VARCHAR
    -- customer_name_4: from DECIMAL to VARCHAR
    -- customer_url: from DECIMAL to VARCHAR
    -- data_line_type: from DECIMAL to VARCHAR
    -- data_medium_exchange: from DECIMAL to VARCHAR
    -- data_transfer_method: from DECIMAL to VARCHAR
    -- decree_registration_code: from DECIMAL to VARCHAR
    -- deletion_flag: from DECIMAL to VARCHAR
    -- delivery_block: from DECIMAL to VARCHAR
    -- delivery_priority: from DECIMAL to VARCHAR
    -- district: from DECIMAL to VARCHAR
    -- district_code: from DECIMAL to VARCHAR
    -- duns: from DECIMAL to VARCHAR
    -- duns4: from DECIMAL to VARCHAR
    -- economic_activity_classification: from DECIMAL to VARCHAR
    -- export_indicator: from DECIMAL to VARCHAR
    -- fax_number: from DECIMAL to VARCHAR
    -- fee_schedule: from DECIMAL to VARCHAR
    -- financial_operations_center: from DECIMAL to VARCHAR
    -- fiscal_number: from DECIMAL to VARCHAR
    -- fiscal_year_variant: from DECIMAL to VARCHAR
    -- form_of_address: from DECIMAL to VARCHAR
    -- icms_tax_liable: from DECIMAL to VARCHAR
    -- icms_taxpayer: from DECIMAL to VARCHAR
    -- industry_category: from DECIMAL to VARCHAR
    -- industry_sector_1: from DECIMAL to VARCHAR
    -- industry_sector_2: from DECIMAL to VARCHAR
    -- industry_sector_3: from DECIMAL to VARCHAR
    -- industry_type: from DECIMAL to VARCHAR
    -- ipi_tax_liable: from DECIMAL to VARCHAR
    -- katr1: from DECIMAL to VARCHAR
    -- katr2: from DECIMAL to VARCHAR
    -- katr3: from DECIMAL to VARCHAR
    -- katr4: from DECIMAL to VARCHAR
    -- katr5: from DECIMAL to VARCHAR
    -- kdkg1: from DECIMAL to VARCHAR
    -- kdkg2: from DECIMAL to VARCHAR
    -- kdkg3: from DECIMAL to VARCHAR
    -- language_key: from INT to VARCHAR
    -- legal_form: from DECIMAL to VARCHAR
    -- location_code: from DECIMAL to VARCHAR
    -- military_verification: from DECIMAL to VARCHAR
    -- name_2: from DECIMAL to VARCHAR
    -- name_3: from DECIMAL to VARCHAR
    -- natural_person_indicator: from DECIMAL to VARCHAR
    -- nielsen_code: from DECIMAL to VARCHAR
    -- one_time_customer: from DECIMAL to VARCHAR
    -- one_time_customer_flag: from DECIMAL to VARCHAR
    -- order_block: from DECIMAL to VARCHAR
    -- order_probability: from DECIMAL to VARCHAR
    -- payment_block: from DECIMAL to VARCHAR
    -- payment_instruction_key: from DECIMAL to VARCHAR
    -- payment_instruction_key_alt: from DECIMAL to VARCHAR
    -- payment_office: from DECIMAL to VARCHAR
    -- plant_code: from DECIMAL to VARCHAR
    -- po_box: from DECIMAL to VARCHAR
    -- po_box_city: from DECIMAL to VARCHAR
    -- pson1: from DECIMAL to VARCHAR
    -- pson2: from DECIMAL to VARCHAR
    -- pson3: from DECIMAL to VARCHAR
    -- psoo1: from DECIMAL to VARCHAR
    -- psoo2: from DECIMAL to VARCHAR
    -- psoo3: from DECIMAL to VARCHAR
    -- psoo4: from DECIMAL to VARCHAR
    -- psoo5: from DECIMAL to VARCHAR
    -- purchasing_account: from DECIMAL to VARCHAR
    -- railway_station_code: from DECIMAL to VARCHAR
    -- railway_station_name: from DECIMAL to VARCHAR
    -- reference_number: from DECIMAL to VARCHAR
    -- region: from INT to VARCHAR
    -- registration_number: from DECIMAL to VARCHAR
    -- representative_name: from DECIMAL to VARCHAR
    -- risk_category: from DECIMAL to VARCHAR
    -- risk_profile_marker: from DECIMAL to VARCHAR
    -- sales_district: from DECIMAL to VARCHAR
    -- sales_group: from DECIMAL to VARCHAR
    -- sales_office: from DECIMAL to VARCHAR
    -- sales_tax_liability: from DECIMAL to VARCHAR
    -- search_term_2: from DECIMAL to VARCHAR
    -- state_code: from DECIMAL to VARCHAR
    -- statement_confirmation: from DECIMAL to VARCHAR
    -- statistical_group: from DECIMAL to VARCHAR
    -- statistics_group: from DECIMAL to VARCHAR
    -- suframa_number: from DECIMAL to VARCHAR
    -- tax_jurisdiction: from DECIMAL to VARCHAR
    -- tax_law_classification: from DECIMAL to VARCHAR
    -- tax_law_indicator: from DECIMAL to VARCHAR
    -- tax_number_1: from DECIMAL to VARCHAR
    -- tax_number_2: from DECIMAL to VARCHAR
    -- tax_number_3: from DECIMAL to VARCHAR
    -- tax_number_4: from DECIMAL to VARCHAR
    -- tax_number_5: from DECIMAL to VARCHAR
    -- tax_number_type: from DECIMAL to VARCHAR
    -- tax_type: from DECIMAL to VARCHAR
    -- telebox_number: from DECIMAL to VARCHAR
    -- telephone_1: from DECIMAL to VARCHAR
    -- telephone_2: from DECIMAL to VARCHAR
    -- teletex_number: from DECIMAL to VARCHAR
    -- telex_number: from DECIMAL to VARCHAR
    -- transit_time: from DECIMAL to INT
    -- transportation_zone: from DECIMAL to VARCHAR
    -- vat_number: from DECIMAL to VARCHAR
    -- vendor_lockout_status: from DECIMAL to VARCHAR
    -- vendor_master_block: from DECIMAL to VARCHAR
    -- vendor_number: from DECIMAL to VARCHAR
    -- vso_data_point: from DECIMAL to VARCHAR
    -- vso_material_palette: from DECIMAL to VARCHAR
    -- vso_one_material_flag: from DECIMAL to VARCHAR
    -- vso_one_sort_flag: from DECIMAL to VARCHAR
    -- vso_packing_material: from DECIMAL to VARCHAR
    -- vso_palette_unit_load: from DECIMAL to VARCHAR
    -- xlso_client_id: from DECIMAL to VARCHAR
    -- xlso_customer_id: from DECIMAL to VARCHAR
    -- xlso_partner_id: from DECIMAL to VARCHAR
    -- xlso_preferred_payment: from DECIMAL to VARCHAR
    -- xlso_system_id: from DECIMAL to VARCHAR
    SELECT
        "customer_number",
        "country_code",
        "name_1",
        "city",
        "postal_code",
        "customer_id",
        "street_address",
        "search_term_1",
        "search_term_3",
        "international_location_number",
        "location_number_check_digit",
        "creator_name",
        "credit_limit",
        "ktokd",
        "monthly_turnover",
        "last_turnover_year",
        "last_dunning_number",
        "last_dunning_year",
        "annual_turnover",
        "due_diligence_flag",
        "dunning_level",
        "civil_servant",
        "subtotal_per_delivery",
        "last_update_date",
        "last_update_time",
        "registration_date",
        "risk_classification",
        "registration_number_date",
        "legal_nature",
        "vso_palette_height",
        "vso_integer_value",
        "vso_unload_side",
        "vso_loading_preference",
        "row_id",
        "is_deleted",
        CAST("account_code" AS VARCHAR) AS "account_code",
        CAST("account_number_prefix" AS VARCHAR) AS "account_number_prefix",
        CAST("address_form_1" AS VARCHAR) AS "address_form_1",
        CAST("address_form_2" AS VARCHAR) AS "address_form_2",
        CAST("address_form_3" AS VARCHAR) AS "address_form_3",
        CAST("address_form_4" AS VARCHAR) AS "address_form_4",
        CAST("address_form_5" AS VARCHAR) AS "address_form_5",
        CAST("address_form_6" AS VARCHAR) AS "address_form_6",
        CAST("address_number" AS VARCHAR) AS "address_number",
        CAST("alcohol_info" AS VARCHAR) AS "alcohol_info",
        CAST("alternative_payee_allowed" AS VARCHAR) AS "alternative_payee_allowed",
        CAST("authorization_group" AS VARCHAR) AS "authorization_group",
        CAST("blocking_indicator" AS VARCHAR) AS "blocking_indicator",
        CAST("branch_code_4" AS VARCHAR) AS "branch_code_4",
        CAST("branch_code_5" AS VARCHAR) AS "branch_code_5",
        CAST("business_region_school" AS VARCHAR) AS "business_region_school",
        CAST("business_type" AS VARCHAR) AS "business_type",
        CAST("central_address_service" AS VARCHAR) AS "central_address_service",
        CAST("central_billing_block" AS VARCHAR) AS "central_billing_block",
        CAST("central_blocking_indicator" AS VARCHAR) AS "central_blocking_indicator",
        CAST("central_payment_allowed" AS VARCHAR) AS "central_payment_allowed",
        CAST("certificate_registration_number" AS VARCHAR) AS "certificate_registration_number",
        CAST("city_code" AS VARCHAR) AS "city_code",
        CAST("client" AS VARCHAR) AS "client",
        CAST("communication_preference" AS VARCHAR) AS "communication_preference",
        CAST("company_code" AS VARCHAR) AS "company_code",
        CAST("company_code_indicator" AS VARCHAR) AS "company_code_indicator",
        CAST("company_size" AS VARCHAR) AS "company_size",
        CAST("county_code" AS VARCHAR) AS "county_code",
        strptime(CAST("creation_date" AS VARCHAR), '%Y%m%d') AS "creation_date",
        CAST("currency_code" AS VARCHAR) AS "currency_code",
        CAST("custom_attribute_1" AS VARCHAR) AS "custom_attribute_1",
        CAST("custom_attribute_2" AS VARCHAR) AS "custom_attribute_2",
        CAST("customer_account_group" AS VARCHAR) AS "customer_account_group",
        CAST("customer_attribute_10" AS VARCHAR) AS "customer_attribute_10",
        CAST("customer_attribute_6" AS VARCHAR) AS "customer_attribute_6",
        CAST("customer_attribute_7" AS VARCHAR) AS "customer_attribute_7",
        CAST("customer_category" AS VARCHAR) AS "customer_category",
        CAST("customer_class" AS VARCHAR) AS "customer_class",
        CAST("customer_classification" AS VARCHAR) AS "customer_classification",
        CAST("customer_classification_1" AS VARCHAR) AS "customer_classification_1",
        CAST("customer_classification_2" AS VARCHAR) AS "customer_classification_2",
        CAST("customer_classification_3" AS VARCHAR) AS "customer_classification_3",
        CAST("customer_classification_4" AS VARCHAR) AS "customer_classification_4",
        CAST("customer_group_4" AS VARCHAR) AS "customer_group_4",
        CAST("customer_group_5" AS VARCHAR) AS "customer_group_5",
        CAST("customer_name_4" AS VARCHAR) AS "customer_name_4",
        CAST("customer_url" AS VARCHAR) AS "customer_url",
        CAST("data_line_type" AS VARCHAR) AS "data_line_type",
        CAST("data_medium_exchange" AS VARCHAR) AS "data_medium_exchange",
        CAST("data_transfer_method" AS VARCHAR) AS "data_transfer_method",
        CAST("decree_registration_code" AS VARCHAR) AS "decree_registration_code",
        CAST("deletion_flag" AS VARCHAR) AS "deletion_flag",
        CAST("delivery_block" AS VARCHAR) AS "delivery_block",
        CAST("delivery_priority" AS VARCHAR) AS "delivery_priority",
        CAST("district" AS VARCHAR) AS "district",
        CAST("district_code" AS VARCHAR) AS "district_code",
        CAST("duns" AS VARCHAR) AS "duns",
        CAST("duns4" AS VARCHAR) AS "duns4",
        CAST("economic_activity_classification" AS VARCHAR) AS "economic_activity_classification",
        CAST("export_indicator" AS VARCHAR) AS "export_indicator",
        CAST("fax_number" AS VARCHAR) AS "fax_number",
        CAST("fee_schedule" AS VARCHAR) AS "fee_schedule",
        CAST("financial_operations_center" AS VARCHAR) AS "financial_operations_center",
        CAST("fiscal_number" AS VARCHAR) AS "fiscal_number",
        CAST("fiscal_year_variant" AS VARCHAR) AS "fiscal_year_variant",
        CAST("form_of_address" AS VARCHAR) AS "form_of_address",
        CAST("icms_tax_liable" AS VARCHAR) AS "icms_tax_liable",
        CAST("icms_taxpayer" AS VARCHAR) AS "icms_taxpayer",
        CAST("industry_category" AS VARCHAR) AS "industry_category",
        CAST("industry_sector_1" AS VARCHAR) AS "industry_sector_1",
        CAST("industry_sector_2" AS VARCHAR) AS "industry_sector_2",
        CAST("industry_sector_3" AS VARCHAR) AS "industry_sector_3",
        CAST("industry_type" AS VARCHAR) AS "industry_type",
        CAST("ipi_tax_liable" AS VARCHAR) AS "ipi_tax_liable",
        CAST("katr1" AS VARCHAR) AS "katr1",
        CAST("katr2" AS VARCHAR) AS "katr2",
        CAST("katr3" AS VARCHAR) AS "katr3",
        CAST("katr4" AS VARCHAR) AS "katr4",
        CAST("katr5" AS VARCHAR) AS "katr5",
        CAST("kdkg1" AS VARCHAR) AS "kdkg1",
        CAST("kdkg2" AS VARCHAR) AS "kdkg2",
        CAST("kdkg3" AS VARCHAR) AS "kdkg3",
        CAST("language_key" AS VARCHAR) AS "language_key",
        CAST("legal_form" AS VARCHAR) AS "legal_form",
        CAST("location_code" AS VARCHAR) AS "location_code",
        CAST("military_verification" AS VARCHAR) AS "military_verification",
        CAST("name_2" AS VARCHAR) AS "name_2",
        CAST("name_3" AS VARCHAR) AS "name_3",
        CAST("natural_person_indicator" AS VARCHAR) AS "natural_person_indicator",
        CAST("nielsen_code" AS VARCHAR) AS "nielsen_code",
        CAST("one_time_customer" AS VARCHAR) AS "one_time_customer",
        CAST("one_time_customer_flag" AS VARCHAR) AS "one_time_customer_flag",
        CAST("order_block" AS VARCHAR) AS "order_block",
        CAST("order_probability" AS VARCHAR) AS "order_probability",
        CAST("payment_block" AS VARCHAR) AS "payment_block",
        CAST("payment_instruction_key" AS VARCHAR) AS "payment_instruction_key",
        CAST("payment_instruction_key_alt" AS VARCHAR) AS "payment_instruction_key_alt",
        CAST("payment_office" AS VARCHAR) AS "payment_office",
        CAST("plant_code" AS VARCHAR) AS "plant_code",
        CAST("po_box" AS VARCHAR) AS "po_box",
        CAST("po_box_city" AS VARCHAR) AS "po_box_city",
        CAST("pson1" AS VARCHAR) AS "pson1",
        CAST("pson2" AS VARCHAR) AS "pson2",
        CAST("pson3" AS VARCHAR) AS "pson3",
        CAST("psoo1" AS VARCHAR) AS "psoo1",
        CAST("psoo2" AS VARCHAR) AS "psoo2",
        CAST("psoo3" AS VARCHAR) AS "psoo3",
        CAST("psoo4" AS VARCHAR) AS "psoo4",
        CAST("psoo5" AS VARCHAR) AS "psoo5",
        CAST("purchasing_account" AS VARCHAR) AS "purchasing_account",
        CAST("railway_station_code" AS VARCHAR) AS "railway_station_code",
        CAST("railway_station_name" AS VARCHAR) AS "railway_station_name",
        CAST("reference_number" AS VARCHAR) AS "reference_number",
        CAST("region" AS VARCHAR) AS "region",
        CAST("registration_number" AS VARCHAR) AS "registration_number",
        CAST("representative_name" AS VARCHAR) AS "representative_name",
        CAST("risk_category" AS VARCHAR) AS "risk_category",
        CAST("risk_profile_marker" AS VARCHAR) AS "risk_profile_marker",
        CAST("sales_district" AS VARCHAR) AS "sales_district",
        CAST("sales_group" AS VARCHAR) AS "sales_group",
        CAST("sales_office" AS VARCHAR) AS "sales_office",
        CAST("sales_tax_liability" AS VARCHAR) AS "sales_tax_liability",
        CAST("search_term_2" AS VARCHAR) AS "search_term_2",
        CAST("state_code" AS VARCHAR) AS "state_code",
        CAST("statement_confirmation" AS VARCHAR) AS "statement_confirmation",
        CAST("statistical_group" AS VARCHAR) AS "statistical_group",
        CAST("statistics_group" AS VARCHAR) AS "statistics_group",
        CAST("suframa_number" AS VARCHAR) AS "suframa_number",
        CAST("tax_jurisdiction" AS VARCHAR) AS "tax_jurisdiction",
        CAST("tax_law_classification" AS VARCHAR) AS "tax_law_classification",
        CAST("tax_law_indicator" AS VARCHAR) AS "tax_law_indicator",
        CAST("tax_number_1" AS VARCHAR) AS "tax_number_1",
        CAST("tax_number_2" AS VARCHAR) AS "tax_number_2",
        CAST("tax_number_3" AS VARCHAR) AS "tax_number_3",
        CAST("tax_number_4" AS VARCHAR) AS "tax_number_4",
        CAST("tax_number_5" AS VARCHAR) AS "tax_number_5",
        CAST("tax_number_type" AS VARCHAR) AS "tax_number_type",
        CAST("tax_type" AS VARCHAR) AS "tax_type",
        CAST("telebox_number" AS VARCHAR) AS "telebox_number",
        CAST("telephone_1" AS VARCHAR) AS "telephone_1",
        CAST("telephone_2" AS VARCHAR) AS "telephone_2",
        CAST("teletex_number" AS VARCHAR) AS "teletex_number",
        CAST("telex_number" AS VARCHAR) AS "telex_number",
        CAST("transit_time" AS INT) AS "transit_time",
        CAST("transportation_zone" AS VARCHAR) AS "transportation_zone",
        CAST("vat_number" AS VARCHAR) AS "vat_number",
        CAST("vendor_lockout_status" AS VARCHAR) AS "vendor_lockout_status",
        CAST("vendor_master_block" AS VARCHAR) AS "vendor_master_block",
        CAST("vendor_number" AS VARCHAR) AS "vendor_number",
        CAST("vso_data_point" AS VARCHAR) AS "vso_data_point",
        CAST("vso_material_palette" AS VARCHAR) AS "vso_material_palette",
        CAST("vso_one_material_flag" AS VARCHAR) AS "vso_one_material_flag",
        CAST("vso_one_sort_flag" AS VARCHAR) AS "vso_one_sort_flag",
        CAST("vso_packing_material" AS VARCHAR) AS "vso_packing_material",
        CAST("vso_palette_unit_load" AS VARCHAR) AS "vso_palette_unit_load",
        CAST("xlso_client_id" AS VARCHAR) AS "xlso_client_id",
        CAST("xlso_customer_id" AS VARCHAR) AS "xlso_customer_id",
        CAST("xlso_partner_id" AS VARCHAR) AS "xlso_partner_id",
        CAST("xlso_preferred_payment" AS VARCHAR) AS "xlso_preferred_payment",
        CAST("xlso_system_id" AS VARCHAR) AS "xlso_system_id"
    FROM "sap_kna1_data_projected_renamed_cleaned"
),

"sap_kna1_data_projected_renamed_cleaned_casted_missing_handled" AS (
    -- Handling missing values: There are 135 columns with unacceptable missing values
    -- account_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- account_number_prefix has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- address_form_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- address_form_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- address_form_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- address_form_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- address_form_5 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- address_form_6 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- alternative_payee_allowed has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- authorization_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- blocking_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- branch_code_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- branch_code_5 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- business_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- central_address_service has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- central_blocking_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- city_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- communication_preference has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- county_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- credit_limit has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- currency_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- custom_attribute_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- custom_attribute_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_account_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_attribute_10 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_attribute_6 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_attribute_7 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_category has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_class has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_classification has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_classification_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_classification_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_classification_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_classification_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_group_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_group_5 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_name_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_url has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- data_line_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- data_medium_exchange has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- data_transfer_method has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- decree_registration_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- deletion_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- delivery_block has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- delivery_priority has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- district has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- district_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- duns has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- duns4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- economic_activity_classification has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- export_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- fax_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- fee_schedule has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- financial_operations_center has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- fiscal_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- fiscal_year_variant has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- form_of_address has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- icms_tax_liable has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- icms_taxpayer has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- industry_category has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- industry_sector_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- industry_sector_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- industry_sector_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- industry_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ipi_tax_liable has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- katr1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- katr2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- katr3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- katr4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- katr5 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- kdkg1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- kdkg2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- kdkg3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- legal_form has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- location_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- military_verification has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- name_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- name_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- natural_person_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- nielsen_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- order_block has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- order_probability has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- payment_block has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- payment_instruction_key has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- payment_instruction_key_alt has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- payment_office has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- plant_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- pson1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- pson2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- pson3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- psoo1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- psoo2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- psoo3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- psoo4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- psoo5 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- purchasing_account has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- registration_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- representative_name has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- risk_category has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- risk_profile_marker has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sales_district has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sales_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sales_office has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sales_tax_liability has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- search_term_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- state_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- statement_confirmation has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- statistical_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- statistics_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- subtotal_per_delivery has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_jurisdiction has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_law_classification has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_law_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_number_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_number_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_number_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_number_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_number_5 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_number_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- telebox_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- telephone_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- telephone_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- teletex_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- telex_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- transit_time has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- transportation_zone has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- vendor_lockout_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- vendor_master_block has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- xlso_client_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- xlso_customer_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- xlso_partner_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- xlso_preferred_payment has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- xlso_system_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "customer_number",
        "country_code",
        "name_1",
        "city",
        "postal_code",
        "customer_id",
        "street_address",
        "search_term_1",
        "search_term_3",
        "international_location_number",
        "location_number_check_digit",
        "creator_name",
        "ktokd",
        "monthly_turnover",
        "last_turnover_year",
        "last_dunning_number",
        "last_dunning_year",
        "annual_turnover",
        "due_diligence_flag",
        "dunning_level",
        "civil_servant",
        "last_update_date",
        "last_update_time",
        "registration_date",
        "risk_classification",
        "registration_number_date",
        "legal_nature",
        "vso_palette_height",
        "vso_integer_value",
        "vso_unload_side",
        "vso_loading_preference",
        "row_id",
        "is_deleted",
        "address_number",
        "alcohol_info",
        "business_region_school",
        "central_billing_block",
        "central_payment_allowed",
        "certificate_registration_number",
        "client",
        "company_code",
        "company_code_indicator",
        "company_size",
        "creation_date",
        "language_key",
        "one_time_customer",
        "one_time_customer_flag",
        "po_box",
        "po_box_city",
        "railway_station_code",
        "railway_station_name",
        "region",
        "suframa_number",
        "vat_number",
        "vendor_number",
        "vso_data_point",
        "vso_material_palette",
        "vso_one_material_flag",
        "vso_one_sort_flag",
        "vso_packing_material",
        "vso_palette_unit_load"
    FROM "sap_kna1_data_projected_renamed_cleaned_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_kna1_data_projected_renamed_cleaned_casted_missing_handled"

stg_sap_kna1_data.yml (Document the table)

version: 2
models:
- name: stg_sap_kna1_data
  description: The table is about customer data. It contains details like customer
    number, name, address, country, and contact information. Each row represents a
    unique customer with various attributes. The table includes fields for business-related
    information, tax details, and system-specific data. It appears to be a comprehensive
    customer master data table, likely used in an SAP system for managing customer
    relationships and transactions.
  columns:
  - name: customer_number
    description: Customer number
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column represents a unique identifier for each customer. For
        this table, each row represents a distinct customer, and the customer_number
        appears to be unique across rows.
  - name: country_code
    description: Country code
    tests:
    - not_null
  - name: name_1
    description: Name 1
    tests:
    - not_null
  - name: city
    description: City
    tests:
    - not_null
  - name: postal_code
    description: Postal code
    tests:
    - not_null
  - name: customer_id
    description: Sort field or customer identifier
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column appears to be another unique identifier for each customer.
        For this table, each row represents a distinct customer, and the customer_id
        seems to be unique across rows.
  - name: street_address
    description: Street address
    tests:
    - not_null
  - name: search_term_1
    description: Search term 1
    tests:
    - not_null
  - name: search_term_3
    description: Search term 3
    tests:
    - not_null
  - name: international_location_number
    description: Customer's international location number
    tests:
    - not_null
  - name: location_number_check_digit
    description: Customer's check digit for int. location number
    tests:
    - not_null
  - name: creator_name
    description: Name of person who created the record
    tests:
    - not_null
  - name: ktokd
    description: Customer account group
    tests:
    - not_null
  - name: monthly_turnover
    description: Monthly turnover amount
    tests:
    - not_null
  - name: last_turnover_year
    description: Year of last turnover
    tests:
    - not_null
  - name: last_dunning_number
    description: Number of last dunning notice
    tests:
    - not_null
  - name: last_dunning_year
    description: Year of last dunning notice
    tests:
    - not_null
  - name: annual_turnover
    description: Annual turnover amount
    tests:
    - not_null
  - name: due_diligence_flag
    description: Due diligence flag
    tests:
    - not_null
    - accepted_values:
        values:
        - Y
        - N
        - X
  - name: dunning_level
    description: Number of dunning level
    tests:
    - not_null
  - name: civil_servant
    description: Civil servant indicator
    tests:
    - not_null
    - accepted_values:
        values:
        - X
        - ' '
  - name: last_update_date
    description: Last update date
    tests:
    - not_null
  - name: last_update_time
    description: Last update time
    tests:
    - not_null
  - name: registration_date
    description: Date of registration or creation
    tests:
    - not_null
  - name: risk_classification
    description: Customer's risk classification
    tests:
    - not_null
  - name: registration_number_date
    description: Date of registration number issuance
    tests:
    - not_null
  - name: legal_nature
    description: Legal nature of the customer
    tests:
    - not_null
  - name: vso_palette_height
    description: VSO palette height
    tests:
    - not_null
  - name: vso_integer_value
    description: VSO-related integer value
    tests:
    - not_null
  - name: vso_unload_side
    description: VSO unload side
    tests:
    - not_null
  - name: vso_loading_preference
    description: VSO loading preference
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column is a Unique identifier for the row. For this table,
        each row represents a unique customer. row_id appears to be a sequential number
        that is unique for each row.
  - name: is_deleted
    description: Indicates if the record is deleted
    tests:
    - not_null
  - name: address_number
    description: Customer's address number
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column represents the Customer's address number. For this table,
        each row is for a unique customer. address_number seems to be a unique identifier
        for each customer's address, which could be unique for each customer.
  - name: alcohol_info
    description: Customer's alcohol-related information
    cocoon_meta:
      missing_acceptable: May not be applicable for non-alcohol related businesses.
  - name: business_region_school
    description: Business region school code
    cocoon_meta:
      missing_acceptable: May not apply to businesses that aren't schools.
  - name: central_billing_block
    description: Central billing block
    cocoon_meta:
      missing_acceptable: May only apply to businesses with multiple locations.
  - name: central_payment_allowed
    description: Central payment allowed indicator
    cocoon_meta:
      missing_acceptable: May only apply to businesses with multiple locations.
  - name: certificate_registration_number
    description: Certificate registration number
    cocoon_meta:
      missing_acceptable: May not be required for all types of businesses.
  - name: client
    description: Client
    tests:
    - not_null
    - accepted_values:
        values:
        - '800'
        - '888'
        - '877'
        - '866'
        - '855'
        - '844'
        - '833'
        - '822'
        - '880'
        - '881'
        - '882'
        - '883'
        - '884'
  - name: company_code
    description: Company code
    cocoon_meta:
      missing_acceptable: May not apply to individual or sole proprietor accounts.
  - name: company_code_indicator
    description: Company code indicator
    tests:
    - not_null
    - accepted_values:
        values:
        - '0'
        - '1'
  - name: company_size
    description: Company size
    cocoon_meta:
      missing_acceptable: May not apply to individual or non-company accounts.
  - name: creation_date
    description: Date of record creation
    tests:
    - not_null
  - name: language_key
    description: Language key
    tests:
    - not_null
  - name: one_time_customer
    description: One-time customer indicator
    cocoon_meta:
      missing_acceptable: May not be applicable for regular customers
  - name: one_time_customer_flag
    description: One-time customer indicator
    cocoon_meta:
      missing_acceptable: May not be applicable for regular customers
  - name: po_box
    description: PO Box
    cocoon_meta:
      missing_acceptable: Not all addresses include a PO Box
  - name: po_box_city
    description: PO Box city
    cocoon_meta:
      missing_acceptable: Not applicable if no PO Box exists
  - name: railway_station_code
    description: Customer's railway station code
    cocoon_meta:
      missing_acceptable: Not all locations are near a railway station
  - name: railway_station_name
    description: Customer's railway station name
    cocoon_meta:
      missing_acceptable: Not all locations are near a railway station
  - name: region
    description: Region (state, province, county)
    tests:
    - not_null
  - name: suframa_number
    description: SUFRAMA number (Brazilian tax)
    cocoon_meta:
      missing_acceptable: Only applicable for businesses in Brazilian Free Trade Zone.
  - name: vat_number
    description: VAT registration number
    cocoon_meta:
      missing_acceptable: Only applicable in countries with VAT system.
  - name: vendor_number
    description: Vendor number
    cocoon_meta:
      missing_acceptable: May not be applicable if not a vendor relationship.
  - name: vso_data_point
    description: VSO-specific data point
    cocoon_meta:
      missing_acceptable: Only applicable for specific vendor-managed inventory systems.
  - name: vso_material_palette
    description: VSO material palette
    cocoon_meta:
      missing_acceptable: Only applicable for specific vendor-managed inventory systems.
  - name: vso_one_material_flag
    description: VSO one material flag
    cocoon_meta:
      missing_acceptable: Only applicable for specific vendor-managed inventory systems.
  - name: vso_one_sort_flag
    description: VSO one sort flag
    cocoon_meta:
      missing_acceptable: Only applicable for specific vendor-managed inventory systems.
  - name: vso_packing_material
    description: VSO packing material
    cocoon_meta:
      missing_acceptable: Only applicable for specific vendor-managed inventory systems.
  - name: vso_palette_unit_load
    description: VSO palette unit load
    cocoon_meta:
      missing_acceptable: Only applicable for specific vendor-managed inventory systems.

stg_sap_bkpf_data (first 100 rows)

gjahr blart posting_month last_change_date update_date usnam transaction_code stjah currency_code fixed_exchange_rate reciprocal_exchange_rate_indicator freight_charges business_transaction reference_type primary_currency currency_2 currency_3 exchange_rate_currency_2 exchange_rate_currency_3 withholding_tax_base_method extended_withholding_tax_base unknown_field_2 unknown_field_3 reverse_posting_date currency_type_2 currency_type_3_alt exchange_rate local_currency_exchange_rate page_count reindat vat_date primary_exchange_rate exchange_rate_2 exchange_rate_3 resubmission_date interest_calculation_date psobt psodt document_time offset_reference_date row_id is_deleted belnr bldat budat bukrs client_number credit_card_issuer credit_card_number currency_type currency_type_3 document_reference_key entry_date entry_time financial_management_area invoice_correction_indicator is_blind_document is_cash_allocation line_item_split_indicator reversal_reason reverse_document_number special_gl_indicator value_date
0 2006 sa 4 0 0 d002766 fb50 0 usd 0.0 0.0 0.0 rfbu bkpf usd eur usd -1.24 0.0 2 2 3 3 0 m m 0.0 0.0 0 0 0 0.0 0.0 0.0 0 0 0 0 0 0 1 False 200001076 2006-04-25 2006-04-25 3000 800 None None 30 40 10000107630002006 2006-04-25 11:28:23 3000 None None None None None None None 2006-04-25
1 2006 sa 4 0 0 d002766 fb50 0 usd 0.0 0.0 0.0 rfbu bkpf usd eur usd -1.24 0.0 2 2 3 3 0 m m 0.0 0.0 0 0 0 0.0 0.0 0.0 0 0 0 0 0 0 2 False 200001077 2006-04-26 2006-04-26 3000 800 None None 30 40 10000107730002006 2006-04-26 09:40:20 3000 None None None None None None None 2006-04-26
2 2006 sa 4 0 0 d002766 fb50 0 usd 0.0 0.0 0.0 rfbu bkpf usd eur usd -1.24 0.0 2 2 3 3 0 m m 0.0 0.0 0 0 0 0.0 0.0 0.0 0 0 0 0 0 0 3 False 200001078 2006-04-26 2006-04-26 3000 800 None None 30 40 10000107830002006 2006-04-26 09:41:35 3000 None None None None None None None 2006-04-26

stg_sap_bkpf_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 04:24:14.810339+00:00
WITH 
"sap_bkpf_data_projected" AS (
    -- Projection: Selecting 115 out of 116 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "belnr",
        "bukrs",
        "gjahr",
        "mandt",
        "blart",
        "bldat",
        "budat",
        "monat",
        "cpudt",
        "cputm",
        "aedat",
        "upddt",
        "wwert",
        "usnam",
        "tcode",
        "bvorg",
        "xblnr",
        "dbblg",
        "stblg",
        "stjah",
        "bktxt",
        "waers",
        "kursf",
        "kzwrs",
        "kzkrs",
        "bstat",
        "xnetb",
        "frath",
        "xrueb",
        "glvor",
        "grpid",
        "dokid",
        "arcid",
        "iblar",
        "awtyp",
        "awkey",
        "fikrs",
        "hwaer",
        "hwae2",
        "hwae3",
        "kurs2",
        "kurs3",
        "basw2",
        "basw3",
        "umrd2",
        "umrd3",
        "xstov",
        "stodt",
        "xmwst",
        "curt2",
        "curt3",
        "kuty2",
        "kuty3",
        "xsnet",
        "ausbk",
        "xusvr",
        "duefl",
        "awsys",
        "txkrs",
        "ctxkrs",
        "lotkz",
        "xwvof",
        "stgrd",
        "ppnam",
        "brnch",
        "numpg",
        "adisc",
        "xref1_hd",
        "xref2_hd",
        "xreversal",
        "reindat",
        "rldnr",
        "ldgrp",
        "propmano",
        "xblnr_alt",
        "vatdate",
        "doccat",
        "xsplit",
        "cash_alloc",
        "follow_on",
        "xreorg",
        "subset",
        "kurst",
        "kursx",
        "kur2x",
        "kur3x",
        "xmca",
        "resubmission",
        "_sapf15_status",
        "psoty",
        "psoak",
        "psoks",
        "psosg",
        "psofn",
        "intform",
        "intdate",
        "psobt",
        "psozl",
        "psodt",
        "psotm",
        "fm_umart",
        "ccins",
        "ccnum",
        "ssblk",
        "batch",
        "sname",
        "sampled",
        "exclude_flag",
        "blind",
        "offset_status",
        "offset_refer_dat",
        "penrc",
        "knumv",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_bkpf_data"
),

"sap_bkpf_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- mandt -> client_number
    -- monat -> posting_month
    -- cpudt -> entry_date
    -- cputm -> entry_time
    -- aedat -> last_change_date
    -- upddt -> update_date
    -- wwert -> value_date
    -- tcode -> transaction_code
    -- bvorg -> accounting_transaction
    -- xblnr -> reference_document_number
    -- stblg -> reverse_document_number
    -- bktxt -> document_header_text
    -- waers -> currency_code
    -- kursf -> fixed_exchange_rate
    -- kzwrs -> currency_indicator
    -- kzkrs -> reciprocal_exchange_rate_indicator
    -- bstat -> document_status
    -- xnetb -> net_amount
    -- frath -> freight_charges
    -- xrueb -> invoice_correction_indicator
    -- glvor -> business_transaction
    -- grpid -> group_id
    -- dokid -> document_id
    -- arcid -> archive_id
    -- iblar -> clearing_document_number
    -- awtyp -> reference_type
    -- awkey -> document_reference_key
    -- fikrs -> financial_management_area
    -- hwaer -> primary_currency
    -- hwae2 -> currency_2
    -- hwae3 -> currency_3
    -- kurs2 -> exchange_rate_currency_2
    -- kurs3 -> exchange_rate_currency_3
    -- basw2 -> withholding_tax_base_method
    -- basw3 -> extended_withholding_tax_base
    -- umrd2 -> unknown_field_2
    -- umrd3 -> unknown_field_3
    -- xstov -> other_period_reversal_indicator
    -- stodt -> reverse_posting_date
    -- xmwst -> tax_code
    -- curt2 -> currency_type
    -- curt3 -> currency_type_3
    -- kuty2 -> currency_type_2
    -- kuty3 -> currency_type_3_alt
    -- xsnet -> statistical_posting_indicator
    -- xusvr -> eu_sales_list_indicator
    -- duefl -> due_flag
    -- awsys -> origin_system
    -- txkrs -> exchange_rate
    -- ctxkrs -> local_currency_exchange_rate
    -- xwvof -> tax_determination_date_indicator
    -- stgrd -> reversal_reason
    -- ppnam -> parked_by
    -- brnch -> branch_number
    -- numpg -> page_count
    -- adisc -> additional_discount
    -- xref1_hd -> reference_key_1
    -- xref2_hd -> reference_key_2
    -- rldnr -> ledger
    -- ldgrp -> ledger_group
    -- propmano -> property_management_object
    -- xblnr_alt -> alternative_reference_number
    -- vatdate -> vat_date
    -- doccat -> document_category
    -- xsplit -> line_item_split_indicator
    -- cash_alloc -> is_cash_allocation
    -- follow_on -> follow_on_indicator
    -- xreorg -> reorganization_status
    -- kurst -> exchange_rate_type
    -- kursx -> primary_exchange_rate
    -- kur2x -> exchange_rate_2
    -- kur3x -> exchange_rate_3
    -- xmca -> special_gl_indicator
    -- resubmission -> resubmission_date
    -- _sapf15_status -> sap_f15_status
    -- psoty -> transaction_type
    -- psoak -> accounting_clerk
    -- psosg -> posting_key
    -- intform -> interest_calculation_form
    -- intdate -> interest_calculation_date
    -- psozl -> line_item_number
    -- psotm -> document_time
    -- fm_umart -> funds_management_type
    -- ccins -> credit_card_issuer
    -- ccnum -> credit_card_number
    -- ssblk -> blocking_reason
    -- batch -> batch_number
    -- sampled -> sampled_indicator
    -- blind -> is_blind_document
    -- offset_refer_dat -> offset_reference_date
    -- penrc -> penalty_calculation_rule
    -- knumv -> condition_record_number
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "belnr",
        "bukrs",
        "gjahr",
        "mandt" AS "client_number",
        "blart",
        "bldat",
        "budat",
        "monat" AS "posting_month",
        "cpudt" AS "entry_date",
        "cputm" AS "entry_time",
        "aedat" AS "last_change_date",
        "upddt" AS "update_date",
        "wwert" AS "value_date",
        "usnam",
        "tcode" AS "transaction_code",
        "bvorg" AS "accounting_transaction",
        "xblnr" AS "reference_document_number",
        "dbblg",
        "stblg" AS "reverse_document_number",
        "stjah",
        "bktxt" AS "document_header_text",
        "waers" AS "currency_code",
        "kursf" AS "fixed_exchange_rate",
        "kzwrs" AS "currency_indicator",
        "kzkrs" AS "reciprocal_exchange_rate_indicator",
        "bstat" AS "document_status",
        "xnetb" AS "net_amount",
        "frath" AS "freight_charges",
        "xrueb" AS "invoice_correction_indicator",
        "glvor" AS "business_transaction",
        "grpid" AS "group_id",
        "dokid" AS "document_id",
        "arcid" AS "archive_id",
        "iblar" AS "clearing_document_number",
        "awtyp" AS "reference_type",
        "awkey" AS "document_reference_key",
        "fikrs" AS "financial_management_area",
        "hwaer" AS "primary_currency",
        "hwae2" AS "currency_2",
        "hwae3" AS "currency_3",
        "kurs2" AS "exchange_rate_currency_2",
        "kurs3" AS "exchange_rate_currency_3",
        "basw2" AS "withholding_tax_base_method",
        "basw3" AS "extended_withholding_tax_base",
        "umrd2" AS "unknown_field_2",
        "umrd3" AS "unknown_field_3",
        "xstov" AS "other_period_reversal_indicator",
        "stodt" AS "reverse_posting_date",
        "xmwst" AS "tax_code",
        "curt2" AS "currency_type",
        "curt3" AS "currency_type_3",
        "kuty2" AS "currency_type_2",
        "kuty3" AS "currency_type_3_alt",
        "xsnet" AS "statistical_posting_indicator",
        "ausbk",
        "xusvr" AS "eu_sales_list_indicator",
        "duefl" AS "due_flag",
        "awsys" AS "origin_system",
        "txkrs" AS "exchange_rate",
        "ctxkrs" AS "local_currency_exchange_rate",
        "lotkz",
        "xwvof" AS "tax_determination_date_indicator",
        "stgrd" AS "reversal_reason",
        "ppnam" AS "parked_by",
        "brnch" AS "branch_number",
        "numpg" AS "page_count",
        "adisc" AS "additional_discount",
        "xref1_hd" AS "reference_key_1",
        "xref2_hd" AS "reference_key_2",
        "xreversal",
        "reindat",
        "rldnr" AS "ledger",
        "ldgrp" AS "ledger_group",
        "propmano" AS "property_management_object",
        "xblnr_alt" AS "alternative_reference_number",
        "vatdate" AS "vat_date",
        "doccat" AS "document_category",
        "xsplit" AS "line_item_split_indicator",
        "cash_alloc" AS "is_cash_allocation",
        "follow_on" AS "follow_on_indicator",
        "xreorg" AS "reorganization_status",
        "subset",
        "kurst" AS "exchange_rate_type",
        "kursx" AS "primary_exchange_rate",
        "kur2x" AS "exchange_rate_2",
        "kur3x" AS "exchange_rate_3",
        "xmca" AS "special_gl_indicator",
        "resubmission" AS "resubmission_date",
        "_sapf15_status" AS "sap_f15_status",
        "psoty" AS "transaction_type",
        "psoak" AS "accounting_clerk",
        "psoks",
        "psosg" AS "posting_key",
        "psofn",
        "intform" AS "interest_calculation_form",
        "intdate" AS "interest_calculation_date",
        "psobt",
        "psozl" AS "line_item_number",
        "psodt",
        "psotm" AS "document_time",
        "fm_umart" AS "funds_management_type",
        "ccins" AS "credit_card_issuer",
        "ccnum" AS "credit_card_number",
        "ssblk" AS "blocking_reason",
        "batch" AS "batch_number",
        "sname",
        "sampled" AS "sampled_indicator",
        "exclude_flag",
        "blind" AS "is_blind_document",
        "offset_status",
        "offset_refer_dat" AS "offset_reference_date",
        "penrc" AS "penalty_calculation_rule",
        "knumv" AS "condition_record_number",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_bkpf_data_projected"
),

"sap_bkpf_data_projected_renamed_cleaned" AS (
    -- Clean unusual string values: 
    -- blart: The problem is that 'sa' is the only value in the column, and it's a very short, ambiguous code without clear meaning in most contexts. Without more information about the dataset or the intended meaning of this column, it's difficult to determine if this is an error or if it has a specific significance in the context of the data. The correct values cannot be determined without additional context. 
    -- currency_type_2: The problem is that 'm' is not a standard currency code or recognizable currency name. Given that this is the only value in the column and without more context, it's difficult to determine what it might represent. It could be a typo, an abbreviation, or a placeholder. Without additional information about the dataset or what this column is supposed to represent, we can't confidently map it to a correct value. The safest approach would be to mark it as an unknown or invalid currency. 
    -- currency_type_3_alt: The problem is that 'm' is not a standard currency code and lacks context. In currency-related data, 'm' is sometimes used as an abbreviation for 'million', but it's not clear if that's the intended meaning here. Without more context about the data set and what this column represents, it's difficult to determine the correct mapping. If 'm' is indeed meant to represent 'million', it should be replaced with a more clear representation. If it's meant to be a currency code, it should be replaced with the appropriate ISO 4217 currency code. Given the lack of context, the safest approach is to keep the value as is for now, pending further clarification. 
    SELECT
        "belnr",
        "bukrs",
        "gjahr",
        "client_number",
        "blart",
        "bldat",
        "budat",
        "posting_month",
        "entry_date",
        "entry_time",
        "last_change_date",
        "update_date",
        "value_date",
        "usnam",
        "transaction_code",
        "accounting_transaction",
        "reference_document_number",
        "dbblg",
        "reverse_document_number",
        "stjah",
        "document_header_text",
        "currency_code",
        "fixed_exchange_rate",
        "currency_indicator",
        "reciprocal_exchange_rate_indicator",
        "document_status",
        "net_amount",
        "freight_charges",
        "invoice_correction_indicator",
        "business_transaction",
        "group_id",
        "document_id",
        "archive_id",
        "clearing_document_number",
        "reference_type",
        "document_reference_key",
        "financial_management_area",
        "primary_currency",
        "currency_2",
        "currency_3",
        "exchange_rate_currency_2",
        "exchange_rate_currency_3",
        "withholding_tax_base_method",
        "extended_withholding_tax_base",
        "unknown_field_2",
        "unknown_field_3",
        "other_period_reversal_indicator",
        "reverse_posting_date",
        "tax_code",
        "currency_type",
        "currency_type_3",
        CASE
            WHEN "currency_type_2" = '''m''' THEN '''UNKNOWN'''
            ELSE "currency_type_2"
        END AS "currency_type_2",
        "currency_type_3_alt",
        "statistical_posting_indicator",
        "ausbk",
        "eu_sales_list_indicator",
        "due_flag",
        "origin_system",
        "exchange_rate",
        "local_currency_exchange_rate",
        "lotkz",
        "tax_determination_date_indicator",
        "reversal_reason",
        "parked_by",
        "branch_number",
        "page_count",
        "additional_discount",
        "reference_key_1",
        "reference_key_2",
        "xreversal",
        "reindat",
        "ledger",
        "ledger_group",
        "property_management_object",
        "alternative_reference_number",
        "vat_date",
        "document_category",
        "line_item_split_indicator",
        "is_cash_allocation",
        "follow_on_indicator",
        "reorganization_status",
        "subset",
        "exchange_rate_type",
        "primary_exchange_rate",
        "exchange_rate_2",
        "exchange_rate_3",
        "special_gl_indicator",
        "resubmission_date",
        "sap_f15_status",
        "transaction_type",
        "accounting_clerk",
        "psoks",
        "posting_key",
        "psofn",
        "interest_calculation_form",
        "interest_calculation_date",
        "psobt",
        "line_item_number",
        "psodt",
        "document_time",
        "funds_management_type",
        "credit_card_issuer",
        "credit_card_number",
        "blocking_reason",
        "batch_number",
        "sname",
        "sampled_indicator",
        "exclude_flag",
        "is_blind_document",
        "offset_status",
        "offset_reference_date",
        "penalty_calculation_rule",
        "condition_record_number",
        "row_id",
        "is_deleted"
    FROM "sap_bkpf_data_projected_renamed"
),

"sap_bkpf_data_projected_renamed_cleaned_casted" AS (
    -- Column Type Casting: 
    -- accounting_clerk: from DECIMAL to VARCHAR
    -- accounting_transaction: from DECIMAL to VARCHAR
    -- additional_discount: from DECIMAL to VARCHAR
    -- alternative_reference_number: from DECIMAL to VARCHAR
    -- archive_id: from DECIMAL to VARCHAR
    -- ausbk: from DECIMAL to VARCHAR
    -- batch_number: from DECIMAL to VARCHAR
    -- belnr: from INT to VARCHAR
    -- bldat: from INT to DATE
    -- blocking_reason: from DECIMAL to VARCHAR
    -- branch_number: from DECIMAL to VARCHAR
    -- budat: from INT to DATE
    -- bukrs: from INT to VARCHAR
    -- clearing_document_number: from DECIMAL to VARCHAR
    -- client_number: from INT to VARCHAR
    -- condition_record_number: from DECIMAL to VARCHAR
    -- credit_card_issuer: from DECIMAL to VARCHAR
    -- credit_card_number: from DECIMAL to VARCHAR
    -- currency_indicator: from DECIMAL to VARCHAR
    -- currency_type: from INT to VARCHAR
    -- currency_type_3: from INT to VARCHAR
    -- dbblg: from DECIMAL to VARCHAR
    -- document_category: from DECIMAL to VARCHAR
    -- document_header_text: from DECIMAL to VARCHAR
    -- document_id: from DECIMAL to VARCHAR
    -- document_reference_key: from INT to VARCHAR
    -- document_status: from DECIMAL to VARCHAR
    -- due_flag: from DECIMAL to VARCHAR
    -- entry_date: from INT to DATE
    -- entry_time: from INT to TIME
    -- eu_sales_list_indicator: from DECIMAL to VARCHAR
    -- exchange_rate_type: from DECIMAL to VARCHAR
    -- exclude_flag: from DECIMAL to VARCHAR
    -- financial_management_area: from INT to VARCHAR
    -- follow_on_indicator: from DECIMAL to VARCHAR
    -- funds_management_type: from DECIMAL to VARCHAR
    -- group_id: from DECIMAL to VARCHAR
    -- interest_calculation_form: from DECIMAL to VARCHAR
    -- invoice_correction_indicator: from DECIMAL to VARCHAR
    -- is_blind_document: from DECIMAL to VARCHAR
    -- is_cash_allocation: from DECIMAL to VARCHAR
    -- ledger: from DECIMAL to VARCHAR
    -- ledger_group: from DECIMAL to VARCHAR
    -- line_item_number: from DECIMAL to VARCHAR
    -- line_item_split_indicator: from DECIMAL to VARCHAR
    -- lotkz: from DECIMAL to VARCHAR
    -- net_amount: from DECIMAL to VARCHAR
    -- offset_status: from DECIMAL to VARCHAR
    -- origin_system: from DECIMAL to VARCHAR
    -- other_period_reversal_indicator: from DECIMAL to VARCHAR
    -- parked_by: from DECIMAL to VARCHAR
    -- penalty_calculation_rule: from DECIMAL to VARCHAR
    -- posting_key: from DECIMAL to VARCHAR
    -- property_management_object: from DECIMAL to VARCHAR
    -- psofn: from DECIMAL to VARCHAR
    -- psoks: from DECIMAL to VARCHAR
    -- reference_document_number: from DECIMAL to VARCHAR
    -- reference_key_1: from DECIMAL to VARCHAR
    -- reference_key_2: from DECIMAL to VARCHAR
    -- reorganization_status: from DECIMAL to VARCHAR
    -- reversal_reason: from DECIMAL to VARCHAR
    -- reverse_document_number: from DECIMAL to VARCHAR
    -- sampled_indicator: from DECIMAL to VARCHAR
    -- sap_f15_status: from DECIMAL to VARCHAR
    -- sname: from DECIMAL to VARCHAR
    -- special_gl_indicator: from DECIMAL to VARCHAR
    -- statistical_posting_indicator: from DECIMAL to VARCHAR
    -- subset: from DECIMAL to VARCHAR
    -- tax_code: from DECIMAL to VARCHAR
    -- tax_determination_date_indicator: from DECIMAL to VARCHAR
    -- transaction_type: from DECIMAL to VARCHAR
    -- value_date: from INT to DATE
    -- xreversal: from DECIMAL to VARCHAR
    SELECT
        "gjahr",
        "blart",
        "posting_month",
        "last_change_date",
        "update_date",
        "usnam",
        "transaction_code",
        "stjah",
        "currency_code",
        "fixed_exchange_rate",
        "reciprocal_exchange_rate_indicator",
        "freight_charges",
        "business_transaction",
        "reference_type",
        "primary_currency",
        "currency_2",
        "currency_3",
        "exchange_rate_currency_2",
        "exchange_rate_currency_3",
        "withholding_tax_base_method",
        "extended_withholding_tax_base",
        "unknown_field_2",
        "unknown_field_3",
        "reverse_posting_date",
        "currency_type_2",
        "currency_type_3_alt",
        "exchange_rate",
        "local_currency_exchange_rate",
        "page_count",
        "reindat",
        "vat_date",
        "primary_exchange_rate",
        "exchange_rate_2",
        "exchange_rate_3",
        "resubmission_date",
        "interest_calculation_date",
        "psobt",
        "psodt",
        "document_time",
        "offset_reference_date",
        "row_id",
        "is_deleted",
        CAST("accounting_clerk" AS VARCHAR) AS "accounting_clerk",
        CAST("accounting_transaction" AS VARCHAR) AS "accounting_transaction",
        CAST("additional_discount" AS VARCHAR) AS "additional_discount",
        CAST("alternative_reference_number" AS VARCHAR) AS "alternative_reference_number",
        CAST("archive_id" AS VARCHAR) AS "archive_id",
        CAST("ausbk" AS VARCHAR) AS "ausbk",
        CAST("batch_number" AS VARCHAR) AS "batch_number",
        CAST("belnr" AS VARCHAR) AS "belnr",
        strptime(CAST("bldat" AS VARCHAR), '%Y%m%d') AS "bldat",
        CAST("blocking_reason" AS VARCHAR) AS "blocking_reason",
        CAST("branch_number" AS VARCHAR) AS "branch_number",
        strptime(CAST("budat" AS VARCHAR), '%Y%m%d') AS "budat",
        CAST("bukrs" AS VARCHAR) AS "bukrs",
        CAST("clearing_document_number" AS VARCHAR) AS "clearing_document_number",
        CAST("client_number" AS VARCHAR) AS "client_number",
        CAST("condition_record_number" AS VARCHAR) AS "condition_record_number",
        CAST("credit_card_issuer" AS VARCHAR) AS "credit_card_issuer",
        CAST("credit_card_number" AS VARCHAR) AS "credit_card_number",
        CAST("currency_indicator" AS VARCHAR) AS "currency_indicator",
        CAST("currency_type" AS VARCHAR) AS "currency_type",
        CAST("currency_type_3" AS VARCHAR) AS "currency_type_3",
        CAST("dbblg" AS VARCHAR) AS "dbblg",
        CAST("document_category" AS VARCHAR) AS "document_category",
        CAST("document_header_text" AS VARCHAR) AS "document_header_text",
        CAST("document_id" AS VARCHAR) AS "document_id",
        CAST("document_reference_key" AS VARCHAR) AS "document_reference_key",
        CAST("document_status" AS VARCHAR) AS "document_status",
        CAST("due_flag" AS VARCHAR) AS "due_flag",
        strptime(CAST("entry_date" AS VARCHAR), '%Y%m%d') AS "entry_date",
        CAST(
            strptime(
                LPAD(CAST("entry_time" AS VARCHAR), 6, '0'),
                '%H%M%S'
            ) AS TIME
        ) AS "entry_time",
        CAST("eu_sales_list_indicator" AS VARCHAR) AS "eu_sales_list_indicator",
        CAST("exchange_rate_type" AS VARCHAR) AS "exchange_rate_type",
        CAST("exclude_flag" AS VARCHAR) AS "exclude_flag",
        CAST("financial_management_area" AS VARCHAR) AS "financial_management_area",
        CAST("follow_on_indicator" AS VARCHAR) AS "follow_on_indicator",
        CAST("funds_management_type" AS VARCHAR) AS "funds_management_type",
        CAST("group_id" AS VARCHAR) AS "group_id",
        CAST("interest_calculation_form" AS VARCHAR) AS "interest_calculation_form",
        CAST("invoice_correction_indicator" AS VARCHAR) AS "invoice_correction_indicator",
        CAST("is_blind_document" AS VARCHAR) AS "is_blind_document",
        CAST("is_cash_allocation" AS VARCHAR) AS "is_cash_allocation",
        CAST("ledger" AS VARCHAR) AS "ledger",
        CAST("ledger_group" AS VARCHAR) AS "ledger_group",
        CAST("line_item_number" AS VARCHAR) AS "line_item_number",
        CAST("line_item_split_indicator" AS VARCHAR) AS "line_item_split_indicator",
        CAST("lotkz" AS VARCHAR) AS "lotkz",
        CAST("net_amount" AS VARCHAR) AS "net_amount",
        CAST("offset_status" AS VARCHAR) AS "offset_status",
        CAST("origin_system" AS VARCHAR) AS "origin_system",
        CAST("other_period_reversal_indicator" AS VARCHAR) AS "other_period_reversal_indicator",
        CAST("parked_by" AS VARCHAR) AS "parked_by",
        CAST("penalty_calculation_rule" AS VARCHAR) AS "penalty_calculation_rule",
        CAST("posting_key" AS VARCHAR) AS "posting_key",
        CAST("property_management_object" AS VARCHAR) AS "property_management_object",
        CAST("psofn" AS VARCHAR) AS "psofn",
        CAST("psoks" AS VARCHAR) AS "psoks",
        CAST("reference_document_number" AS VARCHAR) AS "reference_document_number",
        CAST("reference_key_1" AS VARCHAR) AS "reference_key_1",
        CAST("reference_key_2" AS VARCHAR) AS "reference_key_2",
        CAST("reorganization_status" AS VARCHAR) AS "reorganization_status",
        CAST("reversal_reason" AS VARCHAR) AS "reversal_reason",
        CAST("reverse_document_number" AS VARCHAR) AS "reverse_document_number",
        CAST("sampled_indicator" AS VARCHAR) AS "sampled_indicator",
        CAST("sap_f15_status" AS VARCHAR) AS "sap_f15_status",
        CAST("sname" AS VARCHAR) AS "sname",
        CAST("special_gl_indicator" AS VARCHAR) AS "special_gl_indicator",
        CAST("statistical_posting_indicator" AS VARCHAR) AS "statistical_posting_indicator",
        CAST("subset" AS VARCHAR) AS "subset",
        CAST("tax_code" AS VARCHAR) AS "tax_code",
        CAST("tax_determination_date_indicator" AS VARCHAR) AS "tax_determination_date_indicator",
        CAST("transaction_type" AS VARCHAR) AS "transaction_type",
        strptime(CAST("value_date" AS VARCHAR), '%Y%m%d') AS "value_date",
        CAST("xreversal" AS VARCHAR) AS "xreversal"
    FROM "sap_bkpf_data_projected_renamed_cleaned"
),

"sap_bkpf_data_projected_renamed_cleaned_casted_missing_handled" AS (
    -- Handling missing values: There are 52 columns with unacceptable missing values
    -- accounting_clerk has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- accounting_transaction has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- additional_discount has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- alternative_reference_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- archive_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ausbk has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- batch_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- blocking_reason has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- branch_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- clearing_document_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- condition_record_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- currency_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- dbblg has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- document_category has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- document_header_text has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- document_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- document_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- due_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- eu_sales_list_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- exchange_rate_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- exclude_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- follow_on_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- funds_management_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- group_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- interest_calculation_form has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ledger has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ledger_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- line_item_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lotkz has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- net_amount has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- offset_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- origin_system has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- other_period_reversal_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- parked_by has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- penalty_calculation_rule has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- posting_key has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- property_management_object has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- psofn has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- psoks has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_document_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_key_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_key_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reorganization_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sampled_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sap_f15_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sname has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- statistical_posting_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- subset has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_determination_date_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- transaction_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- xreversal has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "gjahr",
        "blart",
        "posting_month",
        "last_change_date",
        "update_date",
        "usnam",
        "transaction_code",
        "stjah",
        "currency_code",
        "fixed_exchange_rate",
        "reciprocal_exchange_rate_indicator",
        "freight_charges",
        "business_transaction",
        "reference_type",
        "primary_currency",
        "currency_2",
        "currency_3",
        "exchange_rate_currency_2",
        "exchange_rate_currency_3",
        "withholding_tax_base_method",
        "extended_withholding_tax_base",
        "unknown_field_2",
        "unknown_field_3",
        "reverse_posting_date",
        "currency_type_2",
        "currency_type_3_alt",
        "exchange_rate",
        "local_currency_exchange_rate",
        "page_count",
        "reindat",
        "vat_date",
        "primary_exchange_rate",
        "exchange_rate_2",
        "exchange_rate_3",
        "resubmission_date",
        "interest_calculation_date",
        "psobt",
        "psodt",
        "document_time",
        "offset_reference_date",
        "row_id",
        "is_deleted",
        "belnr",
        "bldat",
        "budat",
        "bukrs",
        "client_number",
        "credit_card_issuer",
        "credit_card_number",
        "currency_type",
        "currency_type_3",
        "document_reference_key",
        "entry_date",
        "entry_time",
        "financial_management_area",
        "invoice_correction_indicator",
        "is_blind_document",
        "is_cash_allocation",
        "line_item_split_indicator",
        "reversal_reason",
        "reverse_document_number",
        "special_gl_indicator",
        "value_date"
    FROM "sap_bkpf_data_projected_renamed_cleaned_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_bkpf_data_projected_renamed_cleaned_casted_missing_handled"

stg_sap_bkpf_data.yml (Document the table)

version: 2
models:
- name: stg_sap_bkpf_data
  description: The table is about SAP financial document headers. It contains details
    like document number, company code, fiscal year, posting date, currency, and various
    transaction codes. The table includes financial attributes such as exchange rates,
    reversal information, and VAT-related fields. It also has system-related fields
    for processing and tracking purposes.
  columns:
  - name: gjahr
    description: ''
    tests:
    - not_null
  - name: blart
    description: ''
    tests:
    - not_null
    - accepted_values:
        values:
        - AL
        - AK
        - AZ
        - AR
        - CA
        - CO
        - CT
        - DE
        - FL
        - GA
        - HI
        - ID
        - IL
        - IN
        - IA
        - KS
        - KY
        - LA
        - ME
        - MD
        - MA
        - MI
        - MN
        - MS
        - MO
        - MT
        - NE
        - NV
        - NH
        - NJ
        - NM
        - NY
        - NC
        - ND
        - OH
        - OK
        - OR
        - PA
        - RI
        - SC
        - SD
        - TN
        - TX
        - UT
        - VT
        - VA
        - WA
        - WV
        - WI
        - WY
        - sa
  - name: posting_month
    description: Posting period (month)
    tests:
    - not_null
  - name: last_change_date
    description: Date of last change
    tests:
    - not_null
  - name: update_date
    description: Update date
    tests:
    - not_null
  - name: usnam
    description: ''
    tests:
    - not_null
  - name: transaction_code
    description: SAP transaction code
    tests:
    - not_null
  - name: stjah
    description: ''
    tests:
    - not_null
  - name: currency_code
    description: Currency code
    tests:
    - not_null
  - name: fixed_exchange_rate
    description: Exchange rate (fixed)
    tests:
    - not_null
  - name: reciprocal_exchange_rate_indicator
    description: Indicator for reciprocal exchange rate
    tests:
    - not_null
  - name: freight_charges
    description: Freight charges
    tests:
    - not_null
  - name: business_transaction
    description: Business transaction
    tests:
    - not_null
    - accepted_values:
        values:
        - rfbu
        - sale
        - purchase
        - payment
        - receipt
        - transfer
        - deposit
        - withdrawal
        - invoice
        - refund
        - credit
        - debit
        - adjustment
        - exchange
        - subscription
        - cancellation
        - renewal
        - loan
        - investment
        - payroll
  - name: reference_type
    description: Reference type
    tests:
    - not_null
    - accepted_values:
        values:
        - bkpf
  - name: primary_currency
    description: Currency
    tests:
    - not_null
  - name: currency_2
    description: Currency 2
    tests:
    - not_null
    - accepted_values:
        values:
        - eur
        - usd
        - gbp
        - jpy
        - chf
        - cad
        - aud
        - nzd
        - cny
        - hkd
        - sgd
        - sek
        - nok
        - mxn
        - zar
        - inr
        - brl
        - rub
        - krw
        - try
  - name: currency_3
    description: Currency 3
    tests:
    - not_null
  - name: exchange_rate_currency_2
    description: Exchange rate for currency 2
    tests:
    - not_null
  - name: exchange_rate_currency_3
    description: Exchange rate for currency 3
    tests:
    - not_null
  - name: withholding_tax_base_method
    description: Base method for withholding tax
    tests:
    - not_null
  - name: extended_withholding_tax_base
    description: Extended withholding tax base
    tests:
    - not_null
  - name: unknown_field_2
    description: Unknown field 2
    tests:
    - not_null
  - name: unknown_field_3
    description: Unknown field 3
    tests:
    - not_null
  - name: reverse_posting_date
    description: Reverse posting date
    tests:
    - not_null
  - name: currency_type_2
    description: Currency type 2
    tests:
    - not_null
    - accepted_values:
        values:
        - AUD
        - CAD
        - CHF
        - CNY
        - EUR
        - GBP
        - HKD
        - JPY
        - KRW
        - MXN
        - NOK
        - NZD
        - SEK
        - SGD
        - USD
        - m
  - name: currency_type_3_alt
    description: Currency type 3
    tests:
    - not_null
    - accepted_values:
        values:
        - $
        - "\u20AC"
        - "\xA3"
        - "\xA5"
        - "\u20A3"
        - "\u20B9"
        - Kr
        - R$
        - C$
        - A$
        - NZ$
        - Fr
        - "\u20B1"
        - "\u20A9"
        - "\u20AA"
        - "\u20BA"
        - "\u20B4"
        - "\u20B8"
        - "\u20BC"
        - "\u20BD"
        - m
  - name: exchange_rate
    description: Exchange rate
    tests:
    - not_null
  - name: local_currency_exchange_rate
    description: Exchange rate for local currency
    tests:
    - not_null
  - name: page_count
    description: Number of pages
    tests:
    - not_null
  - name: reindat
    description: ''
    tests:
    - not_null
  - name: vat_date
    description: VAT date
    tests:
    - not_null
  - name: primary_exchange_rate
    description: Exchange rate
    tests:
    - not_null
  - name: exchange_rate_2
    description: Exchange rate 2
    tests:
    - not_null
  - name: exchange_rate_3
    description: Exchange rate 3
    tests:
    - not_null
  - name: resubmission_date
    description: Resubmission date
    tests:
    - not_null
  - name: interest_calculation_date
    description: Interest calculation date
    tests:
    - not_null
  - name: psobt
    description: ''
    tests:
    - not_null
  - name: psodt
    description: ''
    tests:
    - not_null
  - name: document_time
    description: Document time
    tests:
    - not_null
  - name: offset_reference_date
    description: Offset reference date
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column is a unique identifier for each row in the table. For
        this table, each row represents a SAP financial document header. The row_id
        appears to be unique across rows, incrementing by 1 for each new entry.
  - name: is_deleted
    description: Indicates if the record is deleted
    tests:
    - not_null
  - name: belnr
    description: ''
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column appears to be a document number. For this table, each
        row represents a unique SAP financial document header. The belnr values are
        different for each row in the sample data, suggesting it could be unique for
        each document.
  - name: bldat
    description: ''
    tests:
    - not_null
  - name: budat
    description: ''
    tests:
    - not_null
  - name: bukrs
    description: ''
    tests:
    - not_null
  - name: client_number
    description: Client number
    tests:
    - not_null
  - name: credit_card_issuer
    description: Credit card issuer
    cocoon_meta:
      missing_acceptable: Not applicable if transaction doesn't involve credit card.
  - name: credit_card_number
    description: Credit card number
    cocoon_meta:
      missing_acceptable: Not applicable if transaction doesn't involve credit card.
  - name: currency_type
    description: Currency type
    tests:
    - not_null
    - accepted_values:
        values:
        - USD
        - EUR
        - GBP
        - JPY
        - CHF
        - CAD
        - AUD
        - CNY
        - HKD
        - NZD
        - SEK
        - KRW
        - SGD
        - NOK
        - MXN
        - INR
        - RUB
        - ZAR
        - TRY
        - BRL
        - '30'
  - name: currency_type_3
    description: Currency type 3
    tests:
    - not_null
  - name: document_reference_key
    description: Reference key for SAP document
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column seems to be a composite key combining multiple fields
        (possibly including belnr, bukrs, and gjahr). For this table, each row represents
        a unique SAP financial document header. The document_reference_key values
        are different for each row in the sample data, suggesting it could be unique
        for each document.
  - name: entry_date
    description: Entry date
    tests:
    - not_null
  - name: entry_time
    description: Entry time
    tests:
    - not_null
  - name: financial_management_area
    description: Financial management area
    tests:
    - not_null
  - name: invoice_correction_indicator
    description: Invoice correction indicator
    cocoon_meta:
      missing_acceptable: Not applicable if the document is not an invoice correction
  - name: is_blind_document
    description: Blind document indicator
    cocoon_meta:
      missing_acceptable: Not applicable if the document is not a blind document
  - name: is_cash_allocation
    description: Cash allocation indicator
    cocoon_meta:
      missing_acceptable: Not applicable if the transaction doesn't involve cash allocation
  - name: line_item_split_indicator
    description: Line item split indicator
    cocoon_meta:
      missing_acceptable: Not applicable if the line item is not split
  - name: reversal_reason
    description: Reversal reason
    cocoon_meta:
      missing_acceptable: Not applicable if the transaction is not a reversal
  - name: reverse_document_number
    description: Reverse document number
    cocoon_meta:
      missing_acceptable: Not applicable if the transaction is not a reversal
  - name: special_gl_indicator
    description: Special G/L indicator
    cocoon_meta:
      missing_acceptable: Not applicable for standard general ledger transactions
  - name: value_date
    description: Value date
    tests:
    - not_null

stg_sap_mara_data (first 100 rows)

client creator_name last_changed_by complete_material_maintenance_status maintenance_status material_type industry_sector old_material_number base_unit_of_measure number_of_sheets gross_weight ntgew volum material_weight length width height ergew ervol gross_weight_tolerance volume_tolerance fill_quantity stfak valid_from_date material_specific_status material_specific_status_usage min_remaining_shelf_life shelf_life_expiration_date storage_percentage net_contents comparison_price_unit gross_contents internal_object_number completion_level general_item_category_group_mara row_id is_deleted container_requirements creation_date deletion_flag hazardous_material_number hazmat_form hazmat_label_group hazmat_packaging_type last_change_date manufacturer_number manufacturer_part_number material_number
0 700 hvruser hvruser k k fert m updated desc bag 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0.0 0 0.0 0 0 norm 26047 False None 1023-03-08 b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' None None None None 1023-03-08 None None 51066122
1 700 hvruser None k k zmdg m None ea 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0.0 0 0.0 0 0 norm 26048 False None 1023-03-08 b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' None None None None NaT None None 51066123
2 700 hvruser hvruser k k fert m updated desc bag 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0.0 0 0.0 0 0 norm 26053 False None 1023-03-09 b'\x00\x00\x00\x00\x00\x00\x00\x00\x00' None None None None 1023-03-09 None None 51066124

stg_sap_mara_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 05:08:48.627318+00:00
WITH 
"sap_mara_data_projected" AS (
    -- Projection: Selecting 128 out of 129 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "mandt",
        "matnr",
        "ersda",
        "ernam",
        "laeda",
        "aenam",
        "vpsta",
        "pstat",
        "lvorm",
        "mtart",
        "mbrsh",
        "matkl",
        "bismt",
        "meins",
        "bstme",
        "zeinr",
        "zeiar",
        "zeivr",
        "zeifo",
        "aeszn",
        "blatt",
        "blanz",
        "ferth",
        "formt",
        "groes",
        "wrkst",
        "normt",
        "labor",
        "ekwsl",
        "brgew",
        "ntgew",
        "gewei",
        "volum",
        "voleh",
        "behvo",
        "raube",
        "tempb",
        "disst",
        "tragr",
        "stoff",
        "spart",
        "kunnr",
        "eannr",
        "wesch",
        "bwvor",
        "bwscl",
        "saiso",
        "etiar",
        "etifo",
        "entar",
        "ean11",
        "numtp",
        "laeng",
        "breit",
        "hoehe",
        "meabm",
        "prdha",
        "aeklk",
        "cadkz",
        "qmpur",
        "ergew",
        "ergei",
        "ervol",
        "ervoe",
        "gewto",
        "volto",
        "vabme",
        "kzrev",
        "kzkfg",
        "xchpf",
        "vhart",
        "fuelg",
        "stfak",
        "magrv",
        "begru",
        "datab",
        "liqdt",
        "saisj",
        "plgtp",
        "mlgut",
        "extwg",
        "satnr",
        "attyp",
        "kzkup",
        "kznfm",
        "pmata",
        "mstae",
        "mstav",
        "mstde",
        "mstdv",
        "taklv",
        "rbnrm",
        "mhdrz",
        "mhdhb",
        "mhdlp",
        "inhme",
        "inhal",
        "vpreh",
        "etiag",
        "inhbr",
        "cmeth",
        "cuobf",
        "kzumw",
        "kosch",
        "sprof",
        "nrfhg",
        "mfrpn",
        "mfrnr",
        "bmatn",
        "mprof",
        "kzwsm",
        "saity",
        "profl",
        "ihivi",
        "iloos",
        "serlv",
        "kzgvh",
        "xgchp",
        "kzeff",
        "compl",
        "iprkz",
        "rdmhd",
        "przus",
        "mtpos_mara",
        "bflme",
        "nsnid",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_mara_data"
),

"sap_mara_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- mandt -> client
    -- matnr -> material_number
    -- ersda -> creation_date
    -- ernam -> creator_name
    -- laeda -> last_change_date
    -- aenam -> last_changed_by
    -- vpsta -> complete_material_maintenance_status
    -- pstat -> maintenance_status
    -- lvorm -> deletion_block_flag
    -- mtart -> material_type
    -- mbrsh -> industry_sector
    -- matkl -> material_group
    -- bismt -> old_material_number
    -- meins -> base_unit_of_measure
    -- bstme -> purchase_order_uom
    -- zeinr -> rework_time
    -- zeiar -> production_procurement_time
    -- zeivr -> administrative_lead_time
    -- zeifo -> inhouse_production_time
    -- aeszn -> change_document_number
    -- blatt -> specification_page_number
    -- blanz -> number_of_sheets
    -- ferth -> production_memo
    -- formt -> material_format
    -- groes -> dimensions
    -- wrkst -> material_composition
    -- normt -> industry_standard_description
    -- labor -> lab_office
    -- ekwsl -> purchasing_value_key
    -- brgew -> gross_weight
    -- gewei -> weight_unit
    -- voleh -> volume_unit
    -- behvo -> container_requirements
    -- raube -> shelf_life_expiration
    -- tempb -> temperature_conditions
    -- disst -> distribution_status
    -- tragr -> transportation_group
    -- stoff -> hazardous_material_number
    -- spart -> division
    -- kunnr -> customer_number
    -- eannr -> ean_category
    -- wesch -> material_weight
    -- bwvor -> valuation_procedure
    -- bwscl -> valuation_class
    -- saiso -> season
    -- etiar -> hazmat_packaging_type
    -- etifo -> hazmat_form
    -- ean11 -> ean
    -- numtp -> number_type
    -- laeng -> length
    -- breit -> width
    -- hoehe -> height
    -- meabm -> max_storage_period
    -- prdha -> product_hierarchy
    -- aeklk -> purchase_order_text_key
    -- cadkz -> cad_indicator
    -- qmpur -> qm_procurement_active
    -- ergei -> allowed_packaging_weight
    -- ervoe -> proposer_name
    -- gewto -> gross_weight_tolerance
    -- volto -> volume_tolerance
    -- vabme -> variable_purchase_order_unit
    -- kzrev -> revision_level_indicator
    -- xchpf -> batch_management_required
    -- vhart -> packaging_material_type
    -- fuelg -> fill_quantity
    -- magrv -> packaging_material_group
    -- begru -> authorization_group
    -- datab -> valid_from_date
    -- liqdt -> deletion_flag
    -- saisj -> season_year
    -- plgtp -> planning_type
    -- mlgut -> storage_conditions
    -- extwg -> external_material_group
    -- satnr -> cross_plant_configurable_material
    -- attyp -> material_category
    -- kzkup -> co_product_indicator
    -- kznfm -> follow_up_material_indicator
    -- pmata -> pricing_reference_material
    -- mstae -> cross_plant_material_status
    -- mstav -> general_item_category_group
    -- mstde -> material_specific_status
    -- mstdv -> material_specific_status_usage
    -- taklv -> tax_classification
    -- rbnrm -> catalog_profile
    -- mhdrz -> min_remaining_shelf_life
    -- mhdhb -> shelf_life_expiration_date
    -- mhdlp -> storage_percentage
    -- inhme -> contents_unit
    -- inhal -> net_contents
    -- vpreh -> comparison_price_unit
    -- etiag -> hazmat_label_group
    -- inhbr -> gross_contents
    -- cmeth -> consumption_mode
    -- cuobf -> internal_object_number
    -- kzumw -> environmentally_relevant_indicator
    -- kosch -> product_allocation_procedure
    -- sprof -> sales_price_factor
    -- nrfhg -> central_article_number
    -- mfrpn -> manufacturer_part_number
    -- mfrnr -> manufacturer_number
    -- bmatn -> base_material_number
    -- mprof -> pricing_profile
    -- saity -> season_category
    -- profl -> profile
    -- ihivi -> hierarchy_category
    -- serlv -> serial_number_profile
    -- kzgvh -> packaging_material_indicator
    -- xgchp -> cross_plant_batch_management
    -- kzeff -> effectivity_parameter_indicator
    -- compl -> completion_level
    -- iprkz -> shelf_life_period_indicator
    -- rdmhd -> round_lot_size
    -- przus -> price_control_indicator
    -- mtpos_mara -> general_item_category_group_mara
    -- bflme -> lead_time_offset
    -- nsnid -> nato_stock_number
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "mandt" AS "client",
        "matnr" AS "material_number",
        "ersda" AS "creation_date",
        "ernam" AS "creator_name",
        "laeda" AS "last_change_date",
        "aenam" AS "last_changed_by",
        "vpsta" AS "complete_material_maintenance_status",
        "pstat" AS "maintenance_status",
        "lvorm" AS "deletion_block_flag",
        "mtart" AS "material_type",
        "mbrsh" AS "industry_sector",
        "matkl" AS "material_group",
        "bismt" AS "old_material_number",
        "meins" AS "base_unit_of_measure",
        "bstme" AS "purchase_order_uom",
        "zeinr" AS "rework_time",
        "zeiar" AS "production_procurement_time",
        "zeivr" AS "administrative_lead_time",
        "zeifo" AS "inhouse_production_time",
        "aeszn" AS "change_document_number",
        "blatt" AS "specification_page_number",
        "blanz" AS "number_of_sheets",
        "ferth" AS "production_memo",
        "formt" AS "material_format",
        "groes" AS "dimensions",
        "wrkst" AS "material_composition",
        "normt" AS "industry_standard_description",
        "labor" AS "lab_office",
        "ekwsl" AS "purchasing_value_key",
        "brgew" AS "gross_weight",
        "ntgew",
        "gewei" AS "weight_unit",
        "volum",
        "voleh" AS "volume_unit",
        "behvo" AS "container_requirements",
        "raube" AS "shelf_life_expiration",
        "tempb" AS "temperature_conditions",
        "disst" AS "distribution_status",
        "tragr" AS "transportation_group",
        "stoff" AS "hazardous_material_number",
        "spart" AS "division",
        "kunnr" AS "customer_number",
        "eannr" AS "ean_category",
        "wesch" AS "material_weight",
        "bwvor" AS "valuation_procedure",
        "bwscl" AS "valuation_class",
        "saiso" AS "season",
        "etiar" AS "hazmat_packaging_type",
        "etifo" AS "hazmat_form",
        "entar",
        "ean11" AS "ean",
        "numtp" AS "number_type",
        "laeng" AS "length",
        "breit" AS "width",
        "hoehe" AS "height",
        "meabm" AS "max_storage_period",
        "prdha" AS "product_hierarchy",
        "aeklk" AS "purchase_order_text_key",
        "cadkz" AS "cad_indicator",
        "qmpur" AS "qm_procurement_active",
        "ergew",
        "ergei" AS "allowed_packaging_weight",
        "ervol",
        "ervoe" AS "proposer_name",
        "gewto" AS "gross_weight_tolerance",
        "volto" AS "volume_tolerance",
        "vabme" AS "variable_purchase_order_unit",
        "kzrev" AS "revision_level_indicator",
        "kzkfg",
        "xchpf" AS "batch_management_required",
        "vhart" AS "packaging_material_type",
        "fuelg" AS "fill_quantity",
        "stfak",
        "magrv" AS "packaging_material_group",
        "begru" AS "authorization_group",
        "datab" AS "valid_from_date",
        "liqdt" AS "deletion_flag",
        "saisj" AS "season_year",
        "plgtp" AS "planning_type",
        "mlgut" AS "storage_conditions",
        "extwg" AS "external_material_group",
        "satnr" AS "cross_plant_configurable_material",
        "attyp" AS "material_category",
        "kzkup" AS "co_product_indicator",
        "kznfm" AS "follow_up_material_indicator",
        "pmata" AS "pricing_reference_material",
        "mstae" AS "cross_plant_material_status",
        "mstav" AS "general_item_category_group",
        "mstde" AS "material_specific_status",
        "mstdv" AS "material_specific_status_usage",
        "taklv" AS "tax_classification",
        "rbnrm" AS "catalog_profile",
        "mhdrz" AS "min_remaining_shelf_life",
        "mhdhb" AS "shelf_life_expiration_date",
        "mhdlp" AS "storage_percentage",
        "inhme" AS "contents_unit",
        "inhal" AS "net_contents",
        "vpreh" AS "comparison_price_unit",
        "etiag" AS "hazmat_label_group",
        "inhbr" AS "gross_contents",
        "cmeth" AS "consumption_mode",
        "cuobf" AS "internal_object_number",
        "kzumw" AS "environmentally_relevant_indicator",
        "kosch" AS "product_allocation_procedure",
        "sprof" AS "sales_price_factor",
        "nrfhg" AS "central_article_number",
        "mfrpn" AS "manufacturer_part_number",
        "mfrnr" AS "manufacturer_number",
        "bmatn" AS "base_material_number",
        "mprof" AS "pricing_profile",
        "kzwsm",
        "saity" AS "season_category",
        "profl" AS "profile",
        "ihivi" AS "hierarchy_category",
        "iloos",
        "serlv" AS "serial_number_profile",
        "kzgvh" AS "packaging_material_indicator",
        "xgchp" AS "cross_plant_batch_management",
        "kzeff" AS "effectivity_parameter_indicator",
        "compl" AS "completion_level",
        "iprkz" AS "shelf_life_period_indicator",
        "rdmhd" AS "round_lot_size",
        "przus" AS "price_control_indicator",
        "mtpos_mara" AS "general_item_category_group_mara",
        "bflme" AS "lead_time_offset",
        "nsnid" AS "nato_stock_number",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_mara_data_projected"
),

"sap_mara_data_projected_renamed_cleaned" AS (
    -- Clean unusual string values: 
    -- complete_material_maintenance_status: The problem is that the complete_material_maintenance_status column contains only one value, 'k', which is unusual and does not convey meaningful information about maintenance status. Single letter codes are typically not used for maintenance status, as they are not descriptive or easily understandable. The correct values for a maintenance status column would typically be more descriptive, such as 'Completed', 'In Progress', 'Pending', or 'Not Started'. However, without more context or information about the intended meaning of 'k', it's difficult to map it to a specific status. In this case, it's best to map it to an empty string to indicate that the data is not meaningful or is missing. 
    -- maintenance_status: The problem is that 'k' is the only value present in the maintenance_status column, and it's unclear what this single letter represents in terms of maintenance status. Single letter codes are typically not used for maintenance statuses, as they are not self-explanatory. Without additional context or a data dictionary, it's impossible to determine what 'k' stands for or what the correct values should be. Given this lack of information, the best approach is to keep the value as is, but flag it for further investigation or clarification from the data source. 
    -- industry_sector: The problem is that 'm' is the only value in the industry_sector column, and it's not a standard industry sector abbreviation. It lacks clarity and doesn't provide meaningful information about the industry sector. The correct values for industry sectors should typically be more descriptive and recognizable, such as 'Technology', 'Healthcare', 'Finance', etc. However, without more context or information about the dataset, it's impossible to determine what 'm' might stand for or what the correct industry sector should be. 
    -- old_material_number: The problem is that the column 'old_material_number' contains only one unique value: 'updated desc'. This is unusual because material numbers are typically alphanumeric codes used to identify specific materials or products. The value 'updated desc' appears to be a placeholder or an error, possibly indicating that the actual material numbers were updated or replaced with this generic description. Since we don't have the actual material numbers and can't determine what they should be, the best approach is to map this meaningless value to an empty string. 
    SELECT
        "client",
        "material_number",
        "creation_date",
        "creator_name",
        "last_change_date",
        "last_changed_by",
        CASE
            WHEN "complete_material_maintenance_status" = '''k''' THEN ''''
            ELSE "complete_material_maintenance_status"
        END AS "complete_material_maintenance_status",
        "maintenance_status",
        "deletion_block_flag",
        "material_type",
        CASE
            WHEN "industry_sector" = '''m''' THEN NULL
            ELSE "industry_sector"
        END AS "industry_sector",
        "material_group",
        CASE
            WHEN "old_material_number" = '''updated desc''' THEN ''''
            ELSE "old_material_number"
        END AS "old_material_number",
        "base_unit_of_measure",
        "purchase_order_uom",
        "rework_time",
        "production_procurement_time",
        "administrative_lead_time",
        "inhouse_production_time",
        "change_document_number",
        "specification_page_number",
        "number_of_sheets",
        "production_memo",
        "material_format",
        "dimensions",
        "material_composition",
        "industry_standard_description",
        "lab_office",
        "purchasing_value_key",
        "gross_weight",
        "ntgew",
        "weight_unit",
        "volum",
        "volume_unit",
        "container_requirements",
        "shelf_life_expiration",
        "temperature_conditions",
        "distribution_status",
        "transportation_group",
        "hazardous_material_number",
        "division",
        "customer_number",
        "ean_category",
        "material_weight",
        "valuation_procedure",
        "valuation_class",
        "season",
        "hazmat_packaging_type",
        "hazmat_form",
        "entar",
        "ean",
        "number_type",
        "length",
        "width",
        "height",
        "max_storage_period",
        "product_hierarchy",
        "purchase_order_text_key",
        "cad_indicator",
        "qm_procurement_active",
        "ergew",
        "allowed_packaging_weight",
        "ervol",
        "proposer_name",
        "gross_weight_tolerance",
        "volume_tolerance",
        "variable_purchase_order_unit",
        "revision_level_indicator",
        "kzkfg",
        "batch_management_required",
        "packaging_material_type",
        "fill_quantity",
        "stfak",
        "packaging_material_group",
        "authorization_group",
        "valid_from_date",
        "deletion_flag",
        "season_year",
        "planning_type",
        "storage_conditions",
        "external_material_group",
        "cross_plant_configurable_material",
        "material_category",
        "co_product_indicator",
        "follow_up_material_indicator",
        "pricing_reference_material",
        "cross_plant_material_status",
        "general_item_category_group",
        "material_specific_status",
        "material_specific_status_usage",
        "tax_classification",
        "catalog_profile",
        "min_remaining_shelf_life",
        "shelf_life_expiration_date",
        "storage_percentage",
        "contents_unit",
        "net_contents",
        "comparison_price_unit",
        "hazmat_label_group",
        "gross_contents",
        "consumption_mode",
        "internal_object_number",
        "environmentally_relevant_indicator",
        "product_allocation_procedure",
        "sales_price_factor",
        "central_article_number",
        "manufacturer_part_number",
        "manufacturer_number",
        "base_material_number",
        "pricing_profile",
        "kzwsm",
        "season_category",
        "profile",
        "hierarchy_category",
        "iloos",
        "serial_number_profile",
        "packaging_material_indicator",
        "cross_plant_batch_management",
        "effectivity_parameter_indicator",
        "completion_level",
        "shelf_life_period_indicator",
        "round_lot_size",
        "price_control_indicator",
        "general_item_category_group_mara",
        "lead_time_offset",
        "nato_stock_number",
        "row_id",
        "is_deleted"
    FROM "sap_mara_data_projected_renamed"
),

"sap_mara_data_projected_renamed_cleaned_casted" AS (
    -- Column Type Casting: 
    -- administrative_lead_time: from DECIMAL to INT
    -- authorization_group: from DECIMAL to VARCHAR
    -- base_material_number: from DECIMAL to VARCHAR
    -- batch_management_required: from DECIMAL to BOOLEAN
    -- cad_indicator: from DECIMAL to BOOLEAN
    -- catalog_profile: from DECIMAL to VARCHAR
    -- central_article_number: from DECIMAL to VARCHAR
    -- change_document_number: from DECIMAL to VARCHAR
    -- co_product_indicator: from DECIMAL to BOOLEAN
    -- consumption_mode: from DECIMAL to VARCHAR
    -- container_requirements: from DECIMAL to VARCHAR
    -- contents_unit: from DECIMAL to VARCHAR
    -- creation_date: from INT to DATE
    -- cross_plant_batch_management: from DECIMAL to BOOLEAN
    -- cross_plant_configurable_material: from DECIMAL to BOOLEAN
    -- cross_plant_material_status: from DECIMAL to VARCHAR
    -- customer_number: from DECIMAL to VARCHAR
    -- deletion_block_flag: from DECIMAL to BOOLEAN
    -- deletion_flag: from INT to BIT
    -- dimensions: from DECIMAL to VARCHAR
    -- distribution_status: from DECIMAL to VARCHAR
    -- division: from DECIMAL to VARCHAR
    -- ean: from DECIMAL to VARCHAR
    -- ean_category: from DECIMAL to VARCHAR
    -- effectivity_parameter_indicator: from DECIMAL to VARCHAR
    -- entar: from DECIMAL to VARCHAR
    -- environmentally_relevant_indicator: from DECIMAL to VARCHAR
    -- external_material_group: from DECIMAL to VARCHAR
    -- follow_up_material_indicator: from DECIMAL to VARCHAR
    -- general_item_category_group: from DECIMAL to VARCHAR
    -- hazardous_material_number: from DECIMAL to VARCHAR
    -- hazmat_form: from DECIMAL to VARCHAR
    -- hazmat_label_group: from DECIMAL to VARCHAR
    -- hazmat_packaging_type: from DECIMAL to VARCHAR
    -- hierarchy_category: from DECIMAL to VARCHAR
    -- iloos: from DECIMAL to VARCHAR
    -- industry_standard_description: from DECIMAL to VARCHAR
    -- inhouse_production_time: from DECIMAL to VARCHAR
    -- kzkfg: from DECIMAL to VARCHAR
    -- kzwsm: from DECIMAL to VARCHAR
    -- lab_office: from DECIMAL to VARCHAR
    -- last_change_date: from INT to DATE
    -- lead_time_offset: from DECIMAL to VARCHAR
    -- manufacturer_number: from DECIMAL to VARCHAR
    -- manufacturer_part_number: from DECIMAL to VARCHAR
    -- material_category: from DECIMAL to VARCHAR
    -- material_composition: from DECIMAL to VARCHAR
    -- material_format: from DECIMAL to VARCHAR
    -- material_group: from DECIMAL to VARCHAR
    -- material_number: from INT to VARCHAR
    -- max_storage_period: from DECIMAL to VARCHAR
    -- nato_stock_number: from DECIMAL to VARCHAR
    -- number_type: from DECIMAL to VARCHAR
    -- packaging_material_group: from DECIMAL to VARCHAR
    -- packaging_material_indicator: from DECIMAL to VARCHAR
    -- packaging_material_type: from DECIMAL to VARCHAR
    -- planning_type: from DECIMAL to VARCHAR
    -- price_control_indicator: from DECIMAL to VARCHAR
    -- pricing_profile: from DECIMAL to VARCHAR
    -- pricing_reference_material: from DECIMAL to VARCHAR
    -- product_allocation_procedure: from DECIMAL to VARCHAR
    -- product_hierarchy: from DECIMAL to VARCHAR
    -- production_memo: from DECIMAL to VARCHAR
    -- production_procurement_time: from DECIMAL to VARCHAR
    -- profile: from DECIMAL to VARCHAR
    -- proposer_name: from DECIMAL to VARCHAR
    -- purchase_order_text_key: from DECIMAL to VARCHAR
    -- purchase_order_uom: from DECIMAL to VARCHAR
    -- purchasing_value_key: from DECIMAL to VARCHAR
    -- qm_procurement_active: from DECIMAL to VARCHAR
    -- revision_level_indicator: from DECIMAL to VARCHAR
    -- rework_time: from DECIMAL to VARCHAR
    -- round_lot_size: from DECIMAL to VARCHAR
    -- sales_price_factor: from DECIMAL to VARCHAR
    -- season: from DECIMAL to VARCHAR
    -- season_category: from DECIMAL to VARCHAR
    -- season_year: from DECIMAL to VARCHAR
    -- serial_number_profile: from DECIMAL to VARCHAR
    -- shelf_life_expiration: from DECIMAL to VARCHAR
    -- shelf_life_period_indicator: from DECIMAL to VARCHAR
    -- specification_page_number: from DECIMAL to VARCHAR
    -- storage_conditions: from DECIMAL to VARCHAR
    -- tax_classification: from DECIMAL to VARCHAR
    -- temperature_conditions: from DECIMAL to VARCHAR
    -- transportation_group: from DECIMAL to VARCHAR
    -- valuation_class: from DECIMAL to VARCHAR
    -- valuation_procedure: from DECIMAL to VARCHAR
    -- variable_purchase_order_unit: from DECIMAL to VARCHAR
    -- volume_unit: from DECIMAL to VARCHAR
    -- weight_unit: from DECIMAL to VARCHAR
    SELECT
        "client",
        "creator_name",
        "last_changed_by",
        "complete_material_maintenance_status",
        "maintenance_status",
        "material_type",
        "industry_sector",
        "old_material_number",
        "base_unit_of_measure",
        "number_of_sheets",
        "gross_weight",
        "ntgew",
        "volum",
        "material_weight",
        "length",
        "width",
        "height",
        "ergew",
        "allowed_packaging_weight",
        "ervol",
        "gross_weight_tolerance",
        "volume_tolerance",
        "fill_quantity",
        "stfak",
        "valid_from_date",
        "material_specific_status",
        "material_specific_status_usage",
        "min_remaining_shelf_life",
        "shelf_life_expiration_date",
        "storage_percentage",
        "net_contents",
        "comparison_price_unit",
        "gross_contents",
        "internal_object_number",
        "completion_level",
        "general_item_category_group_mara",
        "row_id",
        "is_deleted",
        CAST("administrative_lead_time" AS INT) AS "administrative_lead_time",
        CAST("authorization_group" AS VARCHAR) AS "authorization_group",
        CAST("base_material_number" AS VARCHAR) AS "base_material_number",
        CAST("batch_management_required" AS BOOLEAN) AS "batch_management_required",
        CAST("cad_indicator" AS BOOLEAN) AS "cad_indicator",
        CAST("catalog_profile" AS VARCHAR) AS "catalog_profile",
        CAST("central_article_number" AS VARCHAR) AS "central_article_number",
        CAST("change_document_number" AS VARCHAR) AS "change_document_number",
        CAST("co_product_indicator" AS BOOLEAN) AS "co_product_indicator",
        CAST("consumption_mode" AS VARCHAR) AS "consumption_mode",
        CAST("container_requirements" AS VARCHAR) AS "container_requirements",
        CAST("contents_unit" AS VARCHAR) AS "contents_unit",
        strptime(CAST("creation_date" AS VARCHAR), '%Y%m%d') AS "creation_date",
        CAST("cross_plant_batch_management" AS BOOLEAN) AS "cross_plant_batch_management",
        CAST("cross_plant_configurable_material" AS BOOLEAN) AS "cross_plant_configurable_material",
        CAST("cross_plant_material_status" AS VARCHAR) AS "cross_plant_material_status",
        CAST("customer_number" AS VARCHAR) AS "customer_number",
        CAST("deletion_block_flag" AS BOOLEAN) AS "deletion_block_flag",
        CAST("deletion_flag" AS BIT) AS "deletion_flag",
        CAST("dimensions" AS VARCHAR) AS "dimensions",
        CAST("distribution_status" AS VARCHAR) AS "distribution_status",
        CAST("division" AS VARCHAR) AS "division",
        CAST("ean" AS VARCHAR) AS "ean",
        CAST("ean_category" AS VARCHAR) AS "ean_category",
        CAST("effectivity_parameter_indicator" AS VARCHAR) AS "effectivity_parameter_indicator",
        CAST("entar" AS VARCHAR) AS "entar",
        CAST("environmentally_relevant_indicator" AS VARCHAR) AS "environmentally_relevant_indicator",
        CAST("external_material_group" AS VARCHAR) AS "external_material_group",
        CAST("follow_up_material_indicator" AS VARCHAR) AS "follow_up_material_indicator",
        CAST("general_item_category_group" AS VARCHAR) AS "general_item_category_group",
        CAST("hazardous_material_number" AS VARCHAR) AS "hazardous_material_number",
        CAST("hazmat_form" AS VARCHAR) AS "hazmat_form",
        CAST("hazmat_label_group" AS VARCHAR) AS "hazmat_label_group",
        CAST("hazmat_packaging_type" AS VARCHAR) AS "hazmat_packaging_type",
        CAST("hierarchy_category" AS VARCHAR) AS "hierarchy_category",
        CAST("iloos" AS VARCHAR) AS "iloos",
        CAST("industry_standard_description" AS VARCHAR) AS "industry_standard_description",
        CAST("inhouse_production_time" AS VARCHAR) AS "inhouse_production_time",
        CAST("kzkfg" AS VARCHAR) AS "kzkfg",
        CAST("kzwsm" AS VARCHAR) AS "kzwsm",
        CAST("lab_office" AS VARCHAR) AS "lab_office",
        CASE 
            WHEN "last_change_date" BETWEEN 10000101 AND 99991231 
            THEN strptime(CAST("last_change_date" AS VARCHAR), '%Y%m%d')
            ELSE NULL
        END AS "last_change_date",
        CAST("lead_time_offset" AS VARCHAR) AS "lead_time_offset",
        CAST("manufacturer_number" AS VARCHAR) AS "manufacturer_number",
        CAST("manufacturer_part_number" AS VARCHAR) AS "manufacturer_part_number",
        CAST("material_category" AS VARCHAR) AS "material_category",
        CAST("material_composition" AS VARCHAR) AS "material_composition",
        CAST("material_format" AS VARCHAR) AS "material_format",
        CAST("material_group" AS VARCHAR) AS "material_group",
        CAST("material_number" AS VARCHAR) AS "material_number",
        CAST("max_storage_period" AS VARCHAR) AS "max_storage_period",
        CAST("nato_stock_number" AS VARCHAR) AS "nato_stock_number",
        CAST("number_type" AS VARCHAR) AS "number_type",
        CAST("packaging_material_group" AS VARCHAR) AS "packaging_material_group",
        CAST("packaging_material_indicator" AS VARCHAR) AS "packaging_material_indicator",
        CAST("packaging_material_type" AS VARCHAR) AS "packaging_material_type",
        CAST("planning_type" AS VARCHAR) AS "planning_type",
        CAST("price_control_indicator" AS VARCHAR) AS "price_control_indicator",
        CAST("pricing_profile" AS VARCHAR) AS "pricing_profile",
        CAST("pricing_reference_material" AS VARCHAR) AS "pricing_reference_material",
        CAST("product_allocation_procedure" AS VARCHAR) AS "product_allocation_procedure",
        CAST("product_hierarchy" AS VARCHAR) AS "product_hierarchy",
        CAST("production_memo" AS VARCHAR) AS "production_memo",
        CAST("production_procurement_time" AS VARCHAR) AS "production_procurement_time",
        CAST("profile" AS VARCHAR) AS "profile",
        CAST("proposer_name" AS VARCHAR) AS "proposer_name",
        CAST("purchase_order_text_key" AS VARCHAR) AS "purchase_order_text_key",
        CAST("purchase_order_uom" AS VARCHAR) AS "purchase_order_uom",
        CAST("purchasing_value_key" AS VARCHAR) AS "purchasing_value_key",
        CAST("qm_procurement_active" AS VARCHAR) AS "qm_procurement_active",
        CAST("revision_level_indicator" AS VARCHAR) AS "revision_level_indicator",
        CAST("rework_time" AS VARCHAR) AS "rework_time",
        CAST("round_lot_size" AS VARCHAR) AS "round_lot_size",
        CAST("sales_price_factor" AS VARCHAR) AS "sales_price_factor",
        CAST("season" AS VARCHAR) AS "season",
        CAST("season_category" AS VARCHAR) AS "season_category",
        CAST("season_year" AS VARCHAR) AS "season_year",
        CAST("serial_number_profile" AS VARCHAR) AS "serial_number_profile",
        CAST("shelf_life_expiration" AS VARCHAR) AS "shelf_life_expiration",
        CAST("shelf_life_period_indicator" AS VARCHAR) AS "shelf_life_period_indicator",
        CAST("specification_page_number" AS VARCHAR) AS "specification_page_number",
        CAST("storage_conditions" AS VARCHAR) AS "storage_conditions",
        CAST("tax_classification" AS VARCHAR) AS "tax_classification",
        CAST("temperature_conditions" AS VARCHAR) AS "temperature_conditions",
        CAST("transportation_group" AS VARCHAR) AS "transportation_group",
        CAST("valuation_class" AS VARCHAR) AS "valuation_class",
        CAST("valuation_procedure" AS VARCHAR) AS "valuation_procedure",
        CAST("variable_purchase_order_unit" AS VARCHAR) AS "variable_purchase_order_unit",
        CAST("volume_unit" AS VARCHAR) AS "volume_unit",
        CAST("weight_unit" AS VARCHAR) AS "weight_unit"
    FROM "sap_mara_data_projected_renamed_cleaned"
),

"sap_mara_data_projected_renamed_cleaned_casted_missing_handled" AS (
    -- Handling missing values: There are 82 columns with unacceptable missing values
    -- administrative_lead_time has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- allowed_packaging_weight has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- authorization_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- base_material_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- batch_management_required has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- cad_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- catalog_profile has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- central_article_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- change_document_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- co_product_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- consumption_mode has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- contents_unit has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- cross_plant_batch_management has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- cross_plant_configurable_material has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- cross_plant_material_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- deletion_block_flag has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- dimensions has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- distribution_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- division has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ean has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ean_category has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- effectivity_parameter_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- entar has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- environmentally_relevant_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- external_material_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- follow_up_material_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- general_item_category_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- hierarchy_category has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- iloos has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- industry_standard_description has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- inhouse_production_time has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- kzkfg has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- kzwsm has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- lab_office has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- last_change_date has 33.33 percent missing. Strategy: 🔄 Unchanged
    -- last_changed_by has 33.33 percent missing. Strategy: 🔄 Unchanged
    -- lead_time_offset has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- material_category has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- material_composition has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- material_format has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- material_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- max_storage_period has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- nato_stock_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- number_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- packaging_material_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- packaging_material_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- packaging_material_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- planning_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- price_control_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- pricing_profile has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- pricing_reference_material has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- product_allocation_procedure has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- product_hierarchy has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- production_memo has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- production_procurement_time has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- profile has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- proposer_name has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- purchase_order_text_key has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- purchase_order_uom has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- purchasing_value_key has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- qm_procurement_active has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- revision_level_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- rework_time has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- round_lot_size has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sales_price_factor has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- season has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- season_category has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- season_year has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- serial_number_profile has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- shelf_life_expiration has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- shelf_life_period_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- specification_page_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- storage_conditions has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_classification has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- temperature_conditions has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- transportation_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- valuation_class has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- valuation_procedure has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- variable_purchase_order_unit has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- volume_unit has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- weight_unit has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "client",
        "creator_name",
        "last_changed_by",
        "complete_material_maintenance_status",
        "maintenance_status",
        "material_type",
        "industry_sector",
        "old_material_number",
        "base_unit_of_measure",
        "number_of_sheets",
        "gross_weight",
        "ntgew",
        "volum",
        "material_weight",
        "length",
        "width",
        "height",
        "ergew",
        "ervol",
        "gross_weight_tolerance",
        "volume_tolerance",
        "fill_quantity",
        "stfak",
        "valid_from_date",
        "material_specific_status",
        "material_specific_status_usage",
        "min_remaining_shelf_life",
        "shelf_life_expiration_date",
        "storage_percentage",
        "net_contents",
        "comparison_price_unit",
        "gross_contents",
        "internal_object_number",
        "completion_level",
        "general_item_category_group_mara",
        "row_id",
        "is_deleted",
        "container_requirements",
        "creation_date",
        "deletion_flag",
        "hazardous_material_number",
        "hazmat_form",
        "hazmat_label_group",
        "hazmat_packaging_type",
        "last_change_date",
        "manufacturer_number",
        "manufacturer_part_number",
        "material_number"
    FROM "sap_mara_data_projected_renamed_cleaned_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_mara_data_projected_renamed_cleaned_casted_missing_handled"

stg_sap_mara_data.yml (Document the table)

version: 2
models:
- name: stg_sap_mara_data
  description: The table is about material master data in an SAP system. It contains
    details of materials such as material number, creation date, material type, base
    unit of measure, weight, volume, and various flags and indicators. The data includes
    fields related to procurement, sales, and inventory management. Each row represents
    a unique material record with its attributes and characteristics.
  columns:
  - name: client
    description: Client
    tests:
    - not_null
  - name: creator_name
    description: Name of person who created the record
    tests:
    - not_null
  - name: last_changed_by
    description: Name of person who changed the record
    tests:
    - not_null
  - name: complete_material_maintenance_status
    description: Maintenance status of complete material
    tests:
    - not_null
    - accepted_values:
        values:
        - k
        - a
        - i
        - p
        - r
        - c
        - n
        - d
  - name: maintenance_status
    description: Maintenance status
    tests:
    - not_null
    - accepted_values:
        values:
        - OK
        - Needs Maintenance
        - Under Maintenance
        - Critical
        - Scheduled
        - Deferred
        - Completed
        - Pending
        - N/A
        - k
  - name: material_type
    description: Material type
    tests:
    - not_null
  - name: industry_sector
    description: Industry sector
    tests:
    - not_null
    - accepted_values:
        values:
        - a
        - b
        - c
        - d
        - e
        - f
        - g
        - h
        - i
        - j
        - k
        - l
        - m
        - n
        - o
        - p
        - q
        - r
        - s
        - t
        - u
  - name: old_material_number
    description: Old material number
    cocoon_meta:
      missing_acceptable: Not applicable for newly created materials without previous
        numbers.
  - name: base_unit_of_measure
    description: Base unit of measure
    tests:
    - not_null
    - accepted_values:
        values:
        - bag
        - ea
        - kg
        - g
        - lb
        - oz
        - l
        - ml
        - m
        - cm
        - mm
        - in
        - ft
        - yd
        - pc
        - box
        - case
        - pack
        - set
        - pair
        - roll
        - sheet
        - unit
  - name: number_of_sheets
    description: Number of sheets
    tests:
    - not_null
  - name: gross_weight
    description: Gross weight
    tests:
    - not_null
  - name: ntgew
    description: ''
    tests:
    - not_null
  - name: volum
    description: ''
    tests:
    - not_null
  - name: material_weight
    description: Weight of the material
    tests:
    - not_null
  - name: length
    description: Length of the material
    tests:
    - not_null
  - name: width
    description: Width of the material
    tests:
    - not_null
  - name: height
    description: Height
    tests:
    - not_null
  - name: ergew
    description: ''
    tests:
    - not_null
  - name: ervol
    description: ''
    tests:
    - not_null
  - name: gross_weight_tolerance
    description: Gross weight tolerance
    tests:
    - not_null
  - name: volume_tolerance
    description: Volume tolerance
    tests:
    - not_null
  - name: fill_quantity
    description: Fill quantity
    tests:
    - not_null
  - name: stfak
    description: ''
    tests:
    - not_null
  - name: valid_from_date
    description: Valid-from date
    tests:
    - not_null
  - name: material_specific_status
    description: Material-specific status
    tests:
    - not_null
  - name: material_specific_status_usage
    description: Material-specific status usage
    tests:
    - not_null
  - name: min_remaining_shelf_life
    description: Minimum remaining shelf life
    tests:
    - not_null
  - name: shelf_life_expiration_date
    description: Shelf life expiration date
    tests:
    - not_null
  - name: storage_percentage
    description: Storage percentage
    tests:
    - not_null
  - name: net_contents
    description: Net contents
    tests:
    - not_null
  - name: comparison_price_unit
    description: Comparison price unit
    tests:
    - not_null
  - name: gross_contents
    description: Gross contents
    tests:
    - not_null
  - name: internal_object_number
    description: Internal object number
    tests:
    - not_null
  - name: completion_level
    description: Material completion level
    tests:
    - not_null
  - name: general_item_category_group_mara
    description: General item category group
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column appears to be a unique identifier for each row in the
        table. For this table, each row represents a unique material record. The row_id
        seems to be unique across rows based on the sample data.
  - name: is_deleted
    description: Indicates if the record is deleted
    tests:
    - not_null
  - name: container_requirements
    description: Container requirements
    cocoon_meta:
      missing_acceptable: Only applicable for materials requiring specific containers
  - name: creation_date
    description: Date when the record was created
    tests:
    - not_null
  - name: deletion_flag
    description: Deletion flag
    tests:
    - not_null
  - name: hazardous_material_number
    description: Hazardous material number
    cocoon_meta:
      missing_acceptable: Only applicable for hazardous materials
  - name: hazmat_form
    description: Form of hazardous materials
    cocoon_meta:
      missing_acceptable: Only applicable for hazardous materials
  - name: hazmat_label_group
    description: Labeling group for hazardous materials
    cocoon_meta:
      missing_acceptable: Only applicable for hazardous materials
  - name: hazmat_packaging_type
    description: Packaging requirements for hazardous materials
    cocoon_meta:
      missing_acceptable: Only applicable for hazardous materials
  - name: last_change_date
    description: Date of last change
    tests:
    - not_null
  - name: manufacturer_number
    description: Manufacturer number
    cocoon_meta:
      missing_acceptable: Not applicable for in-house produced items.
  - name: manufacturer_part_number
    description: Manufacturer part number
    cocoon_meta:
      missing_acceptable: Not applicable for in-house produced items.
  - name: material_number
    description: Material number
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column represents the unique identifier for each material in
        the SAP system. For this table, each row is a distinct material, and the material
        number should be unique across all materials.

stg_sap_ska1_data (first 100 rows)

chart_of_accounts creator_username account_group row_id is_deleted balance_sheet_account_type company_code creation_date gl_account_group_currency gl_account_number is_balance_sheet_account pl_account_type
0 dabe sap sako 1 False None 0 1992-06-24 111000 111000 True None
1 dabe sap sako 2 False None 0 1992-06-25 112000 112000 True None
2 dabe sap sako 3 False None 0 1992-06-26 113000 113000 True None

stg_sap_ska1_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 14:47:45.973739+00:00
WITH 
"sap_ska1_data_projected" AS (
    -- Projection: Selecting 19 out of 20 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "mandt",
        "ktopl",
        "saknr",
        "bilkt",
        "gvtyp",
        "vbund",
        "xbilk",
        "sakan",
        "erdat",
        "ernam",
        "ktoks",
        "xloev",
        "xspea",
        "xspeb",
        "xspep",
        "func_area",
        "mustr",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_ska1_data"
),

"sap_ska1_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- mandt -> company_code
    -- ktopl -> chart_of_accounts
    -- saknr -> gl_account_number
    -- bilkt -> balance_sheet_account_type
    -- gvtyp -> pl_account_type
    -- vbund -> partner_company_id
    -- xbilk -> is_balance_sheet_account
    -- sakan -> gl_account_group_currency
    -- erdat -> creation_date
    -- ernam -> creator_username
    -- ktoks -> account_group
    -- xloev -> is_marked_for_deletion
    -- xspea -> is_posting_blocked_client
    -- xspeb -> is_posting_blocked_company
    -- xspep -> is_posting_blocked_profit_center
    -- func_area -> functional_area
    -- mustr -> sample_account_indicator
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "mandt" AS "company_code",
        "ktopl" AS "chart_of_accounts",
        "saknr" AS "gl_account_number",
        "bilkt" AS "balance_sheet_account_type",
        "gvtyp" AS "pl_account_type",
        "vbund" AS "partner_company_id",
        "xbilk" AS "is_balance_sheet_account",
        "sakan" AS "gl_account_group_currency",
        "erdat" AS "creation_date",
        "ernam" AS "creator_username",
        "ktoks" AS "account_group",
        "xloev" AS "is_marked_for_deletion",
        "xspea" AS "is_posting_blocked_client",
        "xspeb" AS "is_posting_blocked_company",
        "xspep" AS "is_posting_blocked_profit_center",
        "func_area" AS "functional_area",
        "mustr" AS "sample_account_indicator",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_ska1_data_projected"
),

"sap_ska1_data_projected_renamed_cleaned" AS (
    -- Clean unusual string values: 
    -- chart_of_accounts: The problem is that 'dabe' is the only value present in the chart_of_accounts column, and it's not a recognized accounting term. It's likely a typo for 'debit', which is a common term in accounting. The correct value should be 'debit'. 
    -- is_balance_sheet_account: The problem is that 'x' is not a clear indicator of balance sheet account status. Typically, a boolean column like this would use values such as 'true'/'false', 'yes'/'no', or 1/0 to indicate whether an account is a balance sheet account or not. The single value 'x' is ambiguous and doesn't provide a clear opposite for non-balance sheet accounts. The correct values should be more explicit boolean indicators. 
    SELECT
        "company_code",
        CASE
            WHEN "chart_of_accounts" = '''dabe''' THEN '''debit'''
            ELSE "chart_of_accounts"
        END AS "chart_of_accounts",
        "gl_account_number",
        "balance_sheet_account_type",
        "pl_account_type",
        "partner_company_id",
        CASE
            WHEN "is_balance_sheet_account" = '''x''' THEN '''yes'''
            ELSE "is_balance_sheet_account"
        END AS "is_balance_sheet_account",
        "gl_account_group_currency",
        "creation_date",
        "creator_username",
        "account_group",
        "is_marked_for_deletion",
        "is_posting_blocked_client",
        "is_posting_blocked_company",
        "is_posting_blocked_profit_center",
        "functional_area",
        "sample_account_indicator",
        "row_id",
        "is_deleted"
    FROM "sap_ska1_data_projected_renamed"
),

"sap_ska1_data_projected_renamed_cleaned_casted" AS (
    -- Column Type Casting: 
    -- balance_sheet_account_type: from DECIMAL to VARCHAR
    -- company_code: from INT to VARCHAR
    -- creation_date: from INT to DATE
    -- functional_area: from DECIMAL to VARCHAR
    -- gl_account_group_currency: from INT to VARCHAR
    -- gl_account_number: from INT to VARCHAR
    -- is_balance_sheet_account: from VARCHAR to BOOLEAN
    -- is_marked_for_deletion: from DECIMAL to VARCHAR
    -- is_posting_blocked_client: from DECIMAL to VARCHAR
    -- is_posting_blocked_company: from DECIMAL to VARCHAR
    -- is_posting_blocked_profit_center: from DECIMAL to VARCHAR
    -- partner_company_id: from DECIMAL to VARCHAR
    -- pl_account_type: from DECIMAL to VARCHAR
    -- sample_account_indicator: from DECIMAL to VARCHAR
    SELECT
        "chart_of_accounts",
        "creator_username",
        "account_group",
        "row_id",
        "is_deleted",
        CAST("balance_sheet_account_type" AS VARCHAR) AS "balance_sheet_account_type",
        CAST("company_code" AS VARCHAR) AS "company_code",
        strptime(CAST("creation_date" AS VARCHAR), '%Y%m%d') AS "creation_date",
        CAST("functional_area" AS VARCHAR) AS "functional_area",
        CAST("gl_account_group_currency" AS VARCHAR) AS "gl_account_group_currency",
        CAST("gl_account_number" AS VARCHAR) AS "gl_account_number",
        CAST("is_balance_sheet_account" = 'x' AS BOOLEAN) AS "is_balance_sheet_account",
        CAST("is_marked_for_deletion" AS VARCHAR) AS "is_marked_for_deletion",
        CAST("is_posting_blocked_client" AS VARCHAR) AS "is_posting_blocked_client",
        CAST("is_posting_blocked_company" AS VARCHAR) AS "is_posting_blocked_company",
        CAST("is_posting_blocked_profit_center" AS VARCHAR) AS "is_posting_blocked_profit_center",
        CAST("partner_company_id" AS VARCHAR) AS "partner_company_id",
        CAST("pl_account_type" AS VARCHAR) AS "pl_account_type",
        CAST("sample_account_indicator" AS VARCHAR) AS "sample_account_indicator"
    FROM "sap_ska1_data_projected_renamed_cleaned"
),

"sap_ska1_data_projected_renamed_cleaned_casted_missing_handled" AS (
    -- Handling missing values: There are 7 columns with unacceptable missing values
    -- functional_area has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- is_marked_for_deletion has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- is_posting_blocked_client has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- is_posting_blocked_company has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- is_posting_blocked_profit_center has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- partner_company_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sample_account_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "chart_of_accounts",
        "creator_username",
        "account_group",
        "row_id",
        "is_deleted",
        "balance_sheet_account_type",
        "company_code",
        "creation_date",
        "gl_account_group_currency",
        "gl_account_number",
        "is_balance_sheet_account",
        "pl_account_type"
    FROM "sap_ska1_data_projected_renamed_cleaned_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_ska1_data_projected_renamed_cleaned_casted_missing_handled"

stg_sap_ska1_data.yml (Document the table)

version: 2
models:
- name: stg_sap_ska1_data
  description: The table is about chart of accounts entries. It contains account numbers
    (saknr) and their properties. Key fields include company code (mandt), chart of
    accounts (ktopl), and account number (saknr). Other fields provide details like
    creation date (erdat), creator (ernam), account type (ktoks), and various flags
    (xbilk, xloev, etc.). The table likely represents a master data structure for
    financial accounting purposes.
  columns:
  - name: chart_of_accounts
    description: Chart of accounts identifier
    tests:
    - not_null
  - name: creator_username
    description: Username of the record creator
    tests:
    - not_null
  - name: account_group
    description: Account group
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column is a unique identifier for each row in the table. For
        this table, each row represents a specific GL account entry. The row_id is
        likely to be unique across rows as it's designed to be a distinct identifier.
  - name: is_deleted
    description: Indicates if the record is deleted
    tests:
    - not_null
  - name: balance_sheet_account_type
    description: Balance sheet account type
    cocoon_meta:
      missing_acceptable: Not applicable for non-balance sheet accounts.
  - name: company_code
    description: Company code
    tests:
    - not_null
  - name: creation_date
    description: Creation date of the record
    tests:
    - not_null
  - name: gl_account_group_currency
    description: G/L account number in group currency
    tests:
    - not_null
  - name: gl_account_number
    description: G/L account number
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column represents the G/L account number. For this table, each
        row is a GL account entry. In a chart of accounts, each account number is
        typically unique. However, without more context, we can't be certain that
        this number is never reused across different chart of accounts or company
        codes.
  - name: is_balance_sheet_account
    description: Balance sheet account flag
    tests:
    - not_null
  - name: pl_account_type
    description: Profit and loss account type
    cocoon_meta:
      missing_acceptable: Not applicable for balance sheet accounts.

stg_sap_faglflexa_data (first 100 rows)

document_line_number ledger_number ryear activity_type transaction_currency document_type record_type version_number transaction_amount_local local_currency_amount group_currency_amount transaction_currency_amount material_ledger_amount amount_document_currency debit_credit_indicator posting_period reporting_currency gjahr line_item_number user_name row_id is_deleted client_code company_code controlling_area cost_center document_number functional_area gl_account_number posting_date posting_key reference_document_number transaction_timestamp
0 2 0l 2007 rfbu usd bkpf 0 1 -2949.00 -2949.00 -2286.05 -2949.00 0.0 -2949.00 h 6 usd 2006 2 steiner 3388016 False 800 3000 2000 None 200001076 None 113100 2006-06-01 50 100002655 2007-05-25 09:22:26
1 2 0l 2007 rfbu usd bkpf 0 1 -655.50 -655.50 -508.14 -655.50 0.0 -655.50 h 6 usd 2006 2 steiner 3388017 False 800 3000 2000 None 200001077 None 113100 2006-06-01 50 100002658 2007-05-25 09:22:28
2 2 0l 2007 rfbu usd bkpf 0 1 -1595.28 -1595.28 -1236.65 -1595.28 0.0 -1595.28 h 6 usd 2006 2 steiner 3388018 False 800 3000 2000 None 200001078 None 113100 2006-06-01 50 100002659 2007-05-25 09:22:28

stg_sap_faglflexa_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 04:40:40.227636+00:00
WITH 
"sap_faglflexa_data_projected" AS (
    -- Projection: Selecting 50 out of 51 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "docln",
        "docnr",
        "rbukrs",
        "rclnt",
        "rldnr",
        "ryear",
        "activ",
        "rmvct",
        "rtcur",
        "runit",
        "awtyp",
        "rrcty",
        "rvers",
        "logsys",
        "racct",
        "cost_elem",
        "rcntr",
        "prctr",
        "rfarea",
        "rbusa",
        "kokrs",
        "segment",
        "zzspreg",
        "scntr",
        "pprctr",
        "sfarea",
        "sbusa",
        "rassc",
        "psegment",
        "tsl",
        "hsl",
        "ksl",
        "osl",
        "msl",
        "wsl",
        "drcrk",
        "poper",
        "rwcur",
        "gjahr",
        "budat",
        "belnr",
        "buzei",
        "bschl",
        "bstat",
        "linetype",
        "xsplitmod",
        "usnam",
        "timestamp_",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_faglflexa_data"
),

"sap_faglflexa_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- docln -> document_line_number
    -- docnr -> reference_document_number
    -- rbukrs -> company_code
    -- rclnt -> client_code
    -- rldnr -> ledger_number
    -- activ -> activity_type
    -- rmvct -> movement_type
    -- rtcur -> transaction_currency
    -- runit -> unit_of_measure
    -- awtyp -> document_type
    -- rrcty -> record_type
    -- rvers -> version_number
    -- logsys -> logical_system
    -- racct -> gl_account_number
    -- cost_elem -> cost_element
    -- rcntr -> cost_center
    -- prctr -> profit_center
    -- rfarea -> functional_area
    -- kokrs -> controlling_area
    -- zzspreg -> special_region
    -- scntr -> sender_cost_center
    -- pprctr -> partner_profit_center
    -- sfarea -> sender_functional_area
    -- rassc -> asset_class
    -- psegment -> profit_segment
    -- tsl -> transaction_amount_local
    -- hsl -> local_currency_amount
    -- ksl -> group_currency_amount
    -- osl -> transaction_currency_amount
    -- msl -> material_ledger_amount
    -- wsl -> amount_document_currency
    -- drcrk -> debit_credit_indicator
    -- poper -> posting_period
    -- rwcur -> reporting_currency
    -- budat -> posting_date
    -- belnr -> document_number
    -- buzei -> line_item_number
    -- bschl -> posting_key
    -- bstat -> document_status
    -- linetype -> line_item_type
    -- xsplitmod -> split_modification
    -- usnam -> user_name
    -- timestamp_ -> transaction_timestamp
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "docln" AS "document_line_number",
        "docnr" AS "reference_document_number",
        "rbukrs" AS "company_code",
        "rclnt" AS "client_code",
        "rldnr" AS "ledger_number",
        "ryear",
        "activ" AS "activity_type",
        "rmvct" AS "movement_type",
        "rtcur" AS "transaction_currency",
        "runit" AS "unit_of_measure",
        "awtyp" AS "document_type",
        "rrcty" AS "record_type",
        "rvers" AS "version_number",
        "logsys" AS "logical_system",
        "racct" AS "gl_account_number",
        "cost_elem" AS "cost_element",
        "rcntr" AS "cost_center",
        "prctr" AS "profit_center",
        "rfarea" AS "functional_area",
        "rbusa",
        "kokrs" AS "controlling_area",
        "segment",
        "zzspreg" AS "special_region",
        "scntr" AS "sender_cost_center",
        "pprctr" AS "partner_profit_center",
        "sfarea" AS "sender_functional_area",
        "sbusa",
        "rassc" AS "asset_class",
        "psegment" AS "profit_segment",
        "tsl" AS "transaction_amount_local",
        "hsl" AS "local_currency_amount",
        "ksl" AS "group_currency_amount",
        "osl" AS "transaction_currency_amount",
        "msl" AS "material_ledger_amount",
        "wsl" AS "amount_document_currency",
        "drcrk" AS "debit_credit_indicator",
        "poper" AS "posting_period",
        "rwcur" AS "reporting_currency",
        "gjahr",
        "budat" AS "posting_date",
        "belnr" AS "document_number",
        "buzei" AS "line_item_number",
        "bschl" AS "posting_key",
        "bstat" AS "document_status",
        "linetype" AS "line_item_type",
        "xsplitmod" AS "split_modification",
        "usnam" AS "user_name",
        "timestamp_" AS "transaction_timestamp",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_faglflexa_data_projected"
),

"sap_faglflexa_data_projected_renamed_cleaned" AS (
    -- Clean unusual string values: 
    -- ledger_number: The problem is that the ledger_number column contains the value '0l' (zero followed by lowercase L), which is likely a typo. The correct value should be '01' (zero followed by the number one). This type of error often occurs when the font used makes it difficult to distinguish between the lowercase letter 'l' and the number '1'. 
    -- activity_type: The problem is that 'rfbu' is the only value present in the activity_type column, and it's an unclear acronym that doesn't provide meaningful information about the activity type. Without additional context or a data dictionary, it's impossible to determine what 'rfbu' stands for or what kind of activity it represents. In this case, since we don't have enough information to map it to a correct value, the best approach is to map it to an empty string to indicate that the activity type is unknown or undefined. 
    -- debit_credit_indicator: The problem is that 'h' is an unusual and non-standard value for a debit_credit_indicator column. Typically, this column should contain 'D' for debit and 'C' for credit. The value 'h' is not meaningful in this context and doesn't provide clear information about whether the transaction is a debit or credit. Since we don't have additional information about what 'h' might represent, it's safest to map it to an empty string to indicate missing or invalid data. 
    SELECT
        "document_line_number",
        "reference_document_number",
        "company_code",
        "client_code",
        CASE
            WHEN "ledger_number" = '''0l''' THEN '''01'''
            ELSE "ledger_number"
        END AS "ledger_number",
        "ryear",
        CASE
            WHEN "activity_type" = '''rfbu''' THEN ''''
            ELSE "activity_type"
        END AS "activity_type",
        "movement_type",
        "transaction_currency",
        "unit_of_measure",
        "document_type",
        "record_type",
        "version_number",
        "logical_system",
        "gl_account_number",
        "cost_element",
        "cost_center",
        "profit_center",
        "functional_area",
        "rbusa",
        "controlling_area",
        "segment",
        "special_region",
        "sender_cost_center",
        "partner_profit_center",
        "sender_functional_area",
        "sbusa",
        "asset_class",
        "profit_segment",
        "transaction_amount_local",
        "local_currency_amount",
        "group_currency_amount",
        "transaction_currency_amount",
        "material_ledger_amount",
        "amount_document_currency",
        CASE
            WHEN "debit_credit_indicator" = '''h''' THEN ''''
            ELSE "debit_credit_indicator"
        END AS "debit_credit_indicator",
        "posting_period",
        "reporting_currency",
        "gjahr",
        "posting_date",
        "document_number",
        "line_item_number",
        "posting_key",
        "document_status",
        "line_item_type",
        "split_modification",
        "user_name",
        "transaction_timestamp",
        "row_id",
        "is_deleted"
    FROM "sap_faglflexa_data_projected_renamed"
),

"sap_faglflexa_data_projected_renamed_cleaned_casted" AS (
    -- Column Type Casting: 
    -- asset_class: from DECIMAL to VARCHAR
    -- client_code: from INT to VARCHAR
    -- company_code: from INT to VARCHAR
    -- controlling_area: from INT to VARCHAR
    -- cost_center: from DECIMAL to VARCHAR
    -- cost_element: from DECIMAL to VARCHAR
    -- document_number: from INT to VARCHAR
    -- document_status: from DECIMAL to VARCHAR
    -- functional_area: from DECIMAL to VARCHAR
    -- gl_account_number: from INT to VARCHAR
    -- line_item_type: from DECIMAL to VARCHAR
    -- logical_system: from DECIMAL to VARCHAR
    -- movement_type: from DECIMAL to VARCHAR
    -- partner_profit_center: from DECIMAL to VARCHAR
    -- posting_date: from INT to DATE
    -- posting_key: from INT to VARCHAR
    -- profit_center: from DECIMAL to VARCHAR
    -- profit_segment: from DECIMAL to VARCHAR
    -- rbusa: from DECIMAL to VARCHAR
    -- reference_document_number: from INT to VARCHAR
    -- sbusa: from DECIMAL to VARCHAR
    -- segment: from DECIMAL to VARCHAR
    -- sender_cost_center: from DECIMAL to VARCHAR
    -- sender_functional_area: from DECIMAL to VARCHAR
    -- special_region: from DECIMAL to VARCHAR
    -- split_modification: from DECIMAL to VARCHAR
    -- transaction_timestamp: from INT to TIMESTAMP
    -- unit_of_measure: from DECIMAL to VARCHAR
    SELECT
        "document_line_number",
        "ledger_number",
        "ryear",
        "activity_type",
        "transaction_currency",
        "document_type",
        "record_type",
        "version_number",
        "transaction_amount_local",
        "local_currency_amount",
        "group_currency_amount",
        "transaction_currency_amount",
        "material_ledger_amount",
        "amount_document_currency",
        "debit_credit_indicator",
        "posting_period",
        "reporting_currency",
        "gjahr",
        "line_item_number",
        "user_name",
        "row_id",
        "is_deleted",
        CAST("asset_class" AS VARCHAR) AS "asset_class",
        CAST("client_code" AS VARCHAR) AS "client_code",
        CAST("company_code" AS VARCHAR) AS "company_code",
        CAST("controlling_area" AS VARCHAR) AS "controlling_area",
        CAST("cost_center" AS VARCHAR) AS "cost_center",
        CAST("cost_element" AS VARCHAR) AS "cost_element",
        CAST("document_number" AS VARCHAR) AS "document_number",
        CAST("document_status" AS VARCHAR) AS "document_status",
        CAST("functional_area" AS VARCHAR) AS "functional_area",
        CAST("gl_account_number" AS VARCHAR) AS "gl_account_number",
        CAST("line_item_type" AS VARCHAR) AS "line_item_type",
        CAST("logical_system" AS VARCHAR) AS "logical_system",
        CAST("movement_type" AS VARCHAR) AS "movement_type",
        CAST("partner_profit_center" AS VARCHAR) AS "partner_profit_center",
        strptime(CAST("posting_date" AS VARCHAR), '%Y%m%d') AS "posting_date",
        CAST("posting_key" AS VARCHAR) AS "posting_key",
        CAST("profit_center" AS VARCHAR) AS "profit_center",
        CAST("profit_segment" AS VARCHAR) AS "profit_segment",
        CAST("rbusa" AS VARCHAR) AS "rbusa",
        CAST("reference_document_number" AS VARCHAR) AS "reference_document_number",
        CAST("sbusa" AS VARCHAR) AS "sbusa",
        CAST("segment" AS VARCHAR) AS "segment",
        CAST("sender_cost_center" AS VARCHAR) AS "sender_cost_center",
        CAST("sender_functional_area" AS VARCHAR) AS "sender_functional_area",
        CAST("special_region" AS VARCHAR) AS "special_region",
        CAST("split_modification" AS VARCHAR) AS "split_modification",
        strptime(CAST("transaction_timestamp" AS VARCHAR), '%Y%m%d%H%M%S') AS "transaction_timestamp",
        CAST("unit_of_measure" AS VARCHAR) AS "unit_of_measure"
    FROM "sap_faglflexa_data_projected_renamed_cleaned"
),

"sap_faglflexa_data_projected_renamed_cleaned_casted_missing_handled" AS (
    -- Handling missing values: There are 17 columns with unacceptable missing values
    -- asset_class has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- cost_element has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- document_status has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- line_item_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- logical_system has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- movement_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- partner_profit_center has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- profit_center has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- profit_segment has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- rbusa has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sbusa has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- segment has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sender_cost_center has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sender_functional_area has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- special_region has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- split_modification has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- unit_of_measure has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "document_line_number",
        "ledger_number",
        "ryear",
        "activity_type",
        "transaction_currency",
        "document_type",
        "record_type",
        "version_number",
        "transaction_amount_local",
        "local_currency_amount",
        "group_currency_amount",
        "transaction_currency_amount",
        "material_ledger_amount",
        "amount_document_currency",
        "debit_credit_indicator",
        "posting_period",
        "reporting_currency",
        "gjahr",
        "line_item_number",
        "user_name",
        "row_id",
        "is_deleted",
        "client_code",
        "company_code",
        "controlling_area",
        "cost_center",
        "document_number",
        "functional_area",
        "gl_account_number",
        "posting_date",
        "posting_key",
        "reference_document_number",
        "transaction_timestamp"
    FROM "sap_faglflexa_data_projected_renamed_cleaned_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_faglflexa_data_projected_renamed_cleaned_casted_missing_handled"

stg_sap_faglflexa_data.yml (Document the table)

version: 2
models:
- name: stg_sap_faglflexa_data
  description: The table is about financial transactions in an SAP system. It includes
    details like document number, company code, fiscal year, account, amounts in different
    currencies, posting date, and line item. Each row represents an individual accounting
    entry with financial values and associated metadata. The table captures various
    aspects of financial postings across different organizational units and time periods.
  columns:
  - name: document_line_number
    description: Document line number
    tests:
    - not_null
  - name: ledger_number
    description: Ledger number
    tests:
    - not_null
  - name: ryear
    description: ''
    tests:
    - not_null
  - name: activity_type
    description: Activity type or category
    tests:
    - not_null
  - name: transaction_currency
    description: Transaction currency
    tests:
    - not_null
    - accepted_values:
        values:
        - usd
        - eur
        - gbp
        - jpy
        - cny
        - chf
        - cad
        - aud
        - hkd
        - sgd
        - krw
        - inr
        - mxn
        - brl
        - rub
        - zar
        - nzd
        - sek
        - nok
        - dkk
  - name: document_type
    description: Document type
    tests:
    - not_null
    - accepted_values:
        values:
        - AB
        - AF
        - AN
        - AZ
        - BA
        - BB
        - BK
        - DA
        - DG
        - DZ
        - EF
        - KA
        - KG
        - KN
        - KR
        - KZ
        - PR
        - SA
        - SK
        - SU
        - WA
        - WE
        - WL
        - bkpf
  - name: record_type
    description: Record type
    tests:
    - not_null
  - name: version_number
    description: Version number
    tests:
    - not_null
  - name: transaction_amount_local
    description: Total transaction amount in local currency
    tests:
    - not_null
  - name: local_currency_amount
    description: Amount in local currency
    tests:
    - not_null
  - name: group_currency_amount
    description: Amount in group currency
    tests:
    - not_null
  - name: transaction_currency_amount
    description: Amount in transaction currency
    tests:
    - not_null
  - name: material_ledger_amount
    description: Amount in material ledger currency
    tests:
    - not_null
  - name: amount_document_currency
    description: Amount in document currency
    tests:
    - not_null
  - name: debit_credit_indicator
    description: Debit/Credit indicator
    tests:
    - not_null
    - accepted_values:
        values:
        - D
        - C
        - h
  - name: posting_period
    description: Posting period
    tests:
    - not_null
  - name: reporting_currency
    description: Currency for reporting
    tests:
    - not_null
    - accepted_values:
        values:
        - usd
        - eur
        - gbp
        - jpy
        - cny
        - chf
        - cad
        - aud
        - hkd
        - sgd
        - inr
        - krw
        - mxn
        - brl
        - zar
        - rub
        - sek
        - nok
        - nzd
        - try
  - name: gjahr
    description: ''
    tests:
    - not_null
  - name: line_item_number
    description: Line item number
    tests:
    - not_null
  - name: user_name
    description: User name
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column appears to be a unique identifier for each row in the
        table. For this table, each row represents a distinct financial transaction
        or accounting entry. row_id is likely to be unique across rows.
  - name: is_deleted
    description: Indicates if the row was deleted
    tests:
    - not_null
  - name: client_code
    description: Client
    tests:
    - not_null
  - name: company_code
    description: Company code
    tests:
    - not_null
  - name: controlling_area
    description: Controlling area
    tests:
    - not_null
  - name: cost_center
    description: Receiving/sending cost center
    cocoon_meta:
      missing_acceptable: May not be applicable for certain transaction types.
  - name: document_number
    description: Accounting document number
    tests:
    - not_null
  - name: functional_area
    description: Functional area
    cocoon_meta:
      missing_acceptable: Might not be relevant for all financial activities.
  - name: gl_account_number
    description: G/L account number
    tests:
    - not_null
  - name: posting_date
    description: Posting date
    tests:
    - not_null
  - name: posting_key
    description: Posting key
    tests:
    - not_null
  - name: reference_document_number
    description: Document number
    tests:
    - not_null
  - name: transaction_timestamp
    description: Timestamp of transaction
    tests:
    - not_null

stg_sap_pa0000_data (first 100 rows)

employee_id sequence_number username row_id is_deleted action_type client_code end_date last_change_date lock_indicator start_date status_2 status_3
0 10 0 bobsponge 1 False 1 800 9999-12-31 2003-05-07 None 2002-01-01 3 1
1 69 0 wardsquid 2 False 1 800 9999-12-31 2003-09-17 None 2003-01-01 3 1
2 70 0 starpatrick 3 False 52 800 9999-12-31 2003-09-17 None 2003-01-01 3 1

stg_sap_pa0000_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 05:10:27.159864+00:00
WITH 
"sap_pa0000_data_projected" AS (
    -- Projection: Selecting 30 out of 31 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "begda",
        "endda",
        "mandt",
        "objps",
        "pernr",
        "seqnr",
        "sprps",
        "subty",
        "aedtm",
        "uname",
        "histo",
        "itxex",
        "refex",
        "ordex",
        "itbld",
        "preas",
        "flag1",
        "flag2",
        "flag3",
        "flag4",
        "rese1",
        "rese2",
        "grpvl",
        "massn",
        "massg",
        "stat1",
        "stat2",
        "stat3",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_pa0000_data"
),

"sap_pa0000_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- begda -> start_date
    -- endda -> end_date
    -- mandt -> client_code
    -- objps -> object_spec
    -- pernr -> employee_id
    -- seqnr -> sequence_number
    -- sprps -> lock_indicator
    -- subty -> record_subtype
    -- aedtm -> last_change_date
    -- uname -> username
    -- histo -> is_historical
    -- itxex -> external_system_id
    -- refex -> external_reference
    -- ordex -> execution_order
    -- itbld -> it_location
    -- preas -> process_reason
    -- flag1 -> custom_flag_1
    -- flag2 -> custom_flag_2
    -- flag3 -> custom_flag_3
    -- flag4 -> custom_flag_4
    -- rese1 -> reserved_1
    -- rese2 -> reserved_2
    -- grpvl -> group_value
    -- massn -> action_type
    -- massg -> measure_group
    -- stat1 -> status_1
    -- stat2 -> status_2
    -- stat3 -> status_3
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "begda" AS "start_date",
        "endda" AS "end_date",
        "mandt" AS "client_code",
        "objps" AS "object_spec",
        "pernr" AS "employee_id",
        "seqnr" AS "sequence_number",
        "sprps" AS "lock_indicator",
        "subty" AS "record_subtype",
        "aedtm" AS "last_change_date",
        "uname" AS "username",
        "histo" AS "is_historical",
        "itxex" AS "external_system_id",
        "refex" AS "external_reference",
        "ordex" AS "execution_order",
        "itbld" AS "it_location",
        "preas" AS "process_reason",
        "flag1" AS "custom_flag_1",
        "flag2" AS "custom_flag_2",
        "flag3" AS "custom_flag_3",
        "flag4" AS "custom_flag_4",
        "rese1" AS "reserved_1",
        "rese2" AS "reserved_2",
        "grpvl" AS "group_value",
        "massn" AS "action_type",
        "massg" AS "measure_group",
        "stat1" AS "status_1",
        "stat2" AS "status_2",
        "stat3" AS "status_3",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_pa0000_data_projected"
),

"sap_pa0000_data_projected_renamed_casted" AS (
    -- Column Type Casting: 
    -- action_type: from INT to VARCHAR
    -- client_code: from INT to VARCHAR
    -- custom_flag_1: from DECIMAL to VARCHAR
    -- custom_flag_2: from DECIMAL to VARCHAR
    -- custom_flag_3: from DECIMAL to VARCHAR
    -- custom_flag_4: from DECIMAL to VARCHAR
    -- end_date: from INT to DATE
    -- execution_order: from DECIMAL to VARCHAR
    -- external_reference: from DECIMAL to VARCHAR
    -- external_system_id: from DECIMAL to VARCHAR
    -- group_value: from DECIMAL to VARCHAR
    -- is_historical: from DECIMAL to VARCHAR
    -- it_location: from DECIMAL to VARCHAR
    -- last_change_date: from INT to DATE
    -- lock_indicator: from DECIMAL to VARCHAR
    -- measure_group: from DECIMAL to VARCHAR
    -- object_spec: from DECIMAL to VARCHAR
    -- process_reason: from DECIMAL to VARCHAR
    -- record_subtype: from DECIMAL to VARCHAR
    -- reserved_1: from DECIMAL to VARCHAR
    -- reserved_2: from DECIMAL to VARCHAR
    -- start_date: from INT to DATE
    -- status_1: from DECIMAL to VARCHAR
    -- status_2: from INT to VARCHAR
    -- status_3: from INT to VARCHAR
    SELECT
        "employee_id",
        "sequence_number",
        "username",
        "row_id",
        "is_deleted",
        CAST("action_type" AS VARCHAR) AS "action_type",
        CAST("client_code" AS VARCHAR) AS "client_code",
        CAST("custom_flag_1" AS VARCHAR) AS "custom_flag_1",
        CAST("custom_flag_2" AS VARCHAR) AS "custom_flag_2",
        CAST("custom_flag_3" AS VARCHAR) AS "custom_flag_3",
        CAST("custom_flag_4" AS VARCHAR) AS "custom_flag_4",
        strptime(CAST("end_date" AS VARCHAR), '%Y%m%d') AS "end_date",
        CAST("execution_order" AS VARCHAR) AS "execution_order",
        CAST("external_reference" AS VARCHAR) AS "external_reference",
        CAST("external_system_id" AS VARCHAR) AS "external_system_id",
        CAST("group_value" AS VARCHAR) AS "group_value",
        CAST("is_historical" AS VARCHAR) AS "is_historical",
        CAST("it_location" AS VARCHAR) AS "it_location",
        strptime(CAST("last_change_date" AS VARCHAR), '%Y%m%d') AS "last_change_date",
        CAST("lock_indicator" AS VARCHAR) AS "lock_indicator",
        CAST("measure_group" AS VARCHAR) AS "measure_group",
        CAST("object_spec" AS VARCHAR) AS "object_spec",
        CAST("process_reason" AS VARCHAR) AS "process_reason",
        CAST("record_subtype" AS VARCHAR) AS "record_subtype",
        CAST("reserved_1" AS VARCHAR) AS "reserved_1",
        CAST("reserved_2" AS VARCHAR) AS "reserved_2",
        strptime(CAST("start_date" AS VARCHAR), '%Y%m%d') AS "start_date",
        CAST("status_1" AS VARCHAR) AS "status_1",
        CAST("status_2" AS VARCHAR) AS "status_2",
        CAST("status_3" AS VARCHAR) AS "status_3"
    FROM "sap_pa0000_data_projected_renamed"
),

"sap_pa0000_data_projected_renamed_casted_missing_handled" AS (
    -- Handling missing values: There are 17 columns with unacceptable missing values
    -- custom_flag_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- custom_flag_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- custom_flag_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- custom_flag_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- execution_order has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- external_reference has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- external_system_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- group_value has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- is_historical has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- it_location has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- measure_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- object_spec has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- process_reason has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- record_subtype has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reserved_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reserved_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- status_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "employee_id",
        "sequence_number",
        "username",
        "row_id",
        "is_deleted",
        "action_type",
        "client_code",
        "end_date",
        "last_change_date",
        "lock_indicator",
        "start_date",
        "status_2",
        "status_3"
    FROM "sap_pa0000_data_projected_renamed_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_pa0000_data_projected_renamed_casted_missing_handled"

stg_sap_pa0000_data.yml (Document the table)

version: 2
models:
- name: stg_sap_pa0000_data
  description: The table is about employee records. It contains personal data and
    employment status information. Each row represents an employee record with fields
    like start date, end date, personnel number, and status codes. The table also
    includes audit fields such as creation date and username. It appears to track
    changes in employee status over time.
  columns:
  - name: employee_id
    description: Personnel number
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column represents the personnel number for each employee. For
        this table, each row represents an employee record. The employee_id appears
        to be unique across rows, as it's a common practice to assign unique identifiers
        to employees.
  - name: sequence_number
    description: Sequence number
    tests:
    - not_null
  - name: username
    description: Username of the person who created/modified
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column is described as a unique identifier for the row. For
        this table, each row is an employee record. By definition, if it's a unique
        identifier for the row, it would be unique across all rows.
  - name: is_deleted
    description: Indicates if the record was deleted
    tests:
    - not_null
  - name: action_type
    description: Action or measure type
    tests:
    - not_null
  - name: client_code
    description: Client or company code
    tests:
    - not_null
  - name: end_date
    description: End date of the record
    tests:
    - not_null
  - name: last_change_date
    description: Date of last change
    tests:
    - not_null
  - name: lock_indicator
    description: Lock indicator
    cocoon_meta:
      missing_acceptable: No locks applied to these records.
  - name: start_date
    description: Start date of the record
    tests:
    - not_null
  - name: status_2
    description: Status 2
    tests:
    - not_null
    - accepted_values:
        values:
        - '1'
        - '2'
        - '3'
        - '4'
        - '5'
  - name: status_3
    description: Status 3
    tests:
    - not_null
    - accepted_values:
        values:
        - '1'
        - '2'
        - '3'
        - '4'
        - '5'

stg_sap_bseg_data (first 100 rows)

buzei fiscal_year clearing_date clearing_fiscal_period posting_key koart debit_credit_indicator total_local_amount wrbtr document_currency_amount transaction_amount transaction_currency transaction_currency_tax_base tax_base_amount foreign_tax_amount tax_amount home_currency_amount functional_area_amount home_currency_assignment_amount functional_area_assignment_amount tax_group accounting_value_3 kursr gbetr valuation_difference valuation_difference_2 value_date line_item_text transaction_type planned_price factory_calendar_date position_number delivery_date asset_value_date personnel_number document_reversal_indicator payment_indicator gvtyp due_date_baseline cash_discount_days_1 cash_discount_days_2 net_payment_terms_days cash_discount_percent_1 cash_discount_percent_2 skfbt transaction_currency_amount cash_discount_amount net_amount local_amount_1 withholding_tax_amount_1 local_amount_2 withholding_tax_amount_2 local_amount_3 withholding_tax_amount_3 reference_fiscal_year reference_line_item customs_amount sample_number depreciation_period insurance_date discount_base_year discount_base_period asset_acquisition_year apc_area document_amount_local balance_carryforward klibt accounting_value_1 accounting_value_2 foreign_non_deductible_tax_base non_deductible_input_tax menge erfmg bpmng purchase_order_item_number account_assignment_sequence peinh reference_amount reference_exchange_rate investment_support_amount bualt net_price accounting_value_4 difference_value_3 difference_value_1 days_in_arrears option_selection order_item_number asset_sequence_number project_key profitability_segment_number profitability_subsegment_number dmbe2 group_currency_amount dmb21 dmb22 dmb23 group_currency_amount_1 group_currency_amount_2 group_currency_amount_3 local_tax_amount document_tax_amount local_non_deductible_tax_base document_non_deductible_tax_base second_local_currency_amount third_local_currency_amount valuation_difference_3 difference_value_2 second_local_currency_tax_base third_local_currency_tax_base kblpos local_currency_tax_amount original_line_item_number reference_line_indicator tax_amount_1 tax_amount_2 tax_amount_3 tax_amount_4 write_off_amount tax_reporting_date settlement_period payment_amount price_difference price_difference_2 price_difference_3 penlc1 penlc2 penlc3 foreign_currency_amount number_of_days sales_use_tax_amount clearing_fiscal_year tax_splitting original_fiscal_year original_document_line_number original_line_item_sequence production_period row_id is_deleted alternative_account asset_acquisition_period asset_subnumber asset_transaction_type asset_valuation_type belnr bill_of_exchange_procedure billing_block billing_type business_area cession_indicator clearing_document_number clearing_reversal_indicator clearing_with_down_payment client_id company_code contract_number contract_type controlling_area delivery_block delivery_completed dunning_level fast_pay_indicator financial_position fixed_payment_terms_indicator foreign_currency_valuation_type funds_center funds_center_description funds_reservation_number gl_account gr_ir_clearing_number grir_clearing_reversal_indicator insurance_indicator insurance_reason_code kostl line_item_reference line_number_range lokkt main_asset_number mandate_id manual_stats_update material_document_date network_activity_number order_number original_document_number payment_block payment_currency payment_method payment_method_supplement payment_processing_indicator payment_provider_transaction_id payment_service_provider payment_term payment_terms payment_terms_key prctr tax_code_1 tax_code_2 tax_code_3 tax_code_change_indicator tax_exempt_indicator tax_indicator tax_jurisdiction_code tax_posting_reversal_indicator vat_registration_number vat_tax_code zuonr
0 1 2006 0 0 50 s h 297.0 297.0 0.0 297.0 eur 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0 soll-buchung rfbu 0.0 0 0 0 0 0 x x x 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0.0 0.0 0.0 0.0 0.0 0 0.0 0.0 0 0.0 0 0 0 0 0 297.0 368.70 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0.0 0 0 0 0 0 0 0.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0.0 0 0 0 0 0 0 495617 False 320700 None None None None 100000795 None None None 9900 None None None None 800 3000 None None 1000 None None None None 1431 None None 980 980 0 483000 None None None None 2300 0 0 483000 None None 0 0 None None None None None None None None None None None None None 1400 None None None None None None None None None None 2300
1 1 2006 0 0 50 s h 1001.0 1001.0 0.0 1001.0 eur 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0 soll-buchung rfbu 0.0 0 0 0 0 0 x x x 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0.0 0.0 0.0 0.0 0.0 0 0.0 0.0 0 0.0 0 0 0 0 0 1001.0 1242.64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0.0 0 0 0 0 0 0 0.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0.0 0 0 0 0 0 0 495618 False 320700 None None None None 100000798 None None None 4000 None None None None 800 3000 None None 1000 None None None None 1431 None None 980 980 0 483000 None None None None 3120 0 0 483000 None None 0 0 None None None None None None None None None None None None None 1100 None None None None None None None None None None 3120
2 1 2006 0 0 50 s h 13.0 13.0 0.0 13.0 eur 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0 soll-buchung rfbu 0.0 0 0 0 0 0 x x x 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0.0 0.0 0.0 0.0 0.0 0 0.0 0.0 0 0.0 0 0 0 0 0 13.0 16.14 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0.0 0 0 0 0 0 0 0.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0.0 0 0 0 0 0 0 495619 False 320700 None None None None 100000806 None None None 9900 None None None None 800 3000 None None 1000 None None None None 1431 None None 980 980 0 483000 None None None None 4110 0 0 483000 None None 0 0 None None None None None None None None None None None None None 1400 None None None None None None None None None None 4110

stg_sap_bseg_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 04:38:06.086647+00:00
WITH 
"sap_bseg_data_projected" AS (
    -- Projection: Selecting 349 out of 350 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "belnr",
        "bukrs",
        "buzei",
        "gjahr",
        "mandt",
        "buzid",
        "augdt",
        "augcp",
        "augbl",
        "bschl",
        "koart",
        "umskz",
        "umsks",
        "zumsk",
        "shkzg",
        "gsber",
        "pargb",
        "mwskz",
        "qsskz",
        "dmbtr",
        "wrbtr",
        "kzbtr",
        "pswbt",
        "pswsl",
        "txbhw",
        "txbfw",
        "mwsts",
        "wmwst",
        "hwbas",
        "fwbas",
        "hwzuz",
        "fwzuz",
        "shzuz",
        "stekz",
        "mwart",
        "txgrp",
        "ktosl",
        "qsshb",
        "kursr",
        "gbetr",
        "bdiff",
        "bdif2",
        "valut",
        "zuonr",
        "sgtxt",
        "zinkz",
        "vbund",
        "bewar",
        "altkt",
        "vorgn",
        "fdlev",
        "fdgrp",
        "fdwbt",
        "fdtag",
        "fkont",
        "kokrs",
        "kostl",
        "projn",
        "aufnr",
        "vbeln",
        "vbel2",
        "posn2",
        "eten2",
        "anln1",
        "anln2",
        "anbwa",
        "bzdat",
        "pernr",
        "xumsw",
        "xhres",
        "xkres",
        "xopvw",
        "xcpdd",
        "xskst",
        "xsauf",
        "xspro",
        "xserg",
        "xfakt",
        "xuman",
        "xanet",
        "xskrl",
        "xinve",
        "xpanz",
        "xauto",
        "xncop",
        "xzahl",
        "saknr",
        "hkont",
        "kunnr",
        "lifnr",
        "filkd",
        "xbilk",
        "gvtyp",
        "hzuon",
        "zfbdt",
        "zterm",
        "zbd1t",
        "zbd2t",
        "zbd3t",
        "zbd1p",
        "zbd2p",
        "skfbt",
        "sknto",
        "wskto",
        "zlsch",
        "zlspr",
        "zbfix",
        "hbkid",
        "bvtyp",
        "nebtr",
        "mwsk1",
        "dmbt1",
        "wrbt1",
        "mwsk2",
        "dmbt2",
        "wrbt2",
        "mwsk3",
        "dmbt3",
        "wrbt3",
        "rebzg",
        "rebzj",
        "rebzz",
        "rebzt",
        "zollt",
        "zolld",
        "lzbkz",
        "landl",
        "diekz",
        "samnr",
        "abper",
        "vrskz",
        "vrsdt",
        "disbn",
        "disbj",
        "disbz",
        "wverw",
        "anfbn",
        "anfbj",
        "anfbu",
        "anfae",
        "blnbt",
        "blnkz",
        "blnpz",
        "mschl",
        "mansp",
        "madat",
        "manst",
        "maber",
        "esrnr",
        "esrre",
        "esrpz",
        "klibt",
        "qsznr",
        "qbshb",
        "qsfbt",
        "navhw",
        "navfw",
        "matnr",
        "werks",
        "menge",
        "meins",
        "erfmg",
        "erfme",
        "bpmng",
        "bprme",
        "ebeln",
        "ebelp",
        "zekkn",
        "elikz",
        "vprsv",
        "peinh",
        "bwkey",
        "bwtar",
        "bustw",
        "rewrt",
        "rewwr",
        "bonfb",
        "bualt",
        "psalt",
        "nprei",
        "tbtkz",
        "spgrp",
        "spgrm",
        "spgrt",
        "spgrg",
        "spgrv",
        "spgrq",
        "stceg",
        "egbld",
        "eglld",
        "rstgr",
        "ryacq",
        "rpacq",
        "rdiff",
        "rdif2",
        "prctr",
        "xhkom",
        "vname",
        "recid",
        "egrup",
        "vptnr",
        "vertt",
        "vertn",
        "vbewa",
        "depot",
        "txjcd",
        "imkey",
        "dabrz",
        "popts",
        "fipos",
        "kstrg",
        "nplnr",
        "aufpl",
        "aplzl",
        "projk",
        "paobjnr",
        "pasubnr",
        "spgrs",
        "spgrc",
        "btype",
        "etype",
        "xegdr",
        "lnran",
        "hrkft",
        "dmbe2",
        "dmbe3",
        "dmb21",
        "dmb22",
        "dmb23",
        "dmb31",
        "dmb32",
        "dmb33",
        "mwst2",
        "mwst3",
        "navh2",
        "navh3",
        "sknt2",
        "sknt3",
        "bdif3",
        "rdif3",
        "hwmet",
        "glupm",
        "xragl",
        "uzawe",
        "lokkt",
        "fistl",
        "geber",
        "stbuk",
        "txbh2",
        "txbh3",
        "pprct",
        "xref1",
        "xref2",
        "kblnr",
        "kblpos",
        "sttax",
        "fkber",
        "obzei",
        "xnegp",
        "rfzei",
        "ccbtc",
        "kkber",
        "empfb",
        "xref3",
        "dtws1",
        "dtws2",
        "dtws3",
        "dtws4",
        "gricd",
        "grirg",
        "gityp",
        "xpypr",
        "kidno",
        "absbt",
        "idxsp",
        "linfv",
        "kontt",
        "kontl",
        "txdat",
        "agzei",
        "pycur",
        "pyamt",
        "bupla",
        "secco",
        "lstar",
        "cession_kz",
        "prznr",
        "ppdiff",
        "ppdif2",
        "ppdif3",
        "penlc1",
        "penlc2",
        "penlc3",
        "penfc",
        "pendays",
        "penrc",
        "grant_nbr",
        "sctax",
        "fkber_long",
        "gmvkz",
        "srtype",
        "intreno",
        "measure",
        "auggj",
        "ppa_ex_ind",
        "docln",
        "segment",
        "psegment",
        "pfkber",
        "hktid",
        "kstar",
        "xlgclr",
        "taxps",
        "pays_prov",
        "pays_tran",
        "mndid",
        "xfrge_bseg",
        "squan",
        "zzspreg",
        "zzbuspartn",
        "zzchan",
        "zzproduct",
        "zzloca",
        "zzlob",
        "zzuserfld1",
        "zzuserfld2",
        "zzuserfld3",
        "zzstate",
        "zzregion",
        "re_bukrs",
        "re_account",
        "pgeber",
        "pgrant_nbr",
        "budget_pd",
        "pbudget_pd",
        "j_1tpbupl",
        "perop_beg",
        "perop_end",
        "fastpay",
        "ignr_ivref",
        "fmfgus_key",
        "fmxdocnr",
        "fmxyear",
        "fmxdocln",
        "fmxzekkn",
        "prodper",
        "recrf",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_bseg_data"
),

"sap_bseg_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- bukrs -> company_code
    -- gjahr -> fiscal_year
    -- mandt -> client_id
    -- buzid -> line_item_identifier
    -- augdt -> clearing_date
    -- augcp -> clearing_fiscal_period
    -- augbl -> clearing_document_number
    -- bschl -> posting_key
    -- umsks -> special_gl_transaction_type
    -- shkzg -> debit_credit_indicator
    -- gsber -> business_area
    -- pargb -> partner_business_area
    -- mwskz -> vat_tax_code
    -- qsskz -> accounting_code_1
    -- dmbtr -> total_local_amount
    -- kzbtr -> document_currency_amount
    -- pswbt -> transaction_amount
    -- pswsl -> transaction_currency
    -- txbhw -> transaction_currency_tax_base
    -- txbfw -> tax_base_amount
    -- mwsts -> foreign_tax_amount
    -- wmwst -> tax_amount
    -- hwbas -> home_currency_amount
    -- fwbas -> functional_area_amount
    -- hwzuz -> home_currency_assignment_amount
    -- fwzuz -> functional_area_assignment_amount
    -- stekz -> tax_exempt_indicator
    -- mwart -> foreign_currency_valuation_type
    -- txgrp -> tax_group
    -- ktosl -> account_key
    -- qsshb -> accounting_value_3
    -- bdiff -> valuation_difference
    -- bdif2 -> valuation_difference_2
    -- valut -> value_date
    -- sgtxt -> line_item_text
    -- zinkz -> internal_code
    -- vbund -> trading_partner
    -- bewar -> asset_valuation_type
    -- altkt -> alternative_account
    -- vorgn -> transaction_type
    -- fdlev -> planning_level
    -- fdgrp -> planning_group
    -- fdwbt -> planned_price
    -- fdtag -> factory_calendar_date
    -- fkont -> funds_reservation_number
    -- kokrs -> controlling_area
    -- projn -> project_number
    -- aufnr -> order_number
    -- vbeln -> sales_document_number
    -- posn2 -> position_number
    -- eten2 -> delivery_date
    -- anln1 -> main_asset_number
    -- anln2 -> asset_subnumber
    -- anbwa -> asset_transaction_type
    -- bzdat -> asset_value_date
    -- pernr -> personnel_number
    -- xumsw -> tax_code_change_indicator
    -- xhres -> clearing_reversal_indicator
    -- xkres -> document_reversal_indicator
    -- xopvw -> open_item_management_indicator
    -- xcpdd -> clearing_with_down_payment
    -- xskst -> tax_posting_reversal_indicator
    -- xspro -> sample_document_indicator
    -- xserg -> recurring_entry_original_indicator
    -- xuman -> manual_entry_indicator
    -- xanet -> net_payment_indicator
    -- xskrl -> grir_clearing_reversal_indicator
    -- xinve -> invoice_indicator
    -- xpanz -> partial_payment_indicator
    -- xauto -> automatic_posting_indicator
    -- xncop -> noted_item_indicator
    -- xzahl -> payment_indicator
    -- saknr -> gl_account_number
    -- hkont -> gl_account
    -- kunnr -> customer_number
    -- lifnr -> vendor_number
    -- filkd -> billing_type
    -- xbilk -> balance_sheet_update_indicator
    -- zfbdt -> due_date_baseline
    -- zterm -> payment_terms
    -- zbd1t -> cash_discount_days_1
    -- zbd2t -> cash_discount_days_2
    -- zbd3t -> net_payment_terms_days
    -- zbd1p -> cash_discount_percent_1
    -- zbd2p -> cash_discount_percent_2
    -- sknto -> transaction_currency_amount
    -- wskto -> cash_discount_amount
    -- zlsch -> payment_method
    -- zlspr -> payment_block
    -- zbfix -> fixed_payment_terms_indicator
    -- hbkid -> house_bank_id
    -- bvtyp -> partner_bank_type
    -- nebtr -> net_amount
    -- mwsk1 -> tax_code_1
    -- dmbt1 -> local_amount_1
    -- wrbt1 -> withholding_tax_amount_1
    -- mwsk2 -> tax_code_2
    -- dmbt2 -> local_amount_2
    -- wrbt2 -> withholding_tax_amount_2
    -- mwsk3 -> tax_code_3
    -- dmbt3 -> local_amount_3
    -- wrbt3 -> withholding_tax_amount_3
    -- rebzj -> reference_fiscal_year
    -- rebzz -> reference_line_item
    -- rebzt -> reference_document_type
    -- zollt -> customs_tariff
    -- zolld -> customs_amount
    -- lzbkz -> payment_terms_key
    -- landl -> country_key
    -- diekz -> service_indicator
    -- samnr -> sample_number
    -- abper -> depreciation_period
    -- vrskz -> insurance_indicator
    -- vrsdt -> insurance_date
    -- disbn -> discount_base_period_number
    -- disbj -> discount_base_year
    -- disbz -> discount_base_period
    -- anfbn -> asset_acquisition_period
    -- anfbj -> asset_acquisition_year
    -- anfbu -> acquisition_date
    -- anfae -> apc_area
    -- blnbt -> document_amount_local
    -- blnkz -> balance_indicator
    -- blnpz -> balance_carryforward
    -- mschl -> dunning_level
    -- mansp -> manual_split_indicator
    -- madat -> material_document_date
    -- manst -> manual_stats_update
    -- esrnr -> gr_ir_clearing_number
    -- esrre -> bill_of_exchange_procedure
    -- esrpz -> payment_term
    -- qsznr -> accounting_number_1
    -- qbshb -> accounting_value_1
    -- qsfbt -> accounting_value_2
    -- navhw -> foreign_non_deductible_tax_base
    -- navfw -> non_deductible_input_tax
    -- matnr -> material_number
    -- werks -> plant
    -- meins -> base_unit_of_measure
    -- bprme -> partner_measurement_unit
    -- ebeln -> purchase_order_number
    -- ebelp -> purchase_order_item_number
    -- zekkn -> account_assignment_sequence
    -- elikz -> delivery_completed
    -- vprsv -> price_control_indicator
    -- bwtar -> valuation_type
    -- bustw -> tax_indicator
    -- rewrt -> reference_amount
    -- rewwr -> reference_exchange_rate
    -- bonfb -> investment_support_amount
    -- nprei -> net_price
    -- tbtkz -> subsequent_billing_indicator
    -- spgrp -> price_reason_code
    -- spgrm -> material_reason_code
    -- spgrt -> text_reason_code
    -- spgrg -> goods_movement_reason_code
    -- spgrv -> insurance_reason_code
    -- spgrq -> quantity_reason_code
    -- stceg -> vat_registration_number
    -- egbld -> billing_block
    -- eglld -> delivery_block
    -- rstgr -> reason_code
    -- ryacq -> accounting_value_5
    -- rpacq -> accounting_value_4
    -- rdiff -> difference_value_3
    -- rdif2 -> difference_value_1
    -- xhkom -> header_comment_indicator
    -- vname -> name
    -- recid -> record_id
    -- egrup -> item_group
    -- vptnr -> partner_account_number
    -- vertt -> contract_type
    -- vertn -> contract_number
    -- vbewa -> sales_movement_type
    -- depot -> securities_account
    -- txjcd -> tax_jurisdiction_code
    -- imkey -> item_key
    -- dabrz -> days_in_arrears
    -- popts -> option_selection
    -- fipos -> financial_position
    -- kstrg -> cost_object
    -- nplnr -> network_activity_number
    -- aufpl -> order_item_number
    -- aplzl -> asset_sequence_number
    -- projk -> project_key
    -- paobjnr -> profitability_segment_number
    -- pasubnr -> profitability_subsegment_number
    -- spgrs -> reservation_reason_code
    -- spgrc -> blocking_reason_code
    -- btype -> balance_type
    -- etype -> po_history_category
    -- xegdr -> single_statement_indicator
    -- lnran -> line_number_range
    -- hrkft -> profitability_segment
    -- dmbe3 -> group_currency_amount
    -- dmb31 -> group_currency_amount_1
    -- dmb32 -> group_currency_amount_2
    -- dmb33 -> group_currency_amount_3
    -- mwst2 -> local_tax_amount
    -- mwst3 -> document_tax_amount
    -- navh2 -> local_non_deductible_tax_base
    -- navh3 -> document_non_deductible_tax_base
    -- sknt2 -> second_local_currency_amount
    -- sknt3 -> third_local_currency_amount
    -- bdif3 -> valuation_difference_3
    -- rdif3 -> difference_value_2
    -- hwmet -> base_unit_quantity
    -- glupm -> consolidation_business_area
    -- xragl -> balance_carryforward_indicator
    -- uzawe -> payment_method_supplement
    -- stbuk -> tax_reporting_company_code
    -- txbh2 -> second_local_currency_tax_base
    -- txbh3 -> third_local_currency_tax_base
    -- pprct -> profit_center
    -- xref1 -> reference_key_1
    -- xref2 -> reference_key_2
    -- sttax -> local_currency_tax_amount
    -- fkber -> funds_center
    -- obzei -> original_line_item_number
    -- xnegp -> negative_posting_indicator
    -- rfzei -> reference_line_indicator
    -- ccbtc -> coding_block
    -- kkber -> credit_control_area
    -- empfb -> goods_recipient
    -- xref3 -> reference_key_3
    -- dtws1 -> tax_amount_1
    -- dtws2 -> tax_amount_2
    -- dtws3 -> tax_amount_3
    -- dtws4 -> tax_amount_4
    -- gricd -> consolidation_functional_area
    -- grirg -> consolidation_region
    -- gityp -> grant_type
    -- xpypr -> payment_processing_indicator
    -- kidno -> customer_id
    -- absbt -> write_off_amount
    -- idxsp -> special_index
    -- linfv -> line_item_reference
    -- kontt -> account_assignment_type
    -- kontl -> account_assignment
    -- txdat -> tax_reporting_date
    -- agzei -> settlement_period
    -- pycur -> payment_currency
    -- pyamt -> payment_amount
    -- bupla -> business_place
    -- secco -> section_code
    -- lstar -> activity_type
    -- cession_kz -> cession_indicator
    -- prznr -> business_process
    -- ppdiff -> price_difference
    -- ppdif2 -> price_difference_2
    -- ppdif3 -> price_difference_3
    -- penfc -> foreign_currency_amount
    -- pendays -> number_of_days
    -- penrc -> reporting_currency
    -- sctax -> sales_use_tax_amount
    -- fkber_long -> funds_center_description
    -- gmvkz -> reporting_value_type
    -- srtype -> reversal_transaction_type
    -- intreno -> internal_renumbering
    -- auggj -> clearing_fiscal_year
    -- ppa_ex_ind -> pa_exchange_rate_indicator
    -- docln -> document_line_number
    -- hktid -> account_id
    -- kstar -> cost_element
    -- xlgclr -> legacy_clearing_indicator
    -- taxps -> tax_splitting
    -- pays_prov -> payment_service_provider
    -- pays_tran -> payment_provider_transaction_id
    -- mndid -> mandate_id
    -- xfrge_bseg -> ready_for_input_indicator
    -- zzspreg -> sales_promotion_region
    -- zzbuspartn -> business_partner
    -- zzchan -> distribution_channel
    -- zzproduct -> product
    -- zzloca -> location
    -- zzlob -> line_of_business
    -- zzuserfld1 -> user_field_1
    -- zzuserfld2 -> user_field_2
    -- zzuserfld3 -> user_field_3
    -- zzstate -> state
    -- zzregion -> region
    -- re_bukrs -> re_company_code
    -- j_1tpbupl -> brazil_tax_upload
    -- perop_beg -> operation_begin_date
    -- perop_end -> operation_end_date
    -- fastpay -> fast_pay_indicator
    -- ignr_ivref -> ignore_invoice_reference
    -- fmfgus_key -> us_federal_grant_key
    -- fmxdocnr -> original_document_number
    -- fmxyear -> original_fiscal_year
    -- fmxdocln -> original_document_line_number
    -- fmxzekkn -> original_line_item_sequence
    -- prodper -> production_period
    -- recrf -> record_reference
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "belnr",
        "bukrs" AS "company_code",
        "buzei",
        "gjahr" AS "fiscal_year",
        "mandt" AS "client_id",
        "buzid" AS "line_item_identifier",
        "augdt" AS "clearing_date",
        "augcp" AS "clearing_fiscal_period",
        "augbl" AS "clearing_document_number",
        "bschl" AS "posting_key",
        "koart",
        "umskz",
        "umsks" AS "special_gl_transaction_type",
        "zumsk",
        "shkzg" AS "debit_credit_indicator",
        "gsber" AS "business_area",
        "pargb" AS "partner_business_area",
        "mwskz" AS "vat_tax_code",
        "qsskz" AS "accounting_code_1",
        "dmbtr" AS "total_local_amount",
        "wrbtr",
        "kzbtr" AS "document_currency_amount",
        "pswbt" AS "transaction_amount",
        "pswsl" AS "transaction_currency",
        "txbhw" AS "transaction_currency_tax_base",
        "txbfw" AS "tax_base_amount",
        "mwsts" AS "foreign_tax_amount",
        "wmwst" AS "tax_amount",
        "hwbas" AS "home_currency_amount",
        "fwbas" AS "functional_area_amount",
        "hwzuz" AS "home_currency_assignment_amount",
        "fwzuz" AS "functional_area_assignment_amount",
        "shzuz",
        "stekz" AS "tax_exempt_indicator",
        "mwart" AS "foreign_currency_valuation_type",
        "txgrp" AS "tax_group",
        "ktosl" AS "account_key",
        "qsshb" AS "accounting_value_3",
        "kursr",
        "gbetr",
        "bdiff" AS "valuation_difference",
        "bdif2" AS "valuation_difference_2",
        "valut" AS "value_date",
        "zuonr",
        "sgtxt" AS "line_item_text",
        "zinkz" AS "internal_code",
        "vbund" AS "trading_partner",
        "bewar" AS "asset_valuation_type",
        "altkt" AS "alternative_account",
        "vorgn" AS "transaction_type",
        "fdlev" AS "planning_level",
        "fdgrp" AS "planning_group",
        "fdwbt" AS "planned_price",
        "fdtag" AS "factory_calendar_date",
        "fkont" AS "funds_reservation_number",
        "kokrs" AS "controlling_area",
        "kostl",
        "projn" AS "project_number",
        "aufnr" AS "order_number",
        "vbeln" AS "sales_document_number",
        "vbel2",
        "posn2" AS "position_number",
        "eten2" AS "delivery_date",
        "anln1" AS "main_asset_number",
        "anln2" AS "asset_subnumber",
        "anbwa" AS "asset_transaction_type",
        "bzdat" AS "asset_value_date",
        "pernr" AS "personnel_number",
        "xumsw" AS "tax_code_change_indicator",
        "xhres" AS "clearing_reversal_indicator",
        "xkres" AS "document_reversal_indicator",
        "xopvw" AS "open_item_management_indicator",
        "xcpdd" AS "clearing_with_down_payment",
        "xskst" AS "tax_posting_reversal_indicator",
        "xsauf",
        "xspro" AS "sample_document_indicator",
        "xserg" AS "recurring_entry_original_indicator",
        "xfakt",
        "xuman" AS "manual_entry_indicator",
        "xanet" AS "net_payment_indicator",
        "xskrl" AS "grir_clearing_reversal_indicator",
        "xinve" AS "invoice_indicator",
        "xpanz" AS "partial_payment_indicator",
        "xauto" AS "automatic_posting_indicator",
        "xncop" AS "noted_item_indicator",
        "xzahl" AS "payment_indicator",
        "saknr" AS "gl_account_number",
        "hkont" AS "gl_account",
        "kunnr" AS "customer_number",
        "lifnr" AS "vendor_number",
        "filkd" AS "billing_type",
        "xbilk" AS "balance_sheet_update_indicator",
        "gvtyp",
        "hzuon",
        "zfbdt" AS "due_date_baseline",
        "zterm" AS "payment_terms",
        "zbd1t" AS "cash_discount_days_1",
        "zbd2t" AS "cash_discount_days_2",
        "zbd3t" AS "net_payment_terms_days",
        "zbd1p" AS "cash_discount_percent_1",
        "zbd2p" AS "cash_discount_percent_2",
        "skfbt",
        "sknto" AS "transaction_currency_amount",
        "wskto" AS "cash_discount_amount",
        "zlsch" AS "payment_method",
        "zlspr" AS "payment_block",
        "zbfix" AS "fixed_payment_terms_indicator",
        "hbkid" AS "house_bank_id",
        "bvtyp" AS "partner_bank_type",
        "nebtr" AS "net_amount",
        "mwsk1" AS "tax_code_1",
        "dmbt1" AS "local_amount_1",
        "wrbt1" AS "withholding_tax_amount_1",
        "mwsk2" AS "tax_code_2",
        "dmbt2" AS "local_amount_2",
        "wrbt2" AS "withholding_tax_amount_2",
        "mwsk3" AS "tax_code_3",
        "dmbt3" AS "local_amount_3",
        "wrbt3" AS "withholding_tax_amount_3",
        "rebzg",
        "rebzj" AS "reference_fiscal_year",
        "rebzz" AS "reference_line_item",
        "rebzt" AS "reference_document_type",
        "zollt" AS "customs_tariff",
        "zolld" AS "customs_amount",
        "lzbkz" AS "payment_terms_key",
        "landl" AS "country_key",
        "diekz" AS "service_indicator",
        "samnr" AS "sample_number",
        "abper" AS "depreciation_period",
        "vrskz" AS "insurance_indicator",
        "vrsdt" AS "insurance_date",
        "disbn" AS "discount_base_period_number",
        "disbj" AS "discount_base_year",
        "disbz" AS "discount_base_period",
        "wverw",
        "anfbn" AS "asset_acquisition_period",
        "anfbj" AS "asset_acquisition_year",
        "anfbu" AS "acquisition_date",
        "anfae" AS "apc_area",
        "blnbt" AS "document_amount_local",
        "blnkz" AS "balance_indicator",
        "blnpz" AS "balance_carryforward",
        "mschl" AS "dunning_level",
        "mansp" AS "manual_split_indicator",
        "madat" AS "material_document_date",
        "manst" AS "manual_stats_update",
        "maber",
        "esrnr" AS "gr_ir_clearing_number",
        "esrre" AS "bill_of_exchange_procedure",
        "esrpz" AS "payment_term",
        "klibt",
        "qsznr" AS "accounting_number_1",
        "qbshb" AS "accounting_value_1",
        "qsfbt" AS "accounting_value_2",
        "navhw" AS "foreign_non_deductible_tax_base",
        "navfw" AS "non_deductible_input_tax",
        "matnr" AS "material_number",
        "werks" AS "plant",
        "menge",
        "meins" AS "base_unit_of_measure",
        "erfmg",
        "erfme",
        "bpmng",
        "bprme" AS "partner_measurement_unit",
        "ebeln" AS "purchase_order_number",
        "ebelp" AS "purchase_order_item_number",
        "zekkn" AS "account_assignment_sequence",
        "elikz" AS "delivery_completed",
        "vprsv" AS "price_control_indicator",
        "peinh",
        "bwkey",
        "bwtar" AS "valuation_type",
        "bustw" AS "tax_indicator",
        "rewrt" AS "reference_amount",
        "rewwr" AS "reference_exchange_rate",
        "bonfb" AS "investment_support_amount",
        "bualt",
        "psalt",
        "nprei" AS "net_price",
        "tbtkz" AS "subsequent_billing_indicator",
        "spgrp" AS "price_reason_code",
        "spgrm" AS "material_reason_code",
        "spgrt" AS "text_reason_code",
        "spgrg" AS "goods_movement_reason_code",
        "spgrv" AS "insurance_reason_code",
        "spgrq" AS "quantity_reason_code",
        "stceg" AS "vat_registration_number",
        "egbld" AS "billing_block",
        "eglld" AS "delivery_block",
        "rstgr" AS "reason_code",
        "ryacq" AS "accounting_value_5",
        "rpacq" AS "accounting_value_4",
        "rdiff" AS "difference_value_3",
        "rdif2" AS "difference_value_1",
        "prctr",
        "xhkom" AS "header_comment_indicator",
        "vname" AS "name",
        "recid" AS "record_id",
        "egrup" AS "item_group",
        "vptnr" AS "partner_account_number",
        "vertt" AS "contract_type",
        "vertn" AS "contract_number",
        "vbewa" AS "sales_movement_type",
        "depot" AS "securities_account",
        "txjcd" AS "tax_jurisdiction_code",
        "imkey" AS "item_key",
        "dabrz" AS "days_in_arrears",
        "popts" AS "option_selection",
        "fipos" AS "financial_position",
        "kstrg" AS "cost_object",
        "nplnr" AS "network_activity_number",
        "aufpl" AS "order_item_number",
        "aplzl" AS "asset_sequence_number",
        "projk" AS "project_key",
        "paobjnr" AS "profitability_segment_number",
        "pasubnr" AS "profitability_subsegment_number",
        "spgrs" AS "reservation_reason_code",
        "spgrc" AS "blocking_reason_code",
        "btype" AS "balance_type",
        "etype" AS "po_history_category",
        "xegdr" AS "single_statement_indicator",
        "lnran" AS "line_number_range",
        "hrkft" AS "profitability_segment",
        "dmbe2",
        "dmbe3" AS "group_currency_amount",
        "dmb21",
        "dmb22",
        "dmb23",
        "dmb31" AS "group_currency_amount_1",
        "dmb32" AS "group_currency_amount_2",
        "dmb33" AS "group_currency_amount_3",
        "mwst2" AS "local_tax_amount",
        "mwst3" AS "document_tax_amount",
        "navh2" AS "local_non_deductible_tax_base",
        "navh3" AS "document_non_deductible_tax_base",
        "sknt2" AS "second_local_currency_amount",
        "sknt3" AS "third_local_currency_amount",
        "bdif3" AS "valuation_difference_3",
        "rdif3" AS "difference_value_2",
        "hwmet" AS "base_unit_quantity",
        "glupm" AS "consolidation_business_area",
        "xragl" AS "balance_carryforward_indicator",
        "uzawe" AS "payment_method_supplement",
        "lokkt",
        "fistl",
        "geber",
        "stbuk" AS "tax_reporting_company_code",
        "txbh2" AS "second_local_currency_tax_base",
        "txbh3" AS "third_local_currency_tax_base",
        "pprct" AS "profit_center",
        "xref1" AS "reference_key_1",
        "xref2" AS "reference_key_2",
        "kblnr",
        "kblpos",
        "sttax" AS "local_currency_tax_amount",
        "fkber" AS "funds_center",
        "obzei" AS "original_line_item_number",
        "xnegp" AS "negative_posting_indicator",
        "rfzei" AS "reference_line_indicator",
        "ccbtc" AS "coding_block",
        "kkber" AS "credit_control_area",
        "empfb" AS "goods_recipient",
        "xref3" AS "reference_key_3",
        "dtws1" AS "tax_amount_1",
        "dtws2" AS "tax_amount_2",
        "dtws3" AS "tax_amount_3",
        "dtws4" AS "tax_amount_4",
        "gricd" AS "consolidation_functional_area",
        "grirg" AS "consolidation_region",
        "gityp" AS "grant_type",
        "xpypr" AS "payment_processing_indicator",
        "kidno" AS "customer_id",
        "absbt" AS "write_off_amount",
        "idxsp" AS "special_index",
        "linfv" AS "line_item_reference",
        "kontt" AS "account_assignment_type",
        "kontl" AS "account_assignment",
        "txdat" AS "tax_reporting_date",
        "agzei" AS "settlement_period",
        "pycur" AS "payment_currency",
        "pyamt" AS "payment_amount",
        "bupla" AS "business_place",
        "secco" AS "section_code",
        "lstar" AS "activity_type",
        "cession_kz" AS "cession_indicator",
        "prznr" AS "business_process",
        "ppdiff" AS "price_difference",
        "ppdif2" AS "price_difference_2",
        "ppdif3" AS "price_difference_3",
        "penlc1",
        "penlc2",
        "penlc3",
        "penfc" AS "foreign_currency_amount",
        "pendays" AS "number_of_days",
        "penrc" AS "reporting_currency",
        "grant_nbr",
        "sctax" AS "sales_use_tax_amount",
        "fkber_long" AS "funds_center_description",
        "gmvkz" AS "reporting_value_type",
        "srtype" AS "reversal_transaction_type",
        "intreno" AS "internal_renumbering",
        "measure",
        "auggj" AS "clearing_fiscal_year",
        "ppa_ex_ind" AS "pa_exchange_rate_indicator",
        "docln" AS "document_line_number",
        "segment",
        "psegment",
        "pfkber",
        "hktid" AS "account_id",
        "kstar" AS "cost_element",
        "xlgclr" AS "legacy_clearing_indicator",
        "taxps" AS "tax_splitting",
        "pays_prov" AS "payment_service_provider",
        "pays_tran" AS "payment_provider_transaction_id",
        "mndid" AS "mandate_id",
        "xfrge_bseg" AS "ready_for_input_indicator",
        "squan",
        "zzspreg" AS "sales_promotion_region",
        "zzbuspartn" AS "business_partner",
        "zzchan" AS "distribution_channel",
        "zzproduct" AS "product",
        "zzloca" AS "location",
        "zzlob" AS "line_of_business",
        "zzuserfld1" AS "user_field_1",
        "zzuserfld2" AS "user_field_2",
        "zzuserfld3" AS "user_field_3",
        "zzstate" AS "state",
        "zzregion" AS "region",
        "re_bukrs" AS "re_company_code",
        "re_account",
        "pgeber",
        "pgrant_nbr",
        "budget_pd",
        "pbudget_pd",
        "j_1tpbupl" AS "brazil_tax_upload",
        "perop_beg" AS "operation_begin_date",
        "perop_end" AS "operation_end_date",
        "fastpay" AS "fast_pay_indicator",
        "ignr_ivref" AS "ignore_invoice_reference",
        "fmfgus_key" AS "us_federal_grant_key",
        "fmxdocnr" AS "original_document_number",
        "fmxyear" AS "original_fiscal_year",
        "fmxdocln" AS "original_document_line_number",
        "fmxzekkn" AS "original_line_item_sequence",
        "prodper" AS "production_period",
        "recrf" AS "record_reference",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_bseg_data_projected"
),

"sap_bseg_data_projected_renamed_cleaned" AS (
    -- Clean unusual string values: 
    -- koart: The problem is that the 'koart' column contains only a single value 's', which is unusual because it's a single character. This likely indicates incomplete or abbreviated data. Without more context about what 'koart' represents or what the full values should be, it's difficult to determine the correct mapping. The single character 's' could be an abbreviation for a longer word or term, but we don't have enough information to expand it accurately. 
    -- debit_credit_indicator: The problem is that 'h' is an unusual and non-standard value for a debit_credit_indicator column. Typically, this column should contain 'D' for debit and 'C' for credit. The value 'h' is not meaningful in this context and doesn't provide clear information about whether the transaction is a debit or credit. Since we don't have additional information about what 'h' might represent, it's safest to map it to an empty string to indicate missing or invalid data. 
    -- transaction_type: The problem is that 'rfbu' is the only value in the transaction_type column, and it's not a standard or commonly recognized transaction type abbreviation. Without more context about the data or the specific business domain, it's impossible to determine what 'rfbu' might stand for or what the correct value should be. In this case, since we can't confidently map it to a known transaction type, the best approach is to keep the value as is, assuming it has a specific meaning within the context of the data source. 
    -- payment_indicator: The problem is that 'x' is not a descriptive or standard value for a payment indicator. It's unclear what 'x' represents - whether it means paid, unpaid, or something else. The correct values for a payment indicator should be more explicit, such as 'paid' or 'unpaid'. 
    -- gvtyp: The problem is that 'x' is a single-character value that lacks clear meaning or context in the gvtyp column. Without additional information about what this column represents or what other values might be expected, it's difficult to determine the correct interpretation. In such cases, it's often best to either keep the value as is (if it might have a specific meaning in the dataset) or replace it with an empty string if it's deemed to be meaningless or a placeholder. 
    SELECT
        "belnr",
        "company_code",
        "buzei",
        "fiscal_year",
        "client_id",
        "line_item_identifier",
        "clearing_date",
        "clearing_fiscal_period",
        "clearing_document_number",
        "posting_key",
        "koart",
        "umskz",
        "special_gl_transaction_type",
        "zumsk",
        CASE
            WHEN "debit_credit_indicator" = '''h''' THEN ''''
            ELSE "debit_credit_indicator"
        END AS "debit_credit_indicator",
        "business_area",
        "partner_business_area",
        "vat_tax_code",
        "accounting_code_1",
        "total_local_amount",
        "wrbtr",
        "document_currency_amount",
        "transaction_amount",
        "transaction_currency",
        "transaction_currency_tax_base",
        "tax_base_amount",
        "foreign_tax_amount",
        "tax_amount",
        "home_currency_amount",
        "functional_area_amount",
        "home_currency_assignment_amount",
        "functional_area_assignment_amount",
        "shzuz",
        "tax_exempt_indicator",
        "foreign_currency_valuation_type",
        "tax_group",
        "account_key",
        "accounting_value_3",
        "kursr",
        "gbetr",
        "valuation_difference",
        "valuation_difference_2",
        "value_date",
        "zuonr",
        "line_item_text",
        "internal_code",
        "trading_partner",
        "asset_valuation_type",
        "alternative_account",
        "transaction_type",
        "planning_level",
        "planning_group",
        "planned_price",
        "factory_calendar_date",
        "funds_reservation_number",
        "controlling_area",
        "kostl",
        "project_number",
        "order_number",
        "sales_document_number",
        "vbel2",
        "position_number",
        "delivery_date",
        "main_asset_number",
        "asset_subnumber",
        "asset_transaction_type",
        "asset_value_date",
        "personnel_number",
        "tax_code_change_indicator",
        "clearing_reversal_indicator",
        "document_reversal_indicator",
        "open_item_management_indicator",
        "clearing_with_down_payment",
        "tax_posting_reversal_indicator",
        "xsauf",
        "sample_document_indicator",
        "recurring_entry_original_indicator",
        "xfakt",
        "manual_entry_indicator",
        "net_payment_indicator",
        "grir_clearing_reversal_indicator",
        "invoice_indicator",
        "partial_payment_indicator",
        "automatic_posting_indicator",
        "noted_item_indicator",
        CASE
            WHEN "payment_indicator" = '''x''' THEN '''paid'''
            ELSE "payment_indicator"
        END AS "payment_indicator",
        "gl_account_number",
        "gl_account",
        "customer_number",
        "vendor_number",
        "billing_type",
        "balance_sheet_update_indicator",
        CASE
            WHEN "gvtyp" = '''x''' THEN ''''
            ELSE "gvtyp"
        END AS "gvtyp",
        "hzuon",
        "due_date_baseline",
        "payment_terms",
        "cash_discount_days_1",
        "cash_discount_days_2",
        "net_payment_terms_days",
        "cash_discount_percent_1",
        "cash_discount_percent_2",
        "skfbt",
        "transaction_currency_amount",
        "cash_discount_amount",
        "payment_method",
        "payment_block",
        "fixed_payment_terms_indicator",
        "house_bank_id",
        "partner_bank_type",
        "net_amount",
        "tax_code_1",
        "local_amount_1",
        "withholding_tax_amount_1",
        "tax_code_2",
        "local_amount_2",
        "withholding_tax_amount_2",
        "tax_code_3",
        "local_amount_3",
        "withholding_tax_amount_3",
        "rebzg",
        "reference_fiscal_year",
        "reference_line_item",
        "reference_document_type",
        "customs_tariff",
        "customs_amount",
        "payment_terms_key",
        "country_key",
        "service_indicator",
        "sample_number",
        "depreciation_period",
        "insurance_indicator",
        "insurance_date",
        "discount_base_period_number",
        "discount_base_year",
        "discount_base_period",
        "wverw",
        "asset_acquisition_period",
        "asset_acquisition_year",
        "acquisition_date",
        "apc_area",
        "document_amount_local",
        "balance_indicator",
        "balance_carryforward",
        "dunning_level",
        "manual_split_indicator",
        "material_document_date",
        "manual_stats_update",
        "maber",
        "gr_ir_clearing_number",
        "bill_of_exchange_procedure",
        "payment_term",
        "klibt",
        "accounting_number_1",
        "accounting_value_1",
        "accounting_value_2",
        "foreign_non_deductible_tax_base",
        "non_deductible_input_tax",
        "material_number",
        "plant",
        "menge",
        "base_unit_of_measure",
        "erfmg",
        "erfme",
        "bpmng",
        "partner_measurement_unit",
        "purchase_order_number",
        "purchase_order_item_number",
        "account_assignment_sequence",
        "delivery_completed",
        "price_control_indicator",
        "peinh",
        "bwkey",
        "valuation_type",
        "tax_indicator",
        "reference_amount",
        "reference_exchange_rate",
        "investment_support_amount",
        "bualt",
        "psalt",
        "net_price",
        "subsequent_billing_indicator",
        "price_reason_code",
        "material_reason_code",
        "text_reason_code",
        "goods_movement_reason_code",
        "insurance_reason_code",
        "quantity_reason_code",
        "vat_registration_number",
        "billing_block",
        "delivery_block",
        "reason_code",
        "accounting_value_5",
        "accounting_value_4",
        "difference_value_3",
        "difference_value_1",
        "prctr",
        "header_comment_indicator",
        "name",
        "record_id",
        "item_group",
        "partner_account_number",
        "contract_type",
        "contract_number",
        "sales_movement_type",
        "securities_account",
        "tax_jurisdiction_code",
        "item_key",
        "days_in_arrears",
        "option_selection",
        "financial_position",
        "cost_object",
        "network_activity_number",
        "order_item_number",
        "asset_sequence_number",
        "project_key",
        "profitability_segment_number",
        "profitability_subsegment_number",
        "reservation_reason_code",
        "blocking_reason_code",
        "balance_type",
        "po_history_category",
        "single_statement_indicator",
        "line_number_range",
        "profitability_segment",
        "dmbe2",
        "group_currency_amount",
        "dmb21",
        "dmb22",
        "dmb23",
        "group_currency_amount_1",
        "group_currency_amount_2",
        "group_currency_amount_3",
        "local_tax_amount",
        "document_tax_amount",
        "local_non_deductible_tax_base",
        "document_non_deductible_tax_base",
        "second_local_currency_amount",
        "third_local_currency_amount",
        "valuation_difference_3",
        "difference_value_2",
        "base_unit_quantity",
        "consolidation_business_area",
        "balance_carryforward_indicator",
        "payment_method_supplement",
        "lokkt",
        "fistl",
        "geber",
        "tax_reporting_company_code",
        "second_local_currency_tax_base",
        "third_local_currency_tax_base",
        "profit_center",
        "reference_key_1",
        "reference_key_2",
        "kblnr",
        "kblpos",
        "local_currency_tax_amount",
        "funds_center",
        "original_line_item_number",
        "negative_posting_indicator",
        "reference_line_indicator",
        "coding_block",
        "credit_control_area",
        "goods_recipient",
        "reference_key_3",
        "tax_amount_1",
        "tax_amount_2",
        "tax_amount_3",
        "tax_amount_4",
        "consolidation_functional_area",
        "consolidation_region",
        "grant_type",
        "payment_processing_indicator",
        "customer_id",
        "write_off_amount",
        "special_index",
        "line_item_reference",
        "account_assignment_type",
        "account_assignment",
        "tax_reporting_date",
        "settlement_period",
        "payment_currency",
        "payment_amount",
        "business_place",
        "section_code",
        "activity_type",
        "cession_indicator",
        "business_process",
        "price_difference",
        "price_difference_2",
        "price_difference_3",
        "penlc1",
        "penlc2",
        "penlc3",
        "foreign_currency_amount",
        "number_of_days",
        "reporting_currency",
        "grant_nbr",
        "sales_use_tax_amount",
        "funds_center_description",
        "reporting_value_type",
        "reversal_transaction_type",
        "internal_renumbering",
        "measure",
        "clearing_fiscal_year",
        "pa_exchange_rate_indicator",
        "document_line_number",
        "segment",
        "psegment",
        "pfkber",
        "account_id",
        "cost_element",
        "legacy_clearing_indicator",
        "tax_splitting",
        "payment_service_provider",
        "payment_provider_transaction_id",
        "mandate_id",
        "ready_for_input_indicator",
        "squan",
        "sales_promotion_region",
        "business_partner",
        "distribution_channel",
        "product",
        "location",
        "line_of_business",
        "user_field_1",
        "user_field_2",
        "user_field_3",
        "state",
        "region",
        "re_company_code",
        "re_account",
        "pgeber",
        "pgrant_nbr",
        "budget_pd",
        "pbudget_pd",
        "brazil_tax_upload",
        "operation_begin_date",
        "operation_end_date",
        "fast_pay_indicator",
        "ignore_invoice_reference",
        "us_federal_grant_key",
        "original_document_number",
        "original_fiscal_year",
        "original_document_line_number",
        "original_line_item_sequence",
        "production_period",
        "record_reference",
        "row_id",
        "is_deleted"
    FROM "sap_bseg_data_projected_renamed"
),

"sap_bseg_data_projected_renamed_cleaned_casted" AS (
    -- Column Type Casting: 
    -- account_assignment: from DECIMAL to VARCHAR
    -- account_assignment_type: from DECIMAL to VARCHAR
    -- account_id: from DECIMAL to VARCHAR
    -- account_key: from DECIMAL to VARCHAR
    -- accounting_code_1: from DECIMAL to VARCHAR
    -- accounting_number_1: from DECIMAL to VARCHAR
    -- accounting_value_5: from DECIMAL to VARCHAR
    -- acquisition_date: from DECIMAL to DATE
    -- activity_type: from DECIMAL to VARCHAR
    -- alternative_account: from INT to VARCHAR
    -- asset_acquisition_period: from DECIMAL to VARCHAR
    -- asset_subnumber: from DECIMAL to VARCHAR
    -- asset_transaction_type: from DECIMAL to VARCHAR
    -- asset_valuation_type: from DECIMAL to VARCHAR
    -- automatic_posting_indicator: from DECIMAL to VARCHAR
    -- balance_carryforward_indicator: from DECIMAL to VARCHAR
    -- balance_indicator: from DECIMAL to VARCHAR
    -- balance_sheet_update_indicator: from DECIMAL to VARCHAR
    -- balance_type: from DECIMAL to VARCHAR
    -- base_unit_of_measure: from DECIMAL to VARCHAR
    -- base_unit_quantity: from DECIMAL to VARCHAR
    -- belnr: from INT to VARCHAR
    -- bill_of_exchange_procedure: from DECIMAL to VARCHAR
    -- billing_block: from DECIMAL to VARCHAR
    -- billing_type: from DECIMAL to VARCHAR
    -- blocking_reason_code: from DECIMAL to VARCHAR
    -- brazil_tax_upload: from DECIMAL to VARCHAR
    -- budget_pd: from DECIMAL to VARCHAR
    -- business_area: from INT to VARCHAR
    -- business_partner: from DECIMAL to VARCHAR
    -- business_place: from DECIMAL to VARCHAR
    -- business_process: from DECIMAL to VARCHAR
    -- bwkey: from DECIMAL to VARCHAR
    -- cession_indicator: from DECIMAL to VARCHAR
    -- clearing_document_number: from DECIMAL to VARCHAR
    -- clearing_reversal_indicator: from DECIMAL to VARCHAR
    -- clearing_with_down_payment: from DECIMAL to VARCHAR
    -- client_id: from INT to VARCHAR
    -- coding_block: from DECIMAL to VARCHAR
    -- company_code: from INT to VARCHAR
    -- consolidation_business_area: from DECIMAL to VARCHAR
    -- consolidation_functional_area: from DECIMAL to VARCHAR
    -- consolidation_region: from DECIMAL to VARCHAR
    -- contract_number: from DECIMAL to VARCHAR
    -- contract_type: from DECIMAL to VARCHAR
    -- controlling_area: from INT to VARCHAR
    -- cost_element: from DECIMAL to VARCHAR
    -- cost_object: from DECIMAL to VARCHAR
    -- country_key: from DECIMAL to VARCHAR
    -- credit_control_area: from DECIMAL to VARCHAR
    -- customer_id: from DECIMAL to VARCHAR
    -- customer_number: from DECIMAL to VARCHAR
    -- customs_tariff: from DECIMAL to VARCHAR
    -- delivery_block: from DECIMAL to VARCHAR
    -- delivery_completed: from DECIMAL to VARCHAR
    -- discount_base_period_number: from DECIMAL to VARCHAR
    -- distribution_channel: from DECIMAL to VARCHAR
    -- document_line_number: from DECIMAL to VARCHAR
    -- dunning_level: from DECIMAL to VARCHAR
    -- erfme: from DECIMAL to VARCHAR
    -- fast_pay_indicator: from DECIMAL to VARCHAR
    -- financial_position: from INT to VARCHAR
    -- fistl: from DECIMAL to VARCHAR
    -- fixed_payment_terms_indicator: from DECIMAL to VARCHAR
    -- foreign_currency_valuation_type: from DECIMAL to VARCHAR
    -- funds_center: from INT to VARCHAR
    -- funds_center_description: from INT to VARCHAR
    -- funds_reservation_number: from INT to VARCHAR
    -- geber: from DECIMAL to VARCHAR
    -- gl_account: from INT to VARCHAR
    -- gl_account_number: from DECIMAL to VARCHAR
    -- goods_movement_reason_code: from DECIMAL to VARCHAR
    -- goods_recipient: from DECIMAL to VARCHAR
    -- gr_ir_clearing_number: from DECIMAL to VARCHAR
    -- grant_nbr: from DECIMAL to VARCHAR
    -- grant_type: from DECIMAL to VARCHAR
    -- grir_clearing_reversal_indicator: from DECIMAL to VARCHAR
    -- header_comment_indicator: from DECIMAL to VARCHAR
    -- house_bank_id: from DECIMAL to VARCHAR
    -- hzuon: from DECIMAL to VARCHAR
    -- ignore_invoice_reference: from DECIMAL to VARCHAR
    -- insurance_indicator: from DECIMAL to VARCHAR
    -- insurance_reason_code: from DECIMAL to VARCHAR
    -- internal_code: from DECIMAL to VARCHAR
    -- internal_renumbering: from DECIMAL to VARCHAR
    -- invoice_indicator: from DECIMAL to VARCHAR
    -- item_group: from DECIMAL to VARCHAR
    -- item_key: from DECIMAL to VARCHAR
    -- kblnr: from DECIMAL to VARCHAR
    -- kostl: from INT to VARCHAR
    -- legacy_clearing_indicator: from DECIMAL to VARCHAR
    -- line_item_identifier: from DECIMAL to VARCHAR
    -- line_item_reference: from INT to VARCHAR
    -- line_number_range: from INT to VARCHAR
    -- line_of_business: from DECIMAL to VARCHAR
    -- location: from DECIMAL to VARCHAR
    -- lokkt: from INT to VARCHAR
    -- maber: from DECIMAL to VARCHAR
    -- main_asset_number: from DECIMAL to VARCHAR
    -- mandate_id: from DECIMAL to VARCHAR
    -- manual_entry_indicator: from DECIMAL to VARCHAR
    -- manual_split_indicator: from DECIMAL to VARCHAR
    -- manual_stats_update: from INT to VARCHAR
    -- material_document_date: from INT to VARCHAR
    -- material_number: from DECIMAL to VARCHAR
    -- material_reason_code: from DECIMAL to VARCHAR
    -- measure: from DECIMAL to VARCHAR
    -- name: from DECIMAL to VARCHAR
    -- negative_posting_indicator: from DECIMAL to VARCHAR
    -- net_payment_indicator: from DECIMAL to VARCHAR
    -- network_activity_number: from DECIMAL to VARCHAR
    -- noted_item_indicator: from DECIMAL to VARCHAR
    -- open_item_management_indicator: from DECIMAL to VARCHAR
    -- operation_begin_date: from INT to DATE
    -- operation_end_date: from INT to DATE
    -- order_number: from DECIMAL to VARCHAR
    -- original_document_number: from DECIMAL to VARCHAR
    -- pa_exchange_rate_indicator: from DECIMAL to VARCHAR
    -- partial_payment_indicator: from DECIMAL to VARCHAR
    -- partner_account_number: from DECIMAL to VARCHAR
    -- partner_bank_type: from DECIMAL to VARCHAR
    -- partner_business_area: from DECIMAL to VARCHAR
    -- partner_measurement_unit: from DECIMAL to VARCHAR
    -- payment_block: from DECIMAL to VARCHAR
    -- payment_currency: from DECIMAL to VARCHAR
    -- payment_method: from DECIMAL to VARCHAR
    -- payment_method_supplement: from DECIMAL to VARCHAR
    -- payment_processing_indicator: from DECIMAL to VARCHAR
    -- payment_provider_transaction_id: from DECIMAL to VARCHAR
    -- payment_service_provider: from DECIMAL to VARCHAR
    -- payment_term: from DECIMAL to VARCHAR
    -- payment_terms: from DECIMAL to VARCHAR
    -- payment_terms_key: from DECIMAL to VARCHAR
    -- pbudget_pd: from DECIMAL to VARCHAR
    -- pfkber: from DECIMAL to VARCHAR
    -- pgeber: from DECIMAL to VARCHAR
    -- pgrant_nbr: from DECIMAL to VARCHAR
    -- planning_group: from DECIMAL to VARCHAR
    -- planning_level: from DECIMAL to VARCHAR
    -- plant: from DECIMAL to VARCHAR
    -- po_history_category: from DECIMAL to VARCHAR
    -- prctr: from INT to VARCHAR
    -- price_control_indicator: from DECIMAL to VARCHAR
    -- price_reason_code: from DECIMAL to VARCHAR
    -- product: from DECIMAL to VARCHAR
    -- profit_center: from DECIMAL to VARCHAR
    -- profitability_segment: from DECIMAL to VARCHAR
    -- project_number: from DECIMAL to VARCHAR
    -- psalt: from DECIMAL to VARCHAR
    -- psegment: from DECIMAL to VARCHAR
    -- purchase_order_number: from DECIMAL to VARCHAR
    -- quantity_reason_code: from DECIMAL to VARCHAR
    -- re_account: from DECIMAL to VARCHAR
    -- re_company_code: from DECIMAL to VARCHAR
    -- ready_for_input_indicator: from DECIMAL to VARCHAR
    -- reason_code: from DECIMAL to VARCHAR
    -- rebzg: from DECIMAL to VARCHAR
    -- record_id: from DECIMAL to VARCHAR
    -- record_reference: from DECIMAL to VARCHAR
    -- recurring_entry_original_indicator: from DECIMAL to VARCHAR
    -- reference_document_type: from DECIMAL to VARCHAR
    -- reference_key_1: from DECIMAL to VARCHAR
    -- reference_key_2: from DECIMAL to VARCHAR
    -- reference_key_3: from DECIMAL to VARCHAR
    -- region: from DECIMAL to VARCHAR
    -- reporting_currency: from DECIMAL to VARCHAR
    -- reporting_value_type: from DECIMAL to VARCHAR
    -- reservation_reason_code: from DECIMAL to VARCHAR
    -- reversal_transaction_type: from DECIMAL to VARCHAR
    -- sales_document_number: from DECIMAL to VARCHAR
    -- sales_movement_type: from DECIMAL to VARCHAR
    -- sales_promotion_region: from DECIMAL to VARCHAR
    -- sample_document_indicator: from DECIMAL to VARCHAR
    -- section_code: from DECIMAL to VARCHAR
    -- securities_account: from DECIMAL to VARCHAR
    -- segment: from DECIMAL to VARCHAR
    -- service_indicator: from DECIMAL to VARCHAR
    -- shzuz: from DECIMAL to VARCHAR
    -- single_statement_indicator: from DECIMAL to VARCHAR
    -- special_gl_transaction_type: from DECIMAL to VARCHAR
    -- special_index: from DECIMAL to VARCHAR
    -- squan: from DECIMAL to VARCHAR
    -- state: from DECIMAL to VARCHAR
    -- subsequent_billing_indicator: from DECIMAL to VARCHAR
    -- tax_code_1: from DECIMAL to VARCHAR
    -- tax_code_2: from DECIMAL to VARCHAR
    -- tax_code_3: from DECIMAL to VARCHAR
    -- tax_code_change_indicator: from DECIMAL to VARCHAR
    -- tax_exempt_indicator: from DECIMAL to VARCHAR
    -- tax_indicator: from DECIMAL to VARCHAR
    -- tax_jurisdiction_code: from DECIMAL to VARCHAR
    -- tax_posting_reversal_indicator: from DECIMAL to VARCHAR
    -- tax_reporting_company_code: from DECIMAL to VARCHAR
    -- text_reason_code: from DECIMAL to VARCHAR
    -- trading_partner: from DECIMAL to VARCHAR
    -- umskz: from DECIMAL to VARCHAR
    -- us_federal_grant_key: from DECIMAL to VARCHAR
    -- user_field_1: from DECIMAL to VARCHAR
    -- user_field_2: from DECIMAL to VARCHAR
    -- user_field_3: from DECIMAL to VARCHAR
    -- valuation_type: from DECIMAL to VARCHAR
    -- vat_registration_number: from DECIMAL to VARCHAR
    -- vat_tax_code: from DECIMAL to VARCHAR
    -- vbel2: from DECIMAL to VARCHAR
    -- vendor_number: from DECIMAL to VARCHAR
    -- wverw: from DECIMAL to VARCHAR
    -- xfakt: from DECIMAL to VARCHAR
    -- xsauf: from DECIMAL to VARCHAR
    -- zumsk: from DECIMAL to VARCHAR
    -- zuonr: from INT to VARCHAR
    SELECT
        "buzei",
        "fiscal_year",
        "clearing_date",
        "clearing_fiscal_period",
        "posting_key",
        "koart",
        "debit_credit_indicator",
        "total_local_amount",
        "wrbtr",
        "document_currency_amount",
        "transaction_amount",
        "transaction_currency",
        "transaction_currency_tax_base",
        "tax_base_amount",
        "foreign_tax_amount",
        "tax_amount",
        "home_currency_amount",
        "functional_area_amount",
        "home_currency_assignment_amount",
        "functional_area_assignment_amount",
        "tax_group",
        "accounting_value_3",
        "kursr",
        "gbetr",
        "valuation_difference",
        "valuation_difference_2",
        "value_date",
        "line_item_text",
        "transaction_type",
        "planned_price",
        "factory_calendar_date",
        "position_number",
        "delivery_date",
        "asset_value_date",
        "personnel_number",
        "document_reversal_indicator",
        "payment_indicator",
        "gvtyp",
        "due_date_baseline",
        "cash_discount_days_1",
        "cash_discount_days_2",
        "net_payment_terms_days",
        "cash_discount_percent_1",
        "cash_discount_percent_2",
        "skfbt",
        "transaction_currency_amount",
        "cash_discount_amount",
        "net_amount",
        "local_amount_1",
        "withholding_tax_amount_1",
        "local_amount_2",
        "withholding_tax_amount_2",
        "local_amount_3",
        "withholding_tax_amount_3",
        "reference_fiscal_year",
        "reference_line_item",
        "customs_amount",
        "sample_number",
        "depreciation_period",
        "insurance_date",
        "discount_base_year",
        "discount_base_period",
        "asset_acquisition_year",
        "apc_area",
        "document_amount_local",
        "balance_carryforward",
        "klibt",
        "accounting_value_1",
        "accounting_value_2",
        "foreign_non_deductible_tax_base",
        "non_deductible_input_tax",
        "menge",
        "erfmg",
        "bpmng",
        "purchase_order_item_number",
        "account_assignment_sequence",
        "peinh",
        "reference_amount",
        "reference_exchange_rate",
        "investment_support_amount",
        "bualt",
        "net_price",
        "accounting_value_4",
        "difference_value_3",
        "difference_value_1",
        "days_in_arrears",
        "option_selection",
        "order_item_number",
        "asset_sequence_number",
        "project_key",
        "profitability_segment_number",
        "profitability_subsegment_number",
        "dmbe2",
        "group_currency_amount",
        "dmb21",
        "dmb22",
        "dmb23",
        "group_currency_amount_1",
        "group_currency_amount_2",
        "group_currency_amount_3",
        "local_tax_amount",
        "document_tax_amount",
        "local_non_deductible_tax_base",
        "document_non_deductible_tax_base",
        "second_local_currency_amount",
        "third_local_currency_amount",
        "valuation_difference_3",
        "difference_value_2",
        "second_local_currency_tax_base",
        "third_local_currency_tax_base",
        "kblpos",
        "local_currency_tax_amount",
        "original_line_item_number",
        "reference_line_indicator",
        "tax_amount_1",
        "tax_amount_2",
        "tax_amount_3",
        "tax_amount_4",
        "write_off_amount",
        "tax_reporting_date",
        "settlement_period",
        "payment_amount",
        "price_difference",
        "price_difference_2",
        "price_difference_3",
        "penlc1",
        "penlc2",
        "penlc3",
        "foreign_currency_amount",
        "number_of_days",
        "sales_use_tax_amount",
        "clearing_fiscal_year",
        "tax_splitting",
        "original_fiscal_year",
        "original_document_line_number",
        "original_line_item_sequence",
        "production_period",
        "row_id",
        "is_deleted",
        CAST("account_assignment" AS VARCHAR) AS "account_assignment",
        CAST("account_assignment_type" AS VARCHAR) AS "account_assignment_type",
        CAST("account_id" AS VARCHAR) AS "account_id",
        CAST("account_key" AS VARCHAR) AS "account_key",
        CAST("accounting_code_1" AS VARCHAR) AS "accounting_code_1",
        CAST("accounting_number_1" AS VARCHAR) AS "accounting_number_1",
        CAST("accounting_value_5" AS VARCHAR) AS "accounting_value_5",
        CAST("acquisition_date" AS DATE) AS "acquisition_date",
        CAST("activity_type" AS VARCHAR) AS "activity_type",
        CAST("alternative_account" AS VARCHAR) AS "alternative_account",
        CAST("asset_acquisition_period" AS VARCHAR) AS "asset_acquisition_period",
        CAST("asset_subnumber" AS VARCHAR) AS "asset_subnumber",
        CAST("asset_transaction_type" AS VARCHAR) AS "asset_transaction_type",
        CAST("asset_valuation_type" AS VARCHAR) AS "asset_valuation_type",
        CAST("automatic_posting_indicator" AS VARCHAR) AS "automatic_posting_indicator",
        CAST("balance_carryforward_indicator" AS VARCHAR) AS "balance_carryforward_indicator",
        CAST("balance_indicator" AS VARCHAR) AS "balance_indicator",
        CAST("balance_sheet_update_indicator" AS VARCHAR) AS "balance_sheet_update_indicator",
        CAST("balance_type" AS VARCHAR) AS "balance_type",
        CAST("base_unit_of_measure" AS VARCHAR) AS "base_unit_of_measure",
        CAST("base_unit_quantity" AS VARCHAR) AS "base_unit_quantity",
        CAST("belnr" AS VARCHAR) AS "belnr",
        CAST("bill_of_exchange_procedure" AS VARCHAR) AS "bill_of_exchange_procedure",
        CAST("billing_block" AS VARCHAR) AS "billing_block",
        CAST("billing_type" AS VARCHAR) AS "billing_type",
        CAST("blocking_reason_code" AS VARCHAR) AS "blocking_reason_code",
        CAST("brazil_tax_upload" AS VARCHAR) AS "brazil_tax_upload",
        CAST("budget_pd" AS VARCHAR) AS "budget_pd",
        CAST("business_area" AS VARCHAR) AS "business_area",
        CAST("business_partner" AS VARCHAR) AS "business_partner",
        CAST("business_place" AS VARCHAR) AS "business_place",
        CAST("business_process" AS VARCHAR) AS "business_process",
        CAST("bwkey" AS VARCHAR) AS "bwkey",
        CAST("cession_indicator" AS VARCHAR) AS "cession_indicator",
        CAST("clearing_document_number" AS VARCHAR) AS "clearing_document_number",
        CAST("clearing_reversal_indicator" AS VARCHAR) AS "clearing_reversal_indicator",
        CAST("clearing_with_down_payment" AS VARCHAR) AS "clearing_with_down_payment",
        CAST("client_id" AS VARCHAR) AS "client_id",
        CAST("coding_block" AS VARCHAR) AS "coding_block",
        CAST("company_code" AS VARCHAR) AS "company_code",
        CAST("consolidation_business_area" AS VARCHAR) AS "consolidation_business_area",
        CAST("consolidation_functional_area" AS VARCHAR) AS "consolidation_functional_area",
        CAST("consolidation_region" AS VARCHAR) AS "consolidation_region",
        CAST("contract_number" AS VARCHAR) AS "contract_number",
        CAST("contract_type" AS VARCHAR) AS "contract_type",
        CAST("controlling_area" AS VARCHAR) AS "controlling_area",
        CAST("cost_element" AS VARCHAR) AS "cost_element",
        CAST("cost_object" AS VARCHAR) AS "cost_object",
        CAST("country_key" AS VARCHAR) AS "country_key",
        CAST("credit_control_area" AS VARCHAR) AS "credit_control_area",
        CAST("customer_id" AS VARCHAR) AS "customer_id",
        CAST("customer_number" AS VARCHAR) AS "customer_number",
        CAST("customs_tariff" AS VARCHAR) AS "customs_tariff",
        CAST("delivery_block" AS VARCHAR) AS "delivery_block",
        CAST("delivery_completed" AS VARCHAR) AS "delivery_completed",
        CAST("discount_base_period_number" AS VARCHAR) AS "discount_base_period_number",
        CAST("distribution_channel" AS VARCHAR) AS "distribution_channel",
        CAST("document_line_number" AS VARCHAR) AS "document_line_number",
        CAST("dunning_level" AS VARCHAR) AS "dunning_level",
        CAST("erfme" AS VARCHAR) AS "erfme",
        CAST("fast_pay_indicator" AS VARCHAR) AS "fast_pay_indicator",
        CAST("financial_position" AS VARCHAR) AS "financial_position",
        CAST("fistl" AS VARCHAR) AS "fistl",
        CAST("fixed_payment_terms_indicator" AS VARCHAR) AS "fixed_payment_terms_indicator",
        CAST("foreign_currency_valuation_type" AS VARCHAR) AS "foreign_currency_valuation_type",
        CAST("funds_center" AS VARCHAR) AS "funds_center",
        CAST("funds_center_description" AS VARCHAR) AS "funds_center_description",
        CAST("funds_reservation_number" AS VARCHAR) AS "funds_reservation_number",
        CAST("geber" AS VARCHAR) AS "geber",
        CAST("gl_account" AS VARCHAR) AS "gl_account",
        CAST("gl_account_number" AS VARCHAR) AS "gl_account_number",
        CAST("goods_movement_reason_code" AS VARCHAR) AS "goods_movement_reason_code",
        CAST("goods_recipient" AS VARCHAR) AS "goods_recipient",
        CAST("gr_ir_clearing_number" AS VARCHAR) AS "gr_ir_clearing_number",
        CAST("grant_nbr" AS VARCHAR) AS "grant_nbr",
        CAST("grant_type" AS VARCHAR) AS "grant_type",
        CAST("grir_clearing_reversal_indicator" AS VARCHAR) AS "grir_clearing_reversal_indicator",
        CAST("header_comment_indicator" AS VARCHAR) AS "header_comment_indicator",
        CAST("house_bank_id" AS VARCHAR) AS "house_bank_id",
        CAST("hzuon" AS VARCHAR) AS "hzuon",
        CAST("ignore_invoice_reference" AS VARCHAR) AS "ignore_invoice_reference",
        CAST("insurance_indicator" AS VARCHAR) AS "insurance_indicator",
        CAST("insurance_reason_code" AS VARCHAR) AS "insurance_reason_code",
        CAST("internal_code" AS VARCHAR) AS "internal_code",
        CAST("internal_renumbering" AS VARCHAR) AS "internal_renumbering",
        CAST("invoice_indicator" AS VARCHAR) AS "invoice_indicator",
        CAST("item_group" AS VARCHAR) AS "item_group",
        CAST("item_key" AS VARCHAR) AS "item_key",
        CAST("kblnr" AS VARCHAR) AS "kblnr",
        CAST("kostl" AS VARCHAR) AS "kostl",
        CAST("legacy_clearing_indicator" AS VARCHAR) AS "legacy_clearing_indicator",
        CAST("line_item_identifier" AS VARCHAR) AS "line_item_identifier",
        CAST("line_item_reference" AS VARCHAR) AS "line_item_reference",
        CAST("line_number_range" AS VARCHAR) AS "line_number_range",
        CAST("line_of_business" AS VARCHAR) AS "line_of_business",
        CAST("location" AS VARCHAR) AS "location",
        CAST("lokkt" AS VARCHAR) AS "lokkt",
        CAST("maber" AS VARCHAR) AS "maber",
        CAST("main_asset_number" AS VARCHAR) AS "main_asset_number",
        CAST("mandate_id" AS VARCHAR) AS "mandate_id",
        CAST("manual_entry_indicator" AS VARCHAR) AS "manual_entry_indicator",
        CAST("manual_split_indicator" AS VARCHAR) AS "manual_split_indicator",
        CAST("manual_stats_update" AS VARCHAR) AS "manual_stats_update",
        CAST("material_document_date" AS VARCHAR) AS "material_document_date",
        CAST("material_number" AS VARCHAR) AS "material_number",
        CAST("material_reason_code" AS VARCHAR) AS "material_reason_code",
        CAST("measure" AS VARCHAR) AS "measure",
        CAST("name" AS VARCHAR) AS "name",
        CAST("negative_posting_indicator" AS VARCHAR) AS "negative_posting_indicator",
        CAST("net_payment_indicator" AS VARCHAR) AS "net_payment_indicator",
        CAST("network_activity_number" AS VARCHAR) AS "network_activity_number",
        CAST("noted_item_indicator" AS VARCHAR) AS "noted_item_indicator",
        CAST("open_item_management_indicator" AS VARCHAR) AS "open_item_management_indicator",
        CASE 
            WHEN "operation_begin_date" = '0' THEN NULL
            ELSE strptime(CAST("operation_begin_date" AS VARCHAR), '%Y%m%d')
        END AS "operation_begin_date",
        CASE 
            WHEN "operation_end_date" = '0' THEN NULL
            ELSE strptime(CAST("operation_end_date" AS VARCHAR), '%Y%m%d')
        END AS "operation_end_date",
        CAST("order_number" AS VARCHAR) AS "order_number",
        CAST("original_document_number" AS VARCHAR) AS "original_document_number",
        CAST("pa_exchange_rate_indicator" AS VARCHAR) AS "pa_exchange_rate_indicator",
        CAST("partial_payment_indicator" AS VARCHAR) AS "partial_payment_indicator",
        CAST("partner_account_number" AS VARCHAR) AS "partner_account_number",
        CAST("partner_bank_type" AS VARCHAR) AS "partner_bank_type",
        CAST("partner_business_area" AS VARCHAR) AS "partner_business_area",
        CAST("partner_measurement_unit" AS VARCHAR) AS "partner_measurement_unit",
        CAST("payment_block" AS VARCHAR) AS "payment_block",
        CAST("payment_currency" AS VARCHAR) AS "payment_currency",
        CAST("payment_method" AS VARCHAR) AS "payment_method",
        CAST("payment_method_supplement" AS VARCHAR) AS "payment_method_supplement",
        CAST("payment_processing_indicator" AS VARCHAR) AS "payment_processing_indicator",
        CAST("payment_provider_transaction_id" AS VARCHAR) AS "payment_provider_transaction_id",
        CAST("payment_service_provider" AS VARCHAR) AS "payment_service_provider",
        CAST("payment_term" AS VARCHAR) AS "payment_term",
        CAST("payment_terms" AS VARCHAR) AS "payment_terms",
        CAST("payment_terms_key" AS VARCHAR) AS "payment_terms_key",
        CAST("pbudget_pd" AS VARCHAR) AS "pbudget_pd",
        CAST("pfkber" AS VARCHAR) AS "pfkber",
        CAST("pgeber" AS VARCHAR) AS "pgeber",
        CAST("pgrant_nbr" AS VARCHAR) AS "pgrant_nbr",
        CAST("planning_group" AS VARCHAR) AS "planning_group",
        CAST("planning_level" AS VARCHAR) AS "planning_level",
        CAST("plant" AS VARCHAR) AS "plant",
        CAST("po_history_category" AS VARCHAR) AS "po_history_category",
        CAST("prctr" AS VARCHAR) AS "prctr",
        CAST("price_control_indicator" AS VARCHAR) AS "price_control_indicator",
        CAST("price_reason_code" AS VARCHAR) AS "price_reason_code",
        CAST("product" AS VARCHAR) AS "product",
        CAST("profit_center" AS VARCHAR) AS "profit_center",
        CAST("profitability_segment" AS VARCHAR) AS "profitability_segment",
        CAST("project_number" AS VARCHAR) AS "project_number",
        CAST("psalt" AS VARCHAR) AS "psalt",
        CAST("psegment" AS VARCHAR) AS "psegment",
        CAST("purchase_order_number" AS VARCHAR) AS "purchase_order_number",
        CAST("quantity_reason_code" AS VARCHAR) AS "quantity_reason_code",
        CAST("re_account" AS VARCHAR) AS "re_account",
        CAST("re_company_code" AS VARCHAR) AS "re_company_code",
        CAST("ready_for_input_indicator" AS VARCHAR) AS "ready_for_input_indicator",
        CAST("reason_code" AS VARCHAR) AS "reason_code",
        CAST("rebzg" AS VARCHAR) AS "rebzg",
        CAST("record_id" AS VARCHAR) AS "record_id",
        CAST("record_reference" AS VARCHAR) AS "record_reference",
        CAST("recurring_entry_original_indicator" AS VARCHAR) AS "recurring_entry_original_indicator",
        CAST("reference_document_type" AS VARCHAR) AS "reference_document_type",
        CAST("reference_key_1" AS VARCHAR) AS "reference_key_1",
        CAST("reference_key_2" AS VARCHAR) AS "reference_key_2",
        CAST("reference_key_3" AS VARCHAR) AS "reference_key_3",
        CAST("region" AS VARCHAR) AS "region",
        CAST("reporting_currency" AS VARCHAR) AS "reporting_currency",
        CAST("reporting_value_type" AS VARCHAR) AS "reporting_value_type",
        CAST("reservation_reason_code" AS VARCHAR) AS "reservation_reason_code",
        CAST("reversal_transaction_type" AS VARCHAR) AS "reversal_transaction_type",
        CAST("sales_document_number" AS VARCHAR) AS "sales_document_number",
        CAST("sales_movement_type" AS VARCHAR) AS "sales_movement_type",
        CAST("sales_promotion_region" AS VARCHAR) AS "sales_promotion_region",
        CAST("sample_document_indicator" AS VARCHAR) AS "sample_document_indicator",
        CAST("section_code" AS VARCHAR) AS "section_code",
        CAST("securities_account" AS VARCHAR) AS "securities_account",
        CAST("segment" AS VARCHAR) AS "segment",
        CAST("service_indicator" AS VARCHAR) AS "service_indicator",
        CAST("shzuz" AS VARCHAR) AS "shzuz",
        CAST("single_statement_indicator" AS VARCHAR) AS "single_statement_indicator",
        CAST("special_gl_transaction_type" AS VARCHAR) AS "special_gl_transaction_type",
        CAST("special_index" AS VARCHAR) AS "special_index",
        CAST("squan" AS VARCHAR) AS "squan",
        CAST("state" AS VARCHAR) AS "state",
        CAST("subsequent_billing_indicator" AS VARCHAR) AS "subsequent_billing_indicator",
        CAST("tax_code_1" AS VARCHAR) AS "tax_code_1",
        CAST("tax_code_2" AS VARCHAR) AS "tax_code_2",
        CAST("tax_code_3" AS VARCHAR) AS "tax_code_3",
        CAST("tax_code_change_indicator" AS VARCHAR) AS "tax_code_change_indicator",
        CAST("tax_exempt_indicator" AS VARCHAR) AS "tax_exempt_indicator",
        CAST("tax_indicator" AS VARCHAR) AS "tax_indicator",
        CAST("tax_jurisdiction_code" AS VARCHAR) AS "tax_jurisdiction_code",
        CAST("tax_posting_reversal_indicator" AS VARCHAR) AS "tax_posting_reversal_indicator",
        CAST("tax_reporting_company_code" AS VARCHAR) AS "tax_reporting_company_code",
        CAST("text_reason_code" AS VARCHAR) AS "text_reason_code",
        CAST("trading_partner" AS VARCHAR) AS "trading_partner",
        CAST("umskz" AS VARCHAR) AS "umskz",
        CAST("us_federal_grant_key" AS VARCHAR) AS "us_federal_grant_key",
        CAST("user_field_1" AS VARCHAR) AS "user_field_1",
        CAST("user_field_2" AS VARCHAR) AS "user_field_2",
        CAST("user_field_3" AS VARCHAR) AS "user_field_3",
        CAST("valuation_type" AS VARCHAR) AS "valuation_type",
        CAST("vat_registration_number" AS VARCHAR) AS "vat_registration_number",
        CAST("vat_tax_code" AS VARCHAR) AS "vat_tax_code",
        CAST("vbel2" AS VARCHAR) AS "vbel2",
        CAST("vendor_number" AS VARCHAR) AS "vendor_number",
        CAST("wverw" AS VARCHAR) AS "wverw",
        CAST("xfakt" AS VARCHAR) AS "xfakt",
        CAST("xsauf" AS VARCHAR) AS "xsauf",
        CAST("zumsk" AS VARCHAR) AS "zumsk",
        CAST("zuonr" AS VARCHAR) AS "zuonr"
    FROM "sap_bseg_data_projected_renamed_cleaned"
),

"sap_bseg_data_projected_renamed_cleaned_casted_missing_handled" AS (
    -- Handling missing values: There are 143 columns with unacceptable missing values
    -- account_assignment has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- account_assignment_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- account_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- account_key has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- accounting_code_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- accounting_number_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- accounting_value_5 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- acquisition_date has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- activity_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- automatic_posting_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- balance_carryforward_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- balance_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- balance_sheet_update_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- balance_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- base_unit_of_measure has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- base_unit_quantity has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- blocking_reason_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- brazil_tax_upload has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- budget_pd has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- business_partner has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- business_place has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- business_process has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- bwkey has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- coding_block has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- consolidation_business_area has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- consolidation_functional_area has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- consolidation_region has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- cost_element has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- cost_object has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- country_key has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- credit_control_area has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customer_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- customs_tariff has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- discount_base_period_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- distribution_channel has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- document_line_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- erfme has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- fistl has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- geber has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- gl_account_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- goods_movement_reason_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- goods_recipient has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- grant_nbr has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- grant_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- header_comment_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- house_bank_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- hzuon has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ignore_invoice_reference has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- internal_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- internal_renumbering has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- invoice_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- item_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- item_key has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- kblnr has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- legacy_clearing_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- line_item_identifier has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- line_of_business has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- location has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- maber has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- manual_entry_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- manual_split_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- material_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- material_reason_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- measure has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- name has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- negative_posting_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- net_payment_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- noted_item_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- open_item_management_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- operation_begin_date has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- operation_end_date has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- pa_exchange_rate_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- partial_payment_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- partner_account_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- partner_bank_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- partner_business_area has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- partner_measurement_unit has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- pbudget_pd has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- pfkber has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- pgeber has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- pgrant_nbr has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- planning_group has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- planning_level has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- plant has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- po_history_category has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- price_control_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- price_reason_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- product has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- profit_center has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- profitability_segment has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- project_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- psalt has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- psegment has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- purchase_order_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- quantity_reason_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- re_account has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- re_company_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- ready_for_input_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reason_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- rebzg has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- record_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- record_reference has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- recurring_entry_original_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_document_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_key_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_key_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_key_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- region has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reporting_currency has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reporting_value_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reservation_reason_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reversal_transaction_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sales_document_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sales_movement_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sales_promotion_region has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- sample_document_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- section_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- securities_account has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- segment has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- service_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- shzuz has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- single_statement_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- special_gl_transaction_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- special_index has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- squan has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- state has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- subsequent_billing_indicator has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- tax_reporting_company_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- text_reason_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- trading_partner has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- umskz has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- us_federal_grant_key has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- user_field_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- user_field_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- user_field_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- valuation_type has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- vbel2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- vendor_number has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- wverw has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- xfakt has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- xsauf has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- zumsk has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "buzei",
        "fiscal_year",
        "clearing_date",
        "clearing_fiscal_period",
        "posting_key",
        "koart",
        "debit_credit_indicator",
        "total_local_amount",
        "wrbtr",
        "document_currency_amount",
        "transaction_amount",
        "transaction_currency",
        "transaction_currency_tax_base",
        "tax_base_amount",
        "foreign_tax_amount",
        "tax_amount",
        "home_currency_amount",
        "functional_area_amount",
        "home_currency_assignment_amount",
        "functional_area_assignment_amount",
        "tax_group",
        "accounting_value_3",
        "kursr",
        "gbetr",
        "valuation_difference",
        "valuation_difference_2",
        "value_date",
        "line_item_text",
        "transaction_type",
        "planned_price",
        "factory_calendar_date",
        "position_number",
        "delivery_date",
        "asset_value_date",
        "personnel_number",
        "document_reversal_indicator",
        "payment_indicator",
        "gvtyp",
        "due_date_baseline",
        "cash_discount_days_1",
        "cash_discount_days_2",
        "net_payment_terms_days",
        "cash_discount_percent_1",
        "cash_discount_percent_2",
        "skfbt",
        "transaction_currency_amount",
        "cash_discount_amount",
        "net_amount",
        "local_amount_1",
        "withholding_tax_amount_1",
        "local_amount_2",
        "withholding_tax_amount_2",
        "local_amount_3",
        "withholding_tax_amount_3",
        "reference_fiscal_year",
        "reference_line_item",
        "customs_amount",
        "sample_number",
        "depreciation_period",
        "insurance_date",
        "discount_base_year",
        "discount_base_period",
        "asset_acquisition_year",
        "apc_area",
        "document_amount_local",
        "balance_carryforward",
        "klibt",
        "accounting_value_1",
        "accounting_value_2",
        "foreign_non_deductible_tax_base",
        "non_deductible_input_tax",
        "menge",
        "erfmg",
        "bpmng",
        "purchase_order_item_number",
        "account_assignment_sequence",
        "peinh",
        "reference_amount",
        "reference_exchange_rate",
        "investment_support_amount",
        "bualt",
        "net_price",
        "accounting_value_4",
        "difference_value_3",
        "difference_value_1",
        "days_in_arrears",
        "option_selection",
        "order_item_number",
        "asset_sequence_number",
        "project_key",
        "profitability_segment_number",
        "profitability_subsegment_number",
        "dmbe2",
        "group_currency_amount",
        "dmb21",
        "dmb22",
        "dmb23",
        "group_currency_amount_1",
        "group_currency_amount_2",
        "group_currency_amount_3",
        "local_tax_amount",
        "document_tax_amount",
        "local_non_deductible_tax_base",
        "document_non_deductible_tax_base",
        "second_local_currency_amount",
        "third_local_currency_amount",
        "valuation_difference_3",
        "difference_value_2",
        "second_local_currency_tax_base",
        "third_local_currency_tax_base",
        "kblpos",
        "local_currency_tax_amount",
        "original_line_item_number",
        "reference_line_indicator",
        "tax_amount_1",
        "tax_amount_2",
        "tax_amount_3",
        "tax_amount_4",
        "write_off_amount",
        "tax_reporting_date",
        "settlement_period",
        "payment_amount",
        "price_difference",
        "price_difference_2",
        "price_difference_3",
        "penlc1",
        "penlc2",
        "penlc3",
        "foreign_currency_amount",
        "number_of_days",
        "sales_use_tax_amount",
        "clearing_fiscal_year",
        "tax_splitting",
        "original_fiscal_year",
        "original_document_line_number",
        "original_line_item_sequence",
        "production_period",
        "row_id",
        "is_deleted",
        "alternative_account",
        "asset_acquisition_period",
        "asset_subnumber",
        "asset_transaction_type",
        "asset_valuation_type",
        "belnr",
        "bill_of_exchange_procedure",
        "billing_block",
        "billing_type",
        "business_area",
        "cession_indicator",
        "clearing_document_number",
        "clearing_reversal_indicator",
        "clearing_with_down_payment",
        "client_id",
        "company_code",
        "contract_number",
        "contract_type",
        "controlling_area",
        "delivery_block",
        "delivery_completed",
        "dunning_level",
        "fast_pay_indicator",
        "financial_position",
        "fixed_payment_terms_indicator",
        "foreign_currency_valuation_type",
        "funds_center",
        "funds_center_description",
        "funds_reservation_number",
        "gl_account",
        "gr_ir_clearing_number",
        "grir_clearing_reversal_indicator",
        "insurance_indicator",
        "insurance_reason_code",
        "kostl",
        "line_item_reference",
        "line_number_range",
        "lokkt",
        "main_asset_number",
        "mandate_id",
        "manual_stats_update",
        "material_document_date",
        "network_activity_number",
        "order_number",
        "original_document_number",
        "payment_block",
        "payment_currency",
        "payment_method",
        "payment_method_supplement",
        "payment_processing_indicator",
        "payment_provider_transaction_id",
        "payment_service_provider",
        "payment_term",
        "payment_terms",
        "payment_terms_key",
        "prctr",
        "tax_code_1",
        "tax_code_2",
        "tax_code_3",
        "tax_code_change_indicator",
        "tax_exempt_indicator",
        "tax_indicator",
        "tax_jurisdiction_code",
        "tax_posting_reversal_indicator",
        "vat_registration_number",
        "vat_tax_code",
        "zuonr"
    FROM "sap_bseg_data_projected_renamed_cleaned_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_bseg_data_projected_renamed_cleaned_casted_missing_handled"

stg_sap_bseg_data.yml (Document the table)

version: 2
models:
- name: stg_sap_bseg_data
  description: The table is about accounting document line items. It contains details
    like document number, company code, fiscal year, posting key, account type, amounts,
    currencies, cost centers, and many other accounting-related fields. Each row represents
    a single line item in an accounting document, with information about the transaction,
    amounts, accounts involved, and various financial dimensions.
  columns:
  - name: buzei
    description: ''
    tests:
    - not_null
  - name: fiscal_year
    description: Fiscal year
    tests:
    - not_null
  - name: clearing_date
    description: Clearing date
    tests:
    - not_null
  - name: clearing_fiscal_period
    description: Clearing fiscal period
    tests:
    - not_null
  - name: posting_key
    description: Posting key
    tests:
    - not_null
  - name: koart
    description: ''
    tests:
    - not_null
  - name: debit_credit_indicator
    description: Debit/Credit indicator
    tests:
    - not_null
    - accepted_values:
        values:
        - D
        - C
        - h
  - name: total_local_amount
    description: Amount in local currency
    tests:
    - not_null
  - name: wrbtr
    description: ''
    tests:
    - not_null
  - name: document_currency_amount
    description: Amount in document currency
    tests:
    - not_null
  - name: transaction_amount
    description: Transaction amount
    tests:
    - not_null
  - name: transaction_currency
    description: Transaction currency
    tests:
    - not_null
  - name: transaction_currency_tax_base
    description: Tax base amount in transaction currency
    tests:
    - not_null
  - name: tax_base_amount
    description: Tax base amount
    tests:
    - not_null
  - name: foreign_tax_amount
    description: Tax amount in foreign currency
    tests:
    - not_null
  - name: tax_amount
    description: Tax amount in document currency
    tests:
    - not_null
  - name: home_currency_amount
    description: Amount in home currency
    tests:
    - not_null
  - name: functional_area_amount
    description: Amount in functional area currency
    tests:
    - not_null
  - name: home_currency_assignment_amount
    description: Assignment amount in home currency
    tests:
    - not_null
  - name: functional_area_assignment_amount
    description: Assignment amount in functional area currency
    tests:
    - not_null
  - name: tax_group
    description: Tax group
    tests:
    - not_null
  - name: accounting_value_3
    description: Unknown accounting value
    tests:
    - not_null
  - name: kursr
    description: ''
    tests:
    - not_null
  - name: gbetr
    description: ''
    tests:
    - not_null
  - name: valuation_difference
    description: Valuation difference
    tests:
    - not_null
  - name: valuation_difference_2
    description: Valuation difference 2
    tests:
    - not_null
  - name: value_date
    description: Value date
    tests:
    - not_null
  - name: line_item_text
    description: Line item text
    tests:
    - not_null
    - accepted_values:
        values:
        - soll-buchung
        - haben-buchung
  - name: transaction_type
    description: Transaction type
    tests:
    - not_null
    - accepted_values:
        values:
        - rfbu
        - rfsl
        - buy
        - sell
        - dep
        - wth
        - div
        - int
        - fee
        - tfr
        - adj
  - name: planned_price
    description: Planned price
    tests:
    - not_null
  - name: factory_calendar_date
    description: Factory calendar date
    tests:
    - not_null
  - name: position_number
    description: Position number
    tests:
    - not_null
  - name: delivery_date
    description: Delivery date
    tests:
    - not_null
  - name: asset_value_date
    description: Asset value date
    tests:
    - not_null
  - name: personnel_number
    description: Personnel number
    tests:
    - not_null
  - name: document_reversal_indicator
    description: Indicator for document reversal
    tests:
    - not_null
    - accepted_values:
        values:
        - x
        - ''
  - name: payment_indicator
    description: Indicator for payment
    tests:
    - not_null
    - accepted_values:
        values:
        - x
        - ''
        - None
  - name: gvtyp
    description: ''
    tests:
    - not_null
    - accepted_values:
        values:
        - Democracy
        - Republic
        - Monarchy
        - Dictatorship
        - Oligarchy
        - Theocracy
        - Federation
        - Confederation
        - Anarchy
        - Socialist
        - Communist
        - Authoritarian
        - Totalitarian
        - Parliamentary
        - Presidential
        - Constitutional Monarchy
        - Absolute Monarchy
        - Military Junta
        - One-Party State
        - Aristocracy
        - x
  - name: due_date_baseline
    description: Baseline date for due date calculation
    tests:
    - not_null
  - name: cash_discount_days_1
    description: Cash discount days 1
    tests:
    - not_null
  - name: cash_discount_days_2
    description: Cash discount days 2
    tests:
    - not_null
  - name: net_payment_terms_days
    description: Net payment terms in days
    tests:
    - not_null
  - name: cash_discount_percent_1
    description: Cash discount percentage 1
    tests:
    - not_null
  - name: cash_discount_percent_2
    description: Cash discount percentage 2
    tests:
    - not_null
  - name: skfbt
    description: ''
    tests:
    - not_null
  - name: transaction_currency_amount
    description: Amount in transaction currency
    tests:
    - not_null
  - name: cash_discount_amount
    description: Cash discount amount in document currency
    tests:
    - not_null
  - name: net_amount
    description: Net amount in document currency
    tests:
    - not_null
  - name: local_amount_1
    description: Local currency amount 1
    tests:
    - not_null
  - name: withholding_tax_amount_1
    description: Withholding tax amount 1
    tests:
    - not_null
  - name: local_amount_2
    description: Local currency amount 2
    tests:
    - not_null
  - name: withholding_tax_amount_2
    description: Withholding tax amount 2
    tests:
    - not_null
  - name: local_amount_3
    description: Local currency amount 3
    tests:
    - not_null
  - name: withholding_tax_amount_3
    description: Withholding tax amount 3
    tests:
    - not_null
  - name: reference_fiscal_year
    description: Reference fiscal year
    tests:
    - not_null
  - name: reference_line_item
    description: Reference line item
    tests:
    - not_null
  - name: customs_amount
    description: Customs amount or duty
    tests:
    - not_null
  - name: sample_number
    description: Sample number
    tests:
    - not_null
  - name: depreciation_period
    description: Depreciation period
    tests:
    - not_null
  - name: insurance_date
    description: Insurance date
    tests:
    - not_null
  - name: discount_base_year
    description: Discount base year
    tests:
    - not_null
  - name: discount_base_period
    description: Discount base period
    tests:
    - not_null
  - name: asset_acquisition_year
    description: Asset acquisition year
    tests:
    - not_null
  - name: apc_area
    description: Acquisition and production costs (APC) area
    tests:
    - not_null
  - name: document_amount_local
    description: Document amount in local currency
    tests:
    - not_null
  - name: balance_carryforward
    description: Balance carryforward
    tests:
    - not_null
  - name: klibt
    description: ''
    tests:
    - not_null
  - name: accounting_value_1
    description: Unknown accounting value
    tests:
    - not_null
  - name: accounting_value_2
    description: Unknown accounting value
    tests:
    - not_null
  - name: foreign_non_deductible_tax_base
    description: Non-deductible tax base in foreign currency
    tests:
    - not_null
  - name: non_deductible_input_tax
    description: Non-deductible input tax amount
    tests:
    - not_null
  - name: menge
    description: ''
    tests:
    - not_null
  - name: erfmg
    description: ''
    tests:
    - not_null
  - name: bpmng
    description: ''
    tests:
    - not_null
  - name: purchase_order_item_number
    description: Purchase order item number
    tests:
    - not_null
  - name: account_assignment_sequence
    description: Sequential number of account assignment
    tests:
    - not_null
  - name: peinh
    description: ''
    tests:
    - not_null
  - name: reference_amount
    description: Reference amount
    tests:
    - not_null
  - name: reference_exchange_rate
    description: Reference exchange rate
    tests:
    - not_null
  - name: investment_support_amount
    description: Investment support amount
    tests:
    - not_null
  - name: bualt
    description: ''
    tests:
    - not_null
  - name: net_price
    description: Net price
    tests:
    - not_null
  - name: accounting_value_4
    description: Unknown accounting value
    tests:
    - not_null
  - name: difference_value_3
    description: Unknown difference value
    tests:
    - not_null
  - name: difference_value_1
    description: Unknown difference value
    tests:
    - not_null
  - name: days_in_arrears
    description: Days in arrears
    tests:
    - not_null
  - name: option_selection
    description: Option selection
    tests:
    - not_null
  - name: order_item_number
    description: Order item number
    tests:
    - not_null
  - name: asset_sequence_number
    description: Asset sequential number
    tests:
    - not_null
  - name: project_key
    description: Project key
    tests:
    - not_null
  - name: profitability_segment_number
    description: Profitability segment number
    tests:
    - not_null
  - name: profitability_subsegment_number
    description: Subnumber of profitability segment
    tests:
    - not_null
  - name: dmbe2
    description: ''
    tests:
    - not_null
  - name: group_currency_amount
    description: Group currency amount
    tests:
    - not_null
  - name: dmb21
    description: ''
    tests:
    - not_null
  - name: dmb22
    description: ''
    tests:
    - not_null
  - name: dmb23
    description: ''
    tests:
    - not_null
  - name: group_currency_amount_1
    description: Group currency amount 1
    tests:
    - not_null
  - name: group_currency_amount_2
    description: Group currency amount 2
    tests:
    - not_null
  - name: group_currency_amount_3
    description: Group currency amount 3
    tests:
    - not_null
  - name: local_tax_amount
    description: Tax amount in local currency
    tests:
    - not_null
  - name: document_tax_amount
    description: Tax amount in document currency
    tests:
    - not_null
  - name: local_non_deductible_tax_base
    description: Non-deductible tax base amount in local currency
    tests:
    - not_null
  - name: document_non_deductible_tax_base
    description: Non-deductible tax base in document currency
    tests:
    - not_null
  - name: second_local_currency_amount
    description: Amount in second local currency
    tests:
    - not_null
  - name: third_local_currency_amount
    description: Amount in third local currency
    tests:
    - not_null
  - name: valuation_difference_3
    description: Valuation difference 3
    tests:
    - not_null
  - name: difference_value_2
    description: Unknown difference value
    tests:
    - not_null
  - name: second_local_currency_tax_base
    description: Tax base amount in second local currency
    tests:
    - not_null
  - name: third_local_currency_tax_base
    description: Tax base amount in third local currency
    tests:
    - not_null
  - name: kblpos
    description: ''
    tests:
    - not_null
  - name: local_currency_tax_amount
    description: Tax amount in local currency
    tests:
    - not_null
  - name: original_line_item_number
    description: Original line item number
    tests:
    - not_null
  - name: reference_line_indicator
    description: Reference line indicator
    tests:
    - not_null
  - name: tax_amount_1
    description: Tax amount 1
    tests:
    - not_null
  - name: tax_amount_2
    description: Tax amount 2
    tests:
    - not_null
  - name: tax_amount_3
    description: Tax amount 3
    tests:
    - not_null
  - name: tax_amount_4
    description: Tax amount 4
    tests:
    - not_null
  - name: write_off_amount
    description: Write-off amount
    tests:
    - not_null
  - name: tax_reporting_date
    description: Tax reporting date
    tests:
    - not_null
  - name: settlement_period
    description: Settlement period
    tests:
    - not_null
  - name: payment_amount
    description: Payment amount
    tests:
    - not_null
  - name: price_difference
    description: Price difference
    tests:
    - not_null
  - name: price_difference_2
    description: Price difference 2
    tests:
    - not_null
  - name: price_difference_3
    description: Price difference 3
    tests:
    - not_null
  - name: penlc1
    description: ''
    tests:
    - not_null
  - name: penlc2
    description: ''
    tests:
    - not_null
  - name: penlc3
    description: ''
    tests:
    - not_null
  - name: foreign_currency_amount
    description: Foreign currency amount
    tests:
    - not_null
  - name: number_of_days
    description: Number of days
    tests:
    - not_null
  - name: sales_use_tax_amount
    description: Sales and use tax amount
    tests:
    - not_null
  - name: clearing_fiscal_year
    description: Clearing fiscal year
    tests:
    - not_null
  - name: tax_splitting
    description: Tax splitting
    tests:
    - not_null
  - name: original_fiscal_year
    description: Fiscal year of original entry
    tests:
    - not_null
  - name: original_document_line_number
    description: Line item number in original document
    tests:
    - not_null
  - name: original_line_item_sequence
    description: Sequential number of original line item
    tests:
    - not_null
  - name: production_period
    description: Production period
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column is likely a unique identifier for each row in the table.
        For this table, each row represents a single line item in an accounting document.
        row_id is likely to be unique across rows as it's designed to uniquely identify
        each record.
  - name: is_deleted
    description: Indicates if the record is deleted
    tests:
    - not_null
  - name: alternative_account
    description: Alternative account number
    tests:
    - not_null
  - name: asset_acquisition_period
    description: Asset acquisition period
    cocoon_meta:
      missing_acceptable: Not applicable for non-asset transactions
  - name: asset_subnumber
    description: Asset subnumber
    cocoon_meta:
      missing_acceptable: Not applicable for non-asset or single-asset transactions
  - name: asset_transaction_type
    description: Asset transaction type
    cocoon_meta:
      missing_acceptable: Not applicable for non-asset transactions
  - name: asset_valuation_type
    description: Asset valuation type
    cocoon_meta:
      missing_acceptable: Not applicable for non-asset transactions
  - name: belnr
    description: ''
    tests:
    - not_null
  - name: bill_of_exchange_procedure
    description: Bill of exchange procedure
    cocoon_meta:
      missing_acceptable: Not applicable for transactions not involving bills of exchange
  - name: billing_block
    description: Billing block
    cocoon_meta:
      missing_acceptable: Not applicable if billing is not relevant
  - name: billing_type
    description: Billing type
    cocoon_meta:
      missing_acceptable: Not applicable if transaction doesn't involve billing
  - name: business_area
    description: Business area
    tests:
    - not_null
  - name: cession_indicator
    description: Cession indicator
    cocoon_meta:
      missing_acceptable: May not apply to all types of financial transactions.
  - name: clearing_document_number
    description: Clearing document number
    cocoon_meta:
      missing_acceptable: Only applicable for cleared transactions.
  - name: clearing_reversal_indicator
    description: Indicator for clearing reversal
    cocoon_meta:
      missing_acceptable: Only relevant for reversed clearing entries.
  - name: clearing_with_down_payment
    description: Clearing with down payment indicator
    cocoon_meta:
      missing_acceptable: Only applicable for transactions involving down payments.
  - name: client_id
    description: Client identifier in SAP system
    tests:
    - not_null
  - name: company_code
    description: Company code
    tests:
    - not_null
  - name: contract_number
    description: Contract number
    cocoon_meta:
      missing_acceptable: Not all transactions are associated with a contract.
  - name: contract_type
    description: Contract type
    cocoon_meta:
      missing_acceptable: Only applicable if a contract exists.
  - name: controlling_area
    description: Controlling area
    tests:
    - not_null
  - name: delivery_block
    description: Delivery block
    cocoon_meta:
      missing_acceptable: Only relevant for transactions involving physical deliveries.
  - name: delivery_completed
    description: Delivery completed indicator
    cocoon_meta:
      missing_acceptable: Only applicable for transactions involving deliveries.
  - name: dunning_level
    description: Number of dunning level
    cocoon_meta:
      missing_acceptable: Only applicable for overdue payments or accounts.
  - name: fast_pay_indicator
    description: Fast pay indicator
    cocoon_meta:
      missing_acceptable: May not apply if no fast payment option exists.
  - name: financial_position
    description: Financial position
    tests:
    - not_null
  - name: fixed_payment_terms_indicator
    description: Indicator for fixed payment terms
    cocoon_meta:
      missing_acceptable: May not apply if payment terms are variable.
  - name: foreign_currency_valuation_type
    description: Type of foreign currency valuation
    cocoon_meta:
      missing_acceptable: Not applicable if only dealing in local currency.
  - name: funds_center
    description: Funds center
    tests:
    - not_null
  - name: funds_center_description
    description: Long text for funds center
    tests:
    - not_null
  - name: funds_reservation_number
    description: Funds reservation number
    tests:
    - not_null
  - name: gl_account
    description: General ledger account
    tests:
    - not_null
  - name: gr_ir_clearing_number
    description: GR/IR clearing number
    cocoon_meta:
      missing_acceptable: May not apply if no goods receipt/invoice receipt clearing.
  - name: grir_clearing_reversal_indicator
    description: Indicator for reversal of GR/IR clearing
    cocoon_meta:
      missing_acceptable: Not applicable if no GR/IR clearing reversal process.
  - name: insurance_indicator
    description: Insurance indicator
    cocoon_meta:
      missing_acceptable: May not apply if insurance is not relevant.
  - name: insurance_reason_code
    description: Insurance-related reason code
    cocoon_meta:
      missing_acceptable: Not applicable if insurance is not involved.
  - name: kostl
    description: ''
    tests:
    - not_null
  - name: line_item_reference
    description: Line item reference
    tests:
    - not_null
  - name: line_number_range
    description: Line number range
    tests:
    - not_null
  - name: lokkt
    description: ''
    tests:
    - not_null
  - name: main_asset_number
    description: Main asset number
    cocoon_meta:
      missing_acceptable: Only applicable for asset-related transactions.
  - name: mandate_id
    description: Mandate identification
    cocoon_meta:
      missing_acceptable: Only applicable for transactions involving mandates.
  - name: manual_stats_update
    description: Manual statistics update indicator
    tests:
    - not_null
    - accepted_values:
        values:
        - '0'
        - '1'
  - name: material_document_date
    description: Material document date
    tests:
    - not_null
  - name: network_activity_number
    description: Network activity number
    cocoon_meta:
      missing_acceptable: Only applicable for network-related activities.
  - name: order_number
    description: Order number
    cocoon_meta:
      missing_acceptable: Only applicable for order-related transactions.
  - name: original_document_number
    description: Document number of original entry
    cocoon_meta:
      missing_acceptable: Only applicable for transactions referencing original documents.
  - name: payment_block
    description: Payment block indicator
    cocoon_meta:
      missing_acceptable: Not applicable for transactions without payment restrictions
  - name: payment_currency
    description: Payment currency
    cocoon_meta:
      missing_acceptable: Not applicable for non-monetary or internal transactions
  - name: payment_method
    description: Payment method or terms code
    cocoon_meta:
      missing_acceptable: Not applicable for non-payment transactions
  - name: payment_method_supplement
    description: Payment method supplement
    cocoon_meta:
      missing_acceptable: Not applicable if no specific payment method used
  - name: payment_processing_indicator
    description: Indicator for payment processing
    cocoon_meta:
      missing_acceptable: Not applicable for non-payment transactions
  - name: payment_provider_transaction_id
    description: Payment service provider transaction ID
    cocoon_meta:
      missing_acceptable: Not applicable for internal or non-provider transactions
  - name: payment_service_provider
    description: Payment service provider
    cocoon_meta:
      missing_acceptable: Not applicable for transactions without external payment
        providers
  - name: payment_term
    description: Payment term
    cocoon_meta:
      missing_acceptable: Not applicable for immediate or non-payment transactions
  - name: payment_terms
    description: Payment terms key
    cocoon_meta:
      missing_acceptable: Not applicable for immediate or non-payment transactions
  - name: payment_terms_key
    description: Payment terms key
    cocoon_meta:
      missing_acceptable: Not applicable for transactions without specific payment
        terms
  - name: prctr
    description: ''
    tests:
    - not_null
  - name: tax_code_1
    description: Tax code 1
    cocoon_meta:
      missing_acceptable: Not applicable for non-taxable transactions
  - name: tax_code_2
    description: Tax code 2
    cocoon_meta:
      missing_acceptable: Not applicable for transactions with single or no tax
  - name: tax_code_3
    description: Tax code 3
    cocoon_meta:
      missing_acceptable: Not applicable for transactions with two or fewer taxes
  - name: tax_code_change_indicator
    description: Indicator for tax code changes
    cocoon_meta:
      missing_acceptable: Not applicable if tax code hasn't changed
  - name: tax_exempt_indicator
    description: Tax exempt indicator
    cocoon_meta:
      missing_acceptable: Not applicable for non-exempt transactions
  - name: tax_indicator
    description: Tax indicator
    cocoon_meta:
      missing_acceptable: Not applicable for non-taxable transactions
  - name: tax_jurisdiction_code
    description: Tax jurisdiction code
    cocoon_meta:
      missing_acceptable: Not applicable for transactions without specific tax jurisdiction
  - name: tax_posting_reversal_indicator
    description: Indicator for reversal of tax posting
    cocoon_meta:
      missing_acceptable: Not applicable for non-reversed tax postings
  - name: vat_registration_number
    description: VAT registration number
    cocoon_meta:
      missing_acceptable: Not applicable for non-VAT transactions or entities
  - name: vat_tax_code
    description: Tax code for VAT
    cocoon_meta:
      missing_acceptable: Not applicable for non-VAT transactions
  - name: zuonr
    description: ''
    tests:
    - not_null

stg_sap_t880_data (first 100 rows)

company_name country language_code street_address postal_code city currency row_id is_deleted branch_code company_code company_name_secondary main_client_code mandate_number restaurant_code street_address_secondary
0 Willy Wonka Chocolate Factory US D 1445 West Norwood Avenue 11223 Walldorf USD 5 False None 1 None 0 800 None None
1 Holmes And Watson UK D 221B Baker Street NW1 6XE London GBP 6 False None 5 None 0 800 None None
2 Nakatomi Plaza US E 2121 Avenue of the Stars 60154 Los Angeles USD 7 False None 6 None 0 800 None None

stg_sap_t880_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 14:57:07.957386+00:00
WITH 
"sap_t880_data_projected" AS (
    -- Projection: Selecting 23 out of 24 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "mandt",
        "rcomp",
        "name1",
        "cntry",
        "name2",
        "langu",
        "stret",
        "pobox",
        "pstlc",
        "city",
        "curr",
        "modcp",
        "glsip",
        "resta",
        "rform",
        "zweig",
        "mcomp",
        "mclnt",
        "lccomp",
        "strt2",
        "indpo",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_t880_data"
),

"sap_t880_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- mandt -> mandate_number
    -- rcomp -> company_code
    -- name1 -> company_name
    -- cntry -> country
    -- name2 -> company_name_secondary
    -- langu -> language_code
    -- stret -> street_address
    -- pobox -> po_box
    -- pstlc -> postal_code
    -- curr -> currency
    -- modcp -> model_company
    -- glsip -> global_site_id
    -- resta -> restaurant_code
    -- rform -> company_form
    -- zweig -> branch_code
    -- mcomp -> main_company_code
    -- mclnt -> main_client_code
    -- lccomp -> local_company_code
    -- strt2 -> street_address_secondary
    -- indpo -> industry_position
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "mandt" AS "mandate_number",
        "rcomp" AS "company_code",
        "name1" AS "company_name",
        "cntry" AS "country",
        "name2" AS "company_name_secondary",
        "langu" AS "language_code",
        "stret" AS "street_address",
        "pobox" AS "po_box",
        "pstlc" AS "postal_code",
        "city",
        "curr" AS "currency",
        "modcp" AS "model_company",
        "glsip" AS "global_site_id",
        "resta" AS "restaurant_code",
        "rform" AS "company_form",
        "zweig" AS "branch_code",
        "mcomp" AS "main_company_code",
        "mclnt" AS "main_client_code",
        "lccomp" AS "local_company_code",
        "strt2" AS "street_address_secondary",
        "indpo" AS "industry_position",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_t880_data_projected"
),

"sap_t880_data_projected_renamed_cleaned" AS (
    -- Clean unusual string values: 
    -- language_code: The problem is that 'D' and 'E' are not standard language codes. Standard language codes typically use two letters (ISO 639-1) or three letters (ISO 639-2/3). 'D' likely stands for 'Deutsch' (German) and 'E' for 'English'. The correct values should be the ISO 639-1 codes for these languages. 
    SELECT
        "mandate_number",
        "company_code",
        "company_name",
        "country",
        "company_name_secondary",
        CASE
            WHEN "language_code" = '''D''' THEN '''de'''
            WHEN "language_code" = '''E''' THEN '''en'''
            ELSE "language_code"
        END AS "language_code",
        "street_address",
        "po_box",
        "postal_code",
        "city",
        "currency",
        "model_company",
        "global_site_id",
        "restaurant_code",
        "company_form",
        "branch_code",
        "main_company_code",
        "main_client_code",
        "local_company_code",
        "street_address_secondary",
        "industry_position",
        "row_id",
        "is_deleted"
    FROM "sap_t880_data_projected_renamed"
),

"sap_t880_data_projected_renamed_cleaned_casted" AS (
    -- Column Type Casting: 
    -- branch_code: from DECIMAL to VARCHAR
    -- company_code: from INT to VARCHAR
    -- company_form: from DECIMAL to VARCHAR
    -- company_name_secondary: from DECIMAL to VARCHAR
    -- global_site_id: from DECIMAL to VARCHAR
    -- industry_position: from DECIMAL to VARCHAR
    -- local_company_code: from DECIMAL to VARCHAR
    -- main_client_code: from INT to VARCHAR
    -- main_company_code: from DECIMAL to VARCHAR
    -- mandate_number: from INT to VARCHAR
    -- model_company: from DECIMAL to VARCHAR
    -- po_box: from DECIMAL to VARCHAR
    -- restaurant_code: from DECIMAL to VARCHAR
    -- street_address_secondary: from DECIMAL to VARCHAR
    SELECT
        "company_name",
        "country",
        "language_code",
        "street_address",
        "postal_code",
        "city",
        "currency",
        "row_id",
        "is_deleted",
        CAST("branch_code" AS VARCHAR) AS "branch_code",
        CAST("company_code" AS VARCHAR) AS "company_code",
        CAST("company_form" AS VARCHAR) AS "company_form",
        CAST("company_name_secondary" AS VARCHAR) AS "company_name_secondary",
        CAST("global_site_id" AS VARCHAR) AS "global_site_id",
        CAST("industry_position" AS VARCHAR) AS "industry_position",
        CAST("local_company_code" AS VARCHAR) AS "local_company_code",
        CAST("main_client_code" AS VARCHAR) AS "main_client_code",
        CAST("main_company_code" AS VARCHAR) AS "main_company_code",
        CAST("mandate_number" AS VARCHAR) AS "mandate_number",
        CAST("model_company" AS VARCHAR) AS "model_company",
        CAST("po_box" AS VARCHAR) AS "po_box",
        CAST("restaurant_code" AS VARCHAR) AS "restaurant_code",
        CAST("street_address_secondary" AS VARCHAR) AS "street_address_secondary"
    FROM "sap_t880_data_projected_renamed_cleaned"
),

"sap_t880_data_projected_renamed_cleaned_casted_missing_handled" AS (
    -- Handling missing values: There are 7 columns with unacceptable missing values
    -- company_form has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- global_site_id has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- industry_position has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- local_company_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- main_company_code has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- model_company has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- po_box has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "company_name",
        "country",
        "language_code",
        "street_address",
        "postal_code",
        "city",
        "currency",
        "row_id",
        "is_deleted",
        "branch_code",
        "company_code",
        "company_name_secondary",
        "main_client_code",
        "mandate_number",
        "restaurant_code",
        "street_address_secondary"
    FROM "sap_t880_data_projected_renamed_cleaned_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_t880_data_projected_renamed_cleaned_casted_missing_handled"

stg_sap_t880_data.yml (Document the table)

version: 2
models:
- name: stg_sap_t880_data
  description: The table is about company details. It includes company codes, names,
    addresses, and other attributes. Each row represents a distinct company or branch.
    Key fields are company code (rcomp), name (name1), country (cntry), street (stret),
    postal code (pstlc), city, and currency (curr). The table appears to store international
    company data, possibly for financial or organizational purposes.
  columns:
  - name: company_name
    description: Primary name of the company
    tests:
    - not_null
  - name: country
    description: Country where the company is located
    tests:
    - not_null
  - name: language_code
    description: Language code
    tests:
    - not_null
  - name: street_address
    description: Street address
    tests:
    - not_null
  - name: postal_code
    description: Postal code
    tests:
    - not_null
  - name: city
    description: City where the company is located
    tests:
    - not_null
  - name: currency
    description: Currency used by the company
    tests:
    - not_null
    - accepted_values:
        values:
        - USD
        - GBP
        - EUR
        - JPY
        - CHF
        - CAD
        - AUD
        - CNY
        - HKD
        - NZD
        - SEK
        - KRW
        - SGD
        - NOK
        - MXN
        - INR
        - RUB
        - ZAR
        - TRY
        - BRL
        - TWD
        - DKK
        - PLN
        - THB
        - IDR
  - name: row_id
    description: Unique identifier for each row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column is described as a unique identifier for each row. For
        this table, each row is for a distinct company or branch. By definition, a
        unique identifier would be distinct for each entry.
  - name: is_deleted
    description: Indicates if the record is deleted
    tests:
    - not_null
  - name: branch_code
    description: Purpose unclear, possibly branch code
    cocoon_meta:
      missing_acceptable: Main office locations may not have branch codes.
  - name: company_code
    description: Company code
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column represents the company code. For this table, each row
        is for a distinct company or branch. Company codes are typically unique identifiers
        assigned to each company or branch within an organization's system.
  - name: company_name_secondary
    description: Secondary name or additional info
    cocoon_meta:
      missing_acceptable: Companies may not have secondary names.
  - name: main_client_code
    description: Purpose unclear, possibly main client code
    tests:
    - not_null
  - name: mandate_number
    description: Purpose unclear, possibly mandate or client number
    tests:
    - not_null
  - name: restaurant_code
    description: Purpose unclear, possibly restaurant code
    cocoon_meta:
      missing_acceptable: Not all companies are restaurants.
  - name: street_address_secondary
    description: Additional street address information
    cocoon_meta:
      missing_acceptable: Companies may not have secondary addresses.

stg_sap_pa0031_data (first 100 rows)

employee_number seqnr subty uname row_id is_deleted company_code end_date flag_1 flag_2 flag_3 flag_4 last_change_date reference_field_1 start_date
0 22314 0 0 I026759 1 False 800 9999-12-31 None None None None 2014-09-19 12345678 1975-04-01
1 80052 0 0 I026759 2 False 800 9999-12-31 None None None None 2012-10-29 23456789 1991-12-15
2 80053 0 0 C5174732 3 False 800 9999-12-31 None None None None 2014-09-22 34567890 1992-03-15

stg_sap_pa0031_data.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-07-06 14:46:10.017022+00:00
WITH 
"sap_pa0031_data_projected" AS (
    -- Projection: Selecting 45 out of 46 columns
    -- Columns projected out: ['_fivetran_synced']
    SELECT 
        "aedtm",
        "begda",
        "endda",
        "flag1",
        "flag2",
        "flag3",
        "flag4",
        "grpvl",
        "histo",
        "itbld",
        "itxex",
        "mandt",
        "objps",
        "ordex",
        "pernr",
        "preas",
        "refex",
        "rese1",
        "rese2",
        "rfp01",
        "rfp02",
        "rfp03",
        "rfp04",
        "rfp05",
        "rfp06",
        "rfp07",
        "rfp08",
        "rfp09",
        "rfp10",
        "rfp11",
        "rfp12",
        "rfp13",
        "rfp14",
        "rfp15",
        "rfp16",
        "rfp17",
        "rfp18",
        "rfp19",
        "rfp20",
        "seqnr",
        "sprps",
        "subty",
        "uname",
        "_fivetran_rowid",
        "_fivetran_deleted"
    FROM "memory"."main"."sap_pa0031_data"
),

"sap_pa0031_data_projected_renamed" AS (
    -- Rename: Renaming columns
    -- aedtm -> last_change_date
    -- begda -> start_date
    -- endda -> end_date
    -- flag1 -> flag_1
    -- flag2 -> flag_2
    -- flag3 -> flag_3
    -- flag4 -> flag_4
    -- grpvl -> group_value
    -- histo -> is_historical
    -- itbld -> info_type_build
    -- itxex -> info_type_exit
    -- mandt -> company_code
    -- objps -> object_spec
    -- ordex -> record_order
    -- pernr -> employee_number
    -- preas -> processing_reason
    -- refex -> external_reference
    -- rese1 -> reserved_1
    -- rese2 -> reserved_2
    -- rfp01 -> reference_field_1
    -- rfp02 -> reference_field_2
    -- rfp03 -> reference_field_3
    -- rfp04 -> reference_field_4
    -- rfp05 -> reference_field_5
    -- rfp06 -> reference_field_6
    -- rfp07 -> reference_field_7
    -- rfp08 -> reference_field_8
    -- rfp09 -> reference_field_9
    -- rfp10 -> reference_field_10
    -- rfp11 -> reference_field_11
    -- rfp12 -> reference_field_12
    -- rfp13 -> reference_field_13
    -- rfp14 -> reference_field_14
    -- rfp15 -> reference_field_15
    -- rfp16 -> reference_field_16
    -- rfp17 -> reference_field_17
    -- rfp18 -> reference_field_18
    -- rfp19 -> reference_field_19
    -- rfp20 -> reference_field_20
    -- sprps -> specific_purpose
    -- _fivetran_rowid -> row_id
    -- _fivetran_deleted -> is_deleted
    SELECT 
        "aedtm" AS "last_change_date",
        "begda" AS "start_date",
        "endda" AS "end_date",
        "flag1" AS "flag_1",
        "flag2" AS "flag_2",
        "flag3" AS "flag_3",
        "flag4" AS "flag_4",
        "grpvl" AS "group_value",
        "histo" AS "is_historical",
        "itbld" AS "info_type_build",
        "itxex" AS "info_type_exit",
        "mandt" AS "company_code",
        "objps" AS "object_spec",
        "ordex" AS "record_order",
        "pernr" AS "employee_number",
        "preas" AS "processing_reason",
        "refex" AS "external_reference",
        "rese1" AS "reserved_1",
        "rese2" AS "reserved_2",
        "rfp01" AS "reference_field_1",
        "rfp02" AS "reference_field_2",
        "rfp03" AS "reference_field_3",
        "rfp04" AS "reference_field_4",
        "rfp05" AS "reference_field_5",
        "rfp06" AS "reference_field_6",
        "rfp07" AS "reference_field_7",
        "rfp08" AS "reference_field_8",
        "rfp09" AS "reference_field_9",
        "rfp10" AS "reference_field_10",
        "rfp11" AS "reference_field_11",
        "rfp12" AS "reference_field_12",
        "rfp13" AS "reference_field_13",
        "rfp14" AS "reference_field_14",
        "rfp15" AS "reference_field_15",
        "rfp16" AS "reference_field_16",
        "rfp17" AS "reference_field_17",
        "rfp18" AS "reference_field_18",
        "rfp19" AS "reference_field_19",
        "rfp20" AS "reference_field_20",
        "seqnr",
        "sprps" AS "specific_purpose",
        "subty",
        "uname",
        "_fivetran_rowid" AS "row_id",
        "_fivetran_deleted" AS "is_deleted"
    FROM "sap_pa0031_data_projected"
),

"sap_pa0031_data_projected_renamed_casted" AS (
    -- Column Type Casting: 
    -- company_code: from INT to VARCHAR
    -- end_date: from INT to DATE
    -- external_reference: from DECIMAL to VARCHAR
    -- flag_1: from DECIMAL to VARCHAR
    -- flag_2: from DECIMAL to VARCHAR
    -- flag_3: from DECIMAL to VARCHAR
    -- flag_4: from DECIMAL to VARCHAR
    -- group_value: from DECIMAL to VARCHAR
    -- info_type_build: from DECIMAL to VARCHAR
    -- info_type_exit: from DECIMAL to VARCHAR
    -- is_historical: from DECIMAL to VARCHAR
    -- last_change_date: from INT to DATE
    -- object_spec: from DECIMAL to VARCHAR
    -- processing_reason: from DECIMAL to VARCHAR
    -- record_order: from DECIMAL to VARCHAR
    -- reference_field_1: from INT to VARCHAR
    -- reference_field_10: from DECIMAL to VARCHAR
    -- reference_field_11: from DECIMAL to VARCHAR
    -- reference_field_12: from DECIMAL to VARCHAR
    -- reference_field_13: from DECIMAL to VARCHAR
    -- reference_field_14: from DECIMAL to VARCHAR
    -- reference_field_15: from DECIMAL to VARCHAR
    -- reference_field_16: from DECIMAL to VARCHAR
    -- reference_field_17: from DECIMAL to VARCHAR
    -- reference_field_18: from DECIMAL to VARCHAR
    -- reference_field_19: from DECIMAL to VARCHAR
    -- reference_field_2: from DECIMAL to VARCHAR
    -- reference_field_20: from DECIMAL to VARCHAR
    -- reference_field_3: from DECIMAL to VARCHAR
    -- reference_field_4: from DECIMAL to VARCHAR
    -- reference_field_5: from DECIMAL to VARCHAR
    -- reference_field_6: from DECIMAL to VARCHAR
    -- reference_field_7: from DECIMAL to VARCHAR
    -- reference_field_8: from DECIMAL to VARCHAR
    -- reference_field_9: from DECIMAL to VARCHAR
    -- reserved_1: from DECIMAL to VARCHAR
    -- reserved_2: from DECIMAL to VARCHAR
    -- specific_purpose: from DECIMAL to VARCHAR
    -- start_date: from INT to DATE
    SELECT
        "employee_number",
        "seqnr",
        "subty",
        "uname",
        "row_id",
        "is_deleted",
        CAST("company_code" AS VARCHAR) AS "company_code",
        strptime(CAST("end_date" AS VARCHAR), '%Y%m%d') AS "end_date",
        CAST("external_reference" AS VARCHAR) AS "external_reference",
        CAST("flag_1" AS VARCHAR) AS "flag_1",
        CAST("flag_2" AS VARCHAR) AS "flag_2",
        CAST("flag_3" AS VARCHAR) AS "flag_3",
        CAST("flag_4" AS VARCHAR) AS "flag_4",
        CAST("group_value" AS VARCHAR) AS "group_value",
        CAST("info_type_build" AS VARCHAR) AS "info_type_build",
        CAST("info_type_exit" AS VARCHAR) AS "info_type_exit",
        CAST("is_historical" AS VARCHAR) AS "is_historical",
        strptime(CAST("last_change_date" AS VARCHAR), '%Y%m%d') AS "last_change_date",
        CAST("object_spec" AS VARCHAR) AS "object_spec",
        CAST("processing_reason" AS VARCHAR) AS "processing_reason",
        CAST("record_order" AS VARCHAR) AS "record_order",
        CAST("reference_field_1" AS VARCHAR) AS "reference_field_1",
        CAST("reference_field_10" AS VARCHAR) AS "reference_field_10",
        CAST("reference_field_11" AS VARCHAR) AS "reference_field_11",
        CAST("reference_field_12" AS VARCHAR) AS "reference_field_12",
        CAST("reference_field_13" AS VARCHAR) AS "reference_field_13",
        CAST("reference_field_14" AS VARCHAR) AS "reference_field_14",
        CAST("reference_field_15" AS VARCHAR) AS "reference_field_15",
        CAST("reference_field_16" AS VARCHAR) AS "reference_field_16",
        CAST("reference_field_17" AS VARCHAR) AS "reference_field_17",
        CAST("reference_field_18" AS VARCHAR) AS "reference_field_18",
        CAST("reference_field_19" AS VARCHAR) AS "reference_field_19",
        CAST("reference_field_2" AS VARCHAR) AS "reference_field_2",
        CAST("reference_field_20" AS VARCHAR) AS "reference_field_20",
        CAST("reference_field_3" AS VARCHAR) AS "reference_field_3",
        CAST("reference_field_4" AS VARCHAR) AS "reference_field_4",
        CAST("reference_field_5" AS VARCHAR) AS "reference_field_5",
        CAST("reference_field_6" AS VARCHAR) AS "reference_field_6",
        CAST("reference_field_7" AS VARCHAR) AS "reference_field_7",
        CAST("reference_field_8" AS VARCHAR) AS "reference_field_8",
        CAST("reference_field_9" AS VARCHAR) AS "reference_field_9",
        CAST("reserved_1" AS VARCHAR) AS "reserved_1",
        CAST("reserved_2" AS VARCHAR) AS "reserved_2",
        CAST("specific_purpose" AS VARCHAR) AS "specific_purpose",
        strptime(CAST("start_date" AS VARCHAR), '%Y%m%d') AS "start_date"
    FROM "sap_pa0031_data_projected_renamed"
),

"sap_pa0031_data_projected_renamed_casted_missing_handled" AS (
    -- Handling missing values: There are 30 columns with unacceptable missing values
    -- external_reference has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- group_value has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- info_type_build has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- info_type_exit has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- is_historical has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- object_spec has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- processing_reason has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- record_order has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_10 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_11 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_12 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_13 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_14 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_15 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_16 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_17 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_18 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_19 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_20 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_3 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_4 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_5 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_6 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_7 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_8 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reference_field_9 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reserved_1 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- reserved_2 has 100.0 percent missing. Strategy: 🗑️ Drop Column
    -- specific_purpose has 100.0 percent missing. Strategy: 🗑️ Drop Column
    SELECT
        "employee_number",
        "seqnr",
        "subty",
        "uname",
        "row_id",
        "is_deleted",
        "company_code",
        "end_date",
        "flag_1",
        "flag_2",
        "flag_3",
        "flag_4",
        "last_change_date",
        "reference_field_1",
        "start_date"
    FROM "sap_pa0031_data_projected_renamed_casted"
)

-- COCOON BLOCK END
SELECT * FROM "sap_pa0031_data_projected_renamed_casted_missing_handled"

stg_sap_pa0031_data.yml (Document the table)

version: 2
models:
- name: stg_sap_pa0031_data
  description: The table is about employee personal data. It contains fields for employee
    number (pernr), start and end dates (begda, endda), various flags and reference
    fields (rfp01-rfp20). Other attributes include company code (mandt), subtypes
    (subty), and user information (uname). The data appears to track changes over
    time with fields for creation date (aedtm) and sequence number (seqnr).
  columns:
  - name: employee_number
    description: Employee number
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column represents a unique identifier for each employee. For
        this table, each row appears to represent an employee's record. The employee_number
        is likely to be unique across rows as it's a standard practice in HR systems.
  - name: seqnr
    description: Sequence number for tracking changes
    tests:
    - not_null
  - name: subty
    description: Subtype of employee data
    tests:
    - not_null
  - name: uname
    description: Username or user identifier
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column is explicitly described as a unique identifier for the
        row. For this table, each row represents an employee record, and row_id is
        designed to be unique across all rows.
  - name: is_deleted
    description: Indicates if the row was deleted
    tests:
    - not_null
  - name: company_code
    description: Company code
    tests:
    - not_null
  - name: end_date
    description: End date of the record
    tests:
    - not_null
  - name: flag_1
    description: Generic flag field 1
    cocoon_meta:
      missing_acceptable: Flag may not be relevant for all records.
  - name: flag_2
    description: Generic flag field 2
    cocoon_meta:
      missing_acceptable: Flag may not be relevant for all records.
  - name: flag_3
    description: Generic flag field 3
    cocoon_meta:
      missing_acceptable: Flag may not be relevant for all records.
  - name: flag_4
    description: Generic flag field 4
    cocoon_meta:
      missing_acceptable: Flag may not be relevant for all records.
  - name: last_change_date
    description: Date of last change
    tests:
    - not_null
  - name: reference_field_1
    description: Reference field 1
    tests:
    - not_null
  - name: start_date
    description: Start date of the record
    tests:
    - not_null
Some tables log change events, which may be redundant to query. Instead, we take a snapshot of the latest.

snapshot_sap_pa0007_data (first 100 rows)

client_id employee_id sequence_number last_modified_by schedule_type time_recording_indicator employment_percentage monthly_hours weekly_hours daily_hours workdays_per_week yearly_hours min_daily_hours max_daily_hours min_weekly_hours max_weekly_hours min_monthly_hours max_monthly_hours min_yearly_hours max_yearly_hours row_id is_deleted dynamic_scheduling valid_from_date valid_to_date
0 800 1003 0 LIMPERT flex 0 100.0 156.48 36.0 7.2 5.0 1879.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 12 False None 1994-01-01 9999-12-31
1 800 80052 0 C5174732 norm 0 100.0 173.34 40.0 8.0 5.0 2080.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1414 False None 1991-12-15 9999-12-31
2 800 80053 0 C5174732 norm 0 100.0 173.34 40.0 8.0 5.0 2080.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1415 False None 1992-03-15 9999-12-31

snapshot_sap_pa0007_data.sql (clean the table)

-- Slowly Changing Dimension: Dimension keys are "client_id", "employee_id"
-- Effective date columns are "last_modified_date"
-- We will create Type 1 SCD (latest snapshot)
SELECT 
    "client_id",
    "employee_id",
    "sequence_number",
    "last_modified_by",
    "schedule_type",
    "time_recording_indicator",
    "employment_percentage",
    "monthly_hours",
    "weekly_hours",
    "daily_hours",
    "workdays_per_week",
    "yearly_hours",
    "min_daily_hours",
    "max_daily_hours",
    "min_weekly_hours",
    "max_weekly_hours",
    "min_monthly_hours",
    "max_monthly_hours",
    "min_yearly_hours",
    "max_yearly_hours",
    "row_id",
    "is_deleted",
    "dynamic_scheduling",
    "valid_from_date",
    "valid_to_date"
FROM (
     SELECT 
            "client_id",
            "employee_id",
            "sequence_number",
            "last_modified_by",
            "schedule_type",
            "time_recording_indicator",
            "employment_percentage",
            "monthly_hours",
            "weekly_hours",
            "daily_hours",
            "workdays_per_week",
            "yearly_hours",
            "min_daily_hours",
            "max_daily_hours",
            "min_weekly_hours",
            "max_weekly_hours",
            "min_monthly_hours",
            "max_monthly_hours",
            "min_yearly_hours",
            "max_yearly_hours",
            "row_id",
            "is_deleted",
            "dynamic_scheduling",
            "valid_from_date",
            "valid_to_date",
            ROW_NUMBER() OVER (
                PARTITION BY "client_id", "employee_id" 
                ORDER BY "last_modified_date" 
            DESC) AS "cocoon_rn"
    FROM "stg_sap_pa0007_data"
) ranked
WHERE "cocoon_rn" = 1

snapshot_sap_pa0007_data.yml (Document the table)

version: 2
models:
- name: snapshot_sap_pa0007_data
  description: The table is about current employee work schedules. It tracks the most
    recent version of each employee's schedule details. Key information includes employee
    ID, schedule type, employment percentage, and various time allocations (daily,
    weekly, monthly, yearly hours). The table also contains fields for minimum and
    maximum hours across different time periods and administrative data like last
    modified details.
  columns:
  - name: client_id
    description: Client identifier
    tests:
    - not_null
  - name: employee_id
    description: Employee personnel number
    tests:
    - not_null
  - name: sequence_number
    description: Sequence number
    tests:
    - not_null
  - name: last_modified_by
    description: User who last changed the record
    tests:
    - not_null
  - name: schedule_type
    description: Work schedule type
    tests:
    - not_null
    - accepted_values:
        values:
        - norm
        - flex
        - part-time
        - shift
  - name: time_recording_indicator
    description: Time recording indicator
    tests:
    - not_null
  - name: employment_percentage
    description: Employment percentage
    tests:
    - not_null
  - name: monthly_hours
    description: Monthly working hours
    tests:
    - not_null
  - name: weekly_hours
    description: Weekly working hours
    tests:
    - not_null
  - name: daily_hours
    description: Daily working hours
    tests:
    - not_null
  - name: workdays_per_week
    description: Workdays per week
    tests:
    - not_null
  - name: yearly_hours
    description: Yearly working hours
    tests:
    - not_null
  - name: min_daily_hours
    description: Minimum daily hours
    tests:
    - not_null
  - name: max_daily_hours
    description: Maximum daily hours
    tests:
    - not_null
  - name: min_weekly_hours
    description: Minimum weekly hours
    tests:
    - not_null
  - name: max_weekly_hours
    description: Maximum weekly hours
    tests:
    - not_null
  - name: min_monthly_hours
    description: Minimum monthly hours
    tests:
    - not_null
  - name: max_monthly_hours
    description: Maximum monthly hours
    tests:
    - not_null
  - name: min_yearly_hours
    description: Minimum yearly hours
    tests:
    - not_null
  - name: max_yearly_hours
    description: Maximum yearly hours
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column appears to be a unique identifier for each row in the
        table. For this table, each row represents a distinct work schedule entry.
        The row_id is likely to be unique across all rows, making it a suitable candidate
        key.
  - name: is_deleted
    description: Indicates if the record is deleted
    tests:
    - not_null
  - name: dynamic_scheduling
    description: Dynamic scheduling indicator
    cocoon_meta:
      missing_acceptable: Not applicable for standard or fixed scheduling types.
  - name: valid_from_date
    description: Start date of validity
    tests:
    - not_null
  - name: valid_to_date
    description: End date of validity
    tests:
    - not_null
cocoon_meta:
  scd_base_table: stg_sap_pa0007_data

snapshot_sap_pa0031_data (first 100 rows)

employee_number seqnr subty uname row_id is_deleted company_code end_date flag_1 flag_2 flag_3 flag_4 reference_field_1 start_date
0 80053 0 0 C5174732 3 False 800 9999-12-31 None None None None 34567890 1992-03-15
1 22314 0 0 I026759 1 False 800 9999-12-31 None None None None 12345678 1975-04-01
2 80052 0 0 I026759 2 False 800 9999-12-31 None None None None 23456789 1991-12-15

snapshot_sap_pa0031_data.sql (clean the table)

-- Slowly Changing Dimension: Dimension keys are "employee_number", "subty"
-- Effective date columns are "last_change_date"
-- We will create Type 1 SCD (latest snapshot)
SELECT 
    "employee_number",
    "seqnr",
    "subty",
    "uname",
    "row_id",
    "is_deleted",
    "company_code",
    "end_date",
    "flag_1",
    "flag_2",
    "flag_3",
    "flag_4",
    "reference_field_1",
    "start_date"
FROM (
     SELECT 
            "employee_number",
            "seqnr",
            "subty",
            "uname",
            "row_id",
            "is_deleted",
            "company_code",
            "end_date",
            "flag_1",
            "flag_2",
            "flag_3",
            "flag_4",
            "reference_field_1",
            "start_date",
            ROW_NUMBER() OVER (
                PARTITION BY "employee_number", "subty" 
                ORDER BY "last_change_date" 
            DESC) AS "cocoon_rn"
    FROM "stg_sap_pa0031_data"
) ranked
WHERE "cocoon_rn" = 1

snapshot_sap_pa0031_data.yml (Document the table)

version: 2
models:
- name: snapshot_sap_pa0031_data
  description: The table is about current employee personal data. It tracks the most
    recent version of employee information for each employee number and subtype combination.
    The table includes details like company code, end date, various flags, and reference
    fields. It omits historical changes and version-related columns, focusing on the
    latest data for each employee.
  columns:
  - name: employee_number
    description: Employee number
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column represents a unique identifier for each employee. For
        this table, each row appears to represent an employee's record. The employee_number
        is likely to be unique across rows as it's a standard practice in HR systems.
  - name: seqnr
    description: Sequence number for tracking changes
    tests:
    - not_null
  - name: subty
    description: Subtype of employee data
    tests:
    - not_null
  - name: uname
    description: Username or user identifier
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column is explicitly described as a unique identifier for the
        row. For this table, each row represents an employee record, and row_id is
        designed to be unique across all rows.
  - name: is_deleted
    description: Indicates if the row was deleted
    tests:
    - not_null
  - name: company_code
    description: Company code
    tests:
    - not_null
  - name: end_date
    description: End date of the record
    tests:
    - not_null
  - name: flag_1
    description: Generic flag field 1
    cocoon_meta:
      missing_acceptable: Flag may not be relevant for all records.
  - name: flag_2
    description: Generic flag field 2
    cocoon_meta:
      missing_acceptable: Flag may not be relevant for all records.
  - name: flag_3
    description: Generic flag field 3
    cocoon_meta:
      missing_acceptable: Flag may not be relevant for all records.
  - name: flag_4
    description: Generic flag field 4
    cocoon_meta:
      missing_acceptable: Flag may not be relevant for all records.
  - name: reference_field_1
    description: Reference field 1
    tests:
    - not_null
  - name: start_date
    description: Start date of the record
    tests:
    - not_null
cocoon_meta:
  scd_base_table: stg_sap_pa0031_data

snapshot_sap_pa0001_data (first 100 rows)

employee_id sequence_number user_name personnel_area persg employee_subgroup personnel_subarea work_schedule_rule sachp sname ename object_type payroll_modifier row_id is_deleted client company_code controlling_area distribution_key is_historical job_id lock_indicator org_unit_id position_id processing_reason valid_from valid_to
0 70 0 c5115457 200 1 gc 2 g1 3 wayne bruce Mr. Bruce Wayne s 200 3 False 800 2000 1000 200 None 50043146 None 50002214 50005691 None 2003-01-01 9999-12-31
1 69 0 c5115457 200 1 gc 2 g1 3 bob sponge Mr. Sponge Bob s 200 2 False 800 2000 1000 200 None 50029038 None 50002214 50005687 None 2003-01-01 9999-12-31
2 10 0 powersa 200 1 gc 2 g1 3 powers austin Mr. Austin Powers s 200 1 False 800 2000 1000 200 None 50016575 None 50001357 50005214 None 2002-01-01 9999-12-31

snapshot_sap_pa0001_data.sql (clean the table)

-- Slowly Changing Dimension: Dimension keys are "employee_id"
-- Effective date columns are "last_changed_date"
-- We will create Type 1 SCD (latest snapshot)
SELECT 
    "employee_id",
    "sequence_number",
    "user_name",
    "personnel_area",
    "persg",
    "employee_subgroup",
    "personnel_subarea",
    "work_schedule_rule",
    "sachp",
    "sname",
    "ename",
    "object_type",
    "payroll_modifier",
    "row_id",
    "is_deleted",
    "client",
    "company_code",
    "controlling_area",
    "distribution_key",
    "is_historical",
    "job_id",
    "lock_indicator",
    "org_unit_id",
    "position_id",
    "processing_reason",
    "valid_from",
    "valid_to"
FROM (
     SELECT 
            "employee_id",
            "sequence_number",
            "user_name",
            "personnel_area",
            "persg",
            "employee_subgroup",
            "personnel_subarea",
            "work_schedule_rule",
            "sachp",
            "sname",
            "ename",
            "object_type",
            "payroll_modifier",
            "row_id",
            "is_deleted",
            "client",
            "company_code",
            "controlling_area",
            "distribution_key",
            "is_historical",
            "job_id",
            "lock_indicator",
            "org_unit_id",
            "position_id",
            "processing_reason",
            "valid_from",
            "valid_to",
            ROW_NUMBER() OVER (
                PARTITION BY "employee_id" 
                ORDER BY "last_changed_date" 
            DESC) AS "cocoon_rn"
    FROM "stg_sap_pa0001_data"
) ranked
WHERE "cocoon_rn" = 1

snapshot_sap_pa0001_data.yml (Document the table)

version: 2
models:
- name: snapshot_sap_pa0001_data
  description: The table is about current employee information in an SAP HR system.
    It tracks the most recent version of personal and organizational details for each
    employee. Key fields include employee ID, name, position, organizational unit,
    and employment dates. The table stores current data like company code, personnel
    area, and employee group. It represents a snapshot of the latest employee information
    without historical changes.
  columns:
  - name: employee_id
    description: Personnel number (employee ID)
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: Unique dimension key, derived from the slowly changing dimension
  - name: sequence_number
    description: Sequence number
    tests:
    - not_null
  - name: user_name
    description: User name (creator or modifier)
    tests:
    - not_null
  - name: personnel_area
    description: Personnel area (often plant or location)
    tests:
    - not_null
  - name: persg
    description: ''
    tests:
    - not_null
  - name: employee_subgroup
    description: Employee subgroup
    tests:
    - not_null
  - name: personnel_subarea
    description: Personnel subarea
    tests:
    - not_null
  - name: work_schedule_rule
    description: Work schedule rule
    tests:
    - not_null
  - name: sachp
    description: ''
    tests:
    - not_null
  - name: sname
    description: ''
    tests:
    - not_null
  - name: ename
    description: ''
    tests:
    - not_null
  - name: object_type
    description: Object type
    tests:
    - not_null
  - name: payroll_modifier
    description: Payroll modifier
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column is explicitly described as a unique identifier for the
        row. In a properly designed database, this would be a unique value for each
        record, making it an excellent candidate key.
  - name: is_deleted
    description: Indicates if the record has been deleted
    tests:
    - not_null
  - name: client
    description: Client
    tests:
    - not_null
    - accepted_values:
        values:
        - '800'
        - '888'
        - '877'
        - '866'
        - '855'
        - '844'
        - '833'
        - '822'
        - '880'
        - '881'
        - '882'
        - '883'
        - '884'
  - name: company_code
    description: Company code
    tests:
    - not_null
  - name: controlling_area
    description: Controlling area
    tests:
    - not_null
  - name: distribution_key
    description: Distribution key
    tests:
    - not_null
  - name: is_historical
    description: Historical record indicator
    cocoon_meta:
      missing_acceptable: Current records are not historical by definition.
  - name: job_id
    description: Job identifier
    tests:
    - not_null
  - name: lock_indicator
    description: Lock indicator
    cocoon_meta:
      missing_acceptable: Unlocked records don't need a lock indicator.
  - name: org_unit_id
    description: Organizational unit identifier
    tests:
    - not_null
  - name: position_id
    description: Position identifier
    tests:
    - not_null
  - name: processing_reason
    description: Processing reason
    cocoon_meta:
      missing_acceptable: No special processing needed for these standard records.
  - name: valid_from
    description: Start date of validity
    tests:
    - not_null
  - name: valid_to
    description: End date of validity
    tests:
    - not_null
cocoon_meta:
  scd_base_table: stg_sap_pa0001_data

snapshot_sap_pa0000_data (first 100 rows)

employee_id sequence_number username row_id is_deleted action_type client_code end_date lock_indicator start_date status_2 status_3
0 70 0 starpatrick 3 False 52 800 9999-12-31 None 2003-01-01 3 1
1 10 0 bobsponge 1 False 1 800 9999-12-31 None 2002-01-01 3 1
2 69 0 wardsquid 2 False 1 800 9999-12-31 None 2003-01-01 3 1

snapshot_sap_pa0000_data.sql (clean the table)

-- Slowly Changing Dimension: Dimension keys are "employee_id"
-- Effective date columns are "last_change_date"
-- We will create Type 1 SCD (latest snapshot)
SELECT 
    "employee_id",
    "sequence_number",
    "username",
    "row_id",
    "is_deleted",
    "action_type",
    "client_code",
    "end_date",
    "lock_indicator",
    "start_date",
    "status_2",
    "status_3"
FROM (
     SELECT 
            "employee_id",
            "sequence_number",
            "username",
            "row_id",
            "is_deleted",
            "action_type",
            "client_code",
            "end_date",
            "lock_indicator",
            "start_date",
            "status_2",
            "status_3",
            ROW_NUMBER() OVER (
                PARTITION BY "employee_id" 
                ORDER BY "last_change_date" 
            DESC) AS "cocoon_rn"
    FROM "stg_sap_pa0000_data"
) ranked
WHERE "cocoon_rn" = 1

snapshot_sap_pa0000_data.yml (Document the table)

version: 2
models:
- name: snapshot_sap_pa0000_data
  description: The table is about current employee records. It contains the latest
    personal data and employment status for each employee. Each row represents an
    employee's most recent information, including start date, end date, and status
    codes. The table excludes historical versions and change tracking columns. It
    provides a snapshot of the current state for all employees in the system.
  columns:
  - name: employee_id
    description: Personnel number
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: Unique dimension key, derived from the slowly changing dimension
  - name: sequence_number
    description: Sequence number
    tests:
    - not_null
  - name: username
    description: Username of the person who created/modified
    tests:
    - not_null
  - name: row_id
    description: Unique identifier for the row
    tests:
    - not_null
    - unique
    cocoon_meta:
      uniqueness: This column is described as a unique identifier for the row. For
        this table, each row is an employee record. By definition, if it's a unique
        identifier for the row, it would be unique across all rows.
  - name: is_deleted
    description: Indicates if the record was deleted
    tests:
    - not_null
  - name: action_type
    description: Action or measure type
    tests:
    - not_null
  - name: client_code
    description: Client or company code
    tests:
    - not_null
  - name: end_date
    description: End date of the record
    tests:
    - not_null
  - name: lock_indicator
    description: Lock indicator
    cocoon_meta:
      missing_acceptable: No locks applied to these records.
  - name: start_date
    description: Start date of the record
    tests:
    - not_null
  - name: status_2
    description: Status 2
    tests:
    - not_null
    - accepted_values:
        values:
        - '1'
        - '2'
        - '3'
        - '4'
        - '5'
  - name: status_3
    description: Status 3
    tests:
    - not_null
    - accepted_values:
        values:
        - '1'
        - '2'
        - '3'
        - '4'
        - '5'
cocoon_meta:
  scd_base_table: stg_sap_pa0000_data
We identify the primary key (PK) and foreign key (FK) from tables. We build a join graph that connects FK to PK.

Join Graph (FK to PK)

%3 stg_sap_pa0008_data stg_sap_pa0008_data stg_sap_bseg_data stg_sap_bseg_data stg_sap_t001_data stg_sap_t001_data stg_sap_bseg_data->stg_sap_t001_data stg_sap_bkpf_data stg_sap_bkpf_data stg_sap_bseg_data->stg_sap_bkpf_data stg_sap_ska1_data stg_sap_ska1_data stg_sap_bseg_data->stg_sap_ska1_data stg_sap_t880_data stg_sap_t880_data stg_sap_t880_data->stg_sap_t001_data stg_sap_faglflext_data stg_sap_faglflext_data stg_sap_faglflext_data->stg_sap_ska1_data stg_sap_ska1_data->stg_sap_t001_data snapshot_sap_pa0001_data snapshot_sap_pa0001_data snapshot_sap_pa0001_data->stg_sap_t001_data snapshot_sap_pa0000_data snapshot_sap_pa0000_data snapshot_sap_pa0000_data->stg_sap_pa0008_data stg_sap_faglflexa_data stg_sap_faglflexa_data stg_sap_faglflexa_data->stg_sap_bseg_data stg_sap_faglflexa_data->stg_sap_t001_data stg_sap_faglflexa_data->stg_sap_bkpf_data stg_sap_faglflexa_data->stg_sap_ska1_data

cocoon_join.yml (Document the joins)

join_graph:
- table_name: stg_sap_t001_data
  primary_key: row_id
  foreign_keys: []
- table_name: stg_sap_bseg_data
  foreign_keys:
  - column: company_code
    reference:
      table_name: stg_sap_t001_data
      column: row_id
  - column: gl_account
    reference:
      table_name: stg_sap_ska1_data
      column: row_id
  - column: original_document_number
    reference:
      table_name: stg_sap_bkpf_data
      column: row_id
  primary_key: row_id
- table_name: stg_sap_faglflexa_data
  foreign_keys:
  - column: company_code
    reference:
      table_name: stg_sap_t001_data
      column: row_id
  - column: gl_account_number
    reference:
      table_name: stg_sap_ska1_data
      column: row_id
  - column: document_number
    reference:
      table_name: stg_sap_bkpf_data
      column: row_id
  - column: document_line_number
    reference:
      table_name: stg_sap_bseg_data
      column: row_id
- table_name: stg_sap_ska1_data
  foreign_keys:
  - column: company_code
    reference:
      table_name: stg_sap_t001_data
      column: row_id
  primary_key: row_id
- table_name: stg_sap_t880_data
  foreign_keys:
  - column: company_code
    reference:
      table_name: stg_sap_t001_data
      column: row_id
- table_name: snapshot_sap_pa0001_data
  foreign_keys:
  - column: company_code
    reference:
      table_name: stg_sap_t001_data
      column: row_id
- table_name: stg_sap_faglflext_data
  foreign_keys:
  - column: gl_account
    reference:
      table_name: stg_sap_ska1_data
      column: row_id
- table_name: stg_sap_bkpf_data
  primary_key: row_id
  foreign_keys: []
- table_name: stg_sap_pa0008_data
  primary_key: row_id
  foreign_keys: []
- table_name: snapshot_sap_pa0000_data
  foreign_keys:
  - column: employee_id
    reference:
      table_name: stg_sap_pa0008_data
      column: row_id
We identify the entities and relationships behind the tables, and tell the story among these relationships.

cocoon_er.yml (Document the ER model)

entities:
- entity_name: CompanyCodes
  entity_description: Represents company codes in an SAP system, containing details
    about each company's configuration and settings.
  table_name: stg_sap_t001_data
  primary_key: row_id
- entity_name: ChartOfAccounts
  entity_description: Represents the chart of accounts entries, containing account
    numbers and their properties for financial accounting purposes.
  table_name: stg_sap_ska1_data
  primary_key: row_id
- entity_name: FinancialDocumentHeaders
  entity_description: Represents the headers of financial documents in SAP, containing
    high-level information about financial transactions.
  table_name: stg_sap_bkpf_data
  primary_key: row_id
- entity_name: AccountingDocumentLineItems
  entity_description: Represents individual line items in accounting documents, containing
    detailed information about each financial transaction.
  table_name: stg_sap_bseg_data
  primary_key: row_id
- entity_name: EmployeeCompensation
  entity_description: Represents employee compensation data, including salary history,
    wage components, and organizational information.
  table_name: stg_sap_pa0008_data
  primary_key: row_id
relations:
- relation_name: CompanyAccountStructure
  relation_description: ChartOfAccounts defines the structure of accounts used by
    CompanyCodes for financial reporting and management.
  table_name: stg_sap_ska1_data
  entities:
  - ChartOfAccounts
  - CompanyCodes
- relation_name: AccountingTransactionDetails
  relation_description: AccountingDocumentLineItems belong to FinancialDocumentHeaders,
    reference accounts in ChartOfAccounts, and are associated with specific CompanyCodes.
  table_name: stg_sap_bseg_data
  entities:
  - AccountingDocumentLineItems
  - CompanyCodes
  - ChartOfAccounts
  - FinancialDocumentHeaders
- relation_name: FinancialAccountingTransactionDetails
  relation_description: CompanyCodes define the organizational structure, ChartOfAccounts
    provides account classifications, FinancialDocumentHeaders capture transaction
    summaries, and AccountingDocumentLineItems detail individual financial postings
    within documents.
  table_name: stg_sap_faglflexa_data
  entities:
  - CompanyCodes
  - ChartOfAccounts
  - FinancialDocumentHeaders
  - AccountingDocumentLineItems
- relation_description: This table stores detailed information about various companies
    or branches, including their codes, names, addresses, and operational attributes.
  table_name: stg_sap_t880_data
  entities:
  - CompanyCodes
- relation_description: This table contains employee information associated with specific
    CompanyCodes within an organization's SAP HR system.
  table_name: snapshot_sap_pa0001_data
  entities:
  - CompanyCodes
- relation_description: This table represents the Chart of Accounts, containing financial
    transactions and balances for various accounts across different periods.
  table_name: stg_sap_faglflext_data
  entities:
  - ChartOfAccounts
- relation_description: This table stores the current employment status and personal
    data for each employee in the organization.
  table_name: snapshot_sap_pa0000_data
  entities:
  - EmployeeCompensation
story:
- relation_name: CompanyAccountStructure
  story_line: Company establishes chart of accounts for financial reporting.
- relation_name: FinancialAccountingTransactionDetails
  story_line: Company creates financial documents to record business transactions.
- relation_name: AccountingTransactionDetails
  story_line: Accountants post detailed line items to financial documents.