OMOP Common Data Model

We display the source tables from the data warehouses to model.

care_site

Table Preview(first 5 rows)

care_site_id care_site_name location_id care_site_source_value place_of_service_source_value
0 1 City Hospital 1 CS001 Hospital
1 2 Rural Clinic 2 CS002 Clinic

condition_occurrence

Table Preview(first 5 rows)

condition_occurrence_id person_id condition_start_date condition_start_datetime condition_end_date condition_end_datetime stop_reason provider_id visit_occurrence_id visit_detail_id condition_source_value condition_status_source_value
0 1 101 2021-01-10 2021-01-10 08:30:00 2021-01-20 2021-01-20 17:00:00 Resolved 150 200 300 Diabetes Active
1 2 102 2021-02-15 2021-02-15 09:45:00 2021-02-25 2021-02-25 16:30:00 Improved 151 201 301 Hypertension Controlled

cost

Table Preview(first 5 rows)

cost_id person_id total_charge total_paid amount_allowed paid_by_payer paid_by_patient paid_patient_copay paid_patient_coinsurance paid_patient_deductible paid_by_primary paid_ingredient_cost paid_dispensing_fee
0 1001 1 500.0 450.0 450.0 400.0 50.0 25.0 15.0 10.0 400.0 300.0 20.0
1 1002 2 1500.0 1400.0 1400.0 1300.0 100.0 50.0 40.0 10.0 1300.0 1000.0 50.0

death

Table Preview(first 5 rows)

person_id death_date death_datetime cause_source_value
0 1 2023-05-15 2023-05-15 14:30:00 I21.9
1 2 2023-06-20 2023-06-20 08:45:00 C50.9

device

Table Preview(first 5 rows)

device_id person_id device_exposure_start_date device_exposure_start_datetime device_exposure_end_date device_exposure_end_datetime unique_device_id quantity provider_id visit_occurrence_id device_source_value
0 1001 2001 2023-01-15 2023-01-15 08:00:00 2023-01-20 2023-01-20 18:00:00 UD1001 1 5001 200 DeviceA
1 1002 2002 2023-02-20 2023-02-20 09:30:00 2023-02-25 2023-02-25 16:30:00 UD1002 2 5002 201 DeviceB

drug_exposure

Table Preview(first 5 rows)

drug_exposure_id person_id drug_exposure_start_date drug_exposure_end_date verbatim_end_date stop_reason refills quantity days_supply sig lot_number provider_id visit_occurrence_id drug_source_value route_source_value dose_unit_source_value
0 1 1001 2023-01-01 2023-01-10 2023-01-10 Completed course 0 10 10 Take 1 tablet daily LOT1001 12345 111 DrugA Oral mg
1 2 1002 2023-02-15 2023-03-01 2023-03-01 Adverse reaction 1 30 15 Take 2 tablets twice daily LOT1002 67890 222 DrugB Oral mg

location

Table Preview(first 5 rows)

location_id address_1 address_2 city state zip county location_source_value latitude longitude
0 1 123 Main St Suite 100 Springfield NY 12345 Hampden L123 40.7128 -74.0060
1 2 456 Elm St Apt 202 Riverdale CA 67890 Archie L456 34.0522 -118.2437

measurement

Table Preview(first 5 rows)

measurement_id person_id measurement_date measurement_datetime value_as_number provider_id visit_occurrence_id measurement_source_value unit_source_value value_source_value
0 1001 2001 2023-01-15 2023-01-15 08:00:00 5.6 8001 9001 BP_SYS mmHg 120
1 1002 2002 2023-01-16 2023-01-16 09:30:00 7.8 8002 9002 BP_DIA mmHg 80

observation

Table Preview(first 5 rows)

observation_id person_id observation_date observation_datetime value_as_number value_as_string provider_id visit_occurrence_id observation_source_value unit_source_value qualifier_source_value
0 1001 123 2023-01-01 2023-01-01 08:00:00 98.6 Normal 7001 200 Blood Pressure mmHg Resting
1 1002 456 2023-01-02 2023-01-02 09:30:00 99.5 Elevated 7002 201 Heart Rate beats/min After Exercise

payer_plan_period

Table Preview(first 5 rows)

payer_plan_period_id person_id payer_plan_period_start_date payer_plan_period_end_date stop_reason payer_source_value plan_source_value sponsor_source_value family_plan_source_value
0 1001 123 2023-01-01 2023-12-31 End of contract PayerA PlanA SponsorA FamilyPlanA
1 1002 456 2023-06-01 2023-12-31 Change of employment PayerB PlanB SponsorB FamilyPlanB

person

Table Preview(first 5 rows)

person_id year_of_birth month_of_birth day_of_birth birth_datetime location_id provider_id care_site_id person_source_value gender_source_value race_source_value ethnicity_source_value
0 1 1985 5 20 1985-05-20 1 1 1 12345 M White Not Hispanic or Latino
1 2 1990 8 10 1990-08-10 2 2 2 67890 F Black or African American Hispanic or Latino

procedure_occurrence

Table Preview(first 5 rows)

procedure_occurrence_id person_id procedure_date procedure_datetime quantity provider_id visit_occurrence_id visit_detail_id procedure_source_value qualifier_source_value procedure_cost
0 1 101 2023-01-15 2023-01-15 10:00:00 1 101 200 0 PROC1 NaN 200.0
1 2 102 2023-02-20 2023-02-20 14:30:00 2 102 200 0 PROC2 NaN 450.0

provider

Table Preview(first 5 rows)

provider_id provider_name npi dea care_site_id provider_source_value specialty_source_value gender_source_value
0 101 Dr. Jane Smith 1234567890 AB1234567 10 P101 Cardiology F
1 102 Dr. John Doe 987654321 CD7654321 20 P102 Neurology M

specimen

Table Preview(first 5 rows)

specimen_id person_id specimen_date specimen_datetime quantity specimen_source_value unit_source_value anatomic_site_source_value disease_status_source_value
0 101 2001 2023-01-15 2023-01-15 10:00:00 1.5 Blood Sample ml Arm Healthy
1 102 2002 2023-01-20 2023-01-20 11:30:00 2.0 Tissue Sample g Liver Diseased

visit_detail

Table Preview(first 5 rows)

visit_detail_id person_id visit_detail_start_date visit_detail_start_datetime visit_detail_end_date visit_detail_end_datetime provider_id care_site_id visit_detail_source_value admitting_source_value discharge_to_source_value visit_occurrence_id
0 1 1001 2023-01-15 2023-01-15 09:00:00 2023-01-15 2023-01-15 10:30:00 101 1 OP12345 NaN NaN 10001
1 2 1002 2023-02-01 2023-02-01 14:00:00 2023-02-03 2023-02-03 11:00:00 102 2 IP67890 NaN NaN 10002
2 3 1003 2023-03-10 2023-03-10 08:30:00 2023-03-10 2023-03-10 09:15:00 103 1 ER45678 NaN NaN 10003
3 4 1004 2023-04-05 2023-04-05 11:00:00 2023-04-05 2023-04-05 12:30:00 104 3 OP23456 NaN NaN 10004

visit_occurrence

Table Preview(first 5 rows)

visit_occurrence_id person_id visit_start_date visit_start_datetime visit_end_date visit_end_datetime provider_id care_site_id visit_source_value admitting_source_value discharge_to_source_value preceding_visit_occurrence_id
0 200 101 2023-01-10 2023-01-10 08:00:00 2023-01-12 2023-01-12 15:00:00 501 601 VS123 AS123 DS123 111
1 201 102 2023-02-15 2023-02-15 09:30:00 2023-02-16 2023-02-16 10:30:00 502 602 VS456 AS456 DS456 222
We display the stage tables, their SQL transformations, and their configurations.

stg_condition_occurrence

Table Preview (first 5 rows)

condition_occurrence_id person_id condition_start_date condition_start_datetime condition_end_date condition_end_datetime stop_reason provider_id visit_occurrence_id visit_detail_id condition_source_value condition_status_source_value
0 1 101 2021-01-10 2021-01-10 08:30:00 2021-01-20 2021-01-20 17:00:00 Resolved 150 200 300 Diabetes Active
1 2 102 2021-02-15 2021-02-15 09:45:00 2021-02-25 2021-02-25 16:30:00 Improved 151 201 301 Hypertension Controlled

stg_condition_occurrence.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:34:36.957786+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."condition_occurrence"

stg_condition_occurrence.yml (Document the table)

version: 2
models:
- name: stg_condition_occurrence
  description: The table is about condition occurrences for patients. It records details
    of medical conditions, including start and end dates/times, provider information,
    visit details, and condition status. Each row represents a specific condition
    occurrence for a patient, identified by a unique condition_occurrence_id and linked
    to a person_id. The table captures the timeline, context, and status of medical
    conditions experienced by patients.
  columns:
  - name: condition_occurrence_id
    description: Unique identifier for each condition occurrence
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column is a unique identifier for each condition occurrence.
        For this table, each row represents a specific condition occurrence for a
        patient. condition_occurrence_id is unique across rows.
  - name: person_id
    description: Unique identifier for each patient
    tests:
    - not_null
  - name: condition_start_date
    description: Date when the condition was first recorded
    tests:
    - not_null
  - name: condition_start_datetime
    description: Precise date and time when condition was recorded
    tests:
    - not_null
  - name: condition_end_date
    description: Date when the condition was resolved
    tests:
    - not_null
  - name: condition_end_datetime
    description: Precise date and time when condition ended
    tests:
    - not_null
  - name: stop_reason
    description: Reason for condition ending or treatment stopping
    tests:
    - not_null
    - accepted_values:
        values:
        - Improved
        - Resolved
    cocoon_meta:
      future_accepted_values:
      - No improvement
      - Side effects
      - Treatment completed
      - Patient request
      - Allergic reaction
      - Ineffective
      - Contraindication
      - Condition worsened
      - Treatment goal achieved
      - Financial reasons
      - Lost to follow-up
      - New treatment available
      - Drug interaction
      - Pregnancy
      - Dosage issues
      - Compliance problems
      - Insurance denial
      - Transferred care
  - name: provider_id
    description: Unique identifier for the healthcare provider
    tests:
    - not_null
  - name: visit_occurrence_id
    description: Unique identifier for the patient visit
    tests:
    - not_null
  - name: visit_detail_id
    description: Identifier for specific details of the visit
    tests:
    - not_null
  - name: condition_source_value
    description: Original value or code for the condition
    tests:
    - not_null
  - name: condition_status_source_value
    description: Status of the condition as reported
    tests:
    - not_null
    - accepted_values:
        values:
        - Active
        - Controlled
    cocoon_meta:
      future_accepted_values:
      - Inactive
      - Resolved
      - In Remission
      - Chronic
      - Acute
      - Recurrent
      - Worsening
      - Improving
      - Stable
      - Unknown
      - Not Applicable
  tests: []

stg_death

Table Preview (first 5 rows)

person_id death_date death_datetime cause_source_value
0 1 2023-05-15 2023-05-15 14:30:00 I21.9
1 2 2023-06-20 2023-06-20 08:45:00 C50.9

stg_death.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:35:10.660283+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."death"

stg_death.yml (Document the table)

version: 2
models:
- name: stg_death
  description: The table is about death records. It contains details of individual
    deaths. Each record includes a person ID, death date, precise death time, and
    cause of death. The cause of death is coded using a standardized system. This
    table likely serves as a comprehensive record of mortality data within a healthcare
    or population database.
  columns:
  - name: person_id
    description: Unique identifier for each individual
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column represents a unique identifier for each individual.
        For this table, each row represents a single death record. person_id is likely
        to be unique across rows as it's typically assigned to uniquely identify a
        person in a database.
  - name: death_date
    description: Date of death
    tests:
    - not_null
  - name: death_datetime
    description: Precise date and time of death
    tests:
    - not_null
  - name: cause_source_value
    description: Coded value representing cause of death
    tests:
    - not_null
  tests: []

stg_provider

Table Preview (first 5 rows)

provider_id provider_name npi dea care_site_id provider_source_value specialty_source_value gender_source_value
0 101 Dr. Jane Smith 1234567890 AB1234567 10 P101 Cardiology F
1 102 Dr. John Doe 987654321 CD7654321 20 P102 Neurology M

stg_provider.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:39:36.706351+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."provider"

stg_provider.yml (Document the table)

version: 2
models:
- name: stg_provider
  description: The table is about healthcare providers. It contains details like provider
    ID, name, NPI number, DEA number, care site ID, provider source value, specialty,
    and gender. Each row represents a unique provider with their identifying information
    and professional details. The table likely serves as a reference for provider
    information in a healthcare system.
  columns:
  - name: provider_id
    description: Unique identifier for the healthcare provider
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column represents a unique identifier for each healthcare
        provider. For this table, each row represents a distinct provider, and provider_id
        is expected to be unique across rows.
  - name: provider_name
    description: Full name of the healthcare provider
    tests:
    - not_null
  - name: npi
    description: National Provider Identifier number
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: The National Provider Identifier is a unique 10-digit identification
        number issued to health care providers in the United States. Each provider
        should have a unique NPI, making it a potential candidate key.
  - name: dea
    description: Drug Enforcement Administration registration number
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: The Drug Enforcement Administration registration number is unique
        to each provider authorized to prescribe controlled substances. It should
        be unique for each provider in the table.
  - name: care_site_id
    description: Identifier for the provider's care site
    tests:
    - not_null
  - name: provider_source_value
    description: Source system's unique identifier for the provider
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This is described as the source system's unique identifier for
        the provider. If it's truly unique in the source system, it should be unique
        for each provider in this table.
  - name: specialty_source_value
    description: Provider's medical specialty
    tests:
    - not_null
  - name: gender_source_value
    description: Gender of the healthcare provider
    tests:
    - not_null
    - accepted_values:
        values:
        - F
        - M
    cocoon_meta:
      future_accepted_values:
      - Other
      - Non-binary
      - Transgender
      - Prefer not to say
      - Unknown
  tests: []

stg_drug_exposure

Table Preview (first 5 rows)

drug_exposure_id person_id drug_exposure_start_date drug_exposure_end_date verbatim_end_date stop_reason refills quantity days_supply sig lot_number provider_id visit_occurrence_id drug_source_value route_source_value dose_unit_source_value
0 1 1001 2023-01-01 2023-01-10 2023-01-10 Completed course 0 10 10 Take 1 tablet daily LOT1001 12345 111 DrugA Oral mg
1 2 1002 2023-02-15 2023-03-01 2023-03-01 Adverse reaction 1 30 15 Take 2 tablets twice daily LOT1002 67890 222 DrugB Oral mg

stg_drug_exposure.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:36:23.784338+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."drug_exposure"

stg_drug_exposure.yml (Document the table)

version: 2
models:
- name: stg_drug_exposure
  description: The table is about drug exposures. It contains details like drug exposure
    ID, person ID, start and end dates, reason for stopping, quantity, days supply,
    and dosage instructions. It also includes information on the drug source, route,
    provider, and associated visit. Each row represents a single drug exposure event
    for a patient.
  columns:
  - name: drug_exposure_id
    description: Unique identifier for each drug exposure event
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column is a unique identifier for each drug exposure event.
        For this table, each row represents a single drug exposure event, so drug_exposure_id
        should be unique across rows.
  - name: person_id
    description: Unique identifier for the patient
    tests:
    - not_null
  - name: drug_exposure_start_date
    description: Date when the drug exposure began
    tests:
    - not_null
  - name: drug_exposure_end_date
    description: Date when the drug exposure ended
    tests:
    - not_null
  - name: verbatim_end_date
    description: Exact end date as recorded
    tests:
    - not_null
  - name: stop_reason
    description: Reason for discontinuing the drug
    tests:
    - not_null
    - accepted_values:
        values:
        - Adverse reaction
        - Completed course
    cocoon_meta:
      future_accepted_values:
      - Ineffective
      - Patient decision
      - Cost
      - Drug interaction
      - Contraindication
      - Dosage adjustment
      - Allergic reaction
      - Side effects
      - Pregnancy
      - Change in treatment plan
      - Drug shortage
      - Physician decision
      - Recovery
      - Toxicity
      - Drug resistance
      - Non-compliance
      - Scheduled discontinuation
      - Other
  - name: refills
    description: Number of refills prescribed
    tests:
    - not_null
  - name: quantity
    description: Amount of drug dispensed
    tests:
    - not_null
  - name: days_supply
    description: Number of days the dispensed quantity covers
    tests:
    - not_null
  - name: sig
    description: Instructions for drug administration
    tests:
    - not_null
  - name: lot_number
    description: Manufacturer's lot number for the drug
    tests:
    - not_null
  - name: provider_id
    description: Unique identifier for the prescribing provider
    tests:
    - not_null
  - name: visit_occurrence_id
    description: Identifier for the associated healthcare visit
    tests:
    - not_null
  - name: drug_source_value
    description: Original value for the drug name
    tests:
    - not_null
  - name: route_source_value
    description: Method of drug administration
    tests:
    - not_null
    - accepted_values:
        values:
        - Oral
    cocoon_meta:
      future_accepted_values:
      - Intravenous
      - Intramuscular
      - Subcutaneous
      - Topical
      - Inhalation
      - Sublingual
      - Transdermal
      - Rectal
      - Ophthalmic
      - Otic
      - Nasal
      - Buccal
      - Vaginal
  - name: dose_unit_source_value
    description: Unit of measurement for the dose
    tests:
    - not_null
    - accepted_values:
        values:
        - mg
    cocoon_meta:
      future_accepted_values:
      - g
      - mcg
      - kg
      - mL
      - L
      - IU
      - mmol
      - mol
      - meq
      - U
      - '%'
      - mcg/mL
      - mg/mL
      - mg/L
      - ng/mL
      - ug/mL
      - mmol/L
      - units
      - tablets
      - capsules
      - drops
      - puffs
      - patches
  tests: []

stg_measurement

Table Preview (first 5 rows)

measurement_id person_id measurement_date measurement_datetime value_as_number provider_id visit_occurrence_id measurement_source_value unit_source_value value_source_value
0 1001 2001 2023-01-15 2023-01-15 08:00:00 5.6 8001 9001 BP_SYS mmHg 120
1 1002 2002 2023-01-16 2023-01-16 09:30:00 7.8 8002 9002 BP_DIA mmHg 80

stg_measurement.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:37:15.254105+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."measurement"

stg_measurement.yml (Document the table)

version: 2
models:
- name: stg_measurement
  description: The table is about medical measurements. It includes unique measurement
    IDs, person IDs, measurement dates and times, measured values, provider IDs, visit
    IDs, measurement types, units, and source values. Each row represents a single
    measurement taken for a specific person during a healthcare visit. The table captures
    both numeric values and their corresponding units for various medical measurements.
  columns:
  - name: measurement_id
    description: Unique identifier for each measurement
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column is a unique identifier for each measurement. For
        this table, each row represents a single measurement, so measurement_id should
        be unique across rows.
  - name: person_id
    description: Unique identifier for each patient
    tests:
    - not_null
  - name: measurement_date
    description: Date when the measurement was taken
    tests:
    - not_null
  - name: measurement_datetime
    description: Date and time when the measurement was taken
    tests:
    - not_null
  - name: value_as_number
    description: Numeric value of the measurement
    tests:
    - not_null
  - name: provider_id
    description: Unique identifier for the healthcare provider
    tests:
    - not_null
  - name: visit_occurrence_id
    description: Unique identifier for the healthcare visit
    tests:
    - not_null
  - name: measurement_source_value
    description: Type or source of the measurement
    tests:
    - not_null
  - name: unit_source_value
    description: Unit of measurement
    tests:
    - not_null
    - accepted_values:
        values:
        - mmHg
    cocoon_meta:
      future_accepted_values:
      - cm
      - m
      - km
      - in
      - ft
      - yd
      - mi
      - g
      - kg
      - lb
      - oz
      - L
      - mL
      - gal
      - qt
      - pt
      - fl oz
      - "\xB0C"
      - "\xB0F"
      - K
      - mol
      - A
      - V
      - W
      - J
      - N
      - Pa
      - Hz
  - name: value_source_value
    description: Original source value of the measurement
    tests:
    - not_null
  tests: []

stg_payer_plan_period

Table Preview (first 5 rows)

payer_plan_period_id person_id payer_plan_period_start_date payer_plan_period_end_date stop_reason payer_source_value plan_source_value sponsor_source_value family_plan_source_value
0 1001 123 2023-01-01 2023-12-31 End of contract PayerA PlanA SponsorA FamilyPlanA
1 1002 456 2023-06-01 2023-12-31 Change of employment PayerB PlanB SponsorB FamilyPlanB

stg_payer_plan_period.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:38:22.759729+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."payer_plan_period"

stg_payer_plan_period.yml (Document the table)

version: 2
models:
- name: stg_payer_plan_period
  description: The table is about payer plan periods. It contains details of insurance
    coverage for individuals. Each record includes a unique ID, person ID, start and
    end dates of coverage, reason for stopping, payer information, plan details, sponsor,
    and family plan information. The table tracks insurance coverage periods and associated
    details for each person.
  columns:
  - name: payer_plan_period_id
    description: Unique identifier for each payer plan period
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column is a unique identifier for each payer plan period.
        For this table, each row represents a distinct insurance coverage period.
        payer_plan_period_id is unique across rows.
  - name: person_id
    description: Unique identifier for each individual
    tests:
    - not_null
  - name: payer_plan_period_start_date
    description: Start date of the insurance coverage period
    tests:
    - not_null
  - name: payer_plan_period_end_date
    description: End date of the insurance coverage period
    tests:
    - not_null
  - name: stop_reason
    description: Reason for termination of the insurance coverage
    tests:
    - not_null
    - accepted_values:
        values:
        - Change of employment
        - End of contract
    cocoon_meta:
      future_accepted_values:
      - Retirement
      - Death
      - Voluntary cancellation
      - Non-payment of premiums
      - Policy expiration
      - Fraud
      - Loss of eligibility
      - Relocation
      - Divorce
      - Company policy change
      - Switching to different insurance
      - Bankruptcy of insurer
      - Legal requirements
  - name: payer_source_value
    description: Identifier or name of the insurance payer
    tests:
    - not_null
  - name: plan_source_value
    description: Identifier or name of the insurance plan
    tests:
    - not_null
  - name: sponsor_source_value
    description: Identifier or name of the plan sponsor
    tests:
    - not_null
  - name: family_plan_source_value
    description: Identifier or name of the family plan
    tests:
    - not_null
  tests: []

stg_visit_occurrence

Table Preview (first 5 rows)

visit_occurrence_id person_id visit_start_date visit_start_datetime visit_end_date visit_end_datetime provider_id care_site_id visit_source_value admitting_source_value discharge_to_source_value preceding_visit_occurrence_id
0 200 101 2023-01-10 2023-01-10 08:00:00 2023-01-12 2023-01-12 15:00:00 501 601 VS123 AS123 DS123 111
1 201 102 2023-02-15 2023-02-15 09:30:00 2023-02-16 2023-02-16 10:30:00 502 602 VS456 AS456 DS456 222

stg_visit_occurrence.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:41:06.097924+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."visit_occurrence"

stg_visit_occurrence.yml (Document the table)

version: 2
models:
- name: stg_visit_occurrence
  description: The table is about patient visit occurrences. It includes details such
    as visit ID, patient ID, visit dates and times, provider and care site IDs, and
    source values for various aspects of the visit. The table also tracks preceding
    visits, allowing for the establishment of visit histories for patients.
  columns:
  - name: visit_occurrence_id
    description: Unique identifier for each visit
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column represents a unique identifier for each visit. For
        this table, each row represents a single patient visit. visit_occurrence_id
        is unique across rows.
  - name: person_id
    description: Unique identifier for each patient
    tests:
    - not_null
  - name: visit_start_date
    description: Date when the visit started
    tests:
    - not_null
  - name: visit_start_datetime
    description: Date and time when the visit started
    tests:
    - not_null
  - name: visit_end_date
    description: Date when the visit ended
    tests:
    - not_null
  - name: visit_end_datetime
    description: Date and time when the visit ended
    tests:
    - not_null
  - name: provider_id
    description: Unique identifier for the healthcare provider
    tests:
    - not_null
  - name: care_site_id
    description: Unique identifier for the care site
    tests:
    - not_null
  - name: visit_source_value
    description: Original code for the visit type
    tests:
    - not_null
  - name: admitting_source_value
    description: Original code for the admitting source
    tests:
    - not_null
  - name: discharge_to_source_value
    description: Original code for the discharge destination
    tests:
    - not_null
  - name: preceding_visit_occurrence_id
    description: ID of the patient's previous visit
    tests:
    - not_null
  tests: []

stg_specimen

Table Preview (first 5 rows)

specimen_id person_id specimen_date specimen_datetime quantity specimen_source_value unit_source_value anatomic_site_source_value disease_status_source_value
0 101 2001 2023-01-15 2023-01-15 10:00:00 1.5 Blood Sample ml Arm Healthy
1 102 2002 2023-01-20 2023-01-20 11:30:00 2.0 Tissue Sample g Liver Diseased

stg_specimen.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:40:03.186834+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."specimen"

stg_specimen.yml (Document the table)

version: 2
models:
- name: stg_specimen
  description: The table is about biological specimens. It contains details of sample
    collection. Each specimen has an ID, associated person ID, collection date and
    time. The table includes quantity, source, unit, anatomical site, and health status.
    This data allows tracking and analysis of biological samples from different individuals.
  columns:
  - name: specimen_id
    description: Unique identifier for each biological specimen
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column represents a unique identifier for each biological
        specimen. For this table, each row is for a distinct specimen, and specimen_id
        is unique across rows.
  - name: person_id
    description: Identifier for the person providing the specimen
    tests:
    - not_null
  - name: specimen_date
    description: Date when the specimen was collected
    tests:
    - not_null
  - name: specimen_datetime
    description: Date and time when the specimen was collected
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column represents the date and time when the specimen was
        collected. For this table, each row is for a distinct specimen, and it's highly
        unlikely (but not impossible) for two specimens to be collected at the exact
        same second. However, it could potentially be unique across rows.
  - name: quantity
    description: Amount of specimen collected
    tests:
    - not_null
  - name: specimen_source_value
    description: Type or origin of the biological specimen
    tests:
    - not_null
  - name: unit_source_value
    description: Unit of measurement for the specimen quantity
    tests:
    - not_null
    - accepted_values:
        values:
        - g
        - ml
    cocoon_meta:
      future_accepted_values:
      - mg
      - "\xB5g"
      - ng
      - kg
      - L
      - dL
      - cL
      - mL
      - "\xB5L"
      - oz
      - lb
      - fl oz
      - pt
      - qt
      - gal
      - cc
      - "m\xB3"
      - "cm\xB3"
      - "mm\xB3"
      - IU
      - mol
      - mmol
      - "\xB5mol"
      - nmol
  - name: anatomic_site_source_value
    description: Body part from which the specimen was collected
    tests:
    - not_null
  - name: disease_status_source_value
    description: Health condition of the specimen source
    tests:
    - not_null
    - accepted_values:
        values:
        - Diseased
        - Healthy
    cocoon_meta:
      future_accepted_values:
      - Normal
      - Control
      - Asymptomatic
      - Pre-symptomatic
      - Acute
      - Chronic
      - Remission
      - Recovered
      - Unknown
  tests: []

stg_observation

Table Preview (first 5 rows)

observation_id person_id observation_date observation_datetime value_as_number value_as_string provider_id visit_occurrence_id observation_source_value unit_source_value qualifier_source_value
0 1001 123 2023-01-01 2023-01-01 08:00:00 98.6 Normal 7001 200 Blood Pressure mmHg Resting
1 1002 456 2023-01-02 2023-01-02 09:30:00 99.5 Elevated 7002 201 Heart Rate beats/min After Exercise

stg_observation.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:37:52.917338+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."observation"

stg_observation.yml (Document the table)

version: 2
models:
- name: stg_observation
  description: The table is about medical observations. It contains details of individual
    observations made on patients. Each observation has an ID, patient ID, date, time,
    numeric value, string value, provider ID, visit ID, and source information. The
    table includes specifics like observation type, unit of measurement, and qualifiers.
    It allows tracking of various medical measurements and assessments for patients
    over time.
  columns:
  - name: observation_id
    description: Unique identifier for each medical observation
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column appears to be a unique identifier for each observation.
        For this table, where each row is an individual observation, observation_id
        is likely to be unique across rows.
  - name: person_id
    description: Unique identifier for each patient
    tests:
    - not_null
  - name: observation_date
    description: Date when the observation was made
    tests:
    - not_null
  - name: observation_datetime
    description: Date and time when the observation was made
    tests:
    - not_null
  - name: value_as_number
    description: Numeric value of the observation
    tests:
    - not_null
  - name: value_as_string
    description: Textual description of the observation result
    tests:
    - not_null
    - accepted_values:
        values:
        - Elevated
        - Normal
    cocoon_meta:
      future_accepted_values:
      - Low
      - Critical
      - Inconclusive
      - Within Reference Range
      - Abnormal
      - Borderline
  - name: provider_id
    description: Unique identifier for the healthcare provider
    tests:
    - not_null
  - name: visit_occurrence_id
    description: Unique identifier for the patient visit
    tests:
    - not_null
  - name: observation_source_value
    description: Type or name of the observation
    tests:
    - not_null
  - name: unit_source_value
    description: Unit of measurement for the observation
    tests:
    - not_null
    - accepted_values:
        values:
        - beats/min
        - mmHg
    cocoon_meta:
      future_accepted_values:
      - kg
      - cm
      - m
      - g
      - mg
      - mL
      - L
      - "\xB0C"
      - "\xB0F"
      - "\u03BCg/dL"
      - mg/dL
      - mmol/L
      - U/L
      - mEq/L
      - ng/mL
      - IU/mL
      - '%'
      - mm
      - in
      - lb
      - oz
      - mIU/L
      - pg/mL
      - breaths/min
  - name: qualifier_source_value
    description: Additional context or condition for the observation
    tests:
    - not_null
    - accepted_values:
        values:
        - After Exercise
        - Resting
    cocoon_meta:
      future_accepted_values:
      - Fasting
      - Post-meal
      - During Sleep
      - Standing
      - Sitting
      - Lying Down
      - Under Stress
      - After Medication
      - During Illness
      - Pre-exercise
      - During Exercise
      - Post-surgery
      - Pregnancy
      - Menstruation
      - Dehydrated
      - Well-hydrated
      - Cold Exposure
      - Heat Exposure
      - High Altitude
  tests: []

stg_care_site

Table Preview (first 5 rows)

care_site_id care_site_name location_id care_site_source_value place_of_service_source_value
0 1 City Hospital 1 CS001 Hospital
1 2 Rural Clinic 2 CS002 Clinic

stg_care_site.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:33:58.873124+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."care_site"

stg_care_site.yml (Document the table)

version: 2
models:
- name: stg_care_site
  description: The table is about care sites. It contains details of healthcare facilities.
    Each care site has a unique ID, name, and location. The table includes source
    values for the care site and place of service. Care sites can be hospitals, clinics,
    or other healthcare locations. The table links care sites to their physical locations
    and categorizes them by type of service.
  columns:
  - name: care_site_id
    description: Unique identifier for each care site
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column represents a unique identifier for each care site.
        For this table, each row represents a distinct care site. care_site_id is
        designed to be unique across rows.
  - name: care_site_name
    description: Name of the healthcare facility
    tests:
    - not_null
  - name: location_id
    description: Identifier for the physical location of the care site
    tests:
    - not_null
  - name: care_site_source_value
    description: Original source code or identifier for the care site
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column represents the original source code or identifier
        for the care site. If this is consistently used as a unique identifier in
        the source system, it could be unique across rows.
  - name: place_of_service_source_value
    description: Type or category of healthcare service provided
    tests:
    - not_null
    - accepted_values:
        values:
        - Clinic
        - Hospital
    cocoon_meta:
      future_accepted_values:
      - Emergency Room
      - Urgent Care
      - Doctor's Office
      - Outpatient Facility
      - Inpatient Facility
      - Ambulatory Surgery Center
      - Nursing Home
      - Home Health
      - Telehealth
      - Pharmacy
      - Laboratory
      - Imaging Center
      - Rehabilitation Center
  tests: []

stg_visit_detail

Table Preview (first 5 rows)

visit_detail_id person_id visit_detail_start_date visit_detail_start_datetime visit_detail_end_date visit_detail_end_datetime provider_id care_site_id visit_detail_source_value admitting_source_value discharge_to_source_value visit_occurrence_id
0 1 1001 2023-01-15 2023-01-15 09:00:00 2023-01-15 2023-01-15 10:30:00 101 1 OP12345 NaN NaN 10001
1 2 1002 2023-02-01 2023-02-01 14:00:00 2023-02-03 2023-02-03 11:00:00 102 2 IP67890 NaN NaN 10002
2 3 1003 2023-03-10 2023-03-10 08:30:00 2023-03-10 2023-03-10 09:15:00 103 1 ER45678 NaN NaN 10003
3 4 1004 2023-04-05 2023-04-05 11:00:00 2023-04-05 2023-04-05 12:30:00 104 3 OP23456 NaN NaN 10004

stg_visit_detail.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:40:34.903624+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."visit_detail"

stg_visit_detail.yml (Document the table)

version: 2
models:
- name: stg_visit_detail
  description: The table is about visit details for patients. It includes information
    on visit start and end times, care providers, care sites, and visit types. Each
    visit detail is linked to a specific person and a broader visit occurrence. The
    table captures both outpatient and inpatient visits, as well as emergency room
    visits. It provides a comprehensive record of each patient's interaction with
    the healthcare system.
  columns:
  - name: visit_detail_id
    description: Unique identifier for each visit detail
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column is a unique identifier for each visit detail. For
        this table, each row represents a distinct visit detail. visit_detail_id is
        unique across rows.
  - name: person_id
    description: Unique identifier for each patient
    tests:
    - not_null
  - name: visit_detail_start_date
    description: Date when the visit started
    tests:
    - not_null
  - name: visit_detail_start_datetime
    description: Precise date and time when visit started
    tests:
    - not_null
  - name: visit_detail_end_date
    description: Date when the visit ended
    tests:
    - not_null
  - name: visit_detail_end_datetime
    description: Precise date and time when visit ended
    tests:
    - not_null
  - name: provider_id
    description: Unique identifier for the healthcare provider
    tests:
    - not_null
  - name: care_site_id
    description: Identifier for the location of care
    tests:
    - not_null
  - name: visit_detail_source_value
    description: Original code for the visit type
    tests:
    - not_null
    - accepted_values:
        values:
        - ER45678
        - IP67890
        - OP12345
        - OP23456
    cocoon_meta:
      future_accepted_values:
      - ER%
      - IP%
      - OP%
      - UC%
      - SC%
      - PC%
      - HH%
      - TH%
      - RC%
      - DC%
      - IC%
      - OB%
      - PD%
      - GE%
      - ON%
  - name: admitting_source_value
    description: Source from which patient was admitted
    cocoon_meta:
      missing_reason: Unknown
  - name: discharge_to_source_value
    description: Destination to which patient was discharged
    cocoon_meta:
      missing_reason: Unknown
  - name: visit_occurrence_id
    description: ID linking to broader visit occurrence
    tests:
    - not_null
  tests: []

stg_person

Table Preview (first 5 rows)

person_id year_of_birth month_of_birth day_of_birth birth_datetime location_id provider_id care_site_id person_source_value gender_source_value race_source_value ethnicity_source_value
0 1 1985 5 20 1985-05-20 1 1 1 12345 M White Not Hispanic or Latino
1 2 1990 8 10 1990-08-10 2 2 2 67890 F Black or African American Hispanic or Latino

stg_person.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:38:52.434116+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."person"

stg_person.yml (Document the table)

version: 2
models:
- name: stg_person
  description: The table is about individual persons. It contains demographic details
    like birth date, gender, race, and ethnicity. It also includes identifiers like
    person_id and source values. The table links persons to locations, providers,
    and care sites through foreign keys. This suggests it's likely part of a healthcare
    or medical database system.
  columns:
  - name: person_id
    description: Unique identifier for each person
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column is a unique identifier for each person. For this
        table, each row represents a distinct person, and person_id is unique across
        rows.
  - name: year_of_birth
    description: Year the person was born
    tests:
    - not_null
  - name: month_of_birth
    description: Month the person was born
    tests:
    - not_null
  - name: day_of_birth
    description: Day of the month the person was born
    tests:
    - not_null
  - name: birth_datetime
    description: Full date and time of birth
    tests:
    - not_null
  - name: location_id
    description: Foreign key linking to a location table
    tests:
    - not_null
  - name: provider_id
    description: Foreign key linking to a healthcare provider table
    tests:
    - not_null
  - name: care_site_id
    description: Foreign key linking to a care site table
    tests:
    - not_null
  - name: person_source_value
    description: Original identifier from the data source
    tests:
    - not_null
  - name: gender_source_value
    description: Gender of the person
    tests:
    - not_null
    - accepted_values:
        values:
        - F
        - M
    cocoon_meta:
      future_accepted_values:
      - O
      - U
      - N
      - X
  - name: race_source_value
    description: Racial background of the person
    tests:
    - not_null
    - accepted_values:
        values:
        - Black or African American
        - White
    cocoon_meta:
      future_accepted_values:
      - Asian
      - American Indian or Alaska Native
      - Native Hawaiian or Other Pacific Islander
      - Hispanic or Latino
      - Two or More Races
      - Other
      - Unknown
      - Declined to Answer
  - name: ethnicity_source_value
    description: Ethnic background of the person
    tests:
    - not_null
    - accepted_values:
        values:
        - Hispanic or Latino
        - Not Hispanic or Latino
    cocoon_meta:
      future_accepted_values:
      - Unknown
      - Declined to answer
      - Other
  tests: []

stg_device

Table Preview (first 5 rows)

device_id person_id device_exposure_start_date device_exposure_start_datetime device_exposure_end_date device_exposure_end_datetime unique_device_id quantity provider_id visit_occurrence_id device_source_value
0 1001 2001 2023-01-15 2023-01-15 08:00:00 2023-01-20 2023-01-20 18:00:00 UD1001 1 5001 200 DeviceA
1 1002 2002 2023-02-20 2023-02-20 09:30:00 2023-02-25 2023-02-25 16:30:00 UD1002 2 5002 201 DeviceB

stg_device.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:35:41.749218+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."device"

stg_device.yml (Document the table)

version: 2
models:
- name: stg_device
  description: The table is about device exposures. It records when a device was used
    by a person. The table includes device identifiers, person identifiers, start
    and end dates/times of exposure, quantity, provider information, visit details,
    and device type. Each row represents a single device exposure event with its associated
    details.
  columns:
  - name: device_id
    description: Unique identifier for each device exposure record
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column represents a unique identifier for each device exposure
        record. For this table, each row is a single device exposure event. device_id
        is unique across rows as it's designed to identify each specific exposure
        event.
  - name: person_id
    description: Unique identifier for the person using the device
    tests:
    - not_null
  - name: device_exposure_start_date
    description: Date when device exposure started
    tests:
    - not_null
  - name: device_exposure_start_datetime
    description: Exact date and time when device exposure started
    tests:
    - not_null
  - name: device_exposure_end_date
    description: Date when device exposure ended
    tests:
    - not_null
  - name: device_exposure_end_datetime
    description: Exact date and time when device exposure ended
    tests:
    - not_null
  - name: unique_device_id
    description: Unique identifier for the specific device used
    tests:
    - not_null
  - name: quantity
    description: Number of devices used in this exposure
    tests:
    - not_null
  - name: provider_id
    description: Identifier for the healthcare provider overseeing the exposure
    tests:
    - not_null
  - name: visit_occurrence_id
    description: Identifier for the associated healthcare visit
    tests:
    - not_null
  - name: device_source_value
    description: Original value or code for the device type
    tests:
    - not_null
  tests: []

stg_cost

Table Preview (first 5 rows)

cost_id person_id total_charge total_paid amount_allowed paid_by_payer paid_by_patient paid_patient_copay paid_patient_coinsurance paid_patient_deductible paid_by_primary paid_ingredient_cost paid_dispensing_fee
0 1001 1 500.0 450.0 450.0 400.0 50.0 25.0 15.0 10.0 400.0 300.0 20.0
1 1002 2 1500.0 1400.0 1400.0 1300.0 100.0 50.0 40.0 10.0 1300.0 1000.0 50.0

stg_cost.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:34:53.984808+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."cost"

stg_cost.yml (Document the table)

version: 2
models:
- name: stg_cost
  description: The table is about medical or prescription costs. It contains details
    of charges, payments, and allocations. Each row represents a single cost entry.
    It includes total charges, amounts paid, and breakdowns of payments by payer,
    patient, and primary insurance. The table also specifies copay, coinsurance, deductible,
    ingredient cost, and dispensing fee amounts.
  columns:
  - name: cost_id
    description: Unique identifier for each cost entry
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column represents a unique identifier for each cost entry.
        For this table, each row is for a single cost entry. cost_id is unique across
        rows.
  - name: person_id
    description: Unique identifier for the person associated with the cost
    tests:
    - not_null
  - name: total_charge
    description: Total amount charged for the medical service or prescription
    tests:
    - not_null
  - name: total_paid
    description: Total amount paid for the medical service or prescription
    tests:
    - not_null
  - name: amount_allowed
    description: Maximum amount allowed by the insurance plan
    tests:
    - not_null
  - name: paid_by_payer
    description: Amount paid by the insurance company
    tests:
    - not_null
  - name: paid_by_patient
    description: Total amount paid by the patient
    tests:
    - not_null
  - name: paid_patient_copay
    description: Fixed amount paid by patient for the service
    tests:
    - not_null
  - name: paid_patient_coinsurance
    description: Percentage of cost paid by patient after deductible
    tests:
    - not_null
  - name: paid_patient_deductible
    description: Amount paid by patient towards their deductible
    tests:
    - not_null
  - name: paid_by_primary
    description: Amount paid by primary insurance
    tests:
    - not_null
  - name: paid_ingredient_cost
    description: Cost of ingredients for prescription drugs
    tests:
    - not_null
  - name: paid_dispensing_fee
    description: Fee charged for dispensing prescription drugs
    tests:
    - not_null
  tests: []

stg_location

Table Preview (first 5 rows)

location_id address_1 address_2 city state zip county location_source_value latitude longitude
0 1 123 Main St Suite 100 Springfield NY 12345 Hampden L123 40.7128 -74.0060
1 2 456 Elm St Apt 202 Riverdale CA 67890 Archie L456 34.0522 -118.2437

stg_location.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:36:52.091248+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."location"

stg_location.yml (Document the table)

version: 2
models:
- name: stg_location
  description: The table is about location information. It contains details such as
    address, city, state, zip code, county, and geographical coordinates. Each location
    has a unique identifier. The table includes both street address components and
    latitude/longitude coordinates. It also has a source value field, possibly for
    tracking data origin.
  columns:
  - name: location_id
    description: Unique identifier for each location
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column is a unique identifier for each location. For this
        table, each row represents a distinct location, and location_id is unique
        across rows.
  - name: address_1
    description: Primary street address
    tests:
    - not_null
  - name: address_2
    description: Secondary address information (e.g., apartment, suite)
    tests:
    - not_null
  - name: city
    description: Name of the city
    tests:
    - not_null
  - name: state
    description: State abbreviation
    tests:
    - not_null
    - accepted_values:
        values:
        - CA
        - NY
    cocoon_meta:
      future_accepted_values:
      - AL
      - AK
      - AZ
      - AR
      - CO
      - CT
      - DE
      - FL
      - GA
      - HI
      - ID
      - IL
      - IN
      - IA
      - KS
      - KY
      - LA
      - ME
      - MD
      - MA
      - MI
      - MN
      - MS
      - MO
      - MT
      - NE
      - NV
      - NH
      - NJ
      - NM
      - NC
      - ND
      - OH
      - OK
      - OR
      - PA
      - RI
      - SC
      - SD
      - TN
      - TX
      - UT
      - VT
      - VA
      - WA
      - WV
      - WI
      - WY
  - name: zip
    description: Postal code
    tests:
    - not_null
  - name: county
    description: Name of the county
    tests:
    - not_null
  - name: location_source_value
    description: Source identifier for the location data
    tests:
    - not_null
  - name: latitude
    description: Geographic coordinate specifying north-south position
    tests:
    - not_null
  - name: longitude
    description: Geographic coordinate specifying east-west position
    tests:
    - not_null
  tests: []

stg_procedure_occurrence

Table Preview (first 5 rows)

procedure_occurrence_id person_id procedure_date procedure_datetime quantity provider_id visit_occurrence_id visit_detail_id procedure_source_value qualifier_source_value procedure_cost
0 1 101 2023-01-15 2023-01-15 10:00:00 1 101 200 0 PROC1 NaN 200.0
1 2 102 2023-02-20 2023-02-20 14:30:00 2 102 200 0 PROC2 NaN 450.0

stg_procedure_occurrence.sql (clean the table)

-- COCOON BLOCK START: PLEASE DO NOT MODIFY THIS BLOCK FOR SELF-MAINTENANCE
-- Generated at 2024-08-23 22:39:15.135495+00:00
-- COCOON BLOCK END
SELECT *
FROM "memory"."main"."procedure_occurrence"

stg_procedure_occurrence.yml (Document the table)

version: 2
models:
- name: stg_procedure_occurrence
  description: The table is about procedure occurrences. It contains details of medical
    procedures performed on patients. Each row represents a single procedure event.
    The table includes information such as the patient's ID, procedure date and time,
    quantity, provider ID, visit details, procedure source value, and cost. It captures
    the essential data related to medical procedures in a healthcare setting.
  columns:
  - name: procedure_occurrence_id
    description: Unique identifier for each procedure occurrence
    tests:
    - not_null
    - unique
    cocoon_meta:
      unique_reason: This column is a unique identifier for each procedure occurrence.
        For this table, each row represents a single procedure event. procedure_occurrence_id
        is unique across rows.
  - name: person_id
    description: Unique identifier for the patient
    tests:
    - not_null
  - name: procedure_date
    description: Date when the procedure was performed
    tests:
    - not_null
  - name: procedure_datetime
    description: Date and time when the procedure was performed
    tests:
    - not_null
  - name: quantity
    description: Number of times the procedure was performed
    tests:
    - not_null
  - name: provider_id
    description: Unique identifier for the healthcare provider
    tests:
    - not_null
  - name: visit_occurrence_id
    description: Identifier for the associated visit
    tests:
    - not_null
  - name: visit_detail_id
    description: Identifier for specific details of the visit
    tests:
    - not_null
  - name: procedure_source_value
    description: Original code or identifier for the procedure
    tests:
    - not_null
  - name: qualifier_source_value
    description: Additional qualifying information for the procedure
    cocoon_meta:
      missing_reason: Additional qualifiers may not be needed for all procedures.
  - name: procedure_cost
    description: Cost of the procedure
    tests:
    - not_null
  tests: []
We identify the primary key (PK) and foreign key (FK) from tables. We build a join graph that connects FK to PK.

Join Graph (FK to PK)

%3 stg_observation_0 stg_observation stg_visit_occurrence_3 stg_visit_occurrence stg_observation_0->stg_visit_occurrence_3 stg_provider_8 stg_provider stg_observation_0->stg_provider_8 stg_person_14 stg_person stg_observation_0->stg_person_14 stg_payer_plan_period_1 stg_payer_plan_period stg_payer_plan_period_1->stg_person_14 stg_death_2 stg_death stg_death_2->stg_person_14 stg_visit_occurrence_3->stg_visit_occurrence_3 stg_visit_occurrence_3->stg_provider_8 stg_care_site_13 stg_care_site stg_visit_occurrence_3->stg_care_site_13 stg_visit_occurrence_3->stg_person_14 stg_measurement_4 stg_measurement stg_measurement_4->stg_visit_occurrence_3 stg_measurement_4->stg_provider_8 stg_measurement_4->stg_person_14 stg_visit_detail_5 stg_visit_detail stg_visit_detail_5->stg_visit_occurrence_3 stg_visit_detail_5->stg_provider_8 stg_visit_detail_5->stg_care_site_13 stg_visit_detail_5->stg_person_14 stg_specimen_6 stg_specimen stg_specimen_6->stg_person_14 stg_procedure_occurrence_7 stg_procedure_occurrence stg_procedure_occurrence_7->stg_visit_occurrence_3 stg_procedure_occurrence_7->stg_visit_detail_5 stg_procedure_occurrence_7->stg_provider_8 stg_procedure_occurrence_7->stg_person_14 stg_provider_8->stg_care_site_13 stg_device_9 stg_device stg_device_9->stg_visit_occurrence_3 stg_device_9->stg_provider_8 stg_device_9->stg_person_14 stg_location_10 stg_location stg_condition_occurrence_11 stg_condition_occurrence stg_condition_occurrence_11->stg_visit_occurrence_3 stg_condition_occurrence_11->stg_visit_detail_5 stg_condition_occurrence_11->stg_provider_8 stg_condition_occurrence_11->stg_person_14 stg_drug_exposure_12 stg_drug_exposure stg_drug_exposure_12->stg_visit_occurrence_3 stg_drug_exposure_12->stg_provider_8 stg_drug_exposure_12->stg_person_14 stg_care_site_13->stg_location_10 stg_person_14->stg_provider_8 stg_person_14->stg_location_10 stg_person_14->stg_care_site_13 stg_cost_15 stg_cost stg_cost_15->stg_person_14

cocoon_join.yml (Document the joins)

join_graph:
- table_name: stg_location
  primary_key: location_id
  foreign_keys: []
- table_name: stg_care_site
  foreign_keys:
  - column: location_id
    reference:
      table_name: stg_location
      column: location_id
  primary_key: care_site_id
- table_name: stg_person
  foreign_keys:
  - column: location_id
    reference:
      table_name: stg_location
      column: location_id
  - column: provider_id
    reference:
      table_name: stg_provider
      column: provider_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: care_site_id
    reference:
      table_name: stg_care_site
      column: care_site_id
  primary_key: person_id
- table_name: stg_condition_occurrence
  foreign_keys:
  - column: person_id
    reference:
      table_name: stg_person
      column: person_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: provider_id
    reference:
      table_name: stg_provider
      column: provider_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: visit_detail_id
    reference:
      table_name: stg_visit_detail
      column: visit_detail_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: visit_occurrence_id
    reference:
      table_name: stg_visit_occurrence
      column: visit_occurrence_id
- table_name: stg_cost
  foreign_keys:
  - column: person_id
    reference:
      table_name: stg_person
      column: person_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
- table_name: stg_death
  foreign_keys:
  - column: person_id
    reference:
      table_name: stg_person
      column: person_id
- table_name: stg_device
  foreign_keys:
  - column: person_id
    reference:
      table_name: stg_person
      column: person_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: provider_id
    reference:
      table_name: stg_provider
      column: provider_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: visit_occurrence_id
    reference:
      table_name: stg_visit_occurrence
      column: visit_occurrence_id
- table_name: stg_drug_exposure
  foreign_keys:
  - column: person_id
    reference:
      table_name: stg_person
      column: person_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: provider_id
    reference:
      table_name: stg_provider
      column: provider_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: visit_occurrence_id
    reference:
      table_name: stg_visit_occurrence
      column: visit_occurrence_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
- table_name: stg_measurement
  foreign_keys:
  - column: person_id
    reference:
      table_name: stg_person
      column: person_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: provider_id
    reference:
      table_name: stg_provider
      column: provider_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: visit_occurrence_id
    reference:
      table_name: stg_visit_occurrence
      column: visit_occurrence_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
- table_name: stg_observation
  foreign_keys:
  - column: person_id
    reference:
      table_name: stg_person
      column: person_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: provider_id
    reference:
      table_name: stg_provider
      column: provider_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: visit_occurrence_id
    reference:
      table_name: stg_visit_occurrence
      column: visit_occurrence_id
- table_name: stg_payer_plan_period
  foreign_keys:
  - column: person_id
    reference:
      table_name: stg_person
      column: person_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
- table_name: stg_procedure_occurrence
  foreign_keys:
  - column: person_id
    reference:
      table_name: stg_person
      column: person_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: provider_id
    reference:
      table_name: stg_provider
      column: provider_id
  - column: visit_detail_id
    reference:
      table_name: stg_visit_detail
      column: visit_detail_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: visit_occurrence_id
    reference:
      table_name: stg_visit_occurrence
      column: visit_occurrence_id
- table_name: stg_specimen
  foreign_keys:
  - column: person_id
    reference:
      table_name: stg_person
      column: person_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
- table_name: stg_visit_detail
  foreign_keys:
  - column: person_id
    reference:
      table_name: stg_person
      column: person_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: provider_id
    reference:
      table_name: stg_provider
      column: provider_id
    referential_integrity:
      '% orphan': 50.00%
      explanation: ''
  - column: visit_occurrence_id
    reference:
      table_name: stg_visit_occurrence
      column: visit_occurrence_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: care_site_id
    reference:
      table_name: stg_care_site
      column: care_site_id
    referential_integrity:
      '% orphan': 33.33%
      explanation: ''
  primary_key: visit_detail_id
- table_name: stg_visit_occurrence
  foreign_keys:
  - column: person_id
    reference:
      table_name: stg_person
      column: person_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: provider_id
    reference:
      table_name: stg_provider
      column: provider_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: preceding_visit_occurrence_id
    reference:
      table_name: stg_visit_occurrence
      column: visit_occurrence_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  - column: care_site_id
    reference:
      table_name: stg_care_site
      column: care_site_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
  primary_key: visit_occurrence_id
- table_name: stg_provider
  primary_key: provider_id
  foreign_keys:
  - column: care_site_id
    reference:
      table_name: stg_care_site
      column: care_site_id
    referential_integrity:
      '% orphan': 100.00%
      explanation: ''
We illustrate the step-by-step process behind the data.

All

Process Story

cocoon_er.yml (Document the entity relationships)

groups: []
entities:
- entity_name: Locations
  entity_description: Physical locations including address details and geographical
    coordinates
  table_name: stg_location
  primary_key: location_id
- entity_name: Patients
  entity_description: Individual patients with demographic information and links to
    other healthcare entities
  table_name: stg_person
  primary_key: person_id
- entity_name: Healthcare Providers
  entity_description: Medical professionals or entities providing healthcare services
  table_name: stg_provider
  primary_key: provider_id
- entity_name: Visit Details
  entity_description: Specific details of patient visits, including timing and care
    information
  table_name: stg_visit_detail
  primary_key: visit_detail_id
- entity_name: Visit Occurrences
  entity_description: Overall patient visits, encompassing multiple visit details
    and broader visit information
  table_name: stg_visit_occurrence
  primary_key: visit_occurrence_id
- entity_name: Care Sites
  entity_description: Healthcare facilities or locations where care is provided
  table_name: stg_care_site
  primary_key: care_site_id
relations:
- relation_name: PatientCareNetwork
  relation_description: Patients receive healthcare services from Providers at Care
    Sites, which are located in specific Locations.
  table_name: stg_person
  entities:
  - Patients
  - Locations
  - Healthcare Providers
  - Care Sites
- relation_name: ProviderCareSiteAffiliation
  relation_description: Healthcare Providers are affiliated with specific Care Sites
    where they practice and provide medical services.
  table_name: stg_provider
  entities:
  - Healthcare Providers
  - Care Sites
- relation_name: PatientVisitDetailRecord
  relation_description: Visit Details record specific interactions between Patients
    and Healthcare Providers at Care Sites during broader Visit Occurrences.
  table_name: stg_visit_detail
  entities:
  - Visit Details
  - Patients
  - Healthcare Providers
  - Visit Occurrences
  - Care Sites
- relation_name: PatientVisitOccurrenceDetails
  relation_description: Patients undergo Visit Occurrences at Care Sites, attended
    by Healthcare Providers, tracking details of each visit and its sequence.
  table_name: stg_visit_occurrence
  entities:
  - Visit Occurrences
  - Patients
  - Healthcare Providers
  - Care Sites
- relation_name: CareSiteLocation
  relation_description: Care Sites are physically situated at specific Locations,
    providing healthcare services in various settings.
  table_name: stg_care_site
  entities:
  - Care Sites
  - Locations
- relation_name: PatientConditionOccurrenceRecord
  relation_description: Patients experience condition occurrences, which are recorded
    during Visit Occurrences and Visit Details, often involving Healthcare Providers
    for diagnosis and treatment.
  table_name: stg_condition_occurrence
  entities:
  - Patients
  - Healthcare Providers
  - Visit Details
  - Visit Occurrences
- relation_description: This table records detailed cost information for medical or
    prescription charges associated with individual patients.
  table_name: stg_cost
  entities:
  - Patients
- relation_description: This table records death information for Patients, including
    their death date, time, and cause of death.
  table_name: stg_death
  entities:
  - Patients
- relation_name: PatientDeviceExposureDetails
  relation_description: Patients receive device exposures from Healthcare Providers
    during specific Visit Occurrences, tracking usage details and durations.
  table_name: stg_device
  entities:
  - Patients
  - Healthcare Providers
  - Visit Occurrences
- relation_name: PatientDrugExposures
  relation_description: Patients receive drug exposures prescribed by Healthcare Providers
    during specific Visit Occurrences, tracking detailed medication information and
    administration.
  table_name: stg_drug_exposure
  entities:
  - Patients
  - Healthcare Providers
  - Visit Occurrences
- relation_name: PatientMeasurementEncounters
  relation_description: Patients receive medical measurements from Healthcare Providers
    during specific Visit Occurrences, capturing detailed measurement data.
  table_name: stg_measurement
  entities:
  - Patients
  - Healthcare Providers
  - Visit Occurrences
- relation_name: PatientProviderVisitObservations
  relation_description: Patients receive medical observations from Healthcare Providers
    during Visit Occurrences, documenting various health measurements and assessments.
  table_name: stg_observation
  entities:
  - Patients
  - Healthcare Providers
  - Visit Occurrences
- relation_description: This table stores insurance coverage periods and plan details
    for individual Patients over time.
  table_name: stg_payer_plan_period
  entities:
  - Patients
- relation_name: PatientProcedureEvents
  relation_description: Patients undergo medical procedures performed by Healthcare
    Providers during Visit Occurrences, with specific Visit Details recorded for each
    procedure.
  table_name: stg_procedure_occurrence
  entities:
  - Patients
  - Healthcare Providers
  - Visit Details
  - Visit Occurrences
- relation_description: This stores biological specimens collected from patients for
    medical analysis and research purposes.
  table_name: stg_specimen
  entities:
  - Patients
story:
- name: PatientCareNetwork
  description: Healthcare system establishes network of providers and facilities.
  type: relation
- name: CareSiteLocation
  description: Care sites are set up in various locations.
  type: relation
- name: ProviderCareSiteAffiliation
  description: Healthcare providers join specific care sites.
  type: relation
- name: PatientVisitOccurrenceDetails
  description: Patient schedules and attends a visit.
  type: relation
- name: PatientVisitDetailRecord
  description: Provider documents specific interactions during the visit.
  type: relation
- name: PatientConditionOccurrenceRecord
  description: Provider diagnoses patient's condition during visit.
  type: relation
- name: PatientMeasurementEncounters
  description: Provider takes medical measurements to assess condition.
  type: relation
- name: PatientProviderVisitObservations
  description: Provider records additional health observations.
  type: relation
- name: PatientProcedureEvents
  description: Provider performs necessary medical procedures.
  type: relation
- name: PatientDrugExposures
  description: Provider prescribes medications for patient's condition.
  type: relation
- name: PatientDeviceExposureDetails
  description: Provider administers medical devices if required.
  type: relation