Gretel Relational
Synthetic Report

Generated 2024-08-01
Generation ID my-mysql-workflow-model-train-run-20240801071721
Good
Composite
Synthetic Data Quality Score
Normal
Composite
Privacy Protection Level

Table Relationships

The primary and foreign keys for each table in the synthesized database and their relationships are displayed below.

Table Name Primary Key Foreign Keys
Branch_Expenses ID Branch_ID 
Branches Branch_ID
Detailed_Patients_Visits ID Patients_visits_ID 
Doctors Doctor_ID Specialization_ID 
Doctors_Contacts ID Doctor_ID 
Patients Patient_ID
Patients_Visits Patients_visits_ID Doctor_ID  Patient_ID  Visit_ID 
Specializations Specialization_ID Branch_ID 
Visits_Type Visit_ID

Synthetic Data Quality Results

For each table, individual and cross-table Gretel Synthetic Reports are generated, which include the Synthetic Data Quality Score (SQS). The individual Synthetic Report evaluates the statistical accuracy of the individual synthetic table compared to the real world data that it is based on. This provides insight into the accuracy of the synthetic output of the stand-alone table. The individual SQS does not take into account statistical correlations of data across related tables. The cross-table Synthetic Report evaluates the statistical accuracy of the synthetic data of a table with consideration to the correlations between data across related tables. The cross-table SQS provides insight into the accuracy of the table in the context of the database as a whole. More information about the Gretel Synthetic Report and Synthetic Data Quality Score is available here.

Synthetic Data Quality Scores

For each table, individual and cross-table synthetic data quality scores (SQS) are computed and displayed below.

Table Name Individual SQS Cross-table SQS
Branches 68 Good None Unavailable
Specializations 100 Excellent 57 Moderate
Branch_Expenses 34 Poor 40 Moderate
Doctors_Contacts 68 Good 65 Good
Detailed_Patients_Visits 71 Good 45 Moderate
Patients_Visits 72 Good 39 Poor
Doctors 69 Good 64 Good
Patients 66 Good None Unavailable
Visits_Type 100 Excellent None Unavailable

The Synthetic Data Quality Score is an estimate of how well the generated synthetic data maintains the same statistical properties as the original dataset. In this sense, the Synthetic Data Quality Score can be viewed as a utility score or a confidence score as to whether scientific conclusions drawn from the synthetic dataset would be the same if one were to have used the original dataset instead. If you do not require statistical symmetry, as might be the case in a testing or demo environment, a lower score may be just as acceptable.

If your Synthetic Data Quality Score isn't as high as you'd like it to be, read here for a multitude of ideas for improving your model.