Platform

Data Modeling

Develop data models collaboratively in the cloud and share them with your organization in various modeling styles and formats with no coding or conversion required

Model Governance

Create and manage business metadata using a dedicated project role

Snowflake Schema Monitoring

Track and get notified of schema changes in live database environments

Integration

Strategic advisors

Kent Graziano

The Data Warrior, Strategic Advisor, Data Vault Master, Author, Speaker, and Tae Kwon Do Grandmaster

Gordon Wong

Leading organizations through analytics transformations, preference for social missions, healthcare, energy, education, and civic engagement

For cloud data platforms

SqlDBM offers secure native connectors to leading data platforms like Snowflake, Databricks, and BigQuery so you can reverse engineer and begin modeling in seconds.

Try modeling now

Data Modeling

Develop data models collaboratively in the cloud and share them with your organization in various modeling styles and formats with no coding or conversion required

Model Governance

Create and manage business metadata using a dedicated project role

Snowflake Schema Monitoring

Track and get notified of schema changes in live database environments

Supercharge Your Data Modeling With AI Without Compromising Security

Have you ever wanted to supercharge your data modeling with Generative AI but faced restrictions due to your company’s strict data privacy policies?

In this article, we’ll explore a secure and efficient alternative: integrating DeepSeek, an open-source local Large Language Model (LLM), with LM Studio. This combination allows you to automate your data modeling workflow and ensure compliance with data privacy regulations.

Why Choose a Local Environment?

Data security and compliance are critical concerns for businesses. Using a local Generative AI solution ensures that sensitive information never leaves your company’s network, greatly reducing exposure to external threats.

Setting Up LM Studio with DeepSeek

To begin, you’ll need to install LM Studio integrated with DeepSeek in your secure local environment. Follow these simple steps:

  • Download LM Studio here.
  • Upon installation, LM Studio suggests DeepSeek R1 Distilled (Qwen 7B), suitable for low-end systems. However, it’s recommended to use the more robust Llama version 8B instead in the Discover option in LM Studio

Search for DeepSeek model and download ollama run deepseek-r1:8b (requires approximately 4.9GB of memory):

Creating Your Data Model with SqlDBM

Design your data model using SqlDBM.

For this demonstration, we utilized the TPCH model from Snowflake, a robust industry-standard test case.

Once your model is complete, perform forward engineering to generate the corresponding Data Definition Language(DDL) code. Save this output to a file.

Validating Business Rules with LM Studio

Now comes the exciting part: using your local LLM to validate your data model against key business rules.

Use the following prompt in LM Studio to validate your model against essential business rules and add the previous DDL file:

“You are a professional data architect who works for a data modeling tool. The attached “txt” represents the data model with the relationships and diagrams in the form of a DDL. I want you to analyze the file and then check for the next business rules in the model:

1. Date fields types should have a naming convention of _D

2. Table names are defined as singular nouns

3. All tables must have a PK constraint

If any of these rules is not met, please provide a detailed proposal for all changes that need to be done in the form of a complete DDL”

Example Results

Upon testing, LM Studio identified key issues, such as incorrect naming conventions (O_ORDERDATE, L_SHIPDATE, etc.), plural table names (ORDERS, LINEITEM), and inadequate primary keys (LINEITEM).

The AI promptly generated corrected DDL code, addressing all business rule violations efficiently.

— ************************** SqlDBM: Snowflake *************************

— ********* Generated by SqlDBM: TPCH by ivan.lopez@blog.sqldbm.com *********

— ************************************** REGION

CREATE TABLE REGION

(

R_REGIONKEY integer NOT NULL,

R_NAME string(25) NOT NULL,

R_COMMENT string(152),

CONSTRAINT R_REGIONKEY PRIMARY KEY ( R_REGIONKEY )

);

— ************************************** PART

CREATE TABLE PART

(

P_PARTKEY integer NOT NULL,

P_NAME string(55) NOT NULL,

P_MFGR string(25) NOT NULL,

P_BRAND string(10) NOT NULL,

P_TYPE string(25) NOT NULL,

P_SIZE integer NOT NULL,

P_CONTAINER string(10) NOT NULL,

P_RETAILPRICE decimal(15,2) NOT NULL,

P_COMMENT string(23) NOT NULL,

CONSTRAINT P_PARTKEY PRIMARY KEY ( P_PARTKEY )

);

— ************************************** NATION

CREATE TABLE NATION

(

N_NATIONKEY integer NOT NULL,

N_NAME string(25) NOT NULL,

N_REGIONKEY integer NOT NULL,

N_COMMENT string(152),

CONSTRAINT N_NATIONKEY PRIMARY KEY ( N_NATIONKEY ),

CONSTRAINT N_REGIONKEY FOREIGN KEY ( N_REGIONKEY ) REFERENCES REGION ( R_REGIONKEY )

);

— ************************************** SUPPLIER

CREATE TABLE SUPPLIER

(

S_SUPPKEY integer NOT NULL,

S_NAME string(25) NOT NULL,

S_ADDRESS string(40) NOT NULL,

S_NATIONKEY integer NOT NULL,

S_PHONE string(15) NOT NULL,

S_ACCTBAL decimal(15,2) NOT NULL,

S_COMMENT string(101) NOT NULL,

CONSTRAINT S_SUPPKEY PRIMARY KEY ( S_SUPPKEY ),

CONSTRAINT S_NATIONKEY FOREIGN KEY ( S_NATIONKEY ) REFERENCES NATION ( N_NATIONKEY )

);

— ************************************** CUSTOMER

CREATE TABLE CUSTOMER

(

C_CUSTKEY integer NOT NULL,

C_NAME string(25) NOT NULL,

C_ADDRESS string(40) NOT NULL,

C_NATIONKEY integer NOT NULL,

C_PHONE string(15) NOT NULL,

C_ACCTBAL decimal(15,2) NOT NULL,

C_MKTSEGMENT string(10) NOT NULL,

C_COMMENT string(117) NOT NULL,

CONSTRAINT C_CUSTKEY PRIMARY KEY ( C_CUSTKEY ),

CONSTRAINT C_NATIONKEY FOREIGN KEY ( C_NATIONKEY ) REFERENCES NATION ( N_NATIONKEY )

);

— ************************************** PARTSUPP

CREATE TABLE PARTSUPP

(

PS_PARTKEY integer NOT NULL,

PS_SUPPKEY integer NOT NULL,

PS_AVAILQTY integer NOT NULL,

PS_SUPPLYCOST decimal(15,2) NOT NULL,

PS_COMMENT string(199) NOT NULL,

CONSTRAINT PK_1 PRIMARY KEY ( PS_PARTKEY, PS_SUPPKEY ),

CONSTRAINT PS_PARTKEY FOREIGN KEY ( PS_PARTKEY ) REFERENCES PART ( P_PARTKEY ),

CONSTRAINT PS_SUPPKEY FOREIGN KEY ( PS_SUPPKEY ) REFERENCES SUPPLIER ( S_SUPPKEY )

);

— ************************************** ORDER

CREATE TABLE “ORDER”

(

O_ORDERKEY integer NOT NULL,

O_CUSTKEY integer NOT NULL,

O_ORDERSTATUS string(1) NOT NULL,

O_TOTALPRICE decimal(15,2) NOT NULL,

O_ORDER_D date NOT NULL, — renamed from O_ORDERDATE to O_ORDER_D for naming convention compliance

O_ORDERPRIORITY string(15) NOT NULL,

O_CLERK string(15) NOT NULL,

O_SHIPPRIORITY integer NOT NULL,

O_COMMENT string(79) NOT NULL,

CONSTRAINT O_ORDERKEY PRIMARY KEY ( O_ORDERKEY ),

CONSTRAINT O_CUSTKEY FOREIGN KEY ( O_CUSTKEY ) REFERENCES CUSTOMER ( C_CUSTKEY )

);

— ************************************** LINE_ITEM

CREATE TABLE LINE_ITEM

(

L_ORDERKEY integer NOT NULL,

L_LINENUMBER integer NOT NULL,

L_QUANTITY decimal(15,2) NOT NULL,

L_EXTENDEDPRICE decimal(15,2) NOT NULL,

L_DISCOUNT decimal(15,2) NOT NULL,

L_TAX decimal(15,2) NOT NULL,

L_RETURNFLAG string(1) NOT NULL,

L_LINESTATUS string(1) NOT NULL,

L_SHIP_D date NOT NULL, — renamed from L_SHIPDATE to L_SHIP_D for naming convention compliance

L_COMMIT_D date NOT NULL, — renamed from L_COMMITDATE to L_COMMIT_D for naming convention compliance

L_RECEIPT_D date NOT NULL, — renamed from L_RECEIPTDATE to L_RECEIPT_D for naming convention compliance

L_SHIPINSTRUCT string(25) NOT NULL,

L_SHIPMODE string(10) NOT NULL,

L_COMMENT string(44) NOT NULL,

PS_PARTKEY integer NOT NULL,

PS_SUPPKEY integer NOT NULL,

CONSTRAINT PK_LINE_ITEM PRIMARY KEY ( L_ORDERKEY, L_LINENUMBER ),

CONSTRAINT L_ORDERKEY FOREIGN KEY ( L_ORDERKEY ) REFERENCES “ORDER” ( O_ORDERKEY ),

CONSTRAINT L_PARTKEY FOREIGN KEY ( PS_PARTKEY, PS_SUPPKEY ) REFERENCES PARTSUPP ( PS_PARTKEY, PS_SUPPKEY )

);

Now, let’s put this DDL back to work in SqlDBM.

Reverse Engineering and Model Comparison

Next, reverse engineer the AI-corrected DDL back into SqlDBM using its built-in functionalities.

This functionality also allows you to easily compare the original and modified models side by side, ensuring alignment with business requirements.

And voilá, your new model is up to date with your business requirements.

Conclusions

Incorporating Generative AI into your data modeling process doesn’t have to come with a heavy risk to data privacy. By leveraging a local solution like DeepSeek with LM Studio, you can:

  • Automate tedious data modeling processes.
  • Ensure strict compliance with data privacy.
  • Quickly validate and enforce business rules.
  • Maintain control over your sensitive data.

This approach empowers data modelers, architects, and engineers to harness Generative AI’s potential without compromising security or compliance.

Thanks for reading!

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Strategic advisors

Kent Graziano

The Data Warrior, Strategic Advisor, Data Vault Master, Author, Speaker, and Tae Kwon Do Grandmaster

Gordon Wong

Leading organizations through analytics transformations, preference for social missions, healthcare, energy, education, and civic engagement

For cloud data platforms

SqlDBM offers secure native connectors to leading data platforms like Snowflake, Databricks, and BigQuery so you can reverse engineer and begin modeling in seconds.

Try modeling now

Platform

Data Modeling

Develop data models collaboratively in the cloud and share them with your organization in various modeling styles and formats with no coding or conversion required

Model Governance

Create and manage business metadata using a dedicated project role

Snowflake Schema Monitoring

Track and get notified of schema changes in live database environments