The database containing information about other databases sounds pretty cool and not only that. It can be also extremely useful if you know how to use it properly. Today we’ll talk about the MySQL INFOMRATION_SCHEMA database.
As already stated, the INFORMATION_SCHEMA database is not just another database. You’ll find it on your MySQL server after installation and before you have performed any action. When we create or modify objects (database, table, keys, contains, procedures, functions, triggers etc.) in any database on the server, we need to store these definitions somehow. Since databases are intended to store data, is there a better way to do it than in the “master” database? That is the main idea behind the INFORMATION_SCHEMA database. You can easily query this database, just like any other database. These queries will not return data users entered using applications but rather definitions we have “entered” while creating our database.
Other SQL DBMSs also have their own version of “master” database.
Our data model
To be honest, there are numerous ways you could use the INFORMATION_SCHEMA and trying to explain everything more detailed would simply go out of the scope of this article. Therefore, I’ll focus on a data model I’ve created two example models to show how to retrieve some important tables and constraints descriptions.
They are almost the same with minor changes in the keys. Both of them are meant to store data about products and services we sold to our clients as well as invoices and invoice details.
On the picture above is the model with everything set properly. Each table has its’ own primary key called id, of an INTEGER data type with auto increment set to Yes.
Foreign keys are used to create relations equivalent to those in the real world. The client could have 0 to more invoices, each invoice will have 1 or more invoice items. Each item is related to exactly one product or service, but same product or service will be used for multiple items on many different invoices. Therefore, all relations in our model are one-to-many.
There are also some additional rules – UNIQUE KEYS defined on the product_service. ps_name and invoice. invoice_code attributes.
I made two changes that we should be able to detect using our queries.
First, one is removing a primary key from the invoice table. We still have attribute id in the table. It can’t longer have the AUTO_INCREMENT property set. Also, as a result of removing the primary key, the foreign key relationship between tables invoice and invoice_item no longer exists. The attribute id_invoice in the invoice_item table was added manually.
The second one is removing relation between tables client and invoice. The client table is now completely unrelated to the rest of the model and we should be able to detect it.
Creating a database from the model
I’ve used the Forward Engineer – Generate SQL option in the sqlDBM to generate a series of SQL statements. After that, I ran these statements on my local MySQL Server installation.
The database with everything set properly is named sales_ok, and the other one is named sales_not_ok.
Now we’re ready to run few queries and analyze results.
The first thing we’ll do is running two queries to determine a list of all tables in both our databases. The query used is:
WHERE INFORMATION_SCHEMA.TABLES.TABLE_SCHEMA = “sales_ok”;
The first thing I want to emphasize is that we could use * to get all columns from this table, but for the sake of the readability, I’ve decided to select only the TABLE_NAME and the AUTO_INCREMENT columns.
It’s obvious that we have all 4 tables created and that each of them had the auto increment column with value 1. This implies that the next record we’ll insert will have value 1 in this column (notice that all 4 tables are without any data now).
Now we’ll do the same for the database with removed keys.
WHERE INFORMATION_SCHEMA.TABLES.TABLE_SCHEMA = “sales_not_ok”;
Once again, we have all 4 tables created but we can see that the invoice table lacks auto increment value. This is due to the fact that id column is not a primary key and therefore can’t have AUTO_INCREMENT property set.
If we decide to have a single-column primary key, with AUTO_INCREMENT property set, then this is the first check telling us something is not as we wanted it to be.
Columns, primary and unique keys
Now we’ll list all columns from all tables in our database. The query below:
WHERE INFORMATION_SCHEMA.COLUMNS.TABLE_SCHEMA = ‘sales_ok’;
returns the following result:
We can easily notice that all columns from all tables in our database are listed. And not only that. We can see column data types as well the info if the column is part of the primary key (PRI) or unique key (UNI). In the EXTRA we’ll see the auto_increment property once again.
Now, let’s run the same query for the database where we removed some keys and properties.
WHERE INFORMATION_SCHEMA.COLUMNS.TABLE_SCHEMA = ‘sales_not_ok’;
As expected, the invoice.id column lacks PRI and auto_increment because these are the rules we have removed. It’s worth noticing that invoice.invoice_code acts as PRI in the absence of the primary key, unique key acts as one (because of the fact that unique is actually an alternate primary key in the table).
Taking this into account, we could easily check for the number of PRI (having auto_increment) in each table. If we decide to use single-column integer auto_increment values as primary keys, we should have exactly 1 such column per each table. Otherwise, we have an error in our db. And yes, using naming convention should definitely be a rule here (always calling the primary key column- id).
The last thing we’ll check today are foreign keys constraints. We can simply list all foreign keys in the database using. We’ll use this query:
WHERE INFORMATION_SCHEMA.KEY_COLUMN_USAGE.REFERENCED_TABLE_SCHEMA = ‘sales_ok’;
The result returned is:
As expected, we have all 3 relationships from our model in this result set.
Now we’ll run the same query on the database where we deleted keys.
WHERE INFORMATION_SCHEMA.KEY_COLUMN_USAGE.REFERENCED_TABLE_SCHEMA = ‘sales_not_ok’;
We have only one foreign key because relations client <-> invoice and invoice <-> invoice_item are missing.
With minor changes, we could modify queries to list all tables and a number of times they have been referenced. Tables that are not referenced anywhere in the model should be somewhere in the middle of the model. In our database, one such example is the invoice_item table.
One thing is definitely sure – each table should be referenced or used as a reference. Otherwise, it’s an island in our model and that screams to be an unwanted behavior.
Using the INFORMATION_SCHEMA database properly could be of a great benefit to you, especially if you’re in DBA role. Building a set of queries to monitor changes is definitely a great plus. Today we just scratched the surface of possibilities you have available. There is no better advice than to continue exploring yourself and stay tuned to our blog.