This guide deals with writing database scripts which can be rerun (‘rerunnable’). Creating rerunnable scripts makes deployments easier and avoids any unwanted behavior if the code were to run more than once. When modifying or updating a database, running a normal script once will work as expected without errors. However, if the script were to run again, the database may return an error message, halt execution or even worse, create unwanted duplicates. This could drastically slow down the deployment process.
This play will show you how to add simple checks to your database scripts so that they can be run more than once without errors.
The types of statements that can have such checks to ensure they are rerunnable are:
Most of these steps should be taken on every script and every project because they are such simple steps. However, ‘ALTER TABLE’ (5) and ‘INSERT INTO’ without Primary Keys (7) require significant additional work. As such, they are more appropriate for large projects where the need to make deployments as easy as possible outweighs the additional development effort.
When creating a table, using “IF NOT EXISTS” in the script will check to see if the table you are trying to create already exists before executing the create statement.
For Example:
CREATE TABLE `yourtable` ( `id` INT NOT NULL , `column1` VARCHAR(255) NULL DEFAULT NULL, `column2` VARCHAR(255) NULL DEFAULT NULL ) ENGINE = InnoDB ;
can be rewritten as:
CREATE TABLE IF NOT EXISTS `yourtable` ( `id` INT NOT NULL , `column1` VARCHAR(255) NULL DEFAULT NULL, `column2` VARCHAR(255) NULL DEFAULT NULL ) ENGINE = InnoDB ;
A rerunnable view creation script can be written using “CREATE OR REPLACE VIEW” instead of “CREATE VIEW”
CREATE VIEW `yourview` AS SELECT `column1` FROM `yourtable` ;
CREATE OR REPLACE VIEW `yourview` AS SELECT `column1` FROM `yourtable` ;
Using “IF EXISTS” when dropping a table or a view from will drop the table/view/stored procedure only if it exists. If the table/view/stored procedure has already been dropped, no error message will be thrown.
For example, the following code for dropping table,
DROP TABLE `yourtable` ;
DROP TABLE IF EXISTS `yourtable` ;
Similarly, the SQL for dropping a view,
DROP VIEW `yourview` ;
DROP VIEW IF EXISTS `yourview` ;
For a stored procedure,
DROP PROCEDURE `yourprocedure` ;
DROP PROCEDURE IF EXISTS `yourprocedure` ;
Finally, for triggers and functions you can write
DROP TRIGGER IF EXISTS `yourtrigger` ;
and
DROP FUNCTION IF EXISTS `yourfunction` ;
Creating a stored procedure works a little differently than creating a table. You will first have to drop a stored procedure if it exists and then create another one with the same name. You would take the same approach to alter a stored procedure as well.
CREATE PROCEDURE `yourprocedure`...
DROP PROCEDURE IF EXISTS `yourprocedure` ; CREATE PROCEDURE `yourprocedure`...
The same approach works for creating and modifying functions and triggers. You can add a DROP IF EXISTS in front of the code which creates the function/trigger:
DROP TRIGGER IF EXISTS `yourtrigger` ; DROP FUNCTION IF EXISTS `yourfunction` ;
There is no single approach to creating a rerunnable ALTER TABLE script. Some ALTER TABLE scripts are always rerunnable but in other cases you have to use a stored procedure to make the query rerunnable (as shown in the example below).
For example, the following line modifies the datatype of a column and will always be possible to rerun:
ALTER TABLE `yourtable` MODIFY `column1` VARCHAR(255) ;
However for adding a column the situation is more difficult. You can use “IF NOT EXISTS” inside a temporary stored procedure to make the script possible to rerun.
Below are the high level steps for running a rerunnable ALTER statement:
We also use all the rerunning techniques we have already learned when carrying out each step. For example, we drop the procedure if it exists before creating it.
DROP PROCEDURE IF EXISTS sp_alter_table ; DELIMITER $$ CREATE PROCEDURE sp_alter_table() BEGIN DECLARE _count INT; SET _count = ( SELECT count(1) FROM information_schema.columns WHERE table_name = 'yourtable' AND table_schema = 'Appian' AND column_name = 'column3' ) ; IF _count = 0 THEN ALTER TABLE yourtable ADD `column3` VARCHAR(255) ; END IF; END $$ DELIMITER ; CALL sp_alter_table(); DROP PROCEDURE sp_alter_table ;
Note: Using “IF NOT EXISTS” on the INFORMATION_SCHEMA.COLUMNS table will check to see if the column exists before adding the column.
You may be wondering about how you could use this to drop columns. A similar method would work, but Appian recommends that you do NOT drop columns from a table to avoid any issues that may arise due to backward compatibility.
Adding or removing constraints follows the same high level steps as adding a column, as explained above. However, you should check for the constraint by querying INFORMATION_SCHEMA.STATISTICS instead of INFORMATION_SCHEMA.COLUMNS. See the example below:
DROP PROCEDURE IF EXISTS sp_addconstraint ; DELIMITER $$ CREATE PROCEDURE sp_addconstraint() begin DECLARE _count INT; SET _count = ( SELECT count(1) FROM information_schema.statistics WHERE table_name = 'yourtable' AND table_schema = 'Appian' AND column_name = 'id' AND index_name = 'PRIMARY' ) ; IF _count = 0 THEN ALTER TABLE `yourtable` ADD PRIMARY KEY (id) ; END IF; END $$ DELIMITER ; CALL sp_addconstraint(); DROP PROCEDURE sp_addconstraint ;
For inserts, if you are inserting values where you are providing the primary key (or unique identifiers), you can use the IGNORE statement to make the insert rerunnable.
INSERT INTO `yourtable` (`id` , `column1`, `column2`, `column3` ) VALUES ('1' , 'A', 'B', 'C' ) , ('2' , 'D', 'E', 'F' ) ;
INSERT IGNORE INTO `yourtable` (`id` , `column1`, `column2`, `column3` ) VALUES ('1' , 'A', 'B', 'C' ) , ('2' , 'D', 'E', 'F' ) ;
Assuming that `id` is the primary key, if a value with `1` or `2` in `id` already exists, this statement will not insert this row of data in the table. However, if you want to update the row to the values in the query, the statement can be rewritten as:
INSERT INTO `yourtable` (`id` , `column1`, `column2`, `column3` ) VALUES ('1' , 'A2', 'B2', 'C2' ) , ('2' , 'D2', 'E2', 'F2' ) ON DUPLICATE KEY UPDATE column1 = VALUES (column1 ) , column2 = VALUES (column2 ) , column3 = VALUES (column3 ) ;
If you are not providing any primary keys with the values you wish to insert, you will have to use another method for inserting while avoiding duplicate inserts on reruns. By using a temporary table, we can define what values already exist in the target table before inserting them (use the LIKE argument to construct a mirror of the target table). Each row that already exists in the target table can be flagged as duplicate and avoided when inserting. Below is an example of using a temporary table to insert values in a way that can be rerun.
High Level Steps
Code to achieve this is included below - if you have been following along, the table `yourtable` will need to have auto increment added to the `id` column for this script to run (in addition to the other changes from the ALTER TABLE section). This is a longer example, so we will break it down in a moment. Note that temporary tables only exist for the duration of the transaction, so this code cannot be run a bit at a time, you must run the whole thing in one go.
-- create temporary table with extra column to flag duplicates CREATE TEMPORARY TABLE `tmp_yourtable` LIKE `yourtable` ; ALTER TABLE `tmp_yourtable` ADD COLUMN duplicate tinyint(1) DEFAULT 0 ; -- insert values into temp table INSERT INTO `tmp_yourtable` ( `column1` , `column2`, `column3` ) VALUES ( 'A2' , 'B2', 'C2' ) , ( 'D2' , 'E2', 'F2' ) , ( 'G' , 'H', 'I' ) ; -- join on compound keys and update matches as existing UPDATE `tmp_yourtable` tmp INNER JOIN `yourtable` your ON your.column1 = tmp.column1 AND your.column2 = tmp.column2 AND your.column3 = tmp.column3 SET tmp.duplicate = 1 ; -- insert only those that do not exist already INSERT INTO `yourtable` (column1 , column2, column3 ) SELECT column1, column2, column3 FROM `tmp_yourtable` WHERE duplicate <> 1 ; DROP TEMPORARY TABLE `tmp_yourtable` ;
Firstly, we are creating a temporary table, which has an extra column in it called ‘duplicate` which we use to mark out duplicate rows. The additional column is a tinyint(1) which is what we use in Appian for booleans, and defaults to 0 (false). We will later set duplicate rows to 1 (true).
CREATE TEMPORARY TABLE `tmp_yourtable` LIKE `yourtable` ; ALTER TABLE `tmp_yourtable` ADD COLUMN duplicate tinyint(1) DEFAULT 0 ;
Next, we insert the rows we want to add into the temporary table first. Like so:
INSERT INTO `tmp_yourtable` ( `column1` , `column2`, `column3` ) VALUES ( 'A2' , 'B2', 'C2' ) , ( 'D2' , 'E2', 'F2' ) , ( 'G' , 'H', 'I' ) ;
In the next stage, we look for duplicates by comparing the temporary table with our actual table. For now, a duplicate is any row where all columns contain the same value except for the primary key.
UPDATE `tmp_yourtable` tmp INNER JOIN `yourtable` your ON your.column1 = tmp.column1 AND your.column2 = tmp.column2 AND your.column3 = tmp.column3 SET tmp.duplicate = 1 ;
Now we have a temporary table containing all the data we wish to insert and we know which rows are duplicates because they have a 1 in the duplicate column. All that remains is copy the non-duplicate rows into our actual table - and of course to drop the temporary table in case another developer needs one like yours when this change goes to your production environment.
INSERT INTO `yourtable` (column1 , column2, column3 ) SELECT column1, column2, column3 FROM `tmp_yourtable` WHERE duplicate <> 1 ; DROP TEMPORARY TABLE `tmp_yourtable` ;
What is a duplicated row? It is tempting to say any row in the table which contains the same data in every column apart from the primary key is a duplicate. However, the situation may not be as simple as this. Consider a table of books. Suppose we store the author, title and the artist who designed the cover for the current edition. In this case, we likely consider a row to be a duplicate if the author and title are the same. If we have different artists in data we want to insert, we will want to update the artist.
To handle this type of situation, we modify step 3 to do an update of existing values in the target table as well as mark rows as duplicate. We will mark the rows as duplicate if the appropriate columns contain the same value. In the book example, this will only be the author and title. If we find a match here we would update the cover artist.
For our abstract table, we will suppose rows are the same if columns 1 and 3 are the same, and update column 2 if a match is found. So the code for the third step becomes:
UPDATE `tmp_yourtable` tmp INNER JOIN `yourtable` your ON your.column1 = tmp.column1 AND your.column3 = tmp.column3 SET tmp.duplicate = 1, your.column2 = tmp.column2 ;