How to restore a database with check constraints

I have found an edge case where pg_dump and pg_restore fails to restore databases with check constraints that points to data in another table that is restored after the table containing the check constraint. This has been tested and found to be a problem on PostgreSQL 9.5.2, 10.3 and 10.4.

In this how to, we will go over the following information:

  1. Create a database with this problem
  2. Backup the database
  3. Normal restoration the database - Fail's
  4. Customized restoration the database - Work's

Creating a database with this problem:

Let setup create our test database and then log into that test database.

# Create our source database
createdb test_db_1

# Start psql to create the data within the database
psql -d test_db_1

Now we need to create two schemas, with one table each and also create a function that can be used by a check constraint to check the contents of one table's field against the other table. The key in causing this failure is to place your key table in a schema alphabetically before your the schema where you use the data. This check constraint acts like a multi foreign key constraint but with a one field constrained by two fields.

-- Create the schema for the lookup table
CREATE SCHEMA schema_b;

-- Create the lookup table
CREATE TABLE schema_b.table_b (
  type VARCHAR NOT NULL,
  "desc" VARCHAR NOT NULL,
  CONSTRAINT table_b_pk PRIMARY KEY(type, "desc")
);

-- Create the schema for our main table
CREATE SCHEMA schema_a;

-- Create the check constraint function
CREATE OR REPLACE FUNCTION schema_b.check_lu_table_b (varchar, varchar)
RETURNS boolean AS'
SELECT CASE WHEN b."desc" IS NULL THEN FALSE ELSE TRUE END 
FROM (SELECT $2 AS "desc", $1 AS type) a LEFT JOIN
schema_b.table_b b USING (type, "desc")
'LANGUAGE 'sql'
VOLATILE
RETURNS NULL ON NULL INPUT;

-- Create the main table, using the check constraint function
CREATE TABLE schema_a.table_a (
  tp_id SERIAL,
  billing_acct_type VARCHAR(20),
  CONSTRAINT "table_a_pk" PRIMARY KEY(tp_id),
  CONSTRAINT "table_a_chk_billing_acct_type" CHECK (schema_b.check_lu_table_b('BillAcctType', billing_acct_type))
);

-- Insert the test data
INSERT INTO schema_b.table_b VALUES ('BillAcctType', 'Invoice');
INSERT INTO schema_a.table_a (billing_acct_type) VALUES ('Invoice');

Backing up the database

We now can perform a normal backup of our database.

#Backup the test_db_1 database to test the restoration
pg_dump -Fc test_db_1 -f test_db_1.pg_dump

Restoring the database

When we restore this database we will run into problems because the data does not exist in table_b at the time the data is being loaded into table_a.

# Create the Database into
createdb test_db_2

# Failed Load
pg_restore -d test_db_2 test_db_1.pg_dump
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2770; 0 20015425 TABLE DATA table_a postgres
pg_restore: [archiver (db)] COPY failed for table "table_a": ERROR:  new row for relation "table_a" violates check constraint "table_a_chk_billing_acct_type"
DETAIL:  Failing row contains (1, Invoice).
CONTEXT:  COPY table_a, line 1: "1      Invoice"
WARNING: errors ignored on restore: 1

Customized restoration the database - Work's

When we restore this database we need to restore items in a specific order. First we need to load the pre-data section of the backup, aka DDL for the tables, views, etc. Before we load the data section of the backup, we need to load the table_b lookup table. Now we can load the data section for the backup, but this will also double fill the table_b lookup table. The simple solution is to truncate table_b and load it's data again. We are now ready to load the post-data section of the backup and the database is ready for use.

# Re-Create Database
dropdb test_db_2
createdb test_db_2

# Load pre-data (DDL)
pg_restore -d test_db_2 --section=pre-data test_db_1.pg_dump

# Load tatype.lu_tblTelco so Data load won't fail
pg_restore -d test_db_2 -a -t table_b test_db_1.pg_dump

# Load Data
pg_restore -d test_db_2 --section=data test_db_1.pg_dump

# Truncate duplicated data
psql -d test_db_2 -c 'TRUNCATE  schema_b.table_b;'
# Reload single set of data
pg_restore -d test_db_2 -a -t table_b test_db_1.pg_dump

# Load Post Data (Indexes, Foreign Keys, etc.)
pg_restore -d test_db_2 --section=post-data test_db_1.pg_dump

You can also create a several pg_restore list files, but I believe that the above solution is simpler to implement.

Even simpler bug demo

In this simpler version we have both tables inside the same schema. The lookup table needs to be alphabetically after the table that uses the data. This simplified test returns a different error, using Postgres 10.3

-- Create the lookup table
CREATE TABLE public.table_b (
  type VARCHAR NOT NULL,
  "desc" VARCHAR NOT NULL,
  CONSTRAINT table_b_pk PRIMARY KEY(type, "desc")
);

-- Create the check constraint function
CREATE OR REPLACE FUNCTION public.check_lu_table_b (varchar, varchar)
RETURNS boolean AS'
SELECT CASE WHEN b."desc" IS NULL THEN FALSE ELSE TRUE END 
FROM (SELECT $2 AS "desc", $1 AS type) a LEFT JOIN
public.table_b b USING (type, "desc")
'LANGUAGE 'sql'
VOLATILE
RETURNS NULL ON NULL INPUT;

-- Create the main table, using the check constraint function
CREATE TABLE public.table_a (
  tp_id SERIAL,
  billing_acct_type VARCHAR(20),
  CONSTRAINT "table_a_pk" PRIMARY KEY(tp_id),
  CONSTRAINT "table_a_chk_billing_acct_type" CHECK (public.check_lu_table_b('BillAcctType', billing_acct_type))
);

-- Insert the test data
INSERT INTO public.table_b VALUES ('BillAcctType', 'Invoice');
INSERT INTO public.table_a (billing_acct_type) VALUES ('Invoice');

Backup and Restoration of simpler bug demo

Even though this is the same problem we get a different error message. Instead of the "violates check constraint" we now see a "relation "table_b" does not exist" error message.

#Backup the test_db_1 database to test the restoration
pg_dump -Fc test_db_1 -f test_db_1.pg_dump

# Create the Database into
createdb test_db_2

# Failed Load
pg_restore -d test_db_2 test_db_1.pg_dump
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2774; 0 1322703910 TABLE DATA table_a postgres
pg_restore: [archiver (db)] COPY failed for table "table_a": ERROR:  relation "table_b" does not exist
LINE 4: table_b b USING (type, "desc")
        ^
QUERY:
SELECT CASE WHEN b."desc" IS NULL THEN FALSE ELSE TRUE END
FROM (SELECT $2 AS "desc", $1 AS type) a LEFT JOIN
table_b b USING (type, "desc")

CONTEXT:  SQL function "check_lu_table_b" during inlining
COPY table_a, line 1: "1        Invoice"
WARNING: errors ignored on restore: 1

The cause of the Problem

The problem is not truly caused by the order the tables are restored in. It is cause by where the check constraints are being restored. When we look at the sql being generated by pg_restore, we can see that when table_a is created it includes the check constraint. Foreign key constraints are restored in the post-data section, but not shown in this example. This is why they don't have this same problem.

--
-- PostgreSQL database dump
--

-- Dumped from database version 10.3
-- Dumped by pg_dump version 10.3

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET client_min_messages = warning;
SET row_security = off;

--
-- Name: schema_a; Type: SCHEMA; Schema: -; Owner: postgres
--

CREATE SCHEMA schema_a;


ALTER SCHEMA schema_a OWNER TO postgres;

--
-- Name: schema_b; Type: SCHEMA; Schema: -; Owner: postgres
--

CREATE SCHEMA schema_b;


ALTER SCHEMA schema_b OWNER TO postgres;

--
-- Name: plpgsql; Type: EXTENSION; Schema: -; Owner:
--

CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;


--
-- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner:
--

COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';


--
-- Name: check_lu_table_b(character varying, character varying); Type: FUNCTION; Schema: schema_b; Owner: postgres
--

CREATE FUNCTION schema_b.check_lu_table_b(character varying, character varying) RETURNS boolean
    LANGUAGE sql STRICT
    AS $_$
SELECT CASE WHEN b."desc" IS NULL THEN FALSE ELSE TRUE END
FROM (SELECT $2 AS "desc", $1 AS type) a LEFT JOIN
schema_b.table_b b USING (type, "desc")
$_$;


ALTER FUNCTION schema_b.check_lu_table_b(character varying, character varying) OWNER TO postgres;

SET default_tablespace = '';

SET default_with_oids = false;

--
-- Name: table_a; Type: TABLE; Schema: schema_a; Owner: postgres
--

CREATE TABLE schema_a.table_a (
    tp_id integer NOT NULL,
    billing_acct_type character varying(20),
    CONSTRAINT table_a_chk_billing_acct_type CHECK (schema_b.check_lu_table_b('BillAcctType'::character varying, billing_acct_type))
);


ALTER TABLE schema_a.table_a OWNER TO postgres;

--
-- Name: table_a_tp_id_seq; Type: SEQUENCE; Schema: schema_a; Owner: postgres
--

CREATE SEQUENCE schema_a.table_a_tp_id_seq
    AS integer
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;


ALTER TABLE schema_a.table_a_tp_id_seq OWNER TO postgres;

--
-- Name: table_a_tp_id_seq; Type: SEQUENCE OWNED BY; Schema: schema_a; Owner: postgres
--

ALTER SEQUENCE schema_a.table_a_tp_id_seq OWNED BY schema_a.table_a.tp_id;


--
-- Name: table_b; Type: TABLE; Schema: schema_b; Owner: postgres
--

CREATE TABLE schema_b.table_b (
    type character varying NOT NULL,
    "desc" character varying NOT NULL
);


ALTER TABLE schema_b.table_b OWNER TO postgres;

--
-- Name: table_a tp_id; Type: DEFAULT; Schema: schema_a; Owner: postgres
--

ALTER TABLE ONLY schema_a.table_a ALTER COLUMN tp_id SET DEFAULT nextval('schema_a.table_a_tp_id_seq'::regclass);


--
-- Data for Name: table_a; Type: TABLE DATA; Schema: schema_a; Owner: postgres
--

COPY schema_a.table_a (tp_id, billing_acct_type) FROM stdin;
1       Invoice
\.


--
-- Data for Name: table_b; Type: TABLE DATA; Schema: schema_b; Owner: postgres
--

COPY schema_b.table_b (type, "desc") FROM stdin;
BillAcctType    Invoice
\.


--
-- Name: table_a_tp_id_seq; Type: SEQUENCE SET; Schema: schema_a; Owner: postgres
--

SELECT pg_catalog.setval('schema_a.table_a_tp_id_seq', 1, true);


--
-- Name: table_a table_a_pk; Type: CONSTRAINT; Schema: schema_a; Owner: postgres
--

ALTER TABLE ONLY schema_a.table_a
    ADD CONSTRAINT table_a_pk PRIMARY KEY (tp_id);


--
-- Name: table_b table_b_pk; Type: CONSTRAINT; Schema: schema_b; Owner: postgres
--

ALTER TABLE ONLY schema_b.table_b
    ADD CONSTRAINT table_b_pk PRIMARY KEY (type, "desc");


--
-- PostgreSQL database dump complete
--
--
-- PostgreSQL database dump
--

-- Dumped from database version 10.3
-- Dumped by pg_dump version 10.3

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET client_min_messages = warning;
SET row_security = off;

--
-- Name: plpgsql; Type: EXTENSION; Schema: -; Owner:
--

CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;


--
-- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner:
--

COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';


--
-- Name: check_lu_table_b(character varying, character varying); Type: FUNCTION; Schema: public; Owner: postgres
--

CREATE FUNCTION public.check_lu_table_b(character varying, character varying) RETURNS boolean
    LANGUAGE sql STRICT
    AS $_$
SELECT CASE WHEN b."desc" IS NULL THEN FALSE ELSE TRUE END
FROM (SELECT $2 AS "desc", $1 AS type) a LEFT JOIN
table_b b USING (type, "desc")
$_$;


ALTER FUNCTION public.check_lu_table_b(character varying, character varying) OWNER TO postgres;

SET default_tablespace = '';

SET default_with_oids = false;

--
-- Name: table_a; Type: TABLE; Schema: public; Owner: postgres
--

CREATE TABLE public.table_a (
    tp_id integer NOT NULL,
    billing_acct_type character varying(20),
    CONSTRAINT table_a_chk_billing_acct_type CHECK (public.check_lu_table_b('BillAcctType'::character varying, billing_acct_type))
);


ALTER TABLE public.table_a OWNER TO postgres;

--
-- Name: table_a_tp_id_seq; Type: SEQUENCE; Schema: public; Owner: postgres
--

CREATE SEQUENCE public.table_a_tp_id_seq
    AS integer
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;


ALTER TABLE public.table_a_tp_id_seq OWNER TO postgres;

--
-- Name: table_a_tp_id_seq; Type: SEQUENCE OWNED BY; Schema: public; Owner: postgres
--

ALTER SEQUENCE public.table_a_tp_id_seq OWNED BY public.table_a.tp_id;


--
-- Name: table_b; Type: TABLE; Schema: public; Owner: postgres
--

CREATE TABLE public.table_b (
    type character varying NOT NULL,
    "desc" character varying NOT NULL
);


ALTER TABLE public.table_b OWNER TO postgres;

--
-- Name: table_a tp_id; Type: DEFAULT; Schema: public; Owner: postgres
--

ALTER TABLE ONLY public.table_a ALTER COLUMN tp_id SET DEFAULT nextval('public.table_a_tp_id_seq'::regclass);


--
-- Data for Name: table_a; Type: TABLE DATA; Schema: public; Owner: postgres
--

COPY public.table_a (tp_id, billing_acct_type) FROM stdin;
1       Invoice
\.


--
-- Data for Name: table_b; Type: TABLE DATA; Schema: public; Owner: postgres
--

COPY public.table_b (type, "desc") FROM stdin;
BillAcctType    Invoice
\.


--
-- Name: table_a_tp_id_seq; Type: SEQUENCE SET; Schema: public; Owner: postgres
--

SELECT pg_catalog.setval('public.table_a_tp_id_seq', 1, true);


--
-- Name: table_a table_a_pk; Type: CONSTRAINT; Schema: public; Owner: postgres
--

ALTER TABLE ONLY public.table_a
    ADD CONSTRAINT table_a_pk PRIMARY KEY (tp_id);


--
-- Name: table_b table_b_pk; Type: CONSTRAINT; Schema: public; Owner: postgres
--

ALTER TABLE ONLY public.table_b
    ADD CONSTRAINT table_b_pk PRIMARY KEY (type, "desc");


--
-- PostgreSQL database dump complete
--

Patching PostgreSQL Ideas

If the creation of the check constraint is moved from the pre-data section to the post-data section of the backup, this would fix the restoration problem.

Conclusion

While this is an edge case, I think this should be a simple fix by the Postgresql hackers. This bug has been submitted as PostgreSQL Bug #15316 by Lloyd Albin

Thank You's
Julie Bernhardt - Corilate, Inc. - Managing Director

Lloyd Albin
Database Administrator
Statistical Center for HIV/AIDS Research and Prevention (SCHARP)
Fred Hutchinson Cancer Research Center (FHCRC / Fred Hutch)