PostgreSQL Server Programming(Second Edition)
上QQ阅读APP看书,第一时间看更新

Conditional expressions

Conditional expressions allow developers to control the action of the function, based on a defined criteria. PostgreSQL provides the CASE and IF statements to execute different commands based on conditions. The following is an example of the usage of a CASE statement to control how a string is treated based on its value. If the value is null or contains a zero-length string, it is treated the same as null:

CREATE OR REPLACE FUNCTION format_us_full_name(
                 prefix text, firstname text, 
                 mi text, lastname text, 
                 suffix text)
  RETURNS text AS
$$
DECLARE
  fname_mi text;
  fmi_lname text;
  prefix_fmil text;
  pfmil_suffix text;
BEGIN
  fname_mi := CONCAT_WS(' ',
                 CASE trim(firstname) 
                 WHEN '' 
                 THEN NULL 
                 ELSE firstname 
                 END, 
                 CASE trim(mi) 
                 WHEN '' 
                 THEN NULL 
                 ELSE mi 
                 END || '.');
  fmi_lname := CONCAT_WS(' ',
                 CASE fname_mi 
                 WHEN '' 
                 THEN NULL 
                 ELSE fname_mi 
                 END,
                 CASE trim(lastname) 
                 WHEN '' 
                 THEN NULL 
                 ELSE lastname 
                 END);
  prefix_fmil := CONCAT_WS('. ',
                 CASE trim(prefix) 
                 WHEN '' 
                 THEN NULL 
                 ELSE prefix 
                 END, 
                 CASE fmi_lname 
                 WHEN '' 
                 THEN NULL 
                 ELSE fmi_lname 
                 END);
  pfmil_suffix := CONCAT_WS(', ',
                 CASE prefix_fmil 
                 WHEN '' 
                 THEN NULL 
                 ELSE prefix_fmil 
                 END, 
                 CASE trim(suffix) 
                 WHEN '' 
                 THEN NULL 
                 ELSE suffix || '.' 
                 END);
  RETURN pfmil_suffix;
END;
$$
LANGUAGE plpgsql;

The idea here, is that when any element of a full name is missing, the surrounding punctuation and white spaces should also be missing. This function returns a well-formatted full name of a person in the USA, with as much of the name filled in as possible. When running this function, you will see the following:

postgres=# SELECT format_us_full_name('Mr', 'Martin', 'L', 'King', 'Jr');

   format_us_full_name   
-------------------------
 Mr. Martin L. King, Jr.

(1 row)

Now, let's try with a name missing:

postgres=# SELECT format_us_full_name('', 'Martin', 'L', 'King', 'Jr');

 format_us_full_name 
---------------------
 Martin L. King, Jr.

(1 row)

Another way to use conditional expressions, is using the IF/THEN/ELSE blocks. The following is the same function again, written using IF statements, rather than CASE statements:

CREATE OR REPLACE FUNCTION format_us_full_name(
                 prefix text, firstname text, 
                 mi text, lastname text, 
                 suffix text)
RETURNS text AS
$$
DECLARE
  fname_mi text;
  fmi_lname text;
  prefix_fmil text;
  pfmil_suffix text;
BEGIN
  fname_mi := CONCAT_WS(' ',
                 IF(trim(firstname) ='',NULL,firstname), 
                 IF(trim(mi) = '', NULL, mi || '.')
                 );
  fmi_lname := CONCAT_WS(' ',
                 IF(fname_mi = '',NULL, fname_mi),
                 IF(trim(lastname) =  '', NULL, lastname)
                 );
  prefix_fmil := CONCAT_WS('. ',
                 IF(trim(prefix) = '', NULL, prefix), 
                 IF(fmi_lname = '', NULL, fmi_lname)
                 );
  pfmil_suffix := CONCAT_WS(', ',
                 IF (prefix_fmil = '', NULL, prefix_fmil), 
                 IF (trim(suffix) = '', NULL, suffix || '.')
                 );
  RETURN pfmil_suffix;
END;
$$
LANGUAGE plpgsql;

PostgreSQL's PL/pgSQL provides several other syntactical variants of these conditional expressions. This introduction focuses on the most commonly used ones. For a more complete discussion of the topic, visit http://www.postgresql.org/docs/current/static/functions-conditional.html.

Loops with counters

The PL/pgSQL language provides a simple way to loop through some elements. The following is a function that returns the nth Fibonacci sequence number:

CREATE OR REPLACE FUNCTION fib(n integer) 
RETURNS INTEGER AS $$

DECLARE 
  counter integer := 0;
  a integer := 0;
  b integer := 1;
BEGIN
  IF (n < 1) THEN
    RETURN 0;
  END IF;
  LOOP    
    EXIT WHEN counter = n;
    counter := counter + 1;
    SELECT  b,a+b INTO a,b;
     END LOOP;
  RETURN a;
END;
$$
LANGUAGE plpgsql;

SELECT fib(4);

The preceding code gives 3 as the output.

Just for the record, each element in the Fibonacci sequence is the sum of the previous two elements. Thus, the first few elements of the sequence should be 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, and so on. There are a few PostgreSQL Fibonacci sequence functions out there on the interwebs, but they use the dreaded recursive method. In this case, recursion is a bad thing. Our example uses the iterative method that avoids the use of stacks for recursion, as this might result in the lack of stack space for large numbers.

In this function, we also introduced default values to the variables in the declarations section. When the function is invoked, the variables will be initially set to these values.

Also, take a quick gander at the statement SELECT b,a+b INTO a,b. This statement makes two variable assignments at the same time. It avoids the use of a third variable while acting on both a and b.

Another slight variation of the function using a FOR loop, is as follows:

CREATE OR REPLACE FUNCTION fib(n integer) 
RETURNS INTEGER 
AS $$
DECLARE 
counter integer := 0;
  a integer := 0;
   b integer := 1;
BEGIN
  IF (n < 1) THEN
    RETURN 0;
  END IF;
  FOR counter IN 1..n
  LOOP
    SELECT  b,a+b INTO a,b;
  END LOOP;
  RETURN a;
END;
$$
  LANGUAGE plpgsql;

For some additional looping syntax, take a look at the PostgreSQL documentation page at http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html.

Statement termination

In PL/pgSQL, all blocks and the statements within the blocks, must end with a semicolon. The exceptions are the statements that start a block with IF or BEGIN. Block-starting statements are not complete statements; therefore, the semicolon is always after the block-ending statement, such as END; or END IF;.

Looping through query results

Before we embark on this journey of query result loops, I think I should warn you, that if you are using this method, you are probably doing it wrong. This is one of the most processor- and memory-intensive operations that PostgreSQL offers. There are exceedingly few reasons to iterate through a result set on the database server that offset this cost. I would encourage you to think hard about how to implement the same idea using a VALUES list in a query, temporary table, and permanent table, or to precompute the values in any way possible, in order to avoid this operation. So, do you still think you have an overwhelming reason to use this technique? Okay, then read on. The following is the simple version:

   FOR row IN 
EXECUTE 'SELECT * FROM job_queue q WHERE NOT processed LIMIT 100' 
LOOP 
    CASE row.process_type
      WHEN 'archive_point_of_sale'
        THEN  INSERT INTO hist_orders (...) 
              SELECT ... FROM orders 
                INNER JOIN order_detail ... 
                INNER JOIN item ...;
      WHEN 'prune_archived_orders'
        THEN DELETE FROM order_detail 
              WHERE order_id in (SELECT order_id FROM hist_orders);
             DELETE FROM orders 
              WHERE order_id IN (SELECT order_id FROM hist_orders);
      ELSE
        RAISE NOTICE 'Unknown process_type: %', row.process_type;
    END;
    UPDATE job_queue SET processed = TRUE WHERE id = q.id;
END LOOP;

The preceding example shows a basic strategy pattern of processing messages in a job queue. Using this technique, the rows in a table contain a list of jobs that need to be processed.

We introduce the EXECUTE statement here as well. The SELECT statement is a string value. Using EXECUTE, we can dynamically build PL/pgSQL commands as strings and then invoke them as statements against the database. This technique comes in handy, when we want to change the table name or other SQL keywords that make up our statement. These parts of the SQL statement cannot be stored in variables and are not generally changeable. With EXECUTE, we can change any part of the statement we jolly well please. We must mention that EXECUTE has a cost associated with it: the queries are prepared each time before running.

The following is an example from the PostgreSQL documentation that shows dynamic commands running inside a loop:

CREATE FUNCTION cs_refresh_mviews() RETURNS integer AS $$
DECLARE
    mviews RECORD;
BEGIN
    PERFORM cs_log('Refreshing materialized views...');
    FOR mviews IN SELECT * FROM cs_materialized_views ORDER BY sort_key LOOP
        -- Now "mviews" has one record from cs_materialized_views
        PERFORM cs_log('Refreshing materialized view ' || quote_ident(mviews.mv_name) || ' ...');
        EXECUTE 'TRUNCATE TABLE ' || quote_ident(mviews.mv_name);
        EXECUTE 'INSERT INTO ' || quote_ident(mviews.mv_name) || ' ' || mviews.mv_query;
    END LOOP;
    PERFORM cs_log('Done refreshing materialized views.');
    RETURN 1;
END;
$$ LANGUAGE plpgsql;

The preceding looping example, shows a more complex function that refreshes the data in some staging tables. These staging tables are designated materialized views because the data is actually physically transferred to the staging tables. This method was a common way to reduce query execution overhead for many presentations of the same data, before materialized views were officially supported in PostgreSQL 9.3. In this case, the inefficiency of looping is trivial compared to the continued cost of repeated queries to the same data.

PERFORM versus SELECT

You may have noticed a statement in the previous example that we haven't covered yet. Use the PERFORM command when you want to just discard the results of a statement. Change the previous example to the following line:

SELECT cs_log("Done refreshing materialized views");

The query engine will return No destination for result data.

We can retrieve the results into variables, and then proceed to ignore the variables, but that's just a little too sloppy for my taste. By using the PERFORM statement, we have indicated that ignoring the results was not accidental. We were happy with the fact that the log was appended to blindly, and if it wasn't, oh well, we didn't fail to continue the execution because of a log entry issue.

Looping Through Arrays

There is a very convenient loop construct called FOREACH, which allows you to loop through the elements of an array. Let's dive into an example:

CREATE FUNCTION findmax(int[]) RETURNS int8 AS $$
DECLARE
  max int8 := 0;
  x int;
BEGIN
  FOREACH x IN ARRAY $1
  LOOP
    IF x > max THEN
      max := x;
    END IF;
  END LOOP;
  RETURN max;
END;
$$ LANGUAGE plpgsql;

This function is quite self-explanatory. It finds the maximum values from a given integer array. The point to be noted here is that unlike a normal FOR loop, a FOREACH loop can only use a counter variable that is already declared. You can see that we have declared x before we used it as a counter in the loop. You can run this function to see if it works:

postgres=# select findmax(ARRAY[1,2,3,4,5, -1]);
 findmax 
---------
    5
(1 row)