Wednesday, April 28, 2010

PostgreSQL simple dblink cursor example

In PostgreSQL, dblink works fine to query the contents of a table remotely and insert into a new table. However, sometimes that table is too big to fit into memory. Here is an example function which uses cursors to do just that:

CREATE OR REPLACE FUNCTION get_data() RETURNS void AS $$
BEGIN
PERFORM dblink_connect('dbname=db hostaddr=host port=5432
user=user password=password');
PERFORM dblink_open('curs', 'select * from table');
LOOP
INSERT INTO table
SELECT data.*
FROM dblink_fetch('curs', 1)
AS data();
IF NOT FOUND THEN
EXIT;
END IF;
END LOOP;
PERFORM dblink_close('curs');
PERFORM dblink_disconnect();
END;
$$ LANGUAGE plpgsql;

Tuesday, April 27, 2010

PostgreSQL Arrays and Java JDBC (always quote on insert)

I ran into a problem recently using PostgreSQL arrays with Java. I had a table where each row represented a sentence in a web page. One column contained the entire sentence. Another column contained an array of tokens that make up that sentence (as parsed by GATE).

While parsing a Wikipedia page, it contained the character: '†' (\u2020). This inserted fine as a sentence, but became {"?"} in the array. I used the class PostgreSQLTextArray from Valentine's tech log (thanks Google). The contents would not display in pgAdmin3. How in the world could the same character work to insert into a character varying field but not a character varying[] field?!

What I found out is that this character and many others need to be quoted to properly insert into a PostgreSQL array. I changed the Array class I was using to always quote characters.