Universal Unique Identifiers PostgreSQL SQL Server Compare

Thursday, October 07. 2010

Universal Unique Identifiers PostgreSQL SQL Server Compare

Recommended Books: Beginning Databases with PostgreSQL SQL Server MVP Deep Dives

Universal Unique Identifiers are 16-byte / 32-hexadecimal digit (with 4 -s for separation) identifiers standardized by the Open Software Foundation.

The main use as far as databases go is to ensure uniqueness of keys across databases. This is important if you have multiple servers or disperate systems that need to replicate or share data and each can generate data on its own end. You want some non-centralized mechanism to ensure the ids generated from each server will never overlap. There are various open standards for generating these ids and each standard will tie the id based on some unique identifier of the computer or a namespace or just a purely random generator algorithm not tied to anything. Since this is a question often asked by users coming from Microsoft SQL Server, we will demonstrate in this article the same concept in Microsoft SQL Server and how you would achieve similar functionality in PostgreSQL.

I'm not sure how popular they are to use in other databases, but in SQL Server (and even in Microsoft Access), they are very popular, particularly if you need to synchronize data with various offices. If you come from Microsoft world, you probably know UUIDs by the term GUIDs, RowGUID, or data type uniqueidentifier or the thing that the NEWID() function in SQL Server generates, or in Microsoft Access when you choose AutoNumber and FieldSize = Replication ID. All the aforementioned use the standard 32-hexadecimal digit (with -) unique identifier. You may be wondering how you could get the same functionality in PostgreSQL.

PostgreSQL offers similar functionality via the uuid datatype. In additon there is a contrib module uuid-osp which provides various functions for auto-generating uuids. You can use these functions as default values in your table for columns that you need to have unique identifiers or just call them directly similar to what you do with SQL Server's NEWID() function. The main difference between the two is that SQL Server's text representation of the GUID is sometimes shown with {} (though not always) and sometimes as uppercase letters for presentation. PostgreSQL's native display representation doesn't have {} and usually uses lowercase letters. These are merely display representations (and can even be different depending on which display tool you are using). The display representation vs the actual type causes all sorts of confusion when people try to allocate space for them in there systems. SQL Server calls the type UniqueIdentifier and PostgreSQL calls the type uuid. Both types occupy 16-bytes of storage. For compatibility reasons with other software or databases, many use some stanardized text representation of them particularly for transport rather than using the native type. The text representation may require anywhere from 32 to 38 bytes (characters) contingent on if you add dashes or {}. Thus the confusion in storage requirements particularly when transporting to databases that don't have an equivalent of this type.

The GUID/UUID is not a text in either PostgreSQL or SQL Server system. They are native typs in both systems and are stored using 16 bytes. PostgreSQL does although have auto casts built in for converting uuids to text where as SQL Server requires an explicit cast to varchar.

As a side note: SQL Server does not support casting these to text even in SQL Server 2008 but PostgreSQL will happily cast to text or varchar. Just something to keep in mind if you need to create code that is compatible in both systems.

To demonstrate :

In SQL Server: If we try to do this:

SELECT NEWID() + '1234';

We get an error:
The data types uniqueidentifier and varchar are incompatible in the add operator.

However we can do this:

SELECT CAST(NEWID() As varchar(50)) + '1234'; which yields:
372F2ED9-37B0-4719-BC06-7ED9D730052A1234

As with most PostgreSQL contribs, the uuid-osp module can be installed in your PostgreSQL databases by running the SQL script share/contrib/uuid-ossp.sql which is located in your PostgreSQL install folder.

Similar example in PostgreSQL using the uuid-ossp contrib module and the ANSI-SQL standard || concatenate operator (NOTE: SQL Server uses + for concatenation):

SELECT uuid_generate_v1() || '1234'

Yields: c3eeeb64-cd6c-11df-a41f-1b8d709f51b91234

We try to avoid relying on automatic casts whereever possible, just because you may get caught off guard if there are two automatic casts that are equally acceptible or there is no automatic cast at all in place. So to write the PostgreSQL in a more safe way and also to behave more like the SQL Server implementation, we would do

SELECT CAST(uuid_generate_v1() AS varchar(50)) || '1234'

Using Unique identifiers as row ids

There are 2 general ways of assigning unique identifiers to rows in SQL Server and PostgreSQL, and a 3rd way specific to SQL Server.

Explicit assignment in a stored proc or code: using SQL Server NEWID() function or one of PostgreSQL's uuid-osp contrib functions take your pick: uuid_generate_v1(), uuid_generate_v1mc(), uuid_generate_v4() functions or build your own.
Setting the default value of the column to in SQL Server - SQL Server NEWID() or one of PostgreSQL's menu of uuid generator functions or build your own for either system.
In SQL Server -- setting data type to uniqueidentifier and is RowGUID property of the column to Yes (similar in MS Access -- choosing data type AutoNumber and Size to ReplicationID): This last option only apply's to SQL Server and is really syntactic sugar for option 2.

Now I'm not sure what algorithm SQL Server uses in its NEWID(). Could be completely random in which case its most like PostgreSQL's uuid_generate_v4() function. At anyrate it doesn't really matter too much since all approaches will more or less guarantee unique identifiers across systems following the standard Unique Identifier form. PostgreSQL's uuid_generate_v1() generator is not completely random and generates a UUID that contains part of the Mac Address of the generating machine. This is nice in some sense because you could tie a record back to the machine that generated it, or not so nice from a security standpoint because it exposes a bit the identity of the machine that generated it. I tend to think of it as a nice informational feature particularly if you have several office servers and want to know the office that generated the record.

Below are some examples of using the UUID/GUID to assign identifiers in both systems. We'll cover both using the native type as well as using a text representation for ease of portability across systems that don't support the native types.

Operation PostgreSQL SQL Server

Create a table that has a UUID/GUID type automatically assigned

CREATE TABLE equipment(
    equip_id uuid PRIMARY KEY
        DEFAULT uuid_generate_v1() , 
    equip_name varchar(255));

-- SQL Server approach 1 identical 
-- to PostgreSQL approach --
CREATE TABLE equipment(
    equip_id uniqueidentifier PRIMARY KEY
        DEFAULT NEWID() , 
    equip_name varchar(255));
    
-- SQL Server approach 2 
-- syntactic sugar of first --
CREATE TABLE equipment(
    equip_id uniqueidentifier 
 PRIMARY KEY ROWGUIDCOL ,
    equip_name varchar(255) );

Create a table that has text representation of UUID/GUID as primary for easier transportation

CREATE TABLE equipment(
    equip_id char(32) PRIMARY KEY
        DEFAULT LOWER(
            REPLACE(
                CAST(uuid_generate_v1() As varchar(50))
                , '-','')
            ) , 
    equip_name varchar(255));

CREATE TABLE equipment(
    equip_id char(32) PRIMARY KEY
        DEFAULT LOWER(
            REPLACE(
                CAST(NEWID() As varchar(50)), '-','')
            ) , 
    equip_name varchar(255));

Encapsulate text id generator in a function

--Define our own functions
-- with same name
-- return text representations
-- PostgreSQL --
CREATE SCHEMA myfuncs;
CREATE FUNCTION myfuncs.fn_mynewid() 
RETURNS 
char(32) AS
$$
    SELECT LOWER(
        REPLACE(
        CAST(uuid_generate_v1() As varchar(50))
            , '-','')
    );
$$
language 'sql';

CREATE TABLE equipment(
    equip_id char(32) PRIMARY KEY
        DEFAULT myfuncs.fn_mynewid(), 
    equip_name varchar(255));

-- SQL Server equivalent;
-- a lot trickier since if you try to do it the naive
-- way even in SQL Server 2008,
-- you get a
-- Invalid use of a side-effecting 
operator 'newid' within a function.
-- here is a hack as described in 
-- http://www.sqlservercentral.com/Forums/Topic459237-8-1.aspx
CREATE SCHEMA myfuncs
GO
CREATE VIEW myfuncs.MyNewID 
     AS 
 SELECT LOWER(
            REPLACE(
                CAST(NEWID() As varchar(50))
                , '-','')
            )  AS MyNewID
GO
-- create function using the trick view
CREATE FUNCTION myfuncs.fn_MyNewID ()
RETURNS char(32)
     AS
  BEGIN
 RETURN (SELECT MyNewID FROM myfuncs.MyNewID)
    END
GO

CREATE TABLE equipment(
    equip_id char(32) PRIMARY KEY
        DEFAULT myfuncs.fn_mynewid(), 
    equip_name varchar(255))
GO

In the very last example, since we created wrapper functions with the same name, we can use the same application code base for both systems even if we want the identifier to be generated in our application for easier retrieval.

Now to take each for a test drive using our last variant:

Action PostgreSQL Output SQL Server Output

SELECT myfuncs.fn_mynewid(); 8a0be412d21511df8757a3f328265df0 2b80fe3543e44fe9a378215d6e10fe4a

INSERT INTO equipment(equip_name)
    VALUES('Fax');
SELECT * FROM equipment;

             equip_id             | equip_name
----------------------------------+------------
 51d4ca36d21611df8448a3b219d463c5 | Fax

equip_id                         equip_name
-------------------------------- -------------
9c2f44bfa60546b7ab1d95253c6070a2 Fax

Posted by Leo Hsu and Regina Obe in 8.2, 8.3, 8.4, 9.0, beginner, contrib spotlight, ms access, postgresql versions, sql server, uuid_osp at 18:55 | Comment (1) | Trackback (1)

Trackbacks

Trackback specific URI for this entry

PingBack

Weblog: sql.darmowe-blogi.pisz.pl
Tracked: Oct 07, 21:50

Comments

Display comments as (Linear | Threaded)

No comments

Add Comment

Name
Email
Homepage
In reply to
Comment	E-Mail addresses will not be displayed and will only be used for E-Mail notifications. To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly. Enter the string from the spam-prevention image above: Phone* What is three plus two?
	Remember Information? Subscribe to this entry

Universal Unique Identifiers PostgreSQL SQL Server Compare

Postgres OnLine Journal

PostGIS in Action About the Authors Consulting

Thursday, October 07. 2010

Universal Unique Identifiers PostgreSQL SQL Server Compare

Using Unique identifiers as row ids

Entry's Links

Quicksearch

Calendar

Categories

Archives

Subscribe

Blog Administration

Universal Unique Identifiers PostgreSQL SQL Server Compare

Postgres OnLine Journal PostGIS in Action About the Authors Consulting

Thursday, October 07. 2010

Universal Unique Identifiers PostgreSQL SQL Server Compare

Using Unique identifiers as row ids

Entry's Links

Quicksearch

Calendar

Categories

Archives

Subscribe

Blog Administration

Postgres OnLine Journal

PostGIS in Action About the Authors Consulting