The goal of this page — which is a work in progress — is to gather information relevant for people who are porting SQL from one product to another and/or are interested in possibilities and limits of 'cross-product' SQL.
The following tables compare how different RDBMS products handle various SQL (and related) features. If possible, the tables also state how the implementations should do things, according to the SQL standard.
I will only write about subjects that I've worked with personally, or subjects which I anticipate to find use for in the near future. Subjects on which there are no significant implementation variances are not covered. Beta-versions of software are not examined. Furthermore, I will not mention all the cases of MySQL not supporting standard SQL features.
I'm sorry about the colors. They are a result of wanting to mark each DBMS differently and at the same time wanting to be relatively nice to printers.
If you have corrections or suggestions, please contact me; even notifications about spelling errors are welcome.
The following SQL standard and implementations have been examined, if not otherwise stated:
| Standard |
The latest official version of SQL is SQL:2003.
I don't have access to the official ISO standard text, but
Whitemarsh Information Systems Corporation
provides a rather final draft
as a zip-archive, containing several files. Most important to this
page is the file SQL:2003 is very new, and the only book currently covering the subject is in German which I was never any good at.
Therefore, I also use the following book as reference:
|
| PostgreSQL |
PostgreSQL 8.0.0 on Fedora Core Linux. Documentation |
| DB2 |
DB2 Universal Database Personal Edition v. 8.1FP7 (AKA v. 8.2) on Fedora Core Linux. Documentation (takes a while to render properly) |
| MS SQL Server |
MS SQL Server 2000 SP3a on Windows 2000.
Microsoft's SQL implementation is sometimes named Transact-SQL.
In this document, I'll generally write MSSQL as a short-hand for
Microsoft's SQL Server product. Documentation (takes a while to render properly) |
| MySQL |
MySQL Database Server 4.1.9 on Fedora Core Linux (i.e. MySQL AB's "classic" DBMS product—not MaxDB). Documentation (Note: The online MySQL manual documents MySQL beta-releases, i.e. sometimes it doesn't reflect the situation in the current MySQL production versions.) |
| Oracle |
Oracle Database 10g Release 1 Standard Edition on Fedora Core release 1 (Linux). The tables should
hold for version 9i, as well (Oracle 10g contains remarkably few improvements/changes in Oracle's SQL
standard compliance). Documentation (requires OTN-membership) |
The products are running with their default settings. This is important for MySQL and MSSQL: Their interpretation of SQL may be changed rather drastically by adjusting certain configuration options, potentially increasing the level of standard compliance. However, such non-default configuration options are not of great value for people writing SQL applications because the developer often cannot rely on non-default configuration settings.
| Standard | Views are part of the standard, and they may be updated, as long as it 'makes sense'.
SQL:2003 has a rather complicated set of rules governing when a view is updatable, basically saying that a view is updatable, as long as the update-operation translates into an unambiguous change. SQL-92 was more restrictive, specifying that updatable views cannot be derived from more than one base table. |
| PostgreSQL | Has views. Breaks that standard by not allowing updates to views; offers the non-standard 'rules'-system as a work-around. |
| DB2 | Conforms to at least SQL-92. |
| MSSQL | Conforms to at least SQL-92. |
| MySQL | Breaks the standard by not offering views. |
| Oracle | Conforms to at least SQL-92. |
Peter Gulutzan has written an article about the implementation of views in three major products.
All the RDBMSes support basic INNER JOINs, but vary in their support for other join types.
In the following feature chart, a
means yes; an empty table cell means no.
| Join type/feature | PostgreSQL | DB2 | MSSQL | MySQL | Oracle |
|---|---|---|---|---|---|
Natural joins (only tested: NATURAL LEFT JOIN) |
|||||
USING-clause |
( |
||||
FULL joins3 (tested: SELECT...FULL JOIN...ON...=...) |
|||||
Explicit CROSS JOIN (cartesian product) |
Remarks:
NATURAL join implementing DBMSes, making MySQL's NATURAL joins less practical.USING seems to have a bug; hence the parenthesized check-mark.FULL joins may be emulated with a union of a left and a right join.| Standard | The SQL-standard states that relations are unordered, but
result sets may be ordered when returned to the user through a cursor:
As such,
the standard doesn't allow The standard doesn't specify how NULLs should be ordered in comparison with non-NULL values, except that NULLs are to be considered equal in the ordering, and that NULLs should sort either above or below all non-NULL values. |
| PostgreSQL | Allows ORDER BY in contexts other
than cursor definitions. NULLs are considered higher than any non-NULL value.
|
| DB2 | Allows ORDER BY in contexts other
than cursor definitions. NULLs are considered higher than any non-NULL value.
|
| MSSQL | Allows ORDER BY in contexts other
than cursor definitions. NULLs are considered lower than any non-NULL value.
|
| MySQL | Allows ORDER BY in contexts other
than cursor definitions.
NULLs are considered lower than any non-NULL value,
except if a |
| Oracle | Allows ORDER BY in contexts other
contexts than cursor definitions.
By default, NULLs are considered higher
than any non-NULL value; however, this sorting behaviour may be changed
by adding Beware of Oracle's strange treatment of emtpy strings and NULLs as the same 'value'. |
Objective: Want to only get n rows in the result set.
Usually only makes sense in connection with an ORDER BY expression.
Note: This is not the same as a top-n query — see next section.
Note also: Some of the queries below may not be legal in all situations, such as in views or sub-queries.
| Standard | Non-core Feature ID T611 specifies window functions, of which
one is ROW_NUMBER() OVER:
SELECT * FROM (
If your application is stateful (in contrast to web applications which normally have to be seen as stateless), then you might look at cursors (core feature ID E121) instead. This involves:
|
| PostgreSQL |
Doesn't support ROW_NUMBER(). Supports cursors (in all contexts, not only
in embedded, dynamic SQL).
Alternative to using
Note that |
| DB2 |
Supports both standards-based approaches. An alternative to using ROW_NUMBER(), which
may be quicker in some situations(?):
|
| MSSQL |
Doesn't support ROW_NUMBER(); supports cursors.
Alternative to using
|
| MySQL |
Doesn't support the standard. Alternative solution:
|
| Oracle |
Doesn't support ROW_NUMBER(); seems to have non-compliant cursor facilities.
An alternative to using
|
Objective: Like the simple limit-query above, but include rows with tie conditions. Thus, the query may return more than n rows.
Some call this a quota-query.
The following examples are based on this table:
SELECT * FROM person ORDER BY age ASC; +----------+-------------+-----+ |PERSON_ID | PERSON_NAME | AGE | +----------+-------------+-----+ | 7 | Hilda | 12 | | 8 | Bill | 12 | | 4 | Joe | 23 | | 2 | Veronica | 23 | | 3 | Michael | 27 | | 9 | Marianne | 27 | | 1 | Ben | 50 | | 10 | Michelle | 50 | | 5 | Irene | 77 | | 6 | Vivian | 77 | +----------+-------------+-----+
Now, we only want the three (n=3) youngest persons displayed, i.e. a result set like this:
+----------+-------------+-----+ |PERSON_ID | PERSON_NAME | AGE | +----------+-------------+-----+ | 7 | Hilda | 12 | | 8 | Bill | 12 | | 4 | Joe | 23 | | 2 | Veronica | 23 | +----------+-------------+-----+
| Standard | With standard SQL, there are two principal ways to obtain the wanted data:
In the article Going To Extremes by Joe Celko, there is a description of yet another principle for performing quota queries, using scalar subqueries. Scalar subqueries are more tedious to write but might yield better performance on your system. |
| PostgreSQL |
Supports the slow standard SQL
query variant. In practice, a PostgreSQL-only method should be used, in order to obtain
acceptable query performance.
(Change |
| DB2 | Supports the fast standard SQL variant. |
| MSSQL |
Supports the slow standard SQL variant. In practice, a MSSQL-only
expression should be used, in order to obtain acceptable query performance:
(Change |
| MySQL |
Supports the slow standard SQL
solution. In practice, this MySQL-specific solution should be used,
in order to obtain acceptable query performance:
(Change The offset-value 2 is the result of n-1 (remember: n is 3 in these examples). The second argument to the |
| Oracle |
Supports the fast standard SQL variant.
However, as Oracle doesn't like "AS ..." after subqueries
(and doesn't require naming of subqueries), the query has to be
paraphrased slightly:
(Change |
Objective: Want to only get n rows in the result set,
and we want the first skip rows in the result set discarded.
Usually only makes sense in connection with an ORDER BY expression.
In the recipes below, basic ordering is ASCending, i.e.
lowest-first queries. If you want the opposite, then change
ASC->DESC and DESC->ASC
at the places emphasized like this.
| Standard |
Non-core Feature ID T611 specifies window functions, one of which is ROW_NUMBER() OVER:
SELECT * FROM (
Alternatively, you may use a cursor (core feature ID E121), if the programming environment permits it. This involves:
|
| PostgreSQL |
Doesn't support ROW_NUMBER(). Supports cursors.
Alternative to
|
| DB2 | Supports both standard approaches. |
| MSSQL |
Doesn't support ROW_NUMBER(); supports cursors.
Alternative to ROW_NUMBER():
|
| MySQL |
Doesn't support the standard approaches. Alternative solution:
SELECT columns
In older versions of MySQL, the LIMIT-syntax is less clear:
|
| Oracle |
Doesn't support ROW_NUMBER(), and Oracle's cursor support doesn't look
standards-compliant. An alternative to the ROW_NUMBER()-solution:
|
LIMIT/TOP/FIRST queries with offset are often used in a result presentation context: To retrieve only—say—30 rows at a time so that the end-user isn't overwhelmed by the complete result set, but instead is offered a paginated result presentation. In this case, be careful not to (only) sort on a non-unique column.
Consider the following example (where PostgreSQL is used):
SELECT * FROM person ORDER BY age ASC;
person_id | person_name | age
-----------+-------------+-----
7 | Hilda | 12
8 | Bill | 12
4 | Joe | 23
2 | Veronica | 23
3 | Michael | 27
9 | Marianne | 27
1 | Ben | 50
10 | Michelle | 50
5 | Irene | 77
6 | Vivian | 77
When ordering is performed on the non-unique age-value, ties may occur and it's not guaranteed that the DBMS will fetch the rows in the same order every time.
Instead of the above listing, the DBMS is allowed to return the following display order where Michael and Marianne are displayed in the opposite order compared to above:
SELECT * FROM person ORDER BY age ASC;
person_id | person_name | age
-----------+-------------+-----
7 | Hilda | 12
8 | Bill | 12
4 | Joe | 23
2 | Veronica | 23
9 | Marianne | 27
3 | Michael | 27
1 | Ben | 50
10 | Michelle | 50
5 | Irene | 77
6 | Vivian | 77
Now, suppose the end-user wants the results displayed five rows at a time. The result set is fetched in two queries where the DBMS happens to sort differently, as above. We will use PostgreSQL's syntax in the example:
SELECT * FROM person ORDER BY age ASC LIMIT 5;
person_id | person_name | age
-----------+-------------+-----
7 | Hilda | 12
8 | Bill | 12
4 | Joe | 23
2 | Veronica | 23
3 | Michael | 27
SELECT * FROM person ORDER BY age ASC LIMIT 5 OFFSET 5;
person_id | person_name | age
-----------+-------------+-----
3 | Michael | 27
1 | Ben | 50
10 | Michelle | 50
5 | Irene | 77
6 | Vivian | 77
Notice that Marianne was not displayed in any of the two split result set presentations.
The problem could be avoided if the result set ordering had been done in
a deterministic way, i.e. where the unique person_id value was considered
in case of a tie:
SELECT * FROM person ORDER BY age ASC, person_id ASC ...
This is safer than to pray for the DBMS to behave in a predictable way when
handling non-unique values.
Note: If the table is updated between parts of the result set pagination, then the user might still get an inconsistent presentation. If you want to guard against this, too, then you should see if use of an insensitive cursor is an option in your application. Use of cursors to paginate result sets usually require that your application is stateful, which is not the case in many web-application settings.
Alternatively, you could copy the result set to a table and use the copied table for pagination; with this solution, you might accumulate a vast amount of result-set generated tables if you don't remember to clean up. If you end up playing with such ideas, then it's probably a good idea to start considering if the entire result set may be cached by the application (e.g. in a session if your web application environment provides for sessions).
| Standard | An optional SQL feature is row value constructors (feature ID F641). One handy use
of row value constructors is when inserting several rows at a time, such as:
— which can be seen as a shorthand for
|
| PostgreSQL | Not supported. |
| DB2 | Supported. |
| MSSQL | Not supported. |
| MySQL | Supported. |
| Oracle | Not supported. |
| Standard | The BOOLEAN type is optional (has feature ID T031), which is
a bit surprising for such a basic type. However, it seems that endless discussions of how
NULL is to be interpreted for a boolean value is holding BOOLEAN from becoming
a core type.
The standard says that a BOOLEAN may be one of the following literals:
The DBMS may interpret NULL as equivalent to UNKNOWN. It is unclear from the specification if the DBMS must support UNKNOWN, NULL or both as boolean literals. In this author's opinion, you should forget about the UNKNOWN literal in order to simplify the situation and let the normal SQL three-way logic apply. It's defined that TRUE > FALSE (true larger than false). |
| PostgreSQL | Follows the standard.
Accepts NULL as a boolean literal; doesn't accept UNKNOWN as a boolean literal. |
| DB2 | Doesn't support the BOOLEAN type.
Judging from various JDBC-documentation, it seems that IBM recommends a CHAR(1) field constrained to values '0' and '1' (and perhaps NULL) as the way to store boolean values. |
| MSSQL | Doesn't support the BOOLEAN type.
Possible alternative type: the BIT type which may have 0 or 1 (or NULL) as value. If you insert an integer value other than these into a field of type BIT, then the inserted value will silently be converted to 1. Rudy Limeback has some notes about oddities with the MSSQL BIT type. |
| MySQL | Offers a non-conforming BOOLEAN type. MySQL's BOOLEAN
is one of many aliases to its TINYINT(1) type.
MySQL accepts the literals TRUE and FALSE as aliases to 1 and 0, respectively. However, you may also assign a value of - e.g. - 9 to a column of type BOOLEAN (which is non-conforming). If you use JDBC with MySQL, then BOOLEAN is the preferred type for booleans: MySQL's JDBC-driver implictly converts between Java's boolean and MySQL's pseudo-BOOLEAN type. |
| Oracle | Doesn't support the BOOLEAN type.
Judging from various JDBC documentation, it seems that Oracle recommends NUMBER(1) as the way to store boolean values; it's probably wise to constrain such columns to values 0 and 1 (and perhaps NULL). |
Warning to JDBC users:
According to the JDBC standard, getBoolean() must convert a
SQL-'value' of NULL to the false Java value. To check if
the database-value was really NULL, use wasNull().
For the following section, I have used this test-SQL to try to illuminate differences (unfortunately, even standard SQL as simple as this has to be adjusted for some products):
Test steps:
CREATE TABLE chartest (
charval1 CHAR(10) NOT NULL,
charval2 CHAR(10) NOT NULL,
varcharval VARCHAR(30) NOT NULL
);
INSERT INTO chartest VALUES ('aaa','aaa','aaa');
INSERT INTO chartest
VALUES ('aaaaaa ','aaa','aaa'); -- should truncate to 'aaaaaa '
INSERT INTO chartest
VALUES ('aaaaaaaaaaaa','aaa','aaa'); -- should raise error
SELECT * FROM chartest; -- should show two rows
DELETE FROM chartest WHERE charval1='aaaaaa';
SELECT * FROM chartest; -- should show one row
SELECT * FROM chartest WHERE charval1=varcharval;
SELECT charval1 || 'X' AS res FROM chartest;
SELECT CHAR_LENGTH(charval1 || charval2) AS res FROM chartest;
SELECT CHAR_LENGTH(charval1) + CHAR_LENGTH(charval2)
AS res
FROM chartest;
Expected results, after CREATE and INSERTs:
SELECT * FROM chartest; -- should show two rows
CHARVAL1 CHARVAL2 VARCHARVAL
========== ========== ==============================
aaa aaa aaa
aaaaaa aaa aaa
DELETE FROM chartest WHERE charval1='aaaaaa';
SELECT * FROM chartest; -- should show one row
CHARVAL1 CHARVAL2 VARCHARVAL
========== ========== ==============================
aaa aaa aaa
SELECT * FROM chartest WHERE charval1=varcharval;
CHARVAL1 CHARVAL2 VARCHARVAL
========== ========== ==============================
aaa aaa aaa
SELECT charval1 || 'X' FROM chartest AS res;
res
===========
aaa X
SELECT CHAR_LENGTH(charval1 || charval2) AS res FROM chartest;
res
===========
20
SELECT character_length(charval1) + character_length(charval2)
AS res
FROM chartest;
res
============
20
| Standard |
|
| PostgreSQL | Generally follows standard, but (conceptually) truncates
trailing white-space before performing some functions
(like the CHARACTER_LENGTH-function).
PostgreSQL's handling of the CHAR-type has changed in every recent major release. Consider using its VARCHAR type instead, unless you know for sure that your CHAR values will always be of a certain length and without trailing spaces; VARCHARs souldn't have worse performance than CHAR in PostgreSQL. |
| DB2 | Follows the standard. |
| MSSQL | Generally follows standard, but (conceptually) truncates
trailing white-space before performing some functions
(at least before LEN()).
|
| MySQL | Breaks the standard by silently inserting the string, truncated to specified column CHAR-length. (It's actually not completely silent, as it issues warnings if values were truncated: If you manually check for warnings, you will know that something bad happened, but not which of the rows are now invalid.) Note that some MySQL functions (at least the CHARACTER_LENGTH-function)
remove trailing spaces before operating.
|
| Oracle | Follows the standard, with a minor exception: Oracle doesn't remove trailing spaces which exceed the specified CHAR length, but raises an exception. |
| Standard |
Part of the Core requirements, feature ID F051-03. Stores year, month, day, hour, minute, second (with fractional seconds; default is 6 fractional digits). Extension to Core SQL (feature ID 411): TIMESTAMP WITH TIME ZONE which also stores the time zone. Examples of TIMESTAMP literals:
Examples of TIMESTAMP WITH TIME ZONE literals:
It's strange that TIMESTAMP WITH TIME ZONE literals
are not represented as, e.g.,
|
| PostgreSQL |
Follows that standard with one exception:TIMESTAMP '2003-08-23 01:02:03 +02:00' is interpreted
as a TIMEZONE WITHOUT TIME ZONE
(discarding the '+02:00' part)—not
as a TIMESTAMP WITH TIME ZONE value. The standard
may be illogical regarding this, but a standard is a standard...
Performs good sanity checks on inserted timestamp values; e.g. this will work: |
| DB2 |
DB2 has the TIMESTAMP data type, but not the extended TIMESTAMP WITH TIME ZONE type. DB2 accepts TIMESTAMP literals like '2003-07-23 00:00:00', however
it doesn't accept the typed TIMESTAMP '2003-07-23 00:00:00' variant.
Performs good sanity checks on inserted timestamp values; e.g. this will work: |
| MSSQL |
Note that MSSQL's choice of words related to date and time is
confusing: In MSSQL's vocabulary, datetime is a concrete data type,
whereas in the SQL standard, datetime is a general term covering the
DATE, TIME and TIMESTAMP types.
MSSQL has a strange pseudo-type called TIMESTAMP, but has deprecated it; don't use it in new code. The closest match to the SQL standard's TIMESTAMP type is DATETIME. This type stores the combination of date and time. It has a maximum of three fractional digits for seconds. Performs good sanity checks on inserted timestamp values; e.g. this will work: |
| MySQL |
In general: No matter what date/time data type chosen in MySQL, storage of fractional
seconds and time zones are not supported. You will have to invent your
own systems for such information.
Note also, that MySQL's choice of words related to date and time is confusing: In MySQL's vocabulary, datetime is a concrete data type, whereas in the SQL standard, datetime is a general term covering the DATE, TIME and TIMESTAMP types. MySQL has a type called TIMESTAMP, but it is quite different from
the standard TIMESTAMP: It's automatically updated to the current date and
time, if some criteria are fulfilled. This strange data
type with side effects seems to exist because MySQL doesn't support
normal DDL expressions involving 'dynamic' default values, like MySQL has a type called DATETIME. Like MySQL's TIMESTAMP type, it stores a combination of date and time without fractional seconds. There are no side effects associated with the DATETIME type—which makes it the closest match to the SQL standard's TIMESTAMP type. MySQL's sanity checks with regard to dates and time are poor. For example, MySQL gladly accepts a DATETIME value of '2003-02-29 00:05:00'. DATETIME values with errors that are easier to detect (like a value of '2003-01-32 00:00:00') yield a warning (which you must check for if you want to be warned) and inserts a value of zero. This means that if you want to be able to prevent the insertion of invalid DATETIME values in MySQL, your application must either use transactions (and hope that the tables support them) or record the state of the relevant rows before doing an update/insert, so that changes may manually be un-done, in case a warning should be issued by MySQL. |
| Oracle |
Follows the standard. Oracle has both the TIMESTAMP and the extended
TIMESTAMP WITH TIME ZONE types.
A special gotcha applies, though: Oracle forbids columns of type TIMESTAMP WITH TIME ZONE as part of a unique key; this includes primary and foreign keys. Timestamps without time zone (and Oracle's special TIMESTAMP WITH LOCAL TIME ZONE) are accepted. Performs good sanity checks on inserted timestamp values; e.g. this will work: |
See also this link to an article timestamp handling in the 'Big Three' (DB2, MSSQL, Oracle).
| Standard | CHARACTER_LENGTH(argument); returns NUMERIC. Returns NULL if the input is NULL. Alias: CHAR_LENGTH. The argument may be of type CHAR or VARCHAR. Part of the Core SQL requirements (feature ID E021-04). Related function: OCTET_LENGTH. |
| PostgreSQL |
Follows the standard, providing CHARACTER_LENGTH (and CHAR_LENGTH).
Note that PostgreSQL removes trailing (not leading) space from from CHAR values before counting. Note also that the behaviour of CHARACTER_LENGTH with regard to CHAR values has changed between versions 7.4 and 8.0 of PostgreSQL. |
| DB2 |
Doesn't have CHARACTER_LENGTH. Provides the LENGTH function instead.
Note that CHAR values are space-padded (like the standard
says they should be), so the length of
|
| MSSQL |
Doesn't have CHARACTER_LENGTH. Provides the LEN function instead.Note that MSSQL removes trailing (not leading) spaces from CHAR values before counting. |
| MySQL |
Provides CHARACTER_LENGTH. Aliases: CHAR_LENGTH, LENGTH. Note that MySQL removes trailing (not leading) spaces from CHAR values before counting. |
| Oracle |
Doesn't have CHARACTER_LENGTH. Provides the LENGTH function instead.
Behaves in strange ways if the input is the empty string or NULL, because of Oracles non-standard NULL handling (it considers NULL and the empty string identical 'values'). Note that CHAR values are space-padded (like the standard
says they should be), so the length of
|
| Standard | The standard defines two variants of the SUBSTRING function:
|
| PostgreSQL | PostgreSQL provides three SUBSTRING flavors:
|
| DB2 | Doesn't provide the standard SUBSTRING function. Provides SUBSTR(input,start-pos[,length]) instead (i.e. length is optional).
A start-pos of 1 is the first charater in the input string. DB2 is less permissive than the standard: Out-of-range values are not permitted, and NULL cannot be the value for start-pos or length. DB2 doesn't seem to provide regular expression facilities for queries. |
| MSSQL | MSSQL has a SUBSTRING function, but its syntax differs from that of the standard. The syntax is:
where start is an integer specifying the beginning of the string, and length is a non-negative integer indicating how many characters to return. |
| MySQL | MySQL supports the standard's ordinary SUBSTRING function, with a twist (see below). No regular expression based substring extraction is supported.
MySQL breaks the standard by always returning the empty string if start-position is set below 1, no matter the length asked for. |
| Oracle | Doesn't provide the standard SUBSTRING function. Provides SUBSTR(input,start-pos[,length]) instead (i.e. length is optional).
Oracle provides a number of SUBSTR-variants (SUBSTRB, SUBSTRC, SUBSTR2, SUBSTR4, same syntax as for SUBSTR), mainly for handling various kinds of non-latin-only string-types. Oracle doesn't have support for string-extraction with the special SQL-style regular expressions. Instead, it has the REGEXP_SUBSTR function which offers string extraction, using POSIX-style regular expression pattern matching. Documentation: SUBSTR and REGEXP_SUBSTR. |
Note: If you find yourself using SUBSTRING in a WHERE-expression, then consider if LIKE could be used instead: The use of LIKE will typically make your DBMS try to use an index, whereas it will typically not try to do so in connection with functions.
By REPLACE is meant a string-function which searches a source string (haystack) for occurrences of a string to be replaced (needle) and replaces it with a new string (replacement).
| Standard | Not mentioned. May be obtained
through a combination of other functions (have a look at the
OVERLAY, POSITION and CHARACTER_LENGTH functions).
A de facto standard seems to have emerged with regard to REPLACE: REPLACE (haystack:string,needle:string,replacement:string)which means 'replace needle with replacement in the string haystack'. Replacement is done case-sensitively unless otherwise stated. The REPLACE function may be handy for correcting spelling errors (and other situations):
UPDATE tablename
|
| PostgreSQL | Follows de facto standard.
Documentation |
| DB2 | Follows de facto standard.
Documentation |
| MSSQL | Follows de facto standard with the exception that MSSQL by default works case insensitively.
Documentation |
| MySQL | Follows de facto standard.
MySQL even works case sensitively.1 Note that the REPLACE-function is different from MySQL's
non-standard REPLACE INTO expression.
Documentation |
| Oracle | Follows de facto standard.
Documentation |
Note 1:
In this author's opinion, it's confusing that most (if not all)
string-related functions in MySQL work case sensitively,
while MySQL's default
behaviour is to work case insensitively in
plain WHERE-clauses involving string comparisons.
| Standard |
Core SQL feature ID E021-09:
TRIM(where characters FROM string_to_be_trimmed)
where may be one of LEADING, TRAILING or BOTH—or omitted which implies BOTH. characters indicates what character(s) to remove from the head and/or tail of the string. It may be omitted which implies the value ' ' (space character). In other words, the shortest form is Trimming NULL returns NULL. |
| PostgreSQL | Follows the standard. |
| DB2 |
Doesn't support the standard TRIM function.
Provides |
| MSSQL |
Doesn't support the standard TRIM function.
Provides |
| MySQL | Follows the standard. |
| Oracle | Follows the standard with two exceptions:
|
It's often important to get the value of current date and time. Below are the functions used to do that in the different implementations.
| Standard |
The current timestamp (without time zone) is retrieved
with the LOCALTIMESTAMP function which may be used as:
Note that " If the DBMS supports the non-core time zone features (feature ID F411), then
it must also provide the functions |
| PostgreSQL | Follows the standard. |
| DB2 |
Doesn't have the LOCALTIMESTAMP function.
Instead, it provides a special, magic value ('special register' in IBM language), CURRENT_TIMESTAMP (alias to 'CURRENT TIMESTAMP') which may be used as though it were a function without arguments. However, since DB2 doesn't provide TIMESTAMP WITH TIME ZONE support, the availability of CURRENT_TIMESTAMP could be said to be against the standard—at least confusing. |
| MSSQL |
Doesn't have the LOCALTIMESTAMP function.
Instead, it has CURRENT_TIMESTAMP which—however—doesn't return a value of TIMESTAMP WITH TIME ZONE, but rather a value of MSSQL's DATETIME type (which doesn't contain time zone information). |
| MySQL | Follows the standard. |
| Oracle | Follows the standard. |
| Standard |
Core feature ID E021-07: Concatenating two strings is done with the ||
operator:
If at least operand is NULL, then the result is NULL. It's unclear to me if the DBMS is allowed to try to automatically cast the operands to concatenation-compatible types. |
| PostgreSQL |
Follows the standard.
Automatically casts the concatenated values into types compatible with concatenation. If an operand is NULL then the result is NULL. |
| DB2 |
Follows the standard.
Does not automatically cast concanated values into compatible types. Throws exception if an operand is NULL. |
| MSSQL |
Breaks the standard by using the '+' operator instead of '||'.
Does not automatically cast operands to compatible types. If an operand is NULL, then the result is NULL. |
| MySQL |
Badly breaks the standard by redefining || to mean OR.
Offers instead a function, Automatically casts values into types which can be concatenated. If an operand is NULL, then the result is NULL. |
| Oracle |
Follows the standard.
Automatically casts values into types which can be concatenated. As Oracle interprets NULL as the empty string, it doesn't return NULL if an operand is NULL. |
| Standard | As the constraint name indicates,
a (set of) column(s) with a UNIQUE constraint may only contain
unique (combinations of) values.
A column—or a set of columns—which is subject to a UNIQUE constraint must also be subject to a not NULL constraint, unless the RDBMS implements an optional "NULLs allowed" feature (Feature ID 591). The optional feature adds some additional characteristics to the UNIQUE constraint:
|
| PostgreSQL | Follows the standard, including the optional NULLs allowed feature. |
| DB2 | Follows the non-optional parts of the UNIQUE-constraint. Doesn't
implement the optional NULLs allowed
feature.
Documentation (see the unique-constraint section of the page). |
| MSSQL | Follows the standard—with a twist:
MSSQL offers the NULLs allowed feature, but allows at most one instance of a NULL-'value', if NULLs are allowed; i.e. breaks characteristic 2 in the above description of the standard. |
| MySQL | Follows the standard, including the optional NULLs allowed feature. |
| Oracle | Follows the standard—with a twist:.
The optional NULLs allowed feature is implemented: If the UNIQUE-constraint is imposed on a single column, then the column may contain any number of NULLs (as expected from characteristic 2 in the above description of the standard). However, if the UNIQUE-constraint is specified for multiple columns, then Oracle sees the constraint as violated if any two rows
|
It's sometimes handy to have the DBMS handle generation of keys. The DBMSes offer various means for this. Note, however, that some database authorities warn against—at least some variants of—auto-generated keys; this is a classic database discourse.
| Standard |
The standard specifies a column attribute of: GENERATED ... AS IDENTITY (non-core feature ID T174+T175). When creating a table, an IDENTITY clause may be declared for certain types of columns (INTEGER being one):
or
The column with the IDENTITY attribute will be given values in increasing order, possibly with 'holes' (...,3,4,7,...). A base table may at most contain one column with the IDENTITY attribute. NOT NULL is implied for an IDENTITY column. Normally, a column declared with IDENTITY will also be declared PRIMARY KEY, but it's not implied. The examples differ in their 'ALWAYS' vs. 'BY DEFAULT' clauses:
The standard specifies several extended options which may be declared for a generated IDENTITY column. |
| PostgreSQL |
PostgreSQL doesn't support the standard's IDENTITY attribute.
By default, PostgreSQL 'secretly' assigns an Object ID to every row in a base table and that value may be cast to an integer and used as identity, but it's very non-standard, and therefore not a good idea as an identity source in most situations. PostgreSQL's best offerering for a column with auto-generated values is to declare a column of 'type' SERIAL:
'SERIAL' is a short-hand for creating a sequence and using that sequence to create unique integers for a column. If the table is dropped, PostgreSQL seems to remember to drop the sequence which was created as a side-effect of using the SERIAL type. As a user may manually
insert or update a value in a column created as SERIAL, this comes
closest to the standard's
Documentation: OIDs and the SERIAL types. |
| DB2 |
Follows standard, albeit with some
restrictions on how identity columns may (not)
be added to an existing table, etc.
Documentation: CREATE TABLE syntax and description of identity columns. |
| MSSQL |
MSSQL offers IDENTITY as a column property, but with a different
syntax (not as intuitive and with less options) than the standard's specification.
An example of creating a table with an IDENTITY column:
With MSSQL's IDENTITY attribute, the user cannot manually
insert/change the value, unless the user has first run I.e., MSSQL's IDENTITY type is closest to the standard's
Documentation: The IDENTITY property and SET IDENTITY_INSERT. |
| MySQL |
MySQL doesn't support the standard's IDENTITY attribute.
As an alternative, an integer column may be assigned the
non-standard
Columns with the AUTO_INCREMENT attribute will—under certain conditions—automatically be assigned a value of <largest value in column>+1. A table can have at most one column with the AUTO_INCREMENT attribute; that column must be indexed (it doesn't have to be a primary key, as in the example SQL above) and cannot have a DEFAULT value attribute. It's probably
not too far fetched to think of MySQL's AUTO_INCREMENT feature as this equivalence: The nice thing about this approach is that the automatic
value insertion should never fail, even though some of the column's
values might have been manually set—i.e. the combined advantages
of the standard's The drawback is that it might result in more house-keeping: The system may need extra table locks when performing row updates/insertions to protect against ghost updates in concurrent transactions—thus slowing down the system in case of many concurrent updates/insertions. |
| Oracle |
Oracle doesn't support the standard's IDENTITY attribute.
If you want an auto-incrementing column in Oracle, then create a sequence and use that sequence in a trigger associated to the table. Example: For the table mytable, you want the mytable_id column to be of integer type, with an auto-incrementing values:
This will create an auto-incrementing column resembling the
Documentation: |
Note: IBM has a page comparing IDENTITY columns and sequences.
The following are not necessarily SQL operations, but rather a description of how different operations are performed in the command line interface provided by each product.
The shape of the command line interfaces in the commercial products is depressing. Vendors, please do something about it: Not all database developers like to use slow GUIs for technical stuff. And sometimes, DBMS work is performed over slow Internet lines which makes a decent command line interface vital.
Fortunately, a tool like HenPlus exists. It can be a pain to install, but once working, it's nice to work with.
| Standard | Not defined. |
| PostgreSQL |
Run:psqlwhich should be in the PATH in any sensible installation. PostgreSQL's command line interface is very user friendly. It has command history (press arrow-up for previous commands) and a faily well-working command completion feature. |
| DB2 |
Run:db2 -t
The The 'utility' doesn't seem to have anything resembling useful
command history or command completion. Fortunately, queries may be
sent to the DB2 also has 'utilities' called |
| MSSQL |
The command line interface is started by runningosql
An alternative to osql—apart from HenPlus, mentioned above—is SQSH which should work on any modern open source operating system. |
| MySQL |
Run:mysql
If you need help on the optional command line options, see the man page. On platforms like Linux and FreeBSD (which have decent readline-capabilities), MySQL's command line interface is simply great; not much else to say. MySQL's command line interface is said to be rather poor on Windows, though. |
| Oracle |
Run:sqlplus
|
| Standard |
Part 11 of the SQL standard specifies the INFORMATION_SCHEMA schema which must
be part of all database catalogues. The schema may be used like this:
or (often more relevant):
See a warning about potential case sensitivity problems below. |
| PostgreSQL |
Follows the standard, except for
some gotchas mentioned below; also, PostgreSQL's INFORMATION_SCHEMA views don't contain all SQL:2003's columns (an example: PostgreSQL's INFORMATION_SCHEMA.COLUMNS view does not contain the IS_IDENTITY column).
In command-line context, it's easier to use the following non-SQL command
instead of querying the INFORMATION_SCHEMA: Documentation: The |
| DB2 |
Doesn't provide the standard INFORMATION_SCHEMA. Instead, DB2 offers
the SYSCAT
schema (catalog) which is somewhat compatible.
Offers what is probably a shorthand to some system catalog query: |
| MSSQL | Follows that standard, except that MSSQL's INFORMATION_SCHEMA doesn't have all SQL:2003's columns (an example: MSSQL's INFORMATION_SCHEMA.COLUMNS view does not contain the IS_IDENTITY column). Sometimes, the SP_TABLES system stored procedure is
easier to use.
|
| MySQL |
Doesn't provide the standard INFORMATION_SCHEMA.
Use: |
| Oracle |
Doesn't provide the standard INFORMATION_SCHEMA. Provides a data dictionary system instead.
The quickest way to get a usable list of 'normal' tables: |
Note that there may be case sensitivity issues involved when using meta-data
views like those in the INFORMATION_SCHEMA. Generally,
the standard states that the name of an identifier (such as table names) are
implicitly converted to uppercase, unless double-quotes are used when referring
to the identifier. The same goes for identiers used inqueries: A query like
SELECT foo FROM tablename is implicitly converted to
SELECT FOO FROM TABLENAME.
If you create your table as
CREATE TABLE testtab (id INTEGER PRIMARY KEY)
then a query like
SELECT * FROM testtab
should work fine, and
SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME='TESTTAB'
should work, while the following query will probably fail:
SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME='testtab'
Warning: PostgreSQL's case-conversion rules for unquoted identifiers (such as table names) are non-standard: PostgreSQL converts the identifiers to lower case, instead of converting to upper case. This means that you may try altering the case of identifier names used for queries in the INFORMATION_SCHEMA if you experience unexpected, empty metadata queries.
Note also that due to PostgreSQL's handling of constraint names, the INFORMATION_SCHEMA cannot safely be used to deduce referential constraints; for this, you have to use PostgreSQL's pg_catalog system-schema.
| Standard |
Part 11 of the SQL standard specifies the INFORMATION_SCHEMA schema which must
be part of all database catalogues. The schema may be used like this:
—or like this (more verbose):
To get information about constraints, involved columns and (possibly)
referenced columns, a query like this may be used: If you don't care about potential namespace conflicts, you may leave out the lines commented with " See also: Warning about potential case sensitivity problems above. |
| PostgreSQL |
Follows the standard, except for
some gotchas mentioned above.
In command-line context it's easier to use this non-SQL command: |
| DB2 |
Doesn't provide the standard INFORMATION_SCHEMA.
To obtain (very) basic information about a table: To get information about constraints, including involved/referred columns,
a query like the following may be used, although the Documentation:
|
| MSSQL | Follows the standard. Sometimes, the SP_COLUMNS tablename
system stored procedure is easier to use,
perhaps in concert with SP_PKEYS tablename,
SP_FKEYS tablename, and/or
SP_HELPCONSTRAINT tablename.
|
| MySQL |
Doesn't provide the standard INFORMATION_SCHEMA.
Use: |
| Oracle |
Doesn't provide the standard INFORMATION_SCHEMA. Offers data dictionary views instead.
To get (very) basic information: To get information on constraints, including foreign (referred) table/column
information, a query like this may be used (adjust tablename
in one of the last lines): To get information
on indexes on a table, a query like this may be
used (adjust tablename in one of the last lines): Documentation:
|
In most DBMSes, it's possible to enable automatic statistics gathering, but sometimes, it's nice to be able to manually tell the DBMS to gather statistics for a table (or a number of tables).
| Standard | Not standardized. |
| PostgreSQL | ANALYZE tablename
If the tablename parameter is left out, then statistics are gathered for all tables in the current database. |
| DB2 | RUNSTATS ON TABLE schema-nametable-name AND INDEXES ALL
The |
| MSSQL | First, you have to add statistics to the table:
CREATE STATISTICS stats_name
The statistics may then be updated when needed:
Having to explicitly mention tables and columns can be tedious, and in many cases, the Documentation: CREATE STATISTICS, UPDATE STATISTICS, sp_createstats, sp_updatestats |
| MySQL | ANALYZE TABLE tablename
|
| Oracle | Oracle offers to estimate (quick) or compute (thorough)
statistics for a database object. The quick way to do this is to use the deprecated
ANALYZE command which can be used in various ways, e.g.
—Or: If you want to stay way from deprecated features (although I doubt that Oracle will remove ANALYZE...STATISTICS... any time soon), you need to use the DBMS_STATS package. |
| Standard | Not standardized. |
| PostgreSQL | EXPLAIN <query>
|
| DB2 | The easiest way to get a query explanation is to save the query in a file (without a terminating semicolon), and then run a special command-line utility:db2expln -database databasename -stmtfile query.sql -terminalIn the above example, the query has been saved to a file called "query.sql". In some versions of DB2, you need to use the If you prefer to get the explanation through SQL:
|
| MSSQL | MSSQL can be put in a query explanation mode where queries
are not actually executed, but a query explanation is returned instead:SET SHOWPLAN_TEXT ON
The query explanation mode is turned off by running |
| MySQL | EXPLAIN <query>
|
| Oracle | After having set up a plan table, runningDELETE FROM plan_table WHERE statement_id='explanationX';will place an explanation into your PLAN_TABLE. Substitute explanationX with a suitable name for the explanation (and make sure you don't delete other users' explanation plans in the DELETE-line above). The plan explanation may now be viewed by a query like this: A bit of documentation reading can probably not be avoided: |
| Standard | Not standardized. |
| PostgreSQL | \timing
|
| DB2 | Run the query in the "db2batch" command line processor; db2batch prints the elapsed time of each query. |
| MSSQL | SET STATISTICS TIME ON
|
| MySQL | MySQL's command line interface prints query times by default. |
| Oracle | SET TIMING ON
|
Some DBMSes let you perform a query like this:
SELECT 1+1
answering
2
With other DBMSes, you need to insert a dummy-table expression to obtain the same result:
SELECT 1+1 FROM dummy-table
| Standard | On my TODO. |
| PostgreSQL | No need for dummy-table. |
| DB2 | Dummy-table: SYSIBM.SYSDUMMY1.
In addition, the |
| MSSQL | No need for dummy-table. |
| MySQL | No need for dummy-table. |
| Oracle | Dummy-table: DUAL. |
| Standard | SELECT CHARACTER_VALUE
|
| PostgreSQL | Follows the standard. An alternative, non-standard function may be used:
SELECT VERSION()
|
| DB2 | Run the special db2level program.
|
| MSSQL | MSSQL's implementation of the IMPLEMENTATION_SCHEMA doesn't seem to include the SQL_IMPLEMENTATION_INFO view. In stead, you may useSELECT SERVERPROPERTY('ProductVersion')
(just the version), or SELECT @@VERSION
(verbose, harder to parse). |
| MySQL | SELECT VERSION()
|
| Oracle | SELECT banner FROM v$version
|
The following people have provided comments, suggestions and/or fixes, resulting in changes on this page:
(In chronological order.)