Redshift is the large-scale data warehouse service for use with business intelligence tools, developed by Amazon based on PostgreSQL.

PostgreSQL is the most advanced open source database management system, developed and maintained by PostgreSQL Global Development Group.

Even though Amazon Redshift was built based on PostgreSQL 8.4, the two databases have a number of very important differences that you must be aware of as you design and develop your data warehouse applications. This post is going to point out some of them.

Business model

  • Redshift: Commercial
  • PostgreSQL: Free and open source

Server operating systems

Redshift is cloud-hosted while PostgreSQL is available on various platforms: FreeBSD, HP-UX, Linux, NetBSD, OpenBSD, OS X, Solaris, Unix, Windows

Connection Methods

  • Redshift uses standard open-source PostgreSQL JDBC and ODBC drivers.
  • PostgreSQL uses native C library, streaming API for large objects, ADO.NET, JDBC, ODBC

In-memory capabilities

  • Redshift: Yes
  • PostgreSQL: No

Data storage

Perhaps the biggest difference is that Redshift is column-oriented storage whereas Postgres is row-oriented. Or in other words, for those queries involving many rows and few columns, Redshift performs significantly faster, and vice versa. Columnar storage not only improves query speed, but save on I/O operations, and can reduce storage size required for the same data sets.

Therefore, Amazon Redshift is better suited for online analytic processing (OLAP) and business intelligence (BI) applications, which require complex queries against large datasets.

Supported data types

  • Redshift supports a great list a data types: SMALLINT, INTEGER, BIGINT, DECIMAL, REAL, DOUBLE PRECISION, BOOLEAN, CHAR, VARCHAR, DATE, TIMESTAMP, TIMESTAMPTZ
  • And PostgreSQL supports almost every data type that you can think of.

Syntaxes

There are some syntaxes were implemented differently in Redshift:

  • CREATE TABLE
. Redshift doesn’t support tablespaces, table partitioning, inheritance, and certain constraints. The Redshift implementation of CREATE TABLE enables you to define the sort and distribution algorithms for tables to optimize parallel processing.
  • ALTER TABLE
. Redshift doesn’t support ALTER COLUMN actions, and ADD COLUMN supports adding only one column in each ALTER TABLE statement.
  • COPY
. The Redshift COPY command is specialized to enable loading of data from Amazon S3 buckets and Amazon DynamoDB tables and to facilitate automatic compression.
  • INSERT, UPDATE, and DELETE
. Redshift doesn’t support the WITH clause.
  • VACUUM
. The parameters for VACUUM are different between the two databases. For example, the default VACUUM operation in PostgreSQL reclaims space and makes it available for reuse. The default VACUUM operation in Redshift is VACUUM FULL, which reclaims disk space and resorts all rows.

Features

There are some PostgreSQL features that are not implemented in Redshift:

  • Table partitioning (range and list partitioning)
  • Tablespaces
  • Constraints
    • Unique
    • Foreign key
    • Primary key
    • Check constraints
    • Exclusion constraints
  • Database Roles
  • Inheritance
  • Postgres system columns. Amazon Redshift SQL does not implicitly define system columns. However, the PostgreSQL system column names cannot be used as names of user-defined columns.
  • Indexes
  • NULLS clause in Window functions
  • Collations. Amazon Redshift does not support locale-specific or user-defined collation sequences.
  • Value expressions
    • Subscripted expressions
    • Array constructors
    • Row constructors
  • Stored procedures
  • Triggers
  • Management of External Data (SQL/MED)
  • Table functions
  • VALUES list used as constant tables
  • Recursive common table expressions
  • Sequences
  • Full-text search

Functions

Some PostgreSQL functions that are suited to smaller-scale OLTP processing have been omitted in Redshift to improve performance:

  • Access privilege inquiry functions
  • Advisory lock functions
  • Aggregate functions
    • STRING_AGG()
    • ARRAY_AGG()
    • EVERY()
    • XML_AGG()
    • CORR()
    • COVAR_POP()
    • COVAR_SAMP()
    • REGR_AVGX(), REGR_AVGY()
    • REGR_COUNT()
    • REGR_INTERCEPT()
    • REGR_R2()
    • REGR_SLOPE()
    • REGR_SXX(), REGR_SXY(), REGR_SYY()
  • Array functions and operators
  • Backup control functions
  • Comment information functions
  • Database object location functions
  • Database object size functions
  • Date/Time functions and operators
    • CLOCK_TIMESTAMP()
    • JUSTIFY_DAYS(), JUSTIFY_HOURS(), JUSTIFY_INTERVAL()
    • PG_SLEEP()
    • TRANSACTION_TIMESTAMP()
  • ENUM support functions
  • Geometric functions and operators
  • Generic file access functions
  • IS DISTINCT FROM
  • Network address functions and operators
  • Mathematical functions
    • DIV()
    • SETSEED()
    • WIDTH_BUCKET()
  • Set returning functions
    • GENERATE_SERIES()
    • GENERATE_SUBSCRIPTS()
  • Range functions and operators
  • Recovery control functions
  • Recovery information functions
  • ROLLBACK TO SAVEPOINT function
  • Schema visibility inquiry functions
  • Server signaling functions
  • Snapshot synchronization functions
  • Sequence manipulation functions
  • String functions
    • BIT_LENGTH()
    • OVERLAY()
    • CONVERT(), CONVERT_FROM(), CONVERT_TO()
    • ENCODE()
    • FORMAT()
    • QUOTE_NULLABLE()
    • REGEXP_MATCHES()
    • REGEXP_SPLIT_TO_ARRAY()
    • REGEXP_SPLIT_TO_TABLE()
  • System catalog information functions
  • System information functions
    • CURRENT_CATALOG CURRENT_QUERY()
    • INET_CLIENT_ADDR()
    • INET_CLIENT_PORT()
    • INET_SERVER_ADDR() INET_SERVER_PORT()
    • PG_CONF_LOAD_TIME()
    • PG_IS_OTHER_TEMP_SCHEMA()
    • PG_LISTENING_CHANNELS()
    • PG_MY_TEMP_SCHEMA()
    • PG_POSTMASTER_START_TIME()
    • PG_TRIGGER_DEPTH()
  • Text search functions and operators
  • Transaction IDs and snapshots functions
  • Trigger functions
  • XML functions


To summarize

If you are already familiar with PostgreSQL, you’ll be able to pick up Amazon Redshift pretty well. Which database is better to use is just a matter of purpose, what are you using it for. Redshift is optimized for analytical queries, and its commands and syntax reflect that, while PostgreSQL is better suited for transactional data.


Need a good GUI Tool for PostgreSQL or Redshift? Try TablePlus. It’s is a modern, native GUI that allows you to simultaneously manage multiple databases such as MySQL, MariaDB, PostgreSQL, SQLite, Microsoft SQL Server, CockroachDB… in a fast and easy way.


Download TablePlus for Mac.

Not on Mac? Download TablePlus for Windows.

TablePlus GUI Tool in Dark theme