Wednesday, 29 January 2020

How does the high-level architecture of Teradata compare to Amazon Redshift

No, but let's be honest: How often do you use a NUSI or USI in Teradata?

And if so, isn't it always hard to design it to be used? In Teradata, statistics must be correct, selectivity must be right, etc.
Amazon Redshift uses an ingenious method for performance tuning:

For each data block, the value range is saved in metadata. This allows Amazon Redshift to restrict the search to data to blocks that match the WHERE condition.
What do Amazon Redshift data blocks look like?
As in Teradata, the size of the data blocks is dynamic.

In Amazon Redshift, a data block grows up to a size of one megabyte, then the data block is divided into two blocks of equal size.
How do joins work in Amazon Redshift?
In this respect Redshift is not very different from Teradata: The data must be on the same slice to be joined.
If the distkey of both tables is the same, then the data of both tables are already on the same slice.
But how can you prevent data from being copied during the join? Amazon Redshift allows you to copy a table to all slices in advance.
While Teradata may choose this strategy during the join to bring the rows onto a common AMP, this can be pre-defined in Redshift.

No comments:

Post a Comment