Name is required.
Email address is required.
Invalid email address
Answer is required.
Exceeding max length of 5KB

Transpose rows component

Hi,

I saw the tutorial for pivoting tables that uses Transpose rows followed by Split.

The issue with this approach is that it is highly inefficient and does not scale well. LIST_AGG followed by SPLIT_PART is a computationally intensive operation which can become quite slow when concatenating / parsing long text values.

Instead, would it be possible to implement a pivot approach using GROUP BY with MAX(CASE...)

For instance for the following data:

key col val
1 A 1A
1 B 1B
2 A 2A
2 B 2B

You can use:

SELECT
key,
MAX(CASE WHEN col = 'A' THEN val END) AS val_a,
MAX(CASE WHEN col = 'B' THEN val END) AS val_b
GROUP BY 1

This scale super well and does the pivot + split steps in 1 go resulting:

key . val_a . val_b
1 . 1A . 1B
2 . 2A . 2B

2 Community Answers

Matillion Agent  

Ian Funnell —

Hi Dobo,

Thanks for this interesting suggestion, which we’ve been considering.

I agree that the SQL you’ve suggested is orders of magnitude faster than the LISTAGG/SPLIT_PART approach (and also similarly pushes down well on Spectrum).
But the two things actually achieve slightly different results, and so aren’t strictly comparable. LISTAGG does not deduplicate: it simply concatenates all the matching values into a long string, which then needs to be split (with careful ordering!) to create new columns. You would therefore need to do some pre-filtering and pre-aggregation anyway in order to use LISTAGG (in your example, filter WHERE col IN (‘A’, ‘B’) and GROUP BY col).

Currently we don’t have a transformation component which does exactly what you’re trying to do, so you’ll need to stick to using either the Calculator or the SQL components. But I agree it would be a nice addition to our functionality, and have added it as a feature request – internal reference EMD-4454.

Best regards,
Ian


Dobo Radichkov —

Awesome, thanks!

Post Your Community Answer

To add an answer please login