Skip to main content
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Get Fabric certified for FREE! Don't miss your chance! Learn more

Reply
adamlob
New Contributor II

Hash Function for Row Compare

Hi,

 

I'm working on a Data Pipeline that loads data into a Dataverse table. I do a row compare to detect changes between loads, so I am only loading rows that have changed.

 

Is there anyway to hash the concat of rows? At the moment it seems I can only do plain-text and then convert it to Binary. Hashing would help save on space.

1 ACCEPTED SOLUTION
AntoineW
Valued Contributor

Hi @adamlob,

 

In Dataflow Gen2 / Fabric Data Pipeline (Data Factory), thereโ€™s no native โ€œHashโ€ transformation yet.

You can use Notebooks to do that : 

 

from pyspark.sql.functions import sha2, concat_ws

df_hashed = df.withColumn(
"row_hash",
sha2(concat_ws("|", *df.columns), 256)
)

 

Then save it back to your Lakehouse table and use that hash for change-detection.

โœ… Benefits:

  • Very fast and scalable,

  • Produces fixed-length SHA-256 strings (~64 chars),

  • Easy to use as a comparison key.

Doc : 

https://spark.apache.org/docs/latest/api/sql/index.html#sha2

 

 

Hope it can help you !

Best regards,

Antoine

View solution in original post

1 REPLY 1
AntoineW
Valued Contributor

Hi @adamlob,

 

In Dataflow Gen2 / Fabric Data Pipeline (Data Factory), thereโ€™s no native โ€œHashโ€ transformation yet.

You can use Notebooks to do that : 

 

from pyspark.sql.functions import sha2, concat_ws

df_hashed = df.withColumn(
"row_hash",
sha2(concat_ws("|", *df.columns), 256)
)

 

Then save it back to your Lakehouse table and use that hash for change-detection.

โœ… Benefits:

  • Very fast and scalable,

  • Produces fixed-length SHA-256 strings (~64 chars),

  • Easy to use as a comparison key.

Doc : 

https://spark.apache.org/docs/latest/api/sql/index.html#sha2

 

 

Hope it can help you !

Best regards,

Antoine

Helpful resources

Announcements
Sticker Challenge 2026 Carousel

Join our Community Sticker Challenge 2026

If you love stickers, then you will definitely want to check out our Community Sticker Challenge!

Free Fabric Certifications

Free Fabric Certifications

Get Fabric certified for free! Don't miss your chance.

January Fabric Update Carousel

Fabric Monthly Update - January 2026

Check out the January 2026 Fabric update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Users online (139)