Skip to main content
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Get Fabric certified for FREE! Don't miss your chance! Learn more

Reply
RenatoDM
New Contributor

Spark Data Lineage

I see the OpenLineage libraries are by default included as built-in library in Spark. When a notebook reads and writes to OneLake does it emit lineage events automatically? According to Copilot it does and lineage visualization in Purview is optional. Where are those events stored? I see a SparkLineage folder in OneLake but it is always empty. I am not able to find clear documentation regarding this topic. I appreciate comments. Thank you.

1 ACCEPTED SOLUTION
nilendraFabric
Honored Contributor

Hi @RenatoDM 

 

The `SparkLineage` folder in OneLake is not populated by default. Its presence suggests compatibility with OpenLineage standards, but explicit configuration is required.
โ€ข To emit granular OpenLineage events (e.g., column-level lineage), you must:
โ€ข Implement a SparkListener to intercept Spark execution plans.
โ€ข Configure diagnostic emitters to route logs to Azure Storage or Log Analytics

 

 

Native Purview integration captures basic item-level lineage (e.g., notebook โ†’ Lakehouse table) but doesnโ€™t populate `SparkLineage`

 

https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/azure-synapse-diagnostic-emitters-az...

 

 

 

 

View solution in original post

1 REPLY 1
nilendraFabric
Honored Contributor

Hi @RenatoDM 

 

The `SparkLineage` folder in OneLake is not populated by default. Its presence suggests compatibility with OpenLineage standards, but explicit configuration is required.
โ€ข To emit granular OpenLineage events (e.g., column-level lineage), you must:
โ€ข Implement a SparkListener to intercept Spark execution plans.
โ€ข Configure diagnostic emitters to route logs to Azure Storage or Log Analytics

 

 

Native Purview integration captures basic item-level lineage (e.g., notebook โ†’ Lakehouse table) but doesnโ€™t populate `SparkLineage`

 

https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/azure-synapse-diagnostic-emitters-az...

 

 

 

 

Helpful resources

Announcements
Sticker Challenge 2026 Carousel

Join our Community Sticker Challenge 2026

If you love stickers, then you will definitely want to check out our Community Sticker Challenge!

Free Fabric Certifications

Free Fabric Certifications

Get Fabric certified for free! Don't miss your chance.

January Fabric Update Carousel

Fabric Monthly Update - January 2026

Check out the January 2026 Fabric update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Users online (78)