Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.
I see the OpenLineage libraries are by default included as built-in library in Spark. When a notebook reads and writes to OneLake does it emit lineage events automatically? According to Copilot it does and lineage visualization in Purview is optional. Where are those events stored? I see a SparkLineage folder in OneLake but it is always empty. I am not able to find clear documentation regarding this topic. I appreciate comments. Thank you.
Solved! Go to Solution.
Hi @RenatoDM
The `SparkLineage` folder in OneLake is not populated by default. Its presence suggests compatibility with OpenLineage standards, but explicit configuration is required.
โข To emit granular OpenLineage events (e.g., column-level lineage), you must:
โข Implement a SparkListener to intercept Spark execution plans.
โข Configure diagnostic emitters to route logs to Azure Storage or Log Analytics
Native Purview integration captures basic item-level lineage (e.g., notebook โ Lakehouse table) but doesnโt populate `SparkLineage`
Hi @RenatoDM
The `SparkLineage` folder in OneLake is not populated by default. Its presence suggests compatibility with OpenLineage standards, but explicit configuration is required.
โข To emit granular OpenLineage events (e.g., column-level lineage), you must:
โข Implement a SparkListener to intercept Spark execution plans.
โข Configure diagnostic emitters to route logs to Azure Storage or Log Analytics
Native Purview integration captures basic item-level lineage (e.g., notebook โ Lakehouse table) but doesnโt populate `SparkLineage`