Solved: Re: clusterBy does not work in dataframe API?

smpa01 · ‎07-10-2025

The following works in databricks but not in fabric. It works through DeltaTable API though. But why does it not work through dataframeAPI?

# write load at t - Create the table with clustering enabled from the start
(df.write.format("delta")
         .mode("overwrite")
         .clusterBy("id") # <--- ADD THIS LINE HERE to enable clustering at creation
         .saveAsTable(table_name) # Use the fully qualified name here for consistency
)

# AttributeError: 'DataFrameWriter' object has no attribute 'clusterBy'

DataFrameWriter Doc

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

My custom visualization projects

Plotting Live Sound: Viz1

Beautiful News:Viz1, Viz2, Viz3

Visual Capitalist: Working Hrs

Others:Easing Graph, Animated Calendar

v-prasare · ‎07-11-2025

Hi @smpa01 ,

the .clusterBy() method on DataFrameWriter is not supported because Fabric uses a customized Spark runtime that limits certain APIs to ensure simplicity and compatibility within its managed environment. Unlike Databricks, which offers extended Delta Lake features directly through the PySpark DataFrameWriter, Fabric restricts clustering capabilities to SQL DDL and the DeltaTable API

Thanks,

Prashanth Are

MS Fabric community support

View solution in original post

v-prasare · ‎07-11-2025

Hi @smpa01 ,

the .clusterBy() method on DataFrameWriter is not supported because Fabric uses a customized Spark runtime that limits certain APIs to ensure simplicity and compatibility within its managed environment. Unlike Databricks, which offers extended Delta Lake features directly through the PySpark DataFrameWriter, Fabric restricts clustering capabilities to SQL DDL and the DeltaTable API

Thanks,

Prashanth Are

MS Fabric community support

smpa01 · ‎07-16-2025

@v-prasare without clusterBy in dataframe writer, I am guessing the clusterd files can't be written, if one intends to write only the raw files with the intention to create an external table.

df.write\
    .format("delta")\
    .mode("append")\
    .clustrBy (cluster by fields)\
    .save(file_path)

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

My custom visualization projects

Plotting Live Sound: Viz1

Beautiful News:Viz1, Viz2, Viz3

Visual Capitalist: Working Hrs

Others:Easing Graph, Animated Calendar

clusterBy does not work in dataframe API?

Helpful resources

Join our Community Sticker Challenge 2026

Free Fabric Certifications

Fabric Monthly Update - January 2026

FabCon Atlanta 2026

FabCon is coming to Atlanta