Skip to main content
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
PraveenVeli
New Contributor III

Max iterations (100) reached for batch Resolution, please set 'spark.sql.analyzer.maxIterations' to

Hi,

I'm encountering the error 'Max iterations (100) reached for batch Resolution, please set 'spark.sql.analyzer.maxIterations' to a larger value.' while executing a Spark SQL script from the notebook. The script is not complex; it queries 1K records from a delta table in Lakehouse A in workspace A and compares them with a delta table in Lakehouse B in workspace B, writing the differences to the delta table in Lakehouse B.

PraveenVeli_0-1738255423524.png

 

1 ACCEPTED SOLUTION
nilendraFabric
Honored Contributor

Hello @PraveenVeli 


In Spark SQLโ€™s context, โ€œiterationsโ€ refer to the number of passes the query analyzer makes through the logical query plan to resolve references, infer types, and apply optimizations

 

 

Why This Applies to Your Fabric Scenario
1. Workspace Boundary Resolution
3. Fabric treats Lakehouses in different workspaces as separate catalogs, forcing Spark to:
โ€ข Verify table existence in both environments
โ€ข Reconcile schemas across workspaces
โ€ข Handle potential credential handoffs

 

Even for 1k rows comparison 

 

-- Implicitly creates nested plans for:
1) Data fetch from Lakehouse A
2) Data fetch from Lakehouse B
3) Join operation
4) Delta transaction log checks
5) Insert operation

 

Try

spark.conf.set("spark.sql.analyzer.maxIterations", "200")

 

And do df.explain(mode="extended")

Look for Cartesian products or complex subquery patterns

Try

 

OPTIMIZE delta_table ZORDER BY primary_key;

 

please give a try and let me know if this works 

 

 

 

 

View solution in original post

4 REPLIES 4
nilendraFabric
Honored Contributor

Hello @PraveenVeli 


In Spark SQLโ€™s context, โ€œiterationsโ€ refer to the number of passes the query analyzer makes through the logical query plan to resolve references, infer types, and apply optimizations

 

 

Why This Applies to Your Fabric Scenario
1. Workspace Boundary Resolution
3. Fabric treats Lakehouses in different workspaces as separate catalogs, forcing Spark to:
โ€ข Verify table existence in both environments
โ€ข Reconcile schemas across workspaces
โ€ข Handle potential credential handoffs

 

Even for 1k rows comparison 

 

-- Implicitly creates nested plans for:
1) Data fetch from Lakehouse A
2) Data fetch from Lakehouse B
3) Join operation
4) Delta transaction log checks
5) Insert operation

 

Try

spark.conf.set("spark.sql.analyzer.maxIterations", "200")

 

And do df.explain(mode="extended")

Look for Cartesian products or complex subquery patterns

Try

 

OPTIMIZE delta_table ZORDER BY primary_key;

 

please give a try and let me know if this works 

 

 

 

 

v-nmadadi-msft
Honored Contributor II

Hi @PraveenVeli 

May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

Thank you.

nilendraFabric
Honored Contributor

@PraveenVeli , if this answered your question please accept this solution 

PraveenVeli
New Contributor III

Thank you @nilendraFabric , information you provided greatly helps. In my case (It's in the same lines of what you said), the issue was that I provided incorrect column names in my CTE, which is causing this error. I'm a bit surprised it's not generating the relevant error message. I have a CTE (that retrieves data from Lake-house A in Workspace B) and then use it within the Merge statement to integrate data into Lake-house B in Warehouse B. If the column name does not match the destination (Lake-house B), it throws an actual error indicating that it can't find the field name. I'm running my notebook in Workspace B.

Helpful resources

Announcements
Users online (25)