Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
malimahesh
New Contributor II

Urgent! issue with starting spark session and not able to run notebooks at all

 

Hello everyone,

I’m encountering a strange issue while trying to connect to a Standard or High Concurrency Spark session in our Fabric workspace. Because of this, I’m unable to execute notebooks manually — although pipelines are still running successfully.

Here’s the relevant error message snippet:

 

 
"synapseController": { "id": "xxxxx", "enabled": true, "activeKernelHandler": "sparkLivy", "kernelMetadata": { "kernel": "synapse_pyspark", "language": "python" }, "state": "error", "sessionId": "xxxxx", "applicationId": null, "applicationName": "", "sessionErrors": [ "[TooManyRequestsForCapacity] This spark job can't be run because you have hit a spark compute or API rate limit. To run this spark job, cancel an active Spark job through the Monitoring hub, choose a larger capacity SKU, or try again later. HTTP status code: 430 {Learn more} HTTP status code: 430." ] }
 

Initially, I assumed it was a capacity limit issue — but the Fabric capacity metrics show usage below 20%, and the Monitoring Hub confirms that no Spark sessions or pipelines are actively running.

We even left it idle for two full days, but the issue persists. Moreover, it’s affecting all users in the workspace, not just me.

The workspace has been active for about two months, and this problem only started recently.

2 REPLIES 2
AntoineW
Contributor III

Hello @malimahesh,

 

Here’s what typically causes this specific behavior:

1. Orphaned Spark sessions still “counting” against the capacity

  • Even though the Monitoring Hub shows no active sessions, Fabric’s backend may still have ghost sessions that didn’t clean up correctly.

  • These orphaned sessions consume Spark concurrency slots, so the controller refuses new sessions.

  • Pipelines can still run because they’re using queued Fabric Jobs, not interactive Spark controllers.

🧠 Clue: The error persists across users and restarts but capacity % is low.


2. Spark API throttling (internal rate limiting)

  • Each Fabric capacity enforces API throttles for Spark session management (number of session start/stop calls per minute).

  • If you’ve had multiple users (or automated retries) launching sessions, the rate limiter can block all new session requests for a few hours.

  • These throttles aren’t visible in capacity metrics (which show CPU/memory CU usage, not API call limits).

 

Resolution Steps

1. Restart the Fabric Capacity (Admin action)

This is the most reliable fix.
In the Admin Portal → Capacity Settings → Fabric Capacity → Refresh or Restart.
This forces a reset of Spark controllers and cleans up orphaned sessions.

🕒 After restart, wait ~10 minutes before retrying.


2. Check for Ghost Sessions

Go to:

Fabric Home → Monitoring Hub → Spark Jobs
Filter by last 7 days and all statuses.
If you see “Starting” or “Queued” jobs stuck indefinitely — cancel them manually.


3. Clear and Reassign the Workspace

If the restart doesn’t help:

  1. Move the affected workspace temporarily to another Fabric capacity (even a Trial or Low SKU).

  2. Wait 5–10 minutes for propagation.

  3. Move it back to the original capacity.

This rebinds the workspace’s Spark controller and resets the job queue association.

 

4. Contact Microsoft Support with Session IDs

If it persists after a restart, send a ticket to Microsoft support.

 

Documentation : 

https://learn.microsoft.com/en-us/fabric/admin/capacity-settings?tabs=power-bi-premium

https://learn.microsoft.com/en-us/fabric/data-engineering/spark-job-concurrency-and-queueing

https://learn.microsoft.com/en-us/fabric/data-engineering/spark-detail-monitoring

 

Hope it can help you ! 

Best regards,

Antoine

tayloramy
Contributor

Hi @malimahesh,

 

That 430 TooManyRequestsForCapacity error means you’ve hit Spark’s concurrency/queue limits for your capacity or session API, not necessarily a CPU/Memory shortage. It’s common to see this when overall capacity graphs are low but there are lingering sessions or a burst of submissions. Microsoft documents the behavior-including the exact 430 message-here: Spark concurrency and queueing and Job queueing.

 

  1. Stop any stuck Spark sessions.
    – Open any notebook and use the toolbar Stop session button; or add a final cell with mssparkutils.session.stop() (or spark.stop()) to proactively release the session. Ref: community tip.
    – In Monitoring hub, switch to All items and filter Workload: Apache Spark, Status: In progress/Queued. Cancel anything lingering. How to monitor/cancel: Monitor hub guide.
  2. Enable High Concurrency for notebooks (workspace admin setting) so multiple notebooks can share a session and reduce session churn: workspace high concurrency setting and overview: High concurrency mode.
  3. Reduce parallel submissions temporarily. If pipelines trigger multiple notebooks in parallel, lower the degree of parallelism (ForEach batch size, fewer simultaneous activities) to free up session slots. Each activity can start its own session (and pipelines can queue even when interactive notebooks fail): see community discussion: Concurrent sessions notes.
  4. Check capacity telemetry to locate the culprits.
    Capacity Metrics app > Compute page to see which items are consuming/queuing Spark work: Metrics app compute page.
    – Note some system Spark jobs don’t show up in the metrics app, which explains “low usage but blocked” situations: Spark capacity consumption not reported.

If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.

Helpful resources

Announcements
FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Users online (12,586)