Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.
I'm currently trying to run a script in MS Fabric Notebook Environement which is attached to a Lakehouse using a table shortcut.
spark.conf.set("spark.hadoop.fs.gs.impl", "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem")
spark.conf.set("spark.hadoop.fs.AbstractFileSystem.gs.impl", "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS")
spark.conf.set("spark.hadoop.google.cloud.auth.service.account.enable", "true")
spark.conf.set("spark.hadoop.google.cloud.auth.service.account.json.keyfile", "/lakehouse/default/Files/gcs_key.json")
spark.conf.set("spark.hadoop.google.cloud.auth.null.enable", "false")
spark.conf.set("google.cloud.auth.service.account.enable", "true")
spark.conf.set("google.cloud.auth.service.account.json.keyfile", "/lakehouse/default/Files/gcs_key.json")
query = "SELECT * FROM dbo_Geography_shortcut"
max_records_per_file = 50000
mode = "append"
format = "json"
gcs_path = "gs://qa_dmgr_audience-prod-replica-1_eu_6bf6/15070/raw_test_5/"
df = spark.sql(query)
df.write.option("maxRecordsPerFile", max_records_per_file).mode(mode).format(format).save(gcs_path)This script is creaating staging files in the GCS path it is uploading and trying to delete them after actual files are uploaded. The service account don't have any delete permission assigned to it, So the spark job fails. I can't provide delete permission as this is restricted by my company to proceed with it.
Error:
Py4JJavaError: An error occurred while calling o5174.save.
: org.apache.spark.SparkException: Job aborted due to stage failure: Authorized committer (attemptNumber=0, stage=27, partition=0) failed; but task commit success, data duplication may happen. reason=ExceptionFailure(org.apache.spark.SparkException,[TASK_WRITE_FAILED] Task failed while writing rows to gs://qa_dmgr_audience-prod-replica-1_eu_6bf6/15070/raw_test.,[Ljava.lang.StackTraceElement;@7e3f1bb3,org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to gs://qa_dmgr_audience-prod-replica-1_eu_6bf6/15070/raw_test.
at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:776)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:499) ..........
.........
Caused by: java.io.IOException: Error deleting 'gs://qa_dmgr_audience-prod-replica-1_eu_6bf6/15070/raw_test/_temporary/0/_temporary/attempt_202507291107556685858892966191353_0027_m_000000_129/part-00000-bb5e042b-8f4d-4e6d-84a5-91149d3429ad-c000.json', stage 2 with generation 1753787276139795I need to somehow avoid this creation of staging files in the first place and direct uploading to GCS path using the same script.
Please help me on this !!
Thanks in advance ๐
Solved! Go to Solution.
Hello @Harsha_k_111,
Thank you for reaching out to the Microsoft Fabric Community Forum.
we understand that the script is failing because it tries to delete temporary staging files in the GCS path, but your service account lacks delete permissions due to company restrictions.
To resolve this, you can configure the script to write directly to the GCS path without creating temporary files. This can be achieved by adjusting the Spark configurations to use a direct write mode, which avoids the need for delete permissions. Specifically, you can set the output stream type to bypass the default behavior of creating staging files. This should allow the job to complete successfully while adhering to your permission constraints.
Best regards,
Ganesh Singamshetty
Hello @Harsha_k_111,
Thank you for reaching out to the Microsoft Fabric Community Forum.
we understand that the script is failing because it tries to delete temporary staging files in the GCS path, but your service account lacks delete permissions due to company restrictions.
To resolve this, you can configure the script to write directly to the GCS path without creating temporary files. This can be achieved by adjusting the Spark configurations to use a direct write mode, which avoids the need for delete permissions. Specifically, you can set the output stream type to bypass the default behavior of creating staging files. This should allow the job to complete successfully while adhering to your permission constraints.
Best regards,
Ganesh Singamshetty
Hello @Harsha_k_111,
We hope you're doing well. Could you please confirm whether your issue has been resolved or if you're still facing challenges? Your update will be valuable to the community and may assist others with similar concerns.
Thank you.
Hello @Harsha_k_111,
Hope everythingโs going great with you. Just checking in has the issue been resolved or are you still running into problems? Sharing an update can really help others facing the same thing.
Thank you.
Hello @Harsha_k_111,
Could you please confirm if your query has been resolved by the provided solutions? This would be helpful for other members who may encounter similar issues.
Thank you for being part of the Microsoft Fabric Community.