Skip to main content
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
meiwah
Contributor

Spark Config for PySpark notebook using SAS

I would like to connect to a parquet in ADLS Gen 2 from a Fabric PySpark notebook. How to set the spark config?

Below results in error

# Set the authentication type to SAS
spark.conf.set(
    f"fs.azure.account.auth.type.{CONFIG_PREFIX}",
    "SAS"
)

# Specify the Fixed SASToken Provider class
spark.conf.set(
    f"fs.azure.sas.token.provider.type.{CONFIG_PREFIX}",
    "org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider"
)

# Set the actual SAS token
spark.conf.set(
    f"fs.azure.sas.fixed.token.{CONFIG_PREFIX}",
    SAS_TOKEN
)

but there is an error 
"Unable to load SAS token provider class: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider not foundjava.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider not found"
5 REPLIES 5
BhaveshPatel
Honored Contributor

Hi @meiwah 

 

Is it Azure Spark ( Pyspark ) or Delta Lake ( both are Parquet files except Delta lake has transaction log with parquet).

 

Also, Please include why do you need to connect to ADLS Gen 2 ( Storage ).   You first need to connect to Azure Storage Explorer.

BhaveshPatel_1-1760000462262.png

 

 

 

Also You need a GroupID and WorkspaceID to connect to Microsoft Fabric to Azure Storage Explorer.

 

https://app.fabric.microsoft.com/groups/08946547d-e0b7-6578-b5ff-ff4a96567753056/synapsenotebooks/5a...

Thanks & Regards,
Bhavesh

Love the Self Service BI.
Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to give Kudos.

Hi Bhavesh, I'm running the code in PySpark notebook in Fabric environment, not in Azure. It is just a parquet and not delta. The reason for connecting to the ADLS is to receive the data there. And I'm using a SAS token becos it is cross tenant, meaning the fabric and adls are of different tenant

BhaveshPatel
Honored Contributor

 

 

You first need to connect to Azure Storage Explorer. ( Multi tenant )  

BhaveshPatel_3-1760000866065.png

 

 

 

 

 

 

Also You need a GroupID and WorkspaceID to connect to Microsoft Fabric to Azure Storage Explorer. 

 

Apache Spark doesn't have ACID Transactions whereas Delta Lake does.

 

https://app.fabric.microsoft.com/groups/08946547d-e0b7-6578-b5ff-ff4a96567753056/synapsenotebooks/5a

Thanks & Regards,
Bhavesh

Love the Self Service BI.
Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to give Kudos.
v-achippa
Honored Contributor

Hi @meiwah,

 

Thank you for reaching out to Microsoft Fabric Community.

 

Thank you @BhaveshPatel for the prompt response.

 

As we havenโ€™t heard back from you, we wanted to kindly follow up to check if the solution provided by the user for the issue worked? or let us know if you need any further assistance.

 

Thanks and regards,

Anjan Kumar Chippa

v-achippa
Honored Contributor

Hi @meiwah,

 

We wanted to kindly follow up to check if the solution provided by the user for the issue worked? or let us know if you need any further assistance.

 

Thanks and regards,

Anjan Kumar Chippa

Helpful resources

Announcements
Users online (10,586)