Solved: Error reading datawarehouse table in a notebook us...

RafaelTerzoni · ‎09-26-2024

I am very frustrated as I developed an integration that was working reading data from the DW in a notebook (python) that basicaly gets the rows from a DW table and send it to an API, now it is broken and having the same problem not just for this table, but all tables.

I CAN'T read any table from my DW, in a notebook through pyspark. Anyone having the same issue ?

delta_table_path = "abfss://MRQ@onelake.dfs.fabric.microsoft.com/MRQ_Int.datawarehouse/Tables/xxx/mrq.Contact"

df = spark.read.format("delta").load(delta_table_path)

df.show()

Getting the folllowing error message.

Py4JJavaError: An error occurred while calling o6025.load. : java.util.concurrent.ExecutionException: Operation failed: "Bad Request", 400, HEAD, http://onelake.dfs.fabric.microsoft.com/MRQ/MRQ_Integration.datawarehouse/Tables/mrq/mrq.DimDate/_de... at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)

frithjof_v · ‎09-29-2024

This works for me:

Replace the Workspace name and Warehouse name with their GUID's (you can find them in the URL in the web browser when you are inside your Warehouse).

Like this:

abfss://<workspaceID>@onelake.dfs.fabric.microsoft.com/<warehouseID>/Tables/<schemaName>/<tableName>

View solution in original post

FabianSchut · ‎09-26-2024

Hi @RafaelTerzoni,

I made a test setup to see the abfss path of a table in a warehouse. It seems to me that you do not need the schema before the table name. In your case, mrq.Contact should become Contact. The schema does need to be in the path, but you already have that, so it seems.

Your full path from your example would become (where I suppose mrq is the schema):
delta_table_path = "abfss://MRQ@onelake.dfs.fabric.microsoft.com/MRQ_Int.datawarehouse/Tables/mrq/Contact"

frithjof_v · ‎09-29-2024

This doesn't work for me.

I tried this:

df_wh = spark.read.load("abfss://<workspaceName>@onelake.dfs.fabric.microsoft.com/<warehouseName>.datawarehouse/Tables/<schemaName>/<tableName>")

It gives me 400 Bad Request.

It worked when I replaced <workspaceName> with the ID of the workspace, and replaced the <warehouseName>.datawarehouse with the ID of the warehouse.

Anonymous · ‎09-27-2024

Hi, @RafaelTerzoni

Thanks for FabianSchut's method. He has analyzed your current problem and given you a way, you can try his method and if it helps you with your current problem, you can accept it as a solution for more people with similar problems to find it more easily.

Best Regards,
Yang
Community Support Team

If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

How to get your questions answered quickly -- How to provide sample data in the Power BI Forum

frithjof_v · ‎09-29-2024

This works for me:

Replace the Workspace name and Warehouse name with their GUID's (you can find them in the URL in the web browser when you are inside your Warehouse).

Like this:

abfss://<workspaceID>@onelake.dfs.fabric.microsoft.com/<warehouseID>/Tables/<schemaName>/<tableName>

rterzoni · ‎09-29-2024

That is it. Thank you ! It worked using the IDs.

Error reading datawarehouse table in a notebook using pyspark python

Helpful resources

FabCon is coming to Atlanta

Error reading datawarehouse table in a notebook using pyspark python

Helpful resources