Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
PeteSpillane
New Contributor II

How do you send mssparkutils.fs.ls() output to a dataframe

I can loop through the output using 

 

 

files = mssparkutils.fs.ls('Files/orders/')
for file in files:
    print(file.name, file.isDir, file.isFile, file.path, file.size)

 

 

But how do I send the output to a dataframe instead?

1 ACCEPTED SOLUTION
kriscoupe
Contributor II

Hi @PeteSpillane ,

 

You can do this with the following code

 

from notebookutils import mssparkutils

# Initialise variables
data = []
columns = ["File Name", "Is Dir", "Is File", "File Path", "File Size"]
files = mssparkutils.fs.ls('Files/orders/')

# Add rows to lists
for file in files:
    data.append([file.name, file.isDir, file.isFile, file.path, file.size])

# Create a dataframe 
dataframe = spark.createDataFrame(data, columns) 
  
# Show data frame
dataframe.show()

 

Tested my side in Fabric notebook and all seemed to work okay.

 

Hope it helps,

Kris

View solution in original post

2 REPLIES 2
kriscoupe
Contributor II

Hi @PeteSpillane ,

 

You can do this with the following code

 

from notebookutils import mssparkutils

# Initialise variables
data = []
columns = ["File Name", "Is Dir", "Is File", "File Path", "File Size"]
files = mssparkutils.fs.ls('Files/orders/')

# Add rows to lists
for file in files:
    data.append([file.name, file.isDir, file.isFile, file.path, file.size])

# Create a dataframe 
dataframe = spark.createDataFrame(data, columns) 
  
# Show data frame
dataframe.show()

 

Tested my side in Fabric notebook and all seemed to work okay.

 

Hope it helps,

Kris

PeteSpillane
New Contributor II

Works perfectly.  Thanks Kris!

Helpful resources

Announcements
Users online (27)