Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.
I'd like to create a data pipeline and run a pyspark code directly from a Github repo, is that possible?
Solved! Go to Solution.
Hi @lchinelli,
Thank you for the follow-up.
I have practically reproduced your scenario using modular, object-oriented PySpark code structured across multiple folders like models/ and utils/.
where modular, object-oriented PySpark code is executed across multiple folders (like models/ and utils/) using a main driver script.
What I Did:
/main.py
/models/
โโโ cleaner.py
โโโ validator.py
/utils/
โโโ formatter.py
Attached the screenshot below showing the successful execution and expected output:
If you have any further questions, please don't hesitate to contact us through the community. We are happy to assist you.
Best Regards,
Ganesh singamshetty.
Do you mean run a notebook from a GitHub repo using a GitHub workflow? If so then absolutely.
I did a post that shows how you can do it with Azure DevOps, you can port the logic over:
https://www.kevinrchant.com/2025/01/31/authenticate-as-a-service-principal-to-run-a-microsoft-fabric...
As far as I know, You can not run programmimg code such as Pyspark from Github repo. It is for CI/CD ( Github Repo) . By the way, why you have to do this.
Rather than I should use Dataflow Gen 2 or Python Notebooks.
To better version control and to import modules from another folders
Hello @lchinelli,
Thank you for reaching out to the Microsoft Fabric Forum Community.
Iโve reproduced your scenario in Microsoft Fabric and achieved the desired outcome. You can run PySpark code directly from a GitHub repo by using a Fabric Notebook that dynamically fetches the script using a requests.get() call and exec() to run it. This notebook can then be triggered inside a Data Factory pipeline using a Notebook activity.
How It Works:
Example GitHub Code Used:
data = [
("Microsoft Fabric", 2025),
("Power BI", 2024),
("Synapse", 2023)
]
columns = ["Product", "Year"]
df = spark.createDataFrame(data, columns)
df.show()
Hereโs a successful pipeline run in Microsoft Fabric using a notebook that fetches a PySpark script from GitHub:
If this information is helpful, please โAccept as solutionโ and give a "kudos" to assist other community members in resolving similar issues more efficiently.
Thank you.
Is it possible to run code from another folders importing into a main.py file or in a main.ipynb? I said that because my code is OOP
Hello @lchinelli,
yes, it is possible to run modular, object-oriented PySpark code across multiple files/folders (just like in OOP projects), even within Microsoft Fabric Notebooks or from a main.py.
Thank you.
How?
Hi @lchinelli,
Thank you for the follow-up.
I have practically reproduced your scenario using modular, object-oriented PySpark code structured across multiple folders like models/ and utils/.
where modular, object-oriented PySpark code is executed across multiple folders (like models/ and utils/) using a main driver script.
What I Did:
/main.py
/models/
โโโ cleaner.py
โโโ validator.py
/utils/
โโโ formatter.py
Attached the screenshot below showing the successful execution and expected output:
If you have any further questions, please don't hesitate to contact us through the community. We are happy to assist you.
Best Regards,
Ganesh singamshetty.
Hello @lchinelli,
We hope you're doing well. Could you please confirm whether your issue has been resolved or if you're still facing challenges? Your update will be valuable to the community and may assist others with similar concerns.
Thank you.
Hello @lchinelli,
Hope everythingโs going great on your end. Just checking in has the issue been resolved or are you still running into problems? Sharing an update can really help others facing the same thing.
Thank you.
Hello @lchinelli,
Could you please confirm if your query has been resolved by the provided solutions? This would be helpful for other members who may encounter similar issues.
Thank you for being part of the Microsoft Fabric Community.
Check out the November 2025 Fabric update to learn about new features.
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!