This is directly related to this problem: Spark & Scala: saveAsTextFile() exception
I also have the same error when I attempt to save a dataframe into a csv from Jupyter Notebook when using PySpark. I created a very simple csv to load and immediately save (I could display it using show in its entirety), but when I attempt to save it I get the UnverifiedLink error.
I followed all suggestions in the above StackOverflow questions but none of them helped. However, when I try to load the same csv in CMD using spark-shell, all works fine.
It also seems that the PySpark in Jupyter that I installed using Anaconda (Py 3.8) doesn’t seem to recognize the HADOOP_HOME environment variable, and I have to set it manually using:
import os os.environ['HADOOP_HOME'] = "C:appshadoop-2.7.3"
I already tried all suggestions I could find on Stack and I’m confused why it works in spark-shell and not in PySpark in a Notebook. I am able to run hadoop from powershell no problem as well
Source: Windows Questions