-
Notifications
You must be signed in to change notification settings - Fork 704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Py4JJavaError: An error occurred while calling lemmatizer = LemmatizerModel.pretrained() #5774
Comments
Hi, First of all, thank you for the detailed report, this really helps to narrow down the issue.
By default, Spark NLP will download the models/pipelines in the home directory of the user. Here is
It seems you put that model right in the root and it doesn't have enough permissions to read and execute it. This also can be related to the configurations on Windows but it would be great to have the directory somewhere that you have enough permissions In the end, it is really good that you have PySpark working correctly, but Spark NLP needs a couple of more stuff:
Please use this spark = sparknlp.start() And remove this part: (if you prefer to construct your own SparkSession instead of
|
Hi Maziyar, Thanks so much for your prompt reply, this is very helpful! I tried to follow your steps, in particular, I use now the spark = sparknlp.start() setup as you suggest. Furthermore, I completed some steps that I did not do before outlined in the link you sent #1022 . In particular, I
Regarding the permissions to read/write/execute, that is strange because I should be working in folders where I have all these permissions (Btw, the C:/Users/dkaenzig/cache_pretrained exists). To double check, I moved the folder from my dropbox to my desktop and the problem remained unfortunately. Is there a way to check the permissions within spark-nlp? After all these changes, I tried again but unfortunately, I still get an error:
Similarly for trying to load the module locally:
The error seems to be different though now, seems related to loading Java? Do you have any idea what the problem could be? Many thanks for your help! Best, Diego |
Update: I am not sure what solved the problem but:
And after all that, it seems to work now! A bit disappointing that I still do not know what the problem exactly was but at least its working. Thanks so much again for all the help! Best, Diego |
I am glad it worked out, best of luck |
Hi, I am new to spark-nlp. As my first project, I tried to replicate the analysis here: https://towardsdatascience.com/natural-language-processing-with-pyspark-and-spark-nlp-b5b29f8faba. I was able to set up Spark, following the instructions here:
https://phoenixnap.com/kb/install-spark-on-windows-10, and am able to run Spark in a Python Jupyter notebook. However, when I try to load a pretrained model, e.g. lemmatizer = LemmatizerModel.pretrained(), I run into errors. Other tasks, e.g. loading a .parquet file work well.
Description
My code is the following:
Unfortunately, this does not work as expected. I run into the following error:
Interestingly, when I run it one more time, I still get an error but the error changes:
I found a similar thread here: #846 but none of the fixes there worked for me. In particular, I have verified that I have Java 8, that the environment variables are (to the best of my knowledge) set correctly, and I updated sparknlp to the newest version.
I also followed the advice there and downloaded the model and tried to load it manually but with no avail. This leads to another error:
This error makes me think that the issue may be related to hadoop and file permissions. Do you have any idea what the problem could be and how to fix it?
My Environment
sparknlp.version()
: '3.1.1'spark.version
: '3.1.2'java -version
: openjdk version "1.8.0_292"OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_292-b10)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.292-b10, mixed mode)
Environment variables:
PS. I tried with Spark build for Hadoop 3.2 but the problem was the same.
Thanks so much for your help in advance! Best wishes,
Diego
The text was updated successfully, but these errors were encountered: