Unable to use some packages on Databricks R Notebook
I tried a lot to solve my problems through the previous questions of stackoverflow, and using other sites, but I failed. Thus, my problems are as follows:
I am trying to install the ggmap package:
install.packages("ggmap", lib="/databricks/spark/R/lib")
but I get this error:
rjcommon.h:11:10: fatal error:jpeglib.h: No such file or directory
Maybe useful info:
x64 Windows 10
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
[1]"/local_disk0/.ephemeral_nfs/envs/rEnv-adfa3b9f-34f8-4494-af74-4cf4c85bece3"
[2] "/databricks/spark/R/lib"
[3] "/local_disk0/.ephemeral_nfs/cluster_libraries/r"
[4] "/usr/local/lib/R/site-library"
[5] "/usr/lib/R/site-library"
[6] "/usr/lib/R/library"
I was also trying to install the GADMTools package:
install.packages("GADMTools", lib="/databricks/spark/R/lib")
and the errors were as follows:
Configuration failed because libudunits2.so was not found and
configure: error: gdal-config not found or not executable
The ggmap package I tried to install it through terminal:
PS C:\Users\olthpor\scoop\buckets\main>sudo apt-get install libjpeg-dev
and the result was:
Start-process: This command cannot be run due to the error: The system cannot find the file
specified.
At C:\Users\olthpor\Documents\Scripts\sudo.ps1:1 char:103
+...ngth -gt 1){start-process arg[0]-ArgumentList args[1...args.Lengt...
+ CategoryInfo : InvalidOperation:(:)[Start-process], InvalidOperationException
+ FullyQualifiedErrorId: InvalidOperationException,
Microsoft.PowerShell.Commands.StartProcessCommand
I'm irrelevant to use Linux commands on Windows.
Solution 1:
The problem is that some system packages aren't installed that are required for compilation of your R packages. Like, libjpeg-dev
for ggmap
, etc.
You can solve problem as following:
- on Community edition or if you're using Single Node cluster, then it could be enough to do (you need to find which Ubuntu packages are required for your libraries):
%sh
apt-get update
apt-get -y install libjpeg-dev
- if you use multi-node cluster, then you need to use cluster init script that will install dependencies on all nodes of the cluster (as library needs to be compiled on each node as well). Content of the script is similar to the command above, just need to add so-called shebang:
#!/bin/bash
apt-get update
apt-get -y install libjpeg-dev
install this script as described in documentation, and restart the cluster.