Installation
A step-by-step guide to setting up Dataverse for you
Dataverse can be installed using pip
:
pip
:pip install dataverse
In order to use Dataverse, there are prerequisites you need to have: Python, Spark and Java. Right below, you can find guidelines for installing Apache Spark and JDK.
Prerequisites
Python (version between 3.10 and 3.11)
JDK (version 11) & PySpark
1. Install JDK
1-1. Install JDK
sudo apt-get update
sudo apt-get install openjdk-11-jdk
1-2. Set Java environment variable
echo "export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64" >> ~/.bashrc
source ~/.bashrc
2. Install PySpark
2-1. Install PySpark
pip install pyspark
2-2. Set PySpark environment variables
echo "export SPARK_HOME=$(pip show pyspark | grep Location | awk '{print $2 "/pyspark"}')" >> ~/.bashrc
echo "export PYSPARK_PYTHON=python3" >> ~/.bashrc
source ~/.bashrc
Last updated