Installation

A step-by-step guide to setting up Dataverse for you

Dataverse can be installed using pip:

pip install dataverse

In order to use Dataverse, there are prerequisites you need to have: Python, Spark and Java. Right below, you can find guidelines for installing Apache Spark and JDK.

Prerequisites

  • Python (version between 3.10 and 3.11)

  • JDK (version 11) & PySpark

1. Install JDK

1-1. Install JDK

sudo apt-get update
sudo apt-get install openjdk-11-jdk

1-2. Set Java environment variable

echo "export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64" >> ~/.bashrc 
source ~/.bashrc

2. Install PySpark

2-1. Install PySpark

pip install pyspark

2-2. Set PySpark environment variables

echo "export SPARK_HOME=$(pip show pyspark | grep Location | awk '{print $2 "/pyspark"}')" >> ~/.bashrc
echo "export PYSPARK_PYTHON=python3" >> ~/.bashrc
source ~/.bashrc

Last updated