IQSS / dataverse-sample-data

Scripts and sample data for demo purposes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dataverse Sample Data

Populate your Dataverse installation with sample data.

Requirements

  • Python 3.4 or higher

Installation

Clone this repo.

git clone https://github.com/IQSS/dataverse-sample-data.git

Change directories into the repo that you cloned.

cd dataverse-sample-data

Create a virtual environment for this project.

python3 -m venv venv

Activate the virtual environment you just created.

source venv/bin/activate

Install dependencies into the virtual environment, especially pyDataverse.

pip3 install -r requirements.txt

Copy dvconfig.py.sample to dvconfig.py (see the cp command below) and add your API token (using your favorite text editor, which may not be vi as shown below). Note that the config file specifies which sample data will be created.

cp dvconfig.py.sample dvconfig.py
vi dvconfig.py

Note that the environment variable $API_TOKEN will override api_token in dvconfig.py.

Adding a custom dataset with specific number of files

You can add a specific number of files to the dataset "Dataverse performance test dataset" with:

python create_sample_custom_dataset.py

You'll be prompted to specify the number of files you wish to create. The application will then generate the requested number of files, each one with the Dataverse logo in a randomly chosen color. These files will be in PNG format. It's important to complete this step before adding any data, as the dataset will otherwise be empty.

If you experience the OSError: no library called "cairo-2" was found error please declare the following env variable as documented here:

export DYLD_LIBRARY_PATH="/opt/homebrew/opt/cairo/lib:$DYLD_LIBRARY_PATH"

Adding sample data

Assuming you have already run the source and cd commands above, you should be able to run the following command to create sample data.

python create_sample_data.py

https://github.com/Kozea/CairoSVG/issues/392#issuecomment-1927435606

export DYLD_LIBRARY_PATH="/opt/homebrew/opt/cairo/lib:$DYLD_LIBRARY_PATH"

All of the steps above may be automated in a fresh installation of Dataverse on an EC2 instance on AWS by downloading ec2-create-instance.sh and main.yaml. Edit main.yml to set dataverse.sampledata.enabled: true and adjust any other settings to your liking, then execute the script with the config file like this:

curl -O https://raw.githubusercontent.com/GlobalDataverseCommunityConsortium/dataverse-ansible/master/ec2/ec2-create-instance.sh
chmod 755 ec2-create-instance.sh
./ec2-create-instance.sh -g main.yml

For more information on spinning up Dataverse in AWS (especially if you don't have the aws executable installed), see http://guides.dataverse.org/en/latest/developers/deployment.html

Contributing

We love contributors! Please see our Contributing Guide for ways you can help.

About

Scripts and sample data for demo purposes


Languages

Language:Python 96.1%Language:Shell 3.9%