heyhellomila / soen363-databasesystems-phase2

Phase II: NoSQL (MongoDB)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SOEN 363: Database Systems

Phase II: NoSQL Databases (MongoDB) Banner

Banner image taken from here.

Contributors:

  • Mila Roisin
  • Michelle Choi
  • Collin Zhou

Software Engineering Gina Cody School of Engineering Concordia University 2020

Database System

We chose MongoDB as our NoSQL database, it is document-based.

Download the community edition from : here

Dataset

We selected the following dataset from Kaggle: Safebooru - Anime Image Metadata

  • Metadata: 2,736,037 rows of tag-based anime image metadata
  • Contains 2736034 unique values
  • Size: 1.24 GB-9 Columns
  • Filetype: .csv (comma separated values)
  • No. of files: 1

Steps

  1. Download the all_data.csv dataset
  2. Make a folder chain called data/db in the hardrive you installed mongodb in
  3. Open command line and cd to the bin of your mongodb folder and type execute mongod
  4. In another command line, cd into the bin as well and run mongoimport command as shown below:

image

Let this run for a few minutes and you would have now imported all the documents to your collection

OR

Install Studio3T GUI to help streamline the queries and importing

Documentation : Studio3T GUI

Running Queries and Code

For how to execute the queries and the queries themselves, please look at the documentation report ProjectPhase2-29575774_40033295_26307647.pdf

About

Phase II: NoSQL (MongoDB)

License:MIT License