Data Visualizer
Data Visualizer is used only to visualize the dataset due ot
(Requirement) Please install some SQL-based database (MySQL/SqlLite) before running the scripts provided in this repo. No other libraries need to be installed if you have already installed the libraries listed in requirements.txt from the main project.
This sub-repo is a simple web application based on Python-Flask to visualize the dataset provided by the json files.
Setting up the database: Open the file
db_cursor.pyand update the database details where you want to create and store the data. This connection would be used throughout the application for accessing data.Inserting in the database: Simply run the file
create_and_insert_into_db.pyto create and insert data in the tables. This might take a while to execute, since we are inserting 450K records. Also don’t forget to specify path to json files in the script before executing.Visualize Data: Copy the train/val/test images to the
staticfolder or creating a symlink to the images also works. It is very inefficient and time-consuming to load the entire dataset in one go. So we used simple pagination to do the trick. This code is configured to visualize 100 entries per page, to change this parameter simply changeper_pageparameter in the visualizer files. To visualize training/validation data, usetrain_val_visualizer.pyand to view test data usetest_visualizer.py. Exact commands to run via terminal are:Train Data -
python train_val_visualizer.py -m train
Val Data -python train_val_visualizer.py -m val
Test Data -python test_visualizer.py