rid17pawar / CollegeCutoffExplorer

Our project’s primary purpose is to provide consolidated information about all engineering colleges along with the caste-wise cut-off list for all available branches. We have utilized the pdfplumber and openpyxl python libraries along with Power BI Reports for creating Dashboard.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CollegeCutoffExplorer

YouTube Video Link: https://youtu.be/SwE4mxQxhEI

PPT Presentation: click here

Introduction

This project is designed for admission counseling departments to analyze and generate customized lists of colleges and branches based on candidate/students' percentage, location, and branch preferences. We aim to reduce counselor's workload and enhance student satisfaction and decision-making during the admission process, by eliminating the need to go through multiple lengthy PDF files containing college cut-off details. This project is developed exclusively for admission counseling departments in Maharashtra dealing with student queries related to Admission Cut-off for Engineering Colleges in Maharashtra.

About Technologies we used in this project,

We have utilized the pdfplumber and openpyxl python libraries, combined with regular expressions, to extract cut-off information from the PDF files. To prepare the data, we employed PowerBI Power Query Editor, which involved tasks such as data cleaning, transformation, and merging. Furthermore, we leveraged PowerBI Reports for visualizing the data.

Data extraction from PDF involves the process of extracting relevant information and data from PDF documents. PDF (Portable Document Format) is a widely used file format for storing and sharing documents. However, extracting data from PDF files can be challenging due to the format's inherent complexity and lack of structured data. Data extraction techniques are employed to automatically identify and extract specific data elements, such as text, tables, or images, from PDF documents. This extraction process often involves using specialized software tools (RPA Tools like UiPath) or programming scripts that can analyze the PDF content, locate the desired data, and convert it into a structured format, such as a spreadsheet or a database. Data extraction from PDFs is particularly useful in scenarios where large amounts of data need to be processed and analyzed. We have utilized the pdfplumber and openpyxl python libraries, combined with regular expressions, to extract information from the PDF files.

Data preparation encompasses the process of extract, transform, and load (ETL). Prior to loading the data for visualization, we performed transformations to ensure it is well-organized, user-friendly, properly formatted, and validated. This approach enhances data quality and safeguards against potential issues like unexpected duplicates, null values, incompatible formats, and incorrect indexing.

Data visualization is the process of visually representing information and facts. It plays a vital role in data analysis by facilitating clear and concise communication of complex data. Visualizing data makes it easier to grasp intricate information. Graphs and charts provide a comprehensible depiction of data, enabling people to better understand and interpret its significance. By utilizing data visualization techniques, we can enhance decision-making by leveraging the insights derived from the data. In our case, we have employed various visualizations to construct a PowerBI Report. To prepare the data, we employed PowerBI Power Query Editor, which involved tasks such as data cleaning, transformation, and merging. Furthermore, we leveraged PowerBI Reports for visualizing the data.

System Architecture

System_Diagram

Technologies Used-

1. Front end Technologies:

  • Power BI Desktop
  • Power Query Editor

2. Back end Technologies:

  • Python Libraries,
    • pdfplumber
    • openpyxl

Snapshots-

Power BI Reports PowerBI_Report_TopN

PowerBI_Report_Details

Thank You !

About

Our project’s primary purpose is to provide consolidated information about all engineering colleges along with the caste-wise cut-off list for all available branches. We have utilized the pdfplumber and openpyxl python libraries along with Power BI Reports for creating Dashboard.


Languages

Language:Python 100.0%