This system is built to handle the seamless ingestion, storage, and retrieval of extensive log data. It consists of a Log Ingestor, which manages the acceptance of log data via HTTP, and a Query Interface that empowers users to conduct full-text searches and apply filters to various log attributes.
- Programming Language: Python;
- Database: MySQL;
- Technologies: Kafka , Kafka Rest Proxy, Kafka Schema Registry;
- Frontend: HTML CSS JavaScript;
- Backend: Flask
Ingests logs in the provided JSON format via HTTP on port 3000. Ensures scalability to handle high log volumes. Optimizes I/O operations and database write speeds.
Offers a user-friendly interface (Web UI/CLI) for full-text search. Includes filters for: level message resourceId timestamp traceId spanId commit metadata.parentResourceId Implements efficient search algorithms for quick results.
Search within specific date ranges. Utilization of regular expressions for search. Combining multiple filters for precise queries. Real-time log ingestion and searching capabilities. Role-based access control to the query interface.
Uses an HTTP server to receive logs. Parses incoming JSON logs and publishes them to a Kafka topic. Log Ingestor II - Log Consume Service:
Subscribes to the Kafka topic and consumes logs. Stores logs from the topic to the primary read database instance. Query Interface - Log Search Service:
Provides a user interface for search and filtering. Processes user queries and translates them into database queries. Utilizes optimized indexing for faster search results. Database Structure:
MYSQL (Relational Database): Stores structured log data, optimizing for structured queries and joins. NoSQL Database (e.g., Elasticsearch): Facilitates efficient full-text search and complex queries (to be implemented). Scalability and Performance:
Caching Mechanism: Utilizes caching strategies for frequently accessed data. Load Balancing: Distributes incoming requests across multiple servers for enhanced performance. How to Run the Project:
- Clone the repository: git clone https://github.com/anushkaspatil/Log-Ingestor-and-Query-Interface.git
- Navigate to the project directory: cd log-ingestor-with-query-interface
- Run: docker-compose up -d
- Wait for 1-2 mins as it takes some time to create resources by Kafka and MySQL.
- To ingest log from ui in browser go to: http://localhost:3000/consumer
- Start consumer service: http://localhost:3000/consumer
- To search log in browser search to http://localhost:3000/search
- You can also use POST method to send json data to http endpoint
curl --location 'localhost:3000' \
--header 'Content-Type: application/json' \
--data '{
"level": "error",
"message": "Failed to connect to DB",
"resourceId": "server-1234",
"timestamp": "2023-09-15T08:00:00Z",
"traceId": "abc-xyz-123",
"spanId": "span-456",
"commit": "5e5342f",
"metadata": {
"parentResourceId": "boy server-0987"
}
}'
Optimization: Continuously optimize database queries and indexing strategies for better performance.
Volume: Efficiently handles massive log volumes. Speed: Provides quick search results. Scalability: Adaptable to increasing log volumes and queries. Usability: Offers an intuitive interface for users. Advanced Features: Implements bonus functionalities. Readability: Maintains a clean and structured codebase.
This system effectively manages log data ingestion and provides a seamless query interface for users to retrieve specific logs based on various attributes. Continuous improvements can enhance its performance and capabilities.
This project has been developed with inspiration and reference to existing sources