This innovative tool was born out of a need for a more reliable and efficient way to convert Markdown files into PDF and HTML formats.
As a developer, I frequently encountered challenges with existing VSCode extensions for Markdown conversion. The primary issue was that these extensions often failed to convert Markdown files into PDFs with selectable text and proper styling tags. This limitation not only hindered readability but also affected the overall presentation of the documents.
Determined to find a solution, I embarked on creating a custom Markdown converter. This project aims to address the shortcomings of existing tools by ensuring high-quality conversions that maintain text selectability and styling fidelity in the output files. Our converter is not just a tool; it's a response to a real-world problem, offering a practical and user-friendly solution for developers and content creators alike.
Below is an overview of these functions along with their pseudo-codes to provide a clear understanding of their inner workings.
- Purpose: Converts a Markdown file to HTML.
- Algorithm:
Read Markdown file Convert Markdown content to HTML using markdown_it Append CSS styling Write the HTML content to a new file with .html extension
- Purpose: Converts a Markdown file to PDF.
- Algorithm:
Check if HTML file exists, if not, call mdToHtml() Configure PDF conversion settings Convert HTML file to PDF using pdfkit Handle exceptions during conversion
- Purpose: Converts a PDF file to Markdown.
- Algorithm:
Extract text from PDF using pdfminer Format text into Markdown syntax Write formatted text to a new Markdown file
- Purpose: Provides insights into the content of a file.
- Algorithm:
Open and read file content Convert to plain text if HTML Calculate words, characters, tab spaces, and line breaks Return these details
-
Start: The program initiates with the
main
function, which presents the user with a menu of options. -
User Input: The user selects an option to perform one of the following tasks:
- Convert Markdown to HTML (
mdToHtml
) - Convert Markdown to PDF (
mdToPdf
) - Convert PDF to Markdown (
pdfToMd
) - Get file details (
detailsOfFile
)
- Convert Markdown to HTML (
-
Processing:
- If the user chooses to convert Markdown to HTML or PDF, the program first checks if the necessary files exist and then proceeds with the conversion.
- For PDF to Markdown conversion, the program extracts text from the PDF and formats it into Markdown syntax.
- If the user opts to get file details, the program analyzes the specified file and returns information like word count, character count, etc.
-
Output Generation:
- For conversions, the output is a new file in the chosen format (HTML, PDF, or Markdown) saved in the specified folder.
- For file details, the output is a summary of the file’s contents displayed to the user.
-
Loop for Continual Operation: The program loops back to the menu, allowing users to perform additional tasks or exit the program.
To ensure a smooth experience with our Markdown File Converter, it's important to understand the prerequisites and steps to run the program. Here's a guide:
Before running the program, make sure you have the following installed:
-
Python 3.x: The programming language used to write the converter.
-
Libraries:
markdown_it
,pdfkit
,html2text
,pdfminer
, anddotenv
. These can be installed via pip:pip install markdown_it pdfkit html2text pdfminer dotenv
-
wkhtmltopdf: This is required for converting HTML to PDF. Download and install it from wkhtmltopdf.org.
Follow these steps to run the Markdown File Converter:
-
Clone the Repository: Clone this repository to your local machine.
-
Environment Variables: Set up environment variables for your folder path and filename in a
.env
file. -
Launch: Run the program by executing the main Python script:
python3 main.py
-
Use the Menu: Interact with the menu in the console to choose the desired conversion or file analysis operation.
-
Check Output: The converted files or file details will be available in the specified folder or displayed in the console.
Understanding the architecture and organization of the Markdown File Converter is essential for developers who may want to contribute or customize the code. Here insights into the program's structure:
The project is structured as follows:
Markdown-File-Converter/
│
├── main.py -> Main script to run the program
├── .env -> Environment variables file
├── Styles.css -> CSS file for styling the HTML output
├── README.md -> Documentation (you are here)
├── your_markdown_file.md -> Input Markdown file
└── your_output_file.pdf -> Output PDF file (generated by the program)
-
main.py
: The entry point of the program that handles user interactions and calls relevant functions. -
.env
: Environment variables file containing folder path and filename. -
Styles.css
: CSS file used to style the HTML output. -
README.md
: This documentation file. -
your_markdown_file.md
: Input Markdown file (replace with your own). -
your_output_file.pdf
: Output PDF file (generated by the program).
-
mdToHtml
,mdToPdf
,pdfToMd
, anddetailsOfFile
: These functions perform specific tasks as described earlier. -
External libraries such as
markdown_it
,pdfkit
,html2text
, andpdfminer
are utilized for Markdown processing, PDF generation, and text extraction. -
The program leverages environment variables to dynamically set input and output file paths.
In this test,the Markdown to HTML conversion using Test1.md
. This file contains headings, lists, and subheadings. The aim is to ensure accurate conversion, maintaining structure and styling.
Steps:
- Input:
Test1.md
with complex Markdown. - Convert to HTML using the Markdown File Converter.
- Check for proper structure, formatting, and styling.
- Verify readability and visual coherence.
In this test case, it will convert a complex Markdown file named Test2.md
into HTML format using the Markdown File Converter. Test2.md
contains a variety of Markdown elements, including headings, lists, sublists, and code blocks.
-
Open the Markdown File Converter program.
-
Select the option to convert Markdown to HTML.
-
Specify the path to the
Test2.md
file when prompted. -
Observe the program's conversion process.
-
Test3.md
contains a complex Markdown document with various elements such as headings, lists, paragraphs, and more. -
To verify that the Markdown File Converter accurately converts this complex Markdown file to both HTML and PDF formats while preserving formatting and styling.Here HTML file is generated but is
Deleted or Revoked from Resp. Directory
becausemdToPdf
is directly called/invoked.
Steps:
-
Input the
Test3.md
file into the Markdown File Converter. -
Choose the option to convert Markdown to HTML.
-
Observe the HTML output for correctness, ensuring that all elements (headings, lists, paragraphs) are properly rendered and styled.
-
Choose the option to convert Markdown to PDF.
-
Examine the PDF output for accuracy, checking that text selection is enabled and the styling matches the Markdown document.