theohbrothers / ConvertOneNote2MarkDown

Ready to make the step to Markdown and saying farewell to your OneNote, EverNote or whatever proprietary note taking tool you are using? Nothing beats clear text, right? Read on!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Failed to Convert ... because specified file doesn't exists

med44600 opened this issue · comments

Bug

Failed to convert page: Nouvelle-section-1\6-9-2022---Expert-comptable. Exception: Error while converting docx file c:\notes\Reprise\docx{76279337-9B48-0B09-34AF-30DA87D5B196}{1}{E185189248259008087511978443734194607593311}-1662474788.docx to markdown file c:\notes\Reprise\Nouvelle-section-1\6-9-2022---Expert-comptable.md. Exception: Impossible d’exécuter cette commande en raison de l’erreur : Le fichier spécifié est introuvable.
Convert-OneNotePage : Failed to convert page: Nouvelle-section-1\6-9-2022---Expert-comptable. Exception: Error while converting docx file
c:\notes\Reprise\docx{76279337-9B48-0B09-34AF-30DA87D5B196}{1}{E185189248259008087511978443734194607593311}-1662474788.docx to markdown file
c:\notes\Reprise\Nouvelle-section-1\6-9-2022---Expert-comptable.md. Exception: Impossible d’exécuter cette commande en raison de l’erreur: Le fichier spécifié est introuvable.
Au caractère C:\Users\MédéricGUERIN\Downloads\ConvertOneNote2MarkDown-master\ConvertOneNote2MarkDown-master\ConvertOneNote2MarkDown-v2.ps1:1422 : 257

  • ... onConfigs | Convert-OneNotePage -OneNoteConnection $OneNote -Config $ ...
  •             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    • CategoryInfo : NotSpecified: (:) [Write-Error], WriteErrorException
    • FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Convert-OneNotePage

Expectation

Conversion from .docx to .md stop because script didn't find docx file even though it exists

Discussion

It is strange because conversion from OneNote to docx works. Docx file is exactly what it should.
The file is correctly named , and the missing file in the error message exists!
It seems to have no length error, no directory error.

The only thing is that i d'ont use VS environnement because I have no other use and it would be heavy for my computer. I run script directly from an admin powershell console. The log file talks about PIA Assembly: could it be the error source? How can i check and correct if it's missing (without installing VS environment :) )?

Environment

-- Start of Logfile --

PS C:\Users\MédéricGUERIN\Downloads\ConvertOneNote2MarkDown-master\ConvertOneNote2MarkDown-master> .\ConvertOneNote2MarkDown-v2.ps1 -Verbose
Configuration:
dryRun: 0
notesdestpath: c:\notes
targetNotebook:
usedocx: 1
keepdocx: 2
docxNamingConvention: 1
prefixFolders: 1
mdFileNameAndFolderNameMaxLength: 60
medialocation: 1
conversion: markdown_mmd-simple_tables-multiline_tables-grid_tables+pipe_tables+task_lists-mmd_link_attributes-raw_html
headerTimestampEnabled: 1
keepspaces: 1
keepescape: 1
newlineCharacter: 2
exportPdf: 1
COMMENTAIRES : L’objet écrit dans le pipeline est une instance du type « Microsoft.Office.Interop.OneNote.Application2Class » de l’assembly PIA (Primary Interop Assembly) du composant. Si ce type expose
des membres autres que ceux d’IDispatch, les scripts écrits pour fonctionner avec cet objet peuvent ne pas fonctionner si l’assembly PIA n’est pas installé.

Notebooks to convert:
Reprise

Converting notebook 'Reprise'... (Ignoring deleted notes)

Building conversion configuration for Reprise [Notebook]
# Building conversion configuration for Nouvelle section 1 [Section]
## Building conversion configuration for 6/9/2022 - Expert-comptable [Page]
## 6/9/2022 - Expert-comptable [Page]
COMMENTAIRES : Uri: https://d.docs.live.net/1f9b287e04932cd7/Documents/Reprise/Nouvelle section 1.one/6/9/2022 - Expert-comptable
COMMENTAIRES : Directory: c:\notes\Reprise\docx
COMMENTAIRES : Directory: c:\notes\Reprise
COMMENTAIRES : Directory: C:\Users\MDRICG~1\AppData\Local\Temp\Reprise\2023-08-02-18-16-42-0695980
COMMENTAIRES : Directory: c:\notes\Reprise\Nouvelle-section-1
COMMENTAIRES : Directory: c:\notes\Reprise\media
COMMENTAIRES : Removing existing docx file: c:\notes\Reprise\docx\{76279337-9B48-0B09-34AF-30DA87D5B196}{1}{E185189248259008087511978443734194607593311}-1662474788.docx
COMMENTAIRES : Publishing new docx file: c:\notes\Reprise\docx\{76279337-9B48-0B09-34AF-30DA87D5B196}{1}{E185189248259008087511978443734194607593311}-1662474788.docx
COMMENTAIRES : Converting docx file to markdown file: \\?\c:\notes\Reprise\Nouvelle-section-1\6-9-2022---Expert-comptable.md
COMMENTAIRES : Command line: pandoc.exe -f docx -t markdown_mmd-simple_tables-multiline_tables-grid_tables+pipe_tables+task_lists-mmd_link_attributes-raw_html -i
c:\notes\Reprise\docx\{76279337-9B48-0B09-34AF-30DA87D5B196}{1}{E185189248259008087511978443734194607593311}-1662474788.docx -o c:\notes\Reprise\Nouvelle-section-1\6-9-2022---Expert-comptable.md
--wrap=none --markdown-headings=atx --extract-media=C:/Users/MDRICG~1/AppData/Local/Temp/Reprise/2023-08-02-18-16-42-0695980
Failed to convert page: Nouvelle-section-1\6-9-2022---Expert-comptable. Exception: Error while converting docx file c:\notes\Reprise\docx\{76279337-9B48-0B09-34AF-30DA87D5B196}{1}{E185189248259008087511978443734194607593311}-1662474788.docx to markdown file c:\notes\Reprise\Nouvelle-section-1\6-9-2022---Expert-comptable.md. Exception: Impossible d’exécuter cette commande en raison de l’erreur : Le fichier spécifié est introuvable.
Convert-OneNotePage : Failed to convert page: Nouvelle-section-1\6-9-2022---Expert-comptable. Exception: Error while converting docx file
c:\notes\Reprise\docx\{76279337-9B48-0B09-34AF-30DA87D5B196}{1}{E185189248259008087511978443734194607593311}-1662474788.docx to markdown file
c:\notes\Reprise\Nouvelle-section-1\6-9-2022---Expert-comptable.md. Exception: Impossible d’exécuter cette commande en raison de l’erreur: Le fichier spécifié est introuvable.
Au caractère C:\Users\MédéricGUERIN\Downloads\ConvertOneNote2MarkDown-master\ConvertOneNote2MarkDown-master\ConvertOneNote2MarkDown-v2.ps1:1422 : 257
+ ... onConfigs | Convert-OneNotePage -OneNoteConnection $OneNote -Config $ ...
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorException
    + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Convert-OneNotePage


Done converting notebook 'Reprise' with 1 notes.
Cleaning up...
Conversion errors:
Failed to convert page: Nouvelle-section-1\6-9-2022---Expert-comptable. Exception: Error while converting docx file c:\notes\Reprise\docx\{76279337-9B48-0B09-34AF-30DA87D5B196}{1}{E185189248259008087511978443734194607593311}-1662474788.docx to markdown file c:\notes\Reprise\Nouvelle-section-1\6-9-2022---Expert-comptable.md. Exception: Impossible d’exécuter cette commande en raison de l’erreur : Le fichier spécifié est introuvable.
Exiting.

-- End of Log File ---

Configuration:
dryRun: 0
notesdestpath: c:\notes
targetNotebook:
usedocx: 1
keepdocx: 2
docxNamingConvention: 1
prefixFolders: 1
mdFileNameAndFolderNameMaxLength: 60
medialocation: 1
conversion: markdown_mmd-simple_tables-multiline_tables-grid_tables+pipe_tables+task_lists-mmd_link_attributes-raw_html
headerTimestampEnabled: 1
keepspaces: 1
keepescape: 1
newlineCharacter: 2
exportPdf: 1

Output of $PSVersionTable

PS > $PSVersionTable

Name                           Value
----                           -----
PSVersion                      5.1.22621.1778
PSEdition                      Desktop
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}
BuildVersion                   10.0.22621.1778
CLRVersion                     4.0.30319.42000
WSManStackVersion              3.0
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
commented

This is strange. Seems like the error is either with Start-Process or with pandoc(there is no stacktrace because the original exception is not thrown, this needs improvement).

What do you get when you execute the command line directly (identical to the above -Verbose logs):

pandoc.exe -f docx -t markdown_mmd-simple_tables-multiline_tables-grid_tables+pipe_tables+task_lists-mmd_link_attributes-raw_html -i c:\notes\Reprise\docx\{76279337-9B48-0B09-34AF-30DA87D5B196}{1}{E185189248259008087511978443734194607593311}-1662474788.docx -o c:\notes\Reprise\Nouvelle-section-1\6-9-2022---Expert-comptable.md --wrap=none --markdown-headings=atx --extract-media=C:/Users/MDRICG~1/AppData/Local/Temp/Reprise/2023-08-02-18-16-42-0695980

versus something like if you simplified --extract-media path, e.g.

pandoc.exe -f docx -t markdown_mmd-simple_tables-multiline_tables-grid_tables+pipe_tables+task_lists-mmd_link_attributes-raw_html -i c:\notes\Reprise\docx\{76279337-9B48-0B09-34AF-30DA87D5B196}{1}{E185189248259008087511978443734194607593311}-1662474788.docx -o c:\notes\Reprise\Nouvelle-section-1\6-9-2022---Expert-comptable.md --wrap=none --markdown-headings=atx --extract-media=C:/notes

if there's no error on the second one, then the error might have something to do with pandoc not being able to parse the ~ in C:/Users/MDRICG~1.

commented

Looking at this again, alternatively the problem might be with Start-Process.

The full path of your home directory is C:\Users\MédéricGUERIN, but the value of $env:TEMP is C:\Users\MDRICG~1\AppData\Local\Temp, such that Start-Process -RedirectStandardError $stderrFile errors out because powershell itself cannot resolve the path.

I'm not that experienced with Windows to know why and when it uses the shortened home directory name sometimes (E.g. MDRICG~1 with the ~1 suffix instead of the full path), but my guess is it might be due to an old Windows NT compatibility convention about using only ASCII characters in the user home directory name when the username contains non-ASCII characters (E.g. MédéricGUERIN). Regardless, such a path might stand a chance to cause hiccups to any application that does not recognize that shortened home directory.

commented

Little bit of reading, the problem is with the value of $env:TEMP that is using the shortened MSDOS 8.3 path, which pandoc --extract-media doesn't seem to recognize. Credits to the last link.

Hints:

Let me open a PR to fix this.

commented

Could you try #168 and let me know if it works?

I tried #168 with new conversion script but same error, so i go back to your direct tests with pandoc.exe
The first on were made in Powershell but dosen't work (as i said in #168 issue) : normal !!
Pandoc works via command line.
I so run this comand in MS-DOS admin windows and IT WORKS !

"C:\Program Files\Pandoc\pandoc.exe" -f docx -t markdown_mmd-simple_tables-multiline_tables-grid_tables+pipe_tables+task_lists-mmd_link_attributes-raw_html -i c:\notes\Reprise\docx\{76279337-9B48-0B09-34AF-30DA87D5B196}{1}{E185189248259008087511978443734194607593311}-1662474788.docx -o c:\notes\Reprise\Nouvelle-section-1\6-9-2022---Expert-comptable.md --wrap=none --markdown-headings=atx --extract-media=C:/notes/temp

It generates a good and full converted .md file from .docx source

I finally found what was wrong: It cames from PATH Environment that was not correctly set for pandoc.exe !!
I install pandoc via .msi prog from pandoc web site but it doesn't set path variable.
That's why i had to use the full path in your direct tests with pandoc.exe

You can easily change it looking for environment in start menu
image
Don't forget to start a new powershell to get new path env.

> $env:PATH C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Windows\System32\OpenSSH\;C:\Program Files\dotnet\;C:\Program Files\PowerShell\7\;C:\Program Files\Pandoc\;C:\Users\MédéricGUERIN\AppData\Local\Microsoft\WindowsApps

@leojonathanoh I also retried previous versions of your script - 7cc35a8 "https://github.com/theohbrothers/ConvertOneNote2MarkDown/blob/docs/ci-cleanup-bug-report-issue-template/ConvertOneNote2MarkDown-v2.ps1"
and #165 and all works fine.

Sorry to make you search, it was in my environment
Thanks for your help

commented

Thank you for your research. It's odd that the script didn't actually validate that pandoc.exe is available in PATH, because it should. Let me find out what this is about.

commented

It does seem like pandoc .msi installer doesn't always set PATH, but a very reliable solution is outlined in jgm/pandoc#1054 (comment). Since pandoc must be available to Administrator for this script, pandoc must be installed system-wide (screenshot). Then at the end the installer notifies the user restart powershell or the computer for PATH to be set correctly in some cases (screenshot).

To verify this on my machine, installed pandoc using the .msi package. Installer sets environment variables correctly, for both user-specific and system-wide installations, and didn't need a restart. But for some environments it may need a re-login / restart for env variables to be set correctly.

Let me add the solution to improve on the docs. Thanks again 😄

Super Job 👌
Actually, I installed pandoc for the local user and not system-wide. This is where the original error comes from!
Your srcipt is awsome/
Now , i have to fight to find howto get back list tags from OneNote to MD !!

commented

oh yes, list tags. I think it's part of the xml object of the page, but this script does not convert them to markdown front matter or references in markdown. hope you manage to get it working 😃

To get tags in MD file, i made a OneTastic macro. It adds MD mark to the tag so that we get them in .docx file, during the first step of your script) before converting to MD.
It's quite simple but it works for me . It can be modify to each usage :)
image

commented

this is cool stuff, didn't know you could do such things with onetastic. Perhaps this may be added in the readme for people who want to use such an approach for preserving tags etc 😄