realdennis / md2pdf

Offline markdown to pdf, choose -> edit -> transform 🥂

Home Page:https://md2pdf.netlify.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use github.io for hosting to improve trust

hacklschorsch opened this issue · comments

It's nice that I could in principle run your code offline, but I only need a quick conversion of my markdown doc to a PDF without installing anything anywhere.

As it stands, this app is hosted by Netlify, and I don't know or have any reason to trust Netlify (or you). (not meaning to be rude)

I would find the online version of this much more trustworthy if it was a static github.io page.

  • I could be sure md2pdf.github.io actually runs the code I can see in the repo
  • I know github.io doesn't allow dynamic server side components guzzling up my sweet document and storing it for whoevers' viewing please
  • AFAIK github.io doesn't even have shitty analytics, which is great if you want this to be a GDPR conforming tool that people can also use at the workplace w/o having their ass busted by IT, legal, or whatever
  • You'd get continuous delivery (CD) for free.

That way, cool hackers that write everything in markdown instantly know you won't collect their data on your server - because (hopefully) you can't. Probably that's not even true, haven't looked into GH pages for a long while, and you always could come up with some way to collect your users' data. But it would make it harder.

Make it extra hard to collect data, even for people who use this software on their own servers. Do set CORS and whatever you can so people have to explicitly disable that if they want to run an instance that does collect data.

While you're at it, also remove that shitty Google Analytics, it smells like vanity and puts an otherwise cool tool to shame. Metadata is data too.

Lemme answer you, there're two version of the answers:

Short ver.

You can actually clone and build this app in your machine, this is a static page, which only have the client side logic, no need the server.

It's under MIT License, feel free to do change if there's a specific case you face, and send the PR if it's general patch 😜

Long ver.

I could be sure md2pdf.github.io actually runs the code I can see in the repo

There's no correct answer where to host, when I create this project, there's no cool github-action stuff to use (auto push to gh-page branch huh),
so the static host service such like netlify.app or vercel.app are really convenience to me, it's triggered when master branch merge, so they'll automatically install deps, build and release it.

But it's a cool idea, will take a look at it.


I know github.io doesn't allow dynamic server side components guzzling up my sweet document and storing it for whoevers' viewing please

Quick question github.io doesn't allowed to run dynamic server, so it would be safe?
No, definitely no, actually if the developer want to collet / store your data, they could also do the client side call using fetch or xhr to send out your data to other server.


AFAIK github.io doesn't even have shitty analytics, which is great if you want this to be a GDPR conforming tool that people can also use at the workplace w/o having their ass busted by IT, legal, or whatever

Netlify or other 3rd party site might collect your data, but it's about the first server-side call, the moment when you browse the page, they might collect your user-agent or something like geo-location, you won't know until you see the http-server code, it's the same as GitHub. moreover, we can easily do tracking and analysis in client side beacon.

The point is md2pdf.netlify.app only run my logic on client-side.

I don't consider the GDPR case, is that happening to you? If yup, the quick way to bypass is build your own.


You'd get continuous delivery (CD) for free.

Actually what I use are also free, ha.


That way, cool hackers that write everything in markdown instantly know you won't collect their data on your server - because (hopefully) you can't.

Hmmm, the best way to make privacy and secure is that just don't trust anyone (even me even github), especially on the web ,do a code review, take a look on your network panel, make a sandbox.


Do set CORS and whatever you can so people have to explicitly disable that if they want to run an instance that does collect data.

It's the client-side logic app, not sure what you mean do set a CORS.

A suggestion: always plug out the network or running in sandbox when using un-trusted tool site.


While you're at it, also remove that shitty Google Analytics, it smells like vanity and puts an otherwise cool tool to shame. Metadata is data too.

Feel free to send a pull request to remove that, this is added when md2pdf is just a web toy to me and my friend,
https://github.com/realdennis/md2pdf/blob/master/public/index.html#L4-L17

Thank you for your prompt reply!

Short ver. - You can actually clone and build this app in your machine [...]

Yes, thank you for that. It's a great piece of software that you and your friend put together, thank you for making it available for free. (as in freedom, not as in gratis)

It's the client-side logic app, not sure what you mean do set a CORS.

CORS (and today probably other technologies, I am not fully up to date with web tech) controls where a web app can connect to. I propose setting as restrictive CORS headers as possible (see POLA) and by so doing make it clear with which domains the web app is allowed to communicate with. As an offline web app that should ideally be: none.

You'd get continuous delivery (CD) for free.

Actually what I use are also free, ha.

I meant built-in to the service, but let's not split hairs ^^

I don't consider the GDPR case, is that happening to you? If yup, the quick way to bypass is build your own.

In the EU one is basically legally not allowed to use non-GDPR software in a professional setting IIRC. This is not widely understood, and mostly not enforced currently, which is a bit of a shame. GDPR is not perfect, but a good thing all in all IMHO

If yup, the quick way to bypass is build your own.

That's not quick enough, I would instead just install pandoc. Yours is a very handy tool to save many people a little time. "Performance is work per time" it is said; if I build your tool myself, this means the performance greatly suffers.

Quick question github.io doesn't allowed to run dynamic server, so it would be safe?

It is much safer in my eyes, yes. Of course, there is no such thing as secure when it comes to computers.

Hi @hacklschorsch , actually I knew a little bit the CORS and GDPR, the point of my question is:

CORS: it’s a client side application, no matter we hosted in GitHub or Netlify, what we do in this project is just the JavaScript and HTML, if what u mean is allow-origin from server response, u might took a bit misunderstanding for CORS/CORB.

GDPR: as u know now it’s a static web app, the data collection in client-side, so my question is that you are disallowed to visit this in current usage? I believe u can see there’s no server side store. Mostly touch the edge in this might be the GA, but I believe Google do so by geo-location, again I’m free to remove that.

If u see the description of the repo, you’ll see the Offline, that mean no server side involve the render and transform process, which mean u can easily shut the network down for using tab, I guess this might what u want ha.

Thanks for loving this.

CORS: it’s a client side application, no matter we hosted in GitHub or Netlify, what we do in this project is just the JavaScript and HTML, if what u mean is allow-origin from server response, u might took a bit misunderstanding for CORS/CORB.

This or the other.

No, definitely no, actually if the developer want to collet / store your data, they could also do the client side call using fetch or xhr to send out your data to other server.

I thought exactly this is what a CORS policy can prevent. But maybe it's CSP. Or X-XSS. You're the web developer 😸
Of course nothing is ever secure in IT. But I know the GitHub static pages lets a developer do very little besides hosting static pages in their CDN. Which is exactly right for a true "offline" web app.

Basically I mean: Not only make a client side / offline application, also codify that by preventing the app to communicate with the outside world. Make it offline by default, without telling me to pull the plug.

GDPR: Google Analytics is the problem here currently that I can see, if the app is not transferring anything else. If you want your using GA to be legal in the EU, you would need to get consent from your visitors. I suggest to not store data & using no cookies at all, then you also don't have to get consent.

I believe u can see there’s no server side store.

I haven't read the source, and I don't intend to. If the app doesn't require network access, make it clear by equipping it with proper policy directives.

Google turned up this: https://developers.google.com/web/fundamentals/security/csp#use_case_2_lockdown

By default, directives are wide open. If you don't set a specific policy for a directive, let's say font-src, then that directive behaves by default as though you'd specified * as the valid source (for example, you could load fonts from anywhere, without restriction).

You can override this default behavior by specifying a default-src directive. This directive defines the defaults for most directives that you leave unspecified.

And yes, it is CSP, not CORS, so you were right.

CSP is more like preventing the XSS attack (since I can add, I can auto preserve a backdoor bypass the rule) instead of defensing the maluser/developer, and you can see now in github.com actually has the CSP header, but the github.io is not, even we can define it in meta in html, but I thought it's not the first priority, since we don't have the 3rd call and service call, since we don't store the markdown data and apply the queryString value in javascript, but it's still a cool idea for further (since yuh I'm like you I don't like the 3rd service), thanks!

For the migration to github.io it's a possible milestone to be achieved, again, will take a look on it, since the github actions is more mature than that time.

Again, I say security is a gradient, and using CSP et. al. would improve it, not make it bullet proof. Nothing will ever be bullet proof in IT. But it's still worth investing a little time in respect of your users and their data.

Strange. Seems my PR removing Google Analytics as per your suggestion has vanished. I will issue it again.

Feel free to send a pull request to remove that [GA tag]

You can now access the latest version in https://realdennis.github.io/md2pdf/
And it's from the branch gh-pages https://github.com/realdennis/md2pdf/tree/gh-pages
It's build and deploy by github-actions https://github.com/realdennis/md2pdf/blob/master/.github/workflows/deploy.yaml#L37

#45
#44

Excellent, thank you! I have just confirmed https://realdennis.github.io/md2pdf/ works for me.