codsen / codsen

a monorepo of npm packages

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[detergent] How to use it with MDX?

ajitid opened this issue · comments

commented

Package's name
detergent

Describe the situation
Unsure how much familiar with MDX and Next.js you are — MDX takes in a superset of markdown and converts it into React components. I'm trying to figure out a way to use detergent with MDX.

MDX takes in two different types of plugins, remark and rehype for Markdown and HTML respectively. I believe both of them use some AST for parsing and transforming, which makes it non-trivial for us to use detergent with them.

hi Ajit!

Sorry for late response.

I'm thinking, detergent has 3 purposes: 1. fix English style; 2. remove invisible characters and 3. strip some/all HTML tags. For MDX, only the first point, fixing English style is relevant, right? Generally, if MDX is authored in VSCode Invisibles would get highlighted; and HTML stripping is a niche task.

I've recently released remark-typography, try that. It's bundled and in CJS — should be no problem to require(), tested on Remix and Gatsby. The readme is not ready on codsen.com but it's working, it actually fixes typography on that same webpage.

commented

Happy new year! I was anticipating a late reply anyway because it was end of the year, so I don't mind at all.

I've recently released remark-typography, try that.

This is exactly what I wanted. Thank you! It worked for me in Next.js as well.

commented

Hey! One issue that I can see is remark-typography might not play well if used along with plugins that do title casing or assign IDs to headings. Usually these plugins split words assuming that there's always space in between them and not a non-breaking space which remark-typography sometimes insert, so:

  • A title case plugin might render "See Changes Done in a Merge Commit to Resolve conflicts" rather "See Changes Done in a Merge Commit to Resolve Conflicts" (notice the last word is lowercase because it is considered a widow)
  • A plugin that generates slug would create #see-changes-done-in-a-merge-commit-to-resolveconflicts rather #see-changes-done-in-a-merge-commit-to-resolve-conflicts (notice a missing dash before the last word)

I think we should mention in the readme on how to avoid such cases. This is what I do:

import remarkTypography from "remark-typography";
import { visit } from "unist-util-visit";

const removeNonBreakingSpacefromHeadings = () => {
  return (tree: any) => {
    visit(tree, "text", (node, _, parent) => {
      if (parent?.type === "heading")
        node.value = node.value.replace(/\u00a0/g, " ");
    });
  };
};

const mdxOptions = {
  remarkPlugins: [remarkTypography, removeNonBreakingSpacefromHeadings],
};

I couldn't figure out the right type for tree here but if you can that'd be better.

Very good point! MDX plugins could and should be aware of the semantics. I'll have a look tonight or weekend latest. Thank you for suggestion. I'll review everything properly.

@ajitid I raised a PR for gouch/to-title-case#26 to support no-break spaces. For the record, its algorithm is not adhering to a best practice (maybe it was circa 2003 but not today in 2023) where they mutate String prototype. In theory, npm package should simply export a function which takes a string and returns a string. If I were you, for titles, I'd switch to a more modern alternative, for example, https://www.npmjs.com/package/ap-style-title-case by wooorm or title by Vercel team.

I'm still working on our plugin.

Also started a discussion on rehype-slug, I actually need that myself; I solved it but it's complex and not elegant — if we could force rehype-slug to treat raw no-break spaces as whitespace, that would solve 50% of the problem (other 50% is fixing title cases).

it's been a dead-end, I'll close this, but don't hesitate to create a new ticket if you see something, sorry about this