Преглед изворни кода

Expand the godoc of pkg/md to include a lot more context.

Also link to it from the godoc of the commands that use it.
Qi Xiao пре 1 година
родитељ
комит
810b3486d8
3 измењених фајлова са 126 додато и 12 уклоњено
  1. 9 0
      cmd/elvmdfmt/main.go
  2. 114 12
      pkg/md/md.go
  3. 3 0
      website/cmd/md2html/main.go

+ 9 - 0
cmd/elvmdfmt/main.go

@@ -1,3 +1,12 @@
+// Command elvmdfmt reformats Markdown sources.
+//
+// This command is used to reformat all Markdown files in this repo; see the
+// [contributor's manual] on how to use it.
+//
+// For general information about the Markdown implementation used by this
+// command, see [src.elv.sh/pkg/md].
+//
+// [contributor's manual]: https://github.com/elves/elvish/blob/master/CONTRIBUTING.md#formatting
 package main
 
 import (

+ 114 - 12
pkg/md/md.go

@@ -1,7 +1,19 @@
-// Package md implements a Markdown renderer.
+// Package md implements a Markdown parser.
 //
-// This package implements most of the CommonMark spec, with the following
-// omissions:
+// To use this package, call [Render] with one of the [Codec] implementations:
+//
+//   - [HTMLCodec] converts Markdown to HTML. This is used in
+//     [src.elv.sh/website/cmd/md2html], part of Elvish's website toolchain.
+//
+//   - [FmtCodec] formats Markdown. This is used in [src.elv.sh/cmd/elvmdfmt],
+//     used for formatting Markdown files in the Elvish repo.
+//
+// Another Codec for rendering Markdown in the terminal will be added in future.
+//
+// # Which Markdown variant does this package implement?
+//
+// This package implements a large subset of the [CommonMark] spec, with the
+// following omissions:
 //
 //   - "\r" and "\r\n" are not supported as line endings. This can be easily
 //     worked around by converting them to "\n" first.
@@ -9,9 +21,15 @@
 //   - Tabs are not supported for defining block structures; use spaces instead.
 //     Tabs in other context are supported.
 //
-//   - Only entities that are necessary for writing valid HTML (< >
-//     &quote; ' &) are supported. This aspect can be controlled by
-//     overriding the UnescapeEntities variable.
+//   - Among HTML entities, only a few are supported: < > &quote; '
+//     &. This is because the full list of HTML entities is very large and
+//     will inflate the binary size.
+//
+//     If full support for HTML entities are desirable, this can be done by
+//     overriding the [UnescapeHTML] variable with [html.UnescapeString].
+//
+//     Note that numeric character references like 	 and   are fully
+//     supported.
 //
 //   - Setext headings are not supported; use ATX headings instead.
 //
@@ -19,13 +37,97 @@
 //
 //   - Lists are always considered loose.
 //
-// All other features are supported, with CommonMark spec tests passing; see
-// test file for which tests are skipped. The spec tests are taken from the HEAD
-// of the CommonMark spec in https://github.com/commonmark/commonmark-spec,
-// which may differ slightly from the latest released version.
+// These omitted features are never used in Elvish's Markdown sources.
+//
+// All implemented features pass their relevant CommonMark spec tests. See
+// [testutils_test.go] for a complete list of which spec tests are skipped.
+//
+// However, note that the spec tests were taken from the HEAD of the CommonMark
+// spec in https://github.com/commonmark/commonmark-spec on 2022-09-26. This is
+// almost the same as CommonMark 0.30 with one difference.
+//
+// # Is this package relevant if I don't contribute to Elvish?
+//
+// You may still find this package interesting for the following reasons:
+//
+//   - The implementation is small.
+//
+//     A rough test shows that including the code to render Markdown into HTML
+//     adds about 150KB to the binary size, while including just the parser of
+//     [github.com/yuin/goldmark] adds more than 1MB to the binary size. (The
+//     binary size increase depends on which packages the binary is already
+//     including though, so your mileage may vary.)
+//
+//   - The formatter implemented by [FmtCodec] is heavily fuzz-tested to ensure
+//     that it does not alter the semantics of the Markdown, as judged by the
+//     HTML output. It can correctly handle a lot of corner cases, such as
+//     not reformatting "* --" to "- --" (the latter becomes a thematic break).
+//     If you are writing a Markdown formatter, this can be interesting even if
+//     you are implementing it in a different language.
+//
+// # Why another Markdown implementation?
+//
+// The Elvish project uses Markdown in the documentation ("[elvdoc]") for the
+// functions and variables defined in builtin modules. These docs are then
+// converted to HTML as part of the website; for example, you can read the docs
+// for builtin functions and variables at https://elv.sh/ref/builtin.html.
+//
+// We used to use [Pandoc] to convert the docs from their Markdown sources to
+// HTML. However, we would also like to expand the elvdoc system in two ways:
+//
+//   - We would like to support elvdocs in user-defined modules, not just
+//     builtin modules.
+//
+//   - We would like to be able to read elvdocs directly from the Elvish
+//     program, without requiring a browser.
+//
+// With these requirements, Elvish itself needs to know how to parse and render
+// Markdown sources, so we need a Go implementation instead. There is a good Go
+// implementation, [github.com/yuin/goldmark], but it is quite large: linking it
+// into Elvish will increase the binary size by more than 1MB.
+//
+// By having a more narrow focus, this package is much smaller than goldmark,
+// and can be easily optimized for Elvish's use cases. That said, the
+// functionalities provided by this package still try to be as general as
+// possible, and can potentially be used by other people interested in a small
+// Markdown implementation.
+//
+// Besides elvdocs, all the other content on the Elvish website (https://elv.sh)
+// is also converted to HTML using Pandoc; additionally, they are formatted with
+// [Prettier]. Now that Elvish has its own Markdown implementation, we can use
+// it not just for rendering elvdocs, but also replace the use of Pandoc and
+// Prettier. These external tools are decent, but using them still came with
+// some frictions:
+//
+//   - Even though both are relatively easy to set up, they can still be a
+//     hindrance to casual contributors.
+//
+//   - Since different versions of the same tool can behave differently,
+//     we explicit specify their versions in both CI configurations and
+//     [contributing instructions]. But this creates another problem: every time
+//     these tools release new versions, we have to manually bump the versions,
+//     and every contributor also needs to manually update them in their
+//     development environments.
+//
+// Replacing external tools with this package removes these frictions.
+//
+// Additionally, this package is very easy to extend and optimize to suit
+// Elvish's needs:
+//
+//   - We used to custom Pandoc using a mix of shell scripts, templates and Lua
+//     scripts. While these customization options of Pandoc are well documented,
+//     they are not something people are likely to be familiar with.
+//
+//     With this implementation, everything is now done with Go code.
+//
+//   - The Markdown formatter is much faster than Prettier, so it's now feasible
+//     to run the formatter every time when saving a Markdown file.
 //
-// This package is not used anywhere in Elvish right now. It is intended to be
-// used for rendering the elvdoc of builtin modules inside terminals.
+// [testutils_test.go]: https://github.com/elves/elvish/blob/master/pkg/md/testutils_test.go
+// [elvdoc]: https://github.com/elves/elvish/blob/master/CONTRIBUTING.md#reference-docs
+// [Pandoc]: https://pandoc.org
+// [Prettier]: https://prettier.io
+// [CommonMark]: https://spec.commonmark.org
 package md
 
 //go:generate stringer -type=OpType,InlineOpType -output=zstring.go

+ 3 - 0
website/cmd/md2html/main.go

@@ -15,6 +15,9 @@
 //   - toc: Generate a table of content
 //
 //   - number-sections: Generate section numbers for headings
+//
+// For general information about the Markdown implementation used by this
+// command, see [src.elv.sh/pkg/md].
 package main
 
 import (