Edit syntax diagrams

How to edit syntax diagrams

In this kind of work, the nature of your documentation change means that you'll need to modify existing syntax diagrams or add new ones.

The following covers diagram maintenance and creation for the YSQL documentation. The YCQL documentation still uses an old method for this. You must take advice from colleagues if you need to work on YCQL diagrams.

After you understand this area, modifying existing syntax diagrams or adding new ones will seem to be very straightforward. However, you must embrace a fairly large mental model. This, in turn, depends on several terms of art. All this is covered in the dedicated section Creating and maintaining syntax diagrams for YSQL Docs.

Here is a terse step-by-step summary:

Get set up

  1. Download the latest RRDiagram JAR file (rrdiagram.jar). See the section the diagram generator.

  2. Run the generator. This regenerates every diagram file that is jointly specified by the diagram definition file and the set of all free-standing generated grammar-diagram pairs. Of course, not all of these newly generated md files will differ from their previous versions; git status will show you only the ones that actually changed.

Modify or add content pages or syntax rules

The nature of your documentation change will determine which selection of the following tasks you will need to do and which of these you will do during each local edit-and-review cycle.

Very occasionally, you might want to reorganize the hierarchical structure of a part of the overall documentation as the user sees it. (This is the hierarchy that you see and navigate in the left-hand navigation panel.) Such changes involve moving existing humanly typed documentation source file(s) in the directory tree that starts at the content directory. If you do this, then you must use your favorite editor (a generic plain text editor is best for this purpose) to do manually driven global search and replace to update URL references to moved files at their old locations. However, because you will do this only in the scope of the files that you worked on (most likely, the /preview/ subtree), you must also establish URL redirects in each moved file to avoid breaking links in other Yugabyte Internet properties, or in third party sites. The frontmatter allows this easily. Here is an example.

title: SELECT statement [YSQL]
headerTitle: SELECT
linkTitle: SELECT
description: Use the SELECT statement to retrieve rows of specified columns that meet a given condition from a table.
menu:
  preview:
    identifier: dml_select
    parent: statements
aliases:
  - /preview/api/ysql/commands/dml_select/

The aliases page property allows a list of many URLs. Notice that these are relative to the content directory. The .md file in this example used to be here:

<your path>/yugabyte-db/docs/content/preview/api/ysql/commands/dml_select.md

It was moved to here:

<your path>/yugabyte-db/docs/content/preview/api/ysql/the-sql-language/statements/dml_select.md

Your specific documentation enhancement will determine if you need only to change existing content pages or to add new ones. You will decide, in turn, if you need to modify any existing syntax rules or to define new ones. And if you do define new ones, you will decide in which files you need to include the "grammar" and "diagram" depictions. (See the section about free-standing generated grammar-diagram pairs.)

Follow this general flow on each local editing cycle.

  1. Modify content pages, or if necessary create new ones.
  2. If necessary, edit the diagram definition file.
  3. If necessary, create one or more new free-standing generated grammar-diagram pairs, using touch, on the appropriate directories in the syntax resources directory tree.
  4. If necessary, run the generator.
  5. If necessary, add diagram inclusion HTML to one or more of the content files that you modified.
  6. If necessary, add URL references in other content files to link to your new work. For example, you might need to update an _index.md file that implements a table of contents to a set of files to which you've just added one.
  7. If necessary, add (or update) the aliases page property in any humanly typed documentation source file(s) that you relocated.
  8. Manually check the new docs page(s) and index page(s) to make sure that there are no broken links.

Note: There is no tool for link checking within your local git. Broken links are detected when a Yugabyte staff member periodically runs a link checker on the published doc on the Internet. The scope of this check includes all of the Yugabyte properties (for example, the Blogs site). Broken links that this check finds are reported and fixed manually.

Mental model and glossary: creating and maintaining syntax diagrams for YSQL Docs

This section describes the mental model for the creation and maintenance of syntax diagrams. The description relies on a number of terms of art. These definitions must be in place in order to explain, efficiently and unambiguously, the mental model of how it all works. Each glossary term is italicized, and set up as a link to the term's definition in this essay, whenever it is used to advertise its status as a defined term of art.

The ordering of the glossary terms is insignificant. There's a fair amount of term-on-term dependency. You therefore have to internalize the whole account by ordinary study. When the mental model that they reflect is firmly in place, you can produce any outcome that you want in this space without step-by-step instructions. In particular, you can base the diagnosis of author-created bugs (for example, broken links) on the mental model.

Note: The account that follows distinguishes between directory and file in the usual way: a directory contains files and/or directories. And a file has actual content.

Note: the two terms, "grammar" and "syntax", mean pretty much the same as each other. But one or the other of these is used by convention in the spellings and definitions of the terms of art that this glossary defines.

Note: Users of the Internet-facing YugabyteDB documentation typically access the sub-corpus that starts here:

docs.yugabyte.com/preview/

Users with more specific requirements will start at .../stable/ or, maybe, something like .../v2.1/. This account assumes that users will work on content only in the /preview/ subtree.

Syntax rule

  • The formal definition of the grammar of a SQL statement, or a component of a SQL statement.

Every syntax rule is defined textually in the single diagram definition file. The set of all these rules is intended to define the entirety of the YSQL grammar—but nothing beyond this. Presently, the definitions of some syntax rules (while these are implemented in the YSQL subsystem of YugabyteDB) remain to be written down.

Sometimes, the grammar of an entire SQL statement can be comfortably described by a single, self-contained syntax rule. The Syntax section of the account of the COMMIT statement provides an example. More commonly, the grammar of a SQL statement includes references (by name) to the definition(s) of one or more other rule(s). And often such referenced syntax rules are the targets of references from many other syntax rules. The complete account of a very flexible SQL statement can end up as a very large closure of multiply referenced rules. SELECT is the canonical example of complexity. For example, a terminal like integer can end up as the reference target in very many distinct syntax spots within the total definition of the SELECT statement, and of other statements.

A syntax rule is specified using EBNF notation. EBNF stands for "extended Backus–Naur form". See this Wikipedia article. Here is an example for the PREPARE statement:

prepare_statement ::= 'PREPARE' name [ '(' data_type { ',' data_type } ')' ] 'AS' statement ;

Note: When this is presented on the "Grammar" tab in the syntax diagram, it is transformed into the PostgreSQL notation. This is widely used in database documentation. But it is not suitable as input for a program that generate diagrams.

Notice the following:

  • The LHS of ::= is the rule's name. The RHS is its definition. The definition has three kinds of element:

  • EBNF syntax elements like these: [ ] ( ) ' { } ,

  • Keywords and punctuation that are part of the grammar that is being defined. Keywords are conventionally spelled in upper case. Each individual keyword is surrounded by single quotes. This conveys the meaning that whitespace between these, and other elements in the target grammar, is insignificant. Punctuation characters in the target grammar are surrounded by single quotes to distinguish them from punctuation characters in the EBNF grammar.

  • References to other rule names. These become clickable links in the generated diagrams (but not in the generated grammars). Rule names are spelled in lower case. Notice how underscore is used between the individual English words.

  • The single space before the ; terminator is significant.

Diagram definition file

<your path>/yugabyte-db/docs/content/preview/api/ysql/syntax_resources/ysql_grammar.ebnf

Located directly on the syntax resources directory, this file, uniquely under the whole of the ysql directory is not a .md file. And uniquely under the syntax resources directory, it is typed up humanly. It holds the definition, written in EBNF notation, of every syntax rule that is processed by the diagram generator.

The diagram generator doesn't care about the order of the rules in the file. But the order is reproduced in the grammar diagrams file. This, in turn, reflects decisions made by authors about what makes a convenient reading order. Try to spot what informs the present ordering and insert new rules in a way that respects this. In the limit, insert a new rule at the end as the last new properly-defined rule and before this comment:

(* Supporting rules *)

Syntax diagram

The syntax diagram and the syntax rule bear a one-to-one mutual relationship.

The Syntax section of the account of the PREPARE statement provides a short, but sufficient, example. The syntax diagram appears as a (member of a) tabbed pair which gives the reader the choice to see a syntax rule as either the "Grammar" form (in the syntax used in the PostgreSQL documentation and the documentation for several other databases) or the "Diagram" form (a so-called "railroad diagram" that again is used commonly in the documentation for several other databases).

Docs directory

cd <your path>/yugabyte-db/docs

You install the diagram generator here. And you stand here when you run the generator. You also typically stand here when you start "hugo"; and when you prepare for, and then do, a git push to your personal GitHub fork.

Content directory

In other words, it holds the entire humanly typed documentation source (not just for YSQL), the free-standing generated grammar-diagram pairs and the diagram definition file:

cd <your path>/yugabyte-db/docs/content

"hugo" uses the tree starting at the content directory to generate the documentation site (under the influence of the supporting doc infrastructure) as a set of static files.

Notice that, when your current directory is the docs directory, git status shows the paths of what it reports starting with the content directory like this:

modified:   content/preview/api/ysql/syntax_resources/grammar_diagrams.md
modified:   content/preview/api/ysql/syntax_resources/ysql_grammar.ebnf
modified:   content/preview/api/ysql/the-sql-language/statements/cmd_do.md

Both the /preview/ and the /stable/ subtrees are direct children of the content directory.

YSQL directory

cd <your path>/yugabyte-db/docs/content/preview/api/ysql

Supporting doc infrastructure

This material is not relevant for the present account.

Humanly typed documentation source

  • (with a few exceptions) The set of ".md" files located under the ysql directory.

Singleton exception: The diagram definition file, on the syntax resources directory, is a plain text file that is typed up manually.

Exception class: The grammar diagrams file and the free-standing generated grammar-diagram pairs. These are all located in the syntax resources directory tree. They are all .md files. And none is humanly typed. Do not manually edit any of these.

With just a few on the ysql directory itself, the humanly typed documentation source files are located in directory trees that start at the ysql directory. The top-of-section _index.md for the YSQL subsystem is among the few humanly typed documentation source files that are located directly on the ysql directory.

Syntax resources directory

cd <your path>/yugabyte-db/docs/content/preview/api/ysql/syntax_resources

With the one exception of the diagram definition file, every file within the syntax resources directory tree is generated.

Generated documentation source

All are located within the syntax resources directory tree. The set minimally includes the grammar diagrams file. It also includes all of the free-standing generated grammar-diagram pairs from this. These will be included within ordinary content .md files by URL reference from the diagram inclusion HTML located in the ordinary content .md file that wants the syntax diagram.

Notice that there is no garbage collection scheme for unreferenced generated documentation source files. Content authors must do this task manually.

Grammar diagrams file

<your path>/yugabyte-db/docs/content/preview/api/ysql/syntax_resources/grammar_diagrams.md

The grammar diagrams file contains every syntax diagram that is generated from all of the syntax rules that are found in the diagram definition file.

Generated grammar-diagram pair

  • The Markdown source that produce the result, in the human-readable documentation, that the user sees as a syntax diagram.

Use this term when you want to focus in the fact that a syntax diagram is presented for the user as a tabbed "Grammar" and "Diagram" pair.

Always found in the grammar diagrams file (but, uniquely, not here as a tabbed pair). Optionally found as a free-standing generated grammar-diagram pair.

Free-standing generated grammar-diagram pair

  • Such a pair requests the diagram generator to populate them automatically with respectively the "grammar" and the "diagram" presentations of the set of syntax rules that each asks for.

    They are defined by a pair of .md files with names that follow a purpose-designed syntax:

<rule-name>[, <rule-name>, ...].grammar.md
               ~               .diagram.md

Each pair is initially created empty (for example using touch) to express the content author's desire to have a diagram set for a particular humanly typed documentation source file.

Such a pair should be placed in the exact mirror sibling directory, in the syntax resources directory tree, of the directory where the humanly typed documentation source file that wants to include the specified syntax diagram is located.

Here is an example. Suppose that the file wants-to-include.md wants to include the syntax diagram with a rule set denoted by the appropriately spelled identifier <rule set X>. And suppose that the humanly typed documentation source file is here:

<your path>/yugabyte-db/docs/content/preview/api/ysql                 /dir_1/dir_2/dir_3/wants-to-include.md

The white space between /ysql and /dir_1 has been introduced as a device to advertise the convention. It doesn't exist in the actual file path.

The free-standing generated grammar-diagram pair must be placed here.

<your path>/yugabyte-db/docs/content/preview/api/ysql/syntax_resources/dir_1/dir_2/dir_3/<rule set X>.grammar.md
<your path>/yugabyte-db/docs/content/preview/api/ysql/syntax_resources/dir_1/dir_2/dir_3/<rule set X>.diagram.md

Suppose that a syntax rule includes a reference to another syntax rule. If the referenced syntax rule is included (by virtue of the name of the diagram-grammar file pair) in the same syntax diagram set, then the name of the syntax rule in the referring syntax diagram becomes a link to the syntax rule in that same syntax diagram set. Otherwise the generated link target of the referring rule is within the grammar diagrams file. The way that this link is spelled depends on the location, within the ysql directory tree, of the .md file that includes the generated syntax diagram.

If you don't follow this rule, then (as long as you specify the right path to the free-standing generated grammar-diagram pair in the diagram inclusion HTML, you will get the diagram in the wants-to-include.md, just as you want it. But the links to diagrams in the grammar diagrams file for rules that are not defined in the same included diagram will be broken. You should therefore always check manually that these links work when you first confirm that you see the generated syntax diagram that you want at the location that you want it. And you should check again just before creating a Pull Request.

Syntax diagram set

Diagram inclusion HTML

Consider this example:

<your path>/yugabyte-db/docs/content/preview/api/ysql/dir_1/dir_2/dir_3/wants-to-include.md

You must include this boilerplate text in wants-to-include.md at the location in this file where you want the syntax diagram to be seen:

<ul class="nav nav-tabs nav-tabs-yb">
  <li >
    <a href="#grammar" class="nav-link active" id="grammar-tab" data-toggle="tab" role="tab" aria-controls="grammar" aria-selected="true">
      <i class="fas fa-file-alt" aria-hidden="true"></i>
      Grammar
    </a>
  </li>
  <li>
    <a href="#diagram" class="nav-link" id="diagram-tab" data-toggle="tab" role="tab" aria-controls="diagram" aria-selected="false">
      <i class="fas fa-project-diagram" aria-hidden="true"></i>
      Diagram
    </a>
  </li>
</ul>

<div class="tab-content">
  <div id="grammar" class="tab-pane fade show active" role="tabpanel" aria-labelledby="grammar-tab">
  {{% includeMarkdown "../../../syntax_resources/dir_1/dir_2/dir_3/<rule set X>.grammar.md" %}}
  </div>
  <div id="diagram" class="tab-pane fade" role="tabpanel" aria-labelledby="diagram-tab">
  {{% includeMarkdown "../../../syntax_resources/dir_1/dir_2/dir_3/<rule set X>.diagram.md" %}}
  </div>
</div>

You take responsibility for spelling this

../../../syntax_resources/dir_1/dir_2/dir_3/<rule set X>.grammar.md

and this:

../../../syntax_resources/dir_1/dir_2/dir_3/<rule set X>.diagram.md

properly. The rest is entirely boilerplate. You musn't touch it. The construct ../../../ must climb up the proper number of levels (three in this example) to reach the parent ysql directory from wants-to-include.md.

Diagram generator

It's implemented by rrdiagram.jar on the docs directory.

Here are the download instructions. You need to do this just once in a newly created local git (typically created from your personal fork of the master YB repo).

cd yugabyte-db/docs/

wget $(curl -s https://api.github.com/repos/Yugabyte/RRDiagram/releases/latest \
       | grep browser_download_url | cut -d \" -f 4)

Use ls rrdiagram.jar to check that it is now present. (It isn't shown by "git status" 'cos it's explicitly excluded.)

Run the generator

  • first "cd..." then "java..."

Specifically:

cd yugabyte-db/docs
java -jar rrdiagram.jar content/preview/api/ysql/syntax_resources/ysql_grammar.ebnf content/preview/api/ysql/syntax_resources/

This will (re)generate all of the files that it ought to. You can run this at any time. In the worst case, a typing error somewhere, especially in the diagram inclusion HTML, can crash hugo, resulting in the notorious blanked out screen in the browser.

Note: To see help, run java -jar rrdiagram.jar (without arguments).

About the redundant use of ( ... )

In at least one case, the use of semantically redundant notation in the diagram definition file conveys the author's intent about how the diagram should be constructed. Here's an example of semantically significant use of ( ... ) in algebra:

  (a/b) - c differs in meaning from a/(b - c)

And while the rules do give a/b - c an unambiguous meaning (advertised by the absence of space surrounding the / operator), it's generally considered unhelpful to write this, thereby forcing the reader to rehearse the rules in order to discern the meaning. Rather, the semantically redundant orthography (a/b) - c is preferred.

In EBNF, ( ... ) plays a similar role as it does in algebra. In some situations, just which subset of items is grouped changes the meaning. But in some situations (like a + b + c in algebra) the use of ( ... ) conveys no semantics. Look for this at the end of the diagram definition file:

(*
  Notice that the "demo-2" rule uses ( ... ) redundantly.
  The two rules are semantically identical but they produce the different diagrams.
  The diagrams do express the same meaning but they are drawn using different conventions.
  Uncomment the two rule definitions and look at the end of the grammar diagrams file to
  see the two semantically equivalent, but differently drawn, diagrams.

  Make sure that you comment these out again before creating a Pull request.
*)

(*
demo-1-irrelevant-for-ysql-syntax ::= ( a 'K' b { ',' a 'K' b } ) ;

demo-2-irrelevant-for-ysql-syntax ::= ( ( a 'K' b ) { ',' ( a 'K' b ) } ) ;
*)

Example use case

Suppose that this humanly typed documentation source file:

<your path>/yugabyte-db/docs/content/preview/api/ysql/exprs/window_functions/window-definition.md

wants to include a syntax diagram set with these syntax rules:

frame_clause, frame_bounds, and frame_bound

Step 1: If they all already exist in the diagram definition file, then go to Step 2. Else, type up any rules that don't yet exist.

Step 2: Create these empty files (for example, with touch).

touch yugabyte-db/docs/content/preview/api/ysql/syntax_resources/exprs/window_functions/frame_clause,frame_bounds,frame_bound.diagram.md

touch yugabyte-db/docs/content/preview/api/ysql/syntax_resources/exprs/window_functions/frame_clause,frame_bounds,frame_bound.grammar.md

Step 3: Copy a reliable example of the diagram inclusion HTML and paste it into window-definition.md at the location where you want the syntax diagram set to appear. Edit the bolded text shown above appropriately.

Step 4: Run the generator.

Step 5: Make sure that you inspect the result immediately in the "hugo" rendering of the humanly typed documentation source into which you just included the diagram inclusion HTML.