Syntax

Comark AST

Complete reference for the Comark AST (Abstract Syntax Tree) format, a lightweight array-based structure for efficient processing.

Overview

Comark AST is a lightweight, array-based AST format designed for efficient processing and minimal memory usage. Unlike traditional node-based ASTs, Comark uses nested arrays (tuples) to represent document structure.

AST Structure

ComarkTree

The root container for all parsed content:

interface ComarkTree {
  nodes: ComarkNode[]                   // Parsed AST nodes
  frontmatter: Record<string, any>      // YAML frontmatter data
  meta: {
    toc?: any                           // Table of contents (from toc plugin)
    summary?: ComarkNode[]              // Summary content (from summary plugin)
    [key: string]: any                  // Other plugin metadata
  }
}

ComarkNode

Each node is either a text string or an element tuple:

type ComarkNode = ComarkElement | ComarkText

ComarkText

Plain string representing text content:

type ComarkText = string

ComarkElement

An element is a tuple array with tag, attributes, and children:

type ComarkElement = [string, ComarkElementAttributes, ...ComarkNode[]]

ComarkElementAttributes

Attributes are a key-value record:

type ComarkElementAttributes = {
  [key: string]: unknown
}

Node Format

Text Nodes

Plain strings represent text content:

"Hello, World!"

Element Nodes

Elements are arrays with the format [tag, props, ...children]:

["p", {}, "This is a paragraph"]

Components:

  • tag (index 0): Element name or component tag
  • props (index 1): Object with attributes/properties
  • children (index 2+): Child nodes (strings or nested arrays)

Common Elements

Headings

# Heading 1
## Heading 2

Paragraphs

This is a paragraph with **bold** and *italic* text.
[Link text](https://example.com)

Images

![Alt text](image.png)

Lists

- Item 1
- Item 2
  - Nested item

Ordered Lists

1. First
2. Second

Code Blocks

```javascript [app.js] {1-2}
const a = 1
const b = 2
```

Inline Code

Use `const` for constants

Span Attributes

Span syntax wraps inline text with custom attributes:

This is [highlighted]{.highlight .yellow} text.

With multiple attributes:

[Important]{#notice .badge style="color: red" data-priority="high"}

With nested markdown:

[**Bold** text]{.emphasized}

Blockquotes

> This is a quote

Tables

| Header 1 | Header 2 |
| -------- | -------- |
| Cell 1   | Cell 2   |

Horizontal Rule

---

Comments

HTML comments are represented with null as the tag:

<!-- This is a comment -->

Inline comment:

Text before <!-- comment --> text after

Multi-line comment:

<!--
Multi-line
comment text
-->

Comment between elements:

## First Section

<!-- Separator -->

## Second Section

Comark Components

Block Components

::alert{type="info"}
This is an alert message
::

Inline Components

Check out this :badge[New]{color="blue"} feature

Components with Slots

::card
#header
## Card Title

#content
Main content here
::

Nested Components

:::outer
::inner
Content
::
:::

Property Types

String Properties

::component{title="Hello"}

Boolean Properties

::component{disabled}

Note: Boolean props are prefixed with : in the AST.

Number Properties

::component{:count="5"}

Object/Array Properties

::component{:data='{"key": "value"}'}

ID and Class Properties

::component{#my-id .class-one .class-two}

Complete AST Example

---
title: Example Document
author: John Doe
---

# Welcome

This is a **sample** document with [links](https://example.com).

::alert{type="info"}
Important notice
::

## Code Example

\`\`\`javascript [demo.js]
console.log("Hello")
\`\`\`

Working with AST

Traversing Nodes

traverse.ts
import type { ComarkNode, ComarkTree } from 'comark'

function traverse(node: ComarkNode, callback: (node: ComarkNode) => void) {
  callback(node)

  if (Array.isArray(node)) {
    const children = node.slice(2)
    for (const child of children) {
      traverse(child, callback)
    }
  }
}

// Usage
const result = await parse(content)
traverse(result.nodes[0], (node) => {
  if (Array.isArray(node)) {
    console.log('Element:', node[0])
  } else {
    console.log('Text:', node)
  }
})

Finding Elements

find.ts
function findByTag(tree: ComarkTree, tag: string): ComarkNode[] {
  const results: ComarkNode[] = []

  function search(node: ComarkNode) {
    if (Array.isArray(node) && node[0] === tag) {
      results.push(node)
    }
    if (Array.isArray(node)) {
      node.slice(2).forEach(search)
    }
  }

  tree.nodes.forEach(search)
  return results
}

// Find all headings
const headings = findByTag(result, 'h1')

Modifying AST

transform.ts
function addClassToLinks(node: ComarkNode): ComarkNode {
  if (Array.isArray(node)) {
    const [tag, props, ...children] = node

    if (tag === 'a') {
      return [tag, { ...props, class: 'external-link' }, ...children]
    }

    return [tag, props, ...children.map(addClassToLinks)]
  }

  return node
}

// Transform all links
const transformed: ComarkTree = {
  nodes: result.nodes.map(addClassToLinks),
  frontmatter: result.frontmatter,
  meta: result.meta
}

Extracting Text Content

extract.ts
function extractText(node: ComarkNode): string {
  if (typeof node === 'string') {
    return node
  }

  if (Array.isArray(node)) {
    return node.slice(2).map(extractText).join('')
  }

  return ''
}

// Get all text from a heading
const heading = ['h1', { id: 'hello' }, 'Hello ', ['strong', {}, 'World']]
console.log(extractText(heading)) // "Hello World"

Rendering AST

To HTML

render.ts
import { renderHTML } from 'comark/string'

const tree = await parse('# Hello **World**')
const html = renderHTML(tree)
// <h1 id="hello-world">Hello <strong>World</strong></h1>

To Markdown

render.ts
import { renderMarkdown } from 'comark/string'

const tree = await parse('# Hello **World**')
const markdown = renderMarkdown(tree)
// # Hello **World**

Custom Rendering

render.ts
function renderToPlainText(node: ComarkNode): string {
  if (typeof node === 'string') {
    return node
  }

  if (Array.isArray(node)) {
    const [tag, props, ...children] = node
    const content = children.map(renderToPlainText).join('')

    // Add formatting based on tag
    switch (tag) {
      case 'h1':
      case 'h2':
      case 'h3':
        return `${content}\n\n`
      case 'p':
        return `${content}\n\n`
      case 'li':
        return `${content}\n`
      default:
        return content
    }
  }

  return ''
}

TypeScript Types

All types are exported from comark and comark/ast:

types.ts
import type {
  ComarkTree,
  ComarkNode,
  ComarkElement,
  ComarkText,
  ComarkElementAttributes
} from 'comark'

// Type guard for element nodes
function isElement(node: ComarkNode): node is ComarkElement {
  return Array.isArray(node) && typeof node[0] === 'string'
}

// Type guard for text nodes
function isText(node: ComarkNode): node is ComarkText {
  return typeof node === 'string'
}

// Usage
function processNode(node: ComarkNode) {
  if (isText(node)) {
    console.log('Text:', node)
  } else if (isElement(node)) {
    const [tag, props, ...children] = node
    console.log('Element:', tag, props)
  }
}

Performance Considerations

  1. Array-based format: More memory efficient than object-based AST
  2. Shallow iteration: Children start at index 2, making iteration predictable
  3. Immutable by convention: Create new arrays when modifying
  4. Lazy processing: Process only what you need

See Also

Copyright © 2026