---
name: tex-or-pdf-to-vmax-markdown
url: https://vmax.ai/skill/convert
description: >
  Convert academic papers, memos, and documents from .samples/ folders into
  local VMAX preview routes at app/-/cases/. Handles LaTeX projects, PDFs,
  and markdown sources. Uploads images, extracts metadata, and writes the
  route files that render both interactive and /paper views.
related:
  - name: VMAX.ai Agent Posting Guide
    url: https://vmax.ai/skill
    description: Publish converted content to vmax.ai via the API
---

# tex-or-pdf-to-vmax-markdown

This skill is served at [`/skill/convert`](https://vmax.ai/skill/convert). The companion posting guide at [`/skill`](https://vmax.ai/skill) covers publishing to vmax.ai via the API.

Convert a source folder (typically `.samples/CaseN_Name`) into a local preview post at `app/-/cases/{slug}/` with both interactive and `/paper` mode support.

## When to use this skill

Use when you are given a folder containing a paper, memo, or document in any of these formats and need to create a VMAX preview route for it. The folder might contain LaTeX source files, a PDF, a markdown file, or plain text with paragraph content.

## Source detection order

Inspect the folder and use the first match:

1. **Markdown** (`.md` files, excluding `README.md`) — richest starting point, closest to target format
2. **LaTeX** (`.tex` files with `\documentclass`) — resolve `\input{}` includes to get the full document
3. **PDF** (`.pdf` files) — extract text via `pdftotext` or macOS Swift/PDFKit
4. **Paragraph text** (`.txt`, `.rst`) — plain text with paragraph breaks
5. **No content found** — report failure, do not generate empty posts

## What you are building

For each source, you produce three files:

```
app/-/cases/{slug}/
  case-post.ts      # Post metadata + getCaseRenderData() function
  page.tsx           # Interactive page (with 3D scene)
  paper/
    page.tsx         # Paper mode page (light theme, no 3D)
```

These follow the exact pattern established by `app/-/sample/`.

## Slug generation

Use the `slugify` function from `common/utilities.ts`. It splits camelCase at lowercase-to-uppercase boundaries, replaces underscores with hyphens, lowercases, and cleans non-word characters.

For a folder named `Case4_PrimeIntellect`, strip the `CaseN_` prefix and slugify the remainder: `PrimeIntellect` becomes `prime-intellect`.

## Reading the source material

### LaTeX projects

LaTeX papers are the most common case. Before converting anything, read the full document:

1. Find the main `.tex` file — prefer `main.tex`, otherwise the file containing `\documentclass`
2. Resolve all `\input{file}` and `\include{file}` directives recursively to build the complete document
3. Extract metadata:
   - **Title** from `\title{...}` — clean LaTeX formatting (`\\`, `\textbf{}`, etc.)
   - **Authors** from `\author{...}` — separate names from institutions. Names are typically comma-separated or joined with `\And` / `\AND`. Lines containing "University", "Institute", "Lab", "Inc.", "Research", or known org names are affiliations, not author names. Email lines (containing `@`) are neither.
   - **Abstract** from `\begin{abstract}...\end{abstract}` — clean to plain text for the description field
   - **Affiliations** from `\affiliation{...}` or parsed from the author block
4. Read the bibliography file (`.bib`) if present — you will need it for citation footnotes

### PDFs

Extract text using `pdftotext` (if available) or macOS Swift/PDFKit. The text will lack structure, so you need to interpret it:

- The first non-empty lines are usually the title
- Lines with multiple comma-separated names are authors
- Lines containing institution names are affiliations
- A line starting with "Abstract" begins the abstract
- Numbered section headers (like "1 Introduction", "2.1 Methods") should become markdown headings
- Table-of-contents sections should become markdown tables

### Markdown files

Already close to the target format. Clean up platform-specific artifacts:

- Replace `<aside>...</aside>` (Notion) with blockquotes (`> ...`)
- Strip invisible characters
- Identify the title from the first `#` heading

## Converting content — the core of this skill

This is where intelligent interpretation matters. Do NOT do mechanical regex substitution. Read the source material, understand its structure and intent, and write a proper VMAX markdown post.

### Structure

- Use `## Section Title` for major sections (all heading levels 2+ render identically as `<h2>`)
- Write prose as markdown paragraphs with proper line breaks between them
- Preserve the logical flow of the paper: introduction, methods, results, discussion, conclusion

### Equations and math

- **Display equations** — wrap in `::latex()` blocks. The opening `::latex()` and closing `::` MUST each be on their own line:

```
::latex()
\begin{aligned}
E &= mc^2 \\
F &= ma
\end{aligned}
::
```

- **Inline math** (`$x^2$`) is NOT supported by the renderer. For important inline math, either convert to a `::latex()` block or write it as plain text (e.g., "where x is the input variable").
- Keep the original LaTeX math syntax inside `::latex()` blocks — do not simplify or rewrite equations
- Use `\begin{aligned}...\end{aligned}` for multi-line equations

### Footnotes and citations

This is critical — do not drop citations. Convert them to markdown footnotes:

- LaTeX `\footnote{text}` becomes `[^N]` inline with `[^N]: text` at the end
- LaTeX `\cite{key}` and `\citep{key}` — look up the key in the `.bib` file and write a proper footnote with the full citation: `[^N]: Author et al., Title, Year. URL if available`
- Group all footnote definitions at the end of the document under a `---` separator
- Use sequential numbering starting from `[^1]`

Example:
```
The seminal result by Sutton[^1] established that...

---

[^1]: Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction. MIT Press.
```

### Tables

Convert LaTeX `\begin{tabular}` to pipe-delimited markdown tables:

```
| Model | Parameters | AIME 2024 |
|---|---|---|
| INTELLECT-3 | 106B (12B active) | 90.8 |
| DAPO | 32B | 50.0 |
```

For table-of-contents sections, convert to a table with Section, Title, and Page columns.

### Figures and images

1. Find all image files in the source folder (`.png`, `.jpg`, `.jpeg`, `.gif`, `.webp`, `.svg`)
2. Upload each using the presigned URL API (see Image Upload below)
3. Reference in markdown: `![Caption text](uploaded-url)`
4. Write meaningful alt text / captions — do not leave empty
5. PDF figures (`.pdf` files in `figures/` directories) cannot display in browsers — note them but do not reference them as images

### Lists

- Convert `\begin{itemize}...\end{itemize}` to `- item` markdown lists
- Convert `\begin{enumerate}...\end{enumerate}` to `1. item` numbered lists
- Preserve nesting with indentation

### Blockquotes

Use `> text` for notable quotes or callout content. Note: blockquotes trigger forest generation in the 3D scene on grass worlds.

### Mermaid diagrams

If the paper describes system architecture, training pipelines, or data flow, consider adding Mermaid diagrams to make the content richer:

```
::mermaid(`flowchart LR
    A[Input] --> B[Process]
    B --> C[Output]`)
```

**Mermaid uses backtick-argument syntax, NOT the block-delimiter syntax.** The entire diagram is a single backtick-delimited argument to `::mermaid()`. Do NOT use the `::mermaid()` / `::` open/close pattern used by `::latex()` — that will render as raw text instead of a diagram.

Correct (backtick argument — renders as diagram):
```
::mermaid(`flowchart LR
    A --> B`)
```

Wrong (block delimiters — renders as raw text):
```
::mermaid()
flowchart LR
    A --> B
::
```

In a TypeScript string array, each line of the diagram is a separate string. The first string opens with `'::mermaid(\`` and the last string closes with `\`)'`:

```typescript
'::mermaid(`flowchart LR',
'    A[Input] --> B[Process]',
'    B --> C[Output]`)',
```

Avoid `\n` inside node labels — use short labels instead. Avoid inline `style` directives — the renderer applies its own theme.

Supported diagram types: flowcharts, sequence diagrams, class diagrams, state diagrams, Gantt charts, ER diagrams. Pie charts are NOT supported (monochrome theme cannot distinguish slices).

### Graph blocks

Use `::graph(type, \`json\`)` for native SVG charts when the paper has quantitative results worth visualizing. The first argument is the graph type; the second is a JSON payload wrapped in backticks.

#### Critical rules

The parser regex is `/::graph\(\s*([a-zA-Z0-9_-]+)\s*,/` — the type argument must contain only letters, digits, hyphens, and underscores. **Never use a descriptive title, sentence, or name with spaces as the type argument.** If the type doesn't match a supported name, the graph silently fails or renders "Unsupported graph."

The payload must be `{ "data": [...], "options": { ... }, "legend": [...] }` with flat arrays. Do NOT use Chart.js-style `{ "labels": [...], "datasets": [{ "data": [...] }] }` — the renderer does not understand that format.

#### Choosing the right type

15 graph types are available. Each expects a specific data shape — using the wrong one produces empty or broken output.

| Type | Use for | Key data fields |
|---|---|---|
| `histogram` | Simple vertical bar comparisons | `label`, `value` |
| `horizontal-bar` | Simple horizontal bar comparisons | `label`, `value` |
| `bar-lines` | Multi-series grouped bar comparison | `year`, `years[]` with `name`, `value`, `color` |
| `line` | Single-series time series or training curves | `date` or `label`, `value`, optional `lower_ci`/`upper_ci` |
| `area` | Single-series filled time series | `date`, `value` |
| `distribution` | Horizontal bars with dot endpoint | `label`, `value` |
| `dotplot` | Horizontal dot plot | `label`, `value` |
| `bubble` | Scatter with sized circles | `x`, `y`, `value`, `category` |
| `grouped-bubbles` | Packed circle layout | `name`, `count` |
| `radar` | Spider/radar chart (array of series) | `axis`, `value` |
| `cohort` | Heatmap grid | `group`, `variable`, `value` |
| `tree` | Hierarchical tree (nested object) | `name`, `children`, `value` |
| `candlestick` | OHLC financial chart | `date`, `open`, `high`, `low`, `close` |
| `column` | Stacked positive/neutral/negative columns | `category`, `positive`, `neutral`, `negative` |
| `diverging-stacked-bar` | Horizontal stacked +/- bars | `category`, `positive`, `neutral`, `negative` |

Common mistakes to avoid:

- **`column` is NOT a simple bar chart.** It renders stacked positive/neutral/negative segments. For simple vertical bars, use `histogram`. For simple horizontal bars, use `horizontal-bar`.
- **`line` and `area` are single-series only.** They render one line/fill from a flat array of `{ date, value }` points. If the paper figure overlays multiple line series (e.g. two training curves compared), `line`/`area` cannot represent it. Either pick the most important single series, use `bar-lines` if the comparison works as grouped bars, or omit the graph and let the narrative text and tables carry the data.
- **Multi-series comparisons → `bar-lines`.** When a paper figure compares 2–4 methods across N categories (grouped bar charts, clustered columns), use `bar-lines` with `[{ "year": "Category", "years": [{ "name": "Method A", "value": N, "color": "..." }, ...] }]`.
- **Don't force a graph.** If a paper figure is a complex multi-panel visualization, has dual axes, overlays multiple line series, or uses a chart type not in the table above, omit the `::graph()` call. The surrounding prose and tables already describe the data. A missing graph is better than a misleading one.

Use a `"title"` field in the payload to give the graph a descriptive heading. Without it, the heading defaults to a generic label like "Horizontal Bar Graph". With it, the reader sees the original figure title from the paper:

```json
{ "title": "Overall Oracle-Normalized Score", "data": [...], ... }
```

Use CSS color variables for theme consistency: `var(--theme-graph-primary)`, `var(--theme-graph-option-1)` through `var(--theme-graph-option-8)`.

#### Examples

Single-series horizontal bars (ablation study):

```
::graph(horizontal-bar, `{
  "data": [
    { "label": "Baseline", "value": 30 },
    { "label": "+ Technique A", "value": 38 },
    { "label": "+ Technique B", "value": 42 },
    { "label": "Full method", "value": 50 }
  ],
  "options": { "height": 384 },
  "legend": [{ "label": "SCORE", "color": "var(--theme-graph-primary)" }]
}`)
```

Multi-series grouped comparison (two methods across categories):

```
::graph(bar-lines, `{
  "data": [
    { "year": "Navigation", "years": [{ "name": "Method A", "value": 0.456, "color": "var(--theme-graph-option-1)" }, { "name": "Method B", "value": 0.250, "color": "var(--theme-graph-option-2)" }] },
    { "year": "Planning", "years": [{ "name": "Method A", "value": 0.334, "color": "var(--theme-graph-option-1)" }, { "name": "Method B", "value": 0.402, "color": "var(--theme-graph-option-2)" }] }
  ],
  "options": { "height": 384 },
  "legend": [{ "label": "Method A", "color": "var(--theme-graph-option-1)" }, { "label": "Method B", "color": "var(--theme-graph-option-2)" }]
}`)
```

#### Fallback: figures as images

When a paper figure cannot be represented by any supported graph type (multi-panel layouts, overlaid multi-series lines, dual axes, complex annotations), convert the source PDF to a PNG and embed it as a markdown image instead:

1. Convert with `sips -s format png --resampleWidth 2400 input.pdf --out output.png` (macOS)
2. Place the PNG in `public/cases/{slug}/`
3. Reference it in the case post: `![Caption describing the figure.](/cases/{slug}/filename.png)`

This is better than forcing data into a wrong graph type. The themed `::graph()` charts are preferred when the data fits, but a clear figure image is better than a misleading graph.

The full per-graph data schema reference is in `public/SKILL.md`.

### Table of contents

When converting a paper that has a table of contents (common in PDF extractions), construct it as a markdown table with section numbers and titles:

```
## Contents

| Section | Title |
|---|---|
| 1 | Introduction |
| 2 | Related Work |
| 2.1 | Reinforcement Learning |
| 2.2 | Self-Play Methods |
| 3 | Method |
| 3.1 | Architecture |
| 3.2 | Training Pipeline |
| 4 | Experiments |
| 5 | Conclusion |
```

If section headings are used throughout the document (as `## 1 Introduction`, etc.), the table of contents provides a navigable overview at the top. Omit page numbers — they are meaningless in the web format.

### Footnotes — when to use them

Beyond converting existing `\cite{}` and `\footnote{}`, add footnotes proactively when:

- A claim references a specific paper, dataset, or benchmark — cite it
- A number or statistic comes from a source that should be traceable
- The original paper uses numbered references like `[1]`, `[2, 3]` — map each to a footnote
- An acronym or system name is introduced — footnote the full reference on first use

Footnote definitions support full markdown including links:

```
[^1]: Vaswani, A., et al. (2017). Attention Is All You Need. NeurIPS 2017. https://arxiv.org/abs/1706.03762
```

### What to strip

- LaTeX preamble (`\documentclass`, `\usepackage`, `\newcommand`, etc.)
- `\maketitle`, `\tableofcontents`, `\begin{document}`, `\end{document}`
- `\label{}`, `\ref{}`, `\eqref{}`, `\autoref{}`
- NeurIPS/ICML checklists (`\begin{checklist}...`)
- Compilation artifacts (`.aux`, `.log`, `.bbl`, `.blg`)
- LaTeX comments (`% ...`)
- Style files (`.sty`, `.cls`)

## Image upload

Read the API key from the project `.env` file — the variable is `INTDEV_IMAGE_UPLOAD_API_KEY`. This is a real key that works for presigned-URL uploads to the VMAX S3 bucket.

```bash
# Read the key
source .env
echo $INTDEV_IMAGE_UPLOAD_API_KEY
```

Upload is a two-step presigned-URL flow:

```bash
# Step 1 — Request a presigned upload URL from the API
curl -X POST https://api.internet.dev/api/data/generate-presigned-url \
  -H "X-API-KEY: $INTDEV_IMAGE_UPLOAD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"type": "image/png", "file": "figure1.png", "size": 102400, "domain": "vmax.ai"}'

# Response: { "uploadURL": "https://s3.amazonaws.com/...", "url": "https://intdev-global.s3..." }
# The "url" (or "fileURL") field is the permanent public URL to use in markdown.

# Step 2 — PUT the binary file to the presigned URL
curl -X PUT "$uploadURL" \
  -H "Content-Type: image/png" \
  --data-binary @figure1.png

# Step 3 — Use the permanent URL in your markdown
# ![Figure 1: Training dynamics](https://intdev-global.s3.us-west-2.amazonaws.com/public/vmax-ai/uuid.png)
```

Max file size: 15 MB. Supported MIME types: `image/png`, `image/jpeg`, `image/gif`, `image/webp`, `image/svg+xml`.

Upload every raster image in the source folder. Walk subdirectories like `figures/` and `figs/`. Skip `.pdf` figure files (they cannot render in `<img>` tags).

## Output file templates

### case-post.ts

Follow the pattern from `app/-/sample/sample-post.ts`. The file exports a metadata object and a `getCaseRenderData()` function that returns the full markdown string with the isometric scene, title, and byline prepended.

```typescript
export const CASE_POST = {
  title: "Paper Title Here",
  description: "First 155 chars of the abstract for SEO.",
  authors: "Author One, Author Two",
  affiliations: "MIT, Stanford",
  events: "",
  publishDate: "May 12, 2026",
  slug: "paper-slug",
};
```

### World keywords in the `events` field

The `events` string in `CASE_POST` controls the 3D isometric scene that renders behind the post. Set it based on the paper's subject matter:

| Keyword | Scene mode | Visual effect |
|---|---|---|
| `CTF` | `ctf` | Fleet ships, armed characters, castle raid — use for papers about capture-the-flag, security challenges, adversarial environments |
| `Wander` | `wander` | Civilian characters wandering — use for papers about exploration, navigation, open-ended agents |
| *(empty string)* | `island` | Static terrain, no agents — the default for most papers |

Keywords are case-insensitive and matched with regex (`/ctf/i`, `/wander/i`). CTF takes precedence if both are present. Multiple events are comma-separated: `"CTF, Benchmark"`.

Content signals also affect the terrain variant:

| Signal | Terrain |
|---|---|
| More than 2 images in the post | `GRASS` world (green terrain, forests spawn near blockquotes) |
| 2 or fewer images | `ICE` world (frozen terrain) |
| Very short / empty post | `DESERT` world (dunes) |

When choosing events, scan the paper's topic: if it involves CTF challenges, security competitions, or adversarial flag-capture tasks, set `events: "CTF"`. If it involves agent exploration or wandering behavior, set `events: "Wander"`. Otherwise leave it empty.

```typescript

export function getCaseRenderData() {
  const markdown = [
    "First line of content",
    "",
    "## Introduction",
    "",
    "The body of the paper...",
    "",
    "::latex()",
    "E = mc^2",
    "::",
    "",
    "---",
    "",
    "[^1]: Citation text here.",
  ].join('\n');

  return `::isometric(0)

# ${CASE_POST.title}

::byline([${CASE_POST.authors}|${CASE_POST.description}||${CASE_POST.publishDate}||${CASE_POST.affiliations}|${CASE_POST.events}])

${markdown}`;
}
```

The byline format is: `::byline([authors|description|externalLink|date|correspondence|affiliations|events])` — empty fields use empty strings between pipes.

Each line of the markdown array becomes a separate line in the rendered output. Use `""` for blank lines between paragraphs.

### page.tsx

```tsx
import '@root/global-block-size-public-post.css';
import 'katex/dist/katex.min.css';

import DefaultLayout from '@document-system-components/page/DefaultLayout';
import Document from '@document-system-components/Document';
import Providers from '@document-system-components/Providers';

import { stringToSeed } from '@engine/common/seeded-random';

import { getCaseRenderData, CASE_POST } from './case-post';

export const dynamic = 'force-dynamic';

export async function generateMetadata() {
  const url = 'https://vmax.ai/-/cases/SLUG_HERE';
  return {
    metadataBase: new URL('https://vmax.ai'),
    title: CASE_POST.title,
    description: CASE_POST.description,
    url,
    openGraph: {
      title: CASE_POST.title,
      description: CASE_POST.description,
      url,
      images: ['https://intdev-global.s3.us-west-2.amazonaws.com/public/internet-dev/e5748d60-a03a-489f-9f56-bc6b1c8166cc.png'],
    },
    twitter: {
      title: CASE_POST.title,
      description: CASE_POST.description,
      url,
      handle: '@vmaxai',
      cardType: 'summary_large_image',
    },
  };
}

export default async function CasePage() {
  return (
    <Providers>
      <DefaultLayout previewPixelSRC="https://intdev-global.s3.us-west-2.amazonaws.com/template-app-icon.png">
        <Document isMarkdown data={getCaseRenderData()} worldSeed={stringToSeed(CASE_POST.slug)} />
      </DefaultLayout>
    </Providers>
  );
}
```

### paper/page.tsx

```tsx
import '@root/global-block-size-public-post.css';
import 'katex/dist/katex.min.css';

import { stringToSeed } from '@engine/common/seeded-random';
import { getCaseRenderData, CASE_POST } from '../case-post';

import DefaultLayout from '@document-system-components/page/DefaultLayout';
import Document from '@document-system-components/Document';
import PaperProviders from '@document-system-components/PaperProviders';

export const dynamic = 'force-dynamic';

export async function generateMetadata() {
  const url = 'https://vmax.ai/-/cases/SLUG_HERE/paper';
  return {
    metadataBase: new URL('https://vmax.ai'),
    title: `${CASE_POST.title} Paper`,
    description: CASE_POST.description,
    url,
    openGraph: {
      title: `${CASE_POST.title} Paper`,
      description: CASE_POST.description,
      url,
      images: ['https://intdev-global.s3.us-west-2.amazonaws.com/public/internet-dev/e5748d60-a03a-489f-9f56-bc6b1c8166cc.png'],
    },
    twitter: {
      title: `${CASE_POST.title} Paper`,
      description: CASE_POST.description,
      url,
      handle: '@vmaxai',
      cardType: 'summary_large_image',
    },
  };
}

export default async function CasePaperPage() {
  return (
    <PaperProviders>
      <DefaultLayout previewPixelSRC="https://intdev-global.s3.us-west-2.amazonaws.com/template-app-icon.png">
        <Document isMarkdown data={getCaseRenderData()} worldSeed={stringToSeed(`${CASE_POST.slug}-paper`)} paper />
      </DefaultLayout>
    </PaperProviders>
  );
}
```

## Quality checklist

Before reporting a case as done, verify:

- [ ] Title is clean (no LaTeX commands, no arXiv metadata, no author names mixed in)
- [ ] Authors field contains only names, not institutions or emails
- [ ] Affiliations field contains only institutions, not author names
- [ ] Abstract/description is a clean English sentence under 155 chars
- [ ] All display equations are in `::latex()` blocks with `::latex()` and `::` on their own lines
- [ ] Citations are converted to markdown footnotes with full reference text
- [ ] Images are uploaded and referenced with valid URLs (not local paths)
- [ ] Tables use pipe-delimited markdown format
- [ ] No LaTeX artifacts remain in the prose (no `\textbf`, `\cite`, `\\`, stray braces)
- [ ] The `page.tsx` and `paper/page.tsx` match the template exactly
- [ ] The slug in the URL matches the slug in the metadata
- [ ] The content reads as a coherent document, not a mechanical translation

## Common pitfalls

| Pitfall | What to do instead |
|---|---|
| Stripping `\cite{}` references silently | Look up the bib entry and create a footnote |
| Leaving `$inline math$` as-is | Convert to `::latex()` block or write as plain text |
| Putting `::latex()` and `::` on the same line as content | Each delimiter gets its own line |
| Dumping raw PDF text without structure | Identify sections, equations, and tables; reformat as structured markdown |
| Using local image paths | Upload via presigned URL API and use the returned URL |
| Mixing author names and affiliations | Parse them apart — names in `authors`, institutions in `affiliations` |
| Including `\begin{document}`, preamble commands, or checklists | Strip all of it |
| Forgetting the `---` separator before footnotes | Add it for visual separation |
| Writing empty descriptions | Use the first sentence of the abstract |
| Writing `[^N]:` mid-sentence (e.g. `result[^5]: the data`) | The parser treats `[^N]:` as a footnote definition, breaking the paragraph. Use `[^N].` or `[^N] —` instead |
| Referencing the same `[^N]` inline more than once | The renderer creates React keys from footnote refs — duplicates cause key collisions. Each footnote number should appear exactly once inline |
