ProcStack's AI Metadata Specification v0.2

Updated : August 1st 2025

This document defines custom metadata and link conventions used across the ProcStack project pages to expose structured data to bots, crawlers, and language models.

This specification includes JSON schema definitions and comprehensive discovery methods for AI systems.

Meta Tags

Link Tags

Discovery Files

JSON Data Schema

Individual Page Data (e.g., /bots/PageName.htm.json)

{
  "title": "string",
  "description": "string",
  "lastModified": "YYYY-MM-DD",
  "media": [
    {
      "type": "video|image",
      "src": "relative/path/to/media",
      "alt": "alt text",
      "caption": "media description"
    }
  ]
}
  

Site Content Manifest (/bots/siteContent.json)

{
  "[pageKey]": {
    "jsonURL": "absolute URL to page JSON",
    "lastModified": "YYYY-MM-DD",
    "title": "page title",
    "description": "page description",
    "media": [...],
    "content": "full HTML content",
    "pageURL": "absolute page URL",
    "relativeURL": "relative page path"
  }
}
  

Use Cases

Example 1 - Site Root

URL : https://procstack.github.io/index.htm

<!-- AI/LLM Data Discovery -->
<meta name="ai:data-source" content="https://procstack.github.io/bots/siteContent.json">
<meta name="ai:data-manifest" content="https://procstack.github.io/data-manifest.json">
<link rel="alternate" type="application/json" href="https://procstack.github.io/bots/siteContent.json" title="Full JSON of all Page's Content Data">
<link rel="data-manifest" type="application/json" href="https://procstack.github.io/data-manifest.json" title="Data Sources Manifest" />
  

Example 2 - Individual Page

URL : https://procstack.github.io/ProjectsLinks/currentsOfWar.htm

<!-- AI/LLM Data Discovery -->
<meta name="ai:data-source" content="https://procstack.github.io/bots/ProjectsLinks_currentsOfWar.htm.json">
<meta name="ai:data-manifest" content="https://procstack.github.io/data-manifest.json">
<link rel="alternate" type="application/json" href="https://procstack.github.io/bots/ProjectsLinks_currentsOfWar.htm.json" title="Single Page Content Data">
<link rel="data-manifest" type="application/json" href="https://procstack.github.io/data-manifest.json" title="Data Sources Manifest" />
  

Example 3 - API End-Point

<meta name="ai:content-api" content="https://procstack.github.io/bots/">
  

Example 4 - Enhanced robots.txt

User-agent: *
Allow: /

# AI/LLM specific directives  
User-agent: GPTBot
Allow: /
User-agent: ChatGPT-User
Allow: /

# Data endpoints
AI-Data-Manifest: https://procstack.github.io/data-manifest.json
AI-Content-API: https://procstack.github.io/bots/
  

Discovery Workflow for AI Systems

  1. Check robots.txt for AI-Data-Manifest directive
  2. Fetch data-manifest.json for complete data source overview
  3. Use siteContent.json for bulk content or individual page JSON for specific content
  4. Parse llms.txt for human-readable content summary
  5. Respect lastModified dates to avoid unnecessary re-crawling