Home

Semem

License Node.js Status

Semantic Web Memory for Intelligent Agents

...or, Graph RAG on steroids for the global knowledgebase

tl;dr - flipping ideas (mostly) from the LLM world over to the Semantic Web for massively simplified integration, at global scale

Semem Workbench

🐳 Quick Start with Docker: A complete Docker Compose setup is available for easy deployment with all dependencies included. See Docker Setup Guide for one-command installation.

Interactive Chat Interface: The workbench includes a natural language chat interface with slash commands (/ask, /tell, /help) and automatic URL/file ingestion. See Chat Documentation for details.

The hints page might help.

Semem Documentation - sprawling

click the triangles to expand the text

Semem is an experimental Node.js toolkit for AI memory management that integrates large language models (LLMs) with Semantic Web technologies (RDF/SPARQL). It offers knowledge graph retrieval and augmentation algorithms within a conceptual model based on the Ragno (knowledge graph description) and ZPT (knowledge graph navigation) ontologies. It is a tensegrity project.

The intuition is that while LLMs and associated techniques have massively advanced the field of AI and offer considerable utility, the typical approach is missing the elephant in the room: the Web - the biggest known knowledgebase in our universe. Semantic Web technologies offer data integration at a global scale, with tried & tested conceptual models for knowledge representation. There is a lot of low-hanging fruit.

Status 2025-08-28

Most of the system is functional but some systematic changes will be needed to deal with performance issues. Currently tidying up and consolidating. There was another explosion of cruft, and also the setup is fragile, minor changes can cause breakage. I've updated the tests but a little more structural work is needed.

previously...

MCP functionality focused down on 7 core verbs : ask, tell, augment, zoom, pan, tilt, inspect. See below and this blog post (in which Claude demonstrates an inability to count).

The UI has been totally re-written to reflect this, see workbench-howto. Currently testing.

Most recent direct workflow experiment : PDF ingestion

SPARQL Document Ingestion: Use examples/ingestion/SPARQLIngest.js for importing documents from SPARQL endpoints with configurable query templates (blog-articles, generic-documents, wikidata-entities). Supports batch processing, authentication, and direct integration with the semantic memory system.

See also : blog (co-written with Claude Code)

Architecture Diagrams

Core Operation Flows: Semem's primary operations follow well-defined information flow patterns:

  • πŸ“₯ Tell Operation Flow - Shows how content is ingested, processed through embedding generation and concept extraction, then stored in the dual SPARQL+FAISS system
  • πŸ“€ Ask Operation Flow - Illustrates the multi-pass adaptive search process, context synthesis, and LLM-powered response generation
Mostly functional but very, very sketchy. It has an MCP server, HTTP API, a crude browser UI and code APIs. A lot to do before much will be remotely useful. It is in active development in June 2025. The codebase is big and chaotic, it is not for the fainthearted.

The codebase is registered as the npm package semem though there hasn't been much time spent on this angle, currently it's pretty much essential to use this repo.

The dev process has involved pushing out in various directions with spikes, then circling back to ensure the core is still functional, then consolidation. To date it's been a one-man + various AI assistants (and a dog) operation. Despite me trying to keep things modular so they can be worked on in isolation, it's still complex enough that Claude (and I) struggle. Collaborators would be very welcome.

It is feature-complete as originally conceived, in the sense of all the right notes, but not necessarily in the right order. There is a lot of cruft and numerous bugs. Right now it's in a consolidation phase.

System Overview

The SPARQL store, chat LLMs and embeddings service are all external. SPARQL uses the standard HTTP interfaces. There are also in-memory and JSON file storage subsystems but these are an artifact of dev history, though they can be useful as a fallback durin testing. LLMs use the hyperdata-clients library to simplify configuration.

The system is layered in a couple of dimensions: interfacing may be direct (SDK-style) API, via the HTTP server or MCP server. Functionality is grouped by purpose broadly into Basic, Ragno and ZPT.

There are fairly comprehensive demos under examples which exercise the different parts of the system (think manual integration tests).

Basic

This contains the low-level operations. It covers basic SPARQL store interactions, embeddings/semantic search and chat. There are also some minimal temporal/relevance-related parts that overlap with Ragno.

Internally the system relies on RDF-Ext and other RDFJS libraries for its graph model, FAISS for its primary vector-oriented functionality.

Ragno

This layer is concerned with the knowledgebase model as described by the Ragno Ontology. On top of the model are a set of algorithms that offer various knowledge retrieval and augmentation facilities. Most are lifted from the NodeRAG paper, with additions such as HyDE Hypothetical Document Embeddings and Vectorised Self-Organising Maps.

ZPT

This layer is concerned with knowledgegraph navigation built on the ZPT Ontology following an analogy from the film world, Zoom, Pan, Tilt. Algorithms have been created to handle parameterisation of filters/selection and corpus decomposition and chunking.

UI

Semem has a browser-based UI in progress. This won't be useful for actual knowledge work any time soon (if ever) but it will have a role in checking system behaviour and experimenting.


The description below is very AI-sloppy.

πŸš€ Quick Start - Simple Verbs Interface

Get started with Semem's natural language interface in 5 minutes:

Prerequisites

  • Node.js 20.11.0 or higher
  • npm (comes with Node.js)

Installation & Setup

  1. Clone and install:

    git clone https://github.com/danja/semem.git
    cd semem
    npm install
    
  2. Start the MCP server:

    npm run mcp:http
    # OR
    node mcp/http-server.js
    

    The server starts on http://localhost:4105 with Simple Verbs REST endpoints.

Try the 7 Simple Verbs

Store knowledge with tell:

curl -X POST http://localhost:4105/tell \
  -H "Content-Type: application/json" \
  -d '{"content": "Machine learning is a subset of AI that enables computers to learn", "type": "concept"}'

Query knowledge with ask:

curl -X POST http://localhost:4105/ask \
  -H "Content-Type: application/json" \
  -d '{"question": "What is machine learning?"}'

Set context with zoom and pan:

# Set abstraction level
curl -X POST http://localhost:4105/zoom \
  -d '{"level": "entity"}'

# Apply domain filtering  
curl -X POST http://localhost:4105/pan \
  -d '{"domains": ["AI", "technology"], "keywords": ["machine learning"]}'

Check your ZPT state:

curl http://localhost:4105/state

Alternative Startup Options

Workbench Development:

npm run start:workbench  # Start workbench server (port 8081)
npm run dev              # Start webpack dev server with hot reload (port 9000)

Legacy API servers:

npm start  # Starts both API server (port 4100) and legacy UI server (port 4120)

MCP Server for Claude Desktop:

# Run MCP server for Claude Desktop integration (local dev)
npm run mcp

# Run MCP HTTP server (local dev)
npm run mcp:http

# Or via published package (most reliable method)
git clone https://github.com/danja/semem.git
cd semem
npm install
npm run mcp                      # Stdio MCP server
npm run mcp:http                 # HTTP MCP server on port 4105

# Alternative: Direct node execution
node mcp/index.js                # Stdio MCP server  
node mcp/http-server.js          # HTTP MCP server on port 4105

Development mode:

npm run dev              # Webpack dev server for workbench UI
npm run start:mcp        # MCP server backend

πŸ€– Claude Desktop Integration

Add Semem to your Claude Desktop MCP configuration:

{
  "mcpServers": {
    "semem": {
      "command": "node",
      "args": ["/path/to/semem/mcp/index.js"]
    }
  }
}

Alternative setup (after cloning repository):

{
  "mcpServers": {
    "semem": {
      "command": "npm",
      "args": ["run", "mcp"],
      "cwd": "/path/to/semem"
    }
  }
}

Then use the 7 Simple Verbs directly in Claude Desktop conversations!

πŸ–₯️ Workbench UI Features

Semantic Memory Workbench

The modern web-based interface implementing the 7 Simple Verbs:

Core Panels:

  • Tell: Store content with lazy/immediate processing options
  • Ask: Query knowledge with HyDE, Wikipedia, and Wikidata enhancements
  • Augment: Extract concepts and process lazy content
  • Navigate: ZPT (Zoom/Pan/Tilt) knowledge space navigation with real-time feedback
  • Inspect: Debug system state and session cache
  • Console: Real-time log viewing with filtering and search

Key Features:

  • Real-time ZPT State Display: Current zoom/pan/tilt settings with descriptions
  • Visual Navigation Feedback: Execute navigation with loading states and results
  • Session Statistics: Track interactions, concepts, and session duration
  • Connection Status: Live backend connectivity monitoring

Development Access:

npm run start:workbench  # Direct workbench server (port 8081)
npm run dev              # Webpack dev server with hot reload (port 9000)

Legacy UI Components

The original UI system includes VSOM visualization and SPARQL browser (via npm start).

πŸš€ Key Features

  • πŸ—£οΈ Simple Verbs Interface: 7-verb natural language API (tell, ask, augment, zoom, pan, tilt, inspect) for intuitive semantic operations
  • 🧠 Semantic Memory: Intelligent context retrieval and memory organization with vector embeddings and SPARQL
  • πŸ•ΈοΈ Knowledge Graph Processing: End-to-end Ragno pipeline for entity extraction and relationship modeling
  • 🎯 Zoom, Pan Tilt (ZPT): Knowledge navigation and processing with persistent state management
  • πŸ”Œ Model Context Protocol (MCP): JSON-RPC 2.0 API for seamless LLM and agent integration with workflow orchestration
  • πŸš€ MCP Prompts: 8 pre-built workflow templates for complex multi-step operations
  • πŸ” Advanced Algorithms: HyDE, VSOM, graph analytics, community detection, and Personal PageRank
  • πŸ“Š Interactive Visualizations: VSOM (Vector Self-Organizing Maps) for high-dimensional data exploration
  • πŸ”— Multi-Provider LLM Support: Ollama, Claude, Mistral, and other providers via unified connector system
  • πŸ“Š Multiple Storage Backends: In-memory, JSON, and SPARQL/RDF with caching optimization

πŸ—£οΈ Simple Verbs Interface

Semem features a 7-verb natural language interface that simplifies complex knowledge operations into conversational commands:

The 7 Simple Verbs

Verb Purpose Example
tell Store information with automatic embeddings tell: "Machine learning uses neural networks"
ask Query stored knowledge with semantic search ask: "What is machine learning?"
augment Extract concepts and enhance content augment: {"target": "text to analyze", "operation": "concepts"}
zoom Set abstraction level (entity/unit/text/community/corpus) zoom: {"level": "entity"}
pan Apply domain/temporal/keyword filtering pan: {"domains": ["AI"], "keywords": ["neural networks"]}
tilt Choose view perspective (keywords/embedding/graph/temporal) tilt: {"style": "embedding"}
inspect Debug and examine stored memories and session cache inspect: {"what": "session", "details": true}

Quick Example Workflow

# Store knowledge
curl -X POST http://localhost:4105/tell \
  -d '{"content": "The 7 Simple Verbs simplify semantic operations"}'

# Set context  
curl -X POST http://localhost:4105/zoom -d '{"level": "entity"}'
curl -X POST http://localhost:4105/pan -d '{"domains": ["MCP"], "keywords": ["verbs"]}'

# Query with context
curl -X POST http://localhost:4105/ask \
  -d '{"question": "What are the Simple Verbs?"}'

The system maintains persistent ZPT state across operations, enabling contextual conversations with your knowledge base. All verbs work via REST API, MCP protocol, or direct SDK calls.

See docs/PROMPT.md for detailed usage instructions.

πŸ“Š Data Visualization

Semem includes an advanced VSOM (Vector Self-Organizing Map) visualization system for exploring high-dimensional data:

Key Features

  • Interactive SOM grid visualization with zoom/pan
  • Real-time training visualization
  • Feature map exploration (U-Matrix, component planes)
  • Interactive clustering of SOM nodes
  • Responsive design for all screen sizes

Getting Started

  1. Navigate to the VSOM tab in the Semem UI
  2. Load or train a SOM model
  3. Explore the visualization and interact with nodes
  4. Use the feature maps to understand data relationships

For more details, see the VSOM Documentation.

πŸ“ Project Structure

semem/
β”œβ”€β”€ src/                           # Core library code
β”‚   β”œβ”€β”€ handlers/                 # LLM and embedding handlers
β”‚   β”œβ”€β”€ stores/                   # Storage backends (JSON, SPARQL, etc.)
β”‚   β”œβ”€β”€ connectors/               # LLM provider connectors
β”‚   β”œβ”€β”€ servers/                  # HTTP server implementations
β”‚   β”œβ”€β”€ ragno/                    # Knowledge graph algorithms
β”‚   β”œβ”€β”€ zpt/                      # Zero-Point Traversal system
β”‚   └── frontend/
β”‚       β”œβ”€β”€ workbench/            # Modern workbench UI (primary)
β”‚       β”‚   β”œβ”€β”€ public/           # Workbench source files
β”‚       β”‚   └── server.js         # Workbench development server
β”‚       └── [legacy]/             # Legacy UI components
β”œβ”€β”€ dist/                         # Build outputs
β”‚   └── workbench/               # Built workbench application
β”œβ”€β”€ examples/                     # Organized examples by category
β”‚   β”œβ”€β”€ basic/                   # Core functionality examples
β”‚   β”œβ”€β”€ ragno/                   # Knowledge graph examples
β”‚   β”œβ”€β”€ mcp/                     # MCP integration examples
β”‚   β”œβ”€β”€ zpt/                     # ZPT processing examples
β”‚   └── pending/                 # Work-in-progress examples
β”œβ”€β”€ mcp/                          # MCP server implementation
β”‚   β”œβ”€β”€ http-server.js           # Primary MCP HTTP server
β”‚   β”œβ”€β”€ tools/                   # MCP tool implementations
β”‚   └── prompts/                 # MCP workflow templates
β”œβ”€β”€ config/                       # Configuration files
β”œβ”€β”€ webpack.config.js             # Workbench build configuration
└── docs/                        # Comprehensive documentation

🌐 Server Architecture

Semem provides a complete HTTP server infrastructure with a modern workbench-first approach:

Primary Components

🧠 MCP Server (mcp/http-server.js)

The primary server providing both MCP protocol and Simple Verbs REST API:

  • Simple Verbs REST API: 7-verb natural language interface (tell, ask, augment, zoom, pan, tilt, inspect)
  • MCP Protocol: JSON-RPC 2.0 API for LLM integration (32 tools + 15 resources + 8 prompts)
  • Session Management: Persistent ZPT state across operations
  • Error Handling: Robust error recovery and partial results

Key Simple Verbs Endpoints:

POST /tell               # Store content with embeddings
POST /ask                # Query knowledge with semantic search
POST /augment            # Extract concepts and enhance content
POST /zoom               # Set abstraction level
POST /pan                # Apply domain/temporal filtering  
POST /tilt               # Choose view perspective
POST /inspect            # Debug system state
GET  /state              # Get current ZPT state

πŸ–₯️ Workbench (src/frontend/workbench/)

Modern web-based interface implementing the 7 Simple Verbs:

  • Interactive Panels: Tell, Ask, Augment, Navigate, Inspect, Console
  • Real-time Feedback: ZPT state display and navigation results
  • Development Server: Hot reload with webpack dev server
  • Session Tracking: Live statistics and connection monitoring

πŸ”₯ Legacy API Server (api-server.js)

REST API server for backward compatibility:

  • Memory Operations: Store, search, and retrieve semantic memories
  • Chat Interface: Conversational AI with context awareness
  • Embedding Services: Vector embedding generation and management
  • Configuration Management: Dynamic provider and storage configuration

πŸš€ Server Manager (server-manager.js)

Process management system for coordinating multiple server instances:

  • Process Lifecycle: Start, monitor, and gracefully stop servers
  • Port Management: Automatic port conflict resolution
  • Health Monitoring: Real-time process status tracking
  • Signal Handling: Graceful shutdown coordination
  • Logging: Centralized output management with timestamps

🎯 Start All (start-all.js)

Orchestration script for launching the complete server ecosystem:

  • Configuration Loading: Unified config system integration
  • Multi-Server Startup: Coordinated API and UI server launch
  • Interactive Control: Keyboard shortcuts for shutdown (Ctrl+C, 'q')
  • Error Handling: Robust startup failure recovery

Quick Server Deployment

Recommended - Workbench Development:

# Start MCP server (backend)
npm run start:mcp                 # MCP server (port 4105)

# Start workbench (frontend) 
npm run start:workbench           # Workbench server (port 8081)
# OR for development with hot reload
npm run dev                       # Webpack dev server (port 9000)

Legacy - Full Server Stack:

# Start all legacy servers
./start.sh                        # API server (4100) + UI server (4120)
# OR
npm start

# Individual server startup
node src/servers/api-server.js    # API only (port 4100)
node mcp/http-server.js           # MCP server (port 4105)

Server Configuration

Servers are configured via config/config.json:

{
  "servers": {
    "api": 4100,      # API server port
    "ui": 4120,       # UI server port
    "redirect": 4110, # Optional redirect port
    "redirectTarget": 4120
  },
  "storage": {
    "type": "sparql",  # or "json", "memory"
    "options": { /* storage-specific config */ }
  },
  "llmProviders": [
    { /* provider configurations */ }
  ]
}

🐳 Docker Deployment

Semem provides comprehensive Docker support for both development and production deployments with a multi-service architecture.

πŸ“– Complete Docker Guide - Detailed installation, configuration, and troubleshooting instructions

Quick Start with Docker

1. Production Deployment:

# Clone the repository
git clone https://github.com/danja/semem.git
cd semem

# Copy your existing .env or create from template
cp .env.docker.example .env
# Edit .env with your API keys (same format as local)

# Start all services
docker compose up -d

# Check service status
docker compose ps
docker compose logs -f semem

2. Development Deployment:

# Use your existing .env file (works as-is)
docker compose -f docker-compose.dev.yml up -d

# View logs
docker compose -f docker-compose.dev.yml logs -f semem-dev

ℹ️ Need Help? See the Docker Guide for detailed installation steps, troubleshooting, and advanced configuration options.

Services Architecture

The Docker deployment includes the following services:

Service Description Ports Purpose
semem Main application container 4100, 4101, 4102 API, MCP, Workbench servers
fuseki Apache Jena SPARQL database 3030 RDF/SPARQL storage backend
nginx Reverse proxy (optional) 80, 443 SSL termination, load balancing

Environment Configuration

Production Environment (.env.docker)

# Core Configuration
NODE_ENV=production
SEMEM_API_KEY=your-secure-api-key

# SPARQL Store Configuration
SPARQL_HOST=localhost         # SPARQL server hostname (localhost for local, fuseki for Docker)
SPARQL_PORT=3030             # SPARQL server port (3030 for local, 4050 for Docker external access)
SPARQL_USER=admin
SPARQL_PASSWORD=your-secure-password

# LLM Provider API Keys (configure at least one)
MISTRAL_API_KEY=your-mistral-key
CLAUDE_API_KEY=your-claude-key
OPENAI_API_KEY=your-openai-key
NOMIC_API_KEY=your-nomic-key

Development Environment (.env.docker.dev)

The development environment uses simplified configuration with optional external API keys.

Volume Management

Persistent Data:

  • fuseki_data: SPARQL database storage
  • semem_data: Application data and cache
  • semem_logs: Application logs

Configuration Management:

# Production: mount configuration
./config/config.docker.json:/app/config/config.json:ro

# Development: live code editing
./src:/app/src:ro
./mcp:/app/mcp:ro

Advanced Deployment Options

1. Production with SSL (using nginx profile)

# Generate SSL certificates
mkdir -p nginx/ssl
openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
  -keyout nginx/ssl/semem.key -out nginx/ssl/semem.crt

# Start with reverse proxy
docker compose --profile proxy up -d

2. Multi-architecture builds

# Build for multiple architectures
docker buildx build --platform linux/amd64,linux/arm64 -t semem:latest .

Health Monitoring

Service Health Checks:

# Check all services
docker compose ps

# Individual service health
curl http://localhost:4100/health    # API server
curl http://localhost:4102/health    # Workbench
curl http://localhost:3030/$/ping    # Fuseki

Application Logs:

# Follow all logs
docker compose logs -f

# Specific service logs
docker compose logs -f semem
docker compose logs -f fuseki

Troubleshooting

Common Issues

1. Port conflicts:

# Check port usage
sudo lsof -i :4100,4101,4102,3030,11434

# Modify ports in docker-compose.yml if needed

2. Permission issues:

# Fix volume permissions
sudo chown -R 1001:1001 ./data ./logs

3. SPARQL connection issues:

# Check Fuseki status
curl -f http://localhost:3030/$/ping

# Restart Fuseki
docker compose restart fuseki

Performance Tuning

Resource Limits:

# In docker-compose.yml
deploy:
  resources:
    limits:
      memory: 4G
      cpus: '2.0'

JVM Settings for Fuseki:

# Environment variable in docker-compose.yml
JVM_ARGS=-Xmx2g -Xms1g

Development Workflow

Live Development with Docker:

# Start development stack
docker compose -f docker-compose.dev.yml up -d

# Make changes to source code (auto-reloaded via volumes)
# Rebuild only when dependencies change
docker compose -f docker-compose.dev.yml build semem-dev

# Debug with logs
docker compose -f docker-compose.dev.yml logs -f semem-dev

Development Tools:

# Access development container
docker compose -f docker-compose.dev.yml exec semem-dev bash

# Run tests inside container
docker compose -f docker-compose.dev.yml exec semem-dev npm test

# Debug Node.js (port 9229 exposed)
# Connect your IDE debugger to localhost:9229

Security Considerations

Production Security:

  • Use strong passwords for SPARQL_PASSWORD
  • Generate secure SEMEM_API_KEY
  • Keep API keys in secure environment files
  • Use nginx with proper SSL configuration
  • Regularly update base images

Network Security:

# Production: restrict external access
# Only expose necessary ports (80, 443 via nginx)
# Keep internal services (fuseki) on internal network

Backup and Migration

Data Backup:

# Backup Fuseki data
docker run --rm -v semem_fuseki_data:/data -v $(pwd):/backup alpine tar czf /backup/fuseki-backup.tar.gz /data

# Backup application data
docker run --rm -v semem_semem_data:/data -v $(pwd):/backup alpine tar czf /backup/semem-data-backup.tar.gz /data

Migration:

# Export configuration
docker compose exec semem cat /app/config/config.json > config-backup.json

# Migrate to new deployment
# 1. Copy volumes or restore from backup
# 2. Update docker-compose.yml with new configuration
# 3. Start services

Development and Production

Development Mode:

# Workbench development with hot reload
npm run dev                       # Webpack dev server (port 9000)
npm run start:mcp                 # MCP backend (port 4105)

# Legacy development
LOG_LEVEL=debug ./start.sh        # Legacy servers with debug logging

Production Deployment:

# Build workbench for production
npm run build:workbench

# Start production servers
NODE_ENV=production npm run start:mcp

# With process management (PM2)
pm2 start mcp/http-server.js --name semem-mcp
pm2 start src/frontend/workbench/server.js --name semem-workbench

Server Monitoring

The server infrastructure includes comprehensive monitoring:

  • Health Checks: /api/health endpoint with component status
  • Metrics: /api/metrics endpoint with performance data
  • Process Monitoring: Real-time process status in server manager
  • Graceful Shutdown: Proper cleanup on SIGTERM/SIGINT signals

⚑ Quick Start

Installation

# Clone and install
git clone https://github.com/your-org/semem.git
cd semem
npm install

# Configure environment
cp example.env .env
# Edit .env with your API keys and settings

Prerequisites

  1. Ollama (recommended for local processing):

    # Install required models
    ollama pull qwen2:1.5b         # For chat/text generation
    ollama pull nomic-embed-text   # For embeddings
    
  2. Optional - SPARQL Endpoint (for advanced features):

    # Using Docker
    docker run -d --name fuseki -p 3030:3030 stain/jena-fuseki
    

Running Servers

# Start HTTP API and UI servers
./start.sh

# Access web interface
open http://localhost:4120

# Test API endpoints
curl http://localhost:4100/api/health

πŸ“– See Server Architecture section for detailed server documentation.

Running Examples

# Basic memory operations
node examples/basic/MemoryEmbeddingJSON.js

# Knowledge graph processing  
node examples/ragno/RagnoPipelineDemo.js

# MCP server integration (32 tools + 15 resources + 8 prompt workflows)
npm run mcp               # Start MCP server
node examples/mcp/SememCoreDemo.js           # Core memory operations
node examples/mcp/RagnoCorpusDecomposition.js # Knowledge graphs
node examples/mcp/ZPTBasicNavigation.js      # 3D navigation

# Complete ZPT suite (5 comprehensive demos)
node examples/mcp/ZPTBasicNavigation.js      # Navigation fundamentals
node examples/mcp/ZPTAdvancedFiltering.js    # Multi-dimensional filtering
node examples/mcp/ZPTUtilityTools.js         # Schema and validation
node examples/mcp/ZPTPerformanceOptimization.js # Performance tuning
node examples/mcp/ZPTIntegrationWorkflows.js # Cross-system integration

# MCP Prompts workflows (NEW!)
# Start MCP server first: npm run mcp
# Then use Claude Desktop or other MCP clients to execute:
# - semem-research-analysis: Analyze research documents
# - semem-memory-qa: Q&A with semantic memory
# - ragno-corpus-to-graph: Build knowledge graphs from text
# - semem-full-pipeline: Complete memory+graph+navigation workflows

🧠 Core Components

Semantic Memory

  • Vector embeddings for semantic similarity
  • Context window management with intelligent chunking
  • Multi-backend storage (JSON, SPARQL, in-memory)
  • Intelligent retrieval with relevance scoring

Knowledge Graph (Ragno)

  • Corpus decomposition into semantic units and entities
  • Relationship extraction and RDF modeling
  • Community detection using Leiden algorithm
  • Graph analytics (centrality, k-core, PageRank)

Zero-Point Traversal (ZPT)

  • Zoom/Pan/Tilt navigation paradigm
  • Content chunking strategies (semantic, fixed, adaptive)
  • Corpuscle selection algorithms
  • Transformation pipelines for content processing

Model Context Protocol (MCP)

  • 32 comprehensive tools covering all Semem capabilities
  • 15 specialized resources for documentation and data access
  • 8 MCP Prompts for workflow orchestration and multi-step operations
  • Complete ZPT integration with 6 navigation tools
  • Cross-system workflows combining Memory + Ragno + ZPT
  • Standardized API for LLM integration with schema validation

MCP Prompts - Workflow Orchestration

Transform complex multi-step operations into simple, guided workflows:

Memory Workflows:

  • semem-research-analysis - Research document analysis with semantic memory context
  • semem-memory-qa - Q&A using semantic memory retrieval and context assembly
  • semem-concept-exploration - Deep concept exploration through memory relationships

Knowledge Graph Construction:

  • ragno-corpus-to-graph - Transform text corpus to structured RDF knowledge graph
  • ragno-entity-analysis - Analyze and enrich entities with contextual relationships

3D Navigation:

  • zpt-navigate-explore - Interactive 3D knowledge space navigation and analysis

Integrated Workflows:

  • semem-full-pipeline - Complete memory β†’ graph β†’ navigation processing pipeline
  • research-workflow - Academic research document processing and insight generation

Key Features:

  • Multi-step Coordination: Chain multiple tools with context passing
  • Dynamic Arguments: Type validation, defaults, and requirement checking
  • Conditional Execution: Skip workflow steps based on conditions
  • Error Recovery: Graceful handling of failures with partial results
  • Execution Tracking: Unique execution IDs and detailed step results

πŸ€– Advanced Algorithms

HyDE (Hypothetical Document Embeddings)

Enhances retrieval by generating hypothetical answers using LLMs, with uncertainty modeling via ragno:maybe properties.

node examples/ragno/Hyde.js

VSOM (Vectorized Self-Organizing Maps)

Provides entity clustering and semantic organization with support for multiple topologies.

node examples/ragno/VSOM.js

Graph Analytics Suite

  • K-core decomposition for dense cluster identification
  • Betweenness centrality for bridge node discovery
  • Community detection (Leiden algorithm)
  • Personalized PageRank for semantic traversal
node examples/ragno/AnalyseGraph.js
node examples/ragno/Communities.js
node examples/ragno/PPR.js

πŸ“š Examples Documentation

The examples/ directory contains comprehensive demonstrations organized by functionality:

  • 🧠 Basic Examples (examples/basic/): Core memory operations, embedding generation, search
  • πŸ•ΈοΈ Ragno Examples (examples/ragno/): Knowledge graph processing, entity extraction, RDF
  • πŸ”Œ MCP Examples (examples/mcp/): Complete MCP integration with 32 tools + 15 resources + 8 prompt workflows
    • ZPT Suite: 5 comprehensive demos covering all ZPT navigation capabilities βœ… COMPLETE
    • Memory Integration: Core semantic memory with context management
    • Knowledge Graphs: Ragno corpus decomposition and RDF processing
    • Cross-System Workflows: Advanced integration patterns
    • πŸš€ MCP Prompts: 8 workflow templates for orchestrating complex multi-step operations βœ… NEW!
  • 🎯 ZPT Examples (examples/zpt/): Content processing and navigation

See examples/README.md and examples/mcp/README.md for detailed documentation and usage instructions.

πŸ”§ Configuration

Storage Backends

JSON Storage (simple persistence):

{
  "storage": {
    "type": "json",
    "options": {
      "filePath": "./data/memories.json"
    }
  }
}

SPARQL Storage (semantic web integration):

{
  "storage": {
    "type": "sparql",
    "options": {
      "query": "http://${SPARQL_HOST:-localhost}:${SPARQL_PORT:-3030}/semem/sparql",
      "update": "http://${SPARQL_HOST:-localhost}:${SPARQL_PORT:-3030}/semem/update",
      "data": "http://${SPARQL_HOST:-localhost}:${SPARQL_PORT:-3030}/semem/data",
      "graphName": "http://hyperdata.it/content",
      "user": "${SPARQL_USER}",
      "password": "${SPARQL_PASSWORD}"
    }
  }
}

Environment Variable Substitution

Semem supports environment variable substitution in configuration files using ${VARIABLE_NAME:-default} syntax:

SPARQL Connection Variables:

  • SPARQL_HOST: SPARQL server hostname (defaults to localhost)
  • SPARQL_PORT: SPARQL server port (defaults to 3030)
  • SPARQL_USER: SPARQL database username
  • SPARQL_PASSWORD: SPARQL database password

Usage Examples:

# Local development (uses defaults: localhost:3030)
npm start

# Custom SPARQL endpoint
export SPARQL_HOST=fuseki.example.com
export SPARQL_PORT=8080
npm start

# Docker automatically sets: SPARQL_HOST=fuseki, SPARQL_PORT=4050
docker compose up -d

Session-Level Memory Cache

Semem implements a hybrid storage strategy that combines persistent storage with session-level caching for immediate semantic retrieval:

How it works:

  • tell operations store content in both persistent storage (SPARQL/JSON) AND a session-level cache
  • ask operations search session cache first, then persistent storage, combining results by semantic similarity
  • Immediate availability: Recently stored concepts are immediately available for retrieval within the same session
  • Semantic similarity: Uses cosine similarity on embeddings for intelligent result ranking

Session cache features:

  • In-memory vector search with similarity caching for performance
  • Concept tracking - maintains a set of all concepts from the session
  • Debugging support - use inspect tool to examine cache contents:
    curl -X POST http://localhost:4105/inspect \
      -H "Content-Type: application/json" \
      -d '{"what": "session", "details": true}'
    

This solves the common issue where tell β†’ ask operations couldn't find recently stored content due to indexing delays in persistent storage.

LLM Providers

Configure multiple providers in config/config.json:

{
  "llmProviders": [
    {
      "type": "ollama",
      "baseUrl": "http://localhost:11434",
      "chatModel": "qwen2:1.5b",
      "embeddingModel": "nomic-embed-text",
      "capabilities": ["chat", "embedding"]
    },
    {
      "type": "claude",
      "apiKey": "${CLAUDE_API_KEY}",
      "chatModel": "claude-3-sonnet-20240229",
      "capabilities": ["chat"]
    }
  ]
}

πŸ”Œ MCP Integration

Semem implements Anthropic's Model Context Protocol (MCP) for seamless LLM integration:

Using from NPM Package

If you've installed Semem as an npm package, you can run the MCP server directly:

# Install globally
npm install -g semem

# Run MCP server via npx (recommended)
npx semem mcp

# Run HTTP MCP server
npx semem mcp-http --port=3000

# Or if installed globally
semem mcp
semem mcp --transport http --port 3000

Using from Source

# Start MCP server
npm run mcp

# Connect from Claude Desktop or other MCP clients
# Server provides 32 tools + 15 resources + 8 prompt workflows covering all Semem capabilities

Claude Desktop Configuration

Add to your Claude Desktop MCP configuration:

{
  "mcpServers": {
    "semem": {
      "command": "npx",
      "args": ["semem mcp"]
    }
  }
}

Or for HTTP transport:

{
  "mcpServers": {
    "semem": {
      "command": "npx", 
      "args": ["semem mcp-http", "--port=3000"],
      "env": {
        "MCP_PORT": "3000"
      }
    }
  }
}

### Available MCP Tools (32 Total)
- **Memory Operations** (5 tools): Store, retrieve, generate responses, embeddings, concepts
- **Storage Management** (6 tools): Backend switching, backup/restore, migration, statistics
- **Context Management** (4 tools): Context windows, configuration, pruning, summarization
- **System Monitoring** (4 tools): Configuration, metrics, health checks, system status
- **Knowledge Graphs** (8 tools): Ragno corpus decomposition, entity extraction, SPARQL, analytics
- **ZPT Navigation** (6 tools): 3D navigation, filtering, validation, schema, optimization

### Available MCP Prompts (8 Workflows)
- **Memory Workflows** (3): Research analysis, memory Q&A, concept exploration
- **Knowledge Graph** (2): Corpus-to-graph, entity analysis  
- **3D Navigation** (1): Interactive exploration
- **Integrated** (2): Full pipeline, research workflow

### Available MCP Resources (15 Total)
- **System Resources** (7): Status, API docs, schemas, configuration, metrics
- **Ragno Resources** (4): Ontology, pipeline guide, examples, SPARQL templates
- **ZPT Resources** (4): Navigation schema, examples, concepts guide, performance optimization

## πŸ§ͺ Testing

```bash
# Run core tests
npm test

# Run LLM-dependent tests
npm run test:llms

# Generate coverage report
npm run test:coverage

# Run with specific test file
npm test -- tests/unit/Config.spec.js

πŸ› οΈ Development

Project Scripts

# Development - Workbench
npm run dev                # Webpack dev server with hot reload (port 9000)
npm run start:workbench    # Workbench server (port 8081)
npm run start:mcp          # MCP server (port 4105)

# Build
npm run build              # Full build (types + workbench)
npm run build:workbench    # Build workbench for production
npm run build:dev          # Development build

# Testing
npm test                   # Run unit and integration tests
npm run test:unit          # Run unit tests only
npm run test:coverage      # Generate coverage report
npm run test:e2e           # End-to-end tests

# Documentation
npm run docs               # Generate JSDoc documentation

# Legacy Servers
./start.sh                 # Start all legacy servers (API + UI)
node src/servers/api-server.js     # API server only (port 4100)

# MCP Server
npm run mcp                # Start MCP server (stdio mode)
npm run mcp:http           # Start MCP HTTP server (port 4105)

Workbench Development Workflow

Quick Start:

# Terminal 1: Start MCP backend
npm run start:mcp

# Terminal 2: Start development server  
npm run dev
# Opens http://localhost:9000 with hot reload

Architecture:

  • Frontend: Webpack dev server (port 9000) with hot module replacement
  • Backend: MCP server (port 4105) with Simple Verbs REST API
  • Proxy: Webpack automatically proxies API calls to MCP server
  • Build Output: dist/workbench/ contains production build

Key Features:

  • Hot Reload: Changes to workbench files trigger automatic browser refresh
  • Source Maps: Full debugging support in development mode
  • Module Aliases: Clean imports with @workbench, @services, @components
  • Live Backend: MCP server provides real semantic memory functionality

Production Build:

npm run build:workbench  # Creates optimized bundle in dist/workbench/

Adding New Examples

  1. Place in appropriate category directory (basic/, ragno/, mcp/, zpt/)
  2. Follow naming convention: PascalCase.js
  3. Include comprehensive documentation
  4. Add error handling and cleanup
  5. Update examples/README.md

πŸ“– Documentation

πŸ” Troubleshooting

Common Issues

Ollama Connection:

# Check Ollama status
ollama list
curl http://localhost:11434/api/tags

SPARQL Endpoint:

# Test connectivity
curl -X POST http://localhost:3030/dataset/query \
  -H "Content-Type: application/sparql-query" \
  -d "SELECT * WHERE { ?s ?p ?o } LIMIT 1"

Memory Issues:

# Increase Node.js memory limit
export NODE_OPTIONS="--max-old-space-size=4096"

Debug Mode

Enable detailed logging:

LOG_LEVEL=debug node examples/basic/MemoryEmbeddingJSON.js

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Update documentation
  5. Submit a pull request

Code Style

  • Use ES modules
  • Follow existing patterns
  • Include JSDoc comments
  • Add comprehensive error handling

πŸ“„ License

MIT License - see LICENSE for details.

πŸ”— Links


Semem - Intelligent semantic memory for the AI age.