Semem
Semantic Web Memory for Intelligent Agents
...or, Graph RAG on steroids for the global knowledgebase
tl;dr - flipping ideas (mostly) from the LLM world over to the Semantic Web for massively simplified integration, at global scale
click the triangles to expand the text
Semem is an experimental Node.js toolkit for AI memory management
that integrates large language models (LLMs) with Semantic Web technologies (RDF/SPARQL). It offers knowledge graph retrieval and augmentation algorithms within a conceptual model based on the [Ragno](https://github.com/danja/ragno) (knowledge graph description) and [ZPT](https://github.com/danja/zpt) (knowledge graph navigation) ontologies. It is a [Tensegrity](https://github.com/danja/tensegrity) subproject.The intuition is that while LLMs and associated techniques have massively advanced the field of AI and offer considerable utility, the typical approach is missing the elephant in the room: the Web - the biggest known knowledgebase in our universe. Semantic Web technologies offer data integration at a global scale, with tried & tested conceptual models for knowledge representation. There is a lot of low-hanging fruit.
Status 2025-06-21
See also : the (in-progress) manual, CURRENT-ACTIVITIES.md and blog
Mostly functional but very, very sketchy. It has an MCP server, HTTP API, a crude browser UI and code APIs. A lot to do before much will be remotely useful. It is in active development in June 2025. The codebase is big and chaotic, it is not for the fainthearted.
The codebase is registered as the npm package semem though there hasn't been much time spent on this angle, currently it's pretty much essential to use this repo.
The dev process has involved pushing out in various directions with spikes, then circling back to ensure the core is still functional, then consolidation. To date it's been a one-man + various AI assistants (and a dog) operation. Despite me trying to keep things modular so they can be worked on in isolation, it's still complex enough that Claude (and I) struggle. Collaborators would be very welcome.
System Overview
The SPARQL store, chat LLMs and embeddings service are all external. SPARQL uses the standard HTTP interfaces. There are also in-memory and JSON file storage subsystems but these are an artifact of dev history, though they can be useful as a fallback durin testing. LLMs use the hyperdata-clients library to simplify configuration.
The system is layered in a couple of dimensions: interfacing may be direct (SDK-style) API, via the HTTP server or MCP server. Functionality is grouped by purpose broadly into Basic, Ragno and ZPT.
There are fairly comprehensive demos under examples which exercise the different parts of the system (think manual integration tests).
Basic
This contains the low-level operations. It covers basic SPARQL store interactions, embeddings/semantic search and chat. There are also some minimal temporal/relevance-related parts that overlap with Ragno.
Internally the system relies on RDF-Ext and other RDFJS libraries for its graph model, FAISS for its primary vector-oriented functionality.
Ragno
This layer is concerned with the knowledgebase model as described by the Ragno Ontology. On top of the model are a set of algorithms that offer various knowledge retrieval and augmentation facilities. Most are lifted from the NodeRAG paper, with additions such as HyDE Hypothetical Document Embeddings and Vectorised Self-Organising Maps.
ZPT
This layer is concerned with knowledgegraph navigation built on the ZPT Ontology following an analogy from the film world, Zoom, Pan, Tilt. Algorithms have been created to handle parameterisation of filters/selection and corpus decomposition and chunking.
UI
Semem has a browser-based UI in progress. This won't be useful for actual knowledge work any time soon (if ever) but it will have a role in checking system behaviour and experimenting.
The description below is very AI-sloppy.
π Quick Start
Prerequisites
- Node.js 20.11.0 or higher
- npm (comes with Node.js)
Installation
-
Clone the repository:
git clone https://github.com/danja/semem.git cd semem
-
Install dependencies:
npm install
Starting the Application
Option 1: Start all services (recommended)
npm start
This will start both the API server and UI server with a single command.
Option 2: Start services individually
# Start API server
npm run start:api
# In a new terminal, start the UI server
npm run start:ui
Development Mode
For development with hot-reloading:
npm run dev
Accessing the UI
Once the servers are running, open your browser and navigate to:
http://localhost:3000
π₯οΈ UI Features
Interactive Console
Access the developer console by clicking the tab on the right side of the screen. The console provides:
- Real-time log viewing
- Log level filtering (Error, Warn, Info, Debug, Trace)
- Search functionality
- Pause/Resume logging
- Copy logs to clipboard
VSOM Visualization
Explore high-dimensional data with the Vector Self-Organizing Map visualization:
- Navigate to the VSOM tab
- Load or train a SOM model
- Interact with the visualization
- Explore feature maps and clustering
SPARQL Browser
Query and explore your knowledge graph using the built-in SPARQL browser.
π Key Features
- π§ Semantic Memory: Intelligent context retrieval and memory organization with vector embeddings and SPARQL
- πΈοΈ Knowledge Graph Processing: End-to-end Ragno pipeline for entity extraction and relationship modeling
- π― Zoom, Pan Tilt (ZPT): Knowledge navigation and processing, cinematic analogy
- π Model Context Protocol (MCP): JSON-RPC 2.0 API for seamless LLM and agent integration with workflow orchestration
- π MCP Prompts: 8 pre-built workflow templates for complex multi-step operations
- π Advanced Algorithms: HyDE, VSOM, graph analytics, community detection, and Personal PageRank
- π Interactive Visualizations: VSOM (Vector Self-Organizing Maps) for high-dimensional data exploration
- π Multi-Provider LLM Support: Ollama, Claude, Mistral, and other providers via unified connector system
- π Multiple Storage Backends: In-memory, JSON, and SPARQL/RDF with caching optimization
π Data Visualization
Semem includes an advanced VSOM (Vector Self-Organizing Map) visualization system for exploring high-dimensional data:
Key Features
- Interactive SOM grid visualization with zoom/pan
- Real-time training visualization
- Feature map exploration (U-Matrix, component planes)
- Interactive clustering of SOM nodes
- Responsive design for all screen sizes
Getting Started
- Navigate to the VSOM tab in the Semem UI
- Load or train a SOM model
- Explore the visualization and interact with nodes
- Use the feature maps to understand data relationships
For more details, see the VSOM Documentation.
π Project Structure
semem/
βββ src/ # Core library code
β βββ handlers/ # LLM and embedding handlers
β βββ stores/ # Storage backends (JSON, SPARQL, etc.)
β βββ connectors/ # LLM provider connectors
β βββ servers/ # HTTP server implementations
β βββ ragno/ # Knowledge graph algorithms
β βββ zpt/ # Zero-Point Traversal system
βββ examples/ # Organized examples by category
β βββ basic/ # Core functionality examples
β βββ ragno/ # Knowledge graph examples
β βββ mcp/ # MCP integration examples
β βββ zpt/ # ZPT processing examples
β βββ pending/ # Work-in-progress examples
βββ mcp/ # MCP server implementation
βββ config/ # Configuration files
βββ docs/ # Comprehensive documentation
π Server Architecture
Semem provides a complete HTTP server infrastructure for deploying AI memory and knowledge graph services. The server system consists of four main components located in src/servers/
:
Core Server Components
π₯ API Server (api-server.js
)
The main REST API server providing HTTP endpoints for all Semem functionality:
- Memory Operations: Store, search, and retrieve semantic memories
- Chat Interface: Conversational AI with context awareness
- Embedding Services: Vector embedding generation and management
- Configuration Management: Dynamic provider and storage configuration
- Health Monitoring: System status and metrics endpoints
Key Endpoints:
POST /api/memory # Store new memories
GET /api/memory/search # Search existing memories
POST /api/chat # Chat with context
POST /api/chat/stream # Streaming chat responses
GET /api/health # System health check
GET /api/config # Server configuration
ποΈ UI Server (ui-server.js
)
Web interface server for interactive access to Semem capabilities:
- Provider Selection: Choose from configured LLM providers
- Memory Browser: Visual interface for memory exploration
- Chat Interface: Web-based conversational UI
- Configuration UI: Visual configuration management
π Server Manager (server-manager.js
)
Process management system for coordinating multiple server instances:
- Process Lifecycle: Start, monitor, and gracefully stop servers
- Port Management: Automatic port conflict resolution
- Health Monitoring: Real-time process status tracking
- Signal Handling: Graceful shutdown coordination
- Logging: Centralized output management with timestamps
π― Start All (start-all.js
)
Orchestration script for launching the complete server ecosystem:
- Configuration Loading: Unified config system integration
- Multi-Server Startup: Coordinated API and UI server launch
- Interactive Control: Keyboard shortcuts for shutdown (Ctrl+C, 'q')
- Error Handling: Robust startup failure recovery
Quick Server Deployment
# Start all servers (recommended)
./start.sh
# OR
npm run start-servers
# Individual server startup
node src/servers/api-server.js # API only (port 4100)
node src/servers/ui-server.js # UI only (port 4120)
# Stop all servers
./stop.sh
# OR
npm run stop-servers
Server Configuration
Servers are configured via config/config.json
:
{
"servers": {
"api": 4100, # API server port
"ui": 4120, # UI server port
"redirect": 4110, # Optional redirect port
"redirectTarget": 4120
},
"storage": {
"type": "sparql", # or "json", "memory"
"options": { /* storage-specific config */ }
},
"llmProviders": [
{ /* provider configurations */ }
]
}
Development and Production
Development Mode:
# Start with hot reload and debug logging
LOG_LEVEL=debug ./start.sh
# Watch mode with automatic restarts
npm run dev
Production Deployment:
# Set production environment
NODE_ENV=production ./start.sh
# With process management (PM2)
pm2 start src/servers/start-all.js --name semem-servers
Server Monitoring
The server infrastructure includes comprehensive monitoring:
- Health Checks:
/api/health
endpoint with component status - Metrics:
/api/metrics
endpoint with performance data - Process Monitoring: Real-time process status in server manager
- Graceful Shutdown: Proper cleanup on SIGTERM/SIGINT signals
β‘ Quick Start
Installation
# Clone and install
git clone https://github.com/your-org/semem.git
cd semem
npm install
# Configure environment
cp example.env .env
# Edit .env with your API keys and settings
Prerequisites
-
Ollama (recommended for local processing):
# Install required models ollama pull qwen2:1.5b # For chat/text generation ollama pull nomic-embed-text # For embeddings
-
Optional - SPARQL Endpoint (for advanced features):
# Using Docker docker run -d --name fuseki -p 3030:3030 stain/jena-fuseki
Running Servers
# Start HTTP API and UI servers
./start.sh
# Access web interface
open http://localhost:4120
# Test API endpoints
curl http://localhost:4100/api/health
π See Server Architecture section for detailed server documentation.
Running Examples
# Basic memory operations
node examples/basic/MemoryEmbeddingJSON.js
# Knowledge graph processing
node examples/ragno/RagnoPipelineDemo.js
# MCP server integration (32 tools + 15 resources + 8 prompt workflows)
npm run mcp-server-new # Start MCP server
node examples/mcp/SememCoreDemo.js # Core memory operations
node examples/mcp/RagnoCorpusDecomposition.js # Knowledge graphs
node examples/mcp/ZPTBasicNavigation.js # 3D navigation
# Complete ZPT suite (5 comprehensive demos)
node examples/mcp/ZPTBasicNavigation.js # Navigation fundamentals
node examples/mcp/ZPTAdvancedFiltering.js # Multi-dimensional filtering
node examples/mcp/ZPTUtilityTools.js # Schema and validation
node examples/mcp/ZPTPerformanceOptimization.js # Performance tuning
node examples/mcp/ZPTIntegrationWorkflows.js # Cross-system integration
# MCP Prompts workflows (NEW!)
# Start MCP server first: npm run mcp-server-new
# Then use Claude Desktop or other MCP clients to execute:
# - semem-research-analysis: Analyze research documents
# - semem-memory-qa: Q&A with semantic memory
# - ragno-corpus-to-graph: Build knowledge graphs from text
# - semem-full-pipeline: Complete memory+graph+navigation workflows
π§ Core Components
Semantic Memory
- Vector embeddings for semantic similarity
- Context window management with intelligent chunking
- Multi-backend storage (JSON, SPARQL, in-memory)
- Intelligent retrieval with relevance scoring
Knowledge Graph (Ragno)
- Corpus decomposition into semantic units and entities
- Relationship extraction and RDF modeling
- Community detection using Leiden algorithm
- Graph analytics (centrality, k-core, PageRank)
Zero-Point Traversal (ZPT)
- Zoom/Pan/Tilt navigation paradigm
- Content chunking strategies (semantic, fixed, adaptive)
- Corpuscle selection algorithms
- Transformation pipelines for content processing
Model Context Protocol (MCP)
- 32 comprehensive tools covering all Semem capabilities
- 15 specialized resources for documentation and data access
- 8 MCP Prompts for workflow orchestration and multi-step operations
- Complete ZPT integration with 6 navigation tools
- Cross-system workflows combining Memory + Ragno + ZPT
- Standardized API for LLM integration with schema validation
MCP Prompts - Workflow Orchestration
Transform complex multi-step operations into simple, guided workflows:
Memory Workflows:
semem-research-analysis
- Research document analysis with semantic memory contextsemem-memory-qa
- Q&A using semantic memory retrieval and context assemblysemem-concept-exploration
- Deep concept exploration through memory relationships
Knowledge Graph Construction:
ragno-corpus-to-graph
- Transform text corpus to structured RDF knowledge graphragno-entity-analysis
- Analyze and enrich entities with contextual relationships
3D Navigation:
zpt-navigate-explore
- Interactive 3D knowledge space navigation and analysis
Integrated Workflows:
semem-full-pipeline
- Complete memory β graph β navigation processing pipelineresearch-workflow
- Academic research document processing and insight generation
Key Features:
- Multi-step Coordination: Chain multiple tools with context passing
- Dynamic Arguments: Type validation, defaults, and requirement checking
- Conditional Execution: Skip workflow steps based on conditions
- Error Recovery: Graceful handling of failures with partial results
- Execution Tracking: Unique execution IDs and detailed step results
π€ Advanced Algorithms
HyDE (Hypothetical Document Embeddings)
Enhances retrieval by generating hypothetical answers using LLMs, with uncertainty modeling via ragno:maybe
properties.
node examples/ragno/Hyde.js
VSOM (Vectorized Self-Organizing Maps)
Provides entity clustering and semantic organization with support for multiple topologies.
node examples/ragno/VSOM.js
Graph Analytics Suite
- K-core decomposition for dense cluster identification
- Betweenness centrality for bridge node discovery
- Community detection (Leiden algorithm)
- Personalized PageRank for semantic traversal
node examples/ragno/AnalyseGraph.js
node examples/ragno/Communities.js
node examples/ragno/PPR.js
π Examples Documentation
The examples/
directory contains comprehensive demonstrations organized by functionality:
- π§ Basic Examples (
examples/basic/
): Core memory operations, embedding generation, search - πΈοΈ Ragno Examples (
examples/ragno/
): Knowledge graph processing, entity extraction, RDF - π MCP Examples (
examples/mcp/
): Complete MCP integration with 32 tools + 15 resources + 8 prompt workflows- ZPT Suite: 5 comprehensive demos covering all ZPT navigation capabilities β COMPLETE
- Memory Integration: Core semantic memory with context management
- Knowledge Graphs: Ragno corpus decomposition and RDF processing
- Cross-System Workflows: Advanced integration patterns
- π MCP Prompts: 8 workflow templates for orchestrating complex multi-step operations β NEW!
- π― ZPT Examples (
examples/zpt/
): Content processing and navigation
See examples/README.md and examples/mcp/README.md for detailed documentation and usage instructions.
π§ Configuration
Storage Backends
JSON Storage (simple persistence):
{
"storage": {
"type": "json",
"options": {
"filePath": "./data/memories.json"
}
}
}
SPARQL Storage (semantic web integration):
{
"storage": {
"type": "sparql",
"options": {
"endpoint": "https://fuseki.hyperdata.it/semem",
"graphName": "http://example.org/graph",
"user": "admin",
"password": "admin123"
}
}
}
LLM Providers
Configure multiple providers in config/config.json
:
{
"llmProviders": [
{
"type": "ollama",
"baseUrl": "http://localhost:11434",
"chatModel": "qwen2:1.5b",
"embeddingModel": "nomic-embed-text",
"capabilities": ["chat", "embedding"]
},
{
"type": "claude",
"apiKey": "${CLAUDE_API_KEY}",
"chatModel": "claude-3-sonnet-20240229",
"capabilities": ["chat"]
}
]
}
π MCP Integration
Semem implements Anthropic's Model Context Protocol (MCP) for seamless LLM integration:
Using from NPM Package
If you've installed Semem as an npm package, you can run the MCP server directly:
# Install globally
npm install -g semem
# Run MCP server via npx (recommended)
npx semem-mcp
# Run HTTP MCP server
npx semem-mcp-http --port=3000
# Or if installed globally
semem-mcp
semem-mcp-http --port=3000
Using from Source
# Start MCP server
npm run mcp-server-new
# Connect from Claude Desktop or other MCP clients
# Server provides 32 tools + 15 resources + 8 prompt workflows covering all Semem capabilities
Claude Desktop Configuration
Add to your Claude Desktop MCP configuration:
{
"mcpServers": {
"semem": {
"command": "npx",
"args": ["semem-mcp"]
}
}
}
Or for HTTP transport:
{
"mcpServers": {
"semem": {
"command": "npx",
"args": ["semem-mcp-http", "--port=3000"],
"env": {
"MCP_PORT": "3000"
}
}
}
}
### Available MCP Tools (32 Total)
- **Memory Operations** (5 tools): Store, retrieve, generate responses, embeddings, concepts
- **Storage Management** (6 tools): Backend switching, backup/restore, migration, statistics
- **Context Management** (4 tools): Context windows, configuration, pruning, summarization
- **System Monitoring** (4 tools): Configuration, metrics, health checks, system status
- **Knowledge Graphs** (8 tools): Ragno corpus decomposition, entity extraction, SPARQL, analytics
- **ZPT Navigation** (6 tools): 3D navigation, filtering, validation, schema, optimization
### Available MCP Prompts (8 Workflows)
- **Memory Workflows** (3): Research analysis, memory Q&A, concept exploration
- **Knowledge Graph** (2): Corpus-to-graph, entity analysis
- **3D Navigation** (1): Interactive exploration
- **Integrated** (2): Full pipeline, research workflow
### Available MCP Resources (15 Total)
- **System Resources** (7): Status, API docs, schemas, configuration, metrics
- **Ragno Resources** (4): Ontology, pipeline guide, examples, SPARQL templates
- **ZPT Resources** (4): Navigation schema, examples, concepts guide, performance optimization
## π§ͺ Testing
```bash
# Run core tests
npm test
# Run LLM-dependent tests
npm run test:llms
# Generate coverage report
npm run test:coverage
# Run with specific test file
npm test -- tests/unit/Config.spec.js
π οΈ Development
Project Scripts
# Development
npm run dev # Start dev server with hot reload
npm run build:watch # Build and watch for changes
# Testing
npm test # Run unit tests
npm run test:coverage # Generate coverage report
# Documentation
npm run docs # Generate JSDoc documentation
# HTTP Servers
./start.sh # Start all servers (API + UI)
./stop.sh # Stop all servers
node src/servers/api-server.js # Start API server only
node src/servers/ui-server.js # Start UI server only
# MCP Server
npm run mcp-server-new # Start new MCP server
npm run mcp-example # Run MCP client example
Adding New Examples
- Place in appropriate category directory (
basic/
,ragno/
,mcp/
,zpt/
) - Follow naming convention:
PascalCase.js
- Include comprehensive documentation
- Add error handling and cleanup
- Update examples/README.md
π Documentation
- Examples Documentation: Comprehensive examples guide
- API Documentation: REST API and SDK reference
- MCP Documentation: Model Context Protocol integration
- MCP Prompts Guide: Complete workflow orchestration guide
- MCP Prompts Examples: Real-world usage patterns
- Architecture Guide: System design and components
- Algorithm Documentation: Advanced algorithms guide
π Troubleshooting
Common Issues
Ollama Connection:
# Check Ollama status
ollama list
curl http://localhost:11434/api/tags
SPARQL Endpoint:
# Test connectivity
curl -X POST http://localhost:3030/dataset/query \
-H "Content-Type: application/sparql-query" \
-d "SELECT * WHERE { ?s ?p ?o } LIMIT 1"
Memory Issues:
# Increase Node.js memory limit
export NODE_OPTIONS="--max-old-space-size=4096"
Debug Mode
Enable detailed logging:
LOG_LEVEL=debug node examples/basic/MemoryEmbeddingJSON.js
π€ Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Update documentation
- Submit a pull request
Code Style
- Use ES modules
- Follow existing patterns
- Include JSDoc comments
- Add comprehensive error handling
π License
MIT License - see LICENSE for details.
π Links
- Documentation: docs/
- Examples: examples/
- MCP Server: mcp/
- Issue Tracker: GitHub Issues
Semem - Intelligent semantic memory for the AI age.