Upload files to "docs"

2025-03-29 12:14:37 +00:00 · 2025-03-29 12:14:37 +00:00 · 1a96de505b
commit 1a96de505b
parent 1902fe9baf
4 changed files with 753 additions and 0 deletions
--- a/docs/architecture.md
+++ b/docs/architecture.md
@ -0,0 +1,124 @@
+# Memory System Architecture
+
+## Overview
+
+The memory system architecture is designed with a layered approach, separating concerns between:
+1. Storage layer
+2. Memory integration layer
+3. Assistant layer (LLM integration)
+
+This modular design allows for flexibility, extensibility, and ease of maintenance.
+
+## System Layers
+
+```
+┌────────────────────────────────────────────────────┐
+│                                                    │
+│  ┌────────────────────────────────────────────┐    │
+│  │           OllamaMemoryAssistant            │    │
+│  │                                            │    │
+│  │  - User interface for memory interaction   │    │
+│  │  - LLM integration via Ollama              │    │
+│  │  - Query processing with context           │    │
+│  └────────────────────┬───────────────────────┘    │
+│                       │                            │
+│  ┌────────────────────▼───────────────────────┐    │
+│  │           IntegratedMemory                 │    │
+│  │                                            │    │
+│  │  - Combines memory types                   │    │
+│  │  - Manages context generation              │    │
+│  │  - Coordinates memory operations           │    │
+│  └────────────────────┬───────────────────────┘    │
+│                       │                            │
+│  ┌────────────────────▼───────────────────────┐    │
+│  │           SimpleMemoryStore                │    │
+│  │                                            │    │
+│  │  - Basic storage implementation            │    │
+│  │  - Persistence with JSON files             │    │
+│  │  - Namespaced key-value storage            │    │
+│  └────────────────────────────────────────────┘    │
+│                                                    │
+└────────────────────────────────────────────────────┘
+```
+
+## Key Components
+
+### 1. SimpleMemoryStore
+
+The foundation of the system, providing basic storage capabilities:
+
+- **Responsibilities**:
+  - Store and retrieve data by namespace and key
+  - Provide basic search functionality
+  - Handle persistence to disk (optional)
+  - Manage file I/O for memory files
+
+- **Key Methods**:
+  - `put(namespace, key, value)`: Store data
+  - `get(namespace, key)`: Retrieve data
+  - `search(namespace, query)`: Find relevant information
+  - `save()` and `load()`: Handle persistence
+
+### 2. IntegratedMemory
+
+The integration layer that combines different memory types:
+
+- **Responsibilities**:
+  - Provide typed access to memory (semantic, episodic, procedural)
+  - Generate context for LLM queries
+  - Manage memory-specific operations
+  - Coordinate between memory types
+
+- **Key Methods**:
+  - `add_fact(concept, details)`: Add semantic memory
+  - `add_procedure(name, steps)`: Add procedural memory
+  - `add_interaction(query, response)`: Add episodic memory
+  - `generate_context(query)`: Create LLM context from memories
+
+### 3. OllamaMemoryAssistant
+
+The application layer that interfaces with Ollama LLM:
+
+- **Responsibilities**:
+  - Process user queries with memory context
+  - Send requests to Ollama API
+  - Update memory with new interactions
+  - Present responses to the user
+
+- **Key Methods**:
+  - `process_query(query)`: Process a query with memory context
+  - `learn_fact(concept, details)`: Add to semantic memory
+  - `learn_procedure(name, steps)`: Add to procedural memory
+
+## Data Flow
+
+1. **Query Processing**:
+   ```
+   User Query → OllamaMemoryAssistant → IntegratedMemory (context) → Ollama API → Response
+   ```
+
+2. **Memory Addition**:
+   ```
+   New Knowledge → OllamaMemoryAssistant → IntegratedMemory → SimpleMemoryStore
+   ```
+
+3. **Persistence**:
+   ```
+   Memory Update → SimpleMemoryStore → JSON File
+   ```
+
+## Extensibility
+
+The system architecture allows for several extension points:
+
+1. **Storage Backends**: Replace SimpleMemoryStore with other implementations (e.g., database, vector store)
+2. **LLM Integration**: Swap Ollama with other LLM providers
+3. **Memory Types**: Add new memory types beyond the current three
+4. **Context Generation**: Customize how context is generated for different query types
+
+## Configuration Options
+
+- **Persistence**: Enable/disable memory persistence
+- **Memory Directory**: Configure where memory files are stored
+- **Model Selection**: Choose which LLM model to use via Ollama
+- **Base URL**: Configure the Ollama API endpoint 
--- a/docs/fine_tuning_guide.md
+++ b/docs/fine_tuning_guide.md
@ -0,0 +1,273 @@
+# Fine-Tuning Guide
+
+This guide explains how to fine-tune a language model to adopt a specific persona (like a domain expert) and integrate it with the memory system.
+
+## Overview
+
+Fine-tuning a language model allows you to customize its behavior for specific use cases:
+
+1. Adopt a particular persona or character
+2. Specialize in domain-specific knowledge
+3. Follow consistent patterns of interaction
+4. Improve reliability for specific tasks
+
+## Prerequisites
+
+- Memory System installed
+- A chosen base model (e.g., llama3, mistral, gemma3)
+- Training data for fine-tuning
+- Compute resources (GPU recommended)
+- Ollama installed locally
+
+## Process Overview
+
+1. **Collect Training Data**
+2. **Format Data for Fine-Tuning**
+3. **Perform Fine-Tuning**
+4. **Integrate with Memory System**
+5. **Test and Iterate**
+
+## Step 1: Collect Training Data
+
+For a domain expert persona, you'll need:
+
+- **Primary Sources**: Books, papers, and articles written by experts in the field
+- **Secondary Sources**: Interviews, lectures, and presentations by experts
+- **Synthetic Data**: Generated Q&A pairs demonstrating expert responses to domain questions
+
+**Example Data Sources for a Physics Expert:**
+- Academic papers on physics topics
+- Transcripts of lectures by renowned physicists
+- Books explaining complex physics concepts
+- Interviews with physics experts
+
+**Data Collection Tips:**
+- Focus on material that captures the expert's voice and perspective
+- Include a variety of topics within the domain
+- Aim for at least 50-100 high-quality examples
+- Balance technical knowledge with communication style
+
+## Step 2: Format Data for Fine-Tuning
+
+Most fine-tuning requires data in a specific format:
+
+```json
+[
+  {
+    "role": "system",
+    "content": "You are a physics expert with deep knowledge of quantum mechanics, relativity, and particle physics. You explain complex concepts clearly and precisely, using analogies when helpful. You cite relevant research and provide context for your explanations."
+  },
+  {
+    "role": "user",
+    "content": "Can you explain quantum entanglement in simple terms?"
+  },
+  {
+    "role": "assistant",
+    "content": "Quantum entanglement is like having two coins that are magically connected. When you flip one coin and it lands on heads, the other coin instantly lands on tails, no matter how far apart they are. This happens faster than any signal could travel between them, even at the speed of light. In quantum physics, particles become 'entangled' when they interact in ways that their quantum states can't be described independently. When you measure one particle, you instantly know information about its entangled partner. Einstein called this 'spooky action at a distance' because it seemed to violate locality principles. Experiments by Alain Aspect in the 1980s and many others since have confirmed this bizarre property of quantum mechanics, which has applications in quantum computing and quantum cryptography."
+  },
+  {
+    "role": "user",
+    "content": "What's the difference between general and special relativity?"
+  },
+  {
+    "role": "assistant",
+    "content": "Special relativity and general relativity are Einstein's two revolutionary theories that fundamentally changed our understanding of space, time, and gravity. Special relativity (1905) deals with physics in the absence of gravity and acceleration. It introduces two key principles: the laws of physics are the same for all non-accelerating observers, and the speed of light is constant regardless of the observer's motion. This leads to effects like time dilation and length contraction. General relativity (1915) extends these concepts to include gravity and acceleration. Instead of viewing gravity as a force, Einstein reconceived it as a curvature of spacetime caused by mass and energy. Massive objects like stars and planets create a kind of 'dent' in the fabric of spacetime, causing other objects to follow curved paths. Special relativity is a special case of general relativity that applies when gravity is negligible. While special relativity unified space and time, general relativity unified space, time, and gravity."
+  }
+]
+```
+
+## Step 3: Perform Fine-Tuning
+
+### Option 1: Using Ollama (Local Fine-Tuning)
+
+Ollama provides a simple way to create a custom model:
+
+1. Create a `Modelfile`:
+
+```
+FROM gemma3:12b
+SYSTEM "You are a physics expert with deep knowledge of quantum mechanics, relativity, and particle physics. You explain complex concepts clearly and precisely, using analogies when helpful. You cite relevant research and provide context for your explanations."
+
+# Training data incorporation
+TEMPLATE """{{ if .System }}{{ .System }}{{ end }}
+{{ if .Prompt }}{{ .Prompt }}{{ end }}
+
+{{ .Response }}
+"""
+
+# Include your training examples as additional parameters
+PARAMETER temperature 0.7
+PARAMETER top_p 0.9
+PARAMETER stop "Human:"
+PARAMETER stop "Expert:"
+```
+
+2. Build the model:
+```bash
+ollama create physics-expert -f /path/to/Modelfile
+```
+
+3. Test your model:
+```bash
+ollama run physics-expert "What is the uncertainty principle?"
+```
+
+### Option 2: Using 3rd-Party Services (More Advanced)
+
+For more advanced fine-tuning:
+
+1. Choose a service:
+   - [Hugging Face](https://huggingface.co/)
+   - [Google Vertex AI](https://cloud.google.com/vertex-ai)
+
+2. Upload your training data and configure the fine-tuning job
+3. Monitor training progress and evaluate results
+4. Export or deploy the model
+
+## Step 4: Integrate with Memory System
+
+After fine-tuning, integrate the model with the memory system:
+
+1. Create a specialized version of `OllamaMemoryAssistant`:
+
+```python
+from src.memory_model import OllamaMemoryAssistant, IntegratedMemory, SimpleMemoryStore
+
+class ExpertMemoryAssistant(OllamaMemoryAssistant):
+    """Memory assistant that embodies a domain expert"""
+    
+    def process_query(self, query):
+        """Process a query using the expert persona"""
+        # Generate context from memory
+        context = self.memory.generate_context(query)
+        
+        # Combine context and query with expert-specific system prompt
+        prompt = f"""You are a physics expert with deep knowledge of quantum mechanics, relativity, and particle physics.
+        You have access to your memories:
+        
+Memory Context:
+{context}
+
+User Question: {query}
+
+Respond as a physics expert, drawing on your memories where relevant.
+"""
+        
+        # Use Ollama API to generate response
+        response = self._generate_response(prompt)
+        
+        # Add to episodic memory
+        self.memory.add_interaction(query, response)
+        
+        return response
+```
+
+2. Initialize with your fine-tuned model:
+
+```python
+# Setup memory directory
+memory_dir = "expert_memories"
+os.makedirs(memory_dir, exist_ok=True)
+
+# Create expert assistant
+assistant = ExpertMemoryAssistant(
+    user_id="physics_expert",
+    model_name="physics-expert:latest",  # Your fine-tuned model
+    memory_dir=memory_dir
+)
+
+# Pre-populate with relevant knowledge
+assistant.learn_fact("quantum_mechanics", {
+    "founder": "Max Planck, Niels Bohr, Werner Heisenberg",
+    "key_principles": ["Wave-particle duality", "Uncertainty principle", "Quantum superposition"],
+    "applications": ["Quantum computing", "Quantum cryptography"]
+})
+
+# Run interactive session
+while True:
+    query = input("You: ")
+    if query.lower() in ["exit", "quit"]:
+        break
+    
+    response = assistant.process_query(query)
+    print(f"Expert: {response}")
+```
+
+## Step 5: Test and Iterate
+
+After integration, evaluate and refine your model:
+
+1. **Evaluation Criteria**:
+   - Does the model consistently maintain the expert persona?
+   - Does it accurately reflect domain knowledge?
+   - Does it effectively use the memory system?
+   - Is the interaction natural and engaging?
+
+2. **Improvement Process**:
+   - Add more training examples for areas where the model is weak
+   - Adjust system prompts to better guide the model's behavior
+   - Fine-tune hyperparameters to balance creativity and accuracy
+   - Add more memories for common topics of discussion
+
+## Alternative: Prompt Engineering Approach
+
+If full fine-tuning is not feasible, a simpler approach is to use prompt engineering:
+
+```python
+class ExpertPromptAssistant(OllamaMemoryAssistant):
+    """Uses prompt engineering to emulate an expert without fine-tuning"""
+    
+    def process_query(self, query):
+        """Process a query using prompt engineering for expert persona"""
+        # Generate context from memory
+        context = self.memory.generate_context(query)
+        
+        # Detailed persona description in system prompt
+        prompt = f"""You are roleplaying as a physics expert with specialization in quantum mechanics.
+        
+Key traits of this expert:
+1. Deep knowledge of theoretical physics
+2. Explains complex concepts clearly
+3. Uses analogies to make difficult concepts accessible
+4. Cites relevant research and theories
+5. Acknowledges areas of ongoing research or debate
+6. Maintains scientific accuracy
+7. Balances technical detail with understandable explanations
+8. Never breaks character
+
+Memory Context:
+{context}
+
+User Question: {query}
+
+Respond as a physics expert would, drawing on your memories where relevant. Maintain a helpful and educational tone throughout.
+"""
+        
+        # Use Ollama API to generate response
+        response = self._generate_response(prompt)
+        
+        # Add to episodic memory
+        self.memory.add_interaction(query, response)
+        
+        return response
+```
+
+## Comparing Approaches
+
+| Approach | Advantages | Disadvantages |
+|----------|------------|--------------|
+| **Fine-tuning** | More consistent character portrayal, Better domain knowledge, Deeper integration of expertise | Requires significant data, Computationally expensive, Takes time to train |
+| **Prompt Engineering** | Quick to implement, No special hardware required, Easy to adjust and iterate | Less consistent characterization, Uses more tokens per request, May occasionally break character |
+
+## Conclusion
+
+For the most authentic and robust expert emulation, a combination of both approaches is ideal:
+
+1. Start with prompt engineering to quickly test and refine the persona
+2. Collect interaction data from your prompt-engineered sessions
+3. Use this data to create a fine-tuning dataset
+4. Fine-tune a model with the collected data
+5. Integrate the fine-tuned model with the memory system
+6. Continue gathering data from user interactions for future fine-tuning iterations
+
+This iterative approach allows you to continuously improve the expert emulation while providing value to users at each stage of development.
--- a/docs/langchain_integration.md
+++ b/docs/langchain_integration.md
@ -0,0 +1,209 @@
+# LangChain Integration Guide
+
+This guide explains how to integrate the Memory System with LangChain, allowing you to use LangChain's models and capabilities with our memory system.
+
+## Prerequisites
+
+- Memory System installed
+- LangChain installed: `pip install langchain`
+- LangChain integration model (e.g., `pip install langchain-community`)
+
+## Integration Approach
+
+There are two main ways to integrate LangChain with our memory system:
+
+1. **Use LangChain models with our memory system**
+2. **Use our memory system as a memory component for LangChain**
+
+## Method 1: Using LangChain Models
+
+You can adapt the `OllamaMemoryAssistant` to work with LangChain models:
+
+```python
+from langchain_community.llms import HuggingFaceHub
+from langchain_core.language_models import BaseLLM
+from src.memory_model import IntegratedMemory, SimpleMemoryStore
+
+class LangChainMemoryAssistant:
+    """Memory assistant that uses LangChain models"""
+    
+    def __init__(self, user_id, llm: BaseLLM, memory_dir="./memories"):
+        """Initialize with a LangChain model"""
+        self.user_id = user_id
+        self.llm = llm
+        
+        # Setup memory storage
+        memory_file = f"{memory_dir}/{user_id}_memory.json"
+        store = SimpleMemoryStore(storage_path=memory_file)
+        self.memory = IntegratedMemory(store)
+    
+    def process_query(self, query):
+        """Process a query using LangChain model and memory context"""
+        # Generate context from memory
+        context = self.memory.generate_context(query)
+        
+        # Combine context and query
+        prompt = f"""You are an AI assistant with memory capabilities.
+        
+Memory Context:
+{context}
+
+User Question: {query}
+
+Please respond to the user's question using the provided memory context where relevant.
+"""
+        
+        # Use LangChain model to generate response
+        response = self.llm.invoke(prompt)
+        
+        # Add to episodic memory
+        self.memory.add_interaction(query, response)
+        
+        return response
+    
+    def learn_fact(self, concept, details):
+        """Add to semantic memory"""
+        self.memory.add_fact(concept, details)
+    
+    def learn_procedure(self, name, steps):
+        """Add to procedural memory"""
+        self.memory.add_procedure(name, steps)
+```
+
+### Example Usage:
+
+```python
+from langchain_community.llms import HuggingFaceHub
+
+# Using Hugging Face with Gemma3
+gemma_assistant = LangChainMemoryAssistant(
+    user_id="gemma_user",
+    llm=HuggingFaceHub(
+        repo_id="google/gemma3-12b", 
+        model_kwargs={"temperature": 0.7, "max_length": 2048}
+    )
+)
+
+# Using other models
+llama_assistant = LangChainMemoryAssistant(
+    user_id="llama_user",
+    llm=HuggingFaceHub(
+        repo_id="meta-llama/Meta-Llama-3-8B", 
+        model_kwargs={"temperature": 0.7}
+    )
+)
+
+# Add knowledge
+gemma_assistant.learn_fact("albert_einstein", {
+    "birth": "1879",
+    "theories": ["General Relativity", "Special Relativity"]
+})
+
+# Process queries
+response = gemma_assistant.process_query("What do we know about Einstein?")
+print(response)
+```
+
+## Method 2: Using Memory System in LangChain Chains
+
+You can also use our memory system as a custom memory component in LangChain chains:
+
+```python
+from langchain.memory import BaseMemory
+from langchain.chains import ConversationChain
+from src.memory_model import IntegratedMemory, SimpleMemoryStore
+
+class IntegratedLangChainMemory(BaseMemory):
+    """Adapter to use IntegratedMemory with LangChain chains"""
+    
+    def __init__(self, user_id, memory_dir="./memories"):
+        """Initialize with user ID"""
+        # Setup memory storage
+        memory_file = f"{memory_dir}/{user_id}_langchain_memory.json"
+        store = SimpleMemoryStore(storage_path=memory_file)
+        self.memory = IntegratedMemory(store)
+        self.return_messages = True
+    
+    @property
+    def memory_variables(self):
+        """The variables this memory provides"""
+        return ["memory_context"]
+    
+    def load_memory_variables(self, inputs):
+        """Load memory context based on input"""
+        query = inputs.get("input", "")
+        context = self.memory.generate_context(query)
+        return {"memory_context": context}
+    
+    def save_context(self, inputs, outputs):
+        """Save interaction to memory"""
+        query = inputs.get("input", "")
+        response = outputs.get("response", "")
+        self.memory.add_interaction(query, response)
+    
+    def clear(self):
+        """Clear memory (not implemented - would need direct access to store)"""
+        pass
+```
+
+### Example Usage with LangChain Chain:
+
+```python
+from langchain_community.llms import HuggingFaceHub
+from langchain.chains import ConversationChain
+from langchain.prompts import PromptTemplate
+
+# Create LangChain memory adapter
+memory = IntegratedLangChainMemory(user_id="langchain_user")
+
+# Create prompt template with memory context
+template = """You are an AI assistant with memory.
+
+Memory Context:
+{memory_context}
+
+Current conversation:
+Human: {input}
+AI: """
+
+prompt = PromptTemplate(
+    input_variables=["memory_context", "input"],
+    template=template
+)
+
+# Create conversation chain with Gemma model
+chain = ConversationChain(
+    llm=HuggingFaceHub(
+        repo_id="google/gemma3-12b",
+        model_kwargs={"temperature": 0.7, "max_length": 2048}
+    ),
+    memory=memory,
+    prompt=prompt,
+    verbose=True
+)
+
+# Add knowledge directly to memory
+memory.memory.add_fact("neural_networks", {
+    "definition": "Computational systems inspired by biological neural networks",
+    "types": ["CNN", "RNN", "Transformer"]
+})
+
+# Use the chain
+response = chain.predict(input="What are neural networks?")
+print(response)
+```
+
+## Considerations
+
+1. **API Differences**: LangChain models have different APIs than Ollama, requiring adaptation
+2. **Prompt Engineering**: Different models may require different prompt templates
+3. **Token Limits**: Be aware of token limits when providing memory context to models
+4. **Model Availability**: Ensure you have access to the models you want to use
+5. **Persistence**: Ensure paths are set correctly for persistent memory storage
+
+## Extended Use Cases
+
+- **Agent Integration**: Use memory system with LangChain agents
+- **Tool Use**: Combine memory system with LangChain tools
+- **Retrieval**: Use memory system alongside LangChain retrievers for enhanced context
+- **Output Parsers**: Use LangChain output parsers to structure memory additions 
--- a/docs/memory_structure.md
+++ b/docs/memory_structure.md
@ -0,0 +1,147 @@
+# Memory Structure Documentation
+
+This document provides detailed information about the memory system's structure and implementation.
+
+## Memory Types
+
+The memory system is built around three cognitive memory types inspired by human memory:
+
+### 1. Semantic Memory
+
+**Description**: Stores factual knowledge, similar to general world knowledge.
+
+**Implementation Details**:
+- Stored as key-value pairs in a dictionary structure
+- Keys represent concepts (e.g., "albert_einstein", "napoleonic_wars")
+- Values contain detailed factual information about the concept
+- Access pattern is direct retrieval by concept key
+- Search capability allows finding concepts by keywords
+
+**Example Structure**:
+```json
+{
+  "semantic": {
+    "napoleon_bonaparte": {
+      "birth": "1769, Corsica",
+      "death": "1821, Saint Helena",
+      "achievements": ["Napoleonic Code", "Military conquests"]
+    },
+    "theory_of_relativity": {
+      "developed_by": "Albert Einstein",
+      "year": "1915",
+      "key_concepts": ["Space-time curvature", "Mass-energy equivalence"]
+    }
+  }
+}
+```
+
+### 2. Episodic Memory
+
+**Description**: Records experiences and conversations, similar to autobiographical memory.
+
+**Implementation Details**:
+- Stored as timestamped entries in a list
+- Each entry contains the interaction content and metadata
+- Chronologically ordered for sequential retrieval
+- Can be filtered by time periods or content keywords
+- Provides context about past interactions
+
+**Example Structure**:
+```json
+{
+  "episodic": [
+    {
+      "timestamp": "2023-07-15T14:32:45",
+      "query": "Tell me about Napoleon's early life",
+      "response": "Napoleon Bonaparte was born on August 15, 1769, in Corsica...",
+      "metadata": {
+        "session_id": "abc123",
+        "concepts_referenced": ["napoleon_bonaparte", "corsica"]
+      }
+    },
+    {
+      "timestamp": "2023-07-15T14:35:12",
+      "query": "What were his major achievements?",
+      "response": "Napoleon's major achievements include the Napoleonic Code...",
+      "metadata": {
+        "session_id": "abc123",
+        "concepts_referenced": ["napoleon_bonaparte", "napoleonic_code"]
+      }
+    }
+  ]
+}
+```
+
+### 3. Procedural Memory
+
+**Description**: Stores knowledge about how to perform tasks or follow procedures.
+
+**Implementation Details**:
+- Stored as named procedures with structured steps
+- Each procedure has a unique identifier
+- Steps are ordered in sequence
+- Can include contextual information about when/how to apply the procedure
+- May reference semantic concepts
+
+**Example Structure**:
+```json
+{
+  "procedural": {
+    "analyze_historical_figure": {
+      "steps": [
+        "1. Research early life and background",
+        "2. Examine key achievements and contributions",
+        "3. Analyze leadership style and decision-making",
+        "4. Evaluate historical impact and legacy",
+        "5. Compare with contemporaries"
+      ],
+      "context": "Use this procedure when conducting a comprehensive analysis of any significant historical figure."
+    },
+    "evaluate_scientific_theory": {
+      "steps": [
+        "1. Identify the core principles",
+        "2. Review the empirical evidence",
+        "3. Examine predictions and confirmations",
+        "4. Consider criticisms and limitations",
+        "5. Assess historical and current relevance"
+      ]
+    }
+  }
+}
+```
+
+## Memory Storage
+
+The underlying storage system uses a simple key-value store implementation:
+
+1. **In-Memory Storage**: Primary storage is an in-memory dictionary for fast access
+2. **Persistence Layer**: Optional JSON file storage for maintaining memory across sessions
+3. **Namespacing**: Each memory type has its own namespace to prevent collisions
+4. **Simple Search**: Keyword-based search without requiring embeddings
+
+## Memory Integration
+
+The `IntegratedMemory` class combines all memory types into a unified interface:
+
+1. **Access Methods**: Provides methods to add/retrieve from each memory type
+2. **Context Generation**: Creates context summaries from all memory types for LLM queries
+3. **Memory Management**: Handles storage, retrieval, and updating of memories
+4. **Persistence**: Manages saving and loading memory from disk storage
+
+## LLM Integration
+
+The `OllamaMemoryAssistant` class connects the memory system to Ollama:
+
+1. **Query Processing**: Formats user queries with appropriate memory context
+2. **Response Generation**: Sends context-enriched queries to the LLM via Ollama
+3. **Memory Updates**: Automatically stores interactions in episodic memory
+4. **Error Handling**: Manages API errors and connection issues
+
+## Persistence Implementation
+
+Memory persistence is implemented using JSON file storage:
+
+1. **File Path**: Each user gets a dedicated file based on their user ID
+2. **Loading**: Memory is loaded from disk when initializing with an existing user ID
+3. **Saving**: Memory is automatically saved after modifications
+4. **Format**: Stored as a structured JSON document maintaining the memory hierarchy