add ai projects

2026-04-08 01:40:24 +00:00
parent f1444739fe
commit 07e353751d
2 changed files with 161 additions and 0 deletions
--- a/projects/self-hosted-ai/continue-mcp-guide.md
+++ b/projects/self-hosted-ai/continue-mcp-guide.md
@@ -0,0 +1,115 @@
+---
+title: "MCP on Code-Server"
+description: "A comprehensive guide to enabling autonomous shell execution using Continue and MCP servers."
+draft: false
+author: "wompmacho"
+date: '2026-04-08T00:55:00-04:00'
+lastmod: '2026-04-08'
+tags: ["continue.dev", "mcp", "code-server", "llm", "automation"]
+---
+
+# Configuring Continue.dev
+
+This guide details how to configure the Continue.dev extension running on a `code-server` instance to autonomously execute shell commands using the Model Context Protocol (MCP).
+
+## The Challenge
+
+By default, Continue's chat interface is designed as a "human-in-the-loop" assistant. When asked to run a command, the LLM will typically generate a Markdown code block (e.g., ` ```bash `). The Continue UI intercepts this and displays a "Run in Terminal" button, requiring the user to click it, which then pastes the command into the VS Code terminal.
+
+Furthermore, in remote environments like `code-server` or VS Code Remote-SSH, Continue often struggles to reliably capture standard output (stdout) natively due to how remote terminals are spawned and monitored.
+
+To achieve true autonomous agent behavior—where the LLM runs the command, reads the output, and continues working without user intervention—we must leverage an MCP server and forcefully override the LLM's default prompting behavior.
+
+## The Solution: MCP + System Message Override
+
+To fix this, we need two main components:
+1.  **An MCP Server**: Specifically, `shell-mcp-server` (or similar), which exposes an `execute_shell_command` tool to the LLM.
+2.  **An Aggressive System Message**: We must inject a `systemMessage` into the Continue configuration that strictly forbids the generation of Markdown code blocks and mandates the use of the tool.
+
+### Step 1: Install the MCP Server
+
+Install a shell MCP server globally on the host machine running `code-server`.
+
+```bash
+npm install -g shell-mcp-server
+```
+
+*Note: Depending on the specific package installed, you may need to ensure the script's schema descriptions are in English and optimized for Linux (defaulting to `/bin/bash` instead of Windows `cmd.exe`) to prevent the LLM from getting confused.*
+
+### Step 2: Configure Continue
+
+Update your Continue configuration file, typically located at `~/.continue/config.yaml` or `~/.continue/config.json`.
+
+Here is a complete, detailed example configuration demonstrating the necessary setup:
+
+```yaml
+# The display name of this workspace/configuration
+name: Home Lab Config
+
+# Config schema version
+version: 1.0.0
+schema: v1
+
+# Define the Language Models available to Continue
+models:
+  # This is the primary model used for heavy lifting (chat, applying code edits)
+  - name: Gemma-4-26B-Heavy
+    provider: ollama
+    model: gemma4:26b
+    apiBase: http://10.0.0.109:11434
+    roles:
+      - chat
+      - edit
+      - apply
+    # CRITICAL: MUST include 'tool_use' to allow the model to interact with MCP servers
+    capabilities:
+      - tool_use
+
+  # A faster model for autocomplete functions
+  - name: Gemma-4-E4B-Fast
+    provider: ollama
+    model: gemma4:e4b
+    apiBase: http://10.0.0.109:11434
+    roles:
+      - chat
+      - autocomplete
+    capabilities:
+      - tool_use
+
+# Define the MCP Servers that provide tools to the LLM
+mcpServers:
+  - name: terminal
+    # CRITICAL: Use absolute paths to the node binary and the MCP server script 
+    # for maximum stability in code-server environments.
+    command: /home/wompmacho/.nvm/versions/node/v24.14.0/bin/node
+    args:
+      - /home/wompmacho/.nvm/versions/node/v24.14.0/bin/shell-mcp-server
+
+# Define context providers the LLM can use to gather information
+context:
+  - provider: file
+  - provider: code
+  - provider: terminal
+  - provider: diff
+  - provider: open
+  - provider: currentFile
+
+# Custom workspace rules to guide the LLM's general behavior
+rules:
+  - Always assume a Linux/Ubuntu environment for terminal commands
+  - Use rules in /home/wompmacho/.gemini/GEMINI.md for all prompts
+  - Use skills in /home/wompmacho/.gemini/skills
+  - Evaluate if it makes sense to use any of the available skills for each prompt
+  # A soft rule reinforcing the tool behavior
+  - 'CRITICAL: When the user asks to run a command, NEVER generate a markdown code block for the user to run. You MUST autonomously use the MCP tool to execute the command directly to capture the output.'
+
+# The System Message Override (The Secret Sauce)
+# This strictly overrides Continue's core internal prompts which strongly bias towards Markdown code blocks.
+systemMessage: 'You are an autonomous AI agent integrated into the developer''s workspace. You have access to MCP tools, including ''execute_shell_command''. ABSOLUTELY CRITICAL: Whenever the user asks you to run, execute, or check something via the terminal/shell, you MUST invoke the ''execute_shell_command'' tool immediately. DO NOT output the command in a markdown block (e.g. ```bash). DO NOT ask the user to run it themselves. You must call the tool, capture the output, and report the results back to the user.'
+```
+
+## Key Takeaways
+
+*   **Absolute Paths:** In remote environments like `code-server`, always use absolute paths for the `command` and `args` in the `mcpServers` definition to avoid `$PATH` resolution errors when the extension spawns the subprocess.
+*   **`capabilities: [tool_use]`**: Ensure your chosen model supports tool usage and that the capability is explicitly declared in your `config.yaml`.
+*   **The System Message**: Continue's internal prompt engineering is very persistent about keeping the human in the loop. The `systemMessage` must be highly aggressive and specifically ban Markdown block generation to successfully force the LLM to route its output through the MCP tool pipeline instead of the chat interface.
--- a/projects/self-hosted-ai/homelab-ai-infrastructure.md
+++ b/projects/self-hosted-ai/homelab-ai-infrastructure.md
@@ -0,0 +1,46 @@
+---
+title: "Self-hosted AI"
+description: "Architecture and configuration details for the self-hosted AI environment including Ollama, Continue, and Open WebUI."
+draft: false
+author: "wompmacho"
+date: '2026-04-08T01:00:00-04:00'
+lastmod: '2026-04-08'
+tags: ["ai", "ollama", "continue.dev", "open-webui", "self-hosted", "gemma"]
+---
+
+# Homelab AI Infrastructure Overview
+
+This document outlines the current self-hosted Artificial Intelligence infrastructure, detailing how models are hosted, accessed, and utilized across different interfaces within the homelab environment.
+
+## Core Inference Engine: Ollama
+
+The backbone of the AI setup is driven by **Ollama**, which handles the actual model inference and API routing.
+
+*   **Host Environment:** Dedicated Gaming PC. This machine provides the necessary GPU compute power and VRAM to run large language models efficiently.
+*   **Network Address:** `http://10.0.0.109:11434`
+*   **Active Models:**
+    *   `gemma4:26b` (Heavy): The primary model used for complex reasoning, comprehensive chat, and applying structural code edits.
+    *   `gemma4:e4b` (Fast): A smaller, optimized model specifically dedicated to low-latency tasks like real-time code autocomplete.
+
+## Developer Integration: Continue.dev
+
+For software development, the AI is integrated directly into the coding environment, turning the IDE into an AI-powered workspace.
+
+*   **Environment:** VS Code running via `code-server` on the Linux host.
+*   **Extension:** Continue.dev
+*   **Routing:** The extension is configured via `~/.continue/config.yaml` to offload all inference to the remote Ollama instance on the Gaming PC.
+*   **Autonomous Capabilities:** 
+    *   The setup is enhanced with the Model Context Protocol (MCP).
+    *   A local `shell-mcp-server` runs on the `code-server` host, exposing the `execute_shell_command` tool.
+    *   Combined with aggressive system prompting, this allows the `gemma4:26b` model to autonomously execute Linux commands, read outputs, and debug the local workspace without requiring the developer to manually run commands.
+
+## General User Interface: Open WebUI
+
+For general-purpose queries, document analysis, and conversational AI outside of the IDE, a dedicated web interface is utilized.
+
+*   **Hosting:** Deployed as a Docker container within the homelab server infrastructure (data stored in `/srv/open-webui`).
+*   **Integration:** Natively connected to the Ollama API on the Gaming PC over the local network (`10.0.0.109`).
+*   **Features:** 
+    *   Provides a polished, ChatGPT-like experience.
+    *   Allows users to interact seamlessly with the Gemma models.
+    *   Supports persistent chat histories and file uploads (via the `/srv/open-webui/uploads` and `vector_db` volumes) for Retrieval-Augmented Generation (RAG) capabilities.