<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[LangDB]]></title><description><![CDATA[The Fastest Enterprise AI Gateway. Secure, govern, and optimize AI Traffic across 250+ LLMs using OpenAI-Compatible APIs]]></description><link>https://blog.langdb.ai</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1734672041277/766b819f-87d0-4bb4-a8be-f5c6359803fa.png</url><title>LangDB</title><link>https://blog.langdb.ai</link></image><generator>RSS for Node</generator><lastBuildDate>Wed, 08 Apr 2026 01:08:42 GMT</lastBuildDate><atom:link href="https://blog.langdb.ai/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Deep Integration: Building Multi-Agent Workflows with CrewAI & LangDB AI Gateway]]></title><description><![CDATA[In the world of AI agent development, the challenge isn't just building individual agents—it's creating systems where agents can seamlessly collaborate, access the right tools, and leverage the best models for each task. What if you could orchestrate...]]></description><link>https://blog.langdb.ai/deep-integration-building-multi-agent-workflows-with-crewai-and-langdb-ai-gateway</link><guid isPermaLink="true">https://blog.langdb.ai/deep-integration-building-multi-agent-workflows-with-crewai-and-langdb-ai-gateway</guid><category><![CDATA[CrewAI]]></category><category><![CDATA[LangDB]]></category><category><![CDATA[llm]]></category><category><![CDATA[AI]]></category><category><![CDATA[agentic AI]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Wed, 30 Jul 2025 11:29:10 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1753874550719/a0d6d555-40c4-4882-a12e-9c9506111a7f.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the world of AI agent development, the challenge isn't just building individual agents—it's creating systems where agents can seamlessly collaborate, access the right tools, and leverage the best models for each task. What if you could orchestrate complex multi-agent workflows with built-in model access, dynamic tool management, and complete observability?</p>
<p>In this post, we'll show you how CrewAI and LangDB work together to create a deeply integrated agent development platform.</p>
<h2 id="heading-tldr">TL;DR:</h2>
<p>This guide demonstrates the deep integration between <a target="_blank" href="https://www.crewai.com/">CrewAI</a> and the <a target="_blank" href="https://langdb.ai/">LangDB AI Gateway</a>, showing how to build sophisticated multi-agent workflows with seamless model access, dynamic tool management, and built-in observability. We'll use a report generation system as a practical example to showcase how these technologies work together as a unified platform, giving you access to over 350+ LLMs.</p>
<p><a target="_blank" href="https://app.langdb.ai/sharing/threads/3becbfed-a1be-ae84-ea3c-4942867a3e22"><img src="https://docs.langdb.ai/~gitbook/image?url=https%3A%2F%2F2821389904-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252FjwXvc8GvLP6kUvOASaLP%252Fuploads%252Fs2mLVieySjZEiEe0Vk7N%252FScreenshot%2520from%25202025-06-16%252020-50-02.png%3Falt%3Dmedia%26token%3D32c732db-a610-4447-81ff-fbee27209d62&amp;width=768&amp;dpr=4&amp;quality=100&amp;sign=3f229ad9&amp;sv=2" alt="Final LangDB Thread view for the CrewAI workflow" /></a></p>
<blockquote>
<p>Full Conversation: <a target="_blank" href="https://app.langdb.ai/sharing/threads/3becbfed-a1be-ae84-ea3c-4942867a3e22">https://app.langdb.ai/sharing/threads/3becbfed-a1be-ae84-ea3c-4942867a3e22</a></p>
</blockquote>
<p>This team of agents collaborates to deliver comprehensive reports by combining web research for current information with analysis and professional writing. You can see a full trace of the entire workflow execution.</p>
<h2 id="heading-the-code">The Code</h2>
<p>You can find the complete source code for this project on GitHub:</p>
<ul>
<li><strong>LangDB Samples</strong>: <a target="_blank" href="https://github.com/langdb/langdb-samples/tree/main/examples/crewai/report-writing-agent">https://github.com/langdb/langdb-samples/tree/main/examples/crewai/report-writing-agent</a></li>
</ul>
<h2 id="heading-the-integration-crewai-langdb-ai-gateway-as-a-unified-platform">The Integration: CrewAI + LangDB AI Gateway as a Unified Platform</h2>
<p>Our system demonstrates how CrewAI and the LangDB AI Gateway work together as a deeply integrated platform:</p>
<ul>
<li><p><strong>CrewAI</strong>: Provides the orchestration framework for multi-agent workflows</p>
</li>
<li><p><strong>LangDB AI Gateway</strong>: Delivers the AI Gateway capabilities including access to over 350+ models, model management, tool management, and observability</p>
</li>
<li><p><strong>Unified Experience</strong>: Both technologies work seamlessly together, appearing as a single development platform</p>
</li>
</ul>
<p>This integration enables you to build sophisticated agent systems without worrying about the underlying infrastructure complexity.</p>
<blockquote>
<p>Checkout: <a target="_blank" href="https://docs.langdb.ai/guides/building-agents/building-reporting-writing-agent-using-crewai">https://docs.langdb.ai/guides/building-agents/building-reporting-writing-agent-using-crewai</a> and <a target="_blank" href="https://docs.langdb.ai/features/tracing">https://docs.langdb.ai/features/tracing</a></p>
</blockquote>
<h2 id="heading-seamless-integration-setup">Seamless Integration Setup</h2>
<p>The integration between CrewAI and the LangDB AI Gateway is designed to be as simple as possible. With just a few lines of code, you get access to over 350 models, dynamic tooling, and complete observability.</p>
<h3 id="heading-initialize-the-integration">Initialize the Integration</h3>
<p>The first step is to initialize the LangDB integration with CrewAI. This single call enables all the advanced features:</p>
<pre><code class="lang-python"><span class="hljs-comment"># main.py</span>
<span class="hljs-keyword">from</span> pylangdb.crewai <span class="hljs-keyword">import</span> init
<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv

<span class="hljs-comment"># Load environment variables and initialize LangDB integration</span>
load_dotenv()
init()  <span class="hljs-comment"># This enables deep integration with the LangDB AI Gateway</span>
</code></pre>
<h3 id="heading-configure-the-langdb-ai-gateway">Configure the LangDB AI Gateway</h3>
<p>Set up your LangDB AI Gateway credentials to enable model access and tool management:</p>
<pre><code class="lang-bash"><span class="hljs-built_in">export</span> LANGDB_API_KEY=<span class="hljs-string">"&lt;your_langdb_api_key&gt;"</span>
<span class="hljs-built_in">export</span> LANGDB_PROJECT_ID=<span class="hljs-string">"&lt;your_langdb_project_id&gt;"</span>
<span class="hljs-built_in">export</span> LANGDB_API_BASE_URL=<span class="hljs-string">'https://api.us-east-1.langdb.ai'</span>
</code></pre>
<h3 id="heading-create-langdb-ai-gateway-enabled-llms">Create LangDB AI Gateway-Enabled LLMs</h3>
<p>Define a helper function that creates LLMs with full LangDB AI Gateway integration:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> crewai <span class="hljs-keyword">import</span> LLM
<span class="hljs-keyword">import</span> os

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_llm</span>(<span class="hljs-params">model</span>):</span>
    <span class="hljs-keyword">return</span> LLM(
        model=model,
        api_key=os.environ.get(<span class="hljs-string">"LANGDB_API_KEY"</span>),
        base_url=os.environ.get(<span class="hljs-string">"LANGDB_API_BASE_URL"</span>),
        extra_headers={
            <span class="hljs-string">"x-project-id"</span>: os.environ.get(<span class="hljs-string">"LANGDB_PROJECT_ID"</span>)
        }
    )
</code></pre>
<h2 id="heading-deep-integration-features">Deep Integration Features</h2>
<h3 id="heading-model-selection-across-350-models">Model Selection across 350+ Models</h3>
<p>The LangDB AI Gateway's model access capabilities are fully integrated with CrewAI. You can specify any model from over 350 providers, and the LangDB AI Gateway will provide access to it:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Each agent can use different models seamlessly</span>
<span class="hljs-meta">@agent</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">researcher</span>(<span class="hljs-params">self</span>) -&gt; Agent:</span>
    <span class="hljs-keyword">return</span> Agent(
        config=self.agents_config[<span class="hljs-string">'researcher'</span>],
        llm=create_llm(<span class="hljs-string">"openai/langdb/reportresearcher_9wzgx5n5"</span>) <span class="hljs-comment"># Virtual Model with tools</span>
    )

<span class="hljs-meta">@agent</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">analyst</span>(<span class="hljs-params">self</span>) -&gt; Agent:</span>
    <span class="hljs-keyword">return</span> Agent(
        config=self.agents_config[<span class="hljs-string">'analyst'</span>],
        llm=create_llm(<span class="hljs-string">"openai/anthropic/claude-3.7-sonnet"</span>) <span class="hljs-comment"># Direct model access</span>
    )

<span class="hljs-meta">@agent</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">report_writer</span>(<span class="hljs-params">self</span>) -&gt; Agent:</span>
    <span class="hljs-keyword">return</span> Agent(
        config=self.agents_config[<span class="hljs-string">'report_writer'</span>],
        llm=create_llm(<span class="hljs-string">"openai/gpt-4o"</span>) <span class="hljs-comment"># Another model provider</span>
    )
</code></pre>
<h3 id="heading-dynamic-tool-management">Dynamic Tool Management</h3>
<p>The LangDB AI Gateway's <a target="_blank" href="https://docs.langdb.ai/concepts/virtual-models">Virtual Models</a> and <a target="_blank" href="https://docs.langdb.ai/concepts/virtual-mcp-servers">Virtual MCPs</a> integrate seamlessly with CrewAI agents. Tools are managed centrally in the LangDB AI Gateway but appear natively to your CrewAI agents:</p>
<pre><code class="lang-python"><span class="hljs-comment"># The researcher agent automatically gets access to web search tools</span>
<span class="hljs-comment"># through the LangDB Virtual Model, without any additional configuration</span>
<span class="hljs-meta">@agent</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">researcher</span>(<span class="hljs-params">self</span>) -&gt; Agent:</span>
    <span class="hljs-keyword">return</span> Agent(
        config=self.agents_config[<span class="hljs-string">'researcher'</span>],
        llm=create_llm(<span class="hljs-string">"openai/langdb/reportresearcher_9wzgx5n5"</span>) <span class="hljs-comment"># Tools included automatically</span>
    )
</code></pre>
<h3 id="heading-built-in-observability">Built-in Observability</h3>
<p>Every interaction is automatically traced and observable through the LangDB AI Gateway's integrated tracing system:</p>
<pre><code class="lang-python"><span class="hljs-comment"># No additional tracing code needed - it's all automatic</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_report</span>(<span class="hljs-params">topic</span>):</span>
    crew_instance = ReportGenerationCrew()
    result = crew_instance.crew().kickoff()
    <span class="hljs-keyword">return</span> result  <span class="hljs-comment"># Full trace automatically captured in the LangDB AI Gateway</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1753873041808/81dc834e-b175-46ae-9e74-cb4645287494.png" alt="Screenshot of interface displaying a task management dashboard. It shows a list of processes with their durations, a visual timeline of task execution, and detailed metadata about a specific run, including trace and thread IDs and timestamps. The background is dark, and the layout is divided into sections for easy navigation." class="image--center mx-auto" /></p>
<h2 id="heading-advanced-integration-capabilities">Advanced Integration Capabilities</h2>
<h3 id="heading-virtual-model-integration">Virtual Model Integration</h3>
<p>LangDB AI Gateway <a target="_blank" href="https://docs.langdb.ai/concepts/virtual-models">Virtual Models</a> work seamlessly with CrewAI agents. You can create models with specific capabilities and use them directly:</p>
<pre><code class="lang-python"><span class="hljs-comment"># This agent automatically gets web search capabilities</span>
<span class="hljs-comment"># through the LangDB Virtual Model configuration</span>
<span class="hljs-meta">@agent</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">researcher</span>(<span class="hljs-params">self</span>) -&gt; Agent:</span>
    <span class="hljs-keyword">return</span> Agent(
        config=self.agents_config[<span class="hljs-string">'researcher'</span>],
        llm=create_llm(<span class="hljs-string">"openai/langdb/reportresearcher_9wzgx5n5"</span>)
    )
</code></pre>
<p>The Virtual Model is configured in the LangDB AI Gateway UI to include:</p>
<ul>
<li><p>Base model (e.g., GPT-4.1)</p>
</li>
<li><p>Attached MCP tools (e.g., Tavily Search)</p>
</li>
<li><p>Custom instructions and parameters</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1753872721466/a3678e9f-a7fd-4a54-ad76-362feea24b19.png" alt="LangDB UI showing the Virtual Model configuration with an attached MCP Server" class="image--center mx-auto" /></p>
<h3 id="heading-mcp-tool-integration">MCP Tool Integration</h3>
<p>The LangDB AI Gateway's <a target="_blank" href="https://docs.langdb.ai/concepts/virtual-mcp-servers">MCP (Model Context Protocol)</a> tools integrate with Virtual Models:</p>
<p><img src="https://docs.langdb.ai/~gitbook/image?url=https%3A%2F%2F2821389904-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252FjwXvc8GvLP6kUvOASaLP%252Fuploads%252Fma2UJXqlZwy2NY7hb9bQ%252FSettingVM.gif%3Falt%3Dmedia%26token%3D2a69ebbd-89ba-4b7f-b67f-488607fdbfb6&amp;width=768&amp;dpr=4&amp;quality=100&amp;sign=8261b999&amp;sv=2" alt="Setting Up Virtual MCP Server With Virtual Model" /></p>
<p>This means:</p>
<ul>
<li><p><strong>No Tool Configuration</strong>: Tools are managed in the LangDB AI Gateway UI</p>
</li>
<li><p><strong>Dynamic Updates</strong>: Change tools without redeploying agents</p>
</li>
<li><p><strong>Automatic Tracing</strong>: All tool calls are traced in the LangDB AI Gateway</p>
</li>
<li><p><strong>Cost Tracking</strong>: Tool usage costs are tracked automatically</p>
</li>
</ul>
<h3 id="heading-multi-provider-model-access">Multi-Provider Model Access</h3>
<p>The LangDB AI Gateway's model access capabilities are fully integrated with CrewAI, giving you access to a universe of over 350+ models:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Each agent can use different models seamlessly</span>
<span class="hljs-meta">@agent</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">researcher</span>(<span class="hljs-params">self</span>) -&gt; Agent:</span>
    <span class="hljs-keyword">return</span> Agent(
        config=self.agents_config[<span class="hljs-string">'researcher'</span>],
        llm=create_llm(<span class="hljs-string">"openai/langdb/reportresearcher_9wzgx5n5"</span>) <span class="hljs-comment"># OpenAI + tools</span>
    )

<span class="hljs-meta">@agent</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">analyst</span>(<span class="hljs-params">self</span>) -&gt; Agent:</span>
    <span class="hljs-keyword">return</span> Agent(
        config=self.agents_config[<span class="hljs-string">'analyst'</span>],
        llm=create_llm(<span class="hljs-string">"openai/anthropic/claude-3.7-sonnet"</span>) <span class="hljs-comment"># Anthropic</span>
    )

<span class="hljs-meta">@agent</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">report_writer</span>(<span class="hljs-params">self</span>) -&gt; Agent:</span>
    <span class="hljs-keyword">return</span> Agent(
        config=self.agents_config[<span class="hljs-string">'report_writer'</span>],
        llm=create_llm(<span class="hljs-string">"openai/gpt-4o"</span>) <span class="hljs-comment"># OpenAI</span>
    )
</code></pre>
<h2 id="heading-integrated-workflow-management">Integrated Workflow Management</h2>
<h3 id="heading-crew-definition-with-langdb-ai-gateway-integration">Crew Definition with LangDB AI Gateway Integration</h3>
<p>The crew definition remains simple while leveraging all LangDB AI Gateway capabilities:</p>
<pre><code class="lang-python"><span class="hljs-meta">@crew</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">crew</span>(<span class="hljs-params">self</span>) -&gt; Crew:</span>
    <span class="hljs-keyword">return</span> Crew(
        agents=[self.researcher(), self.analyst(), self.report_writer()],
        tasks=[self.research_task(), self.analysis_task(), self.report_writing_task()],
        process=Process.sequential
    )
</code></pre>
<h3 id="heading-task-execution-with-built-in-observability">Task Execution with Built-in Observability</h3>
<p>Tasks execute with full LangDB AI Gateway integration:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_report</span>(<span class="hljs-params">topic</span>):</span>
    crew_instance = ReportGenerationCrew()
    result = crew_instance.crew().kickoff()
    <span class="hljs-keyword">return</span> result  <span class="hljs-comment"># Full trace automatically available in the LangDB AI Gateway</span>
</code></pre>
<h2 id="heading-integration-benefits">Integration Benefits</h2>
<h3 id="heading-simplified-development">Simplified Development</h3>
<ul>
<li><p><strong>Single Setup</strong>: One initialization call enables all features</p>
</li>
<li><p><strong>No Tool Management</strong>: Tools are managed centrally in the LangDB AI Gateway</p>
</li>
<li><p><strong>Automatic Tracing</strong>: No additional observability code needed</p>
</li>
</ul>
<h3 id="heading-dynamic-capabilities">Dynamic Capabilities</h3>
<ul>
<li><p><strong>Model Switching</strong>: Change models from over 350+ options without code changes</p>
</li>
<li><p><strong>Tool Updates</strong>: Add/remove tools through the LangDB AI Gateway UI</p>
</li>
<li><p><strong>Model Flexibility</strong>: Mix and match Model seamlessly</p>
</li>
</ul>
<h3 id="heading-production-ready">Production Ready</h3>
<ul>
<li><p><strong>Built-in Observability</strong>: Complete traces for every execution</p>
</li>
<li><p><strong>Cost Tracking</strong>: Automatic cost and usage monitoring</p>
</li>
<li><p><strong>Performance Monitoring</strong>: Latency and performance metrics</p>
</li>
<li><p><strong>Error Handling</strong>: Integrated error tracking and debugging</p>
</li>
</ul>
<h2 id="heading-running-the-integrated-system">Running the Integrated System</h2>
<p>Execute the workflow with full LangDB integration:</p>
<pre><code class="lang-python"><span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    generate_report(<span class="hljs-string">"The Impact of AI on Social Media Marketing in 2024"</span>)
</code></pre>
<p>The system automatically:</p>
<ul>
<li><p>Provides access to appropriate models from over 350+ choices</p>
</li>
<li><p>Manages tool access</p>
</li>
<li><p>Captures complete traces</p>
</li>
<li><p>Tracks costs and performance</p>
</li>
</ul>
<h2 id="heading-real-world-integration-example">Real-World Integration Example</h2>
<p>Here's what the integration looks like in practice:</p>
<pre><code class="lang-markdown"><span class="hljs-section"># The Impact of AI on Social Media Marketing in 2024</span>

<span class="hljs-section">## Executive Summary</span>

Artificial Intelligence has fundamentally transformed social media marketing in 2024, creating new opportunities and challenges for businesses worldwide. This report examines the current state of AI integration in social media marketing, key trends, and strategic implications for marketers.

<span class="hljs-section">## Key Findings</span>

<span class="hljs-section">### 1. AI-Powered Content Creation</span>
<span class="hljs-bullet">-</span> <span class="hljs-strong">**Automated Content Generation**</span>: 73% of marketers now use AI tools for content creation
<span class="hljs-bullet">-</span> <span class="hljs-strong">**Personalization at Scale**</span>: AI enables hyper-personalized content delivery to specific audience segments
<span class="hljs-bullet">-</span> <span class="hljs-strong">**Real-time Optimization**</span>: Dynamic content adjustment based on performance metrics

<span class="hljs-section">### 2. Advanced Analytics and Insights</span>
<span class="hljs-bullet">-</span> <span class="hljs-strong">**Predictive Analytics**</span>: AI models forecast campaign performance with 85% accuracy
<span class="hljs-bullet">-</span> <span class="hljs-strong">**Sentiment Analysis**</span>: Real-time brand sentiment monitoring across platforms
<span class="hljs-bullet">-</span> <span class="hljs-strong">**Competitive Intelligence**</span>: Automated tracking of competitor strategies and performance
</code></pre>
<p>Every step of this report generation was powered by the deep integration between CrewAI and LangDB, with complete observability into the process.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>The integration between CrewAI and the LangDB AI Gateway creates a unified platform for building sophisticated multi-agent systems. This deep integration provides:</p>
<ul>
<li><p><strong>Seamless Development</strong>: Build complex workflows with simple, clean code</p>
</li>
<li><p><strong>Dynamic Capabilities</strong>: Change models and tools without redeployment</p>
</li>
<li><p><strong>Built-in Observability</strong>: Complete visibility into every aspect of your workflows</p>
</li>
<li><p><strong>Production Ready</strong>: Enterprise-grade monitoring and management</p>
</li>
</ul>
<p>This architecture enables rapid development and iteration, allowing you to build truly powerful and intelligent agentic systems for any domain.</p>
<p>Ready to build your own? Start building for free on the <a target="_blank" href="https://docs.langdb.ai/quick-start">LangDB AI Gateway</a> or Explore <a target="_blank" href="https://docs.crewai.com/">CrewAI</a>.</p>
]]></content:encoded></item><item><title><![CDATA[Kimi K2 vs Grok 4: Open-Source Challenger vs Premium Powerhouse]]></title><description><![CDATA[Choosing the right large-language model (LLM) has moved beyond “GPT-4o or bust.” In 2025, Kimi K2 (Moonshot AI) and Grok 4 (xAI) give developers two very different yet highly capable options: an open-source trillion-parameter Mixture-of-Experts model...]]></description><link>https://blog.langdb.ai/kimi-k2-vs-grok-4-open-source-challenger-vs-premium-powerhouse</link><guid isPermaLink="true">https://blog.langdb.ai/kimi-k2-vs-grok-4-open-source-challenger-vs-premium-powerhouse</guid><category><![CDATA[LangDB]]></category><category><![CDATA[grok]]></category><category><![CDATA[Kimi]]></category><category><![CDATA[llm]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Thu, 24 Jul 2025 05:59:46 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1753336692381/f02c8b84-25c0-4450-a2b5-69b6f5537777.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Choosing the right large-language model (LLM) has moved beyond “GPT-4o or bust.” In 2025, <a target="_blank" href="https://langdb.ai/app/providers/openrouter/kimi-k2"><strong>Kimi K2</strong></a> (Moonshot AI) and <a target="_blank" href="https://langdb.ai/app/providers/xai/grok-4"><strong>Grok 4</strong></a> (xAI) give developers two very different yet highly capable options: an open-source trillion-parameter Mixture-of-Experts model on one side and a premium, real-time, multi-agent powerhouse on the other. This article walks through their architectures, benchmark results, practical use cases, and how you can access both through LangDB AI gateway.</p>
<h2 id="heading-tldr"><strong>TL;DR:</strong></h2>
<p>In a nutshell, <strong>Kimi K2</strong> is an open-source MoE with 1 T parameters and a 128 K token context, self-hostable and priced at just $0.15/$2.50 per million tokens—ideal for high-volume or agentic workflows—while <strong>Grok 4</strong> is a proprietary dense model with 1.7 T parameters, a 256 K token window plus live web/X hooks, costing $3/$15 per million tokens and excelling at deep reasoning and real-time data. Benchmarks show Grok leading on live-execution and toughest reasoning tasks, with Kimi matching on static coding and general-knowledge tests at one-tenth the cost. In a real-world LangGraph run, Kimi K2 completed the pipeline in half the time (86 s vs 168 s) at one-tenth the cost ($0.012 vs $0.128).</p>
<h2 id="heading-architecture">Architecture</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Model</td><td>Core design</td><td>Params (total / active)</td><td>Context window</td><td>Stand-out features</td></tr>
</thead>
<tbody>
<tr>
<td><strong>Kimi K2</strong></td><td>Mixture-of-Experts</td><td>1 T / 32 B active</td><td>128 K tokens (up to 1 M offline)</td><td>MuonClip optimizer, open weights</td></tr>
<tr>
<td><strong>Grok 4</strong></td><td>Dense + RL-tuned; “Heavy” = multi-agent</td><td>≈ 1.7 T</td><td>256 K via API</td><td>Real-time X/Twitter &amp; web search, Colossus-scale training</td></tr>
</tbody>
</table>
</div><h3 id="heading-kimi-k2">Kimi K2</h3>
<p>Moonshot’s MoE activates just 32 B parameters per token, giving near-GPT-4o performance at far lower compute. The open Apache 2.0 license plus 128 K context makes it attractive for self-hosting and agentic workflows.</p>
<h3 id="heading-grok-4">Grok 4</h3>
<p>xAI trained Grok 4 on 200 K H100 GPUs; the Heavy variant federates multiple Groks that “debate” their answers, boosting deep reasoning. Real-time data hooks mean answers stay current without extra retrieval plumbing.</p>
<h2 id="heading-benchmarks">Benchmarks</h2>
<p><img src="https://ppl-ai-code-interpreter-files.s3.amazonaws.com/web/direct-files/11b5068937ec17eed2f955ede19e7ac3/f128064b-6c9d-44c2-b6b8-7741ca5012df/c95f5002.png" alt="Bar chart titled &quot;LLM Benchmarks Comparison&quot; showing performance of four models: Kimi K2, Grok 4, Claude 4 Sonnet, and Gemini 2.5 Pro. Benchmarks include GPQA Diamond, MMLU Score, LiveCodeBench, and SWE-bench Verified, with scores ranging from about 55% to 95%." /></p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Suite</td><td>Kimi K2</td><td>Grok 4</td><td>Notes</td></tr>
</thead>
<tbody>
<tr>
<td><strong>SWE-bench Verified</strong></td><td>65.8 % (71.6 % w/ parallel)</td><td><strong>73 %</strong></td><td>Real-world GitHub bug-fixing</td></tr>
<tr>
<td><strong>LiveCodeBench</strong></td><td>53.7 %</td><td><strong>79.4 %</strong></td><td>Code must compile &amp; run</td></tr>
<tr>
<td><strong>MMLU</strong></td><td><strong>89.5 %</strong></td><td>86.6 %</td><td>General knowledge</td></tr>
<tr>
<td><strong>GPQA Diamond</strong></td><td>75.1 %</td><td><strong>88.4 %</strong></td><td>Grad-level physics</td></tr>
</tbody>
</table>
</div><p><em>Take-away:</em> Grok 4 dominates the hardest reasoning and live-execution tasks; Kimi stays neck-and-neck on static coding and actually wins broad knowledge tests—all while being orders-of-magnitude cheaper.</p>
<h2 id="heading-use-cases">Use Cases</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Scenario</td><td>Best fit</td><td>Rationale</td><td>Self-hostable?</td></tr>
</thead>
<tbody>
<tr>
<td><strong>Autonomous agents &amp; CI/CD</strong></td><td><strong>Kimi K2</strong></td><td>Native sandboxed tool-calling + open plugin ecosystem</td><td>✅ Yes</td></tr>
<tr>
<td><strong>Whole-repo deep debugging</strong></td><td><strong>Grok 4 Heavy</strong></td><td>256 K context + multi-agent reasoning spots elusive bugs</td><td>❌ No</td></tr>
<tr>
<td><strong>Budget-constrained startups</strong></td><td><strong>Kimi K2</strong></td><td>$0.15 / $2.50 per M tokens vs $3 / $15 per M tokens; self-host option</td><td>✅ Yes</td></tr>
<tr>
<td><strong>Regulated enterprise, live data</strong></td><td><strong>Grok 4</strong></td><td>SOC 2/GDPR compliance; real-time search; enterprise support</td><td>❌ No</td></tr>
</tbody>
</table>
</div><p>Both models provide correct solutions, but Kimi K2’s open-source nature and lower cost make it more accessible for high-volume or repetitive tasks, while Grok 4’s premium features justify its higher price when you need complex reasoning or real-time data.</p>
<h2 id="heading-accessibility-through-langdb">Accessibility through LangDB</h2>
<p>Both models (alongside Claude 4, Gemini 2.5 Pro, and 300+ others) are available through <strong>LangDB’s OpenAI-compatible API</strong>.</p>
<p><strong>LangDB</strong> is the fastest enterprise AI gateway—fully built in Rust—to <strong>secure, govern, and optimize</strong> AI traffic across <strong>250+ LLMs</strong> via a single OpenAI-compatible API. Key features include:</p>
<ul>
<li><p><strong>Unified access</strong> to Kimi K2, Grok 4, Claude 4, Gemini 2.5 Pro, and hundreds more</p>
</li>
<li><p><strong>Observability &amp; tracing</strong> for every request and agent step</p>
</li>
<li><p><strong>Guardrails</strong> to enforce policy and compliance</p>
</li>
<li><p><strong>Cost control</strong> without changing your code</p>
</li>
<li><p><strong>Framework-agnostic</strong>—works seamlessly with LangChain, LangGraph, and any OpenAI-compatible library</p>
</li>
</ul>
<p>Integrate in minutes and let LangDB handle model management, metrics, and governance so you can focus on building.</p>
<h2 id="heading-real-world-langgraph-performance">Real-World LangGraph Performance</h2>
<p>To see these differences in action, we ran the same LangGraph data-extraction pipeline against both models (full traces linked below):</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1753331774748/a4a58bcf-87a2-4e7c-a747-487bf925fc0b.png" alt="Interface showing processing details of a complex meeting transcript. The screen displays task names, their execution times, and a visual timeline of activities. On the right, detailed logs provide trace and run IDs, start and finish times, and JSON input/output data." class="image--center mx-auto" /></p>
<p>Grok 4: <a target="_blank" href="https://app.langdb.ai/sharing/threads/4d25db11-e011-41be-b7bc-c12f7edee2fb">https://app.langdb.ai/sharing/threads/4d25db11-e011-41be-b7bc-c12f7edee2fb</a></p>
<p>Kimi K2: <a target="_blank" href="https://app.langdb.ai/sharing/threads/82403cde-533a-41b5-bf03-92abceb2b018">https://app.langdb.ai/sharing/threads/82403cde-533a-41b5-bf03-92abceb2b018</a></p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Model</td><td>Cost (USD)</td><td>Time Taken (s)</td></tr>
</thead>
<tbody>
<tr>
<td><strong>Grok 4</strong></td><td>0.128</td><td>167.87</td></tr>
<tr>
<td><strong>Kimi K2</strong></td><td>0.012</td><td>86.00</td></tr>
</tbody>
</table>
</div><blockquote>
<p><strong>See it in action:</strong></p>
<ul>
<li><p>LangGraph data-extraction guide → <a target="_blank" href="https://docs.langdb.ai/guides/building-agents/building-complex-data-extraction-with-langgraph">https://docs.langdb.ai/guides/building-agents/building-complex-data-extraction-with-langgraph</a></p>
</li>
<li><p>Full code examples → <a target="_blank" href="https://github.com/langdb/langdb-samples/tree/main/examples/langchain/langchain-data-extraction">https://github.com/langdb/langdb-samples/tree/main/examples/langchain/langchain-data-extraction</a></p>
</li>
</ul>
</blockquote>
<p>On the same LangGraph pipeline, Kimi K2 ran in roughly half the time and at one-tenth the cost of Grok 4. This real-world test underlines the cost-efficiency and speed advantages of an open-source MoE model for typical data-extraction workflows.</p>
<p>However, if your pipeline demands the deepest reasoning chains or the freshest web-hooks, Grok 4’s premium features may still be worth the extra spend and latency. Evaluate your throughput and SLAs to pick the best fit.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>AI’s future isn’t one-size-fits-all. <strong>Kimi K2</strong> democratizes near-SOTA coding for pennies and full control, while <strong>Grok 4</strong> pushes the reasoning ceiling and keeps answers current—at a premium. With LangDB, you can seamlessly plug both into your stack and choose the right model per task, without rewriting your integration. Pick your path, optimize your costs, and get building!</p>
]]></content:encoded></item><item><title><![CDATA[Designing Smart Multi-Agent Workflows with Agno & LangDB]]></title><description><![CDATA[Build a multi-agent financial analysis team with LangDB and Agno that can reason, research, and report on complex financial data.
In the world of finance, staying ahead requires more than just data; it requires deep analysis, contextual awareness, an...]]></description><link>https://blog.langdb.ai/designing-smart-multi-agent-workflows-with-agno-and-langdb</link><guid isPermaLink="true">https://blog.langdb.ai/designing-smart-multi-agent-workflows-with-agno-and-langdb</guid><category><![CDATA[LangDB]]></category><category><![CDATA[Agno]]></category><category><![CDATA[llm]]></category><category><![CDATA[agents]]></category><category><![CDATA[mcp]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Thu, 17 Jul 2025 11:32:22 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1753073265965/6627505a-2b00-4c2d-bfc9-c310e68f9902.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Build a multi-agent financial analysis team with <a target="_blank" href="https://langdb.ai/">LangDB</a> and <a target="_blank" href="https://www.agno.com?utm_source=langdb&amp;utm_medium=partner-content&amp;utm_campaign=partner-technical&amp;utm_content=langdb">Agno</a> that can reason, research, and report on complex financial data.</p>
<p>In the world of finance, staying ahead requires more than just data; it requires deep analysis, contextual awareness, and collaborative reasoning. What if you could build a team of AI agents to do this for you? In this post, we'll show you how to build a sophisticated, multi-agent financial analysis team using LangDB and Agno.</p>
<h2 id="heading-tldr">TL;DR:</h2>
<p>This guide walks you through building a multi-agent workflow using Agno for orchestration and LangDB as the AI Gateway. We'll use a financial analysis team as a practical example to show how you can build sophisticated agent systems that are easy to manage and debug, thanks to LangDB's end-to-end tracing, dynamic tooling, and access to over 350 LLMs.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1752745323378/85a6a9c0-3fcf-4a98-a5e3-800de54e674c.png" alt="Full Conversation with Agno" class="image--center mx-auto" /></p>
<p>This team of agents collaborates to deliver in-depth insights on publicly traded companies by combining web research for market sentiment with hard financial data analysis. You can see a <a target="_blank" href="https://app.langdb.ai/sharing/threads/630b2ded-15ae-43d9-8a7a-d6dd9d649655">full trace of the final agent's execution</a>.</p>
<h2 id="heading-the-code">The Code</h2>
<p>You can find the complete source code for this project on GitHub:</p>
<ul>
<li><strong>LangDB Samples</strong>: <a target="_blank" href="https://github.com/langdb/langdb-samples/tree/main/examples/agno/reasoning-finance-team">https://github.com/langdb/langdb-samples/tree/main/examples/agno/reasoning-finance-team</a></li>
</ul>
<h2 id="heading-the-architecture-a-trio-of-financial-experts">The Architecture: A Trio of Financial Experts</h2>
<p>Our system is composed of two specialist agents orchestrated by a coordinating team:</p>
<ol>
<li><p><strong>Web Search Agent</strong>: Gathers the latest news and market sentiment from the internet.</p>
</li>
<li><p><strong>Finance Agent</strong>: Equipped with <code>YFinanceTools</code> to fetch and analyze quantitative stock data, including pricing, fundamentals, and analyst recommendations.</p>
</li>
<li><p><strong>Reasoning Finance Team</strong>: A coordinator that directs the two agents, synthesizes their findings, and produces a final, comprehensive report.</p>
</li>
</ol>
<p>LangDB provides the backbone for this system. As an AI Gateway, it enables seamless access to over 350 LLMs, simplifies tool integration, and provides full end-to-end tracing and observability into each agent's actions and the team's collaborative process.</p>
<blockquote>
<p>Checkout: <a target="_blank" href="https://docs.agno.com/models/langdb">https://docs.agno.com/models/langdb</a> and <a target="_blank" href="https://docs.agno.com/observability/langdb">https://docs.agno.com/observability/langdb</a></p>
</blockquote>
<h2 id="heading-enhanced-tracing-with-pylangdbagnoinit">Enhanced Tracing with <code>pylangdb.agno.init()</code></h2>
<p>While you can use LangDB as a provider in Agno directly, calling <code>pylangdb.agno.init()</code> unlocks deeper, end-to-end tracing. This function provides additional metadata and observability by automatically instrumenting the entire Agno framework, giving you complete visibility into your agent's workflows.</p>
<pre><code class="lang-python"><span class="hljs-comment"># main.py</span>
<span class="hljs-keyword">from</span> pylangdb.agno <span class="hljs-keyword">import</span> init

<span class="hljs-comment"># Initialize LangDB for enhanced tracing *before* importing any Agno modules.</span>
init()

<span class="hljs-keyword">from</span> agno.agent <span class="hljs-keyword">import</span> Agent
<span class="hljs-keyword">from</span> agno.team <span class="hljs-keyword">import</span> Team
<span class="hljs-comment"># ... other imports</span>
</code></pre>
<p>As an official provider integrated with Agno, LangDB requires you to set up your credentials. You'll need to export your LangDB API key and Project ID as environment variables. You can find these in your <a target="_blank" href="https://app.langdb.ai/settings/api_keys">LangDB project settings</a>.</p>
<pre><code class="lang-bash"><span class="hljs-built_in">export</span> LANGDB_API_KEY=<span class="hljs-string">"&lt;your_langdb_api_key&gt;"</span>
<span class="hljs-built_in">export</span> LANGDB_PROJECT_ID=<span class="hljs-string">"&lt;your_langdb_project_id&gt;"</span>
</code></pre>
<h2 id="heading-code-walkthrough-building-the-team">Code Walkthrough: Building the Team</h2>
<p>Let's look at how the agents and the team are defined.</p>
<h3 id="heading-the-web-search-agent-decoupled-and-dynamic">The Web Search Agent: Decoupled and Dynamic</h3>
<p>Instead of hard-coding a search tool, we assign the <code>web_agent</code> a LangDB Virtual Model. This decouples the agent's logic from the specific tools it uses.</p>
<pre><code class="lang-python">web_agent = Agent(
    name=<span class="hljs-string">"Web Search Agent"</span>,
    role=<span class="hljs-string">"Search the web for the information"</span>,
    model=LangDB(id=<span class="hljs-string">"langdb/search_agent_xmf4v5jk"</span>),
    instructions=<span class="hljs-string">"Always include sources"</span>
)
</code></pre>
<p>This virtual model is configured in the LangDB UI to provide search capabilities, which we'll cover in the next section.</p>
<h3 id="heading-the-finance-agent-the-quantitative-analyst">The Finance Agent: The Quantitative Analyst</h3>
<p>This agent is equipped with <code>YFinanceTools</code> to access a wide range of financial data. It's powered by a powerful model like Grok-4 and has specific instructions to format its output professionally.</p>
<pre><code class="lang-python">finance_agent = Agent(
    name=<span class="hljs-string">"Finance AI Agent"</span>,
    role=<span class="hljs-string">"Analyse the given stock"</span>,
    model=LangDB(id=<span class="hljs-string">"xai/grok-4"</span>),
    tools=[YFinanceTools(
        stock_price=<span class="hljs-literal">True</span>,
        stock_fundamentals=<span class="hljs-literal">True</span>,
        analyst_recommendations=<span class="hljs-literal">True</span>,
        company_info=<span class="hljs-literal">True</span>,
        company_news=<span class="hljs-literal">True</span>
    )],
    instructions=[
        <span class="hljs-string">"Use tables to display stock prices, fundamentals (P/E, Market Cap), and recommendations."</span>,
        <span class="hljs-string">"Clearly state the company name and ticker symbol."</span>,
        <span class="hljs-string">"Focus on delivering actionable financial insights."</span>
    ]
)
</code></pre>
<h3 id="heading-the-coordinating-team-the-orchestrator">The Coordinating Team: The Orchestrator</h3>
<p>The <a target="_blank" href="https://docs.agno.com/teams/introduction"><code>ReasoningFinanceTeam</code></a> orchestrates the two specialist agents. It operates in <code>coordinate</code> mode, allowing it to delegate tasks, synthesize information, and ensure the final output is a comprehensive report.</p>
<pre><code class="lang-python">reasoning_finance_team = Team(
    name=<span class="hljs-string">"Reasoning Finance Team"</span>,
    mode=<span class="hljs-string">"coordinate"</span>,
    model=LangDB(id=<span class="hljs-string">"xai/grok-4"</span>),
    members=[web_agent, finance_agent],
    tools=[ReasoningTools(add_instructions=<span class="hljs-literal">True</span>)],
    instructions=[
        <span class="hljs-string">"Collaborate to provide comprehensive financial and investment insights"</span>,
        <span class="hljs-string">"Consider both fundamental analysis and market sentiment"</span>,
        <span class="hljs-string">"Present findings in a structured, easy-to-follow format"</span>,
    ],
    success_criteria=<span class="hljs-string">"The team has provided a complete financial analysis with data, visualizations, risk assessment, and actionable investment recommendations."</span>
)
</code></pre>
<h2 id="heading-dynamic-tooling-with-virtual-models-and-virtual-mcpshttpsdocslangdbaiconceptsvirtual-models"><a target="_blank" href="https://docs.langdb.ai/concepts/virtual-models">Dynamic Tooling with Virtual Models and Virtual MCPs</a></h2>
<p>To empower the <code>web_agent</code> with live web search capabilities without hard-coding tools, we configure a Virtual Model in LangDB. This model is backed by a Virtual MCP Server that provides the actual search functionality.</p>
<ol>
<li><p><strong>Create a Virtual MCP Server</strong>: In the LangDB UI, create a new Virtual MCP Server named <code>web-search-mcp</code> that uses the Tavily Search MCP.</p>
</li>
<li><p><strong>Create and Configure the Virtual Model</strong>: Create a new virtual model (e.g., <code>search-agent</code>) and attach the <code>web-search-mcp</code> to it.</p>
</li>
<li><p><strong>Use the Virtual Model ID</strong>: Copy the ID of your new virtual model and use it in the <code>web_agent</code> definition.</p>
</li>
</ol>
<p>This setup allows you to change the tools and models your agents use on the fly from the LangDB UI, without changing a single line of code.</p>
<h2 id="heading-running-the-team-and-observing-the-results">Running the Team and Observing the Results</h2>
<p>To run the team, simply call the <code>print_response</code> method with a detailed prompt:</p>
<pre><code class="lang-python">reasoning_finance_team.print_response(
    <span class="hljs-string">"""Compare the tech sector giants (AAPL, GOOGL, MSFT) performance:\n
    1. Get financial data for all three companies\n
    2. Analyze recent news affecting the tech sector\n
    3. Calculate comparative metrics and correlations\n
    4. Recommend portfolio allocation weights"""</span>
)
</code></pre>
<p>Every execution is captured in LangDB, providing a complete trace of the team's operations. This includes the initial prompt, each agent's contributions, the tools they used, and the final synthesized output. You can explore the <a target="_blank" href="https://app.langdb.ai/sharing/threads/73c91c58-eab7-4c6b-afe1-5ab6324f1ada">full, shareable trace</a>.</p>
<p>Here is a snippet of the final report generated by the agent team:</p>
<pre><code class="lang-text">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃                   Comprehensive Comparative Analysis of Tech Giants: AAPL, GOOGL, and MSFT                    ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

As the Reasoning Finance Team, we've conducted a thorough analysis of Apple Inc. (AAPL), Alphabet Inc. (GOOGL), and Microsoft Corporation (MSFT) based on the user's request. This includes financial data retrieval, recent news analysis, comparative metrics and correlations, and portfolio allocation recommendations. Our evaluation incorporates fundamental analysis (e.g., valuations, growth), market sentiment (e.g., news and analyst views), quantitative metrics (e.g., betas, correlations), and risk assessments. Data is current as of July 2025.

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────

                                             1. Financial Data Overview                                             

Key financial data for each company, sourced from reliable APIs.                                                  

                                                   Stock Prices                                                   

   Metric               AAPL      GOOGL     MSFT                                                                   
  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━                                                               
   Current Price        $210.16   $182.97   $505.62                                                                
   52-Week High         $260.10   $207.05   $508.30                                                                
   52-Week Low          $169.21   $140.53   $344.79                                                                
   50-Day Moving Avg    $203.87   $170.88   $472.41                                                                
   200-Day Moving Avg   $222.55   $173.43   $427.18
</code></pre>
<p>And here's a view of the full trace in the LangDB UI, showing how the agents collaborated to produce the report.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1752745257507/743526c0-6437-4602-b21d-6e7fde8cbe82.png" alt="LangDB trace view for the financial agent" class="image--center mx-auto" /></p>
<h2 id="heading-full-observability-with-langdb-tracing">Full Observability with LangDB Tracing</h2>
<p>The "full observability" promised in our subtitle is delivered through LangDB's detailed tracing capabilities. When you run your Agno team, every action is captured, providing a transparent, hierarchical view of the entire workflow. Here’s what you can see in the trace:</p>
<ul>
<li><p><strong>Hierarchical Span View</strong>: The trace isn't a flat list of events but a tree of "spans." The top-level span represents the entire team's execution, with child spans for each agent's turn, tool call, and model invocation. This shows the exact flow of control and delegation.</p>
</li>
<li><p><strong>Input/Output for Each Step</strong>: For every span, you can inspect the exact inputs and outputs. This means you can see the precise query sent to the <code>Web Search Agent</code>, the articles it returned, the data requested by the <code>Finance Agent</code>, and the final synthesized response from the team. This level of detail is crucial for debugging.</p>
</li>
<li><p><strong>Latency and Performance</strong>: Each span is timestamped and includes latency information, allowing you to instantly identify bottlenecks. You can see exactly how long each tool call, model response, or agent deliberation took.</p>
</li>
<li><p><strong>Cost and Token Usage</strong>: For every LLM call, the trace displays the number of input and output tokens and the associated cost, giving you full transparency into your operational expenses.</p>
</li>
<li><p><strong>Automatic Metadata</strong>: Thanks to <code>pylangdb.agno.init()</code>, traces are automatically enriched with metadata, including agent names, the team name (<code>Reasoning Finance Team</code>), and the models used, making it easy to filter and search for specific traces in the LangDB UI.</p>
</li>
</ul>
<p>This granular, end-to-end visibility is what makes building, debugging, and managing complex agentic workflows with LangDB and Agno so powerful.</p>
<p>You can check out the full conversation with tracing here: <a target="_blank" href="https://app.langdb.ai/sharing/threads/630b2ded-15ae-43d9-8a7a-d6dd9d649655">https://app.langdb.ai/sharing/threads/630b2ded-15ae-43d9-8a7a-d6dd9d649655</a></p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>By combining Agno with the LangDB AI Gateway, we've built a financial analysis team that is:</p>
<ul>
<li><p><strong>Modular</strong>: Each agent has a specific, firewalled responsibility.</p>
</li>
<li><p><strong>Dynamic</strong>: We can change models and grant new tools on the fly from the LangDB UI without redeploying our agent.</p>
</li>
<li><p><strong>Observable</strong>: We get detailed traces of every interaction, making debugging and performance analysis straightforward.</p>
</li>
</ul>
<p>This architecture allows for rapid development and iteration, enabling you to build truly powerful and intelligent agentic systems for any domain.</p>
<p>Ready to build your own? <a target="_blank" href="https://app.langdb.ai/">Start building for free on LangDB</a> or Explore <a target="_blank" href="https://www.agno.com?utm_source=langdb&amp;utm_medium=partner-content&amp;utm_campaign=partner-technical&amp;utm_content=langdb">Agno</a> to orchestrate your agent workflows.</p>
]]></content:encoded></item><item><title><![CDATA[Grok 4: Fast Start Guide for Developers]]></title><description><![CDATA[Elon Musk’s xAI just dropped Grok 4, its most powerful large-language model yet. With perfect or near-perfect scores on top academic benchmarks, Grok 4 redefines what “general-purpose assistant” means for developers.
Grok 4 is xAI’s flagship LLM, opt...]]></description><link>https://blog.langdb.ai/grok-4-fast-start-guide-for-developers</link><guid isPermaLink="true">https://blog.langdb.ai/grok-4-fast-start-guide-for-developers</guid><category><![CDATA[Agno]]></category><category><![CDATA[grok]]></category><category><![CDATA[LangDB]]></category><category><![CDATA[tavily]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Sun, 13 Jul 2025 18:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1752585732059/af8ab824-b6d7-4872-be11-77ac7326cca4.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Elon Musk’s xAI just dropped <strong>Grok 4</strong>, its most powerful large-language model yet. With perfect or near-perfect scores on top academic benchmarks, Grok 4 redefines what “general-purpose assistant” means for developers.</p>
<p>Grok 4 is xAI’s flagship LLM, optimized for deep reasoning, long-context understanding, and agentic workflows.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1752584797092/b88bdfc2-f19a-4747-a7eb-542038636b82.webp" alt class="image--center mx-auto" /></p>
<h2 id="heading-why-grok-4-matters">Why Grok 4 matters</h2>
<ul>
<li><p><strong>Graduate-level reasoning</strong> across STEM &amp; humanities</p>
</li>
<li><p><strong>Ultra-long context (256 K tokens)</strong> — more than Anthropic Claude 4 Sonnet &amp; Opus (200K), o3 (200K), and DeepSeek R1 0528 (128K), but below Google Gemini 2.5 Pro (1M tokens) — ideal for large-codebases and documents</p>
</li>
<li><p><strong>Multi-agent “Heavy” tier</strong> that coordinates five Grok instances for tough problems (adds ~2× accuracy on hard tests)</p>
</li>
</ul>
<p>The benchmark results speak for themselves, with Grok-4 Heavy achieving perfect and near-perfect scores, outperforming its best rivals in several categories.</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Benchmark</td><td>Grok 4</td><td>Grok 4 Heavy</td><td>Best rival*</td></tr>
</thead>
<tbody>
<tr>
<td><strong>AIME 25</strong> (math)</td><td>91.7 %</td><td><strong>100 %</strong></td><td>88.9 % (OpenAI o3)</td></tr>
<tr>
<td><strong>HMMT 25</strong> (math)</td><td>90.0 %</td><td><strong>96.7 %</strong></td><td>82.5 % (Gemini 2.5)</td></tr>
<tr>
<td><strong>GPQA</strong> (grad QA)</td><td>87.5 %</td><td>88.9 %</td><td>86.4 % (Gemini 2.5)</td></tr>
<tr>
<td><strong>Humanity’s Last Exam</strong> (HLE)</td><td>25.4 %</td><td><strong>44.4 %</strong></td><td>≈22 % (GPT-4 / Gemini)</td></tr>
<tr>
<td><strong>ARC-AGI-2</strong> (reasoning)</td><td>16.2 %</td><td>—</td><td>≈8 % (Claude Opus 4)</td></tr>
</tbody>
</table>
</div><blockquote>
<p>Capability highlight: Grok 4 Heavy’s multi-agent architecture doubles down on complex problem solving at scale.</p>
</blockquote>
<p>Grok 4’s combination of a large context window, multi‑agent “Heavy” tier, and tool integration consistently places it at or near the top across a spectrum of advanced reasoning tasks.</p>
<h1 id="heading-building-with-grok-4-the-developers-stack">Building with Grok-4: The Developer's Stack</h1>
<p>A powerful model like Grok-4 is a fantastic tool, but building reliable, scalable, and observable AI applications requires a robust development stack. This is where frameworks like Agno and observability platforms like LangDB come into play.</p>
<p><strong>Agno</strong>: An open-source Python framework for building AI agents. It provides a clean, composable, and "Pythonic" way to structure your agent's logic, tools, and memory. Instead of wrestling with boilerplate code, you can declaratively define what your agent can do.</p>
<p><strong>LangDB</strong>: An AI gateway that acts as a unified control panel for over 350+ LLMs. With a single line of code, you can instrument your entire agent workflow for complete observability.</p>
<h2 id="heading-example-multi-agent-financial-reasoning-with-grok-4-amp-langdb">Example: Multi-Agent Financial Reasoning with Grok 4 &amp; LangDB</h2>
<p>Here's how you can build a real-world financial analysis team using Agno, with Grok 4 as your core reasoning model and LangDB for observability:</p>
<blockquote>
<p>The <code>Web Search Agent</code> below uses a LangDB Virtual Model with Tavily search built-in. No custom search integration or setup needed—just reference your Virtual Model. <a target="_blank" href="https://docs.langdb.ai/concepts/virtual-models">Learn more about Virtual Models.</a></p>
</blockquote>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv

<span class="hljs-keyword">from</span> pylangdb.agno <span class="hljs-keyword">import</span> init
init()

<span class="hljs-keyword">from</span> agno.agent <span class="hljs-keyword">import</span> Agent
<span class="hljs-keyword">from</span> agno.team.team <span class="hljs-keyword">import</span> Team
<span class="hljs-keyword">from</span> agno.tools.yfinance <span class="hljs-keyword">import</span> YFinanceTools
<span class="hljs-keyword">from</span> agno.models.langdb <span class="hljs-keyword">import</span> LangDB

load_dotenv()

<span class="hljs-comment"># Web Search Agent with Tavily via LangDB Virtual Model</span>
web_agent = Agent(
    name=<span class="hljs-string">"Web Search Agent"</span>,
    role=<span class="hljs-string">"Search the web for the information"</span>,
    model=LangDB(id=<span class="hljs-string">"langdb/search_agent_xmf4v5jk"</span>),
    instructions=<span class="hljs-string">"Always include sources"</span>
)

<span class="hljs-comment"># Finance Agent powered by Grok 4</span>
finance_agent = Agent(
    name=<span class="hljs-string">"Finance AI Agent"</span>,
    role=<span class="hljs-string">"Analyse the given stock"</span>,
    model=LangDB(id=<span class="hljs-string">"xai/grok-4"</span>),
    tools=[YFinanceTools(
        stock_price=<span class="hljs-literal">True</span>,
        stock_fundamentals=<span class="hljs-literal">True</span>,
        analyst_recommendations=<span class="hljs-literal">True</span>,
        company_info=<span class="hljs-literal">True</span>,
        company_news=<span class="hljs-literal">True</span>
    )],
    instructions=[
        <span class="hljs-string">"Use tables to display stock prices, fundamentals (P/E, Market Cap), and recommendations."</span>,
        <span class="hljs-string">"Clearly state the company name and ticker symbol."</span>,
        <span class="hljs-string">"Focus on delivering actionable financial insights."</span>
    ]
)

<span class="hljs-comment"># Multi-agent team for collaborative financial analysis</span>
reasoning_finance_team = Team(
    name=<span class="hljs-string">"Reasoning Finance Team"</span>,
    mode=<span class="hljs-string">"coordinate"</span>,
    model=LangDB(id=<span class="hljs-string">"xai/grok-4"</span>),
    members=[web_agent, finance_agent],
    instructions=[
        <span class="hljs-string">"Collaborate to provide comprehensive financial and investment insights"</span>,
        <span class="hljs-string">"Consider both fundamental analysis and market sentiment"</span>,
        <span class="hljs-string">"Use tables and charts to display data clearly and professionally"</span>,
        <span class="hljs-string">"Present findings in a structured, easy-to-follow format"</span>,
        <span class="hljs-string">"Only output the final consolidated analysis, not individual agent responses"</span>
    ],
    markdown=<span class="hljs-literal">True</span>,
    show_members_responses=<span class="hljs-literal">True</span>,
    success_criteria=<span class="hljs-string">"The team has provided a complete financial analysis with data, visualizations, risk assessment, and actionable investment recommendations supported by quantitative analysis and market research."</span>
)

reasoning_finance_team.print_response(
    <span class="hljs-string">"""Compare the tech sector giants (AAPL, GOOGL, MSFT) performance:\n    1. Get financial data for all three companies\n    2. Analyze recent news affecting the tech sector\n    3. Calculate comparative metrics and correlations\n    4. Recommend portfolio allocation weights"""</span>
)
</code></pre>
<h2 id="heading-observability-in-action-what-langdb-adds">Observability in Action: What LangDB Adds</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1752660116636/796e4bc8-8d1f-47d7-9dd1-cb55b608bfc2.png" alt class="image--center mx-auto" /></p>
<p>With LangDB, every part of your multi-agent workflow becomes transparent and easy to debug:</p>
<ul>
<li><p><strong>Visualize each step in your workflow:</strong> Instantly see how the prompt flows through every agent and tool. Whether it’s Tavily search, YFinance, or Grok 4 itself, you get a single unified trace.</p>
</li>
<li><p><strong>Pinpoint latency and costs:</strong> Track response time and token usage for every call at every layer. No more guesswork. Easily spot bottlenecks and unexpected cost spikes.</p>
</li>
<li><p><strong>Troubleshoot faster:</strong> Errors and slowdowns are highlighted with detailed step-by-step spans. You can optimize your pipeline without digging through logs.</p>
</li>
</ul>
<p>Checkout the full conversation: <a target="_blank" href="https://app.langdb.ai/sharing/threads/73c91c58-eab7-4c6b-afe1-5ab6324f1ada">https://app.langdb.ai/sharing/threads/73c91c58-eab7-4c6b-afe1-5ab6324f1ada</a></p>
<h2 id="heading-wrap-up">Wrap-up</h2>
<p>Grok 4 sets a new bar for reasoning, math, and large-context tasks. Paired with Agno for flexible agent design and LangDB as your AI gateway, developers can easily build, debug, and scale high-performance LLM-powered applications. Drop Grok 4 into your own agents or start from the template above, and benefit from full workflow visibility and model management from day one.</p>
<p>Happy building!</p>
<h2 id="heading-further-reading-amp-references">Further Reading &amp; References</h2>
<ul>
<li><p><strong>LangDB Documentation</strong>: Getting started guide and API reference: https://docs.langdb.ai/getting-started/quick-start</p>
</li>
<li><p><strong>LangDB Virtual Models</strong>: Concept guide: https://docs.langdb.ai/concepts/virtual-models</p>
</li>
<li><p><strong>Agno Official Documentation</strong>: https://docs.agno.com</p>
</li>
<li><p><strong>Agno GitHub Repository</strong>: https://github.com/agno-agi/agno</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Discover End-to-End Tracing on Google ADK with LangDB]]></title><description><![CDATA[Before diving into the code, watch this 2-minute video to see a complete demonstration of what we'll be building. You'll learn how to integrate LangDB tracing into the Google ADK Travel Concierge sample with no code chages.
https://youtu.be/5tyDi5xzO...]]></description><link>https://blog.langdb.ai/discover-end-to-end-tracing-on-google-adk-with-langdb</link><guid isPermaLink="true">https://blog.langdb.ai/discover-end-to-end-tracing-on-google-adk-with-langdb</guid><category><![CDATA[google adk]]></category><category><![CDATA[agentic AI]]></category><category><![CDATA[gemini]]></category><category><![CDATA[LangDB]]></category><category><![CDATA[tracing]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Thu, 03 Jul 2025 06:43:35 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1751522824240/ce710310-012f-40e8-967d-ac3d1c11cc77.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Before diving into the code, watch this 2-minute video to see a complete demonstration of what we'll be building. You'll learn how to integrate LangDB tracing into the Google ADK Travel Concierge sample with no code chages.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/5tyDi5xzOUE">https://youtu.be/5tyDi5xzOUE</a></div>
<p> </p>
<p>In this quick demo you’ll see:</p>
<ul>
<li><p>How to install and initialize the <code>pylangdb[adk]</code> package.</p>
</li>
<li><p>The single line of code that enables full observability for every ADK agent and tool.</p>
</li>
<li><p>Running a sample prompt like “Find me flights from JFK to London”.</p>
</li>
<li><p>Inspecting your workflow in the LangDB AI Gateway dashboard, including:</p>
<ul>
<li><p>Threads view for step-by-step conversation logs.</p>
</li>
<li><p>Traces view for Gantt charts, cost &amp; token breakdowns, and dependency graphs.</p>
</li>
</ul>
</li>
<li><p>Drilling into any agent or tool (like the <code>planning_agent</code> on Claude 3 Sonnet) for full observability.</p>
</li>
</ul>
<p>In this tutorial, we'll walk through the architecture of a sophisticated Travel Concierge agent built with Google's Agent Development Kit (ADK). We'll explore how to leverage the LangDB AI Gateway to use any LLM—from OpenAI, Google, Anthropic, and more—and harness powerful features like Virtual Models and Virtual MCPs (Model Context Protocol) to create a dynamic, observable, and easily maintainable agent system.</p>
<p>Our <code>travel_concierge</code> agent is not just a single agent; it's a hierarchy of specialized sub-agents that handle everything from vacation inspiration to booking and in-trip assistance. Here's a look at the overall architecture:</p>
<p><img src="https://github.com/google/adk-samples/raw/main/python/agents/travel-concierge/travel-concierge-arch.png" alt="Travel Concierge's Multi-Agent Architecture" /></p>
<p>This project is based on the official <a target="_blank" href="https://github.com/google/adk-samples/tree/main/python/agents/travel-concierge">Google ADK Travel Concierge sample</a> and has been modified to showcase the integration with the LangDB AI Gateway.</p>
<blockquote>
<p>You can find the complete source code for this agent on GitHub: <a target="_blank" href="https://github.com/langdb/langdb-samples/tree/main/examples/google-adk/travel-concierge">LangDB Samples</a></p>
</blockquote>
<h2 id="heading-the-magic-behind-the-curtain-pylangdbadkinit">The Magic Behind the Curtain: <code>pylangdb.adk.init()</code></h2>
<p>First, let's talk about the most important line of code in this integration:</p>
<pre><code class="lang-python"><span class="hljs-comment"># travel_concierge/agent.py</span>
<span class="hljs-keyword">from</span> pylangdb.adk <span class="hljs-keyword">import</span> init
<span class="hljs-comment"># Initialize LangDB *before* importing any ADK modules.</span>
init()
</code></pre>
<p>This single function call is the key to unlocking the LangDB AI Gateway's observability features. By placing it at the very top of our script, before any <code>google.adk</code> modules are imported, we enable automatic instrumentation for the entire agent framework.</p>
<p>Here’s what <code>init()</code> does automatically:</p>
<ul>
<li><p><strong>Discovers Agents</strong>: It recursively finds all agent and sub-agent definitions within your project.</p>
</li>
<li><p><strong>Patches Runtimes</strong>: It automatically patches the necessary ADK components to emit traces.</p>
</li>
<li><p><strong>Links Sessions</strong>: It intelligently links all the interactions—from the root agent's initial processing to the deepest sub-agent and tool calls—into a single, cohesive trace in the LangDB Tracing.</p>
</li>
</ul>
<p>This "zero-instrumentation" approach means you get complete, end-to-end visibility into your agent's complex workflows just by adding that one line of code.</p>
<h2 id="heading-the-architecture-root-agent-and-sub-agents">The Architecture: Root Agent and Sub-Agents</h2>
<p>Our <code>travel_concierge</code> is a hierarchical agent. At the top is the <code>root_agent</code>, which acts as a smart router or orchestrator. Its job is not to answer queries directly, but to delegate them to a specialized sub-agent.</p>
<p>Here's its actual definition:</p>
<pre><code class="lang-python"><span class="hljs-comment"># travel_concierge/agent.py</span>
root_agent = Agent(
    model=<span class="hljs-string">"openai/gpt-4.1"</span>,
    name=<span class="hljs-string">"root_agent"</span>,
    description=<span class="hljs-string">"A Travel Conceirge using the services of multiple sub-agents"</span>,
    instruction=prompt.ROOT_AGENT_INSTR,
    sub_agents=[
        inspiration_agent,
        planning_agent,
        <span class="hljs-comment"># ... and other sub-agents</span>
    ],
    <span class="hljs-comment"># ...</span>
)
</code></pre>
<p>As you can see, it uses a standard model (<code>"openai/gpt-4.1"</code>) and has a list of <code>sub_agents</code>. It doesn't have any tools of its own. The real power comes from the sub-agents.</p>
<h2 id="heading-dynamic-tooling-with-virtual-models-and-virtual-mcps">Dynamic Tooling with Virtual Models and Virtual MCPs</h2>
<p>A LangDB Virtual Model is a powerful abstraction that decouples your agent's code from its runtime configuration. It acts as a pointer to a configuration that you can manage entirely from the LangDB UI.</p>
<p>This is where the <strong>Model Context Protocol (MCP)</strong> comes in. MCP is a standard that allows language models to interact with external tools and services in a uniform way. However, managing connections to multiple MCP-enabled tools can be complex.</p>
<p>The LangDB AI Gateway simplifies this with <a target="_blank" href="https://blog.langdb.ai/what-are-virtual-mcp-servers"><strong>Virtual MCP Servers</strong></a>. A Virtual MCP is a single, managed endpoint that you configure in the UI. It can bundle multiple tools (like Google Maps, Tavily Search, or your own custom APIs), handle their authentication securely, and lock them to specific versions.</p>
<p>You then connect this Virtual MCP to your agent's Virtual Model. This is how you can dynamically grant new capabilities to your agents without changing a single line of code.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751521868673/8fb8fc6f-8bd7-4745-bda4-5bf690f50e36.png" alt class="image--center mx-auto" /></p>
<p>Here are all the virtual models for our project, as seen in the LangDB AI Gateway dashboard. You can see the <code>inspiration_agent</code>, <code>google_search_agent</code>, and <code>planning_agent</code> all configured here, ready to be assigned to our agents.</p>
<h3 id="heading-example-the-inspirationagent-and-google-maps">Example: The <code>inspiration_agent</code> and Google Maps</h3>
<p>Let's look at our <code>inspiration_agent</code>. It needs access to location data to give travel ideas. Instead of hardcoding a <a target="_blank" href="https://app.langdb.ai/mcp-servers/google-maps">Google Maps MCP</a>, we use a Virtual Model.</p>
<p>Here's the agent's definition:</p>
<pre><code class="lang-python"><span class="hljs-comment"># travel_concierge/sub_agents/inspiration/agent.py</span>
inspiration_agent = Agent(
    model= <span class="hljs-string">"langdb/inspiration_agent_z73m3wmd"</span>,
    name=<span class="hljs-string">"inspiration_agent"</span>,
    description=<span class="hljs-string">"A travel inspiration agent..."</span>,
    <span class="hljs-comment"># ...</span>
)
</code></pre>
<p>Notice its model is <code>langdb/inspiration_agent_z73m3wmd</code>. In the LangDB AI Gateway UI, we've configured this virtual model to use a <strong>Virtual MCP server</strong> that has the Google Maps API attached as a tool. Now, when the <code>inspiration_agent</code> is active, it can seamlessly query Google Maps, even though the tool isn't explicitly listed in its code.</p>
<h3 id="heading-example-grounding-with-google-search">Example: Grounding with Google Search</h3>
<p>We also have a specialized agent tool for web searches, <code>google_search_grounding</code>.</p>
<pre><code class="lang-python"><span class="hljs-comment"># travel_concierge/tools/search.py</span>
_search_agent = Agent(
    model= <span class="hljs-string">"langdb/google_search_agent_hsz7lf9q"</span>,
    name=<span class="hljs-string">"google_search_grounding"</span>,
    description=<span class="hljs-string">"An agent providing Google-search grounding capability"</span>,
    <span class="hljs-comment"># ... instruction ...</span>
)

google_search_grounding = AgentTool(agent=_search_agent)
</code></pre>
<p>Just like our <code>inspiration_agent</code>, the <code>_search_agent</code> uses a virtual model, <code>langdb/google_search_agent_hsz7lf9q</code>. We've attached a <strong>Virtual MCP server</strong> that provides the Tavily Search tool to this model in LangDB.</p>
<h3 id="heading-example-the-planningagent-for-flights-and-hotels">Example: The <code>planning_agent</code> for Flights and Hotels</h3>
<p>Finally, let's look at the <code>planning_agent</code>, which handles the core booking tasks.</p>
<pre><code class="lang-python"><span class="hljs-comment"># travel_concierge/sub_agents/planning/agent.py</span>
planning_agent = Agent(
    model=<span class="hljs-string">"langdb/planning_agent_w1l8sygt"</span>,
    name=<span class="hljs-string">"planning_agent"</span>,
    description=<span class="hljs-string">"Helps users with travel planning..."</span>,
    <span class="hljs-comment"># ...</span>
)
</code></pre>
<p>This agent's virtual model, <code>langdb/planning_agent_w1l8sygt</code>, is connected to a Virtual MCP that provides an Airbnb search tool. This allows the agent to handle complex booking-related queries by leveraging this external service, all without having the tool logic hardcoded in the agent's definition.</p>
<h2 id="heading-the-flow-from-query-to-answer">The Flow: From Query to Answer</h2>
<ol>
<li><p>A user asks the <code>travel_concierge</code>: "What are some good museums to visit in Paris?"</p>
</li>
<li><p>The <code>root_agent</code> receives the query and, based on its instructions, delegates the task to the <code>inspiration_agent</code>.</p>
</li>
<li><p>The <code>inspiration_agent</code> is activated. Its virtual model configuration is loaded from the LangDB AI Gateway.</p>
</li>
<li><p>The agent now knows it has access to the Google Maps tool (via its Virtual MCP).</p>
</li>
<li><p>It uses the tool to find museums in Paris and provides a list to the user.</p>
</li>
<li><p>All of these steps—the delegation, the model calls, the tool usage—are automatically captured as traces in the LangDB AI Gateway, giving us complete observability into our agent's behavior.</p>
</li>
</ol>
<p>You can explore a complete, shareable trace of a conversation with this agent here: <a target="_blank" href="https://app.langdb.ai/sharing/threads/8425e068-77de-4f41-8aa9-d1111fc7d2b7">https://app.langdb.ai/sharing/threads/8425e068-77de-4f41-8aa9-d1111fc7d2b7</a></p>
<p>When you open the trace, you'll see a detailed breakdown of the entire workflow. This includes:</p>
<ul>
<li><p><strong>A Gantt chart</strong> visualizing the sequence and duration of each agent and tool invocation.</p>
</li>
<li><p><strong>Cost and token counts</strong> for every LLM call, helping you monitor usage and optimize performance.</p>
</li>
<li><p><strong>Detailed input/output payloads</strong> for each step, allowing you to inspect the exact data being passed between components.</p>
</li>
<li><p><strong>A dependency graph</strong> showing how agents and tools are interconnected, making it easy to debug complex interactions.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751524184116/8d140ee7-d2d2-482e-a008-a01ff928ba2a.png" alt="Screenshot of a LangDB trace showing the root_agent delegating to inspiration_agent and the tool call to Google Maps" class="image--center mx-auto" /></p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>By combining Google ADK with the LangDB AI Gateway's virtual models and MCPs, we've built a <code>travel_concierge</code> agent that is:</p>
<ul>
<li><p><strong>Modular</strong>: Each sub-agent has a specific responsibility.</p>
</li>
<li><p><strong>Dynamic</strong>: We can change models and grant new tools on the fly from the LangDB UI without redeploying our agent.</p>
</li>
<li><p><strong>Observable</strong>: We get detailed traces of every interaction, making debugging and performance analysis easy.</p>
</li>
</ul>
<p>This architecture allows for rapid development and iteration, enabling us to build truly powerful and intelligent agentic systems.</p>
<hr />
<p><em>Ready to build your own?</em> <a target="_blank" href="https://docs.langdb.ai/"><em>Check out the LangDB AI Gateway documentation to get started</em></a></p>
]]></content:encoded></item><item><title><![CDATA[Empower Your AI Agents with Virtual MCP Servers]]></title><description><![CDATA[Integrating AI agents with external tools such as GitHub, Slack, or ClickHouse often requires custom integration code, careful credential management, and handling differences between multiple APIs. The Model Context Protocol (MCP), initiated by Anthr...]]></description><link>https://blog.langdb.ai/what-are-virtual-mcp-servers</link><guid isPermaLink="true">https://blog.langdb.ai/what-are-virtual-mcp-servers</guid><category><![CDATA[mcp]]></category><category><![CDATA[#anthropic]]></category><category><![CDATA[agents]]></category><category><![CDATA[tools]]></category><category><![CDATA[mcp server]]></category><category><![CDATA[windsurf]]></category><category><![CDATA[claude.ai]]></category><category><![CDATA[cursor]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Fri, 02 May 2025 18:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1746380842071/b36cd792-e45d-45fa-8e17-0acc5f06daba.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Integrating AI agents with external tools such as GitHub, Slack, or ClickHouse often requires custom integration code, careful credential management, and handling differences between multiple APIs. The <strong>Model Context Protocol (MCP)</strong>, initiated by Anthropic, addresses this by providing a standard protocol that lets language models interact with external services uniformly.</p>
<p>However, using MCP across multiple services introduces additional complexity; each tool has its own endpoint, potentially different versions, and varying credential requirements. Ensuring consistent behavior across these tools can become challenging.</p>
<p><strong>Virtual MCP Servers</strong> from LangDB streamline this by allowing you to consolidate multiple MCP-based tools into a single interface. You choose only the tools you need, configure their authentication, specify exact versions to avoid unexpected changes, and gain visibility into all interactions through comprehensive logs.</p>
<p><img src="https://docs.langdb.ai/~gitbook/image?url=https%3A%2F%2F2130779299-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252F0peUeEHQMLSxYq8BpQGt%252Fuploads%252FHu9iztwWZNn98u7BtCD1%252Fscreenrun-04-27-2025-20-20-30%2520%281%29.gif%3Falt%3Dmedia%26token%3Dac3c3da5-1cb8-42b5-9ae5-06bcc1e750eb&amp;width=768&amp;dpr=1&amp;quality=100&amp;sign=6e11f01d&amp;sv=2" alt class="image--center mx-auto" /></p>
<h2 id="heading-what-is-an-mcp-server">What is an MCP Server?</h2>
<p>The <strong>Model Context Protocol (MCP)</strong> defines a clear client-server architecture to standardize interactions between language models and external services:</p>
<ul>
<li><p><strong>Host</strong>: LLM applications like Cursor or Claude.</p>
</li>
<li><p><strong>Client</strong>: Built-in connectors within the host application communicating with MCP servers.</p>
</li>
<li><p><strong>Server</strong>: Service providers that expose tools, resources, or actions.</p>
</li>
</ul>
<h1 id="heading-what-are-virtual-mcp-servers">What are Virtual MCP Servers</h1>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746383097199/84319e46-4bc5-431f-8ddc-7909d0b13de7.png" alt="Creating a Virtual MCP server on LangDB AI Gateway" class="image--center mx-auto" /></p>
<p><strong>Virtual MCP Servers</strong> from LangDB enhance MCP usage by providing a customizable, unified access point to external tools tailored specifically for your workflow. They simplify tool management by:</p>
<ul>
<li><p>Merging multiple tools (e.g., Slack, GitHub, Airtable) into one endpoint.</p>
</li>
<li><p>Ensuring compatibility with various platforms like Windsurf, Cursor, Claude, Python, and TypeScript.</p>
</li>
</ul>
<p>For instance, if your agent needs to create GitHub issues, send Slack messages, and retrieve Airtable records, traditionally, you would set up and maintain three separate MCP connections—each with its credentials, endpoint, and potential for incompatibility. With a Virtual MCP Server, these tools integrate into a single endpoint with centralized configuration, streamlined authentication, and stable versioning.</p>
<h2 id="heading-why-you-need-virtual-mcp-servers">Why You Need Virtual MCP Servers</h2>
<p>While MCPs standardize tool access, LangDB's <strong>Virtual MCP Servers</strong> offer a higher-level abstraction to help developers manage these tools in production.</p>
<h3 id="heading-selective-tool-exposure">Selective Tool Exposure</h3>
<p>With Virtual MCPs, you don’t expose an entire tool suite. You pick the specific tools you need for your use case:</p>
<ul>
<li><p><code>airtable.list_records</code></p>
</li>
<li><p><code>github.create_issue</code></p>
</li>
<li><p><code>slack.send_message</code></p>
</li>
</ul>
<p>This keeps your agent’s surface area small and secure, and avoids unnecessary access.</p>
<p>Additionally, some environments like <strong>Windsurf</strong> enforce hard limits on the number of tools an agent can use (e.g., 50 tools). Virtual MCPs let you <strong>curate and trim</strong> your toolset down to just what's needed — ensuring compatibility and efficiency without sacrificing capabilities.</p>
<h3 id="heading-clean-auth-handling">Clean Auth Handling</h3>
<p>LangDB allows you to:</p>
<ul>
<li><p>Add API keys <em>only</em> for selected tools</p>
</li>
<li><p>Use built-in OAuth where supported</p>
</li>
<li><p>Leave other tools stateless</p>
</li>
</ul>
<p>You avoid global tokens and minimize the risk of credential leaks.</p>
<h3 id="heading-version-lock-in-poisoning-protection">Version Lock-in + Poisoning Protection</h3>
<p>Breakages happen when upstream tools change. Virtual MCPs prevent this:</p>
<ul>
<li><p><strong>Pinned versions</strong> ensure consistent behavior</p>
</li>
<li><p><strong>Frozen tool definitions</strong> guard against future mutations</p>
</li>
<li><p><strong>Poisoning safety</strong> prevents LLMs from ingesting unsafe or malicious schema updates</p>
</li>
</ul>
<p>You control exactly what the model sees and uses.</p>
<h2 id="heading-why-use-a-virtual-mcp">Why Use a Virtual MCP?</h2>
<blockquote>
<p>Full Tracing: Every input/output, latency, and error is logged in your LangDB dashboard.</p>
<p>Poisoning Safety: Prevents injection or override by malicious or unverified tool definitions from source MCPs.</p>
</blockquote>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Feature</td><td>What It Solves</td></tr>
</thead>
<tbody>
<tr>
<td><strong>Selective Tools</strong></td><td>Use only what’s needed from broader MCPs</td></tr>
<tr>
<td><strong>Scoped Auth</strong></td><td>Modular and safe credential management</td></tr>
<tr>
<td><strong>Full Tracing</strong></td><td>Inputs, outputs, latencies, and failures are logged</td></tr>
<tr>
<td><strong>Plug-and-Play</strong></td><td>Compatible with Cursor, Claude, Windsurf, and more</td></tr>
<tr>
<td><strong>Version Control</strong></td><td>Prevents breaking changes</td></tr>
<tr>
<td><strong>Poisoning Safety</strong></td><td>Blocks tampered or unsafe tool definitions</td></tr>
</tbody>
</table>
</div><h2 id="heading-how-to-set-up-a-virtual-mcp-server">How to Set Up a Virtual MCP Server</h2>
<p>Creating your own Virtual MCP is simple:</p>
<ol>
<li><p>Open your <a target="_blank" href="https://app.langdb.ai/mcp-servers">LangDB MCP Servers</a> Listing</p>
</li>
<li><p>Choose the Server you want to start with</p>
</li>
<li><p>Click “<strong>Deploy MCP Server</strong>” and Select the <strong>Project</strong>.</p>
</li>
<li><p>Select tools tools you need or add more MCP Servers from the list</p>
</li>
<li><p>(Optional) Add API keys or let LangDB manage auth</p>
</li>
<li><p>Click on “<strong>Create Virtual MCP Server</strong>”</p>
</li>
<li><p>Click <strong>Generate MCP URL</strong></p>
</li>
</ol>
<p>Use the URL directly in Cursor, Claude, Windsurf or any MCP compatible clients.</p>
<p>Here is a quick 2 mins video on how to use Virtual MCP Servers on LangDB:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/P7aIpi9kS28">https://youtu.be/P7aIpi9kS28</a></div>
<p> </p>
<p>For usage check out: <a target="_blank" href="https://docs.langdb.ai/features/mcp-support#using-virtual-mcps">Virtual MCP Usage Docs</a></p>
<h2 id="heading-who-is-this-for">Who Is This For?</h2>
<p>Virtual MCP Servers are ideal for:</p>
<ul>
<li><p>Cursor / Claude / Windsurf users tired of tool setup</p>
</li>
<li><p>Developers managing complex toolchains</p>
</li>
<li><p>Teams needing secure, scoped, auditable MCP workflows.</p>
</li>
<li><p>Builders of AI agents using Slack, GitHub, Airtable, and more</p>
</li>
</ul>
<h2 id="heading-final-thoughts">Final Thoughts</h2>
<p>Virtual MCP Servers are designed to solve three core challenges faced in AI tool orchestration:</p>
<ul>
<li><p><strong>Scoped Authentication</strong>: Add API keys only for the tools you use. Everything else stays stateless and secure.</p>
</li>
<li><p>Tool Safety &amp; Poisoning Protection: Lock in tool versions and guard against malicious or unverified definitions.</p>
</li>
<li><p><strong>Selective Exposure</strong>: Curate your toolset by combining just what your workflow needs across multiple MCPs.</p>
</li>
</ul>
<p>This is how you build tool-aware agents that are safe, reliable, and production-ready.</p>
<h2 id="heading-explore-more">Explore More</h2>
<ul>
<li><p>📄 <a target="_blank" href="https://app.langdb.ai/mcp-servers">LangDB MCP Servers</a></p>
</li>
<li><p>📄 <a target="_blank" href="https://docs.langdb.ai/concepts/virtual-mcp-servers">LangDB Docs: Virtual MCPs</a></p>
</li>
<li><p>📖 <a target="_blank" href="https://blog.langdb.ai/smarter-coding-workflows-with-context7-sequential-thinking">Previous Blog: Smarter Coding Workflows with Context7 + Sequential Thinking</a></p>
</li>
<li><p>📄 <a target="_blank" href="https://docs.anthropic.com/en/docs/agents-and-tools/mcp">Anthropic's MCP Specification</a></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Smarter Coding Workflows with Context7 + Sequential Thinking]]></title><description><![CDATA[Imagine you're building a Next.js 14 app. You want to implement a "New Todo" form using modern features like Server Actions, useFormState, and form validation — but you're not sure how to wire everything together cleanly.
https://youtu.be/90XzHuJETEk...]]></description><link>https://blog.langdb.ai/smarter-coding-workflows-with-context7-sequential-thinking</link><guid isPermaLink="true">https://blog.langdb.ai/smarter-coding-workflows-with-context7-sequential-thinking</guid><category><![CDATA[mcp server]]></category><category><![CDATA[mcp]]></category><category><![CDATA[cursor]]></category><category><![CDATA[windsurf]]></category><category><![CDATA[#anthropic]]></category><category><![CDATA[coding]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Wed, 30 Apr 2025 18:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1746118121636/6f8ced46-18e3-456c-8989-05a9414da455.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Imagine you're building a <strong>Next.js 14</strong> app. You want to implement a "New Todo" form using modern features like <strong>Server Actions</strong>, <code>useFormState</code>, and form validation — but you're not sure how to wire everything together cleanly.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/90XzHuJETEk">https://youtu.be/90XzHuJETEk</a></div>
<p> </p>
<p>Normally, you'd have to:</p>
<ul>
<li><p>Search documentation manually.</p>
</li>
<li><p>Context-switch between browser and editor.</p>
</li>
<li><p>Piece together examples from random blog posts.</p>
</li>
</ul>
<p>Instead, using <strong>Context7</strong> and <strong>Sequential Thinking</strong> inside your LLM-native code editor (like <strong>Cursor</strong> or <strong>Windsurf</strong>), you simply ask:</p>
<pre><code class="lang-plaintext">I’m adding a “New Todo” form in this Next.js 14 app.  
Please implement:

1. A `createTodo` server action (with Zod validation) that returns `{ id, title, completed: false }`.
2. A `toggleTodo` server action to flip the `completed` boolean.
3. In `NewTodoForm.tsx`, wire up `useFormState(createTodo)`:
   • Render an `&lt;input name="title"&gt;` and “Add Todo” button.
   • Display the todo list with a checkbox next to each.
   • Strikethrough the todo title when `completed === true` and updates when we click.
   • On checkbox change, call `toggleTodo` with optimistic UI updates.
   • Show inline validation errors from Zod.

Break the solution into clear steps — sequentialThinking — and pull fresh docs on Next.js Server Actions, `use server`, and `useFormState` — context7.
</code></pre>
<p><strong>What happens behind the scenes:</strong></p>
<ul>
<li><p><strong>Context7</strong> injects the latest, version-specific documentation for Next.js Server Actions and React form utilities directly into your prompt.</p>
</li>
<li><p><strong>Sequential Thinking</strong> structures the AI's response into clear steps:</p>
<ul>
<li><p>What each server action should do</p>
</li>
<li><p>How to validate using Zod</p>
</li>
<li><p>How to wire up the form with <code>useFormState</code></p>
</li>
<li><p>Best practices for optimistic UI and error handling</p>
</li>
<li><p>A complete implementation with ready-to-use code</p>
</li>
</ul>
</li>
</ul>
<h2 id="heading-how-to-set-it-up">How to Set It Up</h2>
<ol>
<li><p>Go to your LangDB Project.</p>
</li>
<li><p>Create a <strong>Virtual MCP Server</strong>.</p>
</li>
<li><p>Add these two MCPs to it:</p>
<ul>
<li><p><a target="_blank" href="https://app.langdb.ai/mcp-servers/context7">Context7 MCP</a> — injects live documentation.</p>
</li>
<li><p><a target="_blank" href="https://app.langdb.ai/mcp-servers/sequential-thinking">Sequential Thinking MCP</a> — enables structured step-by-step reasoning.</p>
</li>
</ul>
</li>
<li><p>Choose the MCP client of your choice and generate secure MCP URL.</p>
</li>
<li><p>After running the command, start using the MCP Server in your LLM-native editor by mentioning <code>use context7 and sequentialThinking</code> in your prompt.</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1745951867592/16e139ac-00d2-477b-923c-460371e23006.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-tracing-the-workflow">Tracing the Workflow</h2>
<p>Every tool call — from fetching documentation to reasoning through logic — is fully traceable in LangDB:</p>
<ul>
<li><p>See inputs and outputs.</p>
</li>
<li><p>View each MCP server call (Context7, Sequential Thinking) as a distinct trace event.</p>
</li>
<li><p>Debug, inspect, and optimize tool chains just like you would trace API pipelines.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1745951922321/f32af9c7-b42b-471b-aa63-6b727dc24a72.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-why-this-workflow-matters">Why This Workflow Matters</h2>
<ul>
<li><p><strong>Prevents hallucinations</strong>: By pulling live, versioned documentation into your coding context.</p>
</li>
<li><p><strong>Builds cleaner logic</strong>: Step-by-step structured reasoning makes complex implementations manageable.</p>
</li>
<li><p><strong>Keeps you focused</strong>: No need to context-switch for documentation or architecture planning.</p>
</li>
</ul>
<hr />
<h2 id="heading-why-langdb-mcps-are-needed">Why LangDB + MCPs Are Needed</h2>
<p>Today's LLMs are powerful, but they often hallucinate, miss subtle API changes, or lose track of reasoning across steps. Developers need a system that can:</p>
<ul>
<li><p>Inject fresh knowledge dynamically into prompts.</p>
</li>
<li><p>Guide structured thinking, not just code generation.</p>
</li>
<li><p>Track and debug every tool invocation like a real API pipeline.</p>
</li>
</ul>
<p>LangDB's Model Context Protocol (MCP) architecture and full tracing support provide exactly this foundation. With Virtual MCPs, you can stitch together best-in-class tools like Context7 and Sequential Thinking.</p>
<h2 id="heading-try-it-out-yourself">Try It Out Yourself</h2>
<p>Want to see this in action?</p>
<ul>
<li><p>👉 <strong>Demo Repo</strong>: <a target="_blank" href="https://github.com/langdb/langdb-samples/tree/main/examples/mcp/nextjs-server-actions-demo">nextjs-server-actions-demo</a></p>
</li>
<li><p>🔌 <strong>MCPs to Add</strong>:</p>
<ul>
<li><p><a target="_blank" href="https://app.langdb.ai/mcp-servers/context7">Context7 MCP</a></p>
</li>
<li><p><a target="_blank" href="https://app.langdb.ai/mcp-servers/sequential-thinking">Sequential Thinking MCP</a></p>
</li>
</ul>
</li>
</ul>
<p>Once installed in <strong>Cursor</strong>, <strong>Claude</strong>, or <strong>Windsurf</strong>, paste the prompt, and let your AI editor reason, implement, and patch your repo with fully traceable steps.</p>
]]></content:encoded></item><item><title><![CDATA[March Product Update]]></title><description><![CDATA[In March, we focused on making production-grade LLM workflows faster, smarter, and easier to manage. We rolled out major updates to the LangDB AI Gateway: a redesigned tracing view with span-level insights, robust cost controls for scaling safely, an...]]></description><link>https://blog.langdb.ai/march-2025-product-update</link><guid isPermaLink="true">https://blog.langdb.ai/march-2025-product-update</guid><category><![CDATA[product update]]></category><category><![CDATA[LangDB]]></category><category><![CDATA[gateway]]></category><category><![CDATA[openai]]></category><category><![CDATA[llm]]></category><category><![CDATA[Rust]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Fri, 04 Apr 2025 11:31:37 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1743763330107/36a43d8f-ae31-4217-9956-e150a65388a1.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In March, we focused on making production-grade LLM workflows faster, smarter, and easier to manage. We rolled out major updates to the <a target="_blank" href="https://langdb.ai/">LangDB AI Gateway</a>: a redesigned tracing view with span-level insights, robust cost controls for scaling safely, and a smarter models interface for better deployment decisions. We also expanded our model catalog with releases like Gemini 2.5 Pro and Claude 3.7 Sonnet, and added native integrations with ClickHouse, Vercel AI SDK, and Langroid. On the community front, we connected with developers at events in Delhi and Mumbai.</p>
<h2 id="heading-langdb-ai-gateway">🚀 LangDB AI Gateway</h2>
<h3 id="heading-tracing-view-revamp">Tracing View Revamp</h3>
<p>The tracing view has been fully redesigned to give you more visibility into what's happening under the hood:</p>
<ul>
<li><p><strong>Dependency Graph</strong>: Visually map the flow of your entire request, including nested calls, tool usage, and internal routing logic</p>
</li>
<li><p><strong>Timeline View</strong>: See every span and model call across time with accurate duration and sequence</p>
</li>
<li><p><strong>Details Panel</strong>: Click any span to explore headers, messages, tool definitions, usage &amp; cost, outputs, and raw JSON—all in one place</p>
</li>
<li><p><strong>Token, Cost, Duration Overview</strong>: Get high-level metrics across the top of every trace session</p>
</li>
</ul>
<p>This makes it dramatically easier to debug multi-step agent flows, identify bottlenecks, and understand how your models are behaving in production.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1743759093214/5d417c78-22ee-4fbd-bb94-684b7a086d42.png" alt /></p>
<h3 id="heading-cost-control-ui-enhancements">Cost Control UI Enhancements</h3>
<p>Budget management is now easier than ever with LangDB's upgraded cost control experience:</p>
<ul>
<li><p><strong>Project-Level Controls</strong>: Set daily, monthly, or total cost limits per project</p>
</li>
<li><p><strong>Per-User &amp; Group Limits</strong>: Restrict usage for individual users or teams (e.g., devs group) to prevent surprise bills.</p>
</li>
<li><p><strong>Real-Time Visibility</strong>: Instantly view how much has been used, what remains, and what limits are active</p>
</li>
<li><p><strong>No-Code Setup</strong>: Everything can be configured from the UI.</p>
</li>
</ul>
<p>This gives teams peace of mind while scaling, and keeps infra costs transparent and under control.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1743759098694/644a6c9c-6a8e-4219-a4bc-cde070ac1da3.png" alt /></p>
<h3 id="heading-redesigned-models-section">Redesigned Models Section</h3>
<p><strong>Models Section</strong> now includes:</p>
<ul>
<li><p>Filters by provider, type, input/output format, and capabilities</p>
</li>
<li><p>Visible pricing, context length, and input types</p>
</li>
<li><p>Searchable, sortable layout with inline model details</p>
</li>
</ul>
<p>Whether you're deploying high-efficiency models like <code>o3-mini</code> or multi-modal powerhouses like <code>gemini-2.5-pro</code>, this updated section helps you pick the best tool for the job with confidence.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1743759103122/35b907ab-89d1-4f56-a7fc-91f2b686618f.png" alt /></p>
<h3 id="heading-ai-gateway-v021">AI Gateway v0.2.1</h3>
<p>Our gateway has been updated to v0.2.1 with:</p>
<ul>
<li><p>Improved tracing integration</p>
</li>
<li><p>Enhanced config validation</p>
</li>
<li><p>Better support for routing across private and public LLMs</p>
</li>
</ul>
<p><a target="_blank" href="https://github.com/langdb/ai-gateway">🔗 Explore on GitHub</a></p>
<h3 id="heading-7-day-professional-tier-trial">7-Day Professional Tier Trial</h3>
<p>You can now activate a free 7-day trial of LangDB's Professional Tier from the dashboard. Get access to full tracing, usage visibility, routing controls, and more—no commitment required.</p>
<h2 id="heading-new-models">🤖 New Models</h2>
<p>We've expanded LangDB's model lineup to include:</p>
<ul>
<li><p><strong>Gemma 3</strong> variants for lightweight tasks</p>
</li>
<li><p><strong>Gemini 2.5 Pro</strong> for advanced multi-modal reasoning</p>
</li>
<li><p><strong>Claude 3.7 Sonnet</strong> with high accuracy and long context support</p>
</li>
</ul>
<p>All models are available instantly via LangDB's unified API</p>
<p><a target="_blank" href="https://app.langdb.ai/models">🔗 Explore All Models</a></p>
<h2 id="heading-integrations">🔌 Integrations</h2>
<h3 id="heading-new-integrations">New Integrations</h3>
<ul>
<li><p><strong>ClickHouse</strong>: LangDB uses ClickHouse as its data store for traces, metrics, and analytics. Native UDF support lets you run LLM calls directly from SQL. <a target="_blank" href="https://docs.langdb.ai/self-hosted/clickhouse-udfs">Learn about ClickHouse UDFs</a></p>
</li>
<li><p><strong>Vercel AI SDK</strong>: LangDB is now available as a provider via the Vercel AI SDK. <a target="_blank" href="https://sdk.vercel.ai/providers/community-providers/langdb">Try it on Vercel AI SDK</a></p>
</li>
<li><p><strong>Langroid</strong>: Use LangDB with Langroid to build multi-agent applications with routing, memory, and modular agent logic. <a target="_blank" href="https://github.com/langroid/langroid/tree/main/examples/langdb">View examples</a></p>
</li>
<li><p><strong>OpenAI Agents SDK Support</strong>: Learn how to integrate Gemini, Claude, and DeepSeek into the Agents SDK by OpenAI. <a target="_blank" href="https://blog.langdb.ai/integrate-gemini-claude-deepseek-into-agents-sdk-by-openai">Read the article</a></p>
</li>
</ul>
<h2 id="heading-community">👥 Community</h2>
<h3 id="heading-delhi-meetup-with-clickhouse">Delhi Meetup with ClickHouse</h3>
<p>We co-hosted a developer meetup in Delhi alongside ClickHouse to showcase how LangDB powers scalable AI infrastructure using real-time trace analytics.</p>
<p>ClickHouse featured LangDB co-founder <strong>Matteo Pelati</strong> for his contributions to AI infra. Matteo presented LangDB at the ClickHouse Singapore meetup, highlighting how LangDB enables secure, observable LLM deployments while leveraging ClickHouse for storage and analytics.</p>
<p><a target="_blank" href="https://www.linkedin.com/posts/matteopelati_databaseengineer-dataengineer-observability-activity-7309190568838844416-kPgk?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAAC7bga4BsktY_mG6rNe2Ukzo0s1rKKZa29I">🔗 See the post</a></p>
<h3 id="heading-global-ai-bootcamp-mumbai">Global AI Bootcamp (Mumbai)</h3>
<p>We participated in the Global AI Bootcamp in Mumbai and hosted a talk titled <strong>"Introduction to AI Gateway"</strong>, where we covered how LangDB helps teams manage multi-model routing, observability, and real-time debugging with minimal config.</p>
<p><a target="_blank" href="https://www.linkedin.com/posts/hackerspace-mumbai_gaib25mum-mumbai-ai-activity-7313183900246147072-Z92f?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAAC7bga4BsktY_mG6rNe2Ukzo0s1rKKZa29I">🔗 See the post</a></p>
<h2 id="heading-stay-connected">Stay Connected</h2>
<p>Join our community to stay updated on the latest LangDB features, best practices, and AI infrastructure insights.</p>
<p><a target="_blank" href="https://join.slack.com/t/langdbcommunity/shared_invite/zt-2haf5kj6a-d7NX6TFJUPX45w~Ag4dzlg">Join us on Slack</a></p>
]]></content:encoded></item><item><title><![CDATA[Integrate Gemini, Claude, DeepSeek into Agents SDK by OpenAI]]></title><description><![CDATA[OpenAI recently introduced the Agents SDK, a lightweight, Python-first toolkit for building agentic AI apps. It’s built around three primitives:

Agents: LLMs paired with tools and instructions to complete tasks autonomously.

Handoffs: Let agents de...]]></description><link>https://blog.langdb.ai/integrate-gemini-claude-deepseek-into-agents-sdk-by-openai</link><guid isPermaLink="true">https://blog.langdb.ai/integrate-gemini-claude-deepseek-into-agents-sdk-by-openai</guid><category><![CDATA[agents]]></category><category><![CDATA[openai]]></category><category><![CDATA[agentic workflow]]></category><category><![CDATA[tracing]]></category><category><![CDATA[gemini]]></category><category><![CDATA[#anthropic]]></category><category><![CDATA[grok]]></category><category><![CDATA[langchain]]></category><category><![CDATA[CrewAI]]></category><category><![CDATA[Deepseek]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Mon, 31 Mar 2025 15:00:54 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1743426457443/f6cfb629-dd94-4e28-ac92-a29b4e13805c.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>OpenAI recently introduced the <strong>Agents SDK</strong>, a lightweight, Python-first toolkit for building agentic AI apps. It’s built around three primitives:</p>
<ul>
<li><p><strong>Agents</strong>: LLMs paired with tools and instructions to complete tasks autonomously.</p>
</li>
<li><p><strong>Handoffs</strong>: Let agents delegate tasks to other agents.</p>
</li>
<li><p><strong>Guardrails</strong>: Validate inputs/outputs to keep workflows safe and reliable.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1743421022887/50f3946e-198a-4c57-ab4a-0a43d9ff3e26.gif" alt class="image--center mx-auto" /></p>
<h3 id="heading-tldr"><strong>TLDR;</strong></h3>
<p>OpenAI’s Agents SDK is great for building tool-using agents with handoffs. But what if you want to use Gemini's latest reasoning model or take advantage of DeepSeek’s 90 percent lower cost?</p>
<p>With LangDB, you can run the same agent logic across more than 350 models, including Claude, Gemini, DeepSeek, and Grok, without changing your code. It works out of the box with frameworks like CrewAI and LangChain, and provides full trace visibility into every agent, model, and tool call.</p>
<h2 id="heading-building-a-multi-agent-customer-support-system">Building a Multi-Agent Customer Support System</h2>
<p>Let’s create a production-grade AI agent system using the <strong>OpenAI Agents SDK</strong>, with <strong>LangDB</strong> providing multi-model flexibility and runtime control.</p>
<p>We’ll build four specialized agents, each powered by a different model to demonstrate LangDB's multi-model capabilities:</p>
<ul>
<li><p><strong>OrderAgent</strong> (claude-3.7-sonnet): checks order status</p>
</li>
<li><p><strong>FAQAgent</strong> (gemini-2.5-pro): answers common customer questions</p>
</li>
<li><p><strong>ComplaintAgent</strong> (grok-2): handles complaints with empathy</p>
</li>
<li><p><strong>ReservationAgent</strong> (gpt-4o): manages table bookings</p>
</li>
</ul>
<p>Then, we use a <strong>Classifier Agent</strong> to route user queries to the appropriate sub-agent using built-in handoff.</p>
<h3 id="heading-overview">Overview</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1743404929236/6422b5d4-728d-438c-8ff6-8941f856a95e.png" alt class="image--center mx-auto" /></p>
<p>Each leaf agent is powered by a tool — a simple Python function decorated with <code>@function_tool</code>.</p>
<h4 id="heading-example-tool-for-order-status">Example: Tool for Order Status</h4>
<pre><code class="lang-python"><span class="hljs-meta">@function_tool</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">check_order_status</span>(<span class="hljs-params">order_id: str</span>):</span>
    order_statuses = {
        <span class="hljs-string">"12345"</span>: <span class="hljs-string">"Being prepared."</span>,
        <span class="hljs-string">"67890"</span>: <span class="hljs-string">"Dispatched."</span>,
        <span class="hljs-string">"11121"</span>: <span class="hljs-string">"Still processing."</span>
    }
    <span class="hljs-keyword">return</span> order_statuses.get(order_id, <span class="hljs-string">"Order ID not found."</span>)
</code></pre>
<h3 id="heading-defining-the-agent">Defining the agent:</h3>
<pre><code class="lang-python">order_agent = Agent(
    name=<span class="hljs-string">"OrderAgent"</span>,
    model=<span class="hljs-string">"anthropic/claude-3.7-sonnet"</span>,
    instructions=<span class="hljs-string">"Help customers with their order status."</span>,
    tools=[check_order_status]
)
</code></pre>
<p>Each of the other agents follows the same structure, varying only in tools, instructions, and model selection.</p>
<h3 id="heading-routing-user-queries-with-handoffs">Routing User Queries with Handoffs</h3>
<pre><code class="lang-python">classifier_agent = Agent(
    name=<span class="hljs-string">"User Interface Agent"</span>,
    model=<span class="hljs-string">"openai/gpt-4o-mini"</span>,
    instructions=<span class="hljs-string">"You are a restaurant customer support agent. Handoff to the appropriate agent based on the user query."</span>,
    handoffs=[order_agent,faq_agent,complaint_agent,reservation_agent]
)
</code></pre>
<p>This agent functions as a controller, deciding which specialized agent should handle the user's request.</p>
<h2 id="heading-running-multi-model-agents-with-langdb">Running Multi-Model Agents with LangDB</h2>
<p>To run the agents through LangDB and switch between different providers, first configure the OpenAI-compatible client like this:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> agents <span class="hljs-keyword">import</span> Agent, set_default_openai_client
<span class="hljs-keyword">from</span> openai <span class="hljs-keyword">import</span> AsyncOpenAI
<span class="hljs-keyword">from</span> uuid <span class="hljs-keyword">import</span> uuid4

client = AsyncOpenAI(
    api_key=<span class="hljs-string">"langdbAPIKey"</span>,
    base_url=<span class="hljs-string">f"https://api.us-east-1.langdb.ai/<span class="hljs-subst">{langDBProjectID}</span>/v1"</span>,
    default_headers={<span class="hljs-string">"x-thread-id"</span>: str(uuid4()), <span class="hljs-string">"x-run-id"</span>: str(uuid4())})
set_default_openai_client(client, use_for_tracing=<span class="hljs-literal">False</span>)
</code></pre>
<p>LangDB allows you to test your agents using models such as <strong>GPT</strong>, <strong>Claude</strong>, <strong>Gemini</strong>, or <strong>Grok</strong>, while keeping the agent logic unchanged.</p>
<p>To switch models, simply change the configuration:</p>
<pre><code class="lang-python">faq_agent = Agent(
    name=<span class="hljs-string">"FAQAgent"</span>,
    model=<span class="hljs-string">"gemini/gemini-2.5-pro-exp-03-25"</span>, <span class="hljs-comment"># or claude-3.7-sonnet, gpt-4o,</span>
    instructions=<span class="hljs-string">"Answer common customer questions about hours, menu, and location."</span>
    ...)
</code></pre>
<p>This approach enables you to evaluate multiple providers and optimize for quality, cost, or latency.</p>
<h3 id="heading-observing-the-agents-execution">Observing the Agents Execution</h3>
<p>LangDB provides a unified trace view that helps developers inspect agent flows across different models. Each span in the trace shows the model used, any tools called, and the time taken at each step.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1743422195740/fcb3cd12-1261-4f1a-ab87-22f164ed0ca9.png" alt="Each row in the timeline corresponds to a different model and agent interaction." class="image--center mx-auto" /></p>
<p>In the example above, you can see:</p>
<ul>
<li><p>Multiple agents running on different models in a single request</p>
</li>
<li><p>Visual breakdown of agent handoffs and tool calls</p>
</li>
<li><p>Timeline and cost details for each model involved</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1743424176537/134459a1-8dd1-407a-800e-c09272955d24.png" alt class="image--center mx-auto" /></p>
<p>This trace graph shows a single user query routed through multiple agents, each using a different model. It captures the classifier decision, tool calls, and model usage in one place, making it easy to inspect and debug the full interaction flow.</p>
<p>The visibility helps in debugging behavior, verifying tool usage, and understanding model performance across complex workflows.</p>
<h2 id="heading-why-this-matters">Why this matters?</h2>
<p>In real-world applications, different models can excel depending on the type of task or interaction style required:</p>
<ul>
<li><p>Some are better suited for fast, low-latency tasks</p>
</li>
<li><p>Others handle nuanced, empathetic, or creative responses well</p>
</li>
<li><p>Certain models are optimized for summarization or structured formatting</p>
</li>
<li><p>Others provide strong performance for general-purpose conversations</p>
</li>
</ul>
<p>LangDB lets you assign the most suitable model to each agent, giving you task-specific control while maintaining a unified development experience.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>The OpenAI Agents SDK provides a clean way to define agent workflows. Paired with LangDB, it becomes possible to run the same agent setup across multiple model providers without changing your application code.</p>
<p>LangDB gives you visibility into agent execution through trace views and lets you switch between over 350 supported models using a consistent interface. This makes it easier to compare performance, debug behavior, and adapt to evolving requirements.</p>
<h2 id="heading-try-it-yourself">Try It Yourself</h2>
<p>To explore this setup, check out the sample project in the repository. It contains the necessary code to run agents locally, modify model configurations, and observe how requests are routed across different models using LangDB.</p>
<p>Visit <a target="_blank" href="https://langdb.ai/">LangDB</a> and signup to get 10$ credit.</p>
<p>View the sample code: <a target="_blank" href="https://github.com/langdb/langdb-samples/tree/main/examples/openai/customer-support">GitHub Repository</a>.</p>
<hr />
<h2 id="heading-references">References</h2>
<ul>
<li><p><a target="_blank" href="https://openai.github.io/openai-agents-python/">OpenAI Agents SDK Documentation</a></p>
</li>
<li><p><a target="_blank" href="https://docs.langdb.ai/">LangDB Documentation</a></p>
</li>
<li><p><a target="_blank" href="https://app.langdb.ai/models">LangDB Model List</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/langdb/langdb-samples">LangDB Samples Repository</a></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Update: Cost Control UI, Models Section & Open Source v0.2.1]]></title><description><![CDATA[We've made some improvements to the platform:

Cost Control UI Enhancements – Managing budgets is now more intuitive. You can set limits for projects, users, and groups with a simple click and direct input. The UI now also provides clear usage breakd...]]></description><link>https://blog.langdb.ai/update-cost-control-ui</link><guid isPermaLink="true">https://blog.langdb.ai/update-cost-control-ui</guid><category><![CDATA[opensource]]></category><category><![CDATA[Cost Control]]></category><category><![CDATA[Gemma3]]></category><category><![CDATA[gemma]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Fri, 21 Mar 2025 12:12:20 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1742559900835/929f70d9-f938-494a-b010-99e6dd2a2f0c.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We've made some improvements to the platform:</p>
<ul>
<li><p><strong>Cost Control UI Enhancements</strong> – Managing budgets is now more intuitive. You can set limits for projects, users, and groups with a simple click and direct input. The UI now also provides clear usage breakdowns and remaining budget at a glance. Learn more in the <a target="_blank" href="https://docs.langdb.ai/features/cost-control">docs</a>.</p>
</li>
<li><p><strong>Models Section Improvement</strong> – Easily navigate through models with a redesigned layout. The left-side filters allow you to refine searches by provider, model type, input/output types, capabilities, and more. Explore the <a target="_blank" href="https://app.langdb.ai/models">models section</a>.</p>
</li>
<li><p><strong>7-Day Trial for Professional Tier</strong> – Get a hands-on experience with advanced features before committing.</p>
</li>
<li><p><strong>New Models Added</strong> – The model roster now includes <strong>Gemma 3 variants</strong> and additional models, providing more options for different use cases.</p>
</li>
</ul>
<p>Also, <strong>Open Source v0.2.1</strong> is out! Check it here → <a target="_blank" href="https://github.com/langdb/ai-gateway/releases/tag/0.2.1">ai-gateway</a></p>
<p>Check out the updated UI and feel free to share your feedback!</p>
]]></content:encoded></item><item><title><![CDATA[AI Without Downtime]]></title><description><![CDATA[Recently, the DeepSeek-R1 model gained massive popularity, leading to an overwhelming surge in traffic. As demand skyrocketed, the model began experiencing slowdowns and even outages, making it unreliable for production applications.
LangDB’s Fallbac...]]></description><link>https://blog.langdb.ai/ai-without-downtime</link><guid isPermaLink="true">https://blog.langdb.ai/ai-without-downtime</guid><category><![CDATA[Deepseek]]></category><category><![CDATA[openai]]></category><category><![CDATA[llm]]></category><category><![CDATA[routing]]></category><category><![CDATA[fallback mechanism]]></category><category><![CDATA[Zero Downtime]]></category><category><![CDATA[AI]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Thu, 06 Mar 2025 07:04:45 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1741244646784/86d38d94-45fc-42d9-be86-ed65e2efb434.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Recently, the <strong>DeepSeek-R1</strong> model gained massive popularity, leading to an overwhelming surge in traffic. As demand skyrocketed, the model began experiencing slowdowns and even outages, making it unreliable for production applications.</p>
<p>LangDB’s <strong>Fallback Routing</strong> ensures uninterrupted AI service by automatically switching to a backup model when the primary model struggles with high traffic. Instead of facing downtime, applications leveraging DeepSeek-Reasoner could seamlessly reroute requests to alternative models like GPT-4o or even other providers of DeepSeek-R1, ensuring smooth operations even during peak demand.</p>
<h2 id="heading-why-fallback-matters">Why Fallback Matters</h2>
<p>AI reliability is crucial, but high traffic, model failures, or server outages can cause disruptions. Instead of leaving applications vulnerable, LangDB’s <strong>Fallback Routing</strong> ensures continuity by rerouting requests to a backup model in real time.</p>
<h3 id="heading-how-it-works">How it Works</h3>
<p>When a request is sent to LangDB’s routing system, it first attempts to process it using the preferred model. If that model is down, experiencing delays, or overloaded, the system seamlessly reroutes the request to a predefined backup model. This <strong>prevents downtime, reduces latency issues, and improves reliability</strong>.</p>
<h2 id="heading-setting-up-fallback-routing-in-langdb">Setting Up Fallback Routing in LangDB</h2>
<p>LangDB provides an easy way to configure <strong>Fallback Routing</strong> via the UI or API.</p>
<p>Here’s how you can use the UI to set it up:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1741243449470/a157381d-2ec7-47d5-9eaf-d8161ee56a10.gif" alt class="image--center mx-auto" /></p>
<p>Here’s how you can set it up programmatically:</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"model"</span>: <span class="hljs-string">"router/dynamic"</span>,
    <span class="hljs-attr">"router"</span>: {
        <span class="hljs-attr">"name"</span>: <span class="hljs-string">"fallback-router"</span>,
        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"fallback"</span>,
        <span class="hljs-attr">"targets"</span>: [
            { <span class="hljs-attr">"model"</span>: <span class="hljs-string">"deepseek-reasoner"</span>, <span class="hljs-attr">"temperature"</span>: <span class="hljs-number">0.7</span>, <span class="hljs-attr">"max_tokens"</span>: <span class="hljs-number">400</span> },
            { <span class="hljs-attr">"model"</span>: <span class="hljs-string">"gpt-4o"</span>, <span class="hljs-attr">"temperature"</span>: <span class="hljs-number">0.8</span>, <span class="hljs-attr">"max_tokens"</span>: <span class="hljs-number">500</span> }
        ]
    }
}
</code></pre>
<p>This configuration ensures that if <strong>DeepSeek-Reasoner</strong> is overloaded or unavailable, requests automatically switch to <strong>GPT-4o</strong>, maintaining uninterrupted services.</p>
<h3 id="heading-fallback-router-with-percentage">Fallback router with Percentage</h3>
<p>In the previous example, we implemented a simple fallback mechanism. However, a more robust solution would be to <strong>distribute queries across multiple providers of DeepSeek-R1</strong> while maintaining a fallback to GPT-4o if both providers fail. This method helps balance traffic efficiently while ensuring uninterrupted AI services.</p>
<p>Here’s how you can configure <strong>Fallback Routing with Percentage-Based Load Balancing</strong>:</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"model"</span>: <span class="hljs-string">"router/dynamic"</span>,
    <span class="hljs-attr">"router"</span>: {
        <span class="hljs-attr">"name"</span>: <span class="hljs-string">"fallback-percentage-router"</span>,
        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"fallback"</span>,
        <span class="hljs-attr">"targets"</span>: [
            {
                <span class="hljs-attr">"model"</span>: <span class="hljs-string">"router/dynamic"</span>,
                <span class="hljs-attr">"router"</span>: {
                    <span class="hljs-attr">"name"</span>: <span class="hljs-string">"percentage-balanced"</span>,
                    <span class="hljs-attr">"type"</span>: <span class="hljs-string">"percentage"</span>,
                    <span class="hljs-attr">"model_a"</span>: [
                        { <span class="hljs-attr">"model"</span>: <span class="hljs-string">"fireworksai/deepseek-r1"</span>, <span class="hljs-attr">"temperature"</span>: <span class="hljs-number">0.7</span>, <span class="hljs-attr">"max_tokens"</span>: <span class="hljs-number">400</span> },
                        <span class="hljs-number">0.5</span>
                    ],
                    <span class="hljs-attr">"model_b"</span>: [
                        { <span class="hljs-attr">"model"</span>: <span class="hljs-string">"deepseek/deepseek-reasoner"</span>, <span class="hljs-attr">"temperature"</span>: <span class="hljs-number">0.7</span>, <span class="hljs-attr">"max_tokens"</span>: <span class="hljs-number">400</span> },
                        <span class="hljs-number">0.5</span>
                    ]
                }
            },
            { <span class="hljs-attr">"model"</span>: <span class="hljs-string">"gpt-4o"</span>, <span class="hljs-attr">"temperature"</span>: <span class="hljs-number">0.8</span>, <span class="hljs-attr">"max_tokens"</span>: <span class="hljs-number">500</span> }
        ]
    }
}
</code></pre>
<h3 id="heading-how-this-works">How This Works:</h3>
<ul>
<li><p><strong>Primary Route:</strong> The system distributes requests <strong>evenly (50-50%)</strong> between two providers of DeepSeek-R1 to balance the load.</p>
</li>
<li><p><strong>Fallback Route:</strong> If both DeepSeek-R1 providers are unavailable or fail, all requests are automatically rerouted to <strong>GPT-4o</strong>, ensuring continuous service.</p>
</li>
</ul>
<p>This approach provides <strong>load balancing, and reliable failover protection</strong>, making it ideal for AI applications facing high demand and occasional model unavailability.</p>
<p>In more complex scenarios, you can configure a <strong>multi-level fallback system</strong> with percentage-based distribution. This approach allows requests to be routed dynamically based on pricing, performance, or reliability, ensuring efficiency while preventing downtime. Checkout <a target="_blank" href="https://docs.langdb.ai/features/routing#routing-strategies">Routing Strategies</a> for more details about this.</p>
<p>By leveraging <strong>dynamic routing</strong>, you can:</p>
<ul>
<li><p><strong>Prevent downtime</strong> by automatically switching to backup models.</p>
</li>
<li><p><strong>Optimize performance and cost</strong> with smart load balancing.</p>
</li>
<li><p><strong>Ensure scalability</strong> without manual intervention.</p>
</li>
</ul>
<p>With LangDB’s flexible and powerful routing capabilities, <strong>you can build AI applications that are not only intelligent but also robust and fail-safe</strong>.</p>
<h3 id="heading-get-started-today">Get Started Today</h3>
<p>Ready to implement <strong>fallback routing</strong> in your AI stack? Check out the <a target="_blank" href="https://docs.langdb.ai/features/routing">LangDB Routing Docs</a> and deploy your AI applications on <a target="_blank" href="https://app.langdb.ai/">LangDB</a> to ensure reliability, scalability, and seamless failover.</p>
]]></content:encoded></item><item><title><![CDATA[Empowering LLMs with Real-Time Search]]></title><description><![CDATA[Traditional LLMs rely on static training data, making them prone to outdated responses, hallucinations, and missing critical context. LangDB's built-in Search tool solves this by fetching real-time data, improving accuracy and contextual relevance.
T...]]></description><link>https://blog.langdb.ai/empowering-llms-with-real-time-search</link><guid isPermaLink="true">https://blog.langdb.ai/empowering-llms-with-real-time-search</guid><category><![CDATA[Model Context Protocol]]></category><category><![CDATA[search]]></category><category><![CDATA[llm]]></category><category><![CDATA[openai]]></category><category><![CDATA[agentic AI]]></category><category><![CDATA[tools]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Fri, 28 Feb 2025 08:25:42 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740731055196/470a106c-ab53-41e3-95a0-6185718b5ea6.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Traditional LLMs rely on static training data, making them prone to outdated responses, hallucinations, and missing critical context. LangDB's <strong>built-in Search tool</strong> solves this by fetching real-time data, improving accuracy and contextual relevance.</p>
<h2 id="heading-the-challenge-stale-or-incomplete-knowledge">The Challenge: Stale or Incomplete Knowledge</h2>
<ul>
<li><p><strong>Static Corpus</strong>: Most LLMs are trained on large datasets, but that training is typically a snapshot in time. Once trained, the model doesn’t automatically update its knowledge.</p>
</li>
<li><p><strong>Inaccurate or Outdated Information</strong>: Without a method to query current data, an LLM may provide answers that were correct at the time of training but are no longer valid.</p>
</li>
<li><p><strong>Limited Context</strong>: Even if the model has relevant data, it might not surface the best context without a guided search mechanism.</p>
</li>
</ul>
<h2 id="heading-introducing-langdb-search-tool">Introducing LangDB Search Tool</h2>
<p>LangDB’s built-in Search tool addresses these challenges by allowing real-time querying of databases, documents, or external sources:</p>
<ul>
<li><p><strong>On-Demand Queries</strong>: Instead of relying solely on the LLM’s training data, the Search tool can fetch the latest information at query time.</p>
</li>
<li><p><strong>Integrated with LangDB</strong>: The search functionality is seamlessly woven into the LangDB, ensuring that developers can use it without additional overhead.</p>
</li>
<li><p><strong>API-Ready</strong>: LangDB’s search tool can be accessed via API too.</p>
</li>
</ul>
<h2 id="heading-search-vs-no-search">Search vs No-Search</h2>
<p>Below is a side-by-side comparison of using LangDB’s search tool versus relying on static model knowledge. The left image shows results <strong>with search enabled</strong>, pulling real-time, up-to-date information. The right image shows the <strong>same query without search</strong>, leading to more generic and potentially outdated responses.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1740713259619/6567a076-df61-43a6-9205-e9bbdd92e132.png" alt class="image--center mx-auto" /></p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Feature</strong></td><td><strong>No Search</strong></td><td><strong>With LangDB Search</strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>Data Freshness</strong></td><td>Static, based on training corpus</td><td>Dynamic, fetches real-time information</td></tr>
<tr>
<td><strong>Accuracy</strong></td><td>Prone to outdated or incorrect responses</td><td>Pulls from latest sources, improving reliability</td></tr>
<tr>
<td><strong>Context Depth</strong></td><td>Limited by internal model memory</td><td>Integrates external sources for better insights</td></tr>
<tr>
<td><strong>Hallucination Risk</strong></td><td>Higher</td><td>Lower, as responses are backed by retrieved data</td></tr>
</tbody>
</table>
</div><h2 id="heading-using-search-through-api">Using Search through API</h2>
<p>LangDB’s search tool can be easily integrated via API to fetch real-time data. Below is an example of how to make a simple API call to retrieve live information.</p>
<p>Enable real-time search with LLMs in LangDB with a simple API call:</p>
<pre><code class="lang-bash">curl <span class="hljs-string">'https://api.us-east-1.langdb.ai/{LangDB_ProjectID}/v1/chat/completions'</span> \
-H <span class="hljs-string">'authorization: Bearer LangDBAPIKey'</span> \
-H <span class="hljs-string">'Content-Type: application/json'</span> \
-d <span class="hljs-string">'{
  "model": "openai/gpt-4o-mini",
  "mcp_servers": [{ "name": "websearch", "type": "in-memory"}],
  "messages": [
    {
      "role": "user",
      "content": "what are the latest models by anthropic and what are its standout features?"
    }
  ]
}'</span>
</code></pre>
<p>This allows the LLM to enhance responses with live data, ensuring greater accuracy and relevance.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>LangDB’s built-in Search tool eliminates the limitations of static LLMs by integrating <strong>real-time web search</strong>, ensuring your AI retrieves the most relevant, up-to-date, and accurate information. Whether you're building <strong>chatbots, research tools, or automation systems</strong>, dynamic search enhances responses with verifiable data, reducing hallucinations and improving decision-making.</p>
<p>With <strong>seamless API integration and chat playground support</strong>, LangDB makes it easy to <strong>query, validate, and optimize</strong> your LLM calls.</p>
<p>🔹 <strong>Want to build smarter LLM-powered applications?</strong> Start using <a target="_blank" href="https://langdb.ai/">LangDB</a> Search tool today and take your AI to the next level.</p>
]]></content:encoded></item><item><title><![CDATA[AI-Powered Banking Chatbot: Build with LangChain, LangDB.ai & RAG (Part 2)]]></title><description><![CDATA[In Part 1 of our series, we built a LangChain-powered conversational AI for banking FAQs using LangDB AI Gateway. Now, in Part 2, we'll integrate ChromaDB for Retrieval-Augmented Generation (RAG), enhancing the chatbot's ability to provide precise an...]]></description><link>https://blog.langdb.ai/ai-powered-banking-chatbot-build-with-langchain-langdbai-and-rag-part-2</link><guid isPermaLink="true">https://blog.langdb.ai/ai-powered-banking-chatbot-build-with-langchain-langdbai-and-rag-part-2</guid><category><![CDATA[openai]]></category><category><![CDATA[Deepseek]]></category><category><![CDATA[LangDB]]></category><category><![CDATA[langchain]]></category><category><![CDATA[RAG ]]></category><category><![CDATA[AI]]></category><category><![CDATA[chatbot]]></category><dc:creator><![CDATA[Dishant Gandhi]]></dc:creator><pubDate>Wed, 26 Feb 2025 08:44:18 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740417643224/a27b8354-3d8a-48a1-b78a-ac541db68d27.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In Part 1 of our series, we built a LangChain-powered conversational AI for banking FAQs using <a target="_blank" href="https://langdb.ai/">LangDB AI Gateway</a>. Now, in <strong>Part 2</strong>, we'll integrate <strong>ChromaDB</strong> for Retrieval-Augmented Generation (RAG), enhancing the chatbot's ability to provide precise answers based on uploaded documents.</p>
<p>🚀 <strong>What We'll Cover:</strong></p>
<ol>
<li><p>Understanding the RAG Pipeline.</p>
</li>
<li><p>Setting up ChromaDB for vector storage.</p>
</li>
<li><p>Embedding and storing documents.</p>
</li>
<li><p>Retrieving context from documents for accurate responses.</p>
</li>
<li><p>Querying the chatbot for contextually rich answers.</p>
</li>
</ol>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><strong>Find all source code and starter pack by clicking </strong><a target="_self" href="https://github.com/langdb/langdb-samples/blob/main/examples/langchain/langchain-rag-bot/main.py"><strong>here</strong></a><strong>.</strong></div>
</div>

<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><strong>Don’t forget to star us⭐</strong></div>
</div>

<p>Alternatively you can also follow our <strong>YouTube tutorial</strong></p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/jTZ00qz-O3A">https://youtu.be/jTZ00qz-O3A</a></div>
<p> </p>
<hr />
<h2 id="heading-what-is-retrieval-augmented-generation-rag">🤖 <strong>What is Retrieval-Augmented Generation (RAG)?</strong></h2>
<p><strong>Retrieval-Augmented Generation (RAG)</strong> is a powerful approach that enhances the capabilities of large language models (LLMs) by providing them with <strong>external knowledge</strong>. Instead of relying solely on pre-trained knowledge, RAG retrieves relevant documents from a vector database and uses them as context to generate accurate and informed responses.</p>
<h3 id="heading-key-components-of-rag"><strong>Key Components of RAG:</strong></h3>
<ol>
<li><p><strong>Retriever:</strong> Searches for relevant documents based on user queries.</p>
</li>
<li><p><strong>Generator:</strong> Generates a response using both the retrieved context and the LLM's internal knowledge.</p>
</li>
<li><p><strong>Memory:</strong> Retains conversation history for continuity.</p>
</li>
</ol>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">This approach ensures that the chatbot can answer user queries with up-to-date, domain-specific information, such as banking FAQs or interest rates.</div>
</div>

<hr />
<h2 id="heading-what-are-vector-databases"><strong>What Are Vector Databases?</strong></h2>
<p>A <strong>vector database</strong> stores information as <strong>vector embeddings</strong>—numerical representations of text, images, or other data. These embeddings allow for <strong>efficient similarity searches</strong>, enabling the retriever to find the most relevant documents based on user queries.</p>
<h3 id="heading-why-use-a-vector-database-like-chromadb"><strong>Why Use a Vector Database like ChromaDB?</strong></h3>
<ol>
<li><p><strong>Fast Retrieval:</strong> Quickly finds relevant information, even in large datasets.</p>
</li>
<li><p><strong>Contextual Matching:</strong> Retrieves documents based on semantic meaning, not just keywords.</p>
</li>
<li><p><strong>Efficient Storage:</strong> Compact and scalable storage of embeddings.</p>
</li>
</ol>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">In our project, we'll use <strong>ChromaDB</strong> to store and retrieve banking-related documents.</div>
</div>

<hr />
<h2 id="heading-how-does-embedding-work"><strong>How Does Embedding Work?</strong></h2>
<p><strong>Embedding</strong> converts text into <strong>high-dimensional vectors</strong> that capture semantic meaning. For example, the phrases <em>"home loan interest rate"</em> and <em>"mortgage rate"</em> would have <strong>similar vector representations</strong>, enabling efficient retrieval.</p>
<h3 id="heading-embedding-process"><strong>Embedding Process:</strong></h3>
<ol>
<li><p><strong>Text Input:</strong> Extracted from uploaded documents (e.g., PDF FAQs).</p>
</li>
<li><p><strong>Vectorization:</strong> Text is converted into embeddings using a model like <code>sentence-transformers</code>.</p>
</li>
<li><p><strong>Storage:</strong> Embeddings are stored in ChromaDB for future retrieval.</p>
</li>
</ol>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">This process ensures that the chatbot can <strong>search and find relevant information</strong> based on user queries.</div>
</div>

<hr />
<h2 id="heading-setting-up-chromadb"><strong>Setting Up ChromaDB</strong></h2>
<p>First, ensure ChromaDB and PyPDF is installed:</p>
<pre><code class="lang-python">pip install chromadb pypdf
</code></pre>
<p>Import necessary modules and initialize ChromaDB:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_community.vectorstores <span class="hljs-keyword">import</span> Chroma
<span class="hljs-keyword">from</span> langchain_community.embeddings <span class="hljs-keyword">import</span> HuggingFaceEmbeddings

CHROMA_DB_DIR = <span class="hljs-string">"chroma"</span>

<span class="hljs-comment"># Initialize ChromaDB and Embeddings</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">initialize_chromadb</span>():</span>
    embeddings = HuggingFaceEmbeddings(model_name=<span class="hljs-string">"sentence-transformers/all-MiniLM-L6-v2"</span>, model_kwargs={<span class="hljs-string">'device'</span>: <span class="hljs-string">'cpu'</span>})
    vector_store = Chroma(persist_directory=CHROMA_DB_DIR, embedding_function=embeddings)
    <span class="hljs-keyword">return</span> vector_store

<span class="hljs-comment"># Initialize ChromaDB</span>
vector_db = initialize_chromadb()
</code></pre>
<p><strong>What This Does:</strong></p>
<ul>
<li><p><strong>Embeddings:</strong> Converts text into searchable vectors.</p>
</li>
<li><p><strong>Vector Store:</strong> Stores these vectors for retrieval.</p>
</li>
<li><p><strong>Persistence:</strong> Ensures data isn't lost after restarting the app.</p>
</li>
</ul>
<hr />
<h2 id="heading-uploading-and-processing-pdf"><strong>Uploading and Processing PDF</strong></h2>
<p>Let's allow users to upload a PDF containing interest rates and banking FAQs.</p>
<pre><code class="lang-python">st.sidebar.title(<span class="hljs-string">"Options"</span>)
uploaded_file = st.sidebar.file_uploader(<span class="hljs-string">"Upload PDF"</span>, type=<span class="hljs-string">"pdf"</span>)
</code></pre>
<p><strong>File Upload:</strong> Users upload PDFs via Streamlit's sidebar.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_community.document_loaders <span class="hljs-keyword">import</span> PyPDFLoader
<span class="hljs-keyword">from</span> langchain_text_splitters <span class="hljs-keyword">import</span> RecursiveCharacterTextSplitter
<span class="hljs-keyword">import</span> tempfile

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">process_pdf</span>(<span class="hljs-params">file</span>):</span>
    <span class="hljs-keyword">with</span> tempfile.TemporaryDirectory() <span class="hljs-keyword">as</span> temp_dir:
        temp_file_path = os.path.join(temp_dir, file.name)
        <span class="hljs-keyword">with</span> open(temp_file_path, <span class="hljs-string">"wb"</span>) <span class="hljs-keyword">as</span> temp_file:
            temp_file.write(file.getbuffer())

        pdf_loader = PyPDFLoader(temp_file_path)
        documents = pdf_loader.load()

        text_splitter = RecursiveCharacterTextSplitter(chunk_size=<span class="hljs-number">800</span>, chunk_overlap=<span class="hljs-number">80</span>, length_function=len)
        chunks = text_splitter.split_documents(documents)

        <span class="hljs-keyword">return</span> chunks
</code></pre>
<p><strong>Explanation:</strong></p>
<ul>
<li><p><strong>Text Splitting:</strong> Splits documents into 800-character chunks (with 80-character overlap).</p>
</li>
<li><p><strong>PDF Loading:</strong> <code>PyPDFLoader</code> extracts content from the uploaded PDF.</p>
</li>
</ul>
<h2 id="heading-storing-document-in-chromadb"><strong>Storing Document in ChromaDB</strong></h2>
<p>Once the document is processed, we'll convert it into vector embeddings and store it in ChromaDB.</p>
<pre><code class="lang-python"><span class="hljs-keyword">if</span> uploaded_file:
    user_vector_store_dir = CHROMA_DB_DIR
    user_chunks = process_pdf(uploaded_file)
    vector_db.add_documents(user_chunks)
    st.sidebar.success(<span class="hljs-string">f"Processed <span class="hljs-subst">{len(user_chunks)}</span> chunks from uploaded PDF."</span>)
</code></pre>
<ul>
<li><p><strong>Document Conversion:</strong> Converts the text into LangChain document format.</p>
</li>
<li><p><strong>Vector Storage:</strong> Stores the document as vector embeddings in ChromaDB.</p>
</li>
</ul>
<hr />
<h2 id="heading-querying-chromadb-for-contextual-answers"><strong>Querying ChromaDB for Contextual Answers</strong></h2>
<p>Now, let's update the chatbot to search the vector store for relevant context when a user asks a question.</p>
<pre><code class="lang-python"><span class="hljs-keyword">if</span> send_button:
    user_input = st.session_state.user_input.strip()  <span class="hljs-comment"># Ensure the input is not empty or just whitespace    </span>
    <span class="hljs-keyword">if</span> user_input:
        context = <span class="hljs-string">""</span>
        <span class="hljs-comment"># Retrieve relevant context from ChromaDB</span>
        <span class="hljs-keyword">try</span>:
            search_results = vector_db.similarity_search(user_input, k=<span class="hljs-number">3</span>)
            <span class="hljs-keyword">for</span> result <span class="hljs-keyword">in</span> search_results:
                context += result.page_content + <span class="hljs-string">"\n\n"</span>
        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            st.error(<span class="hljs-string">f"Error retrieving context from ChromaDB: <span class="hljs-subst">{e}</span>"</span>)
</code></pre>
<p><strong>What This Does:</strong></p>
<ul>
<li><p><strong>Similarity Search:</strong> Retrieves the top 3 relevant document chunks from ChromaDB.</p>
</li>
<li><p><strong>Contextual Response:</strong> Uses retrieved content to generate a precise, context-aware answer.</p>
</li>
</ul>
<hr />
<h3 id="heading-now-we-will-be-able-to-chat-with-our-banking-assistant-freely">Now, we will be able to chat with our Banking assistant freely</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1740252953817/aac08ae9-389c-411e-98cd-3641a6107dee.gif" alt class="image--center mx-auto" /></p>
<h2 id="heading-whole-code-snippet-with-rag">Whole code snippet with RAG</h2>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> tempfile
<span class="hljs-keyword">from</span> os <span class="hljs-keyword">import</span> getenv

<span class="hljs-keyword">import</span> streamlit <span class="hljs-keyword">as</span> st
<span class="hljs-keyword">from</span> langchain.chains <span class="hljs-keyword">import</span> LLMChain
<span class="hljs-keyword">from</span> langchain.prompts <span class="hljs-keyword">import</span> PromptTemplate
<span class="hljs-keyword">from</span> langchain.memory <span class="hljs-keyword">import</span> ConversationBufferMemory
<span class="hljs-keyword">import</span> requests
<span class="hljs-keyword">from</span> langchain_community.document_loaders <span class="hljs-keyword">import</span> PyPDFLoader
<span class="hljs-keyword">from</span> langchain_community.embeddings <span class="hljs-keyword">import</span> HuggingFaceEmbeddings
<span class="hljs-keyword">from</span> langchain_community.vectorstores <span class="hljs-keyword">import</span> Chroma
<span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI
<span class="hljs-keyword">from</span> langchain_text_splitters <span class="hljs-keyword">import</span> RecursiveCharacterTextSplitter

<span class="hljs-comment"># Constants</span>
PROMPT_TEMPLATE = <span class="hljs-string">"""
You are a banking assistant specializing in answering FAQs about loans, interest rates, and general banking services.
If the user greets, respond with a greeting. If the user asks a question, provide an answer.
Use the following context too for answering questions:

{context}

Conversation History: 
{history}

---


Answer the question based on the above context: {query}

"""</span>

CHROMA_DB_DIR = <span class="hljs-string">"chroma"</span>
LANGDB_API_URL = <span class="hljs-string">"https://api.us-east-1.langdb.ai/your-project-id/v1"</span>  <span class="hljs-comment"># Replace with your LANGDB project id</span>
os.environ[<span class="hljs-string">"LANGDB_API_KEY"</span>] = <span class="hljs-string">"your-api-key"</span>

st.set_page_config(page_title=<span class="hljs-string">"Banking Assistant"</span>, layout=<span class="hljs-string">"wide"</span>)
st.title(<span class="hljs-string">"Banking FAQ Assistant"</span>)
st.write(<span class="hljs-string">"Ask questions about banking services, loan options, and interest rates!"</span>)

<span class="hljs-comment"># Initialize ChromaDB and Embeddings</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">initialize_chromadb</span>():</span>
    embeddings = HuggingFaceEmbeddings(model_name=<span class="hljs-string">"sentence-transformers/all-MiniLM-L6-v2"</span>, model_kwargs={<span class="hljs-string">'device'</span>: <span class="hljs-string">'cpu'</span>})
    vector_store = Chroma(persist_directory=CHROMA_DB_DIR, embedding_function=embeddings)
    <span class="hljs-keyword">return</span> vector_store

<span class="hljs-comment"># Initialize ChromaDB and LangChain LLM</span>
vector_db = initialize_chromadb()
<span class="hljs-comment"># Initialize LangChain LLM</span>
llm = ChatOpenAI(
    base_url=LANGDB_API_URL,
    api_key=getenv(<span class="hljs-string">"LANGDB_API_KEY"</span>),
    model=<span class="hljs-string">"gpt-3.5-turbo"</span>,  <span class="hljs-comment"># Replace with the specific model name you are using</span>
    timeout=<span class="hljs-number">10</span>  <span class="hljs-comment"># Add a timeout of 10 seconds</span>
)

<span class="hljs-comment"># Memory for conversation history</span>
memory = ConversationBufferMemory(
    memory_key=<span class="hljs-string">"history"</span>,
    return_messages=<span class="hljs-literal">True</span>,
    input_key=<span class="hljs-string">"query"</span>,
)

<span class="hljs-comment"># Prompt Template for LangChain</span>
prompt_template = PromptTemplate(
    input_variables=[<span class="hljs-string">"context"</span>, <span class="hljs-string">"history"</span>, <span class="hljs-string">"query"</span>],
    template=PROMPT_TEMPLATE
)

<span class="hljs-comment"># LangChain LLM Chain</span>
chain = LLMChain(llm=llm, prompt=prompt_template, memory=memory)

st.sidebar.title(<span class="hljs-string">"Options"</span>)
uploaded_file = st.sidebar.file_uploader(<span class="hljs-string">"Upload PDF"</span>, type=<span class="hljs-string">"pdf"</span>)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">process_pdf</span>(<span class="hljs-params">file</span>):</span>
    <span class="hljs-keyword">with</span> tempfile.TemporaryDirectory() <span class="hljs-keyword">as</span> temp_dir:
        temp_file_path = os.path.join(temp_dir, file.name)
        <span class="hljs-keyword">with</span> open(temp_file_path, <span class="hljs-string">"wb"</span>) <span class="hljs-keyword">as</span> temp_file:
            temp_file.write(file.getbuffer())

        pdf_loader = PyPDFLoader(temp_file_path)
        documents = pdf_loader.load()

        text_splitter = RecursiveCharacterTextSplitter(chunk_size=<span class="hljs-number">800</span>, chunk_overlap=<span class="hljs-number">80</span>, length_function=len)
        chunks = text_splitter.split_documents(documents)

        <span class="hljs-keyword">return</span> chunks

<span class="hljs-keyword">if</span> uploaded_file:
    user_vector_store_dir = CHROMA_DB_DIR
    user_chunks = process_pdf(uploaded_file)
    vector_db.add_documents(user_chunks)
    st.sidebar.success(<span class="hljs-string">f"Processed <span class="hljs-subst">{len(user_chunks)}</span> chunks from uploaded PDF."</span>)

<span class="hljs-comment"># Chatbox implementation</span>
st.subheader(<span class="hljs-string">"Chatbox"</span>)

<span class="hljs-comment"># Container for chat messages</span>
chat_container = st.container()

<span class="hljs-comment"># Function to display chat messages</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">display_message</span>(<span class="hljs-params">message, is_user=True</span>):</span>
    <span class="hljs-keyword">if</span> is_user:
        chat_container.markdown(<span class="hljs-string">f"&lt;div style='text-align: right; padding: 10px; border-radius: 10px; margin: 5px;'&gt;<span class="hljs-subst">{message}</span>&lt;/div&gt;"</span>, unsafe_allow_html=<span class="hljs-literal">True</span>)
    <span class="hljs-keyword">else</span>:
        chat_container.markdown(<span class="hljs-string">f"&lt;div style='text-align: left; padding: 10px; border-radius: 10px; margin: 5px;'&gt;<span class="hljs-subst">{message}</span>&lt;/div&gt;"</span>, unsafe_allow_html=<span class="hljs-literal">True</span>)

<span class="hljs-comment"># Initialize chat history in session state</span>
<span class="hljs-keyword">if</span> <span class="hljs-string">"messages"</span> <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> st.session_state:
    st.session_state.messages = []

<span class="hljs-comment"># Display chat history</span>
<span class="hljs-keyword">with</span> chat_container:
    <span class="hljs-keyword">for</span> chat <span class="hljs-keyword">in</span> st.session_state.messages:
        display_message(chat[<span class="hljs-string">'content'</span>], is_user=chat[<span class="hljs-string">'is_user'</span>])

<span class="hljs-comment"># User Input Section</span>
user_input = st.text_input(<span class="hljs-string">"Enter your query:"</span>, key=<span class="hljs-string">"user_input"</span>)
send_button = st.button(<span class="hljs-string">"Send"</span>)

<span class="hljs-keyword">if</span> send_button:
    user_input = st.session_state.user_input.strip()  <span class="hljs-comment"># Ensure the input is not empty or just whitespace</span>
    <span class="hljs-keyword">if</span> user_input:
        context = <span class="hljs-string">""</span>
        <span class="hljs-comment"># Retrieve relevant context from ChromaDB</span>
        <span class="hljs-keyword">try</span>:
            search_results = vector_db.similarity_search(user_input, k=<span class="hljs-number">3</span>)
            <span class="hljs-keyword">for</span> result <span class="hljs-keyword">in</span> search_results:
                context += result.page_content + <span class="hljs-string">"\n\n"</span>
        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            st.error(<span class="hljs-string">f"Error retrieving context from ChromaDB: <span class="hljs-subst">{e}</span>"</span>)
        <span class="hljs-keyword">try</span>:
            response = chain.run(context=context, query=user_input)
            <span class="hljs-comment"># Update conversation memory</span>
            st.session_state.messages.append({<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: user_input, <span class="hljs-string">"is_user"</span>:<span class="hljs-literal">True</span>})
            st.session_state.messages.append({<span class="hljs-string">"role"</span>: <span class="hljs-string">"assistant"</span>, <span class="hljs-string">"content"</span>: response, <span class="hljs-string">"is_user"</span>:<span class="hljs-literal">False</span>})
            st.rerun()
        <span class="hljs-keyword">except</span> requests.exceptions.Timeout:
            st.error(<span class="hljs-string">"The request to the LLM timed out. Please try again."</span>)
        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            st.error(<span class="hljs-string">f"Error generating response: <span class="hljs-subst">{e}</span>"</span>)
    <span class="hljs-keyword">else</span>:
        st.warning(<span class="hljs-string">"Please enter a valid query."</span>)
</code></pre>
<h2 id="heading-final-thoughts-smarter-banking-faq-chatbot-with-rag"><strong>Final Thoughts: Smarter Banking FAQ Chatbot with RAG</strong></h2>
<p>With ChromaDB integrated, our chatbot can now answer questions based on uploaded documents, ensuring accurate, contextually relevant responses. This powerful RAG pipeline makes the chatbot adaptable for real-world banking use cases.</p>
<p>💡 <strong>Key Takeaways:</strong></p>
<ol>
<li><p><strong>Enhanced Accuracy:</strong> Queries are answered based on real-time context from documents.</p>
</li>
<li><p><strong>Efficient Retrieval:</strong> ChromaDB ensures fast and relevant search results.</p>
</li>
<li><p><strong>Seamless User Experience:</strong> Users receive precise answers without delays.</p>
</li>
</ol>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><strong>Bonus:</strong> Get started with the complete <a target="_self" href="https://github.com/langdb/langdb-samples/blob/main/examples/langchain/langchain-rag-bot/main.py">source code</a> and experiment with LangChain’s advanced features!</div>
</div>

<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><strong>Don’t forget to star our GitHub repo!</strong></div>
</div>]]></content:encoded></item><item><title><![CDATA[AI-Powered Banking Chatbot: Build with LangChain, LangDB.ai & RAG (Part 1)]]></title><description><![CDATA[In the fast-paced world of AI innovation, crafting a Chat Assistant isn’t just about coding - it’s about engineering an intelligent ecosystem that delivers dynamic responses, integrates seamlessly with vector databases, and maintains conversational m...]]></description><link>https://blog.langdb.ai/ai-powered-banking-chatbot-build-with-langchain-langdbai-and-rag-part-1</link><guid isPermaLink="true">https://blog.langdb.ai/ai-powered-banking-chatbot-build-with-langchain-langdbai-and-rag-part-1</guid><category><![CDATA[Banking Bot]]></category><category><![CDATA[langchain]]></category><category><![CDATA[LangDB]]></category><category><![CDATA[RAG ]]></category><category><![CDATA[vector database]]></category><category><![CDATA[openai]]></category><category><![CDATA[Deepseek]]></category><category><![CDATA[chromadb]]></category><category><![CDATA[streamlit]]></category><dc:creator><![CDATA[Dishant Gandhi]]></dc:creator><pubDate>Wed, 19 Feb 2025 13:32:19 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1739815385854/42ac04e7-e677-4c49-86c1-b7136ea89ea3.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the fast-paced world of AI innovation, crafting a Chat Assistant isn’t just about coding - it’s about engineering an intelligent ecosystem that delivers dynamic responses, integrates seamlessly with vector databases, and maintains conversational memory for enhanced user interactions. Today, we’re diving into how you can build a <strong>LangChain-powered RAG (Retrieval-Augmented Generation) Conversational AI</strong> using <strong>Streamlit, LangChain,</strong> <a target="_blank" href="https://langdb.ai/"><strong>LangDB</strong></a><strong>, and ChromaDB</strong>.</p>
<p><strong>This is just the beginning!</strong> Stay tuned for this two-part series, where we will guide you step-by-step in building a robust AI-powered Chat Assistant. And yes—you get a <strong>free starter pack</strong> with all source code ready to go!</p>
<h2 id="heading-whats-on-our-agenda">What’s on Our Agenda?</h2>
<p>Here’s a sneak peek at what we’ll cover today:</p>
<ul>
<li><p><strong>Installation and Setup</strong>: Get your environment ready with the necessary dependencies.</p>
</li>
<li><p><strong>Building a Simple Chatbot</strong>: Integrate <strong>LangChain, LangDB.ai, and ChromaDB</strong> for intelligent responses.</p>
</li>
<li><p><strong>Adding Memory &amp; Enhancing User Experience</strong>: Implement <strong>conversation history</strong> for a more natural flow.</p>
</li>
<li><p><strong>Deploying with Streamlit</strong>: Run your chatbot with an intuitive UI.</p>
</li>
</ul>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><strong>Find all source code and starter pack by clicking </strong><a target="_self" href="https://github.com/langdb/langdb-samples"><strong>here</strong></a><strong>.</strong></div>
</div>

<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><strong>Don’t forget to star us⭐</strong></div>
</div>

<p>Alternatively you can also follow our <strong>Youtube tutorial</strong></p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/Q-TnQ23RgMU">https://youtu.be/Q-TnQ23RgMU</a></div>
<p> </p>
<hr />
<h2 id="heading-whats-the-theme-of-our-ai">What’s the Theme of Our AI?</h2>
<p>To keep things practical, we are building a Banking FAQ Assistant chatbot that answers user queries about loan options, interest rates, and general banking FAQs.</p>
<p>🏦 <strong>Conversational AI Theme</strong>: Banking FAQ &amp; Loan Inquiry Bot</p>
<p><strong>Capabilities:</strong></p>
<ul>
<li><p>Answer frequently asked questions about banking services</p>
</li>
<li><p>Provide details on various loan types and interest rates</p>
</li>
<li><p>Retain conversational memory for personalized banking guidance</p>
</li>
</ul>
<hr />
<h2 id="heading-step-by-step-guide-to-get-started">Step-by-Step Guide to Get Started</h2>
<h3 id="heading-installation-and-setup"><strong>Installation and Setup</strong></h3>
<p>Before we dive in, let’s set up our development environment.</p>
<h3 id="heading-install-dependencies"><strong>Install Dependencies</strong></h3>
<p>Ensure you have Python installed, then proceed with the following:</p>
<pre><code class="lang-bash">pip install streamlit langchain openai langchain-community requests
</code></pre>
<hr />
<h2 id="heading-building-the-langchain-conversational-ai"><strong>Building the LangChain Conversational AI</strong></h2>
<h3 id="heading-setting-up-the-core-components"><strong>Setting Up the Core Components</strong></h3>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> streamlit <span class="hljs-keyword">as</span> st
<span class="hljs-keyword">from</span> langchain.chains <span class="hljs-keyword">import</span> LLMChain
<span class="hljs-keyword">from</span> langchain.prompts <span class="hljs-keyword">import</span> PromptTemplate
<span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI
<span class="hljs-keyword">from</span> langchain.memory <span class="hljs-keyword">import</span> ConversationBufferMemory
<span class="hljs-keyword">import</span> requests
</code></pre>
<p>Here, we import the necessary modules:</p>
<ul>
<li><p><strong>Streamlit</strong> for the assistant’s UI.</p>
</li>
<li><p><strong>LangChain components</strong> (LLMChain, PromptTemplate) to manage the AI model and prompts.</p>
</li>
<li><p><strong>OpenAI</strong> as our Large Language Model SDK to use LangDB.ai AI Gateway.</p>
</li>
<li><p><strong>ConversationBufferMemory</strong> for maintaining chat history.</p>
</li>
</ul>
<hr />
<h3 id="heading-load-environment-variables">Load Environment Variables</h3>
<pre><code class="lang-python">os.environ[<span class="hljs-string">"LANGDB_API_KEY"</span>] = <span class="hljs-string">"your-langdb-api-key"</span>
</code></pre>
<ul>
<li><p>Make sure to update your <strong>project-id</strong></p>
</li>
<li><p>Since we are using LangDB, we have to add <strong>LANGDB_API_KEY</strong> to LangDB’s API key instead of <strong>OPENAI_API_KEY</strong></p>
</li>
</ul>
<h3 id="heading-creating-the-prompt-template"><strong>Creating the Prompt Template</strong></h3>
<pre><code class="lang-python">PROMPT_TEMPLATE = <span class="hljs-string">"""
You are a banking assistant specializing in answering FAQs about loans, interest rates, and general banking services.
If the user greets, respond with a greeting. If the user asks a question, provide an answer.
Use the following context too for answering questions:

{context}

Conversation History: 
{history}

---

Answer the question based on the above context: {query}
"""</span>
</code></pre>
<p>The Prompt Template provides structure to the assistant’s responses:</p>
<ul>
<li><p>It greets users when necessary.</p>
</li>
<li><p>It uses contextual memory to fetch relevant banking information.</p>
</li>
<li><p>It provides structured responses based on the query and available context.</p>
</li>
</ul>
<hr />
<h3 id="heading-initializing-the-core-ai-components"><strong>Initializing the Core AI Components</strong></h3>
<pre><code class="lang-python">LANGDB_API_URL = <span class="hljs-string">"https://api.us-east-1.langdb.ai/your-project-id/v1"</span>

llm = ChatOpenAI(
    base_url=LANGDB_API_URL,
    api_key=getenv(<span class="hljs-string">"LANGDB_API_KEY"</span>),
    model=<span class="hljs-string">"gpt-4o-mini"</span>,  <span class="hljs-comment"># Replace with the specific model name you are using</span>
    timeout=<span class="hljs-number">10</span>  <span class="hljs-comment"># Add a timeout of 10 seconds</span>
)

memory = ConversationBufferMemory(
    memory_key=<span class="hljs-string">"history"</span>, 
    return_messages=<span class="hljs-literal">True</span>,
    input_key=<span class="hljs-string">"query"</span>,
)
</code></pre>
<p>Here’s what each component does:</p>
<ul>
<li><p><strong>OpenAI (</strong><code>llm</code>): Our AI Gateway SDK to use LangDB.ai Models.</p>
</li>
<li><p><strong>Memory (</strong><code>ConversationBufferMemory</code>): Retains chat history for continuity.</p>
</li>
</ul>
<p>Replace <strong><em>your-project-id</em></strong> with your project id, Follow below Gif to fetch your project-id</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1739945911570/53271703-90c7-4b42-8aa2-d68779433bca.gif" alt class="image--center mx-auto" /></p>
<hr />
<h3 id="heading-building-the-chatbot-chain"><strong>Building the Chatbot Chain</strong></h3>
<pre><code class="lang-python">prompt_template = PromptTemplate(
    input_variables=[<span class="hljs-string">"context"</span>, <span class="hljs-string">"history"</span>, <span class="hljs-string">"query"</span>],
    template=PROMPT_TEMPLATE
)

chain = LLMChain(llm=llm, prompt=prompt_template, memory=memory)
</code></pre>
<p>Here we chain the model with the prompt template and memory, allowing it to generate responses dynamically.</p>
<hr />
<h2 id="heading-deploying-with-streamlit"><strong>Deploying with Streamlit</strong></h2>
<h3 id="heading-setting-up-the-ui"><strong>Setting Up the UI</strong></h3>
<pre><code class="lang-python">st.set_page_config(page_title=<span class="hljs-string">"Banking Assistant"</span>, layout=<span class="hljs-string">"wide"</span>)
st.title(<span class="hljs-string">"Banking FAQ Assistant"</span>)
st.write(<span class="hljs-string">"Ask questions about banking services, loan options, and interest rates!"</span>)
</code></pre>
<p>Above code sets up our Streamlit UI with a title and description.</p>
<hr />
<h3 id="heading-handling-user-queries"><strong>Handling User Queries</strong></h3>
<pre><code class="lang-python">user_input = st.text_input(<span class="hljs-string">"Enter your query:"</span>)
send_button = st.button(<span class="hljs-string">"Send"</span>)
</code></pre>
<p>Users can input their banking questions, and responses are triggered by clicking the Send button.</p>
<hr />
<h3 id="heading-processing-the-query"><strong>Processing the Query</strong></h3>
<pre><code class="lang-python"><span class="hljs-keyword">if</span> send_button:
    <span class="hljs-keyword">if</span> user_input:
        <span class="hljs-keyword">try</span>:
            context = <span class="hljs-string">""</span>
            response = chain.run(context=context, query=user_input)

            st.session_state.messages.append({<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: user_input, <span class="hljs-string">"is_user"</span>:<span class="hljs-literal">True</span>})
            st.session_state.messages.append({<span class="hljs-string">"role"</span>: <span class="hljs-string">"assistant"</span>, <span class="hljs-string">"content"</span>: response, <span class="hljs-string">"is_user"</span>:<span class="hljs-literal">False</span>})
            st.rerun()
        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            st.error(<span class="hljs-string">f"Error generating response: <span class="hljs-subst">{e}</span>"</span>)
    <span class="hljs-keyword">else</span>:
        st.warning(<span class="hljs-string">"Please enter a valid query."</span>)
</code></pre>
<p>Above code snippet helps us with following functionalities:</p>
<ul>
<li><p>Generates a response using LangChain.</p>
</li>
<li><p>Updates chat history in Streamlit’s session state.</p>
</li>
</ul>
<p>Lets have a look at the complete code snippet</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> os <span class="hljs-keyword">import</span> getenv

<span class="hljs-keyword">import</span> streamlit <span class="hljs-keyword">as</span> st
<span class="hljs-keyword">from</span> langchain.chains <span class="hljs-keyword">import</span> LLMChain
<span class="hljs-keyword">from</span> langchain.prompts <span class="hljs-keyword">import</span> PromptTemplate
<span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI
<span class="hljs-keyword">from</span> langchain.memory <span class="hljs-keyword">import</span> ConversationBufferMemory
<span class="hljs-keyword">import</span> requests

<span class="hljs-comment"># Constants</span>
PROMPT_TEMPLATE = <span class="hljs-string">"""
You are a banking assistant specializing in answering FAQs about loans, interest rates, and general banking services.
If the user greets, respond with a greeting. If the user asks a question, provide an answer.
Use the following context too for answering questions:

{context}

Conversation History: 
{history}

---


Answer the question based on the above context: {query}

"""</span>

LANGDB_API_URL = <span class="hljs-string">"https://api.us-east-1.langdb.ai/your-project-key/v1"</span>  <span class="hljs-comment"># Replace with your LANGDB project id</span>
os.environ[<span class="hljs-string">"LANGDB_API_KEY"</span>] = <span class="hljs-string">"your-api-key"</span>

st.set_page_config(page_title=<span class="hljs-string">"Banking Assistant"</span>, layout=<span class="hljs-string">"wide"</span>)
st.title(<span class="hljs-string">"Banking FAQ Assistant"</span>)
st.write(<span class="hljs-string">"Ask questions about banking services, loan options, and interest rates!"</span>)

<span class="hljs-comment"># Initialize LangChain LLM</span>
llm = ChatOpenAI(
    base_url=LANGDB_API_URL,
    api_key=getenv(<span class="hljs-string">"LANGDB_API_KEY"</span>),
    model=<span class="hljs-string">"gpt-3.5-turbo"</span>,  <span class="hljs-comment"># Replace with the specific model name you are using</span>
    timeout=<span class="hljs-number">10</span>  <span class="hljs-comment"># Add a timeout of 10 seconds</span>
)

<span class="hljs-comment"># Memory for conversation history</span>
memory = ConversationBufferMemory(
    memory_key=<span class="hljs-string">"history"</span>, 
    return_messages=<span class="hljs-literal">True</span>,
    input_key=<span class="hljs-string">"query"</span>,
)

<span class="hljs-comment"># Prompt Template for LangChain</span>
prompt_template = PromptTemplate(
    input_variables=[<span class="hljs-string">"context"</span>, <span class="hljs-string">"history"</span>, <span class="hljs-string">"query"</span>],
    template=PROMPT_TEMPLATE
)

<span class="hljs-comment"># LangChain LLM Chain</span>
chain = LLMChain(llm=llm, prompt=prompt_template, memory=memory)

<span class="hljs-comment"># Chatbox implementation</span>
st.subheader(<span class="hljs-string">"Chatbox"</span>)

<span class="hljs-comment"># Container for chat messages</span>
chat_container = st.container()

<span class="hljs-comment"># Function to display chat messages</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">display_message</span>(<span class="hljs-params">message, is_user=True</span>):</span>
    <span class="hljs-keyword">if</span> is_user:
        chat_container.markdown(<span class="hljs-string">f"&lt;div style='text-align: right; padding: 10px; border-radius: 10px; margin: 5px;'&gt;<span class="hljs-subst">{message}</span>&lt;/div&gt;"</span>, unsafe_allow_html=<span class="hljs-literal">True</span>)
    <span class="hljs-keyword">else</span>:
        chat_container.markdown(<span class="hljs-string">f"&lt;div style='text-align: left; padding: 10px; border-radius: 10px; margin: 5px;'&gt;<span class="hljs-subst">{message}</span>&lt;/div&gt;"</span>, unsafe_allow_html=<span class="hljs-literal">True</span>)

<span class="hljs-comment"># Initialize chat history in session state</span>
<span class="hljs-keyword">if</span> <span class="hljs-string">"messages"</span> <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> st.session_state:
    st.session_state.messages = []

<span class="hljs-comment"># Display chat history</span>
<span class="hljs-keyword">with</span> chat_container:
    <span class="hljs-keyword">for</span> chat <span class="hljs-keyword">in</span> st.session_state.messages:
        display_message(chat[<span class="hljs-string">'content'</span>], is_user=chat[<span class="hljs-string">'is_user'</span>])

<span class="hljs-comment"># User Input Section</span>
user_input = st.text_input(<span class="hljs-string">"Enter your query:"</span>, key=<span class="hljs-string">"user_input"</span>)
send_button = st.button(<span class="hljs-string">"Send"</span>)

<span class="hljs-keyword">if</span> send_button:
    user_input = st.session_state.user_input.strip()  <span class="hljs-comment"># Ensure the input is not empty or just whitespace</span>
    <span class="hljs-keyword">if</span> user_input:
        <span class="hljs-keyword">try</span>:
            context = <span class="hljs-string">""</span> <span class="hljs-comment"># to be used in next tutorial</span>
            response = chain.run(context=context, query=user_input)
            <span class="hljs-comment"># Update conversation memory</span>
            st.session_state.messages.append({<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: user_input, <span class="hljs-string">"is_user"</span>:<span class="hljs-literal">True</span>})
            st.session_state.messages.append({<span class="hljs-string">"role"</span>: <span class="hljs-string">"assistant"</span>, <span class="hljs-string">"content"</span>: response, <span class="hljs-string">"is_user"</span>:<span class="hljs-literal">False</span>})
            st.rerun()
        <span class="hljs-keyword">except</span> requests.exceptions.Timeout:
            st.error(<span class="hljs-string">"The request to the LLM timed out. Please try again."</span>)
        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            st.error(<span class="hljs-string">f"Error generating response: <span class="hljs-subst">{e}</span>"</span>)
    <span class="hljs-keyword">else</span>:
        st.warning(<span class="hljs-string">"Please enter a valid query."</span>)
</code></pre>
<hr />
<h2 id="heading-final-thoughts-scale-your-ai-with-rag"><strong>Final Thoughts: Scale Your AI with RAG!</strong></h2>
<p>Building a Banking FAQ chatbot with LangChain, LangDB, and ChromaDB enables users to access essential banking information effortlessly. By integrating memory and contextual awareness, this Conversational AI delivers clear and helpful responses.</p>
<p>🚀 What’s Next? In Part 2 of this series, we’ll dive into <strong>Building a RAG pipeline</strong> for more refined banking FAQs.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><strong>Bonus:</strong> Get started for free with the complete source code and experiment with LangChain’s advanced features! <strong>Don’t forget to star</strong>⭐<strong> our </strong><a target="_self" href="https://github.com/langdb/langdb-samples"><strong>GitHub repo</strong></a><strong>!</strong></div>
</div>]]></content:encoded></item><item><title><![CDATA[How to Build an Epic Conversational AI with LangDB.ai and Rasa-pro (and Get $10 Free Credit!)]]></title><description><![CDATA[In the ever-evolving landscape of AI analytical innovations, building a next-level chatbot is more than just coding, it’s about creating an ecosystem of AI integration solutions that drive enterprise AI governance, scaling AI, and superior AI observa...]]></description><link>https://blog.langdb.ai/how-to-build-an-epic-conversational-ai-with-langdbai-and-rasa-pro-and-get-10-free-credit</link><guid isPermaLink="true">https://blog.langdb.ai/how-to-build-an-epic-conversational-ai-with-langdbai-and-rasa-pro-and-get-10-free-credit</guid><category><![CDATA[rasa]]></category><category><![CDATA[llm]]></category><category><![CDATA[Deepseek]]></category><category><![CDATA[openai]]></category><category><![CDATA[chatbot]]></category><category><![CDATA[conversational-ai]]></category><category><![CDATA[LangDB]]></category><dc:creator><![CDATA[Dishant Gandhi]]></dc:creator><pubDate>Wed, 12 Feb 2025 12:02:20 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1739426421251/447b2688-f9f2-4eee-b664-814633e7c83b.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the ever-evolving landscape of AI analytical innovations, building a next-level chatbot is more than just coding, it’s about creating an ecosystem of AI integration solutions that drive enterprise AI governance, scaling AI, and superior AI observability. Today, we’re diving deep into how you can merge the formidable powers of <a target="_blank" href="https://langdb.ai">LangDB</a> and Rasa-pro to create a conversational AI system that’s both smart and scalable—all while enjoying a $10 free credit to kickstart your journey for free!</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Find all source code and starter pack by clicking <a target="_self" href="http://github.com/langdb/rasa-starter-pack">here</a>.</div>
</div>

<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Don’t forget to star us⭐</div>
</div>

<h2 id="heading-alternatively-you-can-also-follow-our-youtube-tutorial">Alternatively you can also follow our Youtube tutorial</h2>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/Se_pEedNe_4">https://youtu.be/Se_pEedNe_4</a></div>
<p> </p>
<h3 id="heading-whats-the-buzz-about-langdb-ai-gateway">What’s the Buzz About LangDB AI Gateway</h3>
<p>LangDB AI Gateway is a powerful AI platform designed to help enterprises securely govern, optimize, and scale their AI solutions. As an AI gateway, it provides a seamless connection with over 150 large language models (LLMs) using OpenAI-compatible APIs.</p>
<p>By offering AI governance, cost efficiency, and enterprise AI solutions, LangDB ensures businesses can deploy AI models with enhanced security, performance, and reliability.</p>
<p>With LangDB, organizations benefit from AI as a service, enabling smooth AI for business intelligence operations while optimizing resources through intelligent model routing and observability. Whether you're building AI software solutions or integrating AI for enterprises, LangDB AI Gateway is the best artificial intelligence platform to streamline AI deployments at scale.</p>
<h3 id="heading-what-is-rasa">What is Rasa?</h3>
<p>Rasa is a leading open-source framework for building conversational AI solutions. It is celebrated for its ability to empower developers with <strong>AI governance tools</strong> and <strong>ai management</strong> capabilities that streamline the process of creating engaging, intelligent chatbots. By incorporating Rasa into your workflow, you gain access to advanced policies and pipelines—ideal for <strong>enterprise AI governance</strong> and <strong>AI integration</strong>—that help manage and scale AI applications effectively.</p>
<h3 id="heading-whats-on-our-agenda">What’s on Our Agenda?</h3>
<p>Here’s a sneak peek into the roadmap of our ultimate conversational AI guide:</p>
<ul>
<li><p><strong>Installation and Setup:</strong> Learn how to create a Conda environment and install all the necessary packages.</p>
</li>
<li><p><strong>Building Your Chatbot:</strong> Discover how to add engaging flows, configure responses, and fine-tune your Rasa project.</p>
</li>
<li><p><strong>Integrating LangDB</strong>: Set up LangDB to bring powerful AI infrastructure tools into your project</p>
</li>
<li><p><strong>Training and Deployment:</strong> Train your model and launch your AI server for a live demo</p>
</li>
</ul>
<h2 id="heading-step-by-step-guide-to-get-started">Step-by-Step Guide to Get Started</h2>
<h3 id="heading-installation">Installation</h3>
<p>Kick off by setting up a dedicated Conda environment to keep your project clean and dependencies in check:</p>
<ul>
<li><p>Download and Install <a target="_blank" href="https://www.anaconda.com/download/">Miniconda</a>.</p>
</li>
<li><p>Once installed, follow below steps to install Rasa on your environment</p>
</li>
<li><p>Create your conda environment with</p>
</li>
</ul>
<pre><code class="lang-bash">conda create env -n rasa-env python=3.10
</code></pre>
<ul>
<li>Activate your conda environment</li>
</ul>
<pre><code class="lang-bash">conda activate rasa-env
</code></pre>
<ul>
<li><strong>Turbocharge Your Setup,</strong> Speed up your installation process by installing <code>uv</code>:</li>
</ul>
<pre><code class="lang-bash">pip install uv
</code></pre>
<ul>
<li>Install Rasa-pro Now, install Rasa-pro</li>
</ul>
<pre><code class="lang-bash">uv pip install rasa-pro --extra-index-url=https://europe-west3-python.pkg.dev/rasa-releases/rasa-pro-python/simple/
</code></pre>
<h2 id="heading-setting-up-your-rasa-project">Setting Up Your Rasa Project</h2>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Omit this step if you already have Rasa project and Rasa-pro key</div>
</div>

<p>If you’re new to Rasa or need a Rasa-pro license key, no sweat-grab yours from Rasa’s <a target="_blank" href="https://rasa.com/docs/rasa-pro/developer-edition">developer portal</a> and set it in your environment:</p>
<p>You may receive a mail from Rasa with a license key for you to use, similar to below screenshot</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738918314536/ff1b83ad-b351-4dcb-b40b-600f8cfb7105.png" alt class="image--center mx-auto" /></p>
<p>Open your terminal and activate your conda environment <code>rasa-env</code></p>
<p>In the same terminal, set your Rasa pro License as following</p>
<pre><code class="lang-bash"><span class="hljs-built_in">set</span> RASA_PRO_LICENSE=your-key
</code></pre>
<p>Then, initialize your Rasa CALM project:</p>
<pre><code class="lang-bash">rasa init --template calm
</code></pre>
<p>This command scaffolds your project structure, priming it for some serious AI integration and AI application governance.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738918642976/59247e73-1c58-4a88-80cd-f3e3bb7f8929.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-building-your-rasa-chatbot-nerd-out-with-cool-flows-and-responses">Building Your Rasa Chatbot: Nerd Out with Cool Flows and Responses</h2>
<h3 id="heading-adding-flows">Adding flows</h3>
<ul>
<li>Let’s create a flow that greets users with some serious nerd cred. In the <code>data/flows</code> folder, create a file called <code>greet.yml</code>:</li>
</ul>
<pre><code class="lang-yaml"><span class="hljs-attr">flows:</span>
  <span class="hljs-attr">greet:</span>
    <span class="hljs-attr">description:</span> <span class="hljs-string">always</span> <span class="hljs-string">run</span> <span class="hljs-string">when</span> <span class="hljs-string">the</span> <span class="hljs-string">user</span> <span class="hljs-string">first</span> <span class="hljs-string">greets</span> <span class="hljs-string">the</span> <span class="hljs-string">bot</span>
    <span class="hljs-attr">name:</span> <span class="hljs-string">greet</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">action:</span> <span class="hljs-string">utter_greet</span>
</code></pre>
<ul>
<li>This simple yet effective flow ensures your bot springs into action the moment someone drops a “Hi.”</li>
</ul>
<h3 id="heading-crafting-witty-responses">Crafting Witty Responses</h3>
<ul>
<li>Next, update the <code>shared.yml</code> file in the domain folder to add your greeting response:</li>
</ul>
<pre><code class="lang-yaml"><span class="hljs-attr">version:</span> <span class="hljs-string">"3.1"</span>

<span class="hljs-attr">slots:</span>
  <span class="hljs-attr">return_value:</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">any</span>
    <span class="hljs-attr">mappings:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">type:</span> <span class="hljs-string">custom</span>
        <span class="hljs-attr">action:</span> <span class="hljs-string">add_contact</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">type:</span> <span class="hljs-string">custom</span>
        <span class="hljs-attr">action:</span> <span class="hljs-string">remove_contact</span>

<span class="hljs-attr">responses:</span>
  <span class="hljs-attr">utter_greet:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">text:</span> <span class="hljs-string">"Hello, how may I help you?"</span>
</code></pre>
<h3 id="heading-configuring-rasa-for-langdb-integration">Configuring Rasa for LangDB Integration</h3>
<ul>
<li>Integrate your Rasa project with LangDB by updating your <code>config.yml</code>:</li>
</ul>
<pre><code class="lang-yaml"><span class="hljs-attr">recipe:</span> <span class="hljs-string">default.v1</span>
<span class="hljs-attr">language:</span> <span class="hljs-string">en</span>
<span class="hljs-attr">pipeline:</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">SingleStepLLMCommandGenerator</span>
  <span class="hljs-attr">llm:</span>
    <span class="hljs-attr">model_group:</span> <span class="hljs-string">openai-gpt-4</span>

<span class="hljs-attr">policies:</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">FlowPolicy</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">IntentlessPolicy</span>
<span class="hljs-attr">assistant_id:</span> <span class="hljs-number">20250207</span>
</code></pre>
<ul>
<li><p>Next step is to add this model group inside <code>endpoints.yml</code> with our LangDB Integration</p>
</li>
<li><p>Add below configuration at the very end of your <code>endpoints.yml</code></p>
</li>
</ul>
<pre><code class="lang-yaml"><span class="hljs-attr">model_groups:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">id:</span> <span class="hljs-string">openai-gpt-4</span>
    <span class="hljs-attr">models:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">provider:</span> <span class="hljs-string">openai</span>
        <span class="hljs-attr">model:</span> <span class="hljs-string">gpt-4o</span>
        <span class="hljs-attr">api-base:</span> <span class="hljs-string">"https://api.us-east-1.langdb.ai/your-project-id/v1"</span>
        <span class="hljs-attr">request_timeout:</span> <span class="hljs-number">7</span>
        <span class="hljs-attr">max_tokens:</span> <span class="hljs-number">256</span>
</code></pre>
<ul>
<li><p><strong>id:</strong> your custom name for model-group(make sure to keep it same as in your config.yml)</p>
</li>
<li><p><strong>model</strong>: add your LangDB model name</p>
</li>
<li><p>Remember to replace <code>your-project-id</code> with your actual project id to enjoy seamless AI integration and smart AI functionalities.</p>
</li>
</ul>
<p><strong>Follow below steps to fetch Project-id:</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1739012815936/b512b956-fbc5-4389-8006-98b6c21255c3.gif" alt class="image--center mx-auto" /></p>
<h3 id="heading-training-your-rasa-model-with-langdb">Training Your Rasa Model with LangDB</h3>
<ul>
<li><p>Before you train, set these environment variables to direct all your API calls through LangDB, ensuring robust AI operational visibility:</p>
</li>
<li><p>Use <code>set</code> or <code>$env</code> for Windows and <code>export</code> for Mac/Linux machines</p>
</li>
</ul>
<pre><code class="lang-bash">OPENAI_API_KEY=your-api-key
OPENAI_BASE_URL=https://api.us-east-1.langdb.ai/your-project-id/v1
RASA_PRO_LICENSE=your-rasa-pro-license-key
</code></pre>
<ul>
<li><p>Make sure to update your <strong>project-id</strong></p>
</li>
<li><p>Why have we used OPENAI_BASE_URL environment variable?</p>
<ul>
<li><p>Rasa by default uses Openai API url to send all queries</p>
</li>
<li><p>Since we are using LangDB, we have to overwrite OPENAI_BASE_URL to LangDB’s API url</p>
</li>
</ul>
</li>
<li><p>Everything is set, lets train our model by running <code>rasa train</code></p>
</li>
</ul>
<h3 id="heading-running-our-conversational-ai">Running our Conversational AI</h3>
<ul>
<li>Time to see your creation in action! Open two terminal windows:</li>
</ul>
<pre><code class="lang-bash">OPENAI_API_KEY=your-api-key
OPENAI_BASE_URL=https://api.us-east-1.langdb.ai/your-project-id/v1
RASA_PRO_LICENSE=your-rasa-pro-license-key
</code></pre>
<ul>
<li><p>Ensure both terminals have the necessary environment variables</p>
</li>
<li><p><strong>Terminal 1:</strong> <strong>Actions server</strong> by running <code>rasa run actions</code></p>
</li>
<li><p><strong>Terminal 2:</strong> open Rasa’s interactive chat UI by running <code>rasa inspect --debug</code></p>
</li>
<li><p>Watch as your Chat UI lights up with AI integration that delivers unparalleled AI and analytics performance.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1739014632119/38b7b8a4-dd10-4b83-898e-be2e0c38bee1.gif" alt class="image--center mx-auto" /></p>
<h2 id="heading-final-thoughts-nerd-out-and-optimize-your-ai">Final Thoughts: Nerd Out and Optimize Your AI!</h2>
<p>Integrating LangDB with Rasa-pro is more than just building a chatbot-it's about constructing a dynamic, analytical AI ecosystem that brings AI observability, AI governance tools, and end-to-end AI integrations to life. Whether you're diving into ai for managers, tinkering with ai in management, or simply passionate about ai integrations, this guide sets you up for success.</p>
<p><strong>Bonus Alert:</strong> Get started for Free! with a <strong>$10 free credit</strong> to experiment with these cutting-edge tools without any upfront cost! Embrace your inner tech nerd, explore the endless possibilities of AI management tools and AI infrastructure tools, and join the revolution in AI solutions observability.</p>
<p>Don’t forget to check out our <a target="_blank" href="http://github.com/langdb/rasa-starter-pack">source code and starter pack</a>, and if you love what you see, hit that star button on GitHub. Happy coding, and may your AI always be as smart, scalable, and observable as possible!</p>
]]></content:encoded></item><item><title><![CDATA[How Your LLM Interaction Data is a Goldmine for AI Innovation]]></title><description><![CDATA[If you’re deploying AI agents or LLM-powered applications in your business, you might be sitting on an untapped goldmine—one that could revolutionize your product, customer experience, and competitive edge. The secret? The raw, unstructured data gene...]]></description><link>https://blog.langdb.ai/how-your-llm-interaction-data-is-a-goldmine-for-ai-innovation</link><guid isPermaLink="true">https://blog.langdb.ai/how-your-llm-interaction-data-is-a-goldmine-for-ai-innovation</guid><category><![CDATA[llm]]></category><category><![CDATA[agents]]></category><category><![CDATA[finetuning]]></category><dc:creator><![CDATA[Matteo Pelati]]></dc:creator><pubDate>Wed, 12 Feb 2025 08:02:54 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1739347236401/24f3be3b-251b-4623-9281-67851714a386.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you’re deploying AI agents or LLM-powered applications in your business, you might be sitting on an untapped goldmine—one that could revolutionize your product, customer experience, and competitive edge. The secret? The raw, unstructured data generated every time users interact with your AI.</p>
<p>Unlike traditional software, where user inputs are confined to buttons and dropdown menus, LLM applications invite free-form conversations. This creates a treasure trove of unfiltered insights into what users truly want, need, and struggle with. Add to this the layer of explicit feedback (e.g., thumbs up/down on responses), and you have a proprietary dataset that most companies overlook—but forward-thinking AI leaders are already leveraging to dominate their markets.</p>
<p>Here’s why this data matters, how to use it, and real-world examples to inspire your strategy.</p>
<h2 id="heading-the-goldmine-youre-sitting-on"><strong>The Goldmine You’re Sitting On</strong></h2>
<p>When users interact with an LLM-powered chatbot or a virtual assistant they ask questions in their own words. This organic, unstructured data reveals:</p>
<ul>
<li><p><strong>Pain points</strong> (e.g., “How do I reset my device without losing data?”)</p>
</li>
<li><p><strong>Intent</strong> (e.g., “Is this product compatible with X?”)</p>
</li>
<li><p><strong>Sentiment</strong> (e.g., frustration, confusion, satisfaction)</p>
</li>
</ul>
<p>But the real magic happens when you combine this with <strong>preference data</strong>—explicit or inferred signals about what users value. This data can be captured in two powerful ways:</p>
<ol>
<li><p><strong>Direct User Feedback</strong>: (e.g., thumbs-down rating on a chatbot response)</p>
</li>
<li><p><strong>LLM-as-Analyst</strong>: Use a secondary LLM to “read” conversations and infer preferences, even when users don’t explicitly rate responses.</p>
<ul>
<li><p><em>Example</em>: An LLM flags frustration in a user’s message: <em>“I’ve tried this three times already!”</em></p>
</li>
<li><p><em>Example</em>: An LLM detects unresolved issues: <em>“Wait, that didn’t fix it.”</em></p>
</li>
</ul>
</li>
</ol>
<p>Together, these datasets let you:</p>
<ol>
<li><p><strong>Identify patterns</strong> (What do users <em>actually</em> care about?)</p>
</li>
<li><p><strong>Improve your AI</strong> (Fine-tune models to fix weaknesses)</p>
</li>
<li><p><strong>Drive business decisions</strong> (Which features are missing? Where is documentation lacking?)</p>
</li>
</ol>
<h2 id="heading-how-it-works-combining-direct-feedback-llm-inferred-signals"><strong>How It Works: Combining Direct Feedback + LLM-Inferred Signals</strong></h2>
<p>Let’s break down a real conversation and its structured analytical output:</p>
<p><strong>Raw Chat</strong>:</p>
<p><em>User</em>: “Your instructions for resetting the AirPurifier Pro didn’t work. The red light is still on. What now?”</p>
<p><em>Chatbot</em>: “Try unplugging the device for 60 seconds.”</p>
<p><em>User</em>: “Still no luck. This is a waste of time.”</p>
<p><strong>Structured Data Extracted</strong>:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"product"</span>: <span class="hljs-string">"AirPurifier Pro"</span>,
  <span class="hljs-attr">"issue_type"</span>: <span class="hljs-string">"Hardware Troubleshooting / Reset"</span>,
  <span class="hljs-attr">"resolution_status"</span>: <span class="hljs-string">"No"</span>,
  <span class="hljs-attr">"direct_feedback"</span>: <span class="hljs-literal">null</span>,  <span class="hljs-comment">// User didn’t click a rating</span>
  <span class="hljs-attr">"inferred_sentiment"</span>: {
    <span class="hljs-attr">"score"</span>: <span class="hljs-number">1</span>,  <span class="hljs-comment">// LLM-as-analyst assigned based on tone</span>
    <span class="hljs-attr">"triggers"</span>: [<span class="hljs-string">"“Didn’t work”"</span>, <span class="hljs-string">"“Waste of time”"</span>]
  }
}
</code></pre>
<p><strong>Why This Matters</strong>:</p>
<ul>
<li><p>Even without a direct rating, the LLM-as-analyst identified failure and frustration.</p>
</li>
<li><p>Aggregating these signals shows that “hardware troubleshooting / Reset” issues have a 70% unresolved rate, prompting a product team investigation.</p>
</li>
</ul>
<h2 id="heading-an-example-how-a-smart-devices-manufacturer-scaled-ai-powered-support"><strong>An Example: How a Smart Devices Manufacturer scaled AI-Powered Support</strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1739345089031/c43c812a-e539-4b78-ba32-4875502b78d2.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-phase-1-uncovering-hidden-patterns-in-chat-data"><strong>Phase 1: Uncovering Hidden Patterns in Chat Data</strong></h3>
<p>A smart device manufacturer, noticed that 40% of customer support chats were being escalated to human agents. Their chatbot struggled with ambiguous queries, and user feedback averaged just 2.8/5.</p>
<h3 id="heading-step-1-post-process-chats-with-an-llm-analyst"><strong>Step 1: Post-Process Chats with an LLM Analyst</strong></h3>
<p>They ran 10,000+ historical human operated chats through an LLM to extract structured data:</p>
<ul>
<li><p><strong>Product</strong>: Device mentioned (thermostat, camera, etc.)</p>
</li>
<li><p><strong>Issue Type</strong>: Connectivity, documentation, hardware, etc.</p>
</li>
<li><p><strong>Sentiment</strong>: Inferred frustration/confusion from phrases like <em>“This isn’t working!”</em></p>
</li>
<li><p><strong>Resolution Status</strong>: Resolved by bot? (Yes/No)</p>
</li>
</ul>
<p><strong>Sample Insights from the Dashboard</strong>:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Product</td><td>Top Issue</td><td>Escalation Rate</td><td>Avg Sentiment</td></tr>
</thead>
<tbody>
<tr>
<td>SmartThermo X3</td><td>Wi-Fi Setup</td><td>55%</td><td>2.1</td></tr>
<tr>
<td>AirPurifier Pro</td><td>Filter Replacement</td><td>35%</td><td>3.4</td></tr>
<tr>
<td>SecureCam 4K</td><td>Night Vision</td><td>60%</td><td>1.9</td></tr>
</tbody>
</table>
</div><p><strong>Key Findings</strong>:</p>
<ul>
<li><p><strong>Wi-Fi Setup</strong> for thermostats had the highest escalation rate (55%) and lowest sentiment.</p>
</li>
<li><p>Users asking about <strong>filter replacements</strong> often complained about <em>“unclear instructions.”</em></p>
</li>
</ul>
<h3 id="heading-phase-2-fine-tuning-a-specialized-support-model"><strong>Phase 2: Fine-Tuning a Specialized Support Model</strong></h3>
<p>Using the structured data they built a tailored dataset to fix their chatbot’s weaknesses.</p>
<p><strong>Step 1: Create Training Data from Failed Interactions</strong></p>
<p>For high-escalation queries like <em>“Why won’t my thermostat connect to Wi-Fi?”</em>, they paired raw chats with ideal resolutions from human agents:</p>
<p><strong>Training Example</strong>:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"prompt"</span>: <span class="hljs-string">"User: My SmartThermo X3 won’t connect to Wi-Fi. I’ve rebooted it twice. Chatbot: Have you checked your router? User: Yes, other devices work. This is so frustrating!"</span>,
  <span class="hljs-attr">"completion"</span>: <span class="hljs-string">"Apologies for the trouble! The ‘network unavailable’ error often occurs due to a firmware mismatch. Go to Settings &gt; About &gt; Check Updates. If none appear, manually download the latest firmware [here]."</span>
}
</code></pre>
<p><strong>Step 2: Fine-Tune a Domain-Specific Model</strong></p>
<p>They fine-tuned GPT-3.5 on 2,000+ examples like the above, focusing on:</p>
<ul>
<li><p><strong>Troubleshooting clarity</strong>: Step-by-step fixes instead of generic advice.</p>
</li>
<li><p><strong>Proactive solutions</strong>: Linking to firmware updates or video guides.</p>
</li>
<li><p><strong>Tone adjustments</strong>: Acknowledging frustration (e.g., <em>“I understand this is urgent—let’s fix it quickly.”</em>).</p>
</li>
</ul>
<h3 id="heading-phase-3-results-amp-impact"><strong>Phase 3: Results &amp; Impact</strong></h3>
<p>After deploying the fine-tuned model:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Metric</td><td>Before Fine-Tuning</td><td>After Fine-Tuning</td></tr>
</thead>
<tbody>
<tr>
<td>Escalation Rate</td><td>40%</td><td>12%</td></tr>
<tr>
<td>Avg. Feedback Score</td><td>2.8</td><td>4.5</td></tr>
<tr>
<td>Avg. Resolution Time</td><td>8.2 mins</td><td>3.1 mins</td></tr>
</tbody>
</table>
</div><p><strong>Real Conversations Post-Fine-Tuning</strong>:</p>
<p><strong>User</strong>: <em>“Night vision on my SecureCam 4K stopped working after the update.”</em></p>
<p><strong>Old Bot</strong>: <em>“Check your settings. If issues persist, contact support.”</em></p>
<p><strong>Fine-Tuned Bot</strong>:</p>
<p>*“This is a known issue with firmware v2.1. Let’s fix it now:</p>
<ol>
<li><p>Go to Settings &gt; Night Vision &gt; Toggle ‘Infrared Mode.’</p>
</li>
<li><p>If the issue persists, roll back to v2.0 [guide here].</p>
</li>
<li><p>Our team will notify you when v2.2 (fixing this bug) launches next week.”*</p>
</li>
</ol>
<p><strong>User Feedback</strong>: <em>“Finally, a bot that actually solves problems!”</em> (Rated 5/5)</p>
<h3 id="heading-why-this-worked"><strong>Why This Worked</strong></h3>
<ol>
<li><p><strong>Closed-Loop Learning</strong>: Failed chats became training data.</p>
</li>
<li><p><strong>Preference Data Fusion</strong>: Used both direct ratings (“thumbs down”) and LLM-inferred frustration (“This is so frustrating!”).</p>
</li>
<li><p><strong>Business Alignment</strong>: The product team fixed recurring firmware bugs flagged by the chatbot’s analytics (e.g., night vision errors).</p>
</li>
</ol>
<h3 id="heading-your-playbook-for-success"><strong>Your Playbook for Success</strong></h3>
<ol>
<li><p><strong>Mine Raw Chats</strong>: Use an LLM to structure conversations (product, issue, sentiment).</p>
</li>
<li><p><strong>Build a “Failure Dataset”</strong>: Focus on escalated queries and low-rated responses.</p>
</li>
<li><p><strong>Fine-Tune Iteratively</strong>: Start with 500–1,000 high-impact examples.</p>
</li>
<li><p><strong>Measure Religiously</strong>: Track escalations, feedback, and resolution time.</p>
</li>
</ol>
<p>By treating user interactions as a feedback loop—not just a cost center—they turned its chatbot into a competitive asset. The result? Happier customers, fewer support tickets, and a model that keeps improving itself.</p>
]]></content:encoded></item><item><title><![CDATA[Working with MCP Servers on LangDB]]></title><description><![CDATA[In the evolving landscape of AI-powered applications, integrating multiple sources of data efficiently is key to building robust and intelligent systems. Model Context Protocol (MCP) provides a standardized framework that enables AI models to connect...]]></description><link>https://blog.langdb.ai/working-with-mcp-servers-on-langdb</link><guid isPermaLink="true">https://blog.langdb.ai/working-with-mcp-servers-on-langdb</guid><category><![CDATA[smithery]]></category><category><![CDATA[Model Context Protocol]]></category><category><![CDATA[#anthropic]]></category><category><![CDATA[openai]]></category><category><![CDATA[LangDB]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Mon, 10 Feb 2025 04:40:14 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1738777996514/ef0ba587-5e8e-489c-bb11-9d162c86cb2f.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the evolving landscape of AI-powered applications, integrating multiple sources of data efficiently is key to building robust and intelligent systems. <strong>Model Context Protocol (MCP)</strong> provides a standardized framework that enables AI models to connect to various external services seamlessly while maintaining <strong>flexibility, security, and scalability</strong>.</p>
<p>LangDB makes it incredibly simple to work with MCP servers by enabling OpenAI-compatible API access to different LLMs and external search providers with minimal configuration.</p>
<h2 id="heading-what-is-mcp">What is MCP?</h2>
<p>MCP (Model Context Protocol) provides a framework for AI models to connect to multiple external services through a standardized protocol. With native tool integrations, MCP connects AI models to APIs, databases, local files, automation tools, and remote services. This allows developers to effortlessly integrate MCP with IDEs, business workflows, and cloud platforms, while retaining the flexibility to switch between LLM providers. This enables the creation of intelligent, multi-modal workflows where AI securely interacts with real-world data and tools.</p>
<h2 id="heading-where-is-mcp-used">Where is MCP Used?</h2>
<p>MCP is widely used across various fields to enhance AI functionality:</p>
<ul>
<li><p><strong>Data and Storage</strong>: Enables structured data management and retrieval.</p>
</li>
<li><p><strong>Cloud &amp; Infrastructure</strong>: Supports distributed computing and secure content delivery.</p>
</li>
<li><p><strong>Development Tools</strong>: Automates repository management and error tracking.</p>
</li>
<li><p><strong>Content and Search</strong>: Enhances search, indexing, and geolocation services.</p>
</li>
<li><p><strong>AI &amp; Memory</strong>: Powers vector search and machine learning applications.</p>
</li>
<li><p><strong>Productivity</strong>: Integrates AI into collaboration and document management tools.</p>
</li>
<li><p><strong>System &amp; Utilities</strong>: Improves system automation and workflow efficiency.</p>
</li>
</ul>
<p>LangDB simplifies MCP integration by sending requests to the appropriate MCP server based on the LLM's response, ensuring seamless retrieval of relevant data while enabling full tracing of interactions. This allows developers to monitor requests, analyze performance, and optimize responses efficiently.</p>
<h2 id="heading-using-mcp-with-langdb">Using MCP with LangDB</h2>
<p>By specifying an MCP server in the request, developers can ensure seamless interactions between AI models and external data sources.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> openai <span class="hljs-keyword">import</span> OpenAI
client = OpenAI(
    api_key=os.getenv(<span class="hljs-string">"LANGDB_API_KEY"</span>),
    base_url=os.getenv(<span class="hljs-string">"LANGDB_API_URL"</span>)
)
extra_body = {
    <span class="hljs-string">"mcp_servers"</span>: [
        {
            <span class="hljs-string">"server_url"</span>: <span class="hljs-string">"wss://your-mcp-server.com/ws?config=your_encoded_config"</span>,
            <span class="hljs-string">"type"</span>: <span class="hljs-string">"ws"</span>
        }
    ]
}
response = client.chat.completions.create(
    model=<span class="hljs-string">"gpt-4o-mini"</span>,
    messages=[{<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"what is langdb?"</span>}],
    extra_body=extra_body
)
</code></pre>
<h3 id="heading-1-exa-search-integration">1. EXA Search Integration</h3>
<p>EXA Search is a powerful search engine that provides structured and contextualized search results. Using LangDB, we can integrate EXA Search with minimal configuration.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738777903777/60e9831a-5d4c-4112-9c4b-77cfe0f68cca.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-2-github">2. Github</h3>
<p>GitHub MCP Server enables seamless access to the GitHub API, allowing developers to perform file operations, repository management, search functionality, and more.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738777907924/1398cefc-bc3d-44a6-937b-2ed4ab7c6299.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-mcp-tracing-and-observability-in-langdb">MCP Tracing and Observability in LangDB</h2>
<p>One of the key benefits of using LangDB for MCP integration is full tracing and observability. With a simple change, developers can track requests, optimize response times, and analyze usage across different providers.</p>
<p>LangDB automatically manages these integrations, ensuring seamless transitions between different MCP servers without additional infrastructure overhead.</p>
<h2 id="heading-using-smithery-for-mcp-deployments"><strong>Using Smithery for MCP Deployments</strong></h2>
<p>We leveraged <strong>Smithery</strong> to streamline MCP deployments. To learn more:</p>
<ul>
<li><p>Visit the <a target="_blank" href="https://github.com/langdb/langdb-samples/tree/main/examples/mcp"><strong>LangDB Samples</strong></a> Repository for setup instructions and examples.</p>
</li>
<li><p>Checkout <a target="_blank" href="https://docs.langdb.ai/integrations/smithery">Smithery</a> in the documentation.</p>
</li>
</ul>
<p>For more details, visit the <a target="_blank" href="https://modelcontextprotocol.io/introduction"><strong>Model Context Protocol official page</strong></a> and explore <a target="_blank" href="https://docs.anthropic.com/en/docs/build-with-claude/mcp"><strong>Anthropic MCP documentation</strong></a>.</p>
]]></content:encoded></item><item><title><![CDATA[Building a Specialized RAG-Based AI Agent with LangDB and LlamaIndex]]></title><description><![CDATA[Introduction
In today's GenAI-driven landscape, enterprises increasingly integrate multiple AI agents to optimize decision-making, automate workflows, and enhance productivity. A key challenge is managing diverse AI models and libraries in a seamless...]]></description><link>https://blog.langdb.ai/building-a-specialized-rag-based-ai-agent-with-langdbai-and-llamaindex</link><guid isPermaLink="true">https://blog.langdb.ai/building-a-specialized-rag-based-ai-agent-with-langdbai-and-llamaindex</guid><category><![CDATA[LangDB]]></category><category><![CDATA[LlamaIndex]]></category><category><![CDATA[RAG ]]></category><category><![CDATA[AI Workflow]]></category><category><![CDATA[openai]]></category><category><![CDATA[ai gateway]]></category><category><![CDATA[VectorSearch]]></category><category><![CDATA[llm]]></category><dc:creator><![CDATA[Dishant Gandhi]]></dc:creator><pubDate>Tue, 04 Feb 2025 05:13:13 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1738327708102/917d2ff1-3a47-4113-87c4-012c37b2abcb.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction"><strong>Introduction</strong></h2>
<p>In today's GenAI-driven landscape, enterprises increasingly integrate multiple AI agents to optimize decision-making, automate workflows, and enhance productivity. A key challenge is managing diverse AI models and libraries in a seamless, scalable way.</p>
<p>LangDB AI Gateway simplifies this by offering an intelligent <strong>AI Gateway</strong> that integrates effortlessly with popular AI frameworks.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Learn more about AI Gateway and LangDB in our <a target="_self" href="https://blog.langdb.ai/introduction-to-ai-gateway">blog</a>.</div>
</div>

<p>In this guide, we'll explore how to leverage <strong>LangDB with LlamaIndex</strong> to build a powerful, enterprise-ready <strong>AI solution</strong> that can store and retrieve knowledge efficiently.</p>
<p>By the end of this blog, you will have a fully functional Python application that queries a <strong>Large Language Model (LLM)</strong> and retrieves structured responses, paving the way for advanced <strong>enterprise AI</strong> workflows.</p>
<h2 id="heading-table-of-contents"><strong>Table of Contents:</strong></h2>
<ul>
<li><p>Pre-requisites</p>
</li>
<li><p>Installation</p>
</li>
<li><p>Building a LlamaIndex-based Knowledge Store</p>
</li>
<li><p>Configuring LangDB. for Scalable AI Workflows</p>
</li>
<li><p>Do More with LangDB</p>
</li>
</ul>
<p>Feel free to jump to any section as needed. Let's dive in!</p>
<h2 id="heading-pre-requisites"><strong>Pre-requisites</strong></h2>
<p>To follow along, ensure you have:</p>
<ul>
<li><p>A <strong>LangDB account</strong> (<a target="_blank" href="https://app.langdb.ai/login">Sign up here</a>)</p>
</li>
<li><p>A <strong>LangDB API Key</strong></p>
</li>
<li><p>Basic understanding of <strong>LlamaIndex</strong></p>
</li>
</ul>
<h2 id="heading-installation"><strong>Installation</strong></h2>
<p>First, install the necessary dependencies:</p>
<pre><code class="lang-bash">pip install llama-index
pip install openai
</code></pre>
<h2 id="heading-building-a-llamaindex-based-knowledge-store"><strong>Building a LlamaIndex-based Knowledge Store</strong></h2>
<h3 id="heading-step-1-set-up-your-data-storage"><strong>Step 1: Set Up Your Data Storage</strong></h3>
<p>Create a data directory in your root folder and add relevant documents.</p>
<h3 id="heading-step-2-import-required-libraries"><strong>Step 2: Import Required Libraries</strong></h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> llama_index.llms <span class="hljs-keyword">import</span> openai
<span class="hljs-keyword">from</span> llama_index.core <span class="hljs-keyword">import</span> VectorStoreIndex, SimpleDirectoryReader
<span class="hljs-keyword">from</span> llama_index.core <span class="hljs-keyword">import</span> Settings
<span class="hljs-keyword">from</span> llama_index.llms.openai <span class="hljs-keyword">import</span> OpenAI
<span class="hljs-keyword">import</span> uuid
<span class="hljs-keyword">import</span> os
</code></pre>
<p><strong>Step 3: Load and Index Documents</strong></p>
<pre><code class="lang-python">documents = SimpleDirectoryReader(<span class="hljs-string">"data"</span>).load_data()
index = VectorStoreIndex.from_documents(documents)
index.storage_context.persist(<span class="hljs-string">'storage'</span>)
query_engine = index.as_query_engine()
response = query_engine.query(<span class="hljs-string">"What are the features of LangDB?"</span>)
</code></pre>
<p><strong>Explanation:</strong></p>
<ul>
<li><p>The <strong>SimpleDirectoryReader</strong> loads all files from the data directory.</p>
</li>
<li><p>Documents are transformed into vector embeddings and stored persistently.</p>
</li>
<li><p><strong>Query Engine</strong> enables us to retrieve relevant documents using an <strong>LLM-powered AI gateway</strong>.</p>
</li>
</ul>
<p>Now, let’s supercharge this setup with <strong>LangDB.ai, one of the best artificial intelligence software solutions</strong>!</p>
<h2 id="heading-configuring-langdb-for-scalable-ai-workflows"><strong>Configuring LangDB for Scalable AI Workflows</strong></h2>
<p>LangDB acts as a <strong>high-performance AI governance</strong> system, ensuring that our AI agents have structured and persistent memory.</p>
<h3 id="heading-step-1-create-a-langdb-project"><strong>Step 1: Create a LangDB Project</strong></h3>
<ol>
<li>Log in to LangDB and <strong>create a new project</strong>.</li>
</ol>
<p><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXewrfOQVOjLph-2DLe0zxI3ovlixHmDdi3tXPXbSuY1EHtEgMUzDxKrUOh5ijhkmoLmAwsWUwlqGdX3GwXJZSc6RqTwwxrxzvz4eQ9ZxAD-QSfMU_O9rmDkn7zFZrRRNr4a0Irt?key=3JTjkKczjoU2OTwiC9VX_1af" alt /></p>
<ol start="2">
<li>Navigate to <strong>Manage API Keys</strong> and generate a key.</li>
</ol>
<p><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXeR05TvGaNPEQVAqrQLoTrLdEp92OzNbz_QOv1EWLiqvT5IK3CGDJxW3L0MjfGzALWRv2j60dGXt040QFG2Imql9ALUGz30hOGaBKynW4IaOR9h2A4qOFGDATTDkMKI2-c83H_y?key=3JTjkKczjoU2OTwiC9VX_1af" alt /></p>
<h3 id="heading-step-2-load-environment-variables-in-python"><strong>Step 2: Load Environment Variables in Python</strong></h3>
<pre><code class="lang-python">os.environ[<span class="hljs-string">"OPENAI_API_KEY"</span>] = <span class="hljs-string">"your-langdb-api-key"</span>
os.environ[<span class="hljs-string">"OPENAI_API_BASE"</span>] = <span class="hljs-string">"https://api.us-east-1.langdb.ai"</span>
</code></pre>
<h3 id="heading-step-3-configure-langdb-with-llamaindex"><strong>Step 3: Configure LangDB with LlamaIndex</strong></h3>
<pre><code class="lang-python">Settings.llm = OpenAI(
   base_url=os.getenv(<span class="hljs-string">"OPENAI_API_BASE"</span>),  <span class="hljs-comment"># fetching variables declared on step 2</span>
   api_key=os.getenv(<span class="hljs-string">"OPENAI_API_KEY"</span>),  <span class="hljs-comment"># fetching variables declared on step 2</span>
   model=<span class="hljs-string">"gpt-4o-mini"</span>  <span class="hljs-comment"># Your preferred model</span>
)
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">A newer version of LlamaIndex uses Settings to add custom OpenAI model configurations.</div>
</div>

<p>Our connection and configuration is complete.</p>
<ul>
<li>Let's have a look at the whole code</li>
</ul>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> llama_index.llms <span class="hljs-keyword">import</span> openai
<span class="hljs-keyword">from</span> llama_index.core <span class="hljs-keyword">import</span> VectorStoreIndex, SimpleDirectoryReader
<span class="hljs-keyword">from</span> llama_index.core <span class="hljs-keyword">import</span> Settings
<span class="hljs-keyword">from</span> llama_index.llms.openai <span class="hljs-keyword">import</span> OpenAI

langdb_api_base = os.getenv(<span class="hljs-string">"LANGDB_URL"</span>, <span class="hljs-string">"https://api.us-east-1.langdb.ai"</span>) <span class="hljs-comment">### LangDB API base URL</span>
LANGDB_API_KEY = os.getenv(<span class="hljs-string">"LANGDB_API_KEY"</span>)

Settings.llm = OpenAI(
   base_url=langdb_api_base,
   api_key=LANGDB_API_KEY,
   model=<span class="hljs-string">"gpt-4o-mini"</span>
)

documents = SimpleDirectoryReader(<span class="hljs-string">"data"</span>).load_data()
index = VectorStoreIndex.from_documents(documents)
index.storage_context.persist(<span class="hljs-string">'storage'</span>)
query_engine = index.as_query_engine()
response = query_engine.query(<span class="hljs-string">"what are features of langdb?"</span>)
print(response)
</code></pre>
<h3 id="heading-output">Output:</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738234938448/d30ec317-106e-40ab-8c91-a73d4bd6d6e7.gif" alt class="image--center mx-auto" /></p>
<h2 id="heading-for-multiple-projects">For multiple projects</h2>
<p>Use project id in your <code>langdb_api_base</code> url if you have multiple projects in your LangDB account<br />Below is the example:</p>
<pre><code class="lang-python">langdb_api_base = <span class="hljs-string">"https://api.us-east-1.langdb.ai/your-project-id/v1"</span>
</code></pre>
<h2 id="heading-do-more-with-langdb"><strong>Do more with LangDB</strong></h2>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">🔗 LangDB AI Gateway is Open Source! Check out the repo &amp; contribute: <a target="_self" href="https://github.com/langdb/ai-gateway">click here</a></div>
</div>

<h3 id="heading-checkout-our-youtube-video">Checkout our Youtube Video</h3>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/YIhqgcPKJFw?si=fPZQC8wJk9n1Cd55">https://youtu.be/YIhqgcPKJFw?si=fPZQC8wJk9n1Cd55</a></div>
<p> </p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">💬 Join the conversation in our <a target="_self" href="https://langdbcommunity.slack.com">Slack community!</a></div>
</div>

<p>Now that we have a <strong>structured AI-powered Specialized RAG-Based AI Agent</strong>, here’s what we will bring next:</p>
<ul>
<li><p><strong>Build a multi-agent system</strong> with LlamaIndex + LangDB Dynamic Routing</p>
</li>
<li><p>Integrate Langchain <strong>and build a chat application with</strong> LlamaIndex and LangDB</p>
</li>
</ul>
<p>You can also read our next guide: <a target="_blank" href="https://blog.langdb.ai/how-to-integrate-langchain-with-langdb"><strong>How to use LangChain with LangDB</strong></a></p>
]]></content:encoded></item><item><title><![CDATA[Tracing Multiple Agents with LangDB]]></title><description><![CDATA[When working with AI-powered workflows, efficiency, scalability, and cost control are. In many scenarios, a single-agent architecture often fails to meet all operational requirements, leading developers to adopt multi-agent workflows.
LangDB simplifi...]]></description><link>https://blog.langdb.ai/tracing-multiple-agents-with-langdb</link><guid isPermaLink="true">https://blog.langdb.ai/tracing-multiple-agents-with-langdb</guid><category><![CDATA[AI]]></category><category><![CDATA[Multi-Agent Systems (MAS)]]></category><category><![CDATA[openai]]></category><category><![CDATA[langchain]]></category><category><![CDATA[langgraph]]></category><category><![CDATA[llm]]></category><category><![CDATA[gemini]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Wed, 29 Jan 2025 04:13:32 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1738047399116/a4a5e995-ecdd-4bf4-8a8a-22561eded93e.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When working with AI-powered workflows, efficiency, scalability, and cost control are. In many scenarios, a single-agent architecture often fails to meet all operational requirements, leading developers to adopt multi-agent workflows.</p>
<p>LangDB simplifies this by offering seamless tracing, analytics, and optimization tools for multi-agent workflows.</p>
<p><strong>TLDR,</strong> Combining Gemini and GPT offers lower costs, though it’s slightly slower than using a single model.</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Tag</strong></td><td><strong>Total Cost</strong></td><td><strong>Total Requests</strong></td><td><strong>Avg Duration (ms)</strong></td><td><strong>Total Duration (ms)</strong></td><td><strong>Total Input Tokens</strong></td><td><strong>Total Output Tokens</strong></td><td><strong>Avg TTFT (ms)</strong></td><td><strong>TPS</strong></td><td><strong>TPOT</strong></td></tr>
</thead>
<tbody>
<tr>
<td>[gemini]</td><td>0.037731</td><td>4</td><td>2390.0</td><td>9559.802</td><td>11197</td><td>345</td><td>2390.0</td><td>1207.35</td><td>0.03</td></tr>
<tr>
<td>[gpt-4o]</td><td>0.057021</td><td>4</td><td>4574.9</td><td>18299.686</td><td>17235</td><td>443</td><td>4574.9</td><td>966.03</td><td>0.04</td></tr>
<tr>
<td>[combined]</td><td>0.034887</td><td>4</td><td>2765.5</td><td>11061.997</td><td>10009</td><td>405</td><td>2765.5</td><td>941.42</td><td>0.03</td></tr>
</tbody>
</table>
</div><p>Let’s break down a typical <strong>multi-agent workflow</strong>:</p>
<h4 id="heading-example-generating-a-chart-for-a-given-question">Example: Generating a Chart for a given question</h4>
<ul>
<li><p><strong>User Request</strong>: <em>"Generate a chart of average temperature in Alaska over the past decade."</em></p>
</li>
<li><p><strong>Agent Roles</strong>:</p>
<ol>
<li><p><strong>Researcher Agent</strong>: Gathers raw temperature data by calling a search tool.</p>
</li>
<li><p><strong>Chart Generator Agent</strong>: Processes the data and creates a visualization using a code execution tool.</p>
</li>
</ol>
</li>
</ul>
<p>Here’s how this workflow is structured across multiple agents.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737990703446/10e71bbb-859a-4fe5-a3b6-94462d54f7cf.png" alt class="image--center mx-auto" /></p>
<p>To evaluate the efficiency of multi-agent workflows, we ran three different setups in LangDB. These scenarios illustrate how using the same model across agents compares to combining models.</p>
<h3 id="heading-scenarios-we-tested">Scenarios We Tested</h3>
<p>To evaluate the efficiency of multi-agent workflows, we tested three setups. In the first scenario, both the Researcher Agent and Chart Generator Agent used Gemini, focusing on speed and cost. In the second scenario, both agents used GPT-4o, prioritizing accuracy but incurring higher costs. Finally, in the third scenario, we combined models: the Researcher Agent used Gemini for data gathering, while the Chart Generator Agent used GPT-4o for visualization. This allowed us to balance speed, cost, and accuracy</p>
<p><img src="https://www.gitbook.com/cdn-cgi/image/dpr=2,width=760,onerror=redirect,format=auto/https%3A%2F%2Ffiles.gitbook.com%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252FjwXvc8GvLP6kUvOASaLP%252Fuploads%252FyzCMqXVLcpDJLYXPMzRj%252Fimage.png%3Falt%3Dmedia%26token%3D769b7d97-e4c9-418d-a9de-03291c5c7369" alt /></p>
<h3 id="heading-tracing-and-analytics-with-langdb">Tracing and Analytics with LangDB</h3>
<p>After running these scenarios, we used LangDB's tracing capabilities to monitor task routing, response times, and token usage. The tracing interface allowed us to see exactly how tasks were distributed between agents and measure the efficiency of each workflow.</p>
<p>Using LangDB's <strong>analytics</strong>, we evaluated the performance data from these runs to generate the table presented earlier. This included:</p>
<ul>
<li><p><strong>Cost Analysis</strong>: Tracking how much each agent contributed to the overall expense.</p>
</li>
<li><p><strong>Time Analysis</strong>: Measuring the average response time across different workflows.</p>
</li>
<li><p><strong>Number of Requests</strong>: Recording how many requests each run processed.</p>
</li>
<li><p><strong>Average Time to First Token (TTFT)</strong>: Capturing the time taken for the first token to appear.</p>
</li>
<li><p><strong>Tokens Per Second (TPS)</strong>: Analyzing the efficiency of token generation per second.</p>
</li>
<li><p><strong>Time Per Output Token (TPOT)</strong>: Evaluating the time taken per output token to assess response efficiency.</p>
</li>
</ul>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Tag</strong></td><td><strong>Total Cost</strong></td><td><strong>Total Requests</strong></td><td><strong>Avg Duration (ms)</strong></td><td><strong>Total Duration (ms)</strong></td><td><strong>Total Input Tokens</strong></td><td><strong>Total Output Tokens</strong></td><td><strong>Avg TTFT (ms)</strong></td><td><strong>TPS</strong></td><td><strong>TPOT</strong></td></tr>
</thead>
<tbody>
<tr>
<td>[gemini]</td><td>0.037731</td><td>4</td><td>2390.0</td><td>9559.802</td><td>11197</td><td>345</td><td>2390.0</td><td>1207.35</td><td>0.03</td></tr>
<tr>
<td>[gpt-4o]</td><td>0.057021</td><td>4</td><td>4574.9</td><td>18299.686</td><td>17235</td><td>443</td><td>4574.9</td><td>966.03</td><td>0.04</td></tr>
<tr>
<td>[combined]</td><td>0.034887</td><td>4</td><td>2765.5</td><td>11061.997</td><td>10009</td><td>405</td><td>2765.5</td><td>941.42</td><td>0.03</td></tr>
</tbody>
</table>
</div><h3 id="heading-key-takeaways-from-the-scenarios">Key Takeaways from the Scenarios</h3>
<h4 id="heading-1-gemini-only-workflow"><strong>1. Gemini-Only Workflow</strong></h4>
<ul>
<li><p><strong>Cost</strong>: Slightly higher than combined but cheaper than GPT-4o.</p>
</li>
<li><p><strong>Speed</strong>: Fastest average duration (2390 ms) and highest TPS (1207.35).</p>
</li>
</ul>
<h4 id="heading-2-gpt-4o-only-workflow"><strong>2. GPT-4o-Only Workflow</strong></h4>
<ul>
<li><p><strong>Cost</strong>: Highest overall.</p>
</li>
<li><p><strong>Speed</strong>: Slowest duration (4574.9 ms) and lower TPS (966.03).</p>
</li>
</ul>
<h4 id="heading-3-combined-workflow"><strong>3. Combined Workflow</strong></h4>
<ul>
<li><p><strong>Cost</strong>: Lowest of all workflows.</p>
</li>
<li><p><strong>Speed</strong>: Moderate duration (2765.5 ms) and TPS (941.42).</p>
</li>
</ul>
<hr />
<h3 id="heading-why-tracing-matters-for-multi-agent-workflows">Why Tracing Matters for Multi-Agent Workflows</h3>
<p>In a multi-agent workflow, different models handle specialized tasks. Without proper observability, <strong>you don't know how efficiently each model is performing</strong>. This lack of visibility can lead to:</p>
<ul>
<li><p><strong>Escalating Costs</strong>: Premium models being overused for trivial tasks.</p>
</li>
<li><p><strong>Performance Bottlenecks</strong>: Delays caused by slow or inefficient task execution.</p>
</li>
<li><p><strong>Operational Inefficiency</strong>: No clear insights on how to optimize workflows.</p>
</li>
</ul>
<h3 id="heading-why-langdb-is-essential-for-multi-agent-workflows">Why LangDB is Essential for Multi-Agent Workflows</h3>
<p>With LangDB, you can:</p>
<ol>
<li><p><strong>Monitor Key Metrics</strong>: Track costs, latency, number of requests, and token usage for each workflow.</p>
</li>
<li><p><strong>Analyze Performance Trends</strong>: Understand response times, time to first token, and processing speeds across different models.</p>
</li>
<li><p><strong>Gain Detail Insights</strong>: Use detailed analytics to identify inefficiencies and optimize workflows accordingly.</p>
</li>
</ol>
<hr />
<h3 id="heading-next-steps">Next Steps</h3>
<p>To explore multi-agent workflows with LangDB:</p>
<ol>
<li><p>Visit the <a target="_blank" href="https://github.com/langdb/langdb-samples">LangDB Samples</a> Repository for setup instructions and examples.</p>
</li>
<li><p>Try out <a target="_blank" href="https://app.langdb.ai/">LangDB</a> to monitor and optimize your workflows in real time.</p>
</li>
<li><p>Experiment with combining different models to achieve the best results for your unique use case.</p>
</li>
</ol>
<p>Explore LangDB’s capabilities by visiting the <a target="_blank" href="https://github.com/langdb/langdb-samples">LangDB Samples Repository</a> or <a target="_blank" href="https://app.langdb.ai/">trying the platform</a> for multi-agent tracing and analytics.</p>
]]></content:encoded></item><item><title><![CDATA[Choosing the Right LLM for the Job with LangDB’s Dynamic Routing]]></title><description><![CDATA[We got different kinds of LLMs in 2025 ranging from powerful reasoning models to lightweight, cost-efficient option. It can be challenging to one best fit in an complex workflow, especially when you want to have a balance between quality and cost.
TL...]]></description><link>https://blog.langdb.ai/choosing-the-right-llm-for-the-job-with-langdb</link><guid isPermaLink="true">https://blog.langdb.ai/choosing-the-right-llm-for-the-job-with-langdb</guid><category><![CDATA[AI]]></category><category><![CDATA[openai]]></category><category><![CDATA[Google]]></category><category><![CDATA[llm]]></category><category><![CDATA[routing]]></category><category><![CDATA[cost-optimisation]]></category><category><![CDATA[agentic AI]]></category><category><![CDATA[workflow]]></category><dc:creator><![CDATA[Mrunmay Shelar]]></dc:creator><pubDate>Wed, 22 Jan 2025 06:51:33 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1737527255786/66f3b0b8-615b-4b1e-8368-b4870ff3fe85.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We got different kinds of LLMs in 2025 ranging from powerful reasoning models to lightweight, cost-efficient option. It can be challenging to one best fit in an complex workflow, especially when you want to have a balance between quality and cost.</p>
<p><strong>TLDR</strong>; Using dynamic routing saves costs upto 55% in a Multi-agent RAG workflow</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Metrics</td><td>Cost for 10 Queries</td><td>Cost for 20 Queries</td></tr>
</thead>
<tbody>
<tr>
<td><strong>GPT-4o (No Routing)</strong></td><td>$0.2085</td><td>$0.6143</td></tr>
<tr>
<td><strong>GPT-4o + Gemini-1.5-flash-8b (with Routing)</strong></td><td>$0.1517</td><td>$0.2738</td></tr>
<tr>
<td><strong>Cost Difference</strong></td><td><strong>27.25%</strong></td><td><strong>55.42%</strong></td></tr>
</tbody>
</table>
</div><h2 id="heading-the-challenge-of-balancing-cost-and-performance">The Challenge of Balancing Cost and Performance</h2>
<p>In a multi-agent workflow, tasks can vary significantly in complexity:</p>
<ul>
<li><p><strong>Complex questions</strong>: Require deep reasoning and nuanced outputs, best suited for high-powered models like <code>gpt-4o</code>.</p>
</li>
<li><p><strong>Simpler questions</strong>: Involve basic text generation or classification, which lightweight models like <code>gemini-1.5-flash-8b</code> can handle efficiently.</p>
</li>
</ul>
<p>The challenge is ensuring that complex tasks get the power they need while avoiding unnecessary costs for simpler tasks. Over-reliance on a single model can either inflate costs or compromise quality.</p>
<h2 id="heading-how-langdbs-dynamic-routing-helps">How LangDB’s Dynamic Routing Helps</h2>
<p>Instead of manually deciding which model to use for each query, LangDB lets you define a routing strategy that automates these decisions. For example, you can prioritize:</p>
<ul>
<li><p><strong>Cost-efficiency</strong>: Using lighter models whenever possible to reduce expenses.</p>
</li>
<li><p><strong>Quality-first workflows</strong>: Using high-powered models for critical or complex tasks.</p>
</li>
<li><p><strong>Balanced approaches</strong>: Combining cost and quality priorities by dynamically assessing task requirements.</p>
</li>
</ul>
<h3 id="heading-example">Example:</h3>
<p>Here’s an example of a simple cost-optimization strategy using LangDB:</p>
<pre><code class="lang-python">routing_body = {
    <span class="hljs-string">"extra"</span>: {
        <span class="hljs-string">"strategy"</span>: {
            <span class="hljs-string">"type"</span>: <span class="hljs-string">"cost"</span>,
            <span class="hljs-string">"willingness_to_pay"</span>: <span class="hljs-number">0.5</span>
        },
        <span class="hljs-string">"models"</span>: [<span class="hljs-string">"gpt-4o"</span>, <span class="hljs-string">"gemini-1.5-flash-8b"</span>],
    },
}
</code></pre>
<p><strong>Key Points</strong>:</p>
<ul>
<li><p>The <code>"type": "cost"</code> strategy ensures that LangDB evaluates each query for its complexity and routes it to the least expensive model that can handle it.</p>
</li>
<li><p>The <code>"willingness_to_pay"</code> parameter allows you to define your tolerance for higher costs. For instance, <code>0.5</code> means the system will only select the more expensive <code>gpt-4o</code> model if the complexity justifies the additional cost.</p>
</li>
</ul>
<h2 id="heading-experiment-comparing-direct-and-routed-workflows">Experiment: Comparing Direct and Routed Workflows</h2>
<p>To demonstrate the impact of LangDB’s Dynamic Routing, we conducted two tests using a Retrieval-Augmented Generation (RAG) workflow:</p>
<ol>
<li><p><strong>Direct Workflow</strong>: All queries were routed to the high-powered <code>gpt-4o</code> model.</p>
</li>
<li><p><strong>Routed Workflow</strong>: Queries were dynamically routed between <code>gpt-4o</code> and <code>gemini-1.5-flash-8b</code> based on a cost-optimization strategy.</p>
</li>
</ol>
<p>The tests included two sets of queries:</p>
<ul>
<li><p><strong>10 questions</strong> (Test 1)</p>
</li>
<li><p><strong>20 questions</strong> (Test 2)</p>
</li>
</ul>
<h2 id="heading-results-cost-comparison">Results: Cost Comparison</h2>
<h3 id="heading-run-1-10-questions">Run 1: 10 Questions</h3>
<p>For the first test, we compared the total costs of processing 10 questions using the direct and routed workflows.</p>
<p><strong>Insights</strong>:</p>
<ul>
<li><p>The <strong>Direct Workflow</strong> consistently used the high-powered <code>gpt-4o</code> model for all queries, resulting in higher overall costs.</p>
</li>
<li><p>The <strong>Routed Workflow</strong> leveraged <code>gemini-1.5-flash-8b</code> for simpler queries, reducing costs significantly while still using <code>gpt-4o</code> for complex tasks.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737524826874/13e99244-8def-4374-959f-a095405d6931.png" alt class="image--center mx-auto" /></p>
<hr />
<h3 id="heading-run-2-20-questions">Run 2: 20 Questions</h3>
<p>In the second test with 20 questions, the cost differences became even more pronounced.</p>
<p><strong>Insights</strong>:</p>
<ul>
<li><p>The <strong>Direct Workflow</strong> incurred a linear cost increase, as every query was routed to the expensive <code>gpt-4o</code> model.</p>
</li>
<li><p>The <strong>Routed Workflow</strong> scaled far more efficiently by handling a larger proportion of queries with the cost-effective <code>gemini-1.5-flash-8b</code> model.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737524855502/225416c5-ce8d-4825-b897-78a80e380f78.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-quality-analysis">Quality Analysis</h3>
<p>To ensure that cost savings did not compromise the quality of the responses, we evaluated the outputs using an LLM-based scoring scheme. Each answer was rated on a scale of 1-10 using the following criteria:</p>
<ul>
<li><p><strong>Accuracy and correctness</strong> (4 points)</p>
</li>
<li><p><strong>Completeness and depth</strong> (3 points)</p>
</li>
<li><p><strong>Clarity and coherence</strong> (3 points)</p>
</li>
</ul>
<p><strong>Results</strong>:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Test</strong></td><td><strong>Mean Score (Direct Workflow)</strong></td><td><strong>Mean Score (Routed Workflow)</strong></td><td><strong>Cases Routed Scored Higher</strong></td><td><strong>Cases No Routing Scored Higher</strong></td><td><strong>Cases with Equal Scores</strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>Test 1 (10 questions)</strong></td><td>9.0</td><td>8.5</td><td>0</td><td>4</td><td>6</td></tr>
<tr>
<td><strong>Test 2 (20 questions)</strong></td><td>8.65</td><td>8.9</td><td>8</td><td>3</td><td>9</td></tr>
</tbody>
</table>
</div><p>These results demonstrate that LangDB’s routing strategy did not compromise the overall quality of responses, even as it significantly reduced czosts.</p>
<hr />
<p>Whether you’re managing small-scale projects or scaling to thousands of queries, LangDB’s Dynamic Routing ensures the best balance between cost and quality. It allows you to optimize workflows, achieve scalability, and focus resources where they’re needed most.</p>
<p>The experiment clearly shows that automating model selection can not only reduce costs but also maintain high standards of performance, making LangDB’s Dynamic Routing an invaluable tool for any organization leveraging LLMs.</p>
<p>The detailed experiment notebook will be published soon, offering deeper insights and allowing you to replicate our results.</p>
<p>Start optimizing your workflows today with <a target="_blank" href="https://app.langdb.ai/">LangDB AI</a> and make smarter decisions with every query!</p>
]]></content:encoded></item></channel></rss>