<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[lanarkite99]]></title><description><![CDATA[Jack of all, master of none]]></description><link>https://lanarkite99.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!GYsG!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36152a10-594a-4b04-8f8d-317d1e83881d_144x144.png</url><title>lanarkite99</title><link>https://lanarkite99.substack.com</link></image><generator>Substack</generator><lastBuildDate>Tue, 12 May 2026 19:50:31 GMT</lastBuildDate><atom:link href="https://lanarkite99.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[lanarkite99]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[lanarkite99@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[lanarkite99@substack.com]]></itunes:email><itunes:name><![CDATA[lanarkite99]]></itunes:name></itunes:owner><itunes:author><![CDATA[lanarkite99]]></itunes:author><googleplay:owner><![CDATA[lanarkite99@substack.com]]></googleplay:owner><googleplay:email><![CDATA[lanarkite99@substack.com]]></googleplay:email><googleplay:author><![CDATA[lanarkite99]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[I Built a Retrieval-First RAG System for Factory Documents. Here’s Why I Didn’t Build a Chatbot.]]></title><description><![CDATA[How I designed an evidence-first document search system for invoices, BOMs, e-way bills, and financial PDFs and why traditional &#8220;RAG chatbot&#8221; thinking was the wrong starting point for this problem.]]></description><link>https://lanarkite99.substack.com/p/i-built-a-retrieval-first-rag-system</link><guid isPermaLink="false">https://lanarkite99.substack.com/p/i-built-a-retrieval-first-rag-system</guid><dc:creator><![CDATA[lanarkite99]]></dc:creator><pubDate>Thu, 23 Apr 2026 09:20:31 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-XpK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9e399b4-c0eb-4361-96d7-5fc6caea19a6_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-XpK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9e399b4-c0eb-4361-96d7-5fc6caea19a6_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-XpK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9e399b4-c0eb-4361-96d7-5fc6caea19a6_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!-XpK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9e399b4-c0eb-4361-96d7-5fc6caea19a6_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!-XpK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9e399b4-c0eb-4361-96d7-5fc6caea19a6_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!-XpK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9e399b4-c0eb-4361-96d7-5fc6caea19a6_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-XpK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9e399b4-c0eb-4361-96d7-5fc6caea19a6_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d9e399b4-c0eb-4361-96d7-5fc6caea19a6_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1052766,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://lanarkite99.substack.com/i/195211406?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9e399b4-c0eb-4361-96d7-5fc6caea19a6_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-XpK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9e399b4-c0eb-4361-96d7-5fc6caea19a6_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!-XpK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9e399b4-c0eb-4361-96d7-5fc6caea19a6_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!-XpK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9e399b4-c0eb-4361-96d7-5fc6caea19a6_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!-XpK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9e399b4-c0eb-4361-96d7-5fc6caea19a6_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">High Level Architecture of the Retrieval-first document intelligence</figcaption></figure></div><h2><strong>Introduction</strong></h2><blockquote><p>We&#8217;ve all seen the standard RAG demo.</p><p>Upload a few PDFs. Ask a question in natural language. Get back a polished answer in a chatbot window.</p><p>It looks impressive for about five minutes.</p><p>Then someone asks a real business question:</p></blockquote><ul><li><p>&#8220;What&#8217;s the vehicle number for e-way bill <code>FM-GST-2026-2001</code>?&#8221;</p></li><li><p>&#8220;Show me the part number for <code>Seat Foam Cushion</code>.&#8221;</p></li><li><p>&#8220;What is the total amount for invoice <code>TF/2026-27/001</code>?&#8221;</p></li><li><p>&#8220;Where is the registered office of RIL?&#8221;</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://lanarkite99.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><p>And suddenly the problem is not &#8220;generate a nice answer.&#8221;</p><p>The problem is <strong>retrieve the right document, the right page and the right evidence with high precision</strong>.</p><p>That is the gap this project was built to solve. This is not a generic chatbot for PDFs.</p><p>It is a <strong>retrieval-first document intelligence system</strong> for industrial and operational workflows.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WNnI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe24f9b38-8618-40c8-9099-ade3ebb7bd88_1348x620.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WNnI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe24f9b38-8618-40c8-9099-ade3ebb7bd88_1348x620.png 424w, https://substackcdn.com/image/fetch/$s_!WNnI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe24f9b38-8618-40c8-9099-ade3ebb7bd88_1348x620.png 848w, https://substackcdn.com/image/fetch/$s_!WNnI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe24f9b38-8618-40c8-9099-ade3ebb7bd88_1348x620.png 1272w, https://substackcdn.com/image/fetch/$s_!WNnI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe24f9b38-8618-40c8-9099-ade3ebb7bd88_1348x620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WNnI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe24f9b38-8618-40c8-9099-ade3ebb7bd88_1348x620.png" width="1348" height="620" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e24f9b38-8618-40c8-9099-ade3ebb7bd88_1348x620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:620,&quot;width&quot;:1348,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:83806,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://lanarkite99.substack.com/i/195211406?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe24f9b38-8618-40c8-9099-ade3ebb7bd88_1348x620.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WNnI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe24f9b38-8618-40c8-9099-ade3ebb7bd88_1348x620.png 424w, https://substackcdn.com/image/fetch/$s_!WNnI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe24f9b38-8618-40c8-9099-ade3ebb7bd88_1348x620.png 848w, https://substackcdn.com/image/fetch/$s_!WNnI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe24f9b38-8618-40c8-9099-ade3ebb7bd88_1348x620.png 1272w, https://substackcdn.com/image/fetch/$s_!WNnI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe24f9b38-8618-40c8-9099-ade3ebb7bd88_1348x620.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Streamlit Frontend(thin)</figcaption></figure></div><p></p><p>The goal is simple:</p><blockquote><p>Given a business-style query over invoices, bills of materials, e-way bills, and financial PDFs, return the correct document and evidence quickly, reliably, and in a format a real user can trust.</p></blockquote><div><hr></div><h2><strong>Follow Here:</strong></h2><blockquote><p><strong>Github: <a href="https://github.com/lanarkite99/Retrieval-first-document-RAG-system">Click Here</a></strong><br><strong>X/Twitter:</strong> <strong><a href="https://x.com/MeetSiddhapura">Click Here</a></strong><br><strong>LinkedIn: <a href="https://www.linkedin.com/in/meet-siddhapura/">Click Here</a></strong></p></blockquote><div><hr></div><h2><strong>The Actual Problem</strong></h2><p>In a factory or operations setting, most document questions are not open-ended. They are structured, messy, and annoyingly specific.</p><p>A finance person does not want &#8220;a summary of the invoice.&#8221;<br>They want:</p><ul><li><p>the invoice matching a number</p></li><li><p>the line item matching a description</p></li><li><p>the page containing the evidence</p></li><li><p>the exact value, code, or supplier</p></li></ul><p>Likewise, a procurement or production user searching a BOM is not asking for broad semantic understanding.</p><p>They are asking:</p><ul><li><p>&#8220;Which material code matches this component?&#8221;</p></li><li><p>&#8220;Which BOM contains this assembly?&#8221;</p></li><li><p>&#8220;What revision is this part on?&#8221;</p></li></ul><p>Traditional RAG discussions usually start with generation. This problem starts with <strong>retrieval quality</strong>.<br>If retrieval is wrong, generation just makes the wrong answer sound more confident.<br>So I deliberately framed this as:</p><p><strong>evidence-backed retrieval first, optional generation second.<br></strong>That one design choice shaped almost everything else in the project.</p><div><hr></div><h2><strong>Why I Didn&#8217;t Build &#8220;Traditional RAG&#8221;</strong></h2><p>The typical RAG mental model looks something like this:</p><ol><li><p>Chunk documents</p></li><li><p>Embed them</p></li><li><p>Put them in a vector database</p></li><li><p>Retrieve top-k chunks</p></li><li><p>Ask an LLM to answer</p></li></ol><p>That pattern is useful, but it breaks down quickly on operational documents.<br>Why?<br>Because factory and finance PDFs contain:</p><ul><li><p>identifiers like invoice numbers and e-way bill numbers</p></li><li><p>amounts, tax values, and totals</p></li><li><p>table rows and line items</p></li><li><p>material codes and part numbers</p></li><li><p>short layout-bound evidence instead of long narrative passages</p></li></ul><p>Those are not purely semantic retrieval problems.</p><p>In fact, many of them are <strong>lexical, structured, or exact-match problems disguised as natural-language queries</strong>.<br>That is why this system does <strong>not</strong> rely on embeddings alone.<br>Instead, it combines:</p><ul><li><p>exact identifier routing</p></li><li><p>lexical chunk search</p></li><li><p>keyword retrieval</p></li><li><p>vector retrieval for fuzzy queries</p></li><li><p>reranking on top of merged evidence</p></li></ul><p>The LLM is optional and comes <em>after</em> retrieval.</p><p>That means the system behaves more like a document search engine with AI support than a chat interface pretending to be one.</p><div><hr></div><h2><strong>What I Built</strong></h2><blockquote><p>At a high level, the system supports:</p></blockquote><ul><li><p>PDF ingestion from files, folders, or a shared inbox</p></li><li><p>text extraction and lightweight metadata enrichment</p></li><li><p>evidence chunking across text blocks, lines, table rows, and header fields</p></li><li><p>multi-store retrieval using PostgreSQL, OpenSearch, and Qdrant</p></li><li><p>query routing into exact, lexical, mixed, and hybrid paths</p></li><li><p>evidence-first responses through FastAPI and Streamlit</p></li><li><p>optional Gemini-based summarization on top of retrieved evidence</p></li><li><p>evaluation for both retrieval and extraction quality</p></li></ul><p>The stack includes:</p><ul><li><p><strong>FastAPI</strong> for the backend API</p></li><li><p><strong>Streamlit</strong> for the business-facing UI</p></li><li><p><strong>PostgreSQL</strong> for metadata, chunk storage, and query logs</p></li><li><p><strong>OpenSearch</strong> for keyword retrieval</p></li><li><p><strong>Qdrant</strong> for vector retrieval</p></li><li><p><strong>Redis</strong> for query-result caching</p></li><li><p><strong>SentenceTransformers</strong> as the default embedding backend</p></li><li><p><strong>Gemini</strong> as an optional post-retrieval answer layer</p></li><li><p><strong>Docker Compose</strong> for local deployment</p></li></ul><div><hr></div><h2><strong>The Architecture</strong></h2><p>The system has three main flows:</p><h3><strong>1. Ingestion</strong></h3><p>When a PDF enters the system</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yv1u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bf800e3-6411-4f1e-b745-8cb1198fc23f_4800x3924.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yv1u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bf800e3-6411-4f1e-b745-8cb1198fc23f_4800x3924.png 424w, https://substackcdn.com/image/fetch/$s_!yv1u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bf800e3-6411-4f1e-b745-8cb1198fc23f_4800x3924.png 848w, https://substackcdn.com/image/fetch/$s_!yv1u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bf800e3-6411-4f1e-b745-8cb1198fc23f_4800x3924.png 1272w, https://substackcdn.com/image/fetch/$s_!yv1u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bf800e3-6411-4f1e-b745-8cb1198fc23f_4800x3924.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yv1u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bf800e3-6411-4f1e-b745-8cb1198fc23f_4800x3924.png" width="1456" height="1190" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3bf800e3-6411-4f1e-b745-8cb1198fc23f_4800x3924.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1190,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:803663,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://lanarkite99.substack.com/i/195211406?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bf800e3-6411-4f1e-b745-8cb1198fc23f_4800x3924.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yv1u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bf800e3-6411-4f1e-b745-8cb1198fc23f_4800x3924.png 424w, https://substackcdn.com/image/fetch/$s_!yv1u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bf800e3-6411-4f1e-b745-8cb1198fc23f_4800x3924.png 848w, https://substackcdn.com/image/fetch/$s_!yv1u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bf800e3-6411-4f1e-b745-8cb1198fc23f_4800x3924.png 1272w, https://substackcdn.com/image/fetch/$s_!yv1u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bf800e3-6411-4f1e-b745-8cb1198fc23f_4800x3924.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>:</p><ol><li><p>The file is hashed for checksum-based duplicate detection</p></li><li><p>Text is extracted with PyMuPDF</p></li><li><p>The document is classified into types like invoice, BOM, or e-way bill</p></li><li><p>Lightweight metadata is extracted</p></li><li><p>The original PDF is copied into local storage</p></li><li><p>Multiple evidence units are built:</p><ul><li><p>text chunks</p></li><li><p>line chunks</p></li><li><p>table-row chunks</p></li><li><p>header-field chunks</p></li></ul></li><li><p>Chunks are written to Postgres</p></li><li><p>Keyword representations are indexed into OpenSearch</p></li><li><p>Vector embeddings are indexed into Qdrant</p></li></ol><p>The key idea here is that not all evidence in business PDFs should be treated as generic text chunks.</p><p>A line item and a header field behave very differently in search.</p><h3><strong>2. Query Routing</strong></h3><p>When a user submits a query, the router decides whether the search should behave like:</p><ul><li><p><strong>exact_match</strong><br>for strong identifiers like invoice numbers or document IDs</p></li><li><p><strong>lexical</strong><br>for row-level, numeric-heavy, or highly specific queries</p></li><li><p><strong>mixed</strong><br>for queries that benefit from both lexical and semantic signals</p></li><li><p><strong>hybrid</strong><br>for fuzzier text queries</p></li></ul><p>This is one of the most important design choices in the system.</p><p>I did not want every query to be treated as &#8220;vector search + prompt.&#8221;<br><code>FM-GST-2026-2001</code> should not go through the same retrieval path as &#8220;registered office for RIL&#8221;.</p><h3><strong>3. Retrieval and Answering</strong></h3><p>Depending on the route, the system combines signals from:</p><ul><li><p>PostgreSQL chunk search</p></li><li><p>OpenSearch keyword search</p></li><li><p>Qdrant vector search</p></li></ul><p>Then it performs weighted score fusion and reranking.<br>The answer layer does not blindly generate text.</p><p>It follows a priority cascade:</p><ol><li><p>deterministic exact-match answer when possible</p></li><li><p>extractive row answer for line-item style queries</p></li><li><p>optional Gemini summary if enabled</p></li><li><p>extractive fallback using top-hit evidence</p></li></ol><p>That keeps the system grounded in evidence instead of turning retrieval misses into fluent hallucinations.</p><div><hr></div><h2><strong>A More Practical Ingestion Workflow</strong></h2><p>One part I cared about beyond the search itself was <strong>how non-technical users would actually use the system</strong>.</p><p>In a real office, users are not going to open an IDE and run CLI commands to ingest PDFs.</p><p>So I added an inbox-style workflow:</p><ul><li><p>users drop PDFs into a shared <code>incoming</code> folder</p></li><li><p>Streamlit shows the configured inbox path</p></li><li><p>the UI triggers bulk ingestion through the API</p></li><li><p>duplicates are skipped automatically using checksum-based deduplication</p></li></ul><p>This is much closer to how the system would be used in a small factory or finance team.</p><p>Long term, I would evolve this into:</p><ul><li><p><code>incoming/</code></p></li><li><p><code>processed/</code></p></li><li><p><code>failed/</code></p></li></ul><p>plus a scheduled or background ingestion worker.<br>That is a better operational pattern than telling users to manually manage file paths.</p><div><hr></div><h2><strong>What the Benchmark Says</strong></h2><p>I evaluated the system on a held-out benchmark covering invoices, BOMs, e-way bills, and financial PDFs.</p><h3><strong>Retrieval Results</strong></h3><ul><li><p><strong>Recall@1:</strong> <code>86.1%</code></p></li><li><p><strong>Recall@3:</strong> <code>97.2%</code></p></li><li><p><strong>Recall@5:</strong> <code>100%</code></p></li><li><p><strong>MRR:</strong> <code>0.922</code></p></li></ul><p>Those are strong enough for a practical retrieval-first v1.<br>They suggest the system is doing what it was designed to do:</p><p>find the correct document and supporting evidence reliably.</p><h3><strong>Extraction Results</strong></h3><ul><li><p><strong>Documents passed:</strong> <code>0/18</code></p></li><li><p><strong>Field accuracy:</strong> <code>0.5591</code></p></li><li><p><strong>Line item accuracy:</strong> <code>0.7222</code></p><p></p><p>This matters.</p></li></ul><p>Because it forced an important correction in how I describe the project.<br>This is <strong>not</strong> a universal, format-independent document extraction engine.</p><p>It is strongest as:</p><ul><li><p>an evidence-backed retrieval system</p></li><li><p>a production-style document search application</p></li><li><p>a realistic RAG architecture for operational PDFs</p></li></ul><p>That distinction makes the project more honest and, in my opinion, more useful.</p><div><hr></div><h2><strong>What Worked Better Than Expected</strong></h2><blockquote><p>A few things ended up mattering more than I initially expected:</p></blockquote><h3><strong>1. Exact-match routing</strong></h3><p>Business document search is full of identifiers.</p><p>The moment I generalized document-ID detection and routed those queries into exact-match behavior, retrieval quality improved sharply on those cases.</p><h3><strong>2. Row-level evidence</strong></h3><p>Generic chunking is not enough for invoices and BOMs.<br>Adding table-row and header-field evidence made the system much better at line-item style retrieval.</p><h3><strong>3. Retrieval fusion</strong></h3><p>Lexical and semantic retrieval are not enemies.<br>They solve different parts of the problem.</p><p>The best behavior came from routing queries appropriately and then fusing signals rather than betting everything on one retrieval mode.</p><div><hr></div><h2><strong>What Was Harder Than Expected</strong></h2><p>Three things were more painful than the happy-path architecture diagram suggests.</p><h3><strong>1. Layout-sensitive extraction</strong></h3><p>Metadata extraction on operational PDFs is surprisingly fragile.</p><p>Even when the system retrieves the right document, robustly extracting supplier, buyer, amount, and line items across unseen layouts is still hard.</p><p>That is why the retrieval story is much stronger than the extraction story right now.</p><h3><strong>2. Docker and dependency behavior</strong></h3><p>The app side was easier than the packaging side.</p><p>I hit repeated issues with Docker rebuild latency and dependency resolution. Fixing layer invalidation, caching behavior, and CPU-only Torch resolution took real effort.</p><p>It was a useful reminder that &#8220;production-style&#8221; work is often more about the connective tissue than the main algorithm.</p><h3><strong>3. Synchronous ingestion</strong></h3><p>The inbox ingestion flow works, but synchronous ingestion through a UI is not the end state.</p><p>Cold starts, model load time, and indexing latency mean that background ingestion is the right next step.</p><div><hr></div><h2><strong>Why This Project Matters to Me</strong></h2><blockquote><p>I did not want to build another generic &#8220;chat with your PDF&#8221; demo.</p><p>I wanted to build something that reflects how messy business AI systems actually are:</p></blockquote><ul><li><p>real documents</p></li><li><p>mixed retrieval modes</p></li><li><p>evidence and citations</p></li><li><p>deployment trade-offs</p></li><li><p>evaluation that exposes weaknesses instead of hiding them</p></li></ul><p>This project ended up teaching me something I think applies well beyond RAG:</p><p>The hardest part is usually not the model. It is figuring out what the product should actually do, what part of it must be trustworthy, and what trade-offs you are willing to make.</p><p>In this case, the right answer was:</p><p>-build a strong retrieval system first<br>-measure it honestly<br>-and let generation stay downstream of evidence</p><div><hr></div><h2><strong>What I&#8217;d Improve Next</strong></h2><p>If I keep pushing this forward, the next improvements are clear:</p><ul><li><p>asynchronous scheduled ingestion</p></li><li><p><code>incoming / processed / failed</code> folder lifecycle</p></li><li><p>richer ingestion manifests and operational reporting</p></li><li><p>stronger OCR support for scanned documents</p></li><li><p>better financial statement retrieval</p></li><li><p>confidence-aware extraction fallback instead of purely deterministic extraction</p></li><li><p>stronger integration testing across all stores and APIs</p></li></ul><p>In other words: less demo polish, more operational depth.</p><div><hr></div><h2><strong>Final Takeaway</strong></h2><p>If you are building document AI for a real workflow, start by asking:</p><blockquote><p><strong>What is the actual task?</strong></p></blockquote><p>If the task is:</p><ul><li><p>retrieve the right document</p></li><li><p>show the right page</p></li><li><p>ground the answer in evidence</p></li><li><p>handle identifiers, amounts, and line items correctly</p></li></ul><p>then a traditional &#8220;vector search + prompt&#8221; RAG setup is often the wrong mental model.</p><p>This project worked better once I stopped treating it as a chatbot problem and started treating it as a <strong>document retrieval system with AI-enhanced evidence handling</strong>.</p><p>That framing changed the architecture, the evaluation, and the final product.<br>And I think it made the system much better.</p><div><hr></div><h2></h2>]]></content:encoded></item><item><title><![CDATA[Phase 2: Taking the Pipeline to the Cloud: AWS, EKS and the Reality of Production]]></title><description><![CDATA[Deploying the system to cloud]]></description><link>https://lanarkite99.substack.com/p/phase-2-taking-the-pipeline-to-the</link><guid isPermaLink="false">https://lanarkite99.substack.com/p/phase-2-taking-the-pipeline-to-the</guid><dc:creator><![CDATA[lanarkite99]]></dc:creator><pubDate>Thu, 09 Apr 2026 10:07:39 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!uCIr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5a16133-fac3-417c-b907-260afcb6435a_4905x2392.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong><a href="https://open.substack.com/pub/lanarkite99/p/phase-1-production-grade-stock-prediction?utm_campaign=post-expanded-share&amp;utm_medium=post%20viewer">Phase 1: Full ML system design implementation for stock prediction</a></strong> talked about &#8216;what&#8217;, phase 2 is about &#8216;how&#8217;. In phase 1, we built the ML system that predicts stock prices for next 5 days and generates reports that outlines general stock sentiment analysis, recent news etc. </p><p>Here&#8217;s the <strong><a href="https://github.com/lanarkite99/stock_pred_pipeline/">github repo</a></strong>, Note: there are 2 branches, namely <strong>master</strong> and<strong> cloud_deploy</strong>.</p><p>Moving from <strong>Docker Compose</strong> to <strong>AWS EKS</strong> was a massive shift in mindset. You quickly realize that in the cloud, everything is a moving part. Nodes disappear, IP addresses change and permissions are the difference between a working system and a wall of 403 errors.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://lanarkite99.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e5a16133-fac3-417c-b907-260afcb6435a_4905x2392.png&quot;}],&quot;caption&quot;:&quot;Stock Prediction project with AWS deployment&quot;,&quot;alt&quot;:&quot;EC2, ECR, EKS, Bedrock, Terraform, CI/CD, Grafana, Prometheus&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e5a16133-fac3-417c-b907-260afcb6435a_4905x2392.png&quot;}},&quot;isEditorNode&quot;:true}"></div><p></p><blockquote><p><strong><a href="https://github.com/lanarkite99/stock_pred_pipeline/blob/cloud_deploy/ref_docs/cloud.md">AWS Design and config doc</a></strong></p><p><strong><a href="https://github.com/lanarkite99/stock_pred_pipeline/blob/cloud_deploy/ref_docs/system_design.md">Complete System Design</a></strong></p></blockquote><h2><strong>1. The Cloud Stack</strong></h2><h4>The final cloud version used:</h4><ul><li><p><strong>Amazon EKS</strong> for Kubernetes runtime</p></li><li><p><strong>Terraform</strong> for infrastructure provisioning</p></li><li><p><strong>Amazon ECR</strong> for Docker images</p></li><li><p><strong>GitHub Actions</strong> for CI/CD</p></li><li><p><strong>Amazon Bedrock</strong> for chat and embeddings</p></li><li><p><strong>Redis inside the cluster</strong> for cache and task state</p></li><li><p><strong>Chroma inside the cluster</strong> for semantic recall</p></li><li><p><strong>Amazon S3</strong> for durable artifacts</p></li><li><p><strong>EBS-backed PVCs</strong> for persistence</p></li></ul><p>One important design choice: I did not go for every managed service AWS offers. I kept the stack lean so it could be built, explained and torn down without becoming a never ending cloud bill.</p><p>When you run things locally with Docker Compose, life is simple. You share a folder between containers, and they talk to each other like they are in the same room. But Kubernetes is more like a crowded city. Pods are ephemeral; they are born and die constantly. If your <code>/train-child</code> task saves a model to a local folder and that pod gets rescheduled, your model is gone forever.</p><p>To bridge this gap, we had to re-engineer how the system holds onto its state:</p><ul><li><p><strong>The Entryway (AWS ALB):</strong> We used the AWS Load Balancer Controller to set up an <strong>Application Load Balancer</strong>. This gives us a single, stable DNS name. It handles incoming traffic and routes it to our FastAPI services, no matter which node they are currently sitting on.</p></li><li><p><strong>Storage that Survives (Amazon S3):</strong> We swapped local output folders for S3 buckets. Now, every time a <code>StockLSTM</code> finishes training or a monitor report is generated, it gets pushed to S3. This means our inference engine can pull the latest weights from anywhere in the cluster without worrying about disk persistence.</p></li><li><p><strong>Redis</strong> runs as a Kubernetes Deployment plus Service in k8s/redis.yaml. <strong>Chroma</strong> runs as a Kubernetes Deployment plus Service in k8s/chroma.yaml. Their persistent data is backed by PVCs, which in the final cloud setup use EBS, not an AWS managed cache service.</p></li></ul><h2><strong>2. The Compute Strategy: Managing the Cloud Tax</strong></h2><p>One of the biggest traps in cloud ML is compute cost. You do not want to burn money running a lightweight Streamlit dashboard on a giant GPU node just because it sounds impressive.</p><p>For this project, we kept the AWS footprint intentionally lean:</p><ul><li><p>one EKS cluster</p></li><li><p>one managed node group</p></li><li><p>a small number of general purpose EC2 instances</p></li><li><p>only the services we actually needed</p></li><li><p>no unnecessary managed extras that would complicate the stack or inflate the bill</p></li></ul><p>That simplicity mattered. It let us focus on whether the application actually worked in a cloud environment instead of getting buried in infrastructure sprawl.</p><p>The key idea was this: every part of the system should justify its existence.</p><p>If a service was only useful locally, it had to be replaced or removed.<br>If a folder only existed because Docker Compose made it easy, we had to think harder about persistence.<br>If a component could run in-cluster without a managed AWS service, we preferred that path for the first version.</p><p><strong>That is why Redis and Chroma stayed in Kubernetes, and why we used EBS-backed PVCs instead of jumping straight to more expensive managed services.</strong></p><h2><strong>3. Rebuilding The Runtime For Kubernetes</strong></h2><p>Docker Compose gives you a comforting illusion. Everything is on one machine, the network is flat, and storage feels permanent enough.</p><p>Kubernetes strips that comfort away.</p><p>Pods are ephemeral. Nodes can be replaced. IPs change. Service names matter. Storage has to be treated as part of the architecture, not an afterthought.</p><p>That meant the runtime had to be rethought in a few important ways.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!G4cU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c16bb06-8655-425d-a865-4f74ac2dfeb7_325x155.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!G4cU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c16bb06-8655-425d-a865-4f74ac2dfeb7_325x155.png 424w, https://substackcdn.com/image/fetch/$s_!G4cU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c16bb06-8655-425d-a865-4f74ac2dfeb7_325x155.png 848w, https://substackcdn.com/image/fetch/$s_!G4cU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c16bb06-8655-425d-a865-4f74ac2dfeb7_325x155.png 1272w, https://substackcdn.com/image/fetch/$s_!G4cU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c16bb06-8655-425d-a865-4f74ac2dfeb7_325x155.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!G4cU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c16bb06-8655-425d-a865-4f74ac2dfeb7_325x155.png" width="325" height="155" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c16bb06-8655-425d-a865-4f74ac2dfeb7_325x155.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:155,&quot;width&quot;:325,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7557,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://lanarkite99.substack.com/i/193655260?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c16bb06-8655-425d-a865-4f74ac2dfeb7_325x155.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!G4cU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c16bb06-8655-425d-a865-4f74ac2dfeb7_325x155.png 424w, https://substackcdn.com/image/fetch/$s_!G4cU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c16bb06-8655-425d-a865-4f74ac2dfeb7_325x155.png 848w, https://substackcdn.com/image/fetch/$s_!G4cU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c16bb06-8655-425d-a865-4f74ac2dfeb7_325x155.png 1272w, https://substackcdn.com/image/fetch/$s_!G4cU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c16bb06-8655-425d-a865-4f74ac2dfeb7_325x155.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><h3><strong>The Front Door</strong></h3><p>For the UI, we used a Kubernetes LoadBalancer service for the frontend. That gave us a stable public entry point in AWS without overengineering the exposure layer.</p><p>The frontend became the public face of the system. From there, it talked to FastAPI inside the cluster.</p><h3><strong>FastAPI As The Control Plane</strong></h3><p>FastAPI became the central orchestration layer for:</p><ul><li><p>training</p></li><li><p>prediction</p></li><li><p>analysis</p></li><li><p>monitoring</p></li></ul><p>That was a good fit because the backend was already the brain of the project. It knew how to train child models, produce forecasts, pull news, evaluate sentiment, and run the custom monitoring pipeline.</p><p>In the cloud version, FastAPI did not just serve endpoints. It became the coordinator for the whole workflow.</p><h3><strong>Redis For State</strong></h3><p>Redis stayed in-cluster.</p><p>This made sense for the first AWS version because the application only needed a practical cache and task state store. We were not trying to solve distributed systems theory on day one. We just needed the system to keep working reliably enough for a demo and a real test.</p><h3><strong>Chroma For Semantic Recall</strong></h3><p>Chroma also stayed in-cluster, with persistent storage backed by PVCs.</p><p>That decision kept the architecture simple while still preserving the semantic cache behavior that the app relied on.</p><p>The important part was persistence. If the pod restarted, we still wanted the semantic store to survive. That is exactly why we paired Chroma with storage backed by EBS.</p><h3><strong>S3 For Durable Artifacts</strong></h3><p>This was one of the most important cloud decisions.</p><p>Local folders are fine until the pod disappears. In Kubernetes, local filesystem state is fragile. So for anything that mattered beyond the lifetime of a single container, we moved to S3.</p><p>That included:</p><ul><li><p>trained model artifacts</p></li><li><p>scalers</p></li><li><p>analysis outputs</p></li><li><p>monitoring summaries</p></li><li><p>other durable files that needed to survive restarts</p></li></ul><p>This gave the system a proper cloud-native memory for important outputs.</p><h2><strong>4. Bedrock Replaced Local Model Serving</strong></h2><p>The other big change was the AI layer.</p><p>Instead of relying on a local model server, the cloud version used Amazon Bedrock for:</p><ul><li><p>chat model calls</p></li><li><p>embeddings</p></li></ul><p>That solved a lot of operational problems at once.</p><p>There was no need to run a separate local inference stack.<br>There was no need to keep a model server alive in the cluster.<br>And there was no need to depend on one more container that could fail independently.</p><p>The FastAPI app just needed AWS credentials and the right Bedrock model IDs. Once that was in place, the rest of the system could call the model layer through a normal AWS path.</p><p>That made the architecture much cleaner for the cloud version.</p><h2><strong>5. Terraform Made The Whole Thing Reproducible</strong></h2><p></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!W2Bh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d4df64-3d7a-401e-ab52-9197b9515c70_310x163.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!W2Bh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d4df64-3d7a-401e-ab52-9197b9515c70_310x163.png 424w, https://substackcdn.com/image/fetch/$s_!W2Bh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d4df64-3d7a-401e-ab52-9197b9515c70_310x163.png 848w, https://substackcdn.com/image/fetch/$s_!W2Bh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d4df64-3d7a-401e-ab52-9197b9515c70_310x163.png 1272w, https://substackcdn.com/image/fetch/$s_!W2Bh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d4df64-3d7a-401e-ab52-9197b9515c70_310x163.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!W2Bh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d4df64-3d7a-401e-ab52-9197b9515c70_310x163.png" width="640" height="336.51612903225805" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a5d4df64-3d7a-401e-ab52-9197b9515c70_310x163.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:163,&quot;width&quot;:310,&quot;resizeWidth&quot;:640,&quot;bytes&quot;:3478,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://lanarkite99.substack.com/i/193655260?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d4df64-3d7a-401e-ab52-9197b9515c70_310x163.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!W2Bh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d4df64-3d7a-401e-ab52-9197b9515c70_310x163.png 424w, https://substackcdn.com/image/fetch/$s_!W2Bh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d4df64-3d7a-401e-ab52-9197b9515c70_310x163.png 848w, https://substackcdn.com/image/fetch/$s_!W2Bh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d4df64-3d7a-401e-ab52-9197b9515c70_310x163.png 1272w, https://substackcdn.com/image/fetch/$s_!W2Bh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5d4df64-3d7a-401e-ab52-9197b9515c70_310x163.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2></h2><p>Once the runtime shape was clear, Terraform took over the infrastructure side.</p><p><strong><a href="https://github.com/lanarkite99/stock_pred_pipeline/tree/cloud_deploy/terraform">Checkout the terraform directory</a></strong></p><p>This was important because cloud deployment gets messy fast if you create things by hand and then try to remember how you did it later.</p><p>We used Terraform to provision:</p><ul><li><p>the VPC</p></li><li><p>public and private subnets</p></li><li><p>the EKS cluster</p></li><li><p>the managed node group</p></li><li><p>ECR repositories</p></li><li><p>the S3 artifact bucket</p></li><li><p>the IAM pieces needed for EKS and GitHub Actions</p></li><li><p>the EBS CSI driver path for persistent volumes</p></li></ul><p>That gave us something much better than a one-off cloud setup.</p><p>It gave us a repeatable deployment path.</p><p>If I wanted to recreate the stack, I did not need a memory dump from my browser tabs. I could read the Terraform and know what would happen.</p><p>That is a huge difference when you are trying to turn a prototype into something real.</p><h2><strong>6. GitHub Actions Became The Deployment Conductor</strong></h2><p>Once the infrastructure existed, GitHub Actions became the thing that tied everything together.</p><p>We ended up with three workflows:</p><ul><li><p><strong>ci.yml</strong></p></li><li><p><strong>build-and-push.yml</strong></p></li><li><p><strong>deploy-eks.yml</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hKTt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc28a9d-e66a-4180-b56f-e060ade8303d_1325x592.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hKTt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc28a9d-e66a-4180-b56f-e060ade8303d_1325x592.png 424w, https://substackcdn.com/image/fetch/$s_!hKTt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc28a9d-e66a-4180-b56f-e060ade8303d_1325x592.png 848w, https://substackcdn.com/image/fetch/$s_!hKTt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc28a9d-e66a-4180-b56f-e060ade8303d_1325x592.png 1272w, https://substackcdn.com/image/fetch/$s_!hKTt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc28a9d-e66a-4180-b56f-e060ade8303d_1325x592.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hKTt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc28a9d-e66a-4180-b56f-e060ade8303d_1325x592.png" width="1325" height="592" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1bc28a9d-e66a-4180-b56f-e060ade8303d_1325x592.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:592,&quot;width&quot;:1325,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:57496,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://lanarkite99.substack.com/i/193655260?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc28a9d-e66a-4180-b56f-e060ade8303d_1325x592.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hKTt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc28a9d-e66a-4180-b56f-e060ade8303d_1325x592.png 424w, https://substackcdn.com/image/fetch/$s_!hKTt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc28a9d-e66a-4180-b56f-e060ade8303d_1325x592.png 848w, https://substackcdn.com/image/fetch/$s_!hKTt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc28a9d-e66a-4180-b56f-e060ade8303d_1325x592.png 1272w, https://substackcdn.com/image/fetch/$s_!hKTt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc28a9d-e66a-4180-b56f-e060ade8303d_1325x592.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">CI/CD from github actions</figcaption></figure></div><p></p></li></ul><p>The<strong> ci.yml</strong> workflow handled sanity checks on the cloud_deploy branch.<br>The <strong>build-and-push.yml</strong> workflow built the Docker images and pushed them to ECR.<br>The <strong>deploy-eks.yml </strong>workflow applied the Kubernetes manifests to the cluster.</p><p>That sounds simple, but the actual path had a few rough edges.</p><p>For example:</p><ul><li><p>the Dockerfile path had to point to the right subdirectory</p></li><li><p>the feature_store/ folder had to exist in Git for the image build to work</p></li><li><p>the deploy workflow had to generate secrets carefully</p></li><li><p>the EKS access entry had to be created so the GitHub role could actually deploy</p></li></ul><p>In other words, the CI/CD pipeline was not just &#8220;automation.&#8221; It was the glue that made the whole cloud setup operational.</p><h2><strong>7. Where Monitoring Fit In</strong></h2><p>This project has two kinds of monitoring, and it is worth separating them.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aRnh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb179fea0-633a-431e-8b95-b0d7e27f6a82_1221x608.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aRnh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb179fea0-633a-431e-8b95-b0d7e27f6a82_1221x608.png 424w, https://substackcdn.com/image/fetch/$s_!aRnh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb179fea0-633a-431e-8b95-b0d7e27f6a82_1221x608.png 848w, https://substackcdn.com/image/fetch/$s_!aRnh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb179fea0-633a-431e-8b95-b0d7e27f6a82_1221x608.png 1272w, https://substackcdn.com/image/fetch/$s_!aRnh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb179fea0-633a-431e-8b95-b0d7e27f6a82_1221x608.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aRnh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb179fea0-633a-431e-8b95-b0d7e27f6a82_1221x608.png" width="1221" height="608" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b179fea0-633a-431e-8b95-b0d7e27f6a82_1221x608.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:608,&quot;width&quot;:1221,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:330056,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://lanarkite99.substack.com/i/193655260?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb179fea0-633a-431e-8b95-b0d7e27f6a82_1221x608.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aRnh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb179fea0-633a-431e-8b95-b0d7e27f6a82_1221x608.png 424w, https://substackcdn.com/image/fetch/$s_!aRnh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb179fea0-633a-431e-8b95-b0d7e27f6a82_1221x608.png 848w, https://substackcdn.com/image/fetch/$s_!aRnh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb179fea0-633a-431e-8b95-b0d7e27f6a82_1221x608.png 1272w, https://substackcdn.com/image/fetch/$s_!aRnh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb179fea0-633a-431e-8b95-b0d7e27f6a82_1221x608.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Streamlit frontend completely hosted on AWS</figcaption></figure></div><p></p><h3><strong>App-level monitoring</strong></h3><p>This is the custom ticker monitoring logic in the backend.</p><p>It checks things like:</p><ul><li><p>system health</p></li><li><p>regime drift</p></li><li><p>response quality</p></li><li><p>analysis quality</p></li></ul><p>This is the real product logic behind the /monitor/{ticker} endpoint.</p><h3><strong>Infrastructure monitoring</strong></h3><p>This is the Prometheus and Grafana side.</p><p>Prometheus scrapes backend metrics.<br>Grafana can visualize them.</p><p>That part is for observability, not the actual financial monitoring logic.</p><p>So when I say the system has monitoring, I do not mean just dashboards. The app itself has a built-in monitoring layer that reasons about ticker behavior, regime changes and report quality.</p><p>That distinction matters because it reflects the difference between &#8220;<em>the service is up</em>&#8221; and &#8220;<em>the model output is trustworthy</em>.&#8221;</p><h2><strong>8. The Debugging Was Part Of The Build</strong></h2><p>A cloud migration is never just a deployment exercise. It is also a long conversation with everything you accidentally assumed.</p><p>Some of the issues we hit were small, but each one taught something real:</p><ul><li><p><strong>the FastAPI image initially missed the monitoring/ package</strong></p></li><li><p><strong>Chroma crashed because service environment variables leaked into its config</strong></p></li><li><p><strong>the first build workflow pointed at the wrong Dockerfile path</strong></p></li><li><p><strong>the second build failed because feature_store/ had not been committed</strong></p></li><li><p><strong>the deploy workflow failed because the generated secrets file path did not exist yet</strong></p></li><li><p><strong>FastAPI could not call Bedrock until AWS credentials were available inside the pod</strong></p></li></ul><p>Each issue was a reminder that cloud systems fail at the seams.</p><p>And those seams are where the important engineering happens.</p><h2><strong>9. Important Learnings</strong></h2><ul><li><p>Build and validate the ML system locally first, then move it to AWS.</p></li><li><p>Keep the first cloud setup small. Add nodes and services only after the base path works.</p></li><li><p>Set a billing alert early, even for PoC work.</p></li><li><p>Treat monitoring as part of the product, not an afterthought.</p></li><li><p>Use caching to reduce inference latency and cloud cost.</p></li><li><p>Terraform is worth it because it makes the infrastructure repeatable.</p></li><li><p>MLOps is not just the model. It includes backend, infra, deployment, and observability.</p></li><li><p>Tear down AWS resources when you are done to avoid surprise charges.</p></li><li><p>Do not chase a perfect system on day one. Ship a working version, then improve it.</p></li></ul><p><strong><a href="https://github.com/lanarkite99/stock_pred_pipeline">Visit my github</a></strong></p><p><strong><a href="https://x.com/MeetSiddhapura">Visit my twitter</a></strong></p><p><strong><a href="https://www.linkedin.com/in/meet-siddhapura/">Visit my linkedin</a></strong></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Phase 1: Production Grade Stock Prediction System design]]></title><description><![CDATA[An end-end, real world ML system that uses LSTM to predict stocks and generate reports.]]></description><link>https://lanarkite99.substack.com/p/phase-1-production-grade-stock-prediction</link><guid isPermaLink="false">https://lanarkite99.substack.com/p/phase-1-production-grade-stock-prediction</guid><dc:creator><![CDATA[lanarkite99]]></dc:creator><pubDate>Fri, 03 Apr 2026 09:07:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!4QDh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f6ec67a-68b4-447f-98cc-fdd9a3b4aa6a_925x1002.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We&#8217;ve all seen the standard &#8220;Stock Price Prediction&#8221; tutorials. They usually involve a single LSTM model, a CSV file from Yahoo Finance and a plot that looks suspiciously accurate until you realize it&#8217;s just lagging the actual price by one day.</p><p>But if you want to build something that actually works in a production-like environment, you need more than a model. You need an production grade ML System.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://lanarkite99.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Over the last few weeks, I&#8217;ve been building this system. It&#8217;s an end-to-end system that handles everything from parent-child transfer learning to agentic news analysis. Here is how it&#8217;s put together.</p><p>This phase mainly focuses on system design and how to put everything together. Next phase will likely involve K8/cloud deployment.</p><blockquote><p>Note:<br>This project is an exercise in <strong>ML Systems Engineering</strong>, not a quest for the perfect financial signal.</p><p>While I&#8217;ve focused on model quality and performance, the primary goal here is to demonstrate a POC for an end to end pipeline. The emphasis is on the engineering decisions, system design the &#8220;<em>connective tissue</em>&#8221; that makes an ML model functional in a production like environment, rather than simply maximizing prediction accuracy.</p></blockquote><h3><strong><a href="https://github.com/lanarkite99/stock_pred_pipeline">Github Repo</a></strong></h3><h4>Introduction:</h4><p>If you have ever tried to deploy a machine learning model, you know the &#8220;Notebook to Production&#8221; gap is more like a canyon. For this project, I set out to bridge that by building a pipeline that manages itself.</p><p>The idea is simple: a user asks for a forecast on a ticker, and the system coordinates a small army of services to give an answer that is both mathematically sound and narratively backed.</p><p>It&#8217;s not trying to beat hedge funds (good luck with that). It&#8217;s for learning, experimentation, personal trading ideas or as a starting point for something bigger.</p><h4>The Stack: More Than Just Python</h4><p>The core of this system is a distributed architecture designed to handle data ingestion, training and real-time inference.</p><ul><li><p><strong>The Brain (FastAPI):</strong> Everything is an asynchronous task. You hit <code>/train-child</code>, get a task ID, and let the backend do the heavy lifting in the background.</p></li><li><p><strong>The Memory (Redis &amp; Chroma):</strong> Compute is expensive. If someone asks for the same analysis twice, <strong>Redis</strong> pulls the exact cache. If they ask for something <em>similar</em>, <strong>Chroma</strong> provides a semantic cache of previous LLM insights.</p></li><li><p><strong>The Context (Ollama &amp; LangGraph):</strong> A price chart is blind to news. I built an agent layer that fetches headlines from NewsAPI and Finnhub, then uses an LLM to &#8220;read&#8221; the sentiment and weigh it against the numerical forecast.</p></li></ul><h4></h4><div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5f6ec67a-68b4-447f-98cc-fdd9a3b4aa6a_925x1002.png&quot;}],&quot;caption&quot;:&quot;High Level System Design&quot;,&quot;alt&quot;:&quot;&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5f6ec67a-68b4-447f-98cc-fdd9a3b4aa6a_925x1002.png&quot;}},&quot;isEditorNode&quot;:true}"></div><h4>Why the &#8220;<em>Parent-Child</em>&#8221; Strategy?</h4><p>Most developers train one model per stock. That is a maintenance nightmare. Instead, I implemented a hierarchical training flow.</p><p>I start with a <strong>Parent Model</strong>. A robust model trained on broad market sector data to capture general &#8220;<em>market physics</em>.&#8221; When I want to predict a specific ticker like <code>INFY.NS</code> or <code>AAPL</code>, I fine-tune a <strong>Child Model</strong> on that specific asset. It is faster, more data-efficient and inherits a <em>baseline</em> understanding of how markets move.</p><p><strong>But why use an LSTM for this?</strong> In a world obsessed with Transformers (I&#8217;m too), LSTMs remain a pragmatic choice for time series forecasting for a few reasons:</p><ol><li><p><strong>Memory over Attention:</strong> Stocks are sequential. LSTMs are explicitly designed to maintain a<em> hidden state</em> that carries information across time steps, making them naturally suited for the <em>lead lag</em> effects in price action.</p></li><li><p><strong>Data Efficiency:</strong> Transformers are notoriously data hungry. Since we are often dealing with limited high quality daily bars for specific tickers, an LSTM converges much faster and is less prone to overfitting on small datasets.</p></li><li><p><strong>The </strong><em><strong>Vanishing Gradient</strong></em><strong> Solution:</strong> Compared to older RNNs, the LSTM&#8217;s gated architecture (Input, Forget and Output gates) allows it to decide which historical events, like a massive earnings beat three months ago, are still relevant to today&#8217;s price.</p></li></ol><h4>Keeping an Eye on the <em>Regime</em></h4><p>The rules of the market change. A model built for a bull market is a liability during a crash. I didn&#8217;t want to just <em>hope</em> the model was still accurate, so I built a <strong>Monitoring Runner</strong>.</p><p>Every time you hit <code>/monitor/{ticker}</code>, the system runs a check for <strong>Regime Drift</strong>. It compares current market volatility and returns to what the model saw during training. If the market <em>regime</em> has shifted significantly, the system flags it in a JSON artifact. Simply put, it monitors its own relevance.</p><h4>Observability</h4><p>If you aren't measuring it, you aren't managing it. The whole pipeline exports metrics to <strong>Prometheus</strong>, which I visualize in <strong>Grafana</strong>.</p><p>Custom <strong>monitoring</strong> that checks:</p><ul><li><p>Overall system health</p></li><li><p>Market regime drift (is the current market behaving differently from training?)</p></li><li><p>Quality of the generated analysis</p></li></ul><p>All of this runs together with Docker Compose, so spinning up FastAPI, Redis, Prometheus, Grafana, and the Streamlit dashboard is just one command.</p><h4>Frontend</h4><p>There&#8217;s a simple <strong>Streamlit</strong> app that lets you:</p><ul><li><p>Trigger training, prediction, analysis, or monitoring</p></li><li><p>View the latest forecast chart</p></li><li><p>Read the generated analysis</p></li><li><p>Check monitoring results in cards and tables</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1k9d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9802b3b-c81f-494e-a967-8b076576e9e3_1346x636.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1k9d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9802b3b-c81f-494e-a967-8b076576e9e3_1346x636.png 424w, https://substackcdn.com/image/fetch/$s_!1k9d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9802b3b-c81f-494e-a967-8b076576e9e3_1346x636.png 848w, https://substackcdn.com/image/fetch/$s_!1k9d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9802b3b-c81f-494e-a967-8b076576e9e3_1346x636.png 1272w, https://substackcdn.com/image/fetch/$s_!1k9d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9802b3b-c81f-494e-a967-8b076576e9e3_1346x636.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1k9d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9802b3b-c81f-494e-a967-8b076576e9e3_1346x636.png" width="1346" height="636" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c9802b3b-c81f-494e-a967-8b076576e9e3_1346x636.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:636,&quot;width&quot;:1346,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:393197,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://lanarkite99.substack.com/i/192926530?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9802b3b-c81f-494e-a967-8b076576e9e3_1346x636.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1k9d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9802b3b-c81f-494e-a967-8b076576e9e3_1346x636.png 424w, https://substackcdn.com/image/fetch/$s_!1k9d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9802b3b-c81f-494e-a967-8b076576e9e3_1346x636.png 848w, https://substackcdn.com/image/fetch/$s_!1k9d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9802b3b-c81f-494e-a967-8b076576e9e3_1346x636.png 1272w, https://substackcdn.com/image/fetch/$s_!1k9d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9802b3b-c81f-494e-a967-8b076576e9e3_1346x636.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L2Ig!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb32465b5-88e1-49cc-b70c-e793d8c2e4b8_595x583.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L2Ig!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb32465b5-88e1-49cc-b70c-e793d8c2e4b8_595x583.png 424w, https://substackcdn.com/image/fetch/$s_!L2Ig!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb32465b5-88e1-49cc-b70c-e793d8c2e4b8_595x583.png 848w, https://substackcdn.com/image/fetch/$s_!L2Ig!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb32465b5-88e1-49cc-b70c-e793d8c2e4b8_595x583.png 1272w, https://substackcdn.com/image/fetch/$s_!L2Ig!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb32465b5-88e1-49cc-b70c-e793d8c2e4b8_595x583.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L2Ig!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb32465b5-88e1-49cc-b70c-e793d8c2e4b8_595x583.png" width="595" height="583" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b32465b5-88e1-49cc-b70c-e793d8c2e4b8_595x583.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:583,&quot;width&quot;:595,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:257896,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://lanarkite99.substack.com/i/192926530?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb32465b5-88e1-49cc-b70c-e793d8c2e4b8_595x583.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L2Ig!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb32465b5-88e1-49cc-b70c-e793d8c2e4b8_595x583.png 424w, https://substackcdn.com/image/fetch/$s_!L2Ig!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb32465b5-88e1-49cc-b70c-e793d8c2e4b8_595x583.png 848w, https://substackcdn.com/image/fetch/$s_!L2Ig!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb32465b5-88e1-49cc-b70c-e793d8c2e4b8_595x583.png 1272w, https://substackcdn.com/image/fetch/$s_!L2Ig!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb32465b5-88e1-49cc-b70c-e793d8c2e4b8_595x583.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h4>What can be done better?</h4><ul><li><p>Train a larger LSTM model.</p></li><li><p>Use Evidently AI for more professional and accurate drift detection.</p></li><li><p>Deployment to cloud (Next phase, stay tuned!)</p></li></ul><h4>Major Learnings:</h4><ul><li><p><strong>Every Decision Starts with &#8220;</strong><em><strong>It Depends</strong></em><strong>&#8221;:</strong> There is no perfect architecture, only trade-offs. We chose an LSTM over a Transformer and Redis over a standard DB because those tools fit the specific latency and data constraints of this project.</p></li><li><p><strong>Solve the Business Problem, Not the Hype:</strong> At the end of the day, what matters is the problem you are solving. For this pipeline, the problem wasn&#8217;t just <em>predicting a price</em>, it was providing <em>actionable, monitored financial insights.</em></p></li><li><p><strong>Stack Alignment:</strong> Your current stack must support your ultimate goal. By using FastAPI and Docker, it is a portable, scalable service that actually works in a production like environment.</p></li><li><p><strong>The </strong><em><strong>Invisible</strong></em><strong> Infrastructure:</strong> Trust me, training the ML model will be the easiest part of your job. The real value and often the real difficulty lies in the "<em>connective tissue</em>." The data flow, caching strategy and the observability are what keep the system from flying blind in production.</p><p></p></li></ul><h4>What&#8217;s Next?</h4><p>We are not done yet. it&#8217;s a foundation. The repo is designed so you can swap out the LSTM for a Transformer or change the LLM provider without breaking the plumbing. Next phase will involve deploying this whole thing to cloud, so stay tuned!!</p><p>If you found this useful, follow me on <strong>GitHub, X and linkedin</strong>. You can find the full repo and setup instructions at the link below. Go ahead and fork it, clone it, build it, break it and build it again yourself. Because the truth is, until you&#8217;ve built the thing yourself, you don't really understand it.</p><p><strong><a href="https://github.com/lanarkite99/stock_pred_pipeline">Visit my github</a></strong></p><p><strong><a href="https://x.com/MeetSiddhapura">Visit my twitter</a></strong></p><p><strong><a href="https://www.linkedin.com/in/meet-siddhapura/">Visit my linkedin</a></strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://lanarkite99.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>