Case Study: LLM Chat Platform (LibreChat)
Deployed and operated a private, self-hosted LibreChat instance on Azure AI Foundry for coursework and experimentation with document workflows and retrieval‑augmented generation (RAG).
Last updated: January 2026
Summary
- What it is: A private LLM chat workspace with file uploads and RAG-backed document search.
- Who it’s for: A small group (classmates + personal use) for study and experimentation.
- My role: Deployment, configuration, operations, and documenting the system end‑to‑end.
Architecture (high level)
- Compute: Azure VM running Docker Compose.
- LLM access: Azure AI Foundry (model deployments/configuration).
- Storage: Azure Blob Storage for file storage.
- Database: MongoDB Atlas for application data.
- RAG workflow: Document uploads + vector search (implementation details in progress).
What I built / configured
- Deployed the stack and established a repeatable deployment process.
- Configured authentication, file storage, and document ingestion workflows.
- Built/maintained runbooks and operational documentation for stability.
- Set up usage/cost monitoring workflows to keep spending predictable.
Key learnings
Add your learnings here (5–8 bullets). Examples you can adapt:
- Ops is product work: good runbooks reduce downtime and cognitive load.
- RAG quality depends heavily on ingestion, chunking, and metadata, not just the model.
- Small-group “production” still needs guardrails (auth, backups, cost limits).
Appendix: configured models (as of Oct–Dec 2025)
This list is included for completeness, not as the main value proposition.
gpt-5.2-chat
gpt-5.2
gpt-5.2-codex
gpt-5-pro
gpt-5-mini
Mistral-Large-3
mistral-medium-2505
DeepSeek-R1-0528
MAI-DS-R1
Kimi-K2-Thinking