Feng's Notes

Cloud Run Comfyui

2025-04-27 comfyui

If you want to experiment with ComfyUI but don’t want to invest in an expensive GPU, you still have options thanks to various cloud providers. However, you’ll need to carefully consider the trade-offs. Here’s what I’ve learned from my own experience (and some research with GPT): Google Colab Google Colab was my initial go-to for quick experiments. It provides free access to GPUs through Jupyter notebooks, so you can get ComfyUI running quickly by following the official instructions.

How to Rag - case study from dify

2025-01-29 RAG LLM dify

RAG is a core component of LLM applications. The idea is to index your data in a vector database and then use the LLM to generate responses based on the indexed data. The concept seems simple but the implementation can be complex. I recently am researching on Dify - a popular LLM application platform and found its RAG engine is a good case study to understand how to implement a comprehensive RAG system.

Run Multiple Asyncio Frameworks, is it possible?

2024-12-09 python asyncio telegram bot

The answer is yes. But it’s not a trivial task. I had this question when I was working on a telegram bot with python telegram bot(PTB) framework, while I also want to run a fastapi server using uvicorn. PTB is built on top of asyncio and if you run Application.run_polling() it will block the event loop. So I had to find a way to let both run without blocking. Option 1: Embed the other asyncio frameworks in one event loop Actually this is the recommended way to run multiple asyncio (including webserver or another bot), the given example is as follows:

LLM Evaluation Frameworks

2024-11-08 llm llm-evaluation

As Large Language Models (LLMs) become increasingly critical in production systems, robust evaluation frameworks are essential for ensuring their reliability and performance. This article tries to walk you through modern LLM evaluation approaches, examining key frameworks and their specialized capabilities. Core Evaluation Dimensions It’s important to understand that LLM evaluation is not a one-size-fits-all task. The evaluation framework you choose should align with your specific use case and evaluation requirements. In general, there are three core dimensions to consider:

Telegram Bot With AWS Lambda

2024-09-09 aws lambda terraform telegram

Serverless architecture offers a great way to build and deploy applications without managing servers. In this post, we’ll walk through the process of creating a Telegram bot using AWS Lambda for serverless execution, Terraform for infrastructure as code, and Python for the bot’s logic. We’ll also use a Lambda Layer to manage our dependencies efficiently. Project Structure Let’s start with our project structure: Initial Project Structure telegram-bot/ ├── terraform/ │ ├── main.

Gpt-Researcher Deep Dive

2024-08-06 AI Agent

I recently discovered GPT Researcher, an impressive project that’s revolutionizing how we conduct online research. Its ability to generate comprehensive reports quickly and cost-effectively caught my attention, so I decided to dive deeper into its inner workings. In this article, I’ll explore the architecture behind GPT Researcher, why it’s so fast, discuss considerations for deploying it as a service, and look at potential future developments. 1. How GPT Researcher Works: The Architecture GPT Researcher employs a sophisticated multi-agent architecture that’s both efficient and effective.

Multi-agents for long article generation

2024-07-05 AI CrewAI LLM

It’s not too hard these days to generate a research report or an article with the help of AI. You just need to figure out the topic and prompt chatgpt, it will generate a draft for you. Potentially you can polish the draft and work towards a decent article. However I found myself even lazier than that. I just want to show the AI the topic and some keypoints and let it generate a long comprehensive article for me.

How to install Loki, Grafana and Prometheus on Kubernetes 2024

2024-04-26 Kubernetes Monitoring DevOps

Loki + Grafana + Prometheus is a powerful combination for monitoring and logging. I found there were very few up-to-date instructions on how to set up all these three components with helm on Kubernetes. So I decided to write down my experience here. Prerequisites You need to have helm installed. If not you can follow the instructions here Add corresponding helm repositories: for Grafana and Loki: helm repo add grafana https://grafana.

Chrome Extension Third Party Auth

2024-04-05 Browser Extension firebase

When you are building a chrome extension, you may need to authenticate the user with a third party service. If the third party service is google itself, it’s relatively easy as you can use google cloud platform plus firebase to authenticate the user. However if you are using other third party service, it’s a bit tricky and there are very few resources available online. I recently built a chrome extension that needs to authenticate the user with Notion and I would like to share my experience here.

Go Embed Vite

2024-01-19 go gin react vite

Go has a great feature that it can be packaged into a single binary which is easy for distribution. However when it comes to web application it’s a bit tricky becuase 1. you need to bind frontend assets into the binary and 2. you need to deal with the routing if you are building a single page application(SPA). In this post I will walk you through how to embed a vite built react app into a go binary and how to handle the routing using gin framework.