Feng's Notes Isn't coding fun?

A Subtle NestJS Dependency Injection Pitfall: Request-Scoped Providers in Background Workers

Recently our team ran into a production incident caused by a subtle interaction in NestJS’s dependency injection system.

The root cause was a request-scoped provider being injected into a service that was also used by a background worker. At first glance the code looked perfectly valid, but the runtime behavior was very different from what we expected.

This article explains:

  • what happened
  • why it happens in NestJS
  • how dependency scope propagation works
  • how to design services to avoid this class of issue

Building Smarter Rate Limits in NestJS with Redis

When you build APIs that bill per token—like AI workloads—rate limiting stops being just a traffic control feature.

It becomes a revenue-protection mechanism.

We learned this the hard way: if you let users run multiple concurrent AI tasks before their token usage is reconciled, you can lose real money.

So we started from NestJS’s built-in throttler, explored Redis-based options, and eventually built our own token-bucket limiter with Lua.

This post walks through that decision process—what works, what doesn’t, and how to evolve your rate limiting when you move from simple backend requests to token-based billing.

Cloud Run Comfyui

If you want to experiment with ComfyUI but don’t want to invest in an expensive GPU, you still have options thanks to various cloud providers. However, you’ll need to carefully consider the trade-offs. Here’s what I’ve learned from my own experience (and some research with GPT):

  1. Google Colab

Google Colab was my initial go-to for quick experiments. It provides free access to GPUs through Jupyter notebooks, so you can get ComfyUI running quickly by following the official instructions. The downside? Sessions are time-limited and can disconnect after a few hours, and free GPUs aren’t always available. Customizing the environment can also be tricky since everything resets when your session ends. Still, if you want to try ComfyUI without entering your credit card, Colab is a solid starting point.

How to Rag - case study from dify

RAG is a core component of LLM applications. The idea is to index your data in a vector database and then use the LLM to generate responses based on the indexed data.

The concept seems simple but the implementation can be complex. I recently am researching on Dify - a popular LLM application platform and found its RAG engine is a good case study to understand how to implement a comprehensive RAG system.

Run Multiple Asyncio Frameworks, is it possible?

The answer is yes. But it’s not a trivial task. I had this question when I was working on a telegram bot with python telegram bot(PTB) framework, while I also want to run a fastapi server using uvicorn. PTB is built on top of asyncio and if you run Application.run_polling() it will block the event loop. So I had to find a way to let both run without blocking.

Option 1: Embed the other asyncio frameworks in one event loop

Actually this is the recommended way to run multiple asyncio (including webserver or another bot), the given example is as follows:

LLM Evaluation Frameworks

As Large Language Models (LLMs) become increasingly critical in production systems, robust evaluation frameworks are essential for ensuring their reliability and performance. This article tries to walk you through modern LLM evaluation approaches, examining key frameworks and their specialized capabilities.

Core Evaluation Dimensions

It’s important to understand that LLM evaluation is not a one-size-fits-all task. The evaluation framework you choose should align with your specific use case and evaluation requirements. In general, there are three core dimensions to consider:

Telegram Bot With AWS Lambda

Serverless architecture offers a great way to build and deploy applications without managing servers. In this post, we’ll walk through the process of creating a Telegram bot using AWS Lambda for serverless execution, Terraform for infrastructure as code, and Python for the bot’s logic. We’ll also use a Lambda Layer to manage our dependencies efficiently.

Project Structure

Let’s start with our project structure:

Initial Project Structure

telegram-bot/
├── terraform/
│   ├── main.tf
│   └── variables.tf
├── src/
│   └── bot.py
├── requirements.txt
├── build_layer.sh
└── README.md

Project Structure After Building

After running our build scripts, the structure will look like this:

Gpt-Researcher Deep Dive

I recently discovered GPT Researcher, an impressive project that’s revolutionizing how we conduct online research. Its ability to generate comprehensive reports quickly and cost-effectively caught my attention, so I decided to dive deeper into its inner workings. In this article, I’ll explore the architecture behind GPT Researcher, why it’s so fast, discuss considerations for deploying it as a service, and look at potential future developments.

1. How GPT Researcher Works: The Architecture

GPT Researcher employs a sophisticated multi-agent architecture that’s both efficient and effective. Here’s a breakdown of its key components:

Multi-agents for long article generation

It’s not too hard these days to generate a research report or an article with the help of AI. You just need to figure out the topic and prompt chatgpt, it will generate a draft for you. Potentially you can polish the draft and work towards a decent article. However I found myself even lazier than that. I just want to show the AI the topic and some keypoints and let it generate a long comprehensive article for me.

How to install Loki, Grafana and Prometheus on Kubernetes 2024

Loki + Grafana + Prometheus is a powerful combination for monitoring and logging. I found there were very few up-to-date instructions on how to set up all these three components with helm on Kubernetes. So I decided to write down my experience here.

Prerequisites

You need to have helm installed. If not you can follow the instructions here Add corresponding helm repositories:

for Grafana and Loki:

helm repo add grafana https://grafana.github.io/helm-charts

for Prometheus: