This is the start a series of posts about the notes I made when I read the source code of popular open source projects. I myself learned a lot from digging into the code of open source projects. I hope it can help you too.

I will start with Loki, a log aggregation system inspired by Prometheus.

Overview

grafana/loki consists two components:

  • Loki: the log aggregation system
  • Promtail: the agent to collect logs and send to Loki

in a nutshell, loki and promtail is a service-client arhitecture. Both of them are written in Go. As we are reading the code for learning purpose, it’s better to read with questions in mind. Here are some of mine I would like to discuss in a couple of posts:

  1. How the loki server designed?
  2. Where and how loki stores data?
  3. How promtail collect logs and send to loki?

The loki backbone

It’s not too hard to find the entry point of loki server. It’s in cmd/loki/main.go. The main function is quite simple, it first read config file, initialize a loki instance then run.

Here is how a loki instance got initialized:

func New(cfg Config) (*Loki, error) {
	loki := &Loki{
		Cfg:                 cfg,
		clientMetrics:       storage.NewClientMetrics(),
		deleteClientMetrics: deletion.NewDeleteRequestClientMetrics(prometheus.DefaultRegisterer),
		Codec:               queryrange.DefaultCodec,
	}
	analytics.Edition("oss")
	loki.setupAuthMiddleware()
	loki.setupGRPCRecoveryMiddleware()
	if err := loki.setupModuleManager(); err != nil {
		return nil, err
	}

	return loki, nil
}

what we can spot here is loki has a middleware system, and it has a module manager.

If we dig deeper, all middleware implements iterface

type Interface interface {
	Wrap(http.Handler) http.Handler
}

So the Wrap method takes in a handler and returns a handler. So middlewares can be chained together to form a handler chain.

On the other hand, module manager under pkg/loki/loki.go is the place where loki register its modules. A module is a component that can be enabled or disabled. For example, loki has a module called queryfrontend which is responsible for handling query requests. The module struct is

// module is the basic building block of the application
type module struct {
	// dependencies of this module
	deps []string

	// initFn for this module (can return nil)
	initFn func() (services.Service, error)

	// is this module user visible
	userVisible bool

	// is the module allowed to be selected as a target
	targetable bool
}

Loki Run

Loki run method is under package/loki/loki.go#Run. Since loki has a modular manager which manages dozens of modules, each module you can config how to run by implemething initFn() method. so the structure of Run method looks clean. Here is what it does:

  1. Init service manager and async start services
  2. Start http server which btw uses gorilla/mux as router
  3. Start grpc server

Service manager

The service manager is a struct defined in the package of grafana/diskit it’s a kit for designing distributed system. At dskit/package/service/service.go you can find in the comments the digrame of service state

service state

so services could be in one of the following states: new, starting, running, stopping, terminated, failed

This is how service manager initialized:

func NewManager(services ...Service) (*Manager, error) {
	if len(services) == 0 {
		return nil, errors.New("no services")
	}

	m := &Manager{
		services:  services,
		byState:   map[State][]Service{},
		healthyCh: make(chan struct{}),
		stoppedCh: make(chan struct{}),
	}

	for _, s := range services {
		st := s.State()
		if st != New {
			return nil, fmt.Errorf("unexpected service state: %v", st)
		}

		m.byState[st] = append(m.byState[st], s)
	}

	for _, s := range services {
		s.AddListener(newManagerServiceListener(m, s))
	}
	return m, nil
}

So the service manager would hold a map of services and their states. It also register listeners for executing callbacks when service state changes.

Server

Loki wraps http server and grpc server in a struct called server. The server also is registered as a service and managed by service manager.

You can find the server definition in the diskit package. it’s defined like


type Server struct {
	cfg          Config
	handler      SignalHandler
	grpcListener net.Listener
	httpListener net.Listener

	grpchttpmux        cmux.CMux
	grpcOnHTTPListener net.Listener
	GRPCOnHTTPServer   *grpc.Server

	HTTP       *mux.Router
	HTTPServer *http.Server
	GRPC       *grpc.Server
	Log        gokit_log.Logger
	Registerer prometheus.Registerer
	Gatherer   prometheus.Gatherer
}

It worths noting that the server uses Gorrila/mux as http router, and it uses cmux to multiplex grpc and http on the same port.

In addition it offers intrumetion facilities by using prometheus.

(to be continued…)