# 目录 [−]

## Boker 架构

### network layer

Kafka使用NIO自己实现了网络层的代码， 而不是采用netty, mina等第三方的网络框架。从性能上来讲，这一块的代码不是性能的瓶颈。

Kafka的服务器由SocketServer实现,它是一个NIO的服务器，线程模型如下：

• 1个Acceptor线程负责处理新连接
• N个Processor线程， 每个processor都有自己的selector，负责从socket中读取请求和发送response
• M个Handler线程处理请求，并产生response给processor线程

#### b. 它为每个Processor生成一个线程并启动，然后启动一个Acceptor线程。

Acceptor是一个典型NIO 处理新连接的方法类：

#### d. Processor的accept方法将新连接加入它的新连接待处理队列中

configureNewConnections方法中注册OP_READ

### API layer

API层的主要功能是由KafkaApis类实现的。

KafkaRequestHandler不断的从requestChannel队列里面取出request交给apis处理。

apis根据不同的请求类型调用不同的方法进行处理。

### Replication subsystem

To publish a message to a partition, the client first finds the leader of the partition from Zookeeper and sends the message to the leader. The leader writes the message to its local log. Each follower constantly pulls new messages from the leader using a single socket channel. That way, the follower receives all messages in the same order as written in the leader. The follower writes each received message to its own log and sends an acknowledgment back to the leader. Once the leader receives the acknowledgment from all replicas in ISR, the message is committed. The leader advances the HW and sends an acknowledgment to the client. For better performance, each follower sends an acknowledgment after the message is written to memory. So, for each committed message, we guarantee that the message is stored in multiple replicas in memory. However, there is no guarantee that any replica has persisted the commit message to disks though. Given that correlated failures are relatively rare, this approach gives us a good balance between response time and durability. In the future, we may consider adding options that provide even stronger guarantees. The leader also periodically broadcasts the HW to all followers. The broadcasting can be piggybacked on the return value of the fetch requests from the followers. From time to time, each replica checkpoints its HW to its disk.

### Log subsystem

LogManager负责管理Kafka的Log(Kafka消息)， 包括log/Log文件夹的创建，获取和清理。它也会通过定时器检查内存中的log是否要缓存到磁盘中。

## Metrics

Kafka使用metrics进行性能的度量。原先是yammer metrics,现在独立成dropwizard metrics.目前这个框架的package名字比较乱，但是性能监控的功能却是非常的强大。
metrics提供了几种reporter,可以将性能报告显示在哪里， 比如控制台，JMX, Slf4j,Ganglia，Graphite等。
Kafka实现了一个CSV文件报告类KafkaCSVMetricsReporter，它调用metrics的CsvReporter生成报告。

## Producer

kafka.producer.Producer定义了两种类型的Producer: sync和async。基本上都是通过 eventHandler.handle(messages)处理消息, 只不过async会通过一个线程， 以LinkedBlockingQueue为缓冲发送消息。

kafka.javaapi.producer.Producer则提供了java接口。

## Consumer

kafka.consumer.SimpleConsumer提供了Simple Consumer API.它通过一个BlockingChannel发送消息，接收Response完成任务。
kafka.javaapi.consumer.SimpleConsumer则提供了java接口。

High level consumer实际由ZookeeperConsumerConnector完成，它将consumer信息记录在zookeeper中，提供KafkaStream获取Kafka消息。

## 参考文档

1. https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Internals