一个线程安全的泛型支持map库

orcaman/concurrent-map是一个非常高效的线程安全的map库，正如它的文档中所说的那样，标准库sync.Map更适合append-only的场景，或者说少写大量的读的场景，如果针对多读多写的场景，concurrent-map可能会更有优势。它是通过分片的方式，将锁的粒度减少，从而提高性能。

今年初的时候，这个库做了改造，开始支持泛型，但是不幸的是，它只支持Value值泛型，它的key只能是string类型，这就限制了它的应用场景。

我fork了这个项目，创建了一个新的线程安全的库smallnest/safemap,同时支持Key和Value的泛型，可以应用更多的sh使用场合。

我能理解为什么concurrent-map的key不支持泛型，因为不好实现。因为对于一个泛型的Key,我们不太好计算它的hash值，要么转换成string再计算hash值性能底下，要么提供一个hash函数变量，让用户自己实现hash算法，用户使用起来又不方便。

我最近看到一个新的项目alphadose/haxmap,它也是一个线程安全的map, 基于Harris lock-free list算法，参考它的Key的hash的计算方法，实现了对concurrent-map打补丁，以便Key和Value同时支持泛型，这就促成了smallnest/safemap库。

当然，在复制haxmap的hash算法的过程中，也发现了它的hash算法对一些类型有问题，做了修复。

大家既可以使用safemap处理并发读写的场景，也可以通过safemap了解如何对一个既有的项目进行泛型支持的改造。

Map Key/Value同时支持泛型的改造

concurrent-map的定义如下:

type ConcurrentMap[V any] []*ConcurrentMapShared[V]

type ConcurrentMapShared[V any] struct {
	items        map[string]V
	sync.RWMutex // Read Write mutex, guards access to internal map.
}

可以看到，它只定义了V any类型约束，用来做Value的类型，而它的Key类型只能是string类型: items map[string]V。
这个好处是它可以只实现一个特定的对string类型的hash算法，不用考虑各种key类型:

func fnv32(key string) uint32 {
	hash := uint32(2166136261)
	const prime32 = uint32(16777619)
	keyLength := len(key)
	for i := 0; i < keyLength; i++ {
		hash *= prime32
		hash ^= uint32(key[i])
	}
	return hash
}

为啥不使用内建map的hash算法，针对所有的comparable类型计算hash值呢？这个问题也有过讨论#21195,但最后也不了了之了。虽然在Go 1.19 Go官方把内部的memhash算法暴露出来，但是针对本文的场景，我们期望的是能暴露一个泛型的Hash算法。

不管怎样，我们假定有一个办法，我们自己实现了一个针对泛型的hash算法。那么我们现在就需要对ConcurrentMap进行改造。

我首先把它命名为SafeMap,更简短一些，同时支持Key和Value的类型约束:

type SafeMap[K comparable, V any] struct {
	shared []*SafeMapShared[K, V]
	hasher func(K) uintptr
}

type SafeMapShared[K comparable, V any] struct {
	items        map[K]V
	sync.RWMutex 
}

这里定义了一个泛型的hasher: hasher func(K) uintptr,用来计算一个泛型的Key的hash值。下面我们去实现它。

接着就是修复程序的标红的地方。也就是原来硬编码string的地方，我们都要用K进行替换，比如:

func (m *SafeMap[K, V]) GetShard(key K) *SafeMapShared[K, V] {
	k := uint(m.hasher(key))
	return m.shared[uint(k)%uint(ShardCount)]
}

// Sets the given value under the specified key.
func (m *SafeMap[K, V]) Set(key K, value V) {
	// Get map shard.
	shard := m.GetShard(key)
	shard.Lock()
	shard.items[key] = value
	shard.Unlock()
}

Receiver要进行修改，因为目前它有两个类型约束(K、V)。涉及到Key string的地方都要使用Key K进行替换。

这个的hash计算使用hasher方法。接下来就看看我们默认的hasher算法。

hasher

这个hasher方法实现的难点在于要支持泛型。如果要是特定的类型，比如int64、string，我们可以直接使用它底层的值计算hash值。如果要是interface{}呢,也好办，利用type switch也可以方便的机型处理。如果是泛型呢，它是在编译期执行的，也不能直接使用type switch，那怎么办呢？

首先，我们在初始化的时候去判断Key的类型，这可以通过反射实现(先把Key转为interface{},再使用type switch)，然后根据不同的类型返回特定的hasher算法，这样因为hasher实在初始化的时候确定下来的，也不影响后续的Set、Get方法的性能。

下面是生成hasher的片段：

var （
	// qword hasher, key size -> 8 bytes
	qwordHasher = func(key uint64) uintptr {
		k1 := key * prime2
		k1 = bits.RotateLeft64(k1, 31)
		k1 *= prime1
		h := (prime5 + 8) ^ k1
		h = bits.RotateLeft64(h, 27)*prime1 + prime4
		h ^= h >> 33
		h *= prime2
		h ^= h >> 29
		h *= prime3
		h ^= h >> 32
		return uintptr(h)
	}

    	float64Hasher = func(key float64) uintptr {
		k := *(*uint64)(unsafe.Pointer(&key))
		h := prime5 + 4
		h ^= uint64(k) * prime1
		h = bits.RotateLeft64(h, 23)*prime2 + prime3
		h ^= h >> 33
		h *= prime2
		h ^= h >> 29
		h *= prime3
		h ^= h >> 32
		return uintptr(h)
	}
）
    
func genHasher[K comparable]() func(K) uintptr {
	var hasher func(K) uintptr

	switch any(*new(K)).(type) {
	case int64, uint64:
		hasher = *(*func(K) uintptr)(unsafe.Pointer(&qwordHasher))
	case float64:
		hasher = *(*func(K) uintptr)(unsafe.Pointer(&float64Hasher))

首先把泛型的值转换成interface{}，这样才能type switch: switch any(*new(K)).(type) {。

如果是int64、uint64类型, 我们可以使用qwordHasher这个预定义的hasher,不过它的类型是func(key uint64) uintptr，对于int64类型来说，可以使用unsafe强制转一下。

这样，最难的一点我们就攻克了，各种类型都可以独立的去处理，只需要进一步针对不同的类型定义相应的hasher就可以了。你可以使用各种hash算法去实现。

这样，safemap就改造完了。性能怎么样，我建议你实际进行测试，而且最好根据你的实际的场景进行测试。项目中提供了一些基本的benchmark测试。

另外，我还建议你关注一下alphadose/haxmap,我的初步测试它的性能会更好，但是我还需要等待它发展一段时间，变得更稳定。