Go泛型是怎么实现的?

Go 1.17中你就可以使用泛型了,可以参考我3月份的文章:Go 泛型尝鲜, 编译的时候需要加-gcflags=-G=3参数,而当前master分支,默认已经支持泛型,不需要加-G=3参数了。

你可以通过下面的步骤尝试go最新分支:

1
2
go get golang.org/dl/gotip
gotip download

编译代码的时候使用gotip替换go命令即可。

随着Go 1.17的发布,最近也涌现了很多的介绍Go泛型的文章,基本上都是简单介绍的文章。

最近Go泛型的变化是增加了两个操作符: ~|

  • an approximation element ~T restricts to all types whose underlying type is T: 代表底层类型是T
  • a union element T1 | T2 | ... restricts to any of the listed elements: 代表,类型列表之一。

这些不是我想介绍的内容,今天我肝一篇介绍Go泛型实现原理的文章,介绍Go泛型实现的方案。

对于一个函数func Echo[T any](t T){},Go编译器到底编译成了什么代码?

简单的说,当前Go泛型实现的方案和下图中的方案一样:

在国内的老破小小区的楼道中常见的一种高科技印刷技术,通过一个镂花模板,为每一种类型生成特化的类型,这个术语叫做stenciling

但是如果再说多一点,那么就应该从 Taylor和Griesemer说起。

Go泛型提案中关于泛型实现的介绍

Go的泛型有别于其它语言的方案,在Go语言中泛型叫做Type Parameter(类型参数).

Taylor和Griesemer的提案Type Parameters Proposal更多的是泛型呈现形式和影响的思考,对具体的实现涉及甚少。

无论什么编程语言,根据Russ Cox的观察,实现泛型至少要面对下面三条困境之一,那还是在2009年:

  • 拖累程序员:比如C语言,增加了程序员的负担,需要曲折的实现,但是不对增加语言的复杂性
  • 拖累编译器: 比如C++编程语言,增加了编译器的负担,可能会产生很多冗余的代码,重复的代码还需要编译器斟酌删除,编译的文件可能非常大。Rust的泛型也属于这一类。
  • 拖累执行时间:比如Java,将一些装箱成Object,进行类型擦除。虽然代码没啥冗余了,空间节省了,但是需要装箱拆箱操作,代码效率低。

很显然, Go语言至简的设计哲学让它的泛型实现不会选择增加程序员的负担的道路,所以它会在第二和第三种方案中做选择。虽然提案中没有最终说明它选择了哪种方案,但是从实际编译的代码可以看出,它选择的是第二种方案。

三个方案

Keith H. Randall, MIT的博士,现在在Google/Go team做泛型方面的开发,提出了Go泛型实现的三个方案:

字典

在编译时生成一组实例化的字典,在实例话一个泛型函数的时候会使用字典进行蜡印(stencile)。

当为泛型函数生成代码的时候,会生成唯一的一块代码,并且会在参数列表中增加一个字典做参数,就像方法会把receiver当成一个参数传入。字典包含为类型参数实例化的类型信息。

字典在编译时生成,存放在只读的data section中。

当然字段可以当成第一个参数,或者最后一个参数,或者放入一个独占的寄存器。

当然这种方案还有依赖问题,比如字典递归的问题,更重要的是,它对性能可能有比较大的影响,比如一个实例化类型int, x=y可能通过寄存器复制就可以了,但是泛型必须通过memmove

蜡印(Stenciling)

或者翻译成用模板印等。

就像下面的动图一样,同一个泛型函数,为每一个实例化的类型参数生成一套独立的代码,感觉和rust的泛型特化一样。

这种方案和上面的字典方案正好相反。

比如下面一个泛型方法:

1
2
3
func f[T1, T2 any](x int, y T1) T2 {
...
}

如果有两个不同的类型实例化的调用:

1
2
var a float64 = f[int, float64](7, 8.0)
var b struct{f int} = f[complex128, struct{f int}](3, 1+1i)

那么这个方案会生成两套代码:

1
2
3
4
5
6
func f1(x int, y int) float64 {
... identical bodies ...
}
func f2(x int, y complex128) struct{f int} {
... identical bodies ...
}

因为编译f时是不知道它的实例化类型的,只有在调用它时才知道它的实例化的类型,所以需要在调用时编译f。对于相同实例化类型的多个调用,同一个package下编译器可以识别出来是一样的,只生成一个代码就可以了,但是不同的package就不简单了,这些函数表标记为DUPOK,所以链接器会丢掉重复的函数实现。

这种策略需要更多的编译时间,因为需要编译泛型函数多次。因为对于同一个泛型函数,每种类型需要单独的一份编译的代码,如果类型非常多,编译的文件可能非常大,而且性能也比较差。

混合方案(GC Shape Stenciling)

混合前面的两种方案。

对于实例类型的shape相同的情况,只生成一份代码,对于shape类型相同的类型,使用字典区分类型的不同行为。

这种方案介于前两者之间。

啥叫shape?

类型的shape是它对内存分配器/垃圾回收器呈现的方式,包括它的大小、所需的对齐方式、以及类型哪些部分包含指针。

每一个唯一的shape会产生一份代码,每份代码携带一个字典,包含了实例化类型的信息。

这种方案的问题是到底能带来多大的收益,它会变得有多慢,以及其它的一些问题。

从当前的反编译的代码看,当前Go采用的是第二种方案,尽管名称中已经带了shapedict的标志,或许,Go的泛型方案还在进化之中,进化到第三种方案或者其它方案也不是没有可能。

接下来我们看一个例子,看看Go泛型的方案是怎么实现的。

例子

下面是一个简单的例子,有一个泛型函数func echo[T any](t T) string {return fmt.Sprintf("%v", t)},使用不同的几种实例化类型去调用它,并且使用shape一样的int32uint32做为实例化类型。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
package generic
import (
"fmt"
"time"
)
func echo[T any](t T) string {
return fmt.Sprintf("%v", t)
}
func Test() {
echo(0)
echo(int32(0))
echo(uint32(0))
echo(uint64(0))
echo("hello")
echo(struct{}{})
echo(time.Now())
}

反编译后代码非常长,精简如下。编译的时候禁止优化和内联,否则实例化的代码内联后看不到效果了。

可以看到函数echo编译成了不同的函数:"".echo[.shape.int]"".echo[.shape.int32]"".echo[.shape.uint32]"".echo[.shape.uint64]"".echo[.shape.string]"".echo[.shape.struct{}]"".echo[.shape.struct{ time.wall uint64; time.ext int64; time.loc *time.Location }]不同的函数,即使shape一样的类型(int32uint32)。 调用这些函数时,是通过""..dict.echo[uint64]这种方式调用的。

所以我谨慎怀疑,Go的泛型方式在逐步的向第三种方案进化。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
# command-line-arguments
"".Test STEXT size=185 args=0x0 locals=0x48 funcid=0x0
0x0000 00000 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) TEXT "".Test(SB), ABIInternal, $72-0
"".Test STEXT size=185 args=0x0 locals=0x48 funcid=0x0
0x0000 00000 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) TEXT "".Test(SB), ABIInternal, $72-0
0x0000 00000 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) CMPQ SP, 16(R14)
0x0004 00004 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) PCDATA $0, $-2
0x0004 00004 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) JLS 175
0x000a 00010 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) PCDATA $0, $-1
0x000a 00010 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) SUBQ $72, SP
0x000e 00014 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) MOVQ BP, 64(SP)
0x0013 00019 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) LEAQ 64(SP), BP
0x0018 00024 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) FUNCDATA $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0018 00024 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) FUNCDATA $1, gclocals·54241e171da8af6ae173d69da0236748(SB)
0x0018 00024 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:13) LEAQ ""..dict.echo[int](SB), AX
0x001f 00031 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:13) XORL BX, BX
0x0021 00033 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:13) PCDATA $1, $0
0x0021 00033 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:13) CALL "".echo[.shape.int](SB)
0x0026 00038 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:14) LEAQ ""..dict.echo[int32](SB), AX
0x002d 00045 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:14) XORL BX, BX
0x002f 00047 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:14) CALL "".echo[.shape.int32](SB)
0x0034 00052 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:15) LEAQ ""..dict.echo[uint32](SB), AX
0x003b 00059 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:15) XORL BX, BX
0x003d 00061 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:15) NOP
0x0040 00064 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:15) CALL "".echo[.shape.uint32](SB)
0x0045 00069 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:16) LEAQ ""..dict.echo[uint64](SB), AX
0x004c 00076 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:16) XORL BX, BX
0x004e 00078 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:16) CALL "".echo[.shape.uint64](SB)
0x0053 00083 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:17) LEAQ ""..dict.echo[string](SB), AX
0x005a 00090 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:17) LEAQ go.string."hello"(SB), BX
0x0061 00097 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:17) MOVL $5, CX
0x0066 00102 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:17) CALL "".echo[.shape.string](SB)
0x006b 00107 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:18) LEAQ ""..dict.echo[struct{}](SB), AX
0x0072 00114 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:18) CALL "".echo[.shape.struct{}](SB)
0x0077 00119 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:19) CALL time.Now(SB)
0x007c 00124 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:19) MOVQ AX, ""..autotmp_0+40(SP)
0x0081 00129 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:19) MOVQ BX, ""..autotmp_0+48(SP)
0x0086 00134 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:19) MOVQ CX, ""..autotmp_0+56(SP)
0x008b 00139 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:19) MOVQ CX, DI
0x008e 00142 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:19) MOVQ BX, CX
0x0091 00145 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:19) MOVQ AX, BX
0x0094 00148 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:19) LEAQ ""..dict.echo[time.Time](SB), AX
0x009b 00155 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:19) NOP
0x00a0 00160 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:19) CALL "".echo[.shape.struct{ time.wall uint64; time.ext int64; time.loc *time.Location }](SB)
0x00a5 00165 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:20) MOVQ 64(SP), BP
0x00aa 00170 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:20) ADDQ $72, SP
0x00ae 00174 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:20) RET
0x00af 00175 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:20) NOP
0x00af 00175 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) PCDATA $1, $-1
0x00af 00175 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) PCDATA $0, $-2
0x00af 00175 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) CALL runtime.morestack_noctxt(SB)
0x00b4 00180 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) PCDATA $0, $-1
0x00b4 00180 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:12) JMP 0
.................
"".echo[.shape.int] STEXT dupok size=268 args=0x10 locals=0x88 funcid=0x0
0x0000 00000 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:8) TEXT "".echo[.shape.int](SB), DUPOK|ABIInternal, $136-16
.................
0x00c2 00194 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) MOVQ DI, SI
0x00c5 00197 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) PCDATA $1, $0
0x00c5 00197 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) CALL fmt.Sprintf(SB)
.................
"".echo[.shape.int32] STEXT dupok size=266 args=0x10 locals=0x88 funcid=0x0
0x0000 00000 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:8) TEXT "".echo[.shape.int32](SB), DUPOK|ABIInternal, $136-16
.................
0x00bd 00189 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) MOVL $1, DI
0x00c2 00194 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) MOVQ DI, SI
0x00c5 00197 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) PCDATA $1, $0
0x00c5 00197 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) CALL fmt.Sprintf(SB)
.................
"".echo[.shape.uint32] STEXT dupok size=266 args=0x10 locals=0x88 funcid=0x0
0x0000 00000 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:8) TEXT "".echo[.shape.uint32](SB), DUPOK|ABIInternal, $136-16
.................
0x00c5 00197 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) PCDATA $1, $0
0x00c5 00197 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) CALL fmt.Sprintf(SB)
.................
"".echo[.shape.uint64] STEXT dupok size=268 args=0x10 locals=0x88 funcid=0x0
0x0000 00000 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:8) TEXT "".echo[.shape.uint64](SB), DUPOK|ABIInternal, $136-16
.................
0x00c2 00194 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) MOVQ DI, SI
0x00c5 00197 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) PCDATA $1, $0
0x00c5 00197 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) CALL fmt.Sprintf(SB)
.................
"".echo[.shape.string] STEXT dupok size=295 args=0x18 locals=0x88 funcid=0x0
0x0000 00000 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:8) TEXT "".echo[.shape.string](SB), DUPOK|ABIInternal, $136-24
.................
0x00d6 00214 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) PCDATA $1, $2
0x00d6 00214 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) CALL fmt.Sprintf(SB)
.................
"".echo[.shape.struct{}] STEXT dupok size=208 args=0x8 locals=0x88 funcid=0x0
0x0000 00000 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:8) TEXT "".echo[.shape.struct{}](SB), DUPOK|ABIInternal, $136-8
.................
0x0093 00147 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) PCDATA $1, $0
0x0093 00147 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) CALL fmt.Sprintf(SB)
.................
0x00cb 00203 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:8) JMP 0
.................
"".echo[.shape.struct{ time.wall uint64; time.ext int64; time.loc *time.Location }] STEXT dupok size=364 args=0x20 locals=0xa0 funcid=0x0
0x0000 00000 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:8) TEXT "".echo[.shape.struct{ time.wall uint64; time.ext int64; time.loc *time.Location }](SB), DUPOK|ABIInternal, $160-32
.................
0x00c5 00197 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) CMPL runtime.writeBarrier(SB), $0
0x00cc 00204 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) JEQ 208
0x00ce 00206 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) JMP 214
0x00d0 00208 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) MOVQ AX, 8(CX)
0x00d4 00212 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) JMP 221
0x00d6 00214 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:9) CALL runtime.gcWriteBarrier(SB)
.................
0x0167 00359 (/Users/chaoyuepan/go/src/github.com/smallnest/study/type_parameter/generic/generic.go:8) JMP 0
...............
"".echo[.shape.string].stkobj SRODATA static size=32
.......
"".echo[.shape.string].arginfo1 SRODATA static dupok size=9
....... ..........
"".echo[.shape.struct{}].stkobj SRODATA static size=32
.......
"".echo[.shape.struct{}].arginfo1 SRODATA static dupok size=5
.......
"".echo[.shape.struct{ time.wall uint64; time.ext int64; time.loc *time.Location }].stkobj SRODATA static size=56
......
"".echo[.shape.struct{ time.wall uint64; time.ext int64; time.loc *time.Location }].arginfo1 SRODATA static dupok size=11
0x0000 00 08 fe 08 08 10 08 18 08 fd ff ...........

泛型的性能

写一个简单的benchmark程序,没看到明显的性能变化。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
package bench_test
import (
"fmt"
"testing"
)
func BenchmarkAdd_Generic(b *testing.B) {
for i := 0; i < b.N; i++ {
add(i, i)
}
}
func BenchmarkAdd_NonGeneric(b *testing.B) {
for i := 0; i < b.N; i++ {
addInt(i, i)
}
}
type Addable interface {
int
}
func add[T Addable](a, b T) T {
return a + b
}
func addInt(a, b int) int {
return a + b
}
func main() {
fmt.Println(add(1, 2))
fmt.Println(addInt(1, 2))
}

参考文档

  1. https://github.com/golang/proposal/blob/master/design/generics-implementation-dictionaries.md
  2. https://github.com/golang/proposal/blob/master/design/generics-implementation-gcshape.md
  3. https://github.com/golang/proposal/blob/master/design/generics-implementation-stenciling.md
  4. https://github.com/golang/proposal/blob/master/design/43651-type-parameters.md
  5. https://docs.google.com/document/d/1vrAy9gMpMoS3uaVphB32uVXX4pi-HnNjkMEgyAHX4N4/view#