关于Core dump(核心转储)

core dump是程序运行时,在进程收到某些信号而终止运行时,将此时进程地址空间的内容以及有关进程状态的其他信息写入一个磁盘文件。

对应会产生core dump的信号

Signal Action Comment
SIGQUIT Core Quit from keyboard
SIGILL Core Illegal Instruction
SIGABRT Core Abort signal from abort
SIGSEGV Core Invalid memory reference
SIGTRAP Core Trace/breakpoint trap

我们可以通过使用gdb查core dump文件,最后崩溃时的信息,来进行debug
为了更好的查看阅读core dump文件, linux下需要进行以下配置

设置core文件生成位置, 默认为当前目录

可以修改/proc/sys/kernel/core_pattern,将core文件生成到指定目录下

1
2
mkdir /cores
echo "/cores/core.%t.%e.%p" | sudo tee /proc/sys/kernel/core_pattern

参数包含

1
2
3
4
5
6
7
%e  Executable name
%h Hostname
%p PID of dumped process
%s Signal causing dump
%t Time of dump
%u UID
%g GID

设置系统ulimit core size

可以通过ulimit -c 查看当前 core size, 默认为0,即不会生成core dump文件

  • 通过ulimit -c unlimited设置当前会话中的ulimit, 退出或者新开会话会失效
  • 为docker 设置ulimit, 默认会跟随 dockerd的配置,也可以在运行时指定 docker run --ulimit core=-1 --security-opt seccomp=unconfined -v /cores:/cores <后续命令>
  • 在程序中直接设定,下面是一个例子
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <unistd.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <stdio.h>
#define CORE_SIZE -1
int main()
{
struct rlimit rlmt;
if (getrlimit(RLIMIT_CORE, &rlmt) == -1) {
return -1;
}
printf("Before set rlimit CORE dump current is:%d, max is:%d\n", (int)rlmt.rlim_cur, (int)rlmt.rlim_max);

rlmt.rlim_cur = (rlim_t)CORE_SIZE;
rlmt.rlim_max = (rlim_t)CORE_SIZE;

#if DEBUG
// 主要是这句 设定 core size
if (setrlimit(RLIMIT_CORE, &rlmt) == -1) {
return -1;
}
#endif

if (getrlimit(RLIMIT_CORE, &rlmt) == -1) {
return -1;
}
printf("After set rlimit CORE dump current is:%d, max is:%d\n", (int)rlmt.rlim_cur, (int)rlmt.rlim_max);

/*测试非法内存,产生core文件*/
int *ptr = NULL;
*ptr = 10;

return 0;
}

由于我们大多数工程测试时都是跑在clion+wsl环境中的, 当clion启动wsl时会通过 wsl另外启动一个sh的壳导致我们在系统内部设置的ulimit失效,所以我们项目中选择在程序内设定。 通过C:\Windows\system32\wsl.exe --distribution Ubuntu-18.04 --exec /bin/sh -c "ulimit -c"可以验证

添加编译参数,在查看core dump文件时可以拿到更详细的信息

1
2
3
4
5
6
7
CMakeLists.txt
add_definitions(-DDEBUG=true)
add_definitions(-DRELEASE=false)

# core dump config
add_definitions("$ENV{CXXFLAGS} -O0 -g")
SET(CMAKE_CXX_FLAGS_DEBUG "$ENV{CXXFLAGS} -O0 -g")

至此 当程序异常退出时,我们可以在debug环境下愉快的拿到core文件了

假设core文件为/cores/core.1630405848.pixelpai.19,我们就可以通过gdb解析对应的文件
bt # 获取最后退出堆栈的详细信息
frame 3 # 简写 f 3 切到第3个frame 并输出相关代码
p value # 展示所在帧value对象的值
up # 移到上一个帧
down # 移到下一个帧

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
gdb main /cores/core.1630405848.pixelpai.19
...
GNU gdb (Ubuntu 8.1.1-0ubuntu1) 8.1.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from cmake-build-debug/build/bin/main...done.
[New LWP 17765]
[New LWP 2336]
[New LWP 29510]
[New LWP 11379]
[New LWP 2335]
[New LWP 2334]
[New LWP 20475]
[New LWP 20476]
[New LWP 20477]

warning: Could not load shared library symbols for 2 libraries, e.g. ./cjson.so.
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/home/never/work/pixelpai_server/cmake-build-debug/build/bin/main -props /home/'.
Program terminated with signal SIGABRT, Aborted.
#0 raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7fca28917700 (LWP 17765))]
(gdb) bt
#0 raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x0000562abaf64e47 in console::handler (sig=11) at /home/never/work/pixelpai_server/src/server/main/console_linux.cpp:244
#2 <signal handler called>
#3 0x00007fca316e0108 in Sprite::getSpriteSerialize (this=0x7fca00a66d00) at /home/never/work/pixelpai_server/src/server/world/virtualworld/scene/sprite.cpp:595
#4 0x00007fca3169a87a in Scene::sendEnterSceneToAll (this=0x7fca0217ba60, actorId=1685436067) at /home/never/work/pixelpai_server/src/server/world/virtualworld/scene/scene.cpp:535
...
(gdb) frame 3
#3 0x00007fca316e0108 in Sprite::getSpriteSerialize (this=0x7fca00a66d00) at /home/never/work/pixelpai_server/src/server/world/virtualworld/scene/sprite.cpp:595
(gdb) p animationSptr
$1 = std::shared_ptr<IAnimation> (empty) = {get() = 0x0}
(gdb) p spriteSptr
$2 = std::shared_ptr<op_client::Sprite> (use count 1, weak count 0) = {get() = 0x7fca0d84cea0}
(gdb) up
#4 0x00007fca3169a87a in Scene::sendEnterSceneToAll (this=0x7fca0217ba60, actorId=1685436067) at /home/never/work/pixelpai_server/src/server/world/virtualworld/scene/scene.cpp:535
535 auto characterProto = pCharacter->getSpriteSerialize();
(gdb) down
#3 0x00007fca316e0108 in Sprite::getSpriteSerialize (this=0x7fca00a66d00) at /home/never/work/pixelpai_server/src/server/world/virtualworld/scene/sprite.cpp:595

参考

https://ctring.github.io/blog/2021/how-to-get-core-dump-in-a-Docker-container/
https://le.qun.ch/en/blog/core-dump-file-in-docker/
https://zhuanlan.zhihu.com/p/24311785
https://www.cnblogs.com/hazir/p/linxu_core_dump.html

Author

Nevermore

Posted on

2022-01-07

Updated on

2024-02-21

Licensed under