Question

我已经创建了一个共享库（在OSX上为“ dylib”，在Ubuntu上为“ so”）和一个加载该库的可执行文件。如果我只是将共享库链接到可执行文件（cmake中的link_libraries），则一切正常。

现在，我不链接它，而是使用dlopen / dlsym打开库。在可以工作的OSX上，可执行文件可以平稳运行，但是在Linux上，它在特定时刻崩溃。这是valgrind跟踪：

==7253== Jump to the invalid address stated on the next line
 ==7253==    at 0x0: ???
==7253==    by 0x61DB539: void __gnu_cxx::new_allocator<std::thread>::construct<std::thread, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(std::thread*, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (new_allocator.h:136)
==7253==    by 0x61D7780: void std::allocator_traits<std::allocator<std::thread> >::construct<std::thread, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(std::allocator<std::thread>&, std::thread*, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (alloc_traits.h:475)
==7253==    by 0x61D7840: void std::vector<std::thread, std::allocator<std::thread> >::_M_realloc_insert<ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(__gnu_cxx::__normal_iterator<std::thread*, std::vector<std::thread, std::allocator<std::thread> > >, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (vector.tcc:415)
==7253==    by 0x61D371D: void std::vector<std::thread, std::allocator<std::thread> >::emplace_back<ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (vector.tcc:105)
==7253==    by 0x61D19F5: ThreadPool::ThreadPool(unsigned long) (ThreadPool.h:38)
==7253==    by 0x112545: main (testexecutable.cpp:216)
==7253==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==7253== Process terminating with default action of signal 11 (SIGSEGV)
==7253==  Bad permissions for mapped region at address 0x0
==7253==    at 0x0: ???
==7253==    by 0x61DB539: void __gnu_cxx::new_allocator<std::thread>::construct<std::thread, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(std::thread*, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (new_allocator.h:136)
==7253==    by 0x61D7780: void std::allocator_traits<std::allocator<std::thread> >::construct<std::thread, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(std::allocator<std::thread>&, std::thread*, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (alloc_traits.h:475)
==7253==    by 0x61D7840: void std::vector<std::thread, std::allocator<std::thread> >::_M_realloc_insert<ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(__gnu_cxx::__normal_iterator<std::thread*, std::vector<std::thread, std::allocator<std::thread> > >, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (vector.tcc:415)
==7253==    by 0x61D371D: void std::vector<std::thread, std::allocator<std::thread> >::emplace_back<ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (vector.tcc:105)
==7253==    by 0x61D19F5: ThreadPool::ThreadPool(unsigned long) (ThreadPool.h:38)
==7253==    by 0x112545: main (testexecutable.cpp:216)

代码实际上是这样的：

...
// need to keep track of threads so we can join them
std::vector< std::thread > workers;
// the task queue
std::queue< std::function<void()> > tasks;
...

// the constructor just launches some amount of workers
inline ThreadPool::ThreadPool(size_t threads)
: stop(false)
{
for (size_t i = 0; i<threads; ++i)
    workers.emplace_back(
        [this]
   {
...

，崩溃恰在emplace_back调用处。任何想法为什么会发生这种情况？ GCC是7.3.0，Ubuntu 18.04。

编辑1

Link to github repo with code

编辑2

好，所以这是解决方案的一部分。我的同事指出，这可能是由于将函数指针（lambda）放置在可执行文件和共享库的不同堆栈上，引起了混淆-我无法验证，但这是我发现的：

ldd test
linux-vdso.so.1 (0x00007ffd6bdc7000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd8766de000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fd876350000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd875f5f000)
/lib64/ld-linux-x86-64.so.2 (0x00007fd876ae5000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd875bc1000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fd8759a9000)

未将 pthread 显示为必需的库。但是，共享库引用了pthread。

ldd liblibrary.so 
linux-vdso.so.1 (0x00007ffc97b74000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007efce4d30000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007efce49a2000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007efce478a000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007efce4399000)
/lib64/ld-linux-x86-64.so.2 (0x00007efce515f000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007efce3ffb000)

尽管引用了它，但对共享库中需要 pthread 的函数的任何调用都会导致应用程序崩溃-看来， pthread 库未加载完全没有

如果我对主线程进行调用，即

void dummyfunction() {}

int main(int argc, char* argv[]) {
   std::thread dummy(&dummyfunction);
   dummy.join();
   ...
   // dlopen/dlsym here...
   ...
   initFunction();
   ...
   // dlclose
   return 0;
}

pthread 被添加到库列表中，

ldd test
linux-vdso.so.1 (0x00007ffdc7bd0000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f5d13777000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f5d13573000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f5d131e5000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f5d12fcd000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5d12bdc000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5d13b9c000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f5d1283e000)

它被加载，并且一切都在共享库中起作用。

但是为什么不能从共享库中加载pthread库？

也尝试在 pthread 上的共享库中使用dlopen，但这没用。

Answer 1

向@o11c致谢以指出这一点。解决该问题的一种方法是在可执行文件的链接器中添加一个标志，并将 pthread 显式添加到库列表中

SET(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--no-as-needed")
target_link_libraries(test pthread dl)

Linux可执行文件通过dlopen打开共享库在emplace_back上崩溃

1 个答案: