Debugging runtime differences from the same code base

时间:2016-03-04 18:06:34

标签: c++ c++11 omnet++ veins mixim

I'm currently using the VEINS library and simulation package to do some experiments. Because these have a very long run time, I'm trying to use the university cluster servers (KITE 2.0/RHEL6.6/Lustre 2.5.29.ddnpf3) -- however, I've now encountered several different run time bugs, with the same code that runs perfectly fine on my local machine (Fedora 23). I'm looking for a way to easily debug this problem. I suspect that the cause lies somewhere in the different gcc version, or perhaps some other system level library that I can't change remotely (but I'm not sure). I'm certain that the OMNeT++ version is the same; the VEINS library is provided by me and is the same locally and remotely.

An example of the issues I've encountered is discussed here, which I eventually fixed like this (as far as I can tell, both versions have the same semantics... DimensionSet extends std::set, and DimensionSet::timeFreqDomain is a static const initialized with (Dimension::time, Dimension::frequency) as in the fix).

What is a good approach to look for the cause? Is there a simple way to "cross-compile" between these machines, or some way to diff the binaries to look for the cause? Where do I look for common ways to deal with problems like these?

1 个答案:

答案 0 :(得分:3)

我可能已将错误跟踪到static initialization order fiasco的示例:MiXiM' s Dimension::time是静态成员,因此它不应该用于初始化其他静态成员。不幸的是,这正是MiXiM(以及Veins)所做的,导致了这样的崩溃。

我推动了commit 7807f47c(静脉4.4的一部分),它几乎消除了所有静态成员,因此整个框架应该更安全。