Skip to content

Commit f7fe68e

Browse files
Many improvements to die interpretations given our explorations into high_pc (#89)
* many improvements to DWARF die interpretations given our explorations into `high_pc` * fixing up tests * adding dtor supplement until we get the high_pc shenanigans figured out there * additional details around the qualified name of a symbol
1 parent 6a2840c commit f7fe68e

File tree

8 files changed

+454
-126
lines changed

8 files changed

+454
-126
lines changed

README.md

+20
Original file line numberDiff line numberDiff line change
@@ -255,3 +255,23 @@ The following flags are not currently in use or will undergo heavy changes as th
255255
- `[orc_test_flags]`: A series of runtime settings to pass to the test app for this test.
256256

257257
- `[orc_flags]`: A series of runtime settings to pass to the ORC engine for this test.
258+
259+
# Appendix A: Destructor Implementations
260+
261+
It has been observed that the destructor of a given class can be different sizes across translation units. This is because the Itanium ABI defines [several destructor types](https://itanium-cxx-abi.github.io/cxx-abi/abi.html#vague-ctor) which may contribute to the confusion. [Mark Rowe](https://www.linkedin.com/in/bdash/) provides an excellent synopsis:
262+
263+
> Presumably since you're seeing these destructors in multiple object files, they are declared in the header. The definitions have what is referred to as "vague linkage". For Mach-O this typically means they're emitted as weak definitions.
264+
>
265+
> At link time, the linker will use a strong definition for the symbol if it exists, otherwise it'll pick one of the weak definitions to use. If the symbol is not exported from the binary it'll be converted to a strong definition, otherwise it'll remain weak and the dynamic loader will do the same resolution process at load time (strong definition if one exists, otherwise pick one of the available weak definitions).
266+
>
267+
> If all of the various definitions are equivalent from an ABI point of view, it should not matter if they compile to slightly different code. However, if they are not ABI compatible, you'll have bugs that can be very hard to track down.
268+
>
269+
> `-fomit-frame-pointer` being used in some translation units and not others will result in code being generated for the same function. Different optimization levels will as well. Those should still be ABI-compatible though.
270+
>
271+
> Things like class members or virtual functions that are conditionally included or can change types based on `#if`s are a common source of problems.
272+
>
273+
> The other thing to be aware of is that for classes with virtual member functions, the compiler will often generate two destructors: the regular destructor, and the so-called "deleting" destructor. The deleting destructor is effectively a call to the regular destructor followed by a call to the appropriate `operator delete` implementation. If you're not distinguishing between these two types of destructors that may lead you to believe they're different sizes.
274+
>
275+
> The different destructor types are described in the Itanium ABI and can be distinguished via their mangled names.
276+
277+
(This appendix should be kept around until there is reasonable confidence that ORC is discerning between the various types and minimizing false positives.)

include/orc/dwarf_constants.hpp

+38
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,8 @@ enum class at : std::uint16_t {
106106
abstract_origin = 0x31,
107107
accessibility = 0x32,
108108
address_class = 0x33,
109+
// DW_AT_artificial attribute indicates that the associated entity (e.g., a function, variable, or parameter)
110+
// is compiler-generated rather than explicitly written by the programmer in the source code.
109111
artificial = 0x34,
110112
base_types = 0x35,
111113
calling_convention = 0x36,
@@ -114,9 +116,27 @@ enum class at : std::uint16_t {
114116
decl_column = 0x39,
115117
decl_file = 0x3a,
116118
decl_line = 0x3b,
119+
// DW_AT_declaration indicates that the associated entity is a declaration rather than a definition.
120+
// A function declaration is typically represented as a DW_TAG_subprogram entry with the attribute
121+
// DW_AT_declaration set to true (or 1).
122+
// It does not have attributes like DW_AT_low_pc or DW_AT_high_pc, as it does not correspond to actual code.
123+
// Example:
124+
// <1><0x0000003a> DW_TAG_subprogram
125+
// DW_AT_name ("myFunction")
126+
// DW_AT_declaration (true)
127+
128+
// A function implementation is also represented as a DW_TAG_subprogram entry but does not have the DW_AT_declaration attribute.
129+
// Instead, it includes attributes like DW_AT_low_pc and DW_AT_high_pc (or DW_AT_ranges),
130+
// which specify the address range of the function's code in memory.
131+
// Example:
132+
// <1><0x0000003a> DW_TAG_subprogram
133+
// DW_AT_name ("myFunction")
134+
// DW_AT_low_pc (0x0000000000401000)
135+
// DW_AT_high_pc (0x0000000000401020)
117136
declaration = 0x3c,
118137
discr_list = 0x3d,
119138
encoding = 0x3e,
139+
// DW_AT_external attribute indicates that the corresponding entity (e.g., a variable, function, or type) has external linkage.
120140
external = 0x3f,
121141
frame_base = 0x40,
122142
friend_ = 0x41,
@@ -125,6 +145,10 @@ enum class at : std::uint16_t {
125145
namelist_item = 0x44,
126146
priority = 0x45,
127147
segment = 0x46,
148+
// DW_AT_specification attribute is used to reference a declaration of an entity
149+
// (such as a function, variable, or type) that is defined elsewhere.
150+
// It essentially links the current entry to its corresponding declaration,
151+
// which is typically represented by a DW_TAG_subprogram, DW_TAG_variable, or similar tag.
128152
specification = 0x47,
129153
static_link = 0x48,
130154
type = 0x49,
@@ -525,6 +549,20 @@ enum class tag : std::uint16_t {
525549

526550
const char* to_string(tag t);
527551

552+
/**
553+
* @brief Determines if a given DWARF tag represents a type
554+
*
555+
* This function classifies whether a given DWARF tag represents a type definition
556+
* or declaration. This is used to identify type-related DIEs in the DWARF debug
557+
* information.
558+
*
559+
* @param t The DWARF tag to check
560+
*
561+
* @return true if the tag represents a type, false otherwise
562+
*
563+
* @pre The tag must be a valid DWARF tag
564+
* @post The return value will be true for all type-related tags and false for all others
565+
*/
528566
bool is_type(tag t);
529567

530568
/**************************************************************************************************/

include/orc/dwarf_structs.hpp

+61-8
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,9 @@
1414
#include <string>
1515
#include <vector>
1616

17+
// adobe contract checks
18+
#include "adobe/contract_checks.hpp"
19+
1720
// application
1821
#include "orc/dwarf_constants.hpp"
1922
#include "orc/hash.hpp"
@@ -54,7 +57,7 @@ struct attribute_value {
5457
}
5558

5659
auto uint() const {
57-
assert(has(type::uint));
60+
ADOBE_PRECONDITION(has(type::uint));
5861
return _uint;
5962
}
6063

@@ -64,22 +67,35 @@ struct attribute_value {
6467
}
6568

6669
auto sint() const {
67-
assert(has(type::sint));
70+
ADOBE_PRECONDITION(has(type::sint));
6871
return _int;
6972
}
7073

74+
// Return _either_ sint or uint; some attributes
75+
// may be one or the other, but in some cases the
76+
// valid values could be represented by either type
77+
// (e.g., the number cannot be negative or larger
78+
// than the largest possible signed value.)
79+
// This routine is useful when the caller doesn't
80+
// care how it was stored and just wants the value.
81+
// If this attribute value has _both_, it is assumed
82+
// they are equal.
83+
int number() const {
84+
return has(type::sint) ? sint() : uint();
85+
}
86+
7187
void string(pool_string x) {
7288
_type |= type::string;
7389
_string = x;
7490
}
7591

7692
const auto& string() const {
77-
assert(has(type::string));
93+
ADOBE_PRECONDITION(has(type::string));
7894
return _string;
7995
}
8096

8197
auto string_hash() const {
82-
assert(has(type::string));
98+
ADOBE_PRECONDITION(has(type::string));
8399
return _string.hash();
84100
}
85101

@@ -89,7 +105,7 @@ struct attribute_value {
89105
}
90106

91107
auto reference() const {
92-
assert(has(type::reference));
108+
ADOBE_PRECONDITION(has(type::reference));
93109
return _uint;
94110
}
95111

@@ -146,6 +162,7 @@ struct attribute {
146162
auto reference() const { return _value.reference(); }
147163
const auto& string() const { return _value.string(); }
148164
auto uint() const { return _value.uint(); }
165+
auto sint() const { return _value.sint(); }
149166
auto string_hash() const { return _value.string_hash(); }
150167
};
151168

@@ -181,7 +198,7 @@ struct attribute_sequence {
181198

182199
bool has(dw::at name, enum attribute_value::type t) const {
183200
auto [valid, iterator] = find(name);
184-
return valid ? iterator->has(t) : false;
201+
return valid && iterator->has(t);
185202
}
186203

187204
bool has_uint(dw::at name) const {
@@ -198,13 +215,13 @@ struct attribute_sequence {
198215

199216
auto& get(dw::at name) {
200217
auto [valid, iterator] = find(name);
201-
assert(valid);
218+
ADOBE_INVARIANT(valid);
202219
return *iterator;
203220
}
204221

205222
const auto& get(dw::at name) const {
206223
auto [valid, iterator] = find(name);
207-
assert(valid);
224+
ADOBE_INVARIANT(valid);
208225
return *iterator;
209226
}
210227

@@ -216,6 +233,14 @@ struct attribute_sequence {
216233
return get(name).uint();
217234
}
218235

236+
int number(dw::at name) const {
237+
return get(name)._value.number();
238+
}
239+
240+
std::int64_t sint(dw::at name) const {
241+
return get(name).sint();
242+
}
243+
219244
pool_string string(dw::at name) const {
220245
return get(name).string();
221246
}
@@ -237,14 +262,26 @@ struct attribute_sequence {
237262
auto end() { return _attributes.end(); }
238263
auto end() const { return _attributes.end(); }
239264

265+
void erase(dw::at name) {
266+
auto [valid, iterator] = find(name);
267+
ADOBE_INVARIANT(valid);
268+
_attributes.erase(iterator);
269+
}
270+
271+
void move_append(attribute_sequence&& rhs) {
272+
_attributes.insert(_attributes.end(), std::move_iterator(rhs.begin()), std::move_iterator(rhs.end()));
273+
}
274+
240275
private:
276+
/// NOTE: Consider sorting these attribues by `dw::at` to improve performance.
241277
std::tuple<bool, iterator> find(dw::at name) {
242278
auto result = std::find_if(_attributes.begin(), _attributes.end(), [&](const auto& attr){
243279
return attr._name == name;
244280
});
245281
return std::make_tuple(result != _attributes.end(), result);
246282
}
247283

284+
/// NOTE: Consider sorting these attribues by `dw::at` to improve performance.
248285
std::tuple<bool, const_iterator> find(dw::at name) const {
249286
auto result = std::find_if(_attributes.begin(), _attributes.end(), [&](const auto& attr){
250287
return attr._name == name;
@@ -423,6 +460,22 @@ using dies = std::vector<die>;
423460

424461
/**************************************************************************************************/
425462

463+
/**
464+
* @brief Determines if a DWARF attribute is considered non-fatal for ODRV purposes
465+
*
466+
* This function identifies attributes that can be safely ignored when checking for
467+
* One Definition Rule Violations (ODRVs). These attributes typically contain
468+
* information that doesn't affect the actual definition of a symbol, such as
469+
* debug-specific metadata or compiler-specific extensions.
470+
*
471+
* @param at The DWARF attribute to check
472+
*
473+
* @return true if the attribute is non-fatal and can be ignored for ODRV checks,
474+
* false if the attribute must be considered when checking for ODRVs
475+
*
476+
* @pre The attribute must be a valid DWARF attribute
477+
* @post The return value will be consistent with the internal list of nonfatal attributes
478+
*/
426479
bool nonfatal_attribute(dw::at at);
427480
inline bool fatal_attribute(dw::at at) { return !nonfatal_attribute(at); }
428481

0 commit comments

Comments
 (0)