2019年12月27日 星期五

c++ exception handling 的實作 (3) - __gxx_personality_v0

c++ 的 exception handling 特性真是鬼斧神工之作, 這麼複雜的實作機制, 最初到底是怎麼想到的?

難, 難上加難
一樣介紹的是 gcc 的實作方式, 使用 dwarf 格式, 不是 setjmp/longump 那套。

eh.cpp
135 int one()
136 {
145   throw 100;
146   #if 0
147   void *throw_obj = __cxa_allocate_exception(sizeof(int));
148   *(int *)throw_obj = 98;
149   __cxa_throw (throw_obj, &typeid(int), 0);
150   #endif
151 }
170 
171 void main()
172 {
173   try
174   {
180     one();
182   }
184   catch (std::exception &e)
185   {
186     printf("get excetption: %s\n", e.what());
187   }
189   catch (int a)
191   {
192     printf("got excetption: %d\n", a);
193   }
198 }

eh.cpp 是一個很簡單的範例, 表面上看起來很簡單的 try/catch statement, 背後蘊藏著超級複雜的實作機制。可以看到 list 2 反組譯被插入一些額外的程式碼, _Unwind_Resume, __cxa_begin_catch, __cxa_end_catch, 由於這個例子很簡單, 被插入的額外程式碼不多。

這次要介紹的是 __gxx_personality_v0(), 從 throw 100 到 catch(int) 之間, 會發生很多事情, stack unwind 應該是大家比較熟知的部份, 而 __gxx_personality_v0() 可能較為陌生。

在 stack unwind 期間, 有一段程式碼就是在呼叫 __gxx_personality_v0(), 而 __gxx_personality_v0() 會被呼叫 2 次, 類似 list 1 那樣。

list 1
1 code = (*fs.personality) (1, _UA_SEARCH_PHASE, exc->exception_class, exc, &cur_context);

2 code = (*fs.personality) (1, _UA_CLEANUP_PHASE | match_handler, exc->exception_class, exc, context);

eh_throw.cc L83 _Unwind_RaiseException() 會呼叫 2 次 __gxx_personality_v0(), list 1 L1 第一次呼叫 __gxx_personality_v0() 傳入 _UA_SEARCH_PHASE, 這次的目的是找出 landing_pad, landing_pad 的值會是 list 2 L357, 0x10048a, 什麼是 landing_pad, 簡單來說就是在 throw 100 之後, 程式要往哪裡執行呢? 那個位址就是 landing_pad, 計算出這個位址之後還沒完, 要比對丟出的物件 int 和 catch statement 的物件有沒有吻合, eh.cpp 有 2 個 catch statement, 顯然是第二個 catch (int a) 才吻合。

list 1 L1 第二次呼叫 __gxx_personality_v0() 傳入 _UA_CLEANUP_PHASE, 這次的目的是跳到 landing_pad 去執行, 所以在執行完之後, 程式會跳到 0x10048a, 神奇吧!

這個跳躍使用的是 libunwind 的函式, 很複雜, 沒能搞懂其實作, 大概是這樣:

_Unwind_RaiseException() 會呼叫 uw_install_context (&this_context, &cur_context, frames);

#define uw_install_context(CURRENT, TARGET, FRAMES)                     \
  do                                                                    \
    {                                                                   \
      long offset = uw_install_context_1 ((CURRENT), (TARGET));         \
      void *handler = uw_frob_return_addr ((CURRENT), (TARGET));        \
      _Unwind_DebugHook ((TARGET)-c>fa, handler);                       \
      _Unwind_Frames_Extra (FRAMES);                                    \
      __builtin_eh_return (offset, handler);                            \
    }                                                                   \
  while (0)

uw_install_context 是個 macro, 之後就會跳回 landing_pad 的位址。

跳到 0x10048a 之後, 再來就是根據 cmp 判斷式, 來執行 catch (int a) 這段程式。可以參考 list 2 L358 ~ 361, rdx 為 2 的時候, 就是執行 catch (int a) 這段程式, 而 rdx 1 是執行 catch (std::exception)。

這當然不是巧合, 而是從 throw 100 開始一連串精心安排的結果, 一樣沒能搞懂這部份。總之這是 c++ 編譯器的精心傑作。

那如果沒有 catch(int) 呢? __gxx_personality_v0() 只會執行一次, 因為沒找到對應的 catch handle, 這時候會執行 eh_throw.cc L88 的 std::terminate(), 結束整個程式。

eh_throw.cc
  1 // -*- C++ -*- Exception handling routines for throwing.
 70 
 71 extern "C" void
 72 __cxxabiv1::__cxa_throw (void *obj, std::type_info *tinfo,
 73                       void (*dest) (void *))
 74 {
 75   __cxa_eh_globals *globals = __cxa_get_globals ();
 76   globals->uncaughtExceptions += 1;
 77 
 78   // Definitely a primary.
 79   __cxa_refcounted_exception *header =
 80     __cxa_init_primary_exception(obj, tinfo, dest);
 81   header->referenceCount = 1;
 82 
 83   _Unwind_RaiseException (&header->exc.unwindHeader);
 84 
 85   // Some sort of unwinding error.  Note that terminate is a handler.
 86   __cxa_begin_catch (&header->exc.unwindHeader);
 87 
 88   std::terminate ();
 89 }
 90 

list 2. objdump -DC eh.elf
   301 
   302 00000000001003d2 <one()>:
   303   1003d2: 55                    push   %rbp
   304   1003d3: 48 89 e5              mov    %rsp,%rbp
   305   1003d6: bf 04 00 00 00        mov    $0x4,%edi
   306   1003db: e8 80 77 00 00        callq  107b60 <__cxa_allocate_exception>
   307   1003e0: c7 00 64 00 00 00     movl   $0x64,(%rax)
   308   1003e6: ba 00 00 00 00        mov    $0x0,%edx
   309   1003eb: be 48 84 10 00        mov    $0x108448,%esi
   310   1003f0: 48 89 c7              mov    %rax,%rdi
   311   1003f3: e8 6c 24 00 00        callq  102864 <__cxa_throw>
   337 
   338 0000000000100441 <main>:
   339   100441: 55                    push   %rbp
   340   100442: 48 89 e5              mov    %rsp,%rbp
   341   100445: 53                    push   %rbx
   342   100446: 48 83 ec 28           sub    $0x28,%rsp
   343   10044a: 48 89 7d d8           mov    %rdi,-0x28(%rbp)
   344   10044e: c7 45 e0 63 00 00 00  movl   $0x63,-0x20(%rbp)
   345   100455: 8b 5d e0              mov    -0x20(%rbp),%ebx
   346   100458: e8 13 01 00 00        callq  100570 <f()>
   347   10045d: 89 c1                 mov    %eax,%ecx
   348   10045f: 8b 15 ef 2b 03 00     mov    0x32bef(%rip),%edx        # 133054 <def>
   349   100465: 8b 05 e5 2b 03 00     mov    0x32be5(%rip),%eax        # 133050 <abc>
   350   10046b: 41 89 d9              mov    %ebx,%r9d
   351   10046e: 41 b8 00 00 00 00     mov    $0x0,%r8d
   352   100474: 89 c6                 mov    %eax,%esi
   353   100476: bf 68 80 10 00        mov    $0x108068,%edi
   354   10047b: b8 00 00 00 00        mov    $0x0,%eax
   355   100480: e8 20 05 00 00        callq  1009a5 <printf(char const*, ...)>
   356   100485: e8 48 ff ff ff        callq  1003d2 <one()>
   357   10048a: e9 9e 00 00 00        jmpq   10052d <main+0xec>
   358   10048f: 48 83 fa 01           cmp    $0x1,%rdx
   359   100493: 74 0e                 je     1004a3 <main+0x62>
   360   100495: 48 83 fa 02           cmp    $0x2,%rdx
   361   100499: 74 44                 je     1004df <main+0x9e>
   362   10049b: 48 89 c7              mov    %rax,%rdi
   363   10049e: e8 5d 5a 00 00        callq  105f00 <_Unwind_Resume>
   364   1004a3: 48 89 c7              mov    %rax,%rdi
   365   1004a6: e8 66 25 00 00        callq  102a11 <__cxa_begin_catch>
   366   1004ab: 48 89 45 e8           mov    %rax,-0x18(%rbp)
   367   1004af: 48 8b 45 e8           mov    -0x18(%rbp),%rax
   368   1004b3: 48 8b 00              mov    (%rax),%rax
   369   1004b6: 48 83 c0 10           add    $0x10,%rax
   370   1004ba: 48 8b 00              mov    (%rax),%rax
   371   1004bd: 48 8b 55 e8           mov    -0x18(%rbp),%rdx
   372   1004c1: 48 89 d7              mov    %rdx,%rdi
   373   1004c4: ff d0                 callq  *%rax
   374   1004c6: 48 89 c6              mov    %rax,%rsi
   375   1004c9: bf a4 80 10 00        mov    $0x1080a4,%edi
   376   1004ce: b8 00 00 00 00        mov    $0x0,%eax
   377   1004d3: e8 cd 04 00 00        callq  1009a5 <printf(char const*, ...)>
   378   1004d8: e8 19 26 00 00        callq  102af6 <__cxa_end_catch>
   379   1004dd: eb 4e                 jmp    10052d <main+0xec>
   380   1004df: 48 89 c7              mov    %rax,%rdi
   381   1004e2: e8 2a 25 00 00        callq  102a11 <__cxa_begin_catch>
   382   1004e7: 8b 00                 mov    (%rax),%eax
   383   1004e9: 89 45 e4              mov    %eax,-0x1c(%rbp)
   384   1004ec: 8b 45 e4              mov    -0x1c(%rbp),%eax
   385   1004ef: 89 c6                 mov    %eax,%esi
   386   1004f1: bf b8 80 10 00        mov    $0x1080b8,%edi
   387   1004f6: b8 00 00 00 00        mov    $0x0,%eax
   388   1004fb: e8 a5 04 00 00        callq  1009a5 <printf(char const*, ...)>
   389   100500: e8 f1 25 00 00        callq  102af6 <__cxa_end_catch>
   390   100505: eb 26                 jmp    10052d <main+0xec>
   391   100507: 48 89 c3              mov    %rax,%rbx
   392   10050a: e8 e7 25 00 00        callq  102af6 <__cxa_end_catch>
   393   10050f: 48 89 d8              mov    %rbx,%rax
   394   100512: 48 89 c7              mov    %rax,%rdi
   395   100515: e8 e6 59 00 00        callq  105f00 <_Unwind_Resume>
   396   10051a: 48 89 c3              mov    %rax,%rbx
   397   10051d: e8 d4 25 00 00        callq  102af6 <__cxa_end_catch>
   398   100522: 48 89 d8              mov    %rbx,%rax
   399   100525: 48 89 c7              mov    %rax,%rdi
   400   100528: e8 d3 59 00 00        callq  105f00 <_Unwind_Resume>
   401   10052d: 48 83 c4 28           add    $0x28,%rsp
   402   100531: 5b                    pop    %rbx
   403   100532: 5d                    pop    %rbp
   404   100533: c3                    retq   
   405 
 15543 
 15548 Disassembly of section .gcc_except_table:
 15549 
 15550 000000000010b704 <.gcc_except_table>:
 15551   10b704: ff                    (bad)  
 15552   10b705: ff 01                 incl   (%rcx)
 15553   10b707: 00 ff                 add    %bh,%bh
 15554   10b709: ff 01                 incl   (%rcx)
 15555   10b70b: 0c 10                 or     $0x10,%al
 15556   10b70d: 05 00 00 15 05        add    $0x5150000,%eax
 15557   10b712: 28 00                 sub    %al,(%rax)
 15558   10b714: 3d 05 00 00 ff        cmp    $0xff000005,%eax
 15559   10b719: 03 29                 add    (%rcx),%ebp
 15560   10b71b: 01 19                 add    %ebx,(%rcx)
 15561   10b71d: 3f                    (bad)  
 15562   10b71e: 0a 4e 03              or     0x3(%rsi),%cl
 15563   10b721: 5d                    pop    %rbp
 15564   10b722: 05 00 00 92 01        add    $0x1920000,%eax
 15565   10b727: 05 c6 01 00 ba        add    $0xba0001c6,%eax
 15566   10b72c: 01 05 d9 01 00 d4     add    %eax,-0x2bfffe27(%rip)
 15567   10b732: 01 18                 add    %ebx,(%rax)
 15568   10b734: 00 00                 add    %al,(%rax)
 15569   10b736: 02 00                 add    (%rax),%al
 15570   10b738: 01 7d 00              add    %edi,0x0(%rbp)
 15571   10b73b: 00 48 84              add    %cl,-0x7c(%rax)
 15572   10b73e: 10 00                 adc    %al,(%rax)
 15573   10b740: 78 8d                 js     10b6cf 
 15574   10b742: 10 00                 adc    %al,(%rax)
 15575   10b744: ff 03                 incl   (%rbx)
 15576   10b746: 1d 01 12 de 01        sbb    $0x1de1201,%eax
 15577   10b74b: c9                    leaveq 
 15578   10b74c: 06                    (bad)  
 15579   10b74d: 00 00                 add    %al,(%rax)
 15580   10b74f: d5                    (bad)  
 15581   10b750: 0a 05 b4 0c 01 91     or     -0x6efef34c(%rip),%al  
 15582   10b756: 0b 9c 01 00 00 01 00  or     0x10000(%rcx,%rax,1),%ebx
 15583   10b75d: 00 00                 add    %al,(%rax)
 15584   10b75f: 00 00                 add    %al,(%rax)
 15585   10b761: 00 00                 add    %al,(%rax)
 15586   10b763: 00 ff                 add    %bh,%bh
 15587   10b765: ff 01                 incl   (%rcx)
 15588   10b767: 00 ff                 add    %bh,%bh
 15589   10b769: 03 1d 01 12 82 01     add    0x1821201(%rip),%ebx
 15590   10b76f: 05 87 01 01 cb        add    $0xcb010187,%eax
 15591   10b774: 01 86 01 dd 02 00     add    %eax,0x2dd01(%rsi)
 15592   10b77a: f7 02 05 00 00 01     testl  $0x1000005,(%rdx)
 15593  ...
 15594   10b788: ff 03                 incl   (%rbx)
 15595   10b78a: 0d 01 04 10 02        or     $0x2100401,%eax
 15596   10b78f: 17                    (bad)  
 15597   10b790: 01 01                 add    %eax,(%rcx)
 15598   10b792: 00 00                 add    %al,(%rax)
 15599   10b794: 00 00                 add    %al,(%rax)
 15600  ...
 15601 

再來談談怎麼找到 landing_pad 的, 需要透過 list 2 L15550 的 .gcc_except_table section 裡頭的資料, 它有個專業術語, 叫做 language specific data area (lsda), 不過這個資料並無法直接從這些 16 進制的數字解讀, 需要在 runtime 時, 透過一個小型直譯器解譯出這些資料, 複雜之餘又添複雜。

類似 list 3 這些程式, 實際上遠遠比 list 3 列出的還複雜。

list 3 解讀 .gcc_except_table section 程式碼
p = read_encoded_value (0, info.call_site_encoding, p, &cs_start);
p = read_encoded_value (0, info.call_site_encoding, p, &cs_len);
p = read_encoded_value (0, info.call_site_encoding, p, &cs_lp);
p = read_uleb128 (p, &cs_action);

.gcc_except_table section 的資料格式請參考: c++ 異常處理 (2), exception handling tables, 我一樣沒搞懂, 大致有 3 個表格。

  1. call site table: 每一筆 call site record 有 4 個資訊, 就是 list 3 那 4 個, 但我不清楚其中關係, 怎麼透過這些資訊定位出 landing_pad。

    參考「c++ 異常處理 (2)」這篇,
    LSDA 表头之后紧跟着的是 call site table,该表用于记录程序中哪些指令有可能会抛异常,表中每条记录共有4个字段:
    1)cs_start: 可能会抛异常的指令的地址,该地址是距 Landing pad 起始地址的偏移,编码方式由 LSDA 表头中第一个字段指明。
    2)cs_len: 可能抛异常的指令的区域长度,该字段与 1)一起表示一系列连续的指令,编码方式与 1)相同。
    3)cs_lp: 用于处理上述指令的 Landing pad 的位移,这个值如果为 0 则表示不存在相应的 landing pad。
    4)cs_action: 指明要采取哪些 action,这是一个 unsigned LEB128 的值,该值减1后作为下标获取 action table 中相应记录。

    .gcc_except_table」也有類似的敘述
    1. The start of the instructions for the current call site, a byte offset from the landing pad base. This is encoded using the encoding from the header.
    2. The length of the instructions for the current call site, in bytes. This is encoded using the encoding from the header.
    3. A pointer to the landing pad for this sequence of instructions, or 0 if there isn’t one. This is a byte offset from the landing pad base. This is encoded using the encoding from the header.
    4. The action to take, an unsigned LEB128. This is 1 plus a byte offset into the action table. The value zero means that there is no action.

    應該還是很模糊吧! 我依然有看沒有懂。cs_lp 可以和 info.LPStart 加總得到 ladning_pad, cs_action 可以和 info.action_table 計算得到 action_record 的位址。cs_start, cs_len 不懂其用意。

    程式碼: list 5. L294
  2. action table: 裡頭的資訊可以用來取得 catch 的所有 type, 以 eh.cpp 來說, 有 2 個 catch statement, catch (std::exception &e), catch (int a), 就可以透過 action table 來取得 std::exception, int 的 type_info。

    在和 throw 的物件做比對 (這邊的例子是丟出整數 100), 便可以知道這個 landing_pad 是不是 catch handle, 如果沒有吻合, 這個 landing_pad 有可能只是要呼叫某個物件的解構函式, 用來清除該物件。
  3. type table: 紀錄著所有 catch 的 type。
list 5. L365 在取得 catch statement 的 type_info, 很複雜, 會透過
  1. info->ttype_encoding
  2. info->ttype_base
  3. info->TType
以及 action_record 的 ar_filter
353 p = action_record;
354 p = read_sleb128 (p, &ar_filter);

p = read_encoded_value (0, info.call_site_encoding, p, &cs_lp);
p = read_uleb128 (p, &cs_action);
p = read_sleb128 (p, &ar_filter);

這些函式都是用來讀取 .gcc_except_table section 的內容, 由於這些值有經過壓縮, 所以得做個還原的動作。

list 5. libstdc++-v3/libsupc++/eh_personality.cc
  1 // -*- C++ -*- The GNU C++ exception personality routine.
  2 // Copyright (C) 2001-2018 Free Software Foundation, Inc.
  3 //
  4 // This file is part of GCC.
 83 // Return an element from a type table.
 84 
 85 static const std::type_info *
 86 get_ttype_entry (lsda_header_info *info, _uleb128_t i)
 87 {
 88   _Unwind_Ptr ptr;
 89 
 90   i *= size_of_encoded_value (info->ttype_encoding);
 91   read_encoded_value_with_base (info->ttype_encoding, info->ttype_base, info->TType - i, &ptr);
 93 
 94   return reinterpret_cast<const std::type_info *>(ptr);
 95 }
 96 
213 namespace __cxxabiv1
214 {
215 
216 extern "C"
217 _Unwind_Reason_Code
218 __gxx_personality_v0 (int version,
219         _Unwind_Action actions,
220         _Unwind_Exception_Class exception_class,
221         struct _Unwind_Exception *ue_header,
222         struct _Unwind_Context *context)
223 {
224   enum found_handler_type
225   {
226     found_nothing,
227     found_terminate,
228     found_cleanup,
229     found_handler
230   } found_type;
231 
232   lsda_header_info info;
233   const unsigned char *language_specific_data;
234   const unsigned char *action_record;
235   const unsigned char *p;
236   _Unwind_Ptr landing_pad, ip;
237   int handler_switch_value;
238   void* thrown_ptr = 0;
239   bool foreign_exception;
240   int ip_before_insn = 0;
241 
242   __cxa_exception* xh = __get_exception_header_from_ue(ue_header);
243 
244   // Interface version check.
245   if (version != 1)
246     return _URC_FATAL_PHASE1_ERROR;
247   foreign_exception = !__is_gxx_exception_class(exception_class);
248 
249   // Shortcut for phase 2 found handler for domestic exception.
250   if (actions == (_UA_CLEANUP_PHASE | _UA_HANDLER_FRAME)
251       && !foreign_exception)
252     {
253       restore_caught_exception(ue_header, handler_switch_value,
254           language_specific_data, landing_pad);
255       found_type = (landing_pad == 0 ? found_terminate : found_handler);
256       goto install_context;
257     }
258 
259   language_specific_data = (const unsigned char *)
260     _Unwind_GetLanguageSpecificData (context);
261 
262   // If no LSDA, then there are no handlers or cleanups.
263   if (! language_specific_data)
264     CONTINUE_UNWINDING;
265 
266   // Parse the LSDA header.
267   p = parse_lsda_header (context, language_specific_data, &info);
268   info.ttype_base = base_of_encoded_value (info.ttype_encoding, context);
269   ip = _Unwind_GetIPInfo (context, &ip_before_insn);
270   if (! ip_before_insn)
271     --ip;
272   landing_pad = 0;
273   action_record = 0;
274   handler_switch_value = 0;
275 
276   // Search the call-site table for the action associated with this IP.
277   while (p < info.action_table)
278     {
279       _Unwind_Ptr cs_start, cs_len, cs_lp;
280       _uleb128_t cs_action;
281 
282       // Note that all call-site encodings are "absolute" displacements.
283       p = read_encoded_value (0, info.call_site_encoding, p, &cs_start);
284       p = read_encoded_value (0, info.call_site_encoding, p, &cs_len);
285       p = read_encoded_value (0, info.call_site_encoding, p, &cs_lp);
286       p = read_uleb128 (p, &cs_action);
287 
288       // The table is sorted, so if we've passed the ip, stop.
289       if (ip < info.Start + cs_start)
290  p = info.action_table;
291       else if (ip < info.Start + cs_start + cs_len)
292  {
293    if (cs_lp)
294      landing_pad = info.LPStart + cs_lp;
295    if (cs_action)
296      action_record = info.action_table + cs_action - 1;
297    goto found_something;
298  }
299     }
300 
301   // If ip is not present in the table, call terminate.  This is for
302   // a destructor inside a cleanup, or a library routine the compiler
303   // was not expecting to throw.
304   found_type = found_terminate;
305   goto do_something;
306 
307  found_something:
308   if (landing_pad == 0)
309     {
310       // If ip is present, and has a null landing pad, there are
311       // no cleanups or handlers to be run.
312       found_type = found_nothing;
313     }
314   else if (action_record == 0)
315     {
316       // If ip is present, has a non-null landing pad, and a null
317       // action table offset, then there are only cleanups present.
318       // Cleanups use a zero switch value, as set above.
319       found_type = found_cleanup;
320     }
321   else
322     {
323       // Otherwise we have a catch handler or exception specification.
324 
325       _sleb128_t ar_filter, ar_disp;
326       const std::type_info* catch_type;
327       _throw_typet* throw_type;
328       bool saw_cleanup = false;
329       bool saw_handler = false;
330 
331 #if __cpp_rtti
332       // During forced unwinding, match a magic exception type.
333       if (actions & _UA_FORCE_UNWIND)
334  {
335    throw_type = &typeid(abi::__forced_unwind);
336  }
337       // With a foreign exception class, there's no exception type.
338       // ??? What to do about GNU Java and GNU Ada exceptions?
339       else if (foreign_exception)
340  {
341    throw_type = &typeid(abi::__foreign_exception);
342  }
343       else
344 #endif
345         {
346           thrown_ptr = __get_object_from_ue (ue_header);
347           throw_type = __get_exception_header_from_obj
348             (thrown_ptr)->exceptionType;
349         }
350 
351       while (1)
352  {
353    p = action_record;
354    p = read_sleb128 (p, &ar_filter);
355    read_sleb128 (p, &ar_disp);
356 
357    if (ar_filter == 0)
358      {
359        // Zero filter values are cleanups.
360        saw_cleanup = true;
361      }
362    else if (ar_filter > 0)
363      {
364        // Positive filter values are handlers.
365        catch_type = get_ttype_entry (&info, ar_filter);
366 
367        // Null catch type is a catch-all handler; we can catch foreign
368        // exceptions with this.  Otherwise we must match types.
369        if (! catch_type
370     || (throw_type
371         && get_adjusted_ptr (catch_type, throw_type,
372         &thrown_ptr)))
373   {
374     saw_handler = true;
375     break;
376   }
377      }
378    else
379      {
380        // Negative filter values are exception specifications.
381        // ??? How do foreign exceptions fit in?  As far as I can
382        // see we can't match because there's no __cxa_exception
383        // object to stuff bits in for __cxa_call_unexpected to use.
384        // Allow them iff the exception spec is non-empty.  I.e.
385        // a throw() specification results in __unexpected.
386        if ((throw_type
387      && !(actions & _UA_FORCE_UNWIND)
388      && !foreign_exception)
389     ? ! check_exception_spec (&info, throw_type, thrown_ptr,
390          ar_filter)
391     : empty_exception_spec (&info, ar_filter))
392   {
393     saw_handler = true;
394     break;
395   }
396      }
397 
398    if (ar_disp == 0)
399      break;
400    action_record = p + ar_disp;
401  }
402 
403       if (saw_handler)
404  {
405    handler_switch_value = ar_filter;
406    found_type = found_handler;
407  }
408       else
409  found_type = (saw_cleanup ? found_cleanup : found_nothing);
410     }
411 
412  do_something:
413    if (found_type == found_nothing)
414      CONTINUE_UNWINDING;
415 
416   if (actions & _UA_SEARCH_PHASE)
417     {
418       if (found_type == found_cleanup)
419  CONTINUE_UNWINDING;
420 
421       // For domestic exceptions, we cache data from phase 1 for phase 2.
422       if (!foreign_exception)
423         {
424    save_caught_exception(ue_header, context, thrown_ptr,
425     handler_switch_value, language_specific_data,
426     landing_pad, action_record);
427  }
428       return _URC_HANDLER_FOUND;
429     }
430 
431  install_context:
432   
433   // We can't use any of the cxa routines with foreign exceptions,
434   // because they all expect ue_header to be a struct __cxa_exception.
435   // So in that case, call terminate or unexpected directly.
436   if ((actions & _UA_FORCE_UNWIND)
437       || foreign_exception)
438     {
439       if (found_type == found_terminate)
440  std::terminate ();
441       else if (handler_switch_value < 0)
442  {
443    __try 
444      { std::unexpected (); } 
445    __catch(...) 
446      { std::terminate (); }
447  }
448     }
449   else
450     {
451       if (found_type == found_terminate)
452  __cxa_call_terminate(ue_header);
453 
454       // Cache the TType base value for __cxa_call_unexpected, as we won't
455       // have an _Unwind_Context then.
456       if (handler_switch_value < 0)
457  {
458    parse_lsda_header (context, language_specific_data, &info);
459    info.ttype_base = base_of_encoded_value (info.ttype_encoding,
460          context);
461 
462    xh->catchTemp = base_of_encoded_value (info.ttype_encoding, context);
463  }
464     }
465 
466   /* For targets with pointers smaller than the word size, we must extend the
467      pointer, and this extension is target dependent.  */
468   _Unwind_SetGR (context, __builtin_eh_return_data_regno (0),
469    __builtin_extend_pointer (ue_header));
470   _Unwind_SetGR (context, __builtin_eh_return_data_regno (1),
471    handler_switch_value);
472   _Unwind_SetIP (context, landing_pad);
473   return _URC_INSTALL_CONTEXT;
474 }
475 

在寫下這篇之後, 突然好像覺得沒有那麼難了, 由於找不到正式的 gcc_except_table 文件, 所以只能從程式碼推敲這些 table 欄位的用意, 實在是難於登天, 我使用 gdb 追蹤了 20 多次, 依然沒有太大的概念。

再來看看 type table 藏在哪裡? list 6 是反組譯 .gcc_except_table 的部份內容。

list 6. .gcc_except_table 的 type info table
10c980:       68 94 10 00 
10c984:       98 9d 10 00           

get_ttype_entry 函式在找出 catch_type, 以 eh.cpp 來說, 有 catch (std::exception &e), catch (int a), 所以應該會有 2 筆紀錄, 就是 list 6 那 2 筆。

catch_type = get_ttype_entry (&info, ar_filter);

透過計算, 會得到一個 p 指標 - 0x000000000010c984, 再根據 list 7, 取出 unaligned 的 u4 欄位, 就是 type table 的某個 type。

const union unaligned *u = (const union unaligned *) p;
result = u->u4;

這個計算有好幾種方式, 這只是我追蹤的其中一種。

(gdb) x/32xb 0x10c984
0x10c984: 0x98 0x9d 0x10 0x00 0xff 0x03 0x19 0x01
0x10c98c: 0x11 0x78 0xf6 0x0a 0x00 0x00 0x9c 0x0e
0x10c994: 0x05 0xfb 0x0f 0x01 0xd8 0x0e 0x9c 0x01
0x10c99c: 0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00

這個 0x98 0x9d 0x10 0x00 -> 0x109d98, 就是
0000000000109d98 <typeinfo for std::exception>

另外一個 0x109468 就是
0000000000109468 <ypeinfo for int>

分別對應到 std::exception, int 這 2 個 type_info。

list 7. libgcc/unwind-pe.h
 1 const unsigned char * read_encoded_value_with_base (unsigned char encoding, _Unwind_Ptr base, const unsigned char *p, _Unwind_Ptr *val)
 2 {
 3   union unaligned
 4     {
 5       void *ptr;
 6       unsigned u2 __attribute__ ((mode (HI)));
 7       unsigned u4 __attribute__ ((mode (SI)));
 8       unsigned u8 __attribute__ ((mode (DI)));
 9       signed s2 __attribute__ ((mode (HI)));
10       signed s4 __attribute__ ((mode (SI)));
11       signed s8 __attribute__ ((mode (DI)));
12     } __attribute__((__packed__));

相關的數值
ar_filter: 1
透過 ar_filter 取得的值: 4
info->ttype_encoding: 3
info->ttype_base: 0
info->TType: 10c988

p 就是從 info->TType - 透過 ar_filter 取得的值 = 10c988 - 4 = 10c984
catch_type: 0x0000000000109d98

eh1.cpp 有個 two()。

eh1.cpp
 1 #include <exception>
 2 
 3 class Exception: public std::exception {
 4 public:
 5   Exception() {
 6     printf("Construct test exception\n");
 7   }
 8   ~Exception() {
 9     printf("Destruct test exception\n");
10   }
11 
12   virtual const char *what() const noexcept override 
13   {
14     return "Test eh";
15   }
16 };
17 
18 int two()
19 {
20   Exception ex;
21   one();
22 }
23 
24 int one() 
25 {
26   throw 100;
27 }
28 
29 void main()
30 {
31   try
32   {
33     two();
34   }
35   catch (std::exception &e)
36   {
37     printf("get excetption: %s\n", e.what());
38   }
39   catch (int a)
40   {
41     printf("got excetption: %d\n", a);
42   }
43 }

eh1.cpp 的 exception handle 流程又會是怎麼樣的呢? eh1 會有 2 個 landing_pad, 分別是 ...

阿 ... 這篇已經太長了, 下回再說 (如果有的話) ...

ref:
Understanding the .gcc_except_table section in ELF binaries (GCC)

沒有留言:

張貼留言

使用 google 的 reCAPTCHA 驗證碼, 總算可以輕鬆留言了。

我實在受不了 spam 了, 又不想讓大家的眼睛花掉, 只好放棄匿名留言。這是沒辦法中的辦法了。留言的朋友需要有 google 帳號。