Skip to content

Fix exception handling (throw) in xeus-cpp-lite. #222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
anutosh491 opened this issue Jan 15, 2025 · 11 comments
Open

Fix exception handling (throw) in xeus-cpp-lite. #222

anutosh491 opened this issue Jan 15, 2025 · 11 comments
Labels
bug Something isn't working

Comments

@anutosh491
Copy link
Collaborator

anutosh491 commented Jan 15, 2025

I would expect this to work with what is there as of now but I see the following doesn't work

image

So out of all the steps (code -> PTU -> LLVM IR -> wasm object -> wasm binary -> loaded on top of main module using dlopen) we need to think which one is going wrong.

  1. The linking step through clang-repl is successful here and hence a shared wasm module has been created.

  2. But the dlopen step to load it on top of the main module is not working as expected. Hence the trace

image
@anutosh491 anutosh491 added the bug Something isn't working label Jan 15, 2025
@anutosh491
Copy link
Collaborator Author

The trace indicates that the loadDynamicLibrary function call fails from dlopenInternal leading to the error

        var dlopenInternal = (handle, jsflags) => {
            // void *dlopen(const char *file, int mode);
            // http://pubs.opengroup.org/onlinepubs/009695399/functions/dlopen.html
            var filename = UTF8ToString(handle + 36);
            var flags = HEAP32[(((handle) + (4)) >> 2)];
            filename = PATH.normalize(filename);
            var searchpaths = [];

            var global = Boolean(flags & 256);
            var localScope = global ? null : {};

            // We don't care about RTLD_NOW and RTLD_LAZY.
            var combinedFlags = {
                global,
                nodelete: Boolean(flags & 4096),
                loadAsync: jsflags.loadAsync,
            }

            if (jsflags.loadAsync) {
                return loadDynamicLibrary(filename, combinedFlags, localScope, handle);
            }

            try { // THIS TRY BLOCK FAILS !!!!
                return loadDynamicLibrary(filename, combinedFlags, localScope, handle)
            } catch (e) {
                err(`Error in loading dynamic library ${filename}: ${e}`);
                dlSetError(`Could not load dynamic lib: ${filename}\n${e}`);
                return 0;
            }
        }
        

@github-actions github-actions bot added the Needs triage Used in auto labelling of new issues label Jan 15, 2025
@anutosh491 anutosh491 removed the Needs triage Used in auto labelling of new issues label Jan 15, 2025
@anutosh491
Copy link
Collaborator Author

Also this appears in the console

image

@anutosh491
Copy link
Collaborator Author

Going through emscripten's flags they also provide an error like this. We might be missing the flag responsible for enabling exceptions somewhere.

Assertion failed: Exception thrown, but exception catching is not enabled. Compile with -sNO_DISABLE_EXCEPTION_CATCHING or -sEXCEPTION_CATCHING_ALLOWED=[..] to catch. (note: in dynamic linking, if a side module wants exceptions, the main module must be built with that support))

@mcbarton
Copy link
Collaborator

@anutosh491 As far as I can tell from the linked repo exception handling isn't currently supported by WebAssembly https://github.com/WebAssembly/exception-handling/tree/master , but the repo contains a proposal on how they hope to add support for it.

@anutosh491
Copy link
Collaborator Author

Hmm, weird. That shouldn't be the case cause as I wrote about the linking step is fine which means we are creating a wasm module out of the "throw" command. It's the loading that is not working as expected. So I am guessing we should be able to do this !

@anutosh491
Copy link
Collaborator Author

This is the wasm module being generated for throw("error")

(module $incr_module_2.wasm
  (memory $env.memory (;0;) (import "env" "memory") 1)
  (table $env.__indirect_function_table (;0;) (import "env" "__indirect_function_table") 0 funcref)
  (global $__memory_base (;0;) (import "env" "__memory_base") i32)
  (global $__table_base (;1;) (import "env" "__table_base") i32)
  (func $__cxa_allocate_exception (;0;) (import "env" "__cxa_allocate_exception") (param i32) (result i32))
  (func $__cxa_throw (;1;) (import "env" "__cxa_throw") (param i32 i32 i32))
  (global $typeinfo for char const* (;2;) (import "GOT.mem" "_ZTIPKc") (mut i32))
  (func $__wasm_call_ctors (;2;) (export "__wasm_call_ctors")
    call $_GLOBAL__sub_I_incr_module_2
  )
  (func $__stmts__0 (;3;)
    (local $var0 i32)
    (local $var1 i32)
    global.get $__memory_base
    local.set $var0
    i32.const 4
    call $__cxa_allocate_exception
    local.tee $var1
    local.get $var0
    i32.const 0
    i32.add
    i32.store
    local.get $var1
    global.get $typeinfo for char const*
    i32.const 0
    call $__cxa_throw
    unreachable
  )
  (func $_GLOBAL__sub_I_incr_module_2 (;4;)
    call $__stmts__0
  )
  (data (global.get $__memory_base) "error\00")
)

@mcbarton
Copy link
Collaborator

@anutosh491 The -fexceptions and -fwasm-exceptions flags may be relevant this PR (saw them in this issue emscripten-core/emscripten#22587 )

@anutosh491
Copy link
Collaborator Author

Hi,

I was able to find the solution for this but might need some guidance on where it should be placed.

Let me briefly explain what is happening here.

  1. So if you start with a try-catch block on master you would see the following

Image

  1. Now the steps we have are these

code -> PTU -> llvm ir -> incr_module_xx.so -> incr_module_xx.wasm -> loaded on top of the main module using dlopen

So it is one of these steps that is going wrong.

  1. The LLVM IR is exactly what we want (as you can see we have the landingpad, __cxa_end_catch and other constructs required for exception handling)
Generated LLVM IR:
; ModuleID = 'incr_module_2'
source_filename = "incr_module_2"
target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-f128:64-n32:64-S128-ni:1:10:20"
target triple = "wasm32-unknown-emscripten"
@_ZTIi = external constant ptr
@__main_void = hidden alias i32 (), ptr @main
; Function Attrs: mustprogress noinline norecurse optnone
define noundef i32 @main() #0 personality ptr @__gxx_personality_v0 {
entry:
  %exn.slot = alloca ptr, align 4
  %ehselector.slot = alloca i32, align 4
  %exception = call ptr @__cxa_allocate_exception(i32 4) #1
  store i32 1, ptr %exception, align 16
  invoke void @__cxa_throw(ptr %exception, ptr @_ZTIi, ptr null) #2
          to label %unreachable unwind label %lpad
lpad:                                             ; preds = %entry
  %0 = landingpad { ptr, i32 }
          catch ptr null
  %1 = extractvalue { ptr, i32 } %0, 0
  store ptr %1, ptr %exn.slot, align 4
  %2 = extractvalue { ptr, i32 } %0, 1
  store i32 %2, ptr %ehselector.slot, align 4
  br label %catch
catch:                                            ; preds = %lpad
  %exn = load ptr, ptr %exn.slot, align 4
  %3 = call ptr @__cxa_begin_catch(ptr %exn) #1
  call void @__cxa_end_catch()
  br label %try.cont
try.cont:                                         ; preds = %catch
  ret i32 0
unreachable:                                      ; preds = %entry
  unreachable
}
declare ptr @__cxa_allocate_exception(i32)
declare void @__cxa_throw(ptr, ptr, ptr)
declare i32 @__gxx_personality_v0(...)
declare ptr @__cxa_begin_catch(ptr)
declare void @__cxa_end_catch()
  1. The incr_module_xx.wasm generated is wrong (notice that we don't generate any wasm related to the "catch" block and we only generate wasm will "throw" ..... as you can see we call cxx_throw)

Image

This establishes the fact that the step going from LLVM IR -> incr_module_xx.so is going wrong. Let's move on to the Reasoning now

@anutosh491
Copy link
Collaborator Author

REASONING

  1. So the invokes in the LLVM IR are generated in Clang but lowered down to calls in the backend. That's because -enable-emscripten-cxx-exceptions is false by default. In case no exception handling mechanism is specified, the invokes are lowered to calls and landingpads are removed can be seen here

  2. So basically the option we need to pass to the backend is -mllvm -enable-emscripten-cxx-exceptions. We might also like to have -mllvm -enable-emscripten-sjlj too for setjmp-longjmp handling.

  3. So technically to solve our problem if we basically change the default for -enable-emscripten-cxx-exceptions to true and build llvm and run the same thing we're good. Obviously a cheap trick and should pass it as an LLVM command line option through -mllvm

But yeah you can see the correct wasm being generated for a try-catch block

Image

Let's move onto the solution now

@anutosh491
Copy link
Collaborator Author

anutosh491 commented Mar 21, 2025

SOLUTION

  1. Well this means we just need to have -mllvm -enable-emscripten-cxx-exceptions in our cc1 command and that should be enough correct ?

This can be done either through

i) Adding it straightaway in CreateCpp
ii) Or maybe if we don't want to introduce this in llvm itself, we can either add it through xeus-cpp if we want (https://github.com/compiler-research/xeus-cpp/blob/main/src/xinterpreter.cpp#L32)

But to my surprise this doesn't solve the problem.

This is what happens

For usual clang
i) in the cc1_main command, once we create the CompilerInstance, we end up calling ExecuteCompilerInvocation
ii) Which deals with the LLVMArgs from FrontEndOpts first and then moves onto calling ExecuteAction

For clang-repl

i) Everything starts with BuildCompilation which does the addClangTargetOptions step ..... followed by creating the CompilerInstance
ii) Once the CompileInstance is created is it passed to the Interpreter that we create on our side
iii) But if you move to the constructor for the interpreter you would see this

Interpreter::Interpreter(std::unique_ptr<CompilerInstance> Instance,
...........

  Act = std::make_unique<IncrementalAction>(*CI, *TSCtx->getContext(), ErrOut,
                                            *this, std::move(Consumer));
  if (ErrOut)
    return;
  CI->ExecuteAction(*Act);
........
}

iv) Now just like clang normally would do, we need a call to ExecuteAction which is done here too !!!
But how clang does it is through ExecuteCompilerInvocation where we first process the LLVMArgs and then make a call to ExecuteAction.

So basically the point is clang-repl doesn't deal with FrontEndOpts before calling ExecuteAction and even if wehave -mllvm -enable-emscripten-cxx-exceptions in our CC1 command, it is not being respected !

@anutosh491
Copy link
Collaborator Author

anutosh491 commented Mar 21, 2025

SOLUTION Contd

Now if you look at how clang is dealing with this, the following happens

  1. ExecuteCompilerInvocation is called

  2. It looks into some 3-4 frontendopts like

    i) help
    ii) version
    iii) Plugins
    iv) -mllvm args

  3. Finally calling ExecuteAction

We need to replicate something similar in clang-repl. Once we create the CompileInstance and before we make the call to ExecuteAction we need to handle these frontendopts !

For eg talking about version. If we pass version as a flag to the cc1 command using (-XClang -version) we would see this

Image

This is because ExecuteAction wants version to be handled by the Client itself

bool CompilerInstance::ExecuteAction(FrontendAction &Act) {
  assert(hasDiagnostics() && "Diagnostics engine is not initialized!");
  assert(!getFrontendOpts().ShowHelp && "Client must handle '-help'!");
  assert(!getFrontendOpts().ShowVersion && "Client must handle '-version'!");
..........

My understanding tells me that "client" here means the object making use of the CompileInstance which in our case ends up being the Interpreter class. I say this because the solution i propose can also end up in CreateCI if we want but if go by what client means, interpreter class should be responsible for handling this and we handle the frontend opts just before we create the interpreter.

Hence we can start with something like this

diff --git a/clang/lib/Interpreter/Interpreter.cpp b/clang/lib/Interpreter/Interpreter.cpp
index fa4c1439c926..f37bdbdcfcbd 100644
--- a/clang/lib/Interpreter/Interpreter.cpp
+++ b/clang/lib/Interpreter/Interpreter.cpp
@@ -141,6 +141,49 @@ CreateCI(const llvm::opt::ArgStringList &Argv) {
   return std::move(Clang);
 }
 
+static llvm::Error HandleFrontendOptions(const CompilerInstance &CI) {
+  const auto &FrontendOpts = CI.getFrontendOpts();
+
+  if (FrontendOpts.ShowHelp) {
+    driver::getDriverOptTable().printHelp(
+        llvm::outs(), "clang -cc1 [options] file...",
+        "LLVM 'Clang' Compiler: http://clang.llvm.org",
+        /*ShowHidden=*/false, /*ShowAllAliases=*/false,
+        llvm::opt::Visibility(driver::options::CC1Option));
+    return llvm::createStringError(llvm::errc::not_supported, "Help displayed");
+  }
+
+  if (FrontendOpts.ShowVersion) {
+    llvm::cl::PrintVersionMessage();
+    return llvm::createStringError(llvm::errc::not_supported, "Version displayed");
+  }
+
+
+  if (!FrontendOpts.LLVMArgs.empty()) {
+    unsigned NumArgs = FrontendOpts.LLVMArgs.size();
+    auto Args = std::make_unique<const char*[]>(NumArgs + 2);
+    Args[0] = "clang-repl (LLVM option parsing)";
+    for (unsigned i = 0; i != NumArgs; ++i)
+      Args[i + 1] = FrontendOpts.LLVMArgs[i].c_str();
+    Args[NumArgs + 1] = nullptr;
+    llvm::errs() << "Parsing LLVM backend options via cl::ParseCommandLineOptions...\n";
+    llvm::cl::ParseCommandLineOptions(NumArgs + 1, Args.get());
+  }
+
+  return llvm::Error::success();
+}
+
 } // anonymous namespace
 
 namespace clang {
@@ -451,7 +494,12 @@ const char *const Runtimes = R"(
 
 llvm::Expected<std::unique_ptr<Interpreter>>
 Interpreter::create(std::unique_ptr<CompilerInstance> CI) {
-  llvm::Error Err = llvm::Error::success();
+
+  llvm::Error Err = HandleFrontendOptions(*CI);
+  if (Err) {
+    return std::move(Err);
+  }
+
   auto Interp =
       std::unique_ptr<Interpreter>(new Interpreter(std::move(CI), Err));
   if (Err)

This should get us started with handling things like clang is doing.

  1. We handle the exception handling correctly as we now respect the -mllvm args (and we can pass the -mllvm -enable-emscripten-cxx-exceptions flag from the top ... from xeus-cpp itself I guess)

  2. We also address other frontend args for eg version

(base) anutosh491@Anutoshs-MacBook-Air bin % ./clang-repl --Xcc -Xclang --Xcc -version                                      
LLVM (http://llvm.org/):
  LLVM version 21.0.0git
  Optimized build.
clang-repl: Version displayed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants