terralang / terra

Terra is a low-level system programming language that is embedded in and meta-programmed by the Lua programming language.

Home Page:terralang.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sprintf standard library core dump

OneArb opened this issue · comments

On Arch Linux:

line 2: 143220 Segmentation fault (core dumped) ~/terra/bin/terra $1

local C = terralib.includecstring [[

extern int sprintf (char *__restrict __s,
  const char *__restrict __format, ...);

]]

terra sprintfTest()

  var i : int = 5
  var s : rawstring
  C.sprintf(s, "second%d ", i)
end

sprintfTest()

You didn't initialize s.

This works for me:

local C = terralib.includecstring([[
#include <stdio.h>
#include <stdlib.h>
]])     
        
terra sprintfTest()

  var i : int = 5
  var s : rawstring = [rawstring](C.malloc(16))
  C.sprintf(s, "second%d ", i)
end

sprintfTest()

Thanks.

I got in my head that Terra automated string allocations:

local C = terralib.includecstring([[
#include <stdio.h>
]])

terra printfTest()

  var s : rawstring = "anything"
  C.printf("%s\n", s)

  s = "anything 3.14"
  C.printf("%s\n", s)
end

printfTest()

Is it UB in Terra?

Is this safe code:

local C = terralib.includecstring([[
#include <stdio.h>
]])

local s = "test"
terra printfTest(s : rawstring)
  C.printf("%s\n", s)
end

printfTest(s)

This time LuaJIT manages variable s?

It sounds like you're asking about the interaction between the compilation and execution environments.

It might help to modify your examples to print out the generated LLVM IR:

print(terralib.saveobj(nil, "llvmir", {printfTest=printfTest}, nil, nil, false))

In your first example that prints out something like:

; ModuleID = 'terra'
source_filename = "terra"
target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-darwin22.6.0"

@"$string" = private unnamed_addr constant [9 x i8] c"anything\00", align 1
@"$string.1" = private unnamed_addr constant [4 x i8] c"%s\0A\00", align 1
@"$string.2" = private unnamed_addr constant [14 x i8] c"anything 3.14\00", align 1

define dso_local void @printfTest() {
entry:
  %s = alloca i8*, align 8
  store i8* getelementptr inbounds ([9 x i8], [9 x i8]* @"$string", i32 0, i32 0), i8** %s, align 8
  %0 = load i8*, i8** %s, align 8
  %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @"$string.1", i32 0, i32 0), i8* %0)
  store i8* getelementptr inbounds ([14 x i8], [14 x i8]* @"$string.2", i32 0, i32 0), i8** %s, align 8
  %2 = load i8*, i8** %s, align 8
  %3 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @"$string.1", i32 0, i32 0), i8* %2)
  ret void
}

declare dso_local i32 @printf(i8*, ...)

The string constants get embedded in the LLVM IR. They also are held by references from the Terra AST for printfTest. So as long as those functions are live, Lua's GC will not collect them. However, this is irrelevant: the values that get embedded in the IR are distinct memory allocations. So even if they were to be collected, there would be no way to corrupt the IR or the compilation environment.

In your second example, yes, there is a module-local variable that holds the LuaJIT string "test". But again, this is irrelevant. If you run the same saveobj code in that test you'll see the constant gets embedded in the LLVM IR. The memory is backed by a new allocation which is managed by LLVM (and the LLVM context is managed by Terra). Therefore, whatever LuaJIT does matters only up to the point where the AST gets consumed to generate the LLVM IR.

Does that answer your question?

Thanks for suggesting how to access the LLVM IR output.

terralib.saveobj does clarify how Terra code is converted into machine specific IR.

I appreciate looking under the hood.