koka-lang / koka

Koka language compiler and interpreter

Home Page:http://koka-lang.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

segfault in vector-init

chtenb opened this issue · comments

Reproduction

pub fun vector/append(first : vector<a>, second : vector<a>) : _ vector<a>
  vector-init(first.length + second.length, fn(i) index-two(first, second, i))
  
fun vector/index-two(first : vector<a>, second : vector<a>, i: int) : _ a
  if i < first.length then first[i] else second[i]

pub fun main()
  val x = [1,2,3].vector
  val y = x.append(x)
  ()

Don't you want second[i - first.length]?

Yes that's right, but it should not segfault I think? It should throw the appropriate exception and error message in the console.

Yeah, that should have been fixed in this commit: 03793a9

With the splitting of the core.kk file that code now is:
here:

pub fun @index( ^v : vector<a>, ^index : int ) : exn a

here:
inline extern lengthz( ^v : vector<a> ) : ssize_t

and
here:
static inline kk_decl_pure kk_ssize_t kk_vector_len_borrow(const kk_vector_t v, kk_context_t* ctx) {

Hm, I'm on the latest origin/dev which contains that commit

Ahh, I know the issue. Throwing inside of initializing a vector causes the partially initialized vector to be freed including each element in the vector (if they only have a single reference), but not all of it's elements exist yet, so it tries to ref-count invalid memory.

Are ints inside a vector refcounted individually?

Yes, vectors are not currently specialized to non-refcounted types (as far as I know), and even if they were ints are arbitrary precision integers which use an efficient pointer tagged representation until they overflow: paper on integers in koka. Since they can overflow they do need ref-counting (but when not overflowed they are a value type and don't actually do any ref-counting in practice). int32 or int64 in contrast are non-refcounted primitive machine types, and I believe Koka boxes these types when used in a generic type.

As such, it is best if you just use ints, since they are designed to be efficient at ref counting (no ref counting unless require allocation), and do not need to be boxed / allocated due to the pointer tagging.

This vector issue does need fixing, but I don't know if it should be fixed by specializing drop / free for vectors, or if the initializing function should check for yielding in a final manner and fill the rest with values representations that don't require allocation, but can be ref-counted without causing a segfault. Or if we should initialize the memory with some value type representation that doesn't need ref-counting prior to filling it by executing the user's function. For the integer example this would essentially be the same as just initializing with

pub fun vector( ^n : int, default : a) : vector<a>
prior to mapping over it with the user's function, but for non value types it would be more expensive unless we do it at a lower level.

commented

Thanks! Tim was right, the vector was half initialized and freeing would fail. I fixed it in the latest dev by always initializing such array first with dummy values; since the vector is not reachable during initialization this should be safe.