JuliaLang / julia

The Julia Programming Language

Home Page:https://julialang.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

automatic recompilation of dependent functions

vtjnash opened this issue · comments

if I define a function d2 before a function d1 which calls d2 then change d2, d1 uses the old definition for d2.
I assume this is because it is all precompiled, but maybe there should be a note warning of this? Or would it be possible to replace the old definition with a longjmp to the new one?
(Mostly important for the REPL, since I don't always do a full load)

julia> function d2()
       a
       end

julia> function d()
         d2()
       end

julia> d()
in d: a not defined

julia> function d2()
       b=2
       end

julia> d()
in d: a not defined

julia> d2
Methods for generic function d2
d2() at prompt:2

julia> function d()
         d2()
       end

julia> d()
2

I believe this is because the code for d2() is generated when the function is defined so the d() method is resolved at that time. As you noted, you can redefine d2() as it was originally defined to make it work as expected. Ideally, we would automatically re-define anything that depends on d() when it is changed like this, but we don't have any of the metadata in place that would allow that. I guess we could put a warning about this behavior in the manual. Better still would be to fix it, but I think it's a bit tricky to do :-/

I was hoping nobody would notice this, but I guess that wasn't realistic of me :)

It's more fun than that; it really lets you see the JIT at work, since it isn't resolved until execution:

julia> function f2()
       a
       end

julia> function f1()
       f2()
       end

julia> function f2()
       2
       end

julia> f1()
2

However, I fear this can lead to some unexpected results at the REPL, although I don't really know the best way to deal with it. Although I used an error to show the difference in dramatic fashion, it could also happens if I was trying to see the effect of changing an inner equation or constant!

There are two types of locations that may need to be updated when a method is altered:

  • places where the method is called explicitly as a function (call/ret)
  • places where the method has been inlined

The former could be updated just by changing the function body: it could be altered in place, or the calling sites could be actively updated to call a new location, or the old function body could be replaced with a stub that jumps to the new version, optionally patching the calling site. For inlining, we need to track the callers that have inlined the function and re-JIT them.

Really all we can do is discard code, because even if a method wasn't inlined any function that calls it might depend on type behavior that can change if the method changes. This is mostly for interactive cases so re-compiling code is not a huge deal.
Requires same work as issue #47.

Bummer. Well, since it's primarily the repl that's affected, doing the right thing slowly is fine. At run-time, this should almost never kick in — why would someone define something and then redefine it again except interactively? If the solution ends up inducing a lot of overhead maybe turning it one should be optional and automatically done in the repl?

We need to document that this is undefined currently.

Technically, this isn't bug, it's an undefined behavior. When you redefine a method, the resulting behavior is undefined. So Julia can do whatever it likes, including what it currently does. Providing a well-defined behavior upon method redefinition is a feature, not a bug fix. I'm also not convinced that this is a v1.0 issue since going from an undefined behavior to providing a well-defined behavior is not a breaking change. This could be implemented in v1.1 without breaking any valid v1.0 code.

Greg Clayton from Apple's LLVM/LLDB staff was kind enough to document how to elicit (with lldb libraries, a subproject of llvm) the necessary information to determine a binary's dependencies from the embedded symbol info (symbol imports); as well as those symbols exported by a binary (necessary to construct the complete dependency graph).

  • Jason

    On Mar 31, 2012, at 11:02 PM, Jason E. Aten wrote:

    Dear LLDB enthusiasts,

    I'm wondering if I can use the lldb library/libraries to replace the certain code running on OSX that now returns two lists of symbols-- similar to the output of (dyldinfo -lazy_bind -exports ); i.e. I need to list the symbols imported and exported by a binary shared object or bundle.

    My hope was that by using an lldb library, I would be able to use the same client code on OSX as on linux. (The linux version of the code currently uses libbfd and libdld to do the same thing, but the later is getting little love/maintenance).

    I'm looking through include/lldb/, as it seems like lldb would need this same info (imported symbol list, and exported symbol list for a Mach-O file) to function, but it's not clear which API to use. All suggestions/pointers to example code in lldb would be welcome!

    Thank you.
    Jason

    In case it is unclear what dyldinfo does, here is an example: (but I only need the symbol names; not the addresses or segments or sections):

    $ file /tmp/sample_bundle
    /tmp/sample_bundle: Mach-O 64-bit bundle x86_64

    $dyldinfo -lazy_bind -export /tmp/sample_bundle

    lazy binding information (from lazy_bind part of dyld info):
    segment section address index dylib symbol
    __DATA __la_symbol_ptr 0x00001030 0x0000 flat-namespace __mm_pop_chunk
    __DATA __la_symbol_ptr 0x00001038 0x0015 flat-namespace _dh_define
    export information (from trie):
    0x000008A0 _C_ipair
    0x00000920 _init_ipair
    0x00000BC0 _C_iprot
    0x00000C40 _C_ipi2
    0x00000CC0 _C_ipi1
    0x00001040 _K_ipair_R43808f40
    0x00001160 _K_ipi1_R5cb4475d
    0x00001260 _K_ipi2_R5cb4475d
    0x00001360 _K_iprot_Rfc8fe739
    0x00001460 _majver_ipair
    0x00001464 _minver_ipair

On Mon, Apr 2, 2012 at 3:13 PM, Greg Clayton gclayton@apple.com wrote:

Yes you can do this with LLDB. If you load a binary and dump its symbol table, you will see the information you want. For symbols that are lazily bound, you can look for "Trampoline" symbols:

cd lldb/test/lang/objc/foundation
make
lldb a.out
(lldb) target modules dump symtab a.out
Symtab, file = .../lldb/test/lang/objc/foundation/a.out, num_symbols = 54:
              Debug symbol
              |Synthetic symbol
              ||Externally Visible
              |||
Index   UserID DSX Type         File Address/Value Load Address       Size               Flags      Name
------- ------ --- ------------ ------------------ ------------------ ------------------ ---------- ----------------------------------
[    0]      0 D   SourceFile   0x0000000000000000                    Sibling -> [   15] 0x00640000 /Volumes/work/gclayton/Documents/src/lldb/test/lang/objc/foundation/main.m
[    1]      2 D   ObjectFile   0x000000004f79f1ca                    0x0000000000000000 0x00660001 /Volumes/work/gclayton/Documents/src/lldb/test/lang/objc/foundation/main.o
[    2]      4 D   Code         0x00000001000010f0                    0x00000000000000c0 0x000e0000 -[MyString initWithNSString:]
[    3]      8 D   Code         0x00000001000011b0                    0x0000000000000090 0x000e0000 -[MyString dealloc]
[    4]     12 D   Code         0x0000000100001240                    0x00000000000000a0 0x000e0000 -[MyString description]
[    5]     16 D   Code         0x00000001000012e0                    0x0000000000000020 0x000e0000 -[MyString descriptionPauses]
[    6]     20 D   Code         0x0000000100001300                    0x0000000000000030 0x000e0000 -[MyString setDescriptionPauses:]
[    7]     24 D   Code         0x0000000100001330                    0x0000000000000030 0x000e0000 -[MyString str_property]
[    8]     28 D   Code         0x0000000100001360                    0x0000000000000050 0x000e0000 -[MyString setStr_property:]
[    9]     32 D   Code         0x00000001000013b0                    0x0000000000000040 0x000f0000 Test_Selector
[   10]     36 D   Code         0x00000001000013f0                    0x0000000000000130 0x000f0000 Test_NSString
[   11]     40 D   Code         0x0000000100001520                    0x0000000000000120 0x000f0000 Test_MyString
[   12]     44 D   Code         0x0000000100001640                    0x00000000000001b0 0x000f0000 Test_NSArray
[   13]     48 D   Code         0x00000001000017f0                    0x00000000000000e1 0x000f0000 main
[   14]     56 D X Data         0x0000000100002680                    0x0000000000000000 0x00200000 my_global_str
[   15]     58 D   SourceFile   0x0000000000000000                    Sibling -> [   19] 0x00640000 /Volumes/work/gclayton/Documents/src/lldb/test/lang/objc/foundation/my-base.m
[   16]     60 D   ObjectFile   0x000000004f79f1ca                    0x0000000000000000 0x00660001 /Volumes/work/gclayton/Documents/src/lldb/test/lang/objc/foundation/my-base.o
[   17]     62 D   Code         0x00000001000018e0                    0x0000000000000020 0x000e0000 -[MyBase propertyMovesThings]
[   18]     66 D   Code         0x0000000100001900                    0x000000000000001f 0x000e0000 -[MyBase setPropertyMovesThings:]
[   19]     82     Data         0x0000000100002000                    0x0000000000000460 0x000e0000 pvars
[   20]     83     ObjCIVar     0x0000000100002518                    0x0000000000000148 0x001e0000 MyBase.propertyMovesThings
[   21]     84   X Data         0x0000000100002660                    0x0000000000000008 0x000f0000 NXArgc
[   22]     85   X Data         0x0000000100002668                    0x0000000000000008 0x000f0000 NXArgv
[   23]     86   X ObjCClass    0x00000001000024d8                    0x0000000000000028 0x000f0000 MyBase
[   24]     87   X ObjCClass    0x0000000100002460                    0x0000000000000028 0x000f0000 MyString
[   25]     88   X ObjCIVar     0x0000000100002510                    0x0000000000000008 0x000f0000 MyString._desc_pauses
[   26]     89   X ObjCIVar     0x0000000100002508                    0x0000000000000008 0x000f0000 MyString.date
[   27]     90   X ObjCIVar     0x0000000100002500                    0x0000000000000008 0x000f0000 MyString.str
[   28]     91   X ObjCMetaClass 0x00000001000024b0                    0x0000000000000028 0x000f0000 MyBase
[   29]     92   X ObjCMetaClass 0x0000000100002488                    0x0000000000000028 0x000f0000 MyString
[   30]     97   X Data         0x0000000100002678                    0x0000000000000008 0x000f0000 __progname
[   31]     98   X Data         0x0000000100000000                    0x00000000000010b0 0x000f0010 _mh_execute_header
[   32]     99   X Data         0x0000000100002670                    0x0000000000000008 0x000f0000 environ
[   33]    101   X Data         0x0000000100002680                    0x0000000000000000 0x000f0000 my_global_str
[   34]    102   X Code         0x00000001000010b0                    0x0000000000000040 0x000f0000 start
[   35]    103     Trampoline   0x0000000100001938                    0x0000000000000006 0x00010200 NSLog
[   36]    104   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 OBJC_CLASS_$_NSArray
[   37]    105   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010200 OBJC_CLASS_$_NSAutoreleasePool
[   38]    106   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 OBJC_CLASS_$_NSDate
[   39]    107   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 OBJC_CLASS_$_NSObject
[   40]    108   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010200 OBJC_CLASS_$_NSString
[   41]    109   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 OBJC_METACLASS_$_NSObject
[   42]    110   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 __CFConstantStringClassReference
[   43]    111   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010100 _objc_empty_cache
[   44]    112   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010100 _objc_empty_vtable
[   45]    113     Trampoline   0x000000010000193e                    0x0000000000000006 0x00010300 exit
[   46]    114     Trampoline   0x0000000100001920                    0x0000000000000006 0x00010100 objc_getProperty
[   47]    115     Trampoline   0x0000000100001926                    0x0000000000000006 0x00010100 objc_msgSend
[   48]    116     Trampoline   0x000000010000192c                    0x0000000000000006 0x00010100 objc_msgSendSuper2
[   49]    117   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010100 objc_msgSend_fixup
[   50]    118     Trampoline   0x0000000100001932                    0x0000000000000006 0x00010100 objc_setProperty
[   51]    119     Trampoline   0x0000000100001944                    0x0000000000000006 0x00010300 printf
[   52]    120     Trampoline   0x000000010000194a                    0x0000000000000006 0x00010300 usleep
[   53]    121   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010300 dyld_stub_binder
(lldb)


All lazily bound symbols will have type Trampoline:

[   45]    113     Trampoline   0x000000010000193e                    0x0000000000000006 0x00010300 exit
[   46]    114     Trampoline   0x0000000100001920                    0x0000000000000006 0x00010100 objc_getProperty
[   47]    115     Trampoline   0x0000000100001926                    0x0000000000000006 0x00010100 objc_msgSend
[   48]    116     Trampoline   0x000000010000192c                    0x0000000000000006 0x00010100 objc_msgSendSuper2
[   50]    118     Trampoline   0x0000000100001932                    0x0000000000000006 0x00010100 objc_setProperty
[   51]    119     Trampoline   0x0000000100001944                    0x0000000000000006 0x00010300 printf
[   52]    120     Trampoline   0x000000010000194a                    0x0000000000000006 0x00010300 usleep

The other symbols that are exernal are marked with an "X" (which is a boolean flag on each symbol).

The symbols can be accessed via the SBModule:

   size_t
   SBModule::GetNumSymbols ();

   lldb::SBSymbol
   SBModule::GetSymbolAtIndex (size_t idx);

And then you can get the symbol type from each SBSymbol:

   SymbolType
   SBSymbol::GetType ();


I just added the ability to see if a symbol is externally visible:
% svn commit
Sending        include/lldb/API/SBSymbol.h
Sending        scripts/Python/interface/SBSymbol.i
Sending        source/API/SBSymbol.cpp
Transmitting file data ...
Committed revision 153893.


   bool
   SBSymbol::IsExternal();


So your flow should be:

SBDebugger::Initialize();
SBDebugger debugger(SBDebugger::Create());
SBTarget target (debugger.CreateTarget (const char *filename,
                                       const char *target_triple,
                                       const char *platform_name,
                                       bool add_dependent_modules,
                                       lldb::SBError& error));

SBFileSpec exe_file_spec (filename);
SBModule exe_module (target.FindModule(exe_file_spec));
if (exe_module.IsValid()
{
   const size_t num_symbols = exe_module. GetNumSymbols();
   for (size_t i=0; i<num_symbols; ++i)
   {
       SBSymbol symbol (exe_module. GetSymbolAtIndex(i));
       if (symbol.IsExternal())
       {
       }

       if (symbol.GetType() == lldb::eSymbolTypeTrampoline)
       {
       }
   }
}




> _______________________________________________
> lldb-dev mailing list
> lldb-dev@cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

On Mon, Apr 2, 2012 at 4:05 PM, Greg Clayton gclayton@apple.com wrote:

A quick clarification on the args to CreateTarget:

"filename" is just a full path to the local object file you want to observer. "target_triple" is your <arch>-<vendor>-<os>, or something like "x86_64-apple-darwin" or "i386-pc-linux" and can be NULL if the file is only a single architecture. "platform_name" can be NULL. "add_dependent_modules" should be false, since you are only interested in seeing the one object file itself, an SBError instance  should be created and passed in.

This is dredging up an amazingly old thread, but I realized I am possibly flirting around the edges of this problem with my codespeed benchmarking code. I repeatedly Core.include() files that contain two functions, listTests() and runTests(). I then simply call listTests() and runTests(), thereby redefining and then calling them every time I load a new benchmark file. Because I'm not redefining 2nd-tier functions I don't think I'll run into the problems listed here, but is redefining things like this a bad way to go about loading in multiple files with the same "API"?

Might be better to include them in a module repeatedly, but I realize that bumps up against the anonymous module problem that's also been raised on the dev list. In this case, it may be fine, however, since you can just define the same module multiple times and ignore the warning that emits.

Perhaps an old thread, but as relevant as ever.
Maybe another approach is to use evalfile and have the file "return" a tuple of functions. What's the anonymous module problem?

I was referring to this: #3661.

Anyone have additional thoughts on implementation strategies? I was thinking of coding up a simple scheme that maintains a inverted call graph which is updated whenever a new method is created (based on the unoptimized AST). It then walks that graph when a method is redefined, removing the compiled version of each of the redefined function's descendants.

That's as reasonable an approach as any to try. If you're feeling motivated to do it, I'd say go ahead and let's see what happens!

What happens if execution of the callee triggers recompilation of the caller?

I ask because I've belatedly realized that resolving this issue may have interesting consequences that go beyond the user experience in the REPL: it might allow the ultimate answer to a metaprogramming problem, that of creating efficient "staged" functions. As background for those who may not know: some functions are currently implemented by metaprogramming, in particular those for which some aspect of the algorithm depends on the types of the inputs in a nontrivial way---the canonical example is one in which the number of loops in the function body is equal to the dimensionality of an array. The way we usually handle this is to define the function explicitly for, e.g., dimensions 1 and 2, and then have a wrapper which looks something like this:

_method_cache = Dict()
function myfunction(A::AbstractArray)
    N = ndims(A)
    if !haskey(_method_cache, N)
        func = eval(<an expression that generates the function definition for N dimensions>)
        _method_cache[N] = func
    else
        func = _method_cache[N]
    end
    return func(A)
end

So the first time myfunction gets executed for a 4-dimensional array, it first defines a version specific for 4 dimensions, adds it to _method_cache, and then evaluates the new function on the input. On future calls to myfunction with a 4-dimensional array, it just retrieves the definition from the dictionary and evaluates it. _method_cache is a kind of "shadow method table" in parallel to Julia's own internal method table, one that is used just for this particular function. (To keep it private, it's usually encapsulated by a let.)

While the approach in this example works well for function bodies that take some time to execute, it's not very well-suited for functions that execute in less time than required for a dictionary-lookup, and it's especially bad for functions that you want to be able to inline.

A better way to do this might be the following:

function myfunction(A::AbstractArray)
    bodyexpr = <an expression for the body of the function specific for N dimensions>
    f = @eval begin
        function myfunction(A::$(typeof(A)))
            $bodyexpr
        end
    end
    return f(A)
end

Here, execution of myfunction generates a more specific version of myfunction for these particular input types. Code that is compiled after this new definition becomes available will use this new version when applicable; the version above becomes the generic "fallback" for those cases where you haven't already defined something more specific. Consequently, this would be a way of creating staged functions that exploits Julia's own internal method-table mechanisms, and would therefore allow one to generate efficient code.

Currently, this doesn't work for one crucial reason: whatever just called myfunction has already been compiled, so it doesn't know about the new definition. Consequently, this particular caller will always generate a fresh new version of myfunction using eval, which will be god-awful slow.

So the trick would be to recompile the caller, but notice that this needs to occur while it is in the middle of execution. What's going to happen?

See also #5395.

There is a related problem with abstract types that does not actually involve method redefinition but needs similar treatment. Consider:

abstract A
immutable B <: A; end
immutable C <: A; end

g(x::Vector{A}) = f(x[1])

f(::B) = 1
g(A[B()])

f(::C) = 0.5
g(A[C()])

The last line gives 4602678819172646912. We need to throw out the code for g because the type inference for f(::A) is no longer valid.

Yes, that is quite clear. We know replacing methods is only one special case. But it always involves new method definitions.

It seems that current situation can be somehow ambiguous since

f() = x()
x() = 1
println(f())
x() = 2
println(f())

gives

1
1

while

g() = y()
precompile(g, ())
y() = 1
println(g())
y() = 2
println(g())

gives

1
2

It seems that the later case can possible be used as a workaround to emulate caller recompilation (I know its not really recompilation).

I have a feeling that I'm about to say something stupid, but couldn't this problem be eliminated with a single level of indirection? Instead of hard-coding a function address, look it up from a fixed location and call that. Wouldn't this have similar performance to C++ or Java virtual functions? You could then annotate functions to have a static or dynamic address, and get a warning/error when you try to redefine a function with a static address. There could even be a switch to set the default function behavior. I am new to the language and completely unfamiliar with the code base, but if this sounds feasible, I suppose I could give it a jab.

@omer4d one problem with that idea is that Julia does inline a lot of smaller functions, so we still need a way to look up all "dependents" on a certain function.

It could inline only the ones with static addresses, like C++ does. This shouldn't cause any inconvenience.
People who don't care about interactive development won't be affected at all, since the default behavior would be to use static addresses for all functions.
People who don't care about performance but want interactive development can use a switch to make function addresses dynamic by default.
People who want both interactivity and performance could just annotate the relevant functions, which shouldn't be a lot of work, since inlining is only relevant for short low-level functions that get called a lot.
Perhaps the default behavior could even be set per module.

I believe there are several reasons that approach is not viable. First,
annotations really are a nuisance, especially when they are needed to hack
around implementation details like this. If annotations are used to solve 2
or 3 problems, they start to pile up significantly. Second, there is no
good way to pick the right annotation for a function. There is no
connection between whether you want something inlined, and whether you are
developing it interactively. Third, type deductions in other functions
might need to change, so it's not just call sites. In fact we are currently
deriving less type information than we potentially could to make things
safe in the face of this issue.

Well, as someone who does most of his work in languages with dozens of qualifier keywords and repeated manual recompilation, having to occasionally qualify a function or two and recompile after a few cycles of interactive development doesn't sound so bad, but that's just me. I don't know enough to address the third point, though, so I'll step down. I think it is unfortunate that such a solution isn't viable, since recompilation means that if the root of a modified function's call tree contains a game or animation loop, it will need to be terminated and restarted. =(

@JeffBezanson I would probably be most interested in the easy case, at least at first. I don't care as much running code gets updated, as opposed to things accessed through eval (e.g. entered at the REPL prompt). Eventually, it will also be important to e.g. get the right version of display called when redefined. But I believe that should rapidly get us closer to having incremental compiles.

I was a little bit surprized to know this problem, not because I think it is easy to handle this, but because the following works as expected.

julia> f(a, b) = a + b
f (generic function with 1 method)

julia> g(args...) = f(args...)
g (generic function with 1 method)

julia> g(1, 2)
3

julia> f(a::Int, b::Int) = a - b
f (generic function with 2 methods)

julia> g(1, 2)
-1

Edit: Actually this looks like a special case for Vararg input ..... Does this mean that a vararg function is recompiled everytime it is called? Or is it smart enough to only recompile when necessary? And is it possible to just treat other functions in the same way?

julia> f(a, b) = a + b
f (generic function with 1 method)

julia> g(a, b) = f(a, b)
g (generic function with 1 method)

julia> g(1, 2)
3

julia> f(a::Int, b::Int) = a - b
f (generic function with 2 methods)

julia> g(1, 2)
3

Actually, the function returns the right result but the @code_typed returns wrong result.....

julia> f(a, b) = a + b
f (generic function with 1 method)

julia> g(args...) = f(args...)
g (generic function with 1 method)

julia> g(1, 2)
3

julia> @code_typed g(1, 2)
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:(args::(top(apply_type))(Vararg,Any)::Any::Any)], Any[Any[],Any[Any[:args,(Int64,Int64),0]],Any[]], :(begin  # none, line 1:
        return (top(box))(Int64,(top(add_int))((top(tupleref))(args::(Int64,Int64),1)::Int64,(top(tupleref))(args::(Int64,Int64),2)::Int64))
    end::Int64))))

julia> f(a::Int, b::Int) = a - b
f (generic function with 2 methods)

julia> g(1, 2)
-1

julia> @code_typed g(1, 2)
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:(args::(top(apply_type))(Vararg,Any)::Any::Any)], Any[Any[],Any[Any[:args,(Int64,Int64),0]],Any[]], :(begin  # none, line 1:
        return (top(box))(Int64,(top(add_int))((top(tupleref))(args::(Int64,Int64),1)::Int64,(top(tupleref))(args::(Int64,Int64),2)::Int64))
    end::Int64))))

And

julia> f(a, b) = a + b
f (generic function with 1 method)

julia> g(args...) = f(args...)
g (generic function with 1 method)

julia> g(1, 2)
3

julia> f(a::Int, b::Int) = a - b
f (generic function with 2 methods)

julia> g(1, 2)
-1

julia> @code_typed g(1, 2)
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:(args::(top(apply_type))(Vararg,Any)::Any::Any)], Any[Any[],Any[Any[:args,(Int64,Int64),0]],Any[]], :(begin  # none, line 1:
        return (top(box))(Int64,(top(sub_int))((top(tupleref))(args::(Int64,Int64),1)::Int64,(top(tupleref))(args::(Int64,Int64),2)::Int64))
    end::Int64))))

would it work to index methods of functions (or only the function types/objects itself) with a number that changes everytime anything is assigned and thus becoming able to check whether the number stored with the compiled (caller) function (i. e. using the new structure of functions, saving these numbers as a field etc) still matches the one that is "current" on the function that was inlined (called)?
(without a big performance penalty of course)
And in case of change consider to compile the new version or complete by explicit call.

That would move the performance impact to runtime (and it would be everywhere), which is not desirable. What's needed is keeping track of the web of dependencies and recompiling anything that might be affected when a method is partially redefined. It's probably doable, it's just a massive pain.

I don't know if the solution to this issue would easily generalise to macros, but if so that'd be a nice to have.

So one rather should go the other way around and store all methods that inlined the function and on change recompile them. (of course checking for current execution etc)

The hard problems here are :

  • code invalidation as you say, find out all the places that uses old functions (in generated code). That probably implies being able to re-relocate the code that calls functions that changed but for which the calling convention & return type did not change
  • harder : what happens if another task is running while you define a new method ? what state does it see ? You want to solve that without on-stack replacement because that is very hard to implement efficiently. We think we have a solution to that which does not kill performance and I think @vtjnash will come up with a little write up once it's in decent shape

While it is worked on a solution for the problem I'm curious whether it is possible and a good idea to add a flag to every function that is set if (and only if) the given function was used in any of the upper ways (inlining etc) in order to produce a warning that an inlined method got redefined. (Similar to the way the ambiguous warning works)
If there is no better way saving the flag then again maybe by using the new structure of methods/functions in julia 0.5

It is now even easier to hit this in the repl and this behaviour should maybe be documented for map?

julia> function f(x)
         1
       end 
f (generic function with 1 method)

julia> map(f, 1:10)
10-element Array{Int64,1}:
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1

julia> function f(x)
         2
       end
WARNING: Method definition f(Any) in module Main at REPL[9]:2 overwritten at REPL[11]:2.
f (generic function with 1 method)

julia> map(f, 1:10)
10-element Array{Int64,1}:
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1

That's still just vanilla #265. I think you'll appreciate that this'll be fixed soon in v0.6-dev :)

@vtjnash is there any runtime performance or memory tradeoff?

it requires a bit of memory (140MB -> 170 MB), but shouldn't have much affect on performance (compile or runtime). And I haven't really attempted much optimization yet.

The demos so far are fun:

julia> f() = 1
f (generic function with 1 method)

julia> function g(x)
    @eval f() = $x # this is the correct way to write `global f() = x` (which should be a syntax error, but isn't currently)
    return @eval(f)() # use `eval` to hide the value from optimization / inlining, but the call is not inside eval
end
g (generic function with 1 method)

julia> g(2)
WARNING: Method definition f() in module Main at REPL[1]:1 overwritten at REPL[2]:2.
1

julia> g(3)
WARNING: Method definition f() in module Main at REPL[2]:2 overwritten at REPL[2]:2.
2

julia> g(4)
WARNING: Method definition f() in module Main at REPL[2]:2 overwritten at REPL[2]:2.
3

Is the reason that doesn't return 2, 3, 4 due to the order in which compilation of g vs execution that redefines f happen?

That's roughly right. Note that the language semantic don't define compilation, so it's more correct to say that the order in which g is interpreted vs. the time that the redefinition of f becomes visible to the interpreter

alright, here's another fun little demo redefining the + primitive to keep count of how much it gets used:

julia> add_ctr = UInt(0)
0x0000000000000000

julia> Base.:+(a::Int, b::Int) = (global add_ctr += 1; Core.Intrinsics.add_int(a, b))

julia> add_ctr
0x0000000000000016

julia> last = 0;

julia> println(Int(add_ctr - last)); last = add_ctr;
287

julia> println(Int(add_ctr - last)); last = add_ctr;
17

That one's cheating: I don't think + gets used by Base at all, so there wasn't anything to recompile. Redefine a real function, like svd, that you know must get called at least 100 times before the REPL shows up. Then we'll be impressed.

The opportunities for live re-instrumentation of code for profiling/tracing are blowing me away.

Indeed. And here I was about to buy a new keyboard, because the sequence Ctrl-D; julia<Enter> is just about worn out. Looks like I might not have to.

It might be a bit arcane to put in the release notes, but should it be noted somewhere that comprehensions are now syntax for collect-ing a Generator (as opposed to something maybe at the parsing level in 0.4? -- @which doesn't work there)? This example, similar to vchuravy's above, is a bit of a gotcha even considering that each function is now a type in 0.5:

               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.5.0-rc3+0 (2016-08-22 23:43 UTC)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |  x86_64-linux-gnu

julia> f(x) = 1
f (generic function with 1 method)

julia> [f(x) for x in 1:5]
5-element Array{Int64,1}:
 1
 1
 1
 1
 1

julia> f(x) = 2
WARNING: Method definition f(Any) in module Main at REPL[1]:1 overwritten at REPL[3]:1.
f (generic function with 1 method)

julia> [f(x) for x in 1:5]
5-element Array{Int64,1}:
 1
 1
 1
 1
 1

julia> @which [f(x) for x in 1:5]
collect(itr::Base.Generator) at array.jl:295

not quite, right?

useful for fix #265

Is this really fixed 😲

Is this solution still planned to be backported to 0.5.x?

No.

@rapus95, this is a very invasive change, and getting it wrong could destabilize julia for a lot of users. Since the whole point of releases is to provide more stability than following master, it's far better to have it in 0.6. (You can always follow master if you want it earlier than the release of 0.6.)

I'm so happy to see progress on this issue! It would be amazing if this also allowed the compiler to optimize the case described in https://groups.google.com/forum/#!topic/julia-users/OBs0fmNmjCU.

Wow. What a great moment!

Hi guys!

Is there any possibility to backport this fix to 0.5 serie? Or does it only work in what will become 0.6?

This is a breaking change and will only be available in 0.6

It's hard to overstate how awesome it is to have this fixed. Old habits die hard so I sometimes restart and/or rebuild julia unnecessarily, but when I remember it's a complete game-changer for fixing problems.

I'm still getting this on Julia 1.5 and master on occasion, where code_typed changes but code_llvm and observed behavior do not. The behavior seems to be dependent on package-load order.
Here, I am redefining a couple functions from VectorizationBase. Some other functions in VectorizationBase depend on these, and are then called in VectorizedRNG. I get the correct behavior if I load both VectorizationBase and VectorizedRNG before redefining:

julia> using VectorizationBase, VectorizedRNG

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
CodeInfo(
1%1 = Base.getfield(rng, :ptr)::Ptr{UInt64}%2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}%3 = VectorizationBase.llvmcall(("", "%ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8},UInt8,Int64}, %2, v, 514)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514

;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1785"([1 x i64]* nocapture nonnull readonly dereferenceable(8), i8) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:392 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1%1 = Base.getfield(rng, :ptr)::Ptr{UInt64}%2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}%3 = VectorizationBase.llvmcall(("", "%ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8},UInt8,Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130

;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1790"([1 x i64]* nocapture nonnull readonly dereferenceable(8), i8) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:392 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 130
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

We start with a 514, and then after the redefinitions, the offset is 130 (the offset equals 2 + 8 * number of integers it can process efficiently with SIMD instructions).
But now, we'll redefine the functions before loading VectorizedRNG:

julia> using VectorizationBase

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizationBase, VectorizedRNG

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1%1 = Base.getfield(rng, :ptr)::Ptr{UInt64}%2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}%3 = VectorizationBase.llvmcall(("", "%ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8},UInt8,Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130

;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1660"([1 x i64]* nocapture nonnull readonly dereferenceable(8), i8) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:392 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

@code_typed is correct in the sense that it updated. It shows the correct new constant, 130.
@code_llvm is correct in the sense that this is the actual behavior the function shows. It shows the old constant, 514, which is still being used.

Might be related: https://github.com/chriselrod/VectorizationBase.jl/search?q=pure

@pure is the mechanism that opts out of this fix. Does it still occur without those annotations?

Oops, might be my fault! Thanks -- I'll try taking them out.

Without @pure Julia won't optimize them, but LLVM shouldn't have a problem.

Unfortunately not:

(@v1.7) pkg> st VectorizationBase
      Status `~/.julia/environments/v1.7/Project.toml`
  [3d5dd08c] VectorizationBase v0.18.3 `~/.julia/dev/VectorizationBase`

julia> using VectorizationBase

julia> run(`grep -nr "pure" $(dirname(pathof(VectorizationBase)))`) # all instances of `pure` are commented out
/home/chriselrod/.julia/dev/VectorizationBase/src/VectorizationBase.jl:14:# Base.@pure asvalbool(r) = Val(map(Bool, r))
/home/chriselrod/.julia/dev/VectorizationBase/src/VectorizationBase.jl:15:# Base.@pure asvalint(r) = Val(map(Int, r))
/home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/vbroadcast.jl:54:        # $(Expr(:meta,:pure,:inline))
Process(`grep -nr pure /home/chriselrod/.julia/dev/VectorizationBase/src`, ProcessExited(0))

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizedRNG
[ Info: Precompiling VectorizedRNG [33b4df10-0173-11e9-2a0c-851a7edac40e]

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1%1 = Base.getfield(rng, :ptr)::Ptr{UInt64}%2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}%3 = VectorizationBase.llvmcall(("    \n\n    define void @entry(i64, i8, i64) alwaysinline {\n    top:\n        %ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void\n    }\n", "entry"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8}, UInt8, Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1414"([1 x i64]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i8 zeroext %1) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:402 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

EDIT: I just pushed the commit to remove all the pures to master, to remove that as a possible cause in case anyone wants to try the example.

EDIT: Although ArrrayInterface.jl is also using it at some locations. I'll try taking those out too.

EDIT: Also removing them from ArrayInterface solved the problem.

On 1.5, removing all @pure fixed it:

julia> using VectorizationBase

julia> !success(run(`grep -nr "pure" $(dirname(pathof(VectorizationBase)))`, wait=false))
true

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizedRNG

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1%1 = Base.getfield(rng, :ptr)::Ptr{UInt64}%2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}%3 = VectorizationBase.llvmcall(("", "%ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8},UInt8,Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514

;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1692"([1 x i64]* nocapture nonnull readonly dereferenceable(8), i8) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:392 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 130
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

julia> using ArrayInterface

julia> !success(run(`grep -nr "pure" $(dirname(pathof(ArrayInterface)))`, wait=false))
true

julia> versioninfo()
Julia Version 1.5.3
Commit 788b2c77c1 (2020-11-09 13:37 UTC)
Platform Info:
  OS: Linux (x86_64-generic-linux)
  CPU: Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake-avx512)
Environment:
  JULIA_NUM_THREADS = auto

But on master:

julia> using VectorizationBase

julia> !success(run(`grep -nr "pure" $(dirname(pathof(VectorizationBase)))`, wait=false))
true

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizedRNG

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1%1 = Base.getfield(rng, :ptr)::Ptr{UInt64}%2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}%3 = VectorizationBase.llvmcall(("    \n\n    define void @entry(i64, i8, i64) alwaysinline {\n    top:\n        %ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void\n    }\n", "entry"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8}, UInt8, Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1161"([1 x i64]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i8 zeroext %1) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:402 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

julia> using ArrayInterface

julia> !success(run(`grep -nr "pure" $(dirname(pathof(ArrayInterface)))`, wait=false))
true

julia> versioninfo()
Julia Version 1.7.0-DEV.421
Commit 22858a0d29* (2021-02-01 19:04 UTC)
Platform Info:
  OS: Linux (x86_64-generic-linux)
  CPU: Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake-avx512)
Environment:
  JULIA_NUM_THREADS = auto

EDIT: No longer works on 1.5.

Note that @generated also may opt out of the fix, and it is similarly your responsibility to avoid writing them in such a way that the effect is likely to be visible to your users.

Could that explain why @code_typed is updated, but @code_llvm and the actual behavior are not?
I've been working on a minimal example, but haven't gotten a reproducer yet.

While there is a lot of @generated, I think that part of the code mostly relies on dispatch.

Also, redefining after the module has been loaded does trigger recompilation:

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1161"([1 x i64]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i8 zeroext %1) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:402 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

julia> VectorizationBase.has_feature(Val{:x86_64_avx2}())
False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1287"([1 x i64]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i8 zeroext %1) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:402 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 130
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

The code created the bodies of generated functions shouldn't depend on any of these methods being redefined

julia> VectorizedRNG.getoffset()
514

julia> @code_typed VectorizedRNG.getoffset()
CodeInfo(
1return 130
) => Int64

julia> VectorizedRNG.getoffset()
514

julia> @code_warntype VectorizedRNG.getoffset()
MethodInstance for VectorizedRNG.getoffset()
  from getoffset() in VectorizedRNG at /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97
Arguments
  #self#::Core.Const(VectorizedRNG.getoffset)
Body::Int64
1%1 = VectorizedRNG.simd_integer_register_size()::Core.Const(static(16))
│   %2 = (4 * %1)::Core.Const(64)
│   %3 = (%2 * 2)::Core.Const(128)
│   %4 = (%3 + 2)::Core.Const(130)
└──      return %4


julia> VectorizedRNG.simd_integer_register_size()
static(16)

Downstream of simd_integer_register_size, all we have is multiplication.

By checking out the master branches, there is this simpler reproducer:

julia> using VectorizationBase

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizedRNG
[ Info: Precompiling VectorizedRNG [33b4df10-0173-11e9-2a0c-851a7edac40e]

julia> VectorizedRNG.getoffset()
514

julia> @code_warntype VectorizedRNG.getoffset()
MethodInstance for VectorizedRNG.getoffset()
  from getoffset() in VectorizedRNG at /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98
Arguments
  #self#::Core.Const(VectorizedRNG.getoffset)
Body::Int64
1%1 = VectorizedRNG.sirs()::Core.Const(static(16))
│   %2 = (4 * %1)::Core.Const(64)
│   %3 = (%2 * 2)::Core.Const(128)
│   %4 = (%3 + 2)::Core.Const(130)
└──      return %4


julia> VectorizedRNG.sirs()
static(16)

julia> VectorizedRNG.getoffset()
514

sirs is updated above. But the downstream multiplciation by integers is not.