automatic recompilation of dependent functions

Question

automatic recompilation of dependent functions

vtjnash opened this issue 13 years ago · comments

if I define a function d2 before a function d1 which calls d2 then change d2, d1 uses the old definition for d2.
I assume this is because it is all precompiled, but maybe there should be a note warning of this? Or would it be possible to replace the old definition with a longjmp to the new one?
(Mostly important for the REPL, since I don't always do a full load)

julia> function d2()
       a
       end

julia> function d()
         d2()
       end

julia> d()
in d: a not defined

julia> function d2()
       b=2
       end

julia> d()
in d: a not defined

julia> d2
Methods for generic function d2
d2() at prompt:2

julia> function d()
         d2()
       end

julia> d()
2

Yichao Yu commented 8 years ago

No.

Stefan Karpinski · Answer 1 · Mon Nov 28 2011 05:51:32 GMT+0800 (China Standard Time)

I believe this is because the code for d2() is generated when the function is defined so the d() method is resolved at that time. As you noted, you can redefine d2() as it was originally defined to make it work as expected. Ideally, we would automatically re-define anything that depends on d() when it is changed like this, but we don't have any of the metadata in place that would allow that. I guess we could put a warning about this behavior in the manual. Better still would be to fix it, but I think it's a bit tricky to do :-/

Jeff Bezanson · Answer 2 · Mon Nov 28 2011 11:29:43 GMT+0800 (China Standard Time)

I was hoping nobody would notice this, but I guess that wasn't realistic of me :)

Jameson Nash · Answer 3 · Tue Nov 29 2011 00:40:01 GMT+0800 (China Standard Time)

It's more fun than that; it really lets you see the JIT at work, since it isn't resolved until execution:

julia> function f2()
       a
       end

julia> function f1()
       f2()
       end

julia> function f2()
       2
       end

julia> f1()
2

However, I fear this can lead to some unexpected results at the REPL, although I don't really know the best way to deal with it. Although I used an error to show the difference in dramatic fashion, it could also happens if I was trying to see the effect of changing an inner equation or constant!

Stefan Karpinski · Answer 4 · Tue Nov 29 2011 02:59:47 GMT+0800 (China Standard Time)

There are two types of locations that may need to be updated when a method is altered:

places where the method is called explicitly as a function (call/ret)
places where the method has been inlined

The former could be updated just by changing the function body: it could be altered in place, or the calling sites could be actively updated to call a new location, or the old function body could be replaced with a stub that jumps to the new version, optionally patching the calling site. For inlining, we need to track the callers that have inlined the function and re-JIT them.

Jeff Bezanson · Answer 5 · Tue Nov 29 2011 11:53:15 GMT+0800 (China Standard Time)

Really all we can do is discard code, because even if a method wasn't inlined any function that calls it might depend on type behavior that can change if the method changes. This is mostly for interactive cases so re-compiling code is not a huge deal.
Requires same work as issue #47.

Stefan Karpinski · Answer 6 · Tue Nov 29 2011 16:53:14 GMT+0800 (China Standard Time)

Bummer. Well, since it's primarily the repl that's affected, doing the right thing slowly is fine. At run-time, this should almost never kick in — why would someone define something and then redefine it again except interactively? If the solution ends up inducing a lot of overhead maybe turning it one should be optional and automatically done in the repl?

Stefan Karpinski · Answer 7 · Thu Mar 15 2012 13:18:50 GMT+0800 (China Standard Time)

We need to document that this is undefined currently.

Stefan Karpinski · Answer 8 · Mon Mar 26 2012 05:25:13 GMT+0800 (China Standard Time)

Technically, this isn't bug, it's an undefined behavior. When you redefine a method, the resulting behavior is undefined. So Julia can do whatever it likes, including what it currently does. Providing a well-defined behavior upon method redefinition is a feature, not a bug fix. I'm also not convinced that this is a v1.0 issue since going from an undefined behavior to providing a well-defined behavior is not a breaking change. This could be implemented in v1.1 without breaking any valid v1.0 code.

Jason E. Aten, Ph.D. · Answer 9 · Thu Apr 19 2012 06:22:18 GMT+0800 (China Standard Time)

Greg Clayton from Apple's LLVM/LLDB staff was kind enough to document how to elicit (with lldb libraries, a subproject of llvm) the necessary information to determine a binary's dependencies from the embedded symbol info (symbol imports); as well as those symbols exported by a binary (necessary to construct the complete dependency graph).

Jason

On Mar 31, 2012, at 11:02 PM, Jason E. Aten wrote:

Dear LLDB enthusiasts,

I'm wondering if I can use the lldb library/libraries to replace the certain code running on OSX that now returns two lists of symbols-- similar to the output of (dyldinfo -lazy_bind -exports ); i.e. I need to list the symbols imported and exported by a binary shared object or bundle.

My hope was that by using an lldb library, I would be able to use the same client code on OSX as on linux. (The linux version of the code currently uses libbfd and libdld to do the same thing, but the later is getting little love/maintenance).

I'm looking through include/lldb/, as it seems like lldb would need this same info (imported symbol list, and exported symbol list for a Mach-O file) to function, but it's not clear which API to use. All suggestions/pointers to example code in lldb would be welcome!

Thank you.
Jason

In case it is unclear what dyldinfo does, here is an example: (but I only need the symbol names; not the addresses or segments or sections):

$ file /tmp/sample_bundle
/tmp/sample_bundle: Mach-O 64-bit bundle x86_64

$dyldinfo -lazy_bind -export /tmp/sample_bundle

lazy binding information (from lazy_bind part of dyld info):
segment section address index dylib symbol
__DATA __la_symbol_ptr 0x00001030 0x0000 flat-namespace __mm_pop_chunk
__DATA __la_symbol_ptr 0x00001038 0x0015 flat-namespace _dh_define
export information (from trie):
0x000008A0 _C_ipair
0x00000920 _init_ipair
0x00000BC0 _C_iprot
0x00000C40 _C_ipi2
0x00000CC0 _C_ipi1
0x00001040 _K_ipair_R43808f40
0x00001160 _K_ipi1_R5cb4475d
0x00001260 _K_ipi2_R5cb4475d
0x00001360 _K_iprot_Rfc8fe739
0x00001460 _majver_ipair
0x00001464 _minver_ipair

On Mon, Apr 2, 2012 at 3:13 PM, Greg Clayton gclayton@apple.com wrote:

Yes you can do this with LLDB. If you load a binary and dump its symbol table, you will see the information you want. For symbols that are lazily bound, you can look for "Trampoline" symbols:

cd lldb/test/lang/objc/foundation
make
lldb a.out
(lldb) target modules dump symtab a.out
Symtab, file = .../lldb/test/lang/objc/foundation/a.out, num_symbols = 54:
              Debug symbol
              |Synthetic symbol
              ||Externally Visible
              |||
Index   UserID DSX Type         File Address/Value Load Address       Size               Flags      Name
------- ------ --- ------------ ------------------ ------------------ ------------------ ---------- ----------------------------------
[    0]      0 D   SourceFile   0x0000000000000000                    Sibling -> [   15] 0x00640000 /Volumes/work/gclayton/Documents/src/lldb/test/lang/objc/foundation/main.m
[    1]      2 D   ObjectFile   0x000000004f79f1ca                    0x0000000000000000 0x00660001 /Volumes/work/gclayton/Documents/src/lldb/test/lang/objc/foundation/main.o
[    2]      4 D   Code         0x00000001000010f0                    0x00000000000000c0 0x000e0000 -[MyString initWithNSString:]
[    3]      8 D   Code         0x00000001000011b0                    0x0000000000000090 0x000e0000 -[MyString dealloc]
[    4]     12 D   Code         0x0000000100001240                    0x00000000000000a0 0x000e0000 -[MyString description]
[    5]     16 D   Code         0x00000001000012e0                    0x0000000000000020 0x000e0000 -[MyString descriptionPauses]
[    6]     20 D   Code         0x0000000100001300                    0x0000000000000030 0x000e0000 -[MyString setDescriptionPauses:]
[    7]     24 D   Code         0x0000000100001330                    0x0000000000000030 0x000e0000 -[MyString str_property]
[    8]     28 D   Code         0x0000000100001360                    0x0000000000000050 0x000e0000 -[MyString setStr_property:]
[    9]     32 D   Code         0x00000001000013b0                    0x0000000000000040 0x000f0000 Test_Selector
[   10]     36 D   Code         0x00000001000013f0                    0x0000000000000130 0x000f0000 Test_NSString
[   11]     40 D   Code         0x0000000100001520                    0x0000000000000120 0x000f0000 Test_MyString
[   12]     44 D   Code         0x0000000100001640                    0x00000000000001b0 0x000f0000 Test_NSArray
[   13]     48 D   Code         0x00000001000017f0                    0x00000000000000e1 0x000f0000 main
[   14]     56 D X Data         0x0000000100002680                    0x0000000000000000 0x00200000 my_global_str
[   15]     58 D   SourceFile   0x0000000000000000                    Sibling -> [   19] 0x00640000 /Volumes/work/gclayton/Documents/src/lldb/test/lang/objc/foundation/my-base.m
[   16]     60 D   ObjectFile   0x000000004f79f1ca                    0x0000000000000000 0x00660001 /Volumes/work/gclayton/Documents/src/lldb/test/lang/objc/foundation/my-base.o
[   17]     62 D   Code         0x00000001000018e0                    0x0000000000000020 0x000e0000 -[MyBase propertyMovesThings]
[   18]     66 D   Code         0x0000000100001900                    0x000000000000001f 0x000e0000 -[MyBase setPropertyMovesThings:]
[   19]     82     Data         0x0000000100002000                    0x0000000000000460 0x000e0000 pvars
[   20]     83     ObjCIVar     0x0000000100002518                    0x0000000000000148 0x001e0000 MyBase.propertyMovesThings
[   21]     84   X Data         0x0000000100002660                    0x0000000000000008 0x000f0000 NXArgc
[   22]     85   X Data         0x0000000100002668                    0x0000000000000008 0x000f0000 NXArgv
[   23]     86   X ObjCClass    0x00000001000024d8                    0x0000000000000028 0x000f0000 MyBase
[   24]     87   X ObjCClass    0x0000000100002460                    0x0000000000000028 0x000f0000 MyString
[   25]     88   X ObjCIVar     0x0000000100002510                    0x0000000000000008 0x000f0000 MyString._desc_pauses
[   26]     89   X ObjCIVar     0x0000000100002508                    0x0000000000000008 0x000f0000 MyString.date
[   27]     90   X ObjCIVar     0x0000000100002500                    0x0000000000000008 0x000f0000 MyString.str
[   28]     91   X ObjCMetaClass 0x00000001000024b0                    0x0000000000000028 0x000f0000 MyBase
[   29]     92   X ObjCMetaClass 0x0000000100002488                    0x0000000000000028 0x000f0000 MyString
[   30]     97   X Data         0x0000000100002678                    0x0000000000000008 0x000f0000 __progname
[   31]     98   X Data         0x0000000100000000                    0x00000000000010b0 0x000f0010 _mh_execute_header
[   32]     99   X Data         0x0000000100002670                    0x0000000000000008 0x000f0000 environ
[   33]    101   X Data         0x0000000100002680                    0x0000000000000000 0x000f0000 my_global_str
[   34]    102   X Code         0x00000001000010b0                    0x0000000000000040 0x000f0000 start
[   35]    103     Trampoline   0x0000000100001938                    0x0000000000000006 0x00010200 NSLog
[   36]    104   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 OBJC_CLASS_$_NSArray
[   37]    105   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010200 OBJC_CLASS_$_NSAutoreleasePool
[   38]    106   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 OBJC_CLASS_$_NSDate
[   39]    107   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 OBJC_CLASS_$_NSObject
[   40]    108   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010200 OBJC_CLASS_$_NSString
[   41]    109   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 OBJC_METACLASS_$_NSObject
[   42]    110   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010400 __CFConstantStringClassReference
[   43]    111   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010100 _objc_empty_cache
[   44]    112   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010100 _objc_empty_vtable
[   45]    113     Trampoline   0x000000010000193e                    0x0000000000000006 0x00010300 exit
[   46]    114     Trampoline   0x0000000100001920                    0x0000000000000006 0x00010100 objc_getProperty
[   47]    115     Trampoline   0x0000000100001926                    0x0000000000000006 0x00010100 objc_msgSend
[   48]    116     Trampoline   0x000000010000192c                    0x0000000000000006 0x00010100 objc_msgSendSuper2
[   49]    117   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010100 objc_msgSend_fixup
[   50]    118     Trampoline   0x0000000100001932                    0x0000000000000006 0x00010100 objc_setProperty
[   51]    119     Trampoline   0x0000000100001944                    0x0000000000000006 0x00010300 printf
[   52]    120     Trampoline   0x000000010000194a                    0x0000000000000006 0x00010300 usleep
[   53]    121   X Undefined    0x0000000000000000                    0x0000000000000000 0x00010300 dyld_stub_binder
(lldb)


All lazily bound symbols will have type Trampoline:

[   45]    113     Trampoline   0x000000010000193e                    0x0000000000000006 0x00010300 exit
[   46]    114     Trampoline   0x0000000100001920                    0x0000000000000006 0x00010100 objc_getProperty
[   47]    115     Trampoline   0x0000000100001926                    0x0000000000000006 0x00010100 objc_msgSend
[   48]    116     Trampoline   0x000000010000192c                    0x0000000000000006 0x00010100 objc_msgSendSuper2
[   50]    118     Trampoline   0x0000000100001932                    0x0000000000000006 0x00010100 objc_setProperty
[   51]    119     Trampoline   0x0000000100001944                    0x0000000000000006 0x00010300 printf
[   52]    120     Trampoline   0x000000010000194a                    0x0000000000000006 0x00010300 usleep

The other symbols that are exernal are marked with an "X" (which is a boolean flag on each symbol).

The symbols can be accessed via the SBModule:

   size_t
   SBModule::GetNumSymbols ();

   lldb::SBSymbol
   SBModule::GetSymbolAtIndex (size_t idx);

And then you can get the symbol type from each SBSymbol:

   SymbolType
   SBSymbol::GetType ();


I just added the ability to see if a symbol is externally visible:
% svn commit
Sending        include/lldb/API/SBSymbol.h
Sending        scripts/Python/interface/SBSymbol.i
Sending        source/API/SBSymbol.cpp
Transmitting file data ...
Committed revision 153893.


   bool
   SBSymbol::IsExternal();


So your flow should be:

SBDebugger::Initialize();
SBDebugger debugger(SBDebugger::Create());
SBTarget target (debugger.CreateTarget (const char *filename,
                                       const char *target_triple,
                                       const char *platform_name,
                                       bool add_dependent_modules,
                                       lldb::SBError& error));

SBFileSpec exe_file_spec (filename);
SBModule exe_module (target.FindModule(exe_file_spec));
if (exe_module.IsValid()
{
   const size_t num_symbols = exe_module. GetNumSymbols();
   for (size_t i=0; i<num_symbols; ++i)
   {
       SBSymbol symbol (exe_module. GetSymbolAtIndex(i));
       if (symbol.IsExternal())
       {
       }

       if (symbol.GetType() == lldb::eSymbolTypeTrampoline)
       {
       }
   }
}




> _______________________________________________
> lldb-dev mailing list
> lldb-dev@cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

On Mon, Apr 2, 2012 at 4:05 PM, Greg Clayton gclayton@apple.com wrote:

A quick clarification on the args to CreateTarget:

"filename" is just a full path to the local object file you want to observer. "target_triple" is your <arch>-<vendor>-<os>, or something like "x86_64-apple-darwin" or "i386-pc-linux" and can be NULL if the file is only a single architecture. "platform_name" can be NULL. "add_dependent_modules" should be false, since you are only interested in seeing the one object file itself, an SBError instance  should be created and passed in.

Stefan Karpinski · Answer 10 · Thu Mar 21 2013 01:39:46 GMT+0800 (China Standard Time)

Dev-list discussion here: https://groups.google.com/forum/?fromgroups=#!topic/julia-dev/snnGKJul4vg.

Elliot Saba · Answer 11 · Wed Jul 10 2013 13:25:53 GMT+0800 (China Standard Time)

This is dredging up an amazingly old thread, but I realized I am possibly flirting around the edges of this problem with my codespeed benchmarking code. I repeatedly Core.include() files that contain two functions, listTests() and runTests(). I then simply call listTests() and runTests(), thereby redefining and then calling them every time I load a new benchmark file. Because I'm not redefining 2nd-tier functions I don't think I'll run into the problems listed here, but is redefining things like this a bad way to go about loading in multiple files with the same "API"?

Stefan Karpinski · Answer 12 · Wed Jul 10 2013 13:30:34 GMT+0800 (China Standard Time)

Might be better to include them in a module repeatedly, but I realize that bumps up against the anonymous module problem that's also been raised on the dev list. In this case, it may be fine, however, since you can just define the same module multiple times and ignore the warning that emits.

Jeff Bezanson · Answer 13 · Wed Jul 10 2013 13:33:29 GMT+0800 (China Standard Time)

Perhaps an old thread, but as relevant as ever.
Maybe another approach is to use evalfile and have the file "return" a tuple of functions. What's the anonymous module problem?

Stefan Karpinski · Answer 14 · Wed Jul 10 2013 17:37:39 GMT+0800 (China Standard Time)

I was referring to this: #3661.

Jonathan Malmaud · Answer 15 · Tue Jan 07 2014 02:33:15 GMT+0800 (China Standard Time)

Anyone have additional thoughts on implementation strategies? I was thinking of coding up a simple scheme that maintains a inverted call graph which is updated whenever a new method is created (based on the unoptimized AST). It then walks that graph when a method is redefined, removing the compiled version of each of the redefined function's descendants.

Jeff Bezanson · Answer 16 · Thu Jan 09 2014 15:02:02 GMT+0800 (China Standard Time)

That's as reasonable an approach as any to try. If you're feeling motivated to do it, I'd say go ahead and let's see what happens!

Tim Holy · Answer 17 · Wed Jan 15 2014 18:53:14 GMT+0800 (China Standard Time)

What happens if execution of the callee triggers recompilation of the caller?

I ask because I've belatedly realized that resolving this issue may have interesting consequences that go beyond the user experience in the REPL: it might allow the ultimate answer to a metaprogramming problem, that of creating efficient "staged" functions. As background for those who may not know: some functions are currently implemented by metaprogramming, in particular those for which some aspect of the algorithm depends on the types of the inputs in a nontrivial way---the canonical example is one in which the number of loops in the function body is equal to the dimensionality of an array. The way we usually handle this is to define the function explicitly for, e.g., dimensions 1 and 2, and then have a wrapper which looks something like this:

_method_cache = Dict()
function myfunction(A::AbstractArray)
    N = ndims(A)
    if !haskey(_method_cache, N)
        func = eval(<an expression that generates the function definition for N dimensions>)
        _method_cache[N] = func
    else
        func = _method_cache[N]
    end
    return func(A)
end

So the first time myfunction gets executed for a 4-dimensional array, it first defines a version specific for 4 dimensions, adds it to _method_cache, and then evaluates the new function on the input. On future calls to myfunction with a 4-dimensional array, it just retrieves the definition from the dictionary and evaluates it. _method_cache is a kind of "shadow method table" in parallel to Julia's own internal method table, one that is used just for this particular function. (To keep it private, it's usually encapsulated by a let.)

While the approach in this example works well for function bodies that take some time to execute, it's not very well-suited for functions that execute in less time than required for a dictionary-lookup, and it's especially bad for functions that you want to be able to inline.

A better way to do this might be the following:

function myfunction(A::AbstractArray)
    bodyexpr = <an expression for the body of the function specific for N dimensions>
    f = @eval begin
        function myfunction(A::$(typeof(A)))
            $bodyexpr
        end
    end
    return f(A)
end

Here, execution of myfunction generates a more specific version of myfunction for these particular input types. Code that is compiled after this new definition becomes available will use this new version when applicable; the version above becomes the generic "fallback" for those cases where you haven't already defined something more specific. Consequently, this would be a way of creating staged functions that exploits Julia's own internal method-table mechanisms, and would therefore allow one to generate efficient code.

Currently, this doesn't work for one crucial reason: whatever just called myfunction has already been compiled, so it doesn't know about the new definition. Consequently, this particular caller will always generate a fresh new version of myfunction using eval, which will be god-awful slow.

So the trick would be to recompile the caller, but notice that this needs to occur while it is in the middle of execution. What's going to happen?

See also #5395.

Simon Kornblith · Answer 18 · Sat May 03 2014 00:00:53 GMT+0800 (China Standard Time)

There is a related problem with abstract types that does not actually involve method redefinition but needs similar treatment. Consider:

abstract A
immutable B <: A; end
immutable C <: A; end

g(x::Vector{A}) = f(x[1])

f(::B) = 1
g(A[B()])

f(::C) = 0.5
g(A[C()])

The last line gives 4602678819172646912. We need to throw out the code for g because the type inference for f(::A) is no longer valid.

Jeff Bezanson · Answer 19 · Sat May 03 2014 00:07:47 GMT+0800 (China Standard Time)

Yes, that is quite clear. We know replacing methods is only one special case. But it always involves new method definitions.

Cristóvão Duarte Sousa · Answer 20 · Fri Sep 05 2014 20:22:02 GMT+0800 (China Standard Time)

It seems that current situation can be somehow ambiguous since

f() = x()
x() = 1
println(f())
x() = 2
println(f())

gives

1
1

while

g() = y()
precompile(g, ())
y() = 1
println(g())
y() = 2
println(g())

gives

1
2

It seems that the later case can possible be used as a workaround to emulate caller recompilation (I know its not really recompilation).

omer4d · Answer 21 · Sun Sep 14 2014 00:21:24 GMT+0800 (China Standard Time)

I have a feeling that I'm about to say something stupid, but couldn't this problem be eliminated with a single level of indirection? Instead of hard-coding a function address, look it up from a fixed location and call that. Wouldn't this have similar performance to C++ or Java virtual functions? You could then annotate functions to have a static or dynamic address, and get a warning/error when you try to redefine a function with a static address. There could even be a switch to set the default function behavior. I am new to the language and completely unfamiliar with the code base, but if this sounds feasible, I suppose I could give it a jab.

Elliot Saba · Answer 22 · Sun Sep 14 2014 02:10:45 GMT+0800 (China Standard Time)

@omer4d one problem with that idea is that Julia does inline a lot of smaller functions, so we still need a way to look up all "dependents" on a certain function.

omer4d · Answer 23 · Sun Sep 14 2014 04:11:09 GMT+0800 (China Standard Time)

It could inline only the ones with static addresses, like C++ does. This shouldn't cause any inconvenience.
People who don't care about interactive development won't be affected at all, since the default behavior would be to use static addresses for all functions.
People who don't care about performance but want interactive development can use a switch to make function addresses dynamic by default.
People who want both interactivity and performance could just annotate the relevant functions, which shouldn't be a lot of work, since inlining is only relevant for short low-level functions that get called a lot.
Perhaps the default behavior could even be set per module.

Jeff Bezanson · Answer 24 · Sun Sep 14 2014 05:58:18 GMT+0800 (China Standard Time)

I believe there are several reasons that approach is not viable. First,
annotations really are a nuisance, especially when they are needed to hack
around implementation details like this. If annotations are used to solve 2
or 3 problems, they start to pile up significantly. Second, there is no
good way to pick the right annotation for a function. There is no
connection between whether you want something inlined, and whether you are
developing it interactively. Third, type deductions in other functions
might need to change, so it's not just call sites. In fact we are currently
deriving less type information than we potentially could to make things
safe in the face of this issue.

omer4d · Answer 25 · Sun Sep 14 2014 08:16:26 GMT+0800 (China Standard Time)

Well, as someone who does most of his work in languages with dozens of qualifier keywords and repeated manual recompilation, having to occasionally qualify a function or two and recompile after a few cycles of interactive development doesn't sound so bad, but that's just me. I don't know enough to address the third point, though, so I'll step down. I think it is unfortunate that such a solution isn't viable, since recompilation means that if the root of a modified function's call tree contains a game or animation loop, it will need to be terminated and restarted. =(

Jameson Nash · Answer 26 · Wed Sep 17 2014 11:08:02 GMT+0800 (China Standard Time)

@JeffBezanson I would probably be most interested in the easy case, at least at first. I don't care as much running code gets updated, as opposed to things accessed through eval (e.g. entered at the REPL prompt). Eventually, it will also be important to e.g. get the right version of display called when redefined. But I believe that should rapidly get us closer to having incremental compiles.

Yichao Yu · Answer 27 · Sun Jan 04 2015 10:56:50 GMT+0800 (China Standard Time)

I was a little bit surprized to know this problem, not because I think it is easy to handle this, but because the following works as expected.

julia> f(a, b) = a + b
f (generic function with 1 method)

julia> g(args...) = f(args...)
g (generic function with 1 method)

julia> g(1, 2)
3

julia> f(a::Int, b::Int) = a - b
f (generic function with 2 methods)

julia> g(1, 2)
-1

Edit: Actually this looks like a special case for Vararg input ..... Does this mean that a vararg function is recompiled everytime it is called? Or is it smart enough to only recompile when necessary? And is it possible to just treat other functions in the same way?

julia> f(a, b) = a + b
f (generic function with 1 method)

julia> g(a, b) = f(a, b)
g (generic function with 1 method)

julia> g(1, 2)
3

julia> f(a::Int, b::Int) = a - b
f (generic function with 2 methods)

julia> g(1, 2)
3

Yichao Yu · Answer 28 · Sun Jan 04 2015 11:01:53 GMT+0800 (China Standard Time)

Actually, the function returns the right result but the @code_typed returns wrong result.....

julia> f(a, b) = a + b
f (generic function with 1 method)

julia> g(args...) = f(args...)
g (generic function with 1 method)

julia> g(1, 2)
3

julia> @code_typed g(1, 2)
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:(args::(top(apply_type))(Vararg,Any)::Any::Any)], Any[Any[],Any[Any[:args,(Int64,Int64),0]],Any[]], :(begin  # none, line 1:
        return (top(box))(Int64,(top(add_int))((top(tupleref))(args::(Int64,Int64),1)::Int64,(top(tupleref))(args::(Int64,Int64),2)::Int64))
    end::Int64))))

julia> f(a::Int, b::Int) = a - b
f (generic function with 2 methods)

julia> g(1, 2)
-1

julia> @code_typed g(1, 2)
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:(args::(top(apply_type))(Vararg,Any)::Any::Any)], Any[Any[],Any[Any[:args,(Int64,Int64),0]],Any[]], :(begin  # none, line 1:
        return (top(box))(Int64,(top(add_int))((top(tupleref))(args::(Int64,Int64),1)::Int64,(top(tupleref))(args::(Int64,Int64),2)::Int64))
    end::Int64))))

And

julia> f(a, b) = a + b
f (generic function with 1 method)

julia> g(args...) = f(args...)
g (generic function with 1 method)

julia> g(1, 2)
3

julia> f(a::Int, b::Int) = a - b
f (generic function with 2 methods)

julia> g(1, 2)
-1

julia> @code_typed g(1, 2)
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:(args::(top(apply_type))(Vararg,Any)::Any::Any)], Any[Any[],Any[Any[:args,(Int64,Int64),0]],Any[]], :(begin  # none, line 1:
        return (top(box))(Int64,(top(sub_int))((top(tupleref))(args::(Int64,Int64),1)::Int64,(top(tupleref))(args::(Int64,Int64),2)::Int64))
    end::Int64))))

Aaron Matthis · Answer 29 · Fri Apr 01 2016 19:15:03 GMT+0800 (China Standard Time)

would it work to index methods of functions (or only the function types/objects itself) with a number that changes everytime anything is assigned and thus becoming able to check whether the number stored with the compiled (caller) function (i. e. using the new structure of functions, saving these numbers as a field etc) still matches the one that is "current" on the function that was inlined (called)?
(without a big performance penalty of course)
And in case of change consider to compile the new version or complete by explicit call.

Stefan Karpinski · Answer 30 · Fri Apr 01 2016 23:55:07 GMT+0800 (China Standard Time)

That would move the performance impact to runtime (and it would be everywhere), which is not desirable. What's needed is keeping track of the web of dependencies and recompiling anything that might be affected when a method is partially redefined. It's probably doable, it's just a massive pain.

Mike J Innes · Answer 31 · Sat Apr 02 2016 00:05:59 GMT+0800 (China Standard Time)

I don't know if the solution to this issue would easily generalise to macros, but if so that'd be a nice to have.

Aaron Matthis · Answer 32 · Sat Apr 02 2016 00:28:55 GMT+0800 (China Standard Time)

So one rather should go the other way around and store all methods that inlined the function and on change recompile them. (of course checking for current execution etc)

Oscar Blumberg · Answer 33 · Sat Apr 02 2016 00:53:18 GMT+0800 (China Standard Time)

The hard problems here are :

code invalidation as you say, find out all the places that uses old functions (in generated code). That probably implies being able to re-relocate the code that calls functions that changed but for which the calling convention & return type did not change
harder : what happens if another task is running while you define a new method ? what state does it see ? You want to solve that without on-stack replacement because that is very hard to implement efficiently. We think we have a solution to that which does not kill performance and I think @vtjnash will come up with a little write up once it's in decent shape

Aaron Matthis · Answer 34 · Tue May 17 2016 00:39:17 GMT+0800 (China Standard Time)

While it is worked on a solution for the problem I'm curious whether it is possible and a good idea to add a flag to every function that is set if (and only if) the given function was used in any of the upper ways (inlining etc) in order to produce a warning that an inlined method got redefined. (Similar to the way the ambiguous warning works)
If there is no better way saving the flag then again maybe by using the new structure of methods/functions in julia 0.5

Valentin Churavy · Answer 35 · Wed Aug 24 2016 23:19:37 GMT+0800 (China Standard Time)

It is now even easier to hit this in the repl and this behaviour should maybe be documented for map?

julia> function f(x)
         1
       end 
f (generic function with 1 method)

julia> map(f, 1:10)
10-element Array{Int64,1}:
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1

julia> function f(x)
         2
       end
WARNING: Method definition f(Any) in module Main at REPL[9]:2 overwritten at REPL[11]:2.
f (generic function with 1 method)

julia> map(f, 1:10)
10-element Array{Int64,1}:
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1

Jameson Nash · Answer 36 · Thu Aug 25 2016 00:15:18 GMT+0800 (China Standard Time)

That's still just vanilla #265. I think you'll appreciate that this'll be fixed soon in v0.6-dev :)

datnamer · Answer 37 · Thu Aug 25 2016 00:52:03 GMT+0800 (China Standard Time)

@vtjnash is there any runtime performance or memory tradeoff?

Jameson Nash · Answer 38 · Thu Aug 25 2016 06:35:35 GMT+0800 (China Standard Time)

it requires a bit of memory (140MB -> 170 MB), but shouldn't have much affect on performance (compile or runtime). And I haven't really attempted much optimization yet.

The demos so far are fun:

julia> f() = 1
f (generic function with 1 method)

julia> function g(x)
    @eval f() = $x # this is the correct way to write `global f() = x` (which should be a syntax error, but isn't currently)
    return @eval(f)() # use `eval` to hide the value from optimization / inlining, but the call is not inside eval
end
g (generic function with 1 method)

julia> g(2)
WARNING: Method definition f() in module Main at REPL[1]:1 overwritten at REPL[2]:2.
1

julia> g(3)
WARNING: Method definition f() in module Main at REPL[2]:2 overwritten at REPL[2]:2.
2

julia> g(4)
WARNING: Method definition f() in module Main at REPL[2]:2 overwritten at REPL[2]:2.
3

Tony Kelman · Answer 39 · Thu Aug 25 2016 13:47:57 GMT+0800 (China Standard Time)

Is the reason that doesn't return 2, 3, 4 due to the order in which compilation of g vs execution that redefines f happen?

Jameson Nash · Answer 40 · Thu Aug 25 2016 14:15:09 GMT+0800 (China Standard Time)

That's roughly right. Note that the language semantic don't define compilation, so it's more correct to say that the order in which g is interpreted vs. the time that the redefinition of f becomes visible to the interpreter

Jameson Nash · Answer 41 · Sat Aug 27 2016 02:02:20 GMT+0800 (China Standard Time)

alright, here's another fun little demo redefining the + primitive to keep count of how much it gets used:

julia> add_ctr = UInt(0)
0x0000000000000000

julia> Base.:+(a::Int, b::Int) = (global add_ctr += 1; Core.Intrinsics.add_int(a, b))

julia> add_ctr
0x0000000000000016

julia> last = 0;

julia> println(Int(add_ctr - last)); last = add_ctr;
287

julia> println(Int(add_ctr - last)); last = add_ctr;
17

Tim Holy · Answer 42 · Sat Aug 27 2016 02:41:02 GMT+0800 (China Standard Time)

That one's cheating: I don't think + gets used by Base at all, so there wasn't anything to recompile. Redefine a real function, like svd, that you know must get called at least 100 times before the REPL shows up. Then we'll be impressed.

Elliot Saba · Answer 43 · Sat Aug 27 2016 02:46:51 GMT+0800 (China Standard Time)

The opportunities for live re-instrumentation of code for profiling/tracing are blowing me away.

Tim Holy · Answer 44 · Sat Aug 27 2016 02:49:54 GMT+0800 (China Standard Time)

Indeed. And here I was about to buy a new keyboard, because the sequence Ctrl-D; julia<Enter> is just about worn out. Looks like I might not have to.

Ed Schmerling · Answer 45 · Mon Aug 29 2016 15:55:29 GMT+0800 (China Standard Time)

It might be a bit arcane to put in the release notes, but should it be noted somewhere that comprehensions are now syntax for collect-ing a Generator (as opposed to something maybe at the parsing level in 0.4? -- @which doesn't work there)? This example, similar to vchuravy's above, is a bit of a gotcha even considering that each function is now a type in 0.5:

               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.5.0-rc3+0 (2016-08-22 23:43 UTC)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |  x86_64-linux-gnu

julia> f(x) = 1
f (generic function with 1 method)

julia> [f(x) for x in 1:5]
5-element Array{Int64,1}:
 1
 1
 1
 1
 1

julia> f(x) = 2
WARNING: Method definition f(Any) in module Main at REPL[1]:1 overwritten at REPL[3]:1.
f (generic function with 1 method)

julia> [f(x) for x in 1:5]
5-element Array{Int64,1}:
 1
 1
 1
 1
 1

julia> @which [f(x) for x in 1:5]
collect(itr::Base.Generator) at array.jl:295

Tony Kelman · Answer 46 · Wed Sep 14 2016 22:11:23 GMT+0800 (China Standard Time)

not quite, right?

useful for fix #265

Eric Davies · Answer 47 · Wed Sep 14 2016 22:12:12 GMT+0800 (China Standard Time)

Is this really fixed 😲

Aaron Matthis · Answer 48 · Wed Sep 28 2016 02:39:06 GMT+0800 (China Standard Time)

Is this solution still planned to be backported to 0.5.x?

Tim Holy · Answer 49 · Wed Sep 28 2016 18:00:28 GMT+0800 (China Standard Time)

@rapus95, this is a very invasive change, and getting it wrong could destabilize julia for a lot of users. Since the whole point of releases is to provide more stability than following master, it's far better to have it in 0.6. (You can always follow master if you want it earlier than the release of 0.6.)

Twan Koolen · Answer 50 · Fri Sep 30 2016 10:05:47 GMT+0800 (China Standard Time)

I'm so happy to see progress on this issue! It would be amazing if this also allowed the compiler to optimize the case described in https://groups.google.com/forum/#!topic/julia-users/OBs0fmNmjCU.

Jeff Bezanson · Answer 51 · Fri Dec 23 2016 13:26:33 GMT+0800 (China Standard Time)

Wow. What a great moment!

Ronan Arraes Jardim Chagas · Answer 52 · Tue Jan 10 2017 05:09:01 GMT+0800 (China Standard Time)

Hi guys!

Is there any possibility to backport this fix to 0.5 serie? Or does it only work in what will become 0.6?

Simon Kornblith · Answer 53 · Tue Jan 10 2017 05:09:52 GMT+0800 (China Standard Time)

This is a breaking change and will only be available in 0.6

Tim Holy · Answer 54 · Tue Jan 10 2017 18:12:14 GMT+0800 (China Standard Time)

It's hard to overstate how awesome it is to have this fixed. Old habits die hard so I sometimes restart and/or rebuild julia unnecessarily, but when I remember it's a complete game-changer for fixing problems.

Chris Elrod · Answer 55 · Wed Feb 03 2021 22:35:34 GMT+0800 (China Standard Time)

I'm still getting this on Julia 1.5 and master on occasion, where code_typed changes but code_llvm and observed behavior do not. The behavior seems to be dependent on package-load order.
Here, I am redefining a couple functions from VectorizationBase. Some other functions in VectorizationBase depend on these, and are then called in VectorizedRNG. I get the correct behavior if I load both VectorizationBase and VectorizedRNG before redefining:

julia> using VectorizationBase, VectorizedRNG

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
CodeInfo(
1 ─ %1 = Base.getfield(rng, :ptr)::Ptr{UInt64}
│   %2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}
│   %3 = VectorizationBase.llvmcall(("", "%ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8},UInt8,Int64}, %2, v, 514)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514

;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1785"([1 x i64]* nocapture nonnull readonly dereferenceable(8), i8) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:392 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1 ─ %1 = Base.getfield(rng, :ptr)::Ptr{UInt64}
│   %2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}
│   %3 = VectorizationBase.llvmcall(("", "%ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8},UInt8,Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130

;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1790"([1 x i64]* nocapture nonnull readonly dereferenceable(8), i8) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:392 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 130
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

We start with a 514, and then after the redefinitions, the offset is 130 (the offset equals 2 + 8 * number of integers it can process efficiently with SIMD instructions).
But now, we'll redefine the functions before loading VectorizedRNG:

julia> using VectorizationBase

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizationBase, VectorizedRNG

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1 ─ %1 = Base.getfield(rng, :ptr)::Ptr{UInt64}
│   %2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}
│   %3 = VectorizationBase.llvmcall(("", "%ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8},UInt8,Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130

;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1660"([1 x i64]* nocapture nonnull readonly dereferenceable(8), i8) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:392 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

@code_typed is correct in the sense that it updated. It shows the correct new constant, 130.
@code_llvm is correct in the sense that this is the actual behavior the function shows. It shows the old constant, 514, which is still being used.

Matt Bauman · Answer 56 · Wed Feb 03 2021 23:18:17 GMT+0800 (China Standard Time)

Might be related: https://github.com/chriselrod/VectorizationBase.jl/search?q=pure

@pure is the mechanism that opts out of this fix. Does it still occur without those annotations?

Chris Elrod · Answer 57 · Wed Feb 03 2021 23:22:51 GMT+0800 (China Standard Time)

Oops, might be my fault! Thanks -- I'll try taking them out.

Without @pure Julia won't optimize them, but LLVM shouldn't have a problem.

Chris Elrod · Answer 58 · Wed Feb 03 2021 23:32:58 GMT+0800 (China Standard Time)

Unfortunately not:

(@v1.7) pkg> st VectorizationBase
      Status `~/.julia/environments/v1.7/Project.toml`
  [3d5dd08c] VectorizationBase v0.18.3 `~/.julia/dev/VectorizationBase`

julia> using VectorizationBase

julia> run(`grep -nr "pure" $(dirname(pathof(VectorizationBase)))`) # all instances of `pure` are commented out
/home/chriselrod/.julia/dev/VectorizationBase/src/VectorizationBase.jl:14:# Base.@pure asvalbool(r) = Val(map(Bool, r))
/home/chriselrod/.julia/dev/VectorizationBase/src/VectorizationBase.jl:15:# Base.@pure asvalint(r) = Val(map(Int, r))
/home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/vbroadcast.jl:54:        # $(Expr(:meta,:pure,:inline))
Process(`grep -nr pure /home/chriselrod/.julia/dev/VectorizationBase/src`, ProcessExited(0))

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizedRNG
[ Info: Precompiling VectorizedRNG [33b4df10-0173-11e9-2a0c-851a7edac40e]

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1 ─ %1 = Base.getfield(rng, :ptr)::Ptr{UInt64}
│   %2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}
│   %3 = VectorizationBase.llvmcall(("    \n\n    define void @entry(i64, i8, i64) alwaysinline {\n    top:\n        %ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void\n    }\n", "entry"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8}, UInt8, Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1414"([1 x i64]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i8 zeroext %1) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:402 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

EDIT: I just pushed the commit to remove all the pures to master, to remove that as a possible cause in case anyone wants to try the example.

EDIT: Although ArrrayInterface.jl is also using it at some locations. I'll try taking those out too.

EDIT: Also removing them from ArrayInterface solved the problem.

Chris Elrod · Answer 59 · Thu Feb 04 2021 02:56:17 GMT+0800 (China Standard Time)

On 1.5, removing all @pure fixed it:

julia> using VectorizationBase

julia> !success(run(`grep -nr "pure" $(dirname(pathof(VectorizationBase)))`, wait=false))
true

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizedRNG

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1 ─ %1 = Base.getfield(rng, :ptr)::Ptr{UInt64}
│   %2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}
│   %3 = VectorizationBase.llvmcall(("", "%ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8},UInt8,Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514

;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1692"([1 x i64]* nocapture nonnull readonly dereferenceable(8), i8) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:392 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 130
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

julia> using ArrayInterface

julia> !success(run(`grep -nr "pure" $(dirname(pathof(ArrayInterface)))`, wait=false))
true

julia> versioninfo()
Julia Version 1.5.3
Commit 788b2c77c1 (2020-11-09 13:37 UTC)
Platform Info:
  OS: Linux (x86_64-generic-linux)
  CPU: Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake-avx512)
Environment:
  JULIA_NUM_THREADS = auto

But on master:

julia> using VectorizationBase

julia> !success(run(`grep -nr "pure" $(dirname(pathof(VectorizationBase)))`, wait=false))
true

julia> VectorizationBase.has_feature(::Val{:x86_64_avx512f}) = VectorizationBase.False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizedRNG

julia> @code_typed VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 130
CodeInfo(
1 ─ %1 = Base.getfield(rng, :ptr)::Ptr{UInt64}
│   %2 = Base.bitcast(Ptr{UInt8}, %1)::Ptr{UInt8}
│   %3 = VectorizationBase.llvmcall(("    \n\n    define void @entry(i64, i8, i64) alwaysinline {\n    top:\n        %ptr.0 = inttoptr i64 %0 to i8*\n%ptr.1 = getelementptr inbounds i8, i8* %ptr.0, i64 %2\nstore i8 %1, i8* %ptr.1, align 1\nret void\n    }\n", "entry"), VectorizationBase.Cvoid, Tuple{Ptr{UInt8}, UInt8, Int64}, %2, v, 130)::Nothing
└──      return %3
) => Nothing

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1161"([1 x i64]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i8 zeroext %1) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:402 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

julia> using ArrayInterface

julia> !success(run(`grep -nr "pure" $(dirname(pathof(ArrayInterface)))`, wait=false))
true

julia> versioninfo()
Julia Version 1.7.0-DEV.421
Commit 22858a0d29* (2021-02-01 19:04 UTC)
Platform Info:
  OS: Linux (x86_64-generic-linux)
  CPU: Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake-avx512)
Environment:
  JULIA_NUM_THREADS = auto

EDIT: No longer works on 1.5.

Jameson Nash · Answer 60 · Thu Feb 04 2021 04:26:19 GMT+0800 (China Standard Time)

Note that @generated also may opt out of the fix, and it is similarly your responsibility to avoid writing them in such a way that the effect is likely to be visible to your users.

Chris Elrod · Answer 61 · Thu Feb 04 2021 04:28:05 GMT+0800 (China Standard Time)

Could that explain why @code_typed is updated, but @code_llvm and the actual behavior are not?
I've been working on a minimal example, but haven't gotten a reproducer yet.

While there is a lot of @generated, I think that part of the code mostly relies on dispatch.

Chris Elrod · Answer 62 · Thu Feb 04 2021 04:48:27 GMT+0800 (China Standard Time)

Also, redefining after the module has been loaded does trigger recompilation:

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1161"([1 x i64]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i8 zeroext %1) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:402 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 514
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

julia> VectorizationBase.has_feature(Val{:x86_64_avx2}())
False()

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> @code_llvm VectorizedRNG.setrand64counter!(local_rng(), 0x01) # we have a constant offset of 514
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97 within `setrand64counter!'
define void @"julia_setrand64counter!_1287"([1 x i64]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i8 zeroext %1) {
top:
;  @ /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98 within `setrand64counter!'
; ┌ @ essentials.jl:402 within `unsafe_convert'
; │┌ @ pointer.jl:30 within `convert'
    %2 = bitcast [1 x i64]* %0 to i8**
    %3 = load i8*, i8** %2, align 8
; └└
; ┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:782 within `vstore!' @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638
; │┌ @ /home/chriselrod/.julia/dev/VectorizationBase/src/llvm_intrin/memory_addr.jl:638 within `macro expansion'
    %ptr.1.i = getelementptr inbounds i8, i8* %3, i64 130
    store i8 %1, i8* %ptr.1.i, align 1
; └└
  ret void
}

The code created the bodies of generated functions shouldn't depend on any of these methods being redefined

julia> VectorizedRNG.getoffset()
514

julia> @code_typed VectorizedRNG.getoffset()
CodeInfo(
1 ─     return 130
) => Int64

julia> VectorizedRNG.getoffset()
514

julia> @code_warntype VectorizedRNG.getoffset()
MethodInstance for VectorizedRNG.getoffset()
  from getoffset() in VectorizedRNG at /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:97
Arguments
  #self#::Core.Const(VectorizedRNG.getoffset)
Body::Int64
1 ─ %1 = VectorizedRNG.simd_integer_register_size()::Core.Const(static(16))
│   %2 = (4 * %1)::Core.Const(64)
│   %3 = (%2 * 2)::Core.Const(128)
│   %4 = (%3 + 2)::Core.Const(130)
└──      return %4


julia> VectorizedRNG.simd_integer_register_size()
static(16)

Downstream of simd_integer_register_size, all we have is multiplication.

By checking out the master branches, there is this simpler reproducer:

julia> using VectorizationBase

julia> VectorizationBase.has_feature(::Val{:x86_64_avx2}) = VectorizationBase.False()

julia> using VectorizedRNG
[ Info: Precompiling VectorizedRNG [33b4df10-0173-11e9-2a0c-851a7edac40e]

julia> VectorizedRNG.getoffset()
514

julia> @code_warntype VectorizedRNG.getoffset()
MethodInstance for VectorizedRNG.getoffset()
  from getoffset() in VectorizedRNG at /home/chriselrod/.julia/dev/VectorizedRNG/src/xoshiro.jl:98
Arguments
  #self#::Core.Const(VectorizedRNG.getoffset)
Body::Int64
1 ─ %1 = VectorizedRNG.sirs()::Core.Const(static(16))
│   %2 = (4 * %1)::Core.Const(64)
│   %3 = (%2 * 2)::Core.Const(128)
│   %4 = (%3 + 2)::Core.Const(130)
└──      return %4


julia> VectorizedRNG.sirs()
static(16)

julia> VectorizedRNG.getoffset()
514

sirs is updated above. But the downstream multiplciation by integers is not.

automatic recompilation of dependent functions

In case it is unclear what dyldinfo does, here is an example: (but I only need the symbol names; not the addresses or segments or sections):