New to Inform - Have a few questions

Question

New to Inform - Have a few questions

aleksejs-fomins opened this issue 5 years ago · comments

Dear Developers,

I currently actively use a code from your colleagues called JIDT/IDTxl, I'm sure you are familiar with it. Currently my code works well, but has prohibitively high computation time and memory footprint (even after I have parallelized it to multiple cores). I am looking for potential alternatives, and your code seems to be implemented in C, which gives me hope that it might be a good alternative for my tasks.

I am interested in computing Entropy, Mutual Information and Transfer Entropy for 3D matrices of [Channel x Time x Trial], where Trial stands for repetitions of the same experiment.

I have read through your documentation and some part of the source code, and still have unanswered questions. Would you be so kind to answer, or direct me to the correct place to ask the questions:

Is it currently possible to use real data? All examples seem to be using integer time-series
In the source code I have only found histogram-based estimators. Are there currently other estimators available (such as Kraskov)? Is the histogram estimator bias-corrected?
What exactly does block-entropy for k>1 do? Does it split time-steps into subsets of length k, or does is sweep the timesteps with a window of length k?
I am not able to figure out from the documentation what is an initial condition. Could you explain this concept or direct me to literature? Is this the same as what I call Trial? In that case, is it possible to, for example, find mutual information between two variables, for which only one time step, but many trials are given?
Transfer Entropy operates with lags. Questions of interest are "what is TE for X->Y at lag=7" or "what is TE for X->Y given all lags={1,2,3,4,5}". Can a lag parameter be provided? What is the current convention?
JIDT provides multivatiate TE estimators, which allow (to some extent) to eliminate spurious connections such as those due to common ancestry and intermediate link. Is such functionality present or foreseen in the nearest future?
For TE and MI, another super valuable measure is test against zero. Currently, JIDT performs such tests and returns p-values along with the estimates, allowing the user to estimate if there is at all a relationship between variables above chance? Is such functionality implemented or intended?

In principle, I would be interested in contributing, if I can achieve my goals with your toolbox given a few weeks of coding.

Best,
Aleksejs

mwibral · Answer 1 · Tue Dec 03 2019 23:21:35 GMT+0800 (China Standard Time)

Dear Aleksejs, let me give some (partial) answers to your questions: Is it currently possible to use real data? Yes, in IDTxl just use the estimators have in Kraskov in their names (this should also answer the question for the Kraskov estimators. Also, most funtionality of JIDT should be accessible from IDTxl. Then long run-times come from two reasons: most likely you are using a multivariate TE approach. These approaches come with unavoidable internal statistical testing against surrogate data, to recursively include the source candidates (see Novelli,..,Lizier, Network Neuroscience, 2019). This process scales as n³s log(s) where n are the source candidates (sources x lags you want to consider), and where s are the samples accumulated across all trials. In our lab, computing multivariate TE for a single target in a network of 7 nodes with 5-6 lags each and with 100 trials of 500ms at 1.2 kHz will take 5-7 days (depending how deep the recursive inclusion algorithms find additional sources) on a four core latest generation intel processor (scaling to more cores is definitely sub-linear but not too bad). This is when using 200 statistical surrogate data for the inclusion and pruning runs and 500 surrogate datasets for final testing. To get the whole network for that dataset you'll have to multiply this by the numbers of nodes (i.e. 7 here). So yes, this is painfully slow - due to surrogate based statistics, using the Kraskov-estimator, the scaling of the multivariate problem (the full problem is actually NP-hard). Giving up on any of these setting, nodes, etc., will make things orders of magnitude faster, but may or may not be acceptable in your case. Your best option is to reduce the number of nodes or past lags under consideration (n³) or the samples (s log(s)). Best, Michael

…

________________________________ From: Aleksejs Fomins <notifications@github.com> Sent: Tuesday, December 3, 2019 3:34 PM To: ELIFE-ASU/Inform Cc: Subscribed Subject: [ELIFE-ASU/Inform] New to Inform - Have a few questions (#84) Dear Developers, I currently actively use a code from your colleagues called JIDT/IDTxl, I'm sure you are familiar with it. Currently my code works well, but has prohibitively high computation time and memory footprint (even after I have parallelized it to multiple cores). I am looking for potential alternatives, and your code seems to be implemented in C, which gives me hope that it might be a good alternative for my tasks. I am interested in computing Entropy, Mutual Information and Transfer Entropy for 3D matrices of [Channel x Time x Trial], where Trial stands for repetitions of the same experiment. I have read through your documentation and some part of the source code, and still have unanswered questions. Would you be so kind to answer, or direct me to the correct place to ask the questions: * Is it currently possible to use real data? All examples seem to be using integer time-series * In the source code I have only found histogram-based estimators. Are there currently other estimators available (such as Kraskov)? Is the histogram estimator bias-corrected? * What exactly does block-entropy for k>1 do? Does it split time-steps into subsets of length k, or does is sweep the timesteps with a window of length k? * I am not able to figure out from the documentation what is an initial condition. Could you explain this concept or direct me to literature? Is this the same as what I call Trial? In that case, is it possible to, for example, find mutual information between two variables, for which only one time step, but many trials are given? * Transfer Entropy operates with lags. Questions of interest are "what is TE for X->Y at lag=7" or "what is TE for X->Y given all lags={1,2,3,4,5}". Can a lag parameter be provided? What is the current convention? * JIDT provides multivatiate TE estimators, which allow (to some extent) to eliminate spurious connections such as those due to common ancestry and intermediate link. Is such functionality present or foreseen in the nearest future? * For TE and MI, another super valuable measure is test against zero. Currently, JIDT performs such tests and returns p-values along with the estimates, allowing the user to estimate if there is at all a relationship between variables above chance? Is such functionality implemented or intended? In principle, I would be interested in contributing, if I can achieve my goals with your toolbox given a few weeks of coding. Best, Aleksejs — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<#84?email_source=notifications&email_token=ACFJQGRIOHRCYBDLNPMN2Q3QWZU7FA5CNFSM4JUZASPKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H5WDUVA>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACFJQGTF2FMONY343ARFHB3QWZU7FANCNFSM4JUZASPA>.

Aleksejs Fomins · Answer 2 · Wed Dec 04 2019 00:16:05 GMT+0800 (China Standard Time)

Dear Michael,

Thank you very much for your answer. Would you be so kind to elaborate on a few of them?

First a bit of details:
We are trying to compute dynamic changes in functional connectivity (FC) by sweeping the dataset with a rather small time-window, and relying on extracting the FC from many repetitions of the same experiment. A single dataset is approx (Nodes, Samples, Trials) = (12, 200, 400), which we sweep with a window of 4 timesteps, resulting in 197 runs of network analysis for the shape (12, 4, 400). Naturally, we also parallelize over targets. My main problem is not even the speed, but also the memory use. A single use of analyse_single_target for Kraskov estimator can exceed 4Gb of ram for the data shape above, which prevents me from efficiently running the parallelized code on the cluster. I have already contacted Joseph and Patricia about it, we will try to sort it out.

Questions then:

Is it possible to accelerate the testing procedure by extracting the surrogates only once from the entire dataset, and then testing each window with the same set of surrogates? Or, for example, if the same experiment has been performed on multiple days, extracting the surrogates from all days simultaneously? I understand that, theoretically, the most rigorous approach is to assume that the probability distribution and hence the testing threshold is different for every timestep and every day. But this way, there is also fewest points to estimate that probability distribution and that threshold. I am wondering if there is an alternative way of making a compromise somewhere there?
I do not doubt that the developers of JIDT have spent a lot of time in optimizing it. But, if I recall correctly, the use of Java over C was originally motivated by a compromise between speed and portability. Do you believe that there is merit in accelerating entropy estimation by moving its implementation to a lower level language such as C? It seems to me that portability is not much of a issue since installing a Python or R wrapper for C code that does not use any crazy pre-requisite libraries seems to be a single line command.

Aleksejs Fomins · Answer 3 · Wed Dec 04 2019 00:26:58 GMT+0800 (China Standard Time)

Ah, yes, while I am here, another tiny question. I went through the IDTxl and JIDT documentation, and mostly saw estimators for MI and TE. Can I use JIDT/IDTxl to just estimate (conditional) entropy of my data? For example, a question I want to address is the information capacity of each channel excluding its immediate past. Given that we are using calcium imaging neuronal data at 20Hz, there is a certain concern that the signals may not change fast enough to provide information above noise. I would like to estimate the size of this effect

mwibral · Answer 4 · Wed Dec 04 2019 00:41:23 GMT+0800 (China Standard Time)

Hi Aleksejs, I wonder if it is a good idea to estimate (multivariate, or vene bivariate) TE from just 4*400=1600 samples for continuous valued data (see Patricia's paper on the nonstationary case for sample lengths that worked for us in very simple cases). But you would know your data better than I do. Memory consumption may be high due to the number of surrogate datasets that are being created, but 4GB for such a small dataset seems too much (?). Do you compute for each target separately and then fuse the results? Best, Michael Gesendet: Dienstag, 03. Dezember 2019 um 17:16 Uhr Von: "Aleksejs Fomins" <notifications@github.com> An: ELIFE-ASU/Inform <Inform@noreply.github.com> Cc: "Wibral, Michael" <michael.wibral@uni-goettingen.de>, Comment <comment@noreply.github.com> Betreff: Re: [ELIFE-ASU/Inform] New to Inform - Have a few questions (#84) Dear Michael, Thank you very much for your answer. Would you be so kind to elaborate on a few of them? First a bit of details: We are trying to compute dynamic changes in functional connectivity (FC) by sweeping the dataset with a rather small time-window, and relying on extracting the FC from many repetitions of the same experiment. A single dataset is approx (Nodes, Samples, Trials) = (12, 200, 400), which we sweep with a window of 4 timesteps, resulting in 197 runs of network analysis for the shape (12, 4, 400). Naturally, we also parallelize over targets. My main problem is not even the speed, but also the memory use. A single use of analyse_single_target for Kraskov estimator can exceed 4Gb of ram for the data shape above, which prevents me from efficiently running the parallelized code on the cluster. I have already contacted Joseph and Patricia about it, we will try to sort it out. Questions then: Is it possible to accelerate the testing procedure by extracting the surrogates only once from the entire dataset, and then testing each window with the same set of surrogates? Or, for example, if the same experiment has been performed on multiple days, extracting the surrogates from all days simultaneously? I understand that, theoretically, the most rigorous approach is to assume that the probability distribution and hence the testing threshold is different for every timestep and every day. But this way, there is also fewest points to estimate that probability distribution and that threshold. I am wondering if there is an alternative way of making a compromise somewhere there? I do not doubt that the developers of JIDT have spent a lot of time in optimizing it. But, if I recall correctly, the use of Java over C was originally motivated by a compromise between speed and portability. Do you believe that there is merit in accelerating entropy estimation by moving its implementation to a lower level language such as C? It seems to me that portability is not much of a issue since installing a Python or R wrapper for C code that does not use any crazy pre-requisite libraries seems to be a single line command. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Aleksejs Fomins · Answer 5 · Wed Dec 04 2019 01:31:11 GMT+0800 (China Standard Time)

Dear Michael,

I don't know either if 1600 datapoints is enough. From one side, more data is better. From the other, the macroscopic FC in mouse brain changes pretty rapidly it seems, so for large windows I get quite dense matrices. There is probably an optimum somewhere, so I am playing around with different window lengths.

Thanks for recommending the paper, I will have a look.

Yes, I compute single targets and fuse results. The shocking thing is that when I run analyse_single_target in a loop alongside with a memory profiler the memory consumption increases over time. I have done lots of benchmarking and can guarantee it is inside IDTxl. When I generate dummy results and comment out IDTxl, I only use 20Mb of memory for the same task. An even more weird thing is that it does not behave like a proper memory leak. It decreases over time. So the first run of analyse_single_target reserves most of the memory, the second a little less, the third even less. The total memory consumption approaches a certain maximum, after which it no longer grows. The memory does not get freed up if I close JVM, but it does get freed up if I close python. If I were to guess it looks like some sort of dynamic memory allocation which does anticipatory allocation based on current use. I am working on a minimal example right now, then I will send it around.

Douglas G. Moore · Answer 6 · Wed Dec 04 2019 03:21:22 GMT+0800 (China Standard Time)

Hey now, that's enough talk about JIDT and IDTxl in Inform's issues sections. 😜

@aleksejs-fomins, thanks for your interest in Inform. I'll try to answer your questions, but right out of the gate I need to say that JIDT and IDTxl are much more developed than Inform is. Inform has high hopes, but still has a long way to go. To my knowledge, Inform has no features in common with IDTxl; it's much closer to — though still a long way from — JIDT.

Currently my code works well, but has prohibitively high computation time and memory footprint (even after I have parallelized it to multiple cores). I am looking for potential alternatives, and your code seems to be implemented in C, which gives me hope that it might be a good alternative for my tasks.

It will largely depend on your particular use case. You might get a bit of a performance bump, but it probably won't be earth shattering — maybe a 1-3x speedup and slightly better memory overhead. We included a (very naive) benchmark comparison of Inform and JIDT in our conference paper. JIDT and Inform are pretty even footed until you get to longer history lengths (though you probably shouldn't be looking at history lengths that long anyway). As these things go, I find the only way to know for sure is to give it a try. My hunch is that Inform will be bit speedier and lighter on memory, but whether it's worth the effort of converting your code from Java to C is for you to decide. Inform is fairly low-level. It doesn't offer anything like network inference, as IDTxl does, and you'd have to manually implement any parallel processing.

On to your questions and comments!

I am interested in computing Entropy, Mutual Information and Transfer Entropy for 3D matrices of [Channel x Time x Trial], where Trial stands for repetitions of the same experiment.

We have all three implemented, though you'll need to be careful with the array formatting. Inform expects the time series to be a continuous array. I can elaborate on this if you'd like.

I have read through your documentation and some part of the source code, and still have unanswered questions.

Sorry the documentation is lacking. I'll update it to reflect these questions and answers.

Is it currently possible to use real data? All examples seem to be using integer time-series
In the source code I have only found histogram-based estimators. Are there currently other estimators available (such as Kraskov)? Is the histogram estimator bias-corrected?

Well, a lot of the "real" data we deal with in-house is discrete, so that's what Inform has focused on. We do not currently have continuously-valued estimators (though we plan to some day #21). We do have functions for binning data. Of course, binning isn't always appropriate. The histogram estimators are not biased-corrected, though I'm sure it'd be easy enough to implement bias correction.

What exactly does block-entropy for k>1 do? Does it split time-steps into subsets of length k, or does is sweep the timesteps with a window of length k?

It sweeps. I don't know of many use cases wherein you want to split the time series, but I think you could use the inform_black_box to accomplish that. That function seems to confuse people, but I think it's one of the nicest features of Inform. It turns out to be very useful for all kinds of things, e.g. you can use it together with any of the information measures to make the measure multivariate. Again, I'm happy to elaborate.

I am not able to figure out from the documentation what is an initial condition. Could you explain this concept or direct me to literature? Is this the same as what I call Trial? In that case, is it possible to, for example, find mutual information between two variables, for which only one time step, but many trials are given?

Yeah, I'm pretty sure we're using the same word for the same thing. Since there is no history length involved in mutual information, and because we estimate the distributions based on all trials (as JIDT does, I believe — @mwibral might know better than I do), I think you'd get the same answer regardless of how many trials you have. A better example would be something like transfer entropy:

int xs[6] = {0, 1,   // Trial 1
             1, 1,   // Trial 2
             0, 1};  // Trial 3
int ys[6] = {0, 1,
             1, 0,
             1, 1};
double te = inform_transfer_entropy(xs,     // source data
                                    ys,     // target data
                                    NULL,   // the 3rd argument is series against which to condition
                                    0,      // no conditions
                                    3,      // 3 trials
                                    2,      // 2 time steps per trial
                                    2,      // base-2 time series
                                    1,      // k = 1
                                    NULL);  // Ignore any errors because this is just an example and I'm lazy
assert(te == 0.6666666666666666);

Transfer Entropy operates with lags. Questions of interest are "what is TE for X->Y at lag=7" or "what is TE for X->Y given all lags={1,2,3,4,5}". Can a lag parameter be provided? What is the current convention?

Unfortunately, we don't have this implemented in Inform. It would be very simple to do, but we just haven't had the time. That said, if your time series only have one trial in them, then you can simply offset the arrays, something like

int xs[9] = {0, 0, 1, 0, 0, 1, 0, 0};
int ys[9] = {0, 0, 1, 1, 0, 1, 1, 0};

inform_transfer_entropy(xs, ys + 1, // this is the same as lagging xs by 1
                        NULL, 0, 1, 8, // your time series are now 1 element shorter
                        2, 1, NULL);

Of course, this is a bit unsatisfying. We've been toying with ideas for how to implement it. The issue we have is that C's not the nicest language for API design. The transfer entropy function takes 9 argument! That's a lot of information to keep straight. The direction I'm leaning is to create a function, similar to inform_black_box, which will introduce lags. You could then pass the resulting time series as arguments to the various information functions. I really like this kind of composable design and it helps us avoid monolithic and redundant implementation (though there's still plenty of that in Inform 😉 ).

JIDT provides multivatiate TE estimators, which allow (to some extent) to eliminate spurious connections such as those due to common ancestry and intermediate link. Is such functionality present or foreseen in the nearest future?

Well, you could mean one of two things here: multivariate in the sense of "what is the TE for (X × Y) →Z", or conditional in the sense of "what is the TE for X → Z conditioned on Y". Inform can do both, but the former requires inform_black_box. For the former, you can do something like...

int xs_ys[18] = {1, 0, 0, 1, 0, 1, 0, 0, 1,  // variable X
                 0, 1, 1, 0, 1, 1, 0, 0, 1}; // variable Y
int xs_times_ys[9];
inform_black_box(xs_ys,         // variables to black-box/coarse-grain/product
                 2,             // the number of variables
                 1,             // the number of trials per variable
                 9,             // the number of observations per trial
                 (int[2]){2,2}, // the base of each variable (both binary)
                 NULL,          // do not use a history length
                 NULL,          // do not use a future length
                 xs_times_ys,   // the array into which to place X × Y
                 NULL);         // Ignore any errors

// xs_times_ys == {2, 1, 1, 2, 1, 3, 0, 0, 3}

int zs[9] = {1, 0, 1, 0, 0, 1, 1, 0, 0};

// (X × Y) → Z
double te = inform_transfer_entropy(xs_times_ys, // source data
                                    zs,          // target data
                                    NULL,        // the 3rd argument is series against which to condition
                                    0,           // no conditions
                                    1,           // 1 trial per variable
                                    9,           // 9 time steps per trial
                                    4,           // base-4 timeseries (the largest base of the inputs)
                                    1,           // k = 1
                                    NULL);       // Ignore any errors because this is just an example and I'm lazy
printf("(X × Y) → Z: %.16lf\n", te);
assert(te == 0.9056390622295665);

// Z → (X × Y)
te = inform_transfer_entropy(zs, xs_times_ys, NULL, 0, 1, 9, 4, 1, NULL);
printf("Z → (X × Y): %.16lf\n", te);
assert(te == 0.5943609377704335);

Conditional transfer entropy (which we just fixed #78), is a little simpler:

int xs[9] = {1, 0, 0, 1, 0, 1, 0, 0, 1};
int ys[9] = {0, 1, 1, 0, 1, 1, 0, 0, 1};
int zs[9] = {1, 0, 1, 0, 0, 1, 1, 0, 0};

// X → Y | Z
double cte = inform_transfer_entropy(xs,    // source data
                                     ys,    // target data
                                     zs,    // condition data
                                     1,     // 1 condition
                                     1,     // 1 trial per variable
                                     9,     // 9 time steps per trial
                                     2,     // base-2 timeseries
                                     1,     // k = 1
                                     NULL); // Ignore any errors because this is just an example and I'm lazy
printf("X → Y | Z: %.16lf\n", cte);
assert(cte == 0.25);

// X → Z | Y
cte = inform_transfer_entropy(xs, zs, ys, 1, 1, 9, 2, 1, NULL);
printf("X → Z | Y: %.16lf\n", cte);
assert(cte == 0.25);

// Z → X | Y
cte = inform_transfer_entropy(zs, xs, ys, 1, 1, 9, 2, 1, NULL);
printf("Z → X | Y: %.16lf\n", cte);
assert(cte == 0.34436093777043353);

For TE and MI, another super valuable measure is test against zero. Currently, JIDT performs such tests and returns p-values along with the estimates, allowing the user to estimate if there is at all a relationship between variables above chance? Is such functionality implemented or intended?

This isn't currently implemented in Inform, but we have already done it in-house for a few different measures. It's actually pretty easy to do permutation tests, and will probably be one of the next features added.

Anyway, I hope this helps. Given that your data is real-valued, I suspect JIDT/IDTxl is the best fit for you — at least for now. That said, if you are still interested in Inform, we'd welcome any contributions you might have: code, comment and suggestions alike.

I personally like Inform, but I'm biased 😀.

mwibral · Answer 7 · Wed Dec 04 2019 06:30:54 GMT+0800 (China Standard Time)

Hi Douglas, sorry I completely overlooked that this was not from an IDTxl forum thread. Sorry for hijacking the thread and my best wishes for your inform project. Best, Michael Wibral

…

On 03.12.19 20:21, Douglas G. Moore wrote: Hey now, that's enough talk about JIDT and IDTxl in Inform's issues sections. 😜 @aleksejs-fomins <https://github.com/aleksejs-fomins>, thanks for your interest in Inform. I'll try to answer your questions, but right out of the gate I need to say that JIDT and IDTxl are much more developed than Inform is. Inform has high hopes, but still has a long way to go. To my knowledge, Inform has no features in common with IDTxl; it's much closer to — though still a long way from — JIDT. Currently my code works well, but has prohibitively high computation time and memory footprint (even after I have parallelized it to multiple cores). I am looking for potential alternatives, and your code seems to be implemented in C, which gives me hope that it might be a good alternative for my tasks. It will largely depend on your particular use case. You /might/ get a bit of a performance bump, but it probably won't be earth shattering — maybe a 1-3x speedup and slightly better memory overhead. We included a (very naive) benchmark comparison of Inform and JIDT in our conference paper <https://ieeexplore.ieee.org/abstract/document/8285197>. JIDT and Inform are pretty even footed until you get to longer history lengths (/though you probably shouldn't be looking at history lengths that long anyway/). As these things go, I find the only way to know for sure is to give it a try. My hunch is that Inform will be bit speedier and lighter on memory, but whether it's worth the effort of converting your code from Java to C is for you to decide. Inform is fairly low-level. It doesn't offer anything like network inference, as IDTxl does, and you'd have to manually implement any parallel processing. On to your questions and comments! I am interested in computing Entropy, Mutual Information and Transfer Entropy for 3D matrices of [Channel x Time x Trial], where Trial stands for repetitions of the same experiment. We have all three implemented, though you'll need to be careful with the array formatting. Inform expects the time series to be a continuous array. I can elaborate on this if you'd like. I have read through your documentation and some part of the source code, and still have unanswered questions. Sorry the documentation is lacking. I'll update it to reflect these questions and answers. * Is it currently possible to use real data? All examples seem to be using integer time-series In the source code I have only found histogram-based estimators. Are there currently other estimators available (such as Kraskov)? Is the histogram estimator bias-corrected? Well, a lot of the "real" data we deal with in-house is discrete, so that's what Inform has focused on. We /do not/ currently have continuously-valued estimators (though we plan to some day #21 <#21>). We do have functions for binning <https://elife-asu.github.io/Inform/#binning-time-series> data. Of course, binning isn't always appropriate. The histogram estimators /are not/ biased-corrected, though I'm sure it'd be easy enough to implement bias correction. * What exactly does block-entropy for k>1 do? Does it split time-steps into subsets of length k, or does is sweep the timesteps with a window of length k? It sweeps. I don't know of many use cases wherein you want to split the time series, but I think you could use the |inform_black_box| <https://elife-asu.github.io/Inform/#black-boxing-time-series> to accomplish that. /That function seems to confuse people, but I think it's one of the nicest features of Inform. It turns out to be very useful for all kinds of things, e.g. you can use it together with any of the information measures to make the measure multivariate./ Again, I'm happy to elaborate. * I am not able to figure out from the documentation what is an initial condition. Could you explain this concept or direct me to literature? Is this the same as what I call Trial? In that case, is it possible to, for example, find mutual information between two variables, for which only one time step, but many trials are given? Yeah, I'm pretty sure we're using the same word for the same thing. Since there is no history length involved in mutual information, and because we estimate the distributions based on all trials (as JIDT does, I believe — @mwibral <https://github.com/mwibral> might know better than I do), I think you'd get the same answer regardless of how many trials you have. A better example would be something like transfer entropy: int xs[6] = {0,1,// Trial 1 1,1,// Trial 2 0,1};// Trial 3 int ys[6] = {0,1, 1,0, 1,1}; double te = inform_transfer_entropy(xs,// source data ys,// target data NULL,// the 3rd argument is series against which to condition 0,// no conditions 3,// 3 trials 2,// 2 time steps per trial 2,// base-2 time series 1,// k = 1 NULL);// Ignore any errors because this is just an example and I'm lazy assert(te ==0.6666666666666666); * Transfer Entropy operates with lags. Questions of interest are "what is TE for X->Y at lag=7" or "what is TE for X->Y given all lags={1,2,3,4,5}". Can a lag parameter be provided? What is the current convention? Unfortunately, we don't have this implemented in Inform. It would be very simple to do, but we just haven't had the time. That said, if your time series only have one trial in them, then you can simply offset the arrays, something like int xs[9] = {0,0,1,0,0,1,0,0}; int ys[9] = {0,0,1,1,0,1,1,0}; inform_transfer_entropy(xs, ys +1,// this is the same as lagging xs by 1 NULL,0,1,8,// your time series are now 1 element shorter 2,1,NULL); Of course, this is a bit unsatisfying. We've been toying with ideas for how to implement it. The issue we have is that C's not the nicest language for API design. The transfer entropy function takes 9 argument! That's a lot of information to keep straight. The direction I'm leaning is to create a function, similar to |inform_black_box|, which will introduce lags. You could then pass the resulting time series as arguments to the various information functions. I really like this kind of composable design and it helps us avoid monolithic and redundant implementation (though there's still plenty of that in Inform 😉 ). * JIDT provides multivatiate TE estimators, which allow (to some extent) to eliminate spurious connections such as those due to common ancestry and intermediate link. Is such functionality present or foreseen in the nearest future? Well, you could mean one of two things here: multivariate in the sense of "what is the TE for (X × Y) →Z", or conditional in the sense of "what is the TE for X → Z conditioned on Y". Inform can do both, but the former requires |inform_black_box| <https://elife-asu.github.io/Inform/#black-boxing-time-series>. For the former, you can do something like... int xs_ys[18] = {1,0,0,1,0,1,0,0,1,// variable X 0,1,1,0,1,1,0,0,1};// variable Y int xs_times_ys[9]; inform_black_box(xs_ys,// variables to black-box/coarse-grain/product 2,// the number of variables 1,// the number of trials per variable 9,// the number of observations per trial (int[2]){2,2},// the base of each variable (both binary) NULL,// do not use a history length NULL,// do not use a future length xs_times_ys,// the array into which to place X × Y NULL);// Ignore any errors // xs_times_ys == {2, 1, 1, 2, 1, 3, 0, 0, 3} int zs[9] = {1,0,1,0,0,1,1,0,0}; // (X × Y) → Z double te = inform_transfer_entropy(xs_times_ys,// source data zs,// target data NULL,// the 3rd argument is series against which to condition 0,// no conditions 1,// 1 trial per variable 9,// 9 time steps per trial 4,// base-4 timeseries (the largest base of the inputs) 1,// k = 1 NULL);// Ignore any errors because this is just an example and I'm lazy printf("(X × Y) → Z: %.16lf\n", te); assert(te ==0.9056390622295665); // Z → (X × Y) te = inform_transfer_entropy(zs, xs_times_ys,NULL,0,1,9,4,1,NULL); printf("Z → (X × Y): %.16lf\n", te); assert(te ==0.5943609377704335); Conditional transfer entropy (which we just fixed #78 <#78>), is a little simpler: int xs[9] = {1,0,0,1,0,1,0,0,1}; int ys[9] = {0,1,1,0,1,1,0,0,1}; int zs[9] = {1,0,1,0,0,1,1,0,0}; // X → Y | Z double cte = inform_transfer_entropy(xs,// source data ys,// target data zs,// condition data 1,// 1 condition 1,// 1 trial per variable 9,// 9 time steps per trial 2,// base-2 timeseries 1,// k = 1 NULL);// Ignore any errors because this is just an example and I'm lazy printf("X → Y | Z: %.16lf\n", cte); assert(cte ==0.25); // X → Z | Y cte = inform_transfer_entropy(xs, zs, ys,1,1,9,2,1,NULL); printf("X → Z | Y: %.16lf\n", cte); assert(cte ==0.25); // Z → X | Y cte = inform_transfer_entropy(zs, xs, ys,1,1,9,2,1,NULL); printf("Z → X | Y: %.16lf\n", cte); assert(cte ==0.34436093777043353); * For TE and MI, another super valuable measure is test against zero. Currently, JIDT performs such tests and returns p-values along with the estimates, allowing the user to estimate if there is at all a relationship between variables above chance? Is such functionality implemented or intended? This isn't /currently/ implemented in Inform, but we have already done it in-house for a few different measures. It's actually pretty easy to do permutation tests, and will probably be one of the next features added. Anyway, I hope this helps. Given that your data is real-valued, I suspect JIDT/IDTxl is the best fit for you — at least for now. That said, if you are still interested in Inform, we'd welcome any contributions you might have: code, comment and suggestions alike. I personally like Inform, but I'm biased 😀. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#84?email_source=notifications&email_token=ACFJQGUDBSDBG4F3M4YRXITQW2WTHA5CNFSM4JUZASPKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF2QJVI#issuecomment-561317077>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACFJQGVJJ4ROD2WWQD73T6TQW2WTHANCNFSM4JUZASPA>.

Douglas G. Moore · Answer 8 · Wed Dec 04 2019 07:35:09 GMT+0800 (China Standard Time)

@mwibral Ha. No worries. I was just giving you a hard time. The details might be different, but I think your comments are useful for Inform users too.

Aleksejs Fomins · Answer 9 · Wed Dec 04 2019 23:11:55 GMT+0800 (China Standard Time)

Dear Douglas,

Thanks a lot for your broad reply. I apologise for using the word "real data". I meant "real-valued", not "real-world" :D. I would in principle consider contributing some continuous -variable estimators. Not with the purpose of competing with JIDT/IDTxl, but more as an exercise for myself to understand how they really work.

Cheers,
Aleksejs