XENON1T / pax

The XENON1T raw data processor [deprecated]

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

binom_test MemoryError

pdeperio opened this issue · comments

Discovered by @lucrlom processing 170316_0210 (event number?):

Traceback (most recent call last):
  File "/project/lgrandi/anaconda3/envs/pax_v6.6.2/bin/cax", line 9, in <module>
    load_entry_point('cax==5.0.12', 'console_scripts', 'cax')()
  File "/project/lgrandi/anaconda3/envs/pax_v6.6.2/lib/python3.4/site-packages/cax-5.0.12-py3.4.egg/cax/main.py", line 142, in main
    task.go(args.run)
  File "/project/lgrandi/anaconda3/envs/pax_v6.6.2/lib/python3.4/site-packages/cax-5.0.12-py3.4.egg/cax/task.py", line 65, in go
    self.each_run()
  File "/project/lgrandi/anaconda3/envs/pax_v6.6.2/lib/python3.4/site-packages/cax-5.0.12-py3.4.egg/cax/tasks/process.py", line 217, in each_run
    ncpus)
  File "/project/lgrandi/anaconda3/envs/pax_v6.6.2/lib/python3.4/site-packages/cax-5.0.12-py3.4.egg/cax/tasks/process.py", line 104, in _process
    core.Processor(**pax_kwargs).run()
  File "/project/lgrandi/anaconda3/envs/pax_v6.6.2/lib/python3.4/site-packages/pax-6.6.2-py3.4.egg/pax/core.py", line 315, in run
    self.process_event(event)
  File "/project/lgrandi/anaconda3/envs/pax_v6.6.2/lib/python3.4/site-packages/pax-6.6.2-py3.4.egg/pax/core.py", line 276, in process_event
    event = plugin.process_event(event)
  File "/project/lgrandi/anaconda3/envs/pax_v6.6.2/lib/python3.4/site-packages/pax-6.6.2-py3.4.egg/pax/plugin.py", line 91, in process_event
    event = self._process_event(event)
  File "/project/lgrandi/anaconda3/envs/pax_v6.6.2/lib/python3.4/site-packages/pax-6.6.2-py3.4.egg/pax/plugin.py", line 108, in _process_event
    return self.transform_event(event)
  File "/project/lgrandi/anaconda3/envs/pax_v6.6.2/lib/python3.4/site-packages/pax-6.6.2-py3.4.egg/pax/plugins/interaction_processing/S1AreaFractionTopProbability.py", line 34, in transform_event
    ia.s1_area_fraction_top_probability = binom_test(size_top, size_tot, aft)
  File "/project/lgrandi/anaconda3/envs/pax_v6.6.2/lib/python3.4/site-packages/scipy/stats/morestats.py", line 2050, in binom_test
    i = np.arange(np.floor(p*n) + 1)
MemoryError

Probably a very large S1 overloading the combinatorial. See http://stackoverflow.com/questions/3056179/binomial-test-in-python-for-very-large-numbers and https://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation for possible solutions.

I suspect the p value isn't very good at high energies, eventually the statistical error becomes so small any remaining systematic (e.g. small error in map) will dominate. We could consider skipping the test for very high energy peaks, or probably better, clipping the n in this test to some large number.

I still need to look into this, but my guess is that once we start hitting massive saturation the calculation is inaccurate. I'm also not entirely sure above what energy the p-value stops being useful. We could artificially cap it at ~1e4 pe or something.

ok. can you implement a fix soon @darrylmasson? i think this may be killing all our MC jobs currently.

From what the code seems to say, it's doing some fairly memory-inefficient operations that a large s1 would kill. The p-value isn't particularly useful for alpha studies (or higher energy), so until I get around to implementing all the floating-point operation overloads for this I'll cap it at 1e4 pe. There are ways to rewrite these to not chew all the memory (such as using the normal approximation for high energies) I'll work on implementing.

Closed in #558