ComputationalRadiationPhysics / particle_reduction

just a minor issue discovered by @NeilZaim: "particle patches" are an optional feature in openPMD:

Lines 649 to 651 in 8283f3f

    
           SCALAR = openpmd_api.Mesh_Record_Component.SCALAR 
        
           num_particles = particle_species.particle_patches["numParticles"][SCALAR].load() 
        
           series_hdf.flush()

Although PIConGPU always writes them, these patches might not exist in other data sets such as the examples here:
https://github.com/openPMD/openPMD-example-datasets/
This currently leads to a segfault when using the scripts with those examples.

Can we please add the following simple fallback logic? If particle species does not declare particle patches, just chunk the particle read by a fixed number of particles, e.g. 1e7 particles at a time.

Thanks for pointing out, I will try to fix this now

Fixed. Tested on openPMD example datasets. Now it seems to work correctly

Thanks a lot, this is great!
@NeilZaim does this work for you, too? :)

Thanks a lot! It goes further in the code but I still observe a couple of issues.

I'm trying to use the Voronoi algorithm using the following command:
python reduction_main.py -algorithm=voronoi -hdf=../simData_0008000.h5 -hdf_re=../reduced.h5 -momentum_tol=0.1 -momentum_pos=0.1 (these are random values for the tolerances)

The first issue I encounter is that I think the dimensions attribute of the algorithm is never set, which, for the Voronoi algorithm, results in a failure of the check_needs_subdivision function.

Since I'm testing the code with data from a 2D simulation, I guess that I can temporarily fix this by adding algorithm.dimensions = Dimensions(2,3) approximately anywhere in the code right?

Now if I do this and try the above command line, I obtain the following error:

Traceback (most recent call last):
File "reduction_main.py", line 891, in
base_reduction_function(args.hdf, args.hdf_re, "voronoi", parameters)
File "reduction_main.py", line 133, in base_reduction_function
process_iteration_group(algorithm, current_iteration, series_hdf, series_hdf_reduction, reduction_iteration)
File "reduction_main.py", line 160, in process_iteration_group
process_patches_in_group_v2(iteration.particles[name_group], series_hdf,
File "reduction_main.py", line 737, in process_patches_in_group_v2
write_draft_copy(reduced_weight, relative_result,
File "reduction_main.py", line 537, in write_draft_copy
particle_species["weighting_copy"][SCALAR][previos_idx:current_idx] = reduced_weight
TypeError: setitem(): incompatible function arguments. The following argument types are supported:
1. (self: openpmd_api.Record_Component, tuple of index slices: tuple, array with values to assign: array) -> None
2. (self: openpmd_api.Record_Component, slice: slice, array with values to assign: array) -> None
3. (self: openpmd_api.Record_Component, axis index: int_, array with values to assign: array) -> None

Invoked with: <openPMD.Record_Component of dimensionality '1'>, slice(0, 1, None), [9120000000.0]

Do you know where this is coming from? I can run more tests if needed.

Thanks again!

@NeilZaim thanks. i reproduced it
working on a fix

@NeilZaim now it seems to work. I also add some information about algorithms to a command arguments description

Thanks a lot @KseniaBastrakova !

I am now able to produce h5 output that I can read (tested with the Voronoi algorithm) so I'll start toying around and see what I can get :) .

I've encountered a few issues, listed below:

On the reduced dataset, the mass and charge still have the shape of the old dataset. (If I had 1000 particles before and 10 particles now, reading the mass on the reduced dataset still results in a vector of size 1000, at least using openPMDviewer). This causes openPMDviewer to crash when reading the momentum during the renormalization step. This is however easy to work around by removing the momentum normalization in openPMDviewer and doing it manually afterwards.
I still encounter some random segmentation faults (core dumped) during some openpmd-api calls. More specifically this can occur during the line reduction_series.flush() at the very end of copy_meshes in reduction_main.py. If I comment out copy_meshes (which is ok for me, I only need particle data in the reduced dataset, not grid data) I can still have a segfault during the call series_hdf_reduction.flush() in line 536 of reduction_main.py (in the function write_draft_copy). These segfaults do not occur everytime so I guess it's harder to debug but I can provide more info if needed. When there is no segfault the code runs until the end.
I think that we can remove line 895 in reduction_main.py (base_reduction_function(args.hdf, args.hdf_re, "voronoi", parameters)). The base_reduction_function is already called in the function voronoi_algorithm so this makes the code run twice.

Thanks again!

@NeilZaim Thank you very much for the detailed description of the errors. Working on reproduction and fix

@NeilZaim, I fixed the first issue and, probably, the the secound one(I found a possible problem that could cause it), so now it seems work for all algorithms

And I assume the third one was fixed with PR by @NeilZaim ?

And I assume the third one was fixed with PR by @NeilZaim ?

yes, I merged it already

Thanks a lot! I've just given it a try but I get the following error message

Traceback (most recent call last):
File "reduction_main.py", line 746, in
voronoi_algorithm(args.hdf, args.hdf_re, args.momentum_tol, args.position_lol)
File "reduction_main.py", line 708, in voronoi_algorithm
base_reduction_function(hdf_file_name, hdf_file_reduction_name, "voronoi", parameters)
File "reduction_main.py", line 203, in base_reduction_function
process_iteration_group(algorithm, current_iteration, series_hdf, series_hdf_reduction, reduction_iteration)
File "reduction_main.py", line 222, in process_iteration_group
process_patches_in_group_v2(iteration.particles[name_group], series_hdf,
File "reduction_main.py", line 654, in process_patches_in_group_v2
write_draft_copy(relative_data, dict_data_indexes, series_hdf_reduction, particle_species_reduction,
File "reduction_main.py", line 463, in write_draft_copy
write_record_copy(current_record, values, series_hdf_reduction, previos_idx, current_idx)
File "reduction_main.py", line 435, in write_record_copy
current_reduced_data = values_to_write[:, pos_vector_in_reduction_data].astype(current_type)
IndexError: index 1 is out of bounds for axis 1 with size 1

I don't have time to look at it in more details today but if you want I can run more tests/provide more details tomorrow.

@ NeilZaim Do you use 1-d coordinates? I think my problem in improper handling this dataset. (only testing on 2-d and 3-d)
working on a fix

No this is data from a 2D simulation. Although in this case the coordinates are named x and z. (there is no y coordinate) Could this be causing the issue?

On a completely unrelated note, what is the current status of the Vranic algorithm? I see that there is a file in the Algorithm folder but the algorithm is not included in reduction_main.py. Is the algorithm working except for IO? Or is there still work to do on the algorithm itself? In both cases I can try to make it work, I'd like to test it.

No this is data from a 2D simulation. Although in this case the coordinates are named x and z. (there is no y coordinate) Could this be causing the issue?

could you provide a file? I made a file only with x and z. coordinates, but, sadly, still can't reproduce the bug

Ok thanks. I think the data is a bit large but I'll see if I can produce smaller data with the same error message.

On a completely unrelated note, what is the current status of the Vranic algorithm? I see that there is a file in the Algorithm folder but the algorithm is not included in reduction_main.py. Is the algorithm working except for IO? Or is there still work to do on the algorithm itself? In both cases I can try to make it work, I'd like to test it.

I still work on the algorithm itself, so it works for a few simple cases, but I did not test it completely and did not measure the metrics

No this is data from a 2D simulation. Although in this case the coordinates are named x and z. (there is no y coordinate) Could this be causing the issue?

could you provide a file? I made a file only with x and z. coordinates, but, sadly, still can't reproduce the bug

Hello Ksenia. I attach below a somewhat small h5 (~3MB) file which reproduces the issue.
simData_0000040.h5.zip

The error appears when record_name in write_draft_copy is E so maybe it comes from the fact that the particle data I use contains the E and B field at the particles position?

In any case, I've also noticed that the line values = data[particle_index_range[0]:particle_index_range[1]][0] in write_draft_copy seems to be equivalent to values = data[particle_index_range[0]], which could be different from what you initially wanted no? (maybe you meant values = data[particle_index_range[0]:particle_index_range[1],0]?) If this is the case, could this cause the issue? (if this means that values does not have the expected shape)

@ NeilZaim, thank you for the small example. Yes, I think, it can cause this problem. I working on a small fix

@NeilZaim , i fixed it in #19. Now, your file working correctly in my case, could you try once again?

So I've tested it and it works now, thanks a lot!

I still have some random segfaults during some of the calls to series_hdf_reduction.flush(), but I can live with that.

I've also noticed that the code feeds many values to the merging algorithms (all the components in particle_species.items(), for instance the E and B values at the particle positions) which are for the most part unused by the algorithms. Typically I've added something like

if record_component_name != "momentum":
            continue

in get_data and it can make the scripts much faster when working on larger data arrays.

I also now have a working version of the Vranic algorithm, I'll write a PR soon.

Thanks for your feedback @NeilZaim .

Regarding the loading, I agree that many things are not needed in the algorithms. Unfortunately, they still need to be loaded from and written back to the file, as the structure changes with some macroparticles being deleted. Also, when macroWeighted = 1 the values need to be changed when the weight changes. However, it can be that the implementation does some more data shuffling than necessary. Could you look into that @KseniaBastrakova ?

I see, thanks. (although I guess that in the general case values other than momentum, position and weight are undefined anyways when we create new particles (with Voronoi for instance)). Note that what I have currently written for the Vranic algorithm just returns position/momentum/weight, I can modify it to include other values if needed.

In any case you don't need to worry too much about further optimizing the code as far as I am concerned, what is currently here is fine for me.

Actually Vranic currently crashes if we don't add the lines

if record_component_name != "momentum":
            continue

😬
I'll write a fix

@NeilZaim will you provide this as another commit to your PR #20 ?

Yes, just pushed it

	SCALAR = openpmd_api.Mesh_Record_Component.SCALAR
	num_particles = particle_species.particle_patches["numParticles"][SCALAR].load()
	series_hdf.flush()

Particle Patches: Fallback