Crashes of parasail_aligner running under window cause an error message box to be displayed

Question

Crashes of parasail_aligner running under window cause an error message box to be displayed

Elmohound opened this issue 7 years ago · comments

Sometimes crashes of parasail_aligner under windows cause an error message dialog box to be displayed. This requires user interaction to clear the diagbox, which might be a problem when executing multiple invocations of parasail_aligner from a script.

Jeff Daily · Answer 1 · Sat Feb 17 2018 03:29:10 GMT+0800 (China Standard Time)

I've experienced this, as well. I'm not sure how to avoid the dialog box. I managed to cause a crash when I wasn't properly aligning memory. It would be nice to know what you were running to cause this, if possible. For example your input files, which alignment routine was used, which visual studio version you used, cmake version, parasail version/branch, etc. Unless this is related to other posted issues.

I will look into how to keep windows from popping up error message dialogs, but I have very little experience as a windows developer. That's why I used CMake. I'm primarily a Linux developer.

Elmohound · Answer 2 · Sat Feb 17 2018 03:35:15 GMT+0800 (China Standard Time)

Build environment: Microsoft Visual Studio 15 2017 Win64 and it's CMAKE-GUI

Command was nothing special, the crashes were triggered by using any avx2_256 functions (I tried 36 of them)

I've also seen it when trying very large alignments and suspect that it was triggered by memory problems with traceback. I see if I can generate these with and without traceback.

Elmohound · Answer 3 · Sat Feb 17 2018 05:28:34 GMT+0800 (China Standard Time)

Does anything mentioned here offer a useful approach to suppressing the error box?

https://stackoverflow.com/questions/3561545/how-to-terminate-a-program-when-it-crashes-which-should-just-fail-a-unit-test/3637710#3637710

A crash due, for example to memory problems, is tolerable, but user interaction in such cases is less so.

Jeff Daily · Answer 4 · Sat Feb 17 2018 05:41:16 GMT+0800 (China Standard Time)

I will try and reproduce the error window popping up using your guidance from earlier.

As for what the link suggests, Option 1 isn't what we're after. You shouldn't set global registry settings just to silence these popups. Option 2, disabling it just for the application might be okay, but I'd rather not mask the symptom and instead fix the problem. It would be nice if there was a solution that reported to stdout and exited cleanly.

If we went with Option 3, SetUnhandledExceptionFilter, I would worry about any tools using parasail as a library. First off, I'm not familiar with exceptions on Windows. More generically, I understand C++ exceptions. libparasail, being a C library, does not have exceptions. I could certainly change the exception handler for the aligner app and see if that helps.

(I once wrote a C library that was wrapped in MATLAB, and the C library encountered an error and exited, taking down the MATLAB GUI with it. I'd like to avoid similar cases with parasail.)

Elmohound · Answer 5 · Sat Feb 17 2018 05:58:29 GMT+0800 (China Standard Time)

I agree, it's one thing to handle exceptions in parasail_aligner, quite another in the library.

I am herewith uploading two more DNA files of 60 and 50 k so that you can generate memory-dependent crashes.

bigger-fasta-files.zip

Commands like the following crash

parasail_aligner -a sw_trace_scan_sat -f 60K.fa -q 50k.fa -OEMBOSS -mnuc44

but using 50k.fas for both succeeds (with SG/SW/NW)

Incidentally using a 100K or even 1MB sequence as the target and something considerably smaller for the query (e.g. 10K) succeeds as well, which if fine. It's just those cases where both sequences are largish

Jeff Daily · Answer 6 · Sat Mar 03 2018 07:52:00 GMT+0800 (China Standard Time)

Thanks again for discovering a bug and providing an easy reproducer. The traceback routine was overflowing some integer indexes. In the process of further debugging I found some more integer overflows in the non-vectorized routines. Looks like I failed to test large alignments. Should have a fix for all soon.

Jeff Daily · Answer 7 · Sat Mar 03 2018 15:46:33 GMT+0800 (China Standard Time)

3b585c3 ended up being a bigger fix than anticipated. I went through all the code trying to figure out where I might be overflowing on any integer index. I think I got them all. Do try to break it and get back to me.

As an aside, we might be reaching the limits of the effectiveness of parasail. If both the query and target are as large as or larger than the ones you are aligning (approximately 50k-100k bp), the scores start to be in the 32-bit range. We start to lose the benefit of vectorization.

What's the largest target and/or query you think you need to support?

Jeff Daily · Answer 8 · Tue Mar 06 2018 03:01:48 GMT+0800 (China Standard Time)

Sincere apologies but I labeled this as "wontfix" because I'd rather a user be rightfully notified that something wrong happened. I noticed that sometimes I was not getting any error message whatsoever even though the aligner wasn't running properly.

I suppose the fact that our discussion helped me uncover the bug/reason why you were seeing the popup message in the first place gives me more justification to keep the behavior the way it is - the popup indicated an error that needed fixing.

Elmohound · Answer 9 · Tue Mar 06 2018 03:02:17 GMT+0800 (China Standard Time)

I would say that 50K maximum alignment length would be acceptable as long as problems with longer alignments are handled gracefully. With the hotfix of 2008-03-03, I was able to align a 100K sequence to itself in a reasonable amount of time (under 3 minutes) using {NW,SG,SW}_{scan,striped} so from that perspective things are going well.

The question is, how will failed alignments look? Right now, for example, when I tried using diag functions with a 100Kb x 100Kb alignment I got a 0 length alignment, which is fine because it avoided the error dialog box that was seen with earlier version of parasail_aligner.

When I used a really long target sequence (1M) with a sufficiently long query (60kb) I got the dreaded "Program has stopped working" dialog box, whereas a 20kb query succeeded with a memory budget of 17GB. It would useful to have a heuristic that would enable the aligner to predict failure since it would be better to complain and refuse to return an alignment that it would be to have an ugly crash.

As I recall, SSW's aligner application returns a partial alignment when the alignment is too long. The problem with that approach is that there is no indication that the alignment is not full length. Given the choice, I'd take a failed alignment which gives some indication of failure over a partial alignment.

Elmohound · Answer 10 · Tue Mar 06 2018 03:05:56 GMT+0800 (China Standard Time)

Is the problem in the aligner or in the library?
If the pop-up cannot be suppressed, it's going it make it difficult for me to wrap something around the aligner because I don't know how I can make my wrapper respond to (and clear) the pop-up window.

Jeff Daily · Answer 11 · Tue Mar 06 2018 03:25:25 GMT+0800 (China Standard Time)

Alright, I'll revisit this one. I can try and capture as many exceptions as possible and return a non-zero exit code in such cases. Hopefully that will keep the dialog from appearing while still providing an indication of failure. I'll experiment with that and see how it goes.

As for the really long sequences, and specifically the scan or striped vectorized approaches, I think what we're seeing is a lot of memory usage due to the query profile. The query profile is created for each query sequence and is proportional in size to the query length and alphabet size.

As for failed alignments, I'll separate the reasons. If it's a lack of memory, hopefully the above solution about catching the unhandled exceptions will work.

If it's integer saturation for the given solution size, I would like your suggestion. It can be expensive w.r.t. memory to use the "_sat" routines because they create query profiles for 8-, 16-, and 32-bit solutions, but the benefit is that they at least try each alignment solution size. If you know you will need a 32-bit number range for your alignment score, it is better off to select that alignment routine. Saves both memory and computation time. If an alignment errors due to score saturation, should I abort the entire run of the aligner? Should I somehow report the failed alignment and move on to other sequences?

Elmohound · Answer 12 · Tue Mar 06 2018 06:21:14 GMT+0800 (China Standard Time)

I think that it would be reasonable to simply report the failure and move on to other sequences. Otherwise, the user might get frustrated having to re-run many searches when only some combinations failed.

One reason why the _sat routines are useful, is the user (in this case me) has no specific way to know ahead of time if an alignment is going to fail due to small word size, and if it does, whether it is worth trying a larger word size.

Similarly, if there were some rules of thumb as to what constitutes a maximum "safe" alignment size, then that could be used prevent the user from even attempting those alignments. I guess that that would be hard to estimate with absolute certainty.

Jeff Daily · Answer 13 · Tue Mar 06 2018 06:53:13 GMT+0800 (China Standard Time)

What would you like the reported failed alignment message to look like? Index number of query in the query file? index of he target sequence? Sequence names? On Mar 5, 2018, at 2:22 PM, Elmohound <notifications@github.com<mailto:notifications@github.com>> wrote: I think that it would be reasonable to simply report the failure and move on to other sequences. Otherwise, the user might get frustrated having to re-run many searches when only some combinations failed. One reason why the _sat routines are useful, is the user (in this case me) has no specific way to know ahead of time if an alignment is going to fail due to small word size, and if it does, whether it is worth trying a larger word size. Similarly, if there were some rules of thumb as to what constitutes a maximum "safe" alignment size, then that could be used prevent the user from even attempting those alignments. I guess that that would be hard to estimate with absolute certainty. — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#53 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AA3MOAn-V1JbyUOGCqEJ9W65vvwSXacCks5tbbpbgaJpZM4SIywe>.

Elmohound · Answer 14 · Tue Mar 06 2018 07:26:00 GMT+0800 (China Standard Time)

How about this, if a run's command line includes "-q", it's just two sequences so report their names.

Otherwise, assume it's a batch job and report the target and query by either names or index values (or both). Reporting the sequence lenghts might also be useful, but it's not critical.

Jeff Daily · Answer 15 · Tue Mar 06 2018 15:05:03 GMT+0800 (China Standard Time)

v2.1.1 is tagged and adds the warning when alignments have saturated.