michaelmarty / UniDec

Universal Deconvolution of Mass and Ion Mobility Spectra

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

trying to create txt input

animesh opened this issue · comments

commented

I am trying to create txt input for UniDec using thermo-raw file reader https://github.com/animesh/RawRead/tree/deConv but i am not sure how to represent MZ/intensity values as mentioned in http://michaelmarty.github.io/UniDecDocumentation/index.html , currently i am representing the base-peak and intensity(max) across the scan but getting a confusing mz/5 scatter
image
thus wondering what is the right way to represent the txt input?

Hi @animesh,
I think the issue with these files is that the m/z values are out of order. It looks like it is reading the data properly, but it expects a sorted list of m/z. Try applying a simple sort and see if that works. However, there is already a raw reader in the software, so you might not need to write your own. Thanks, MTM

commented

Thanks @michaelmarty for such a quick response! I tried sorting the values and at least the display doesn't look so messy after
210112__solveig_AN0.raw.intensityThreshold1000.PPM10.errTolDecimalPlace3.Time20210128191052.MS.txt

image
but facing error
screenshot
image
console

c:\python37\lib\site-packages\PyInstaller\loader\pyimod03_importers.py:493: MatplotlibDeprecationWarning:
The MATPLOTLIBDATA environment variable was deprecated in Matplotlib 3.1 and will be removed in 3.3.
Required multiplierz data file unimod.sqlite not found!  Creating it.
WARNING: No unimod.sqlite found in F:\UniDec_Windows_210118\UniDec_Windows (F:\UniDec_Windows_210118\UniDec_Windows\GUI_UniDec.exe)

UniDec Engine v.4.4.0

UniDec Path: F:\UniDec_Windows_210118\UniDec_Windows\UniDec.exe
Launching UniDec

UniDec Engine v.4.4.0

UniDec Path: F:\UniDec_Windows_210118\UniDec_Windows\UniDec.exe
Display Size  (1920, 1200)
Opening:  210112__solveig_AN0.raw.intensityThreshold1000.PPM10.errTolDecimalPlace3.Time20210128191052.MS.txt
Opening File:  F:\SK\210112__solveig_AN0.raw.intensityThreshold1000.PPM10.errTolDecimalPlace3.Time20210128191052.MS.txt
Header Length: 1
Loading Time: 14s
Plot 1: 0.19s
Linear False
Data Prep Time: 11s
F:\UniDec_Windows_210118\UniDec_Windows\scipy\optimize\minpack.py:829: OptimizeWarning: Covariance of the parameters could not be estimated
Automatic Peak Width: 0.00294
Plot 1: 0.23s
Data Prep Done. Time: 15s
infile = F:\SK\210112__solveig_AN0.raw.intensityThreshold1000.PPM10.errTolDecimalPlace3.Time20210128191052.MS_unidecfiles\210112__solveig_AN0.raw.intensityThreshold1000.PPM10.errTolDecimalPlace3.Time20210128191052.MS_input.dat
UniDec Run
Length of Data: 846412
Threshold: 2.165771     UniDec run 22s
UniDec Run Error: 3221225477
Error  3221225477

probably input [210112__solveig_AN0.raw.intensityThreshold1000.PPM10.errTolDecimalPlace3.Time20210128191052.MS.txt] is incorrect still/needs some more pre-proc?

Hi @animesh,
Yep! The data import looks good. I am pretty sure the error is coming from the fact that your data is really large, and the computer is running out of memory in a critical part. I think you can fix this by setting the Peak FWHM to 0. I am planning to put in a test for this. Thanks, MTM

commented

Wondering where is this FWHM setting in UniDec @michaelmarty ? BTW if i convert to mzML and then import i get a totally different picture
image
so it could just be issue of how i am presenting the ions in txt file, which is just essentially all the ions and its (sum)Intensity sorted on mz values, probably every scan has to go separately in txt file?

The FWHM is under the additional Deconvolution Parameters. Yeah, each scan will have a slightly different m/z value, so if you just flatten all the scans, it will give you crazy big data. We use a merge scan function to determine the average resolution and then sum the data to match that for mzML data. For Waters and Thermo data, we actually use the scan averaging built in their DLL files. Not sure how it works, though...