Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation of chromatogram with only two peaks with high intensity differences difficult to match #18

Closed
sebastian-hogeweg opened this issue Jun 12, 2024 · 2 comments · Fixed by #20
Assignees
Labels
bug Something isn't working troubleshooting

Comments

@sebastian-hogeweg
Copy link

Applying the presented workflow leads to insufficient results regarding this data set (chromatogram1.csv). I already tried to manipulate the data (chromatogram1_mod.csv) so that all values are positive; however, the peaks and the corresponding areas look strange to me. Modifying specific parameters, such as the window in the baseline, only limited improved the result. Consequently, it would be great to get some help selecting the parameters to improve the result.
I am looking forward to any help.

Example code:
`
chromatogram = load_chromatogram('chromatogram1_mod.csv', cols=['time', 'signal'])
chrom = Chromatogram(chromatogram)
chrom.show()
plt.savefig("chromatogram.svg", bbox_inches="tight", transparent = False)
plt.close()

chrom = Chromatogram(chromatogram)
chrom.correct_baseline()
chrom.show()
plt.savefig("chromatogram_baseline_correction.svg", bbox_inches="tight", transparent = False)
plt.close()

peaks = chrom.fit_peaks(correct_baseline=False, prominence=0.01)
chrom.show()
plt.savefig("chromatogram_peaks.svg", bbox_inches="tight", transparent = False)
plt.show()
`

chromatogram
chromatogram_peaks
chromatogram_peaks_window100

@gchure gchure self-assigned this Jun 15, 2024
@gchure gchure added bug Something isn't working troubleshooting labels Jun 15, 2024
@gchure
Copy link
Member

gchure commented Jun 17, 2024

Hi @sebastian-hogeweg. Thanks for the issue. This is something that's known for very large-valued time dimensions (see #15). I think I know what the issue is, but it will take me some time to rework how the windowing and inference operates.

In the mean time, you can work on manually adjusting the fitting parameter bounds (see param_bounds on deconvolve_peaks). I suspect broadening the location and amplitude bounds will help.

Additionally, you will need to adjust approx_peak_width in the call to fit_peaks for the background subtraction. The default value there is 2, where in your case it should be something more like 500 since your time dimension is large.

@gchure
Copy link
Member

gchure commented Aug 14, 2024

Hi @sebastian-hogeweg, hope you're doing well! Sorry for the late response on this. I've now taken a stab at addressing this issue, which should now be functional in in #20 with hplc-py v0.2.7.

Running some default script like this on your modified chromatogram data yields this:

from hplc.quant import Chromatogram
from hplc.io import load_chromatogram
data = load_chromatogram('chromatogram1_mod.csv', cols=['time', 'signal'])
chrom = Chromatogram(data)
peaks = chrom.fit_peaks(approx_peak_width=500)
chrom.show()
Screenshot 2024-08-14 at 12 47 17

I modified the default bounding for the parameters to be more permissive of very large signal intensities, such as that present in your chromatogram.

Also note that the approximate peak width is set to 500 rather than the default of 2. This is important when you have large time dimension in your chromatograms. I've put in a new check that will yell if your peak width is too small, which would result in a blank chromatogram.

If you're still having problems after updating, feel free to reopen the issue.

@gchure gchure closed this as completed Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working troubleshooting
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants