On the MIMIC-II query builder there were a couple of tables related to the MIMIC waveform database. Is this something that will be implemented in MIMIC-III?
Also, are there any plans to update the waveform database with more matched patients and new waveforms?
There are plans to update the matched database. The matching process is still ongoing.
Regarding the waveform tables, I'm not convinced that was the simplest method of distributing the matches. While we will release a map of some form, it may not be in the form of relative database tables.
I will leave this issue open for now and re-address it when there is an update about the waveforms.
I've attached a sample of matched headers for MIMIC-III patients here: if you have time, could you comment on whether this is a useful format, and whether you think additional information (HADM_ID
, ICUSTAY_ID
) would make things easier. See here for how to use matched waveform headers: http://physionet.org/physiobank/database/mimic2wdb/matched/
We do not currently plan to add tables to the MIMIC-III clinical database to match to the waveforms, but we do plan on releasing headers, such as those in the above file.
Thanks Alistair, I think the most important thing for the header would be the ICUSTAY_ID, as that indicates when the patient was admitted to the hospital. The current date listed in the headers is when the actual recording starts as opposed to the date of ICU admission. So if we have the ICUSTAY_ID, I should be able to link the rest of the patient data from there.
Could there be any cases where there is a recording but no ICUSTAY_ID associated with it?
Yes, there are. ICUSTAY_ID and the waveform records are collected independently. We have to map them back and that's not always trivial. There is a host of issues that can happen (different clocks, waveform records with erroneous medical record numbers, alignment issues, ...). Also, minor correction, the ICUSTAY_ID starts when the patient enters the ICU, not the hospital. The HADM_ID is associated with the hospital.
From my calculations around 73% of records have an ICUSTAY_ID, and 87% have an HADM_ID.
Here's a map of the above headers to ICUSTAY_ID/HADM_ID: mimic-iii-matched-waveforms-sample.xlsx
Hi,
I work with @parisni at APHP on MIMIC3 data.
I just found this csv file : https://physionet.org/physiobank/database/mimic3wdb/matched/matched_waveform_info.csv and I would like to know if it is the definitive version of the matches between the waveforms and the HADM_ID/ICUSTAY_ID? Also, can you explain what are the 'hadm_overlap', 'icustay_overlap', 'rih' and 'rii' columns?
The page https://mimic.physionet.org/mimicdata/waveforms/ indicates that the work is not finished yet but it seems to be finished.
In the issue #166, @tompollard states that "The waveform database for MIMIC-III has not yet been released, but we are working on it.", however, it seems to be available at /mimic3wdb.
Thanks! :)
Thanks for highlighting this @Dubrzr. Essentially @alistairewj created a header file to match previously released waveforms to the MIMIC-III clinical data, but no additional waveforms have been released yet. We'll update documentation etc to clarify this point.
Thanks for your answer and also for your work! :D
I am working on getting all the data in the .hea header files to put it in a database and I would like to know if it could be interesting to merge this work in this repository.
It works like this:
mimic3wdb/
s00020/
3544749_0001.hea
3544749_0002.hea
3544749_0003.hea
3544749_0004.hea
3544749_0005.hea
3544749_0006.hea
3544749_0007.hea
3544749_0008.hea
3544749_layout.hea
s00020-2183-04-28-17-47.hea
s00020-2183-04-28-17-47n.hea
s00033/
....
....
> wfr.csv: record_id, subject_id, starttime, endtime, starting_hadm, ending_hadm, starting_icustay, ending_icustay, hadmmatch, icumatch, rih, rii, hadm_overlap, icustay_overlap, comments
> wfe.csv: record_id, type, segment_index, start_datedatetime, end_datedatetime, nsamp, nsig, fs, fmt, sampsperframe, skew, byteoffset, gain, units, baseline, initvalue, signame, comments
My scripts are available here: https://github.com/Dubrzr/mimic3-scripts
If you are interested in the resulting files, ask me.
Hi,
While exploring the data gathered with my script, I found erroneous dates in header files.
Only headers of numerics (s*n.hea) have this problem, for example, in the following file https://physionet.org/physiobank/database/mimic3wdb/matched/s00052/s00052-2191-01-10-02-21n.hea, the date is 14/03/3036 while the filename indicates that the date is 10/01/2191.
There are 888 numerics headers with this problem.
For the files concerned, can I assume that the date in the filename is the correct one? It seems to be concordant with the admission table.
There are also header files that are totally wrong:
You can see all the files with those problems here: https://gist.github.com/Dubrzr/6a22ae48980a549cc5883f3750ec0578
The script that generated this output is here: https://github.com/Dubrzr/mimic3-scripts/blob/master/headers_checker.py
Thanks!
Thanks for the bug report. I will be fixing the data later today - it was a sloppy regex! The date in the filename is the correct one. I'll post again when the data is updated on PhysioNet.
Regarding the crazy years, there are four of them to my knowledge:
No idea why the years are ridiculous. Probably a bad setting on a monitor. I would just exclude them like you're doing.
The matched header files on PhysioNet should be updated. Specifically, you should only need to redownload the s#####*.hea files. Let me know if you succeed with your next iteration of the script!
Regarding your scripts, I do think they'd be of interest to the community, but we'd have to think about where best to put them. For now I would tag your repository with mimic-iii
and physionet
which should help some.
Just a quick update: we are pleased to say that a new batch of matched waveforms are being uploaded to PhysioNet right now (~10k patients in total). Once the waveforms are uploaded and checked, they will be made available for analysis.
This is a super-exciting announcement! Thanks a lot for both of your work!
@bemoody and @cx1111 are the guys to thank for this - we'll pass on your praise!
Most helpful comment
Just a quick update: we are pleased to say that a new batch of matched waveforms are being uploaded to PhysioNet right now (~10k patients in total). Once the waveforms are uploaded and checked, they will be made available for analysis.