I would be very pleased if Pandas supported multiple comment characters when reading data from files. According to:
import pandas as pd
df = pd.read_table("data.dat", comment=("#","@"), delim_whitespace=True)
I don't know if this is requires a minor or major implementation effort?
Best,
Erik
this would be a bit of an effort. the reader is basically byte by byte (with some backref capability). So it would have to check agains a buffer of the comment chars (it just checks against the single char now, but only if its not NULL), in a performant manner. Could be done.
Related:
Would be great if a comment character could actually also be two characters, e.g. "##". For example, in VCF files, some meta data is specified at the beginning of the file with "##" before the actual table starts:
http://www.internationalgenome.org/wiki/Analysis/vcf4.0/
Often one just wants to ignore these, but:
df = pd.read_csv("data.vcf", comment="##")
doesn't work. Note that for VCF it won't work to just use comment="#"
since the header line actually starts with a single "#".
This would be difficult. I'm closing this for now
Most helpful comment
Related:
Would be great if a comment character could actually also be two characters, e.g. "##". For example, in VCF files, some meta data is specified at the beginning of the file with "##" before the actual table starts:
http://www.internationalgenome.org/wiki/Analysis/vcf4.0/
Often one just wants to ignore these, but:
doesn't work. Note that for VCF it won't work to just use
comment="#"
since the header line actually starts with a single "#".