Dataclasses were added in Python 3.7.
It would be nice for pandas to support dataclasses. For example could be possible to construct dataframe from by calling .from_dataclasses
or just .DataFrame(data=dataclass_list)
. There should be also possibility to do .to_dataclasses
.
from dataclasses import dataclass
import pandas as pd
@dataclass
class SimpleDataObject(object):
field_a: int
field_b: str
dataclass_object1 = SimpleDataObject(1, 'a')
dataclass_object2 = SimpleDataObject(2, 'b')
>>> asd
# Dataclasses to DataFrame
df = pd.from_dataclasses([dataclass_object1, dataclass_object2])
df.dtypes == ['field_a', 'field_b']
>>> True
df.dtypes == ['int', 'str']
>>> True
# Dataclasses to DataFrame
df = pd.DataFrame(data=[dataclass_object1, dataclass_object2])
df.dtypes == ['field_a', 'field_b']
>>> True
df.dtypes == ['int', 'str']
>>> True
# DataFrame to Dataclasses
df = pd.DataFrame(columns=['field_a', 'field_b'], data=[[1, 'a'], [2, 'b']])
dataclass_list = df.to_dataclasses()
dataclass_list == [dataclass_object1, dataclass_object2]
>>> True
AFAIK is s not guaranteed that you can know that a certain instance is a dataclass. E.g. Classes do not inherit from dataclass.
From your example:
@dataclass
class SimpleDataObject(object):
field_a: int
field_b: str
x = SimpleDataObject(a=2, b=‘f’)
I dont think you could even tell from introspection that x
is a dataclass, correct? If that’s the case, this isn’t possible to do.
The dataclasses
module has is_dataclass
and fields
introspection functions, so that part shouldn't be an issue.
That said I'm not sure we should quickly commit to any specific API/support here. For now the the asdict
helper from the dataclasses module can help with the ingest usecase.
In [18]: from dataclasses import asdict
In [19]: pd.DataFrame([asdict(x) for x in [dataclass_object1, dataclass_object2]])
Out[19]:
field_a field_b
0 1 a
1 2 b
I compiled a solution where I check the data provided during __init__, in this PR, however, it looks like their testing pipeline is setup to support multiple py-versions. So I may need a bit more time to make this happen.
Most helpful comment
The
dataclasses
module hasis_dataclass
andfields
introspection functions, so that part shouldn't be an issue.That said I'm not sure we should quickly commit to any specific API/support here. For now the the
asdict
helper from the dataclasses module can help with the ingest usecase.