# Property Data Sets¶

A PhysicalPropertyDataSet is a collection of measured physical properties encapsulated as physical property objects. They may be created from scratch:

# Define a density measurement
density = Density(
substance=Substance.from_components("O"),
thermodynamic_state=ThermodynamicState(
pressure=1.0*unit.atmospheres, temperature=298.15*unit.kelvin
),
phase=PropertyPhase.Liquid,
value=1.0*unit.gram/unit.millilitre,
uncertainty=0.0001*unit.gram/unit.millilitre
)

# Add the property to a data set
data_set = PhysicalPropertyDataset()


# Save the data set as a JSON file.
data_set.json(file_path="data_set.json", format=True)
# Load the data set from a JSON file
data_set = PhysicalPropertyDataset.from_json(file_path="data_set.json")


and may be converted to pandas DataFrame objects:

data_set.to_pandas()


The framework implements specific data set objects for extracting data measurements directly from a number of open data sources, such as the ThermoMLDataSet (see ThermoML Data Sets) which provides utilities for extracting the data from the NIST ThermoML Archive and converting it into the standard framework objects.

Data set objects are directly iterable:

for physical_property in data_set:
...


or can be iterated over for a specific substance:

for physical_property in data_set.properties_by_substance(substance):
...


or for a specific type of property:

for physical_property in data_set.properties_by_type("Density"):
...


## Physical Properties¶

The PhysicalProperty object is a base class for any object which describes a measured property of substance, and is defined by a combination of:

as well as optionally

• the uncertainty in the value of the property.

• a list of ParameterGradient which defines the gradient of the property with respect to the model parameters if it was computationally estimated.

• a Source specifying the source (either experimental or computational) and provenance of the measurement.

Each type of property supported by the framework, such as a density of an enthalpy of vaporization, must have it’s own class representation which inherits from PhysicalProperty:

# Define a density measurement
density = Density(
substance=Substance.from_components("O"),
thermodynamic_state=ThermodynamicState(
pressure=1.0*unit.atmospheres, temperature=298.15*unit.kelvin
),
phase=PropertyPhase.Liquid,
value=1.0*unit.gram/unit.millilitre,
uncertainty=0.0001*unit.gram/unit.millilitre
)


## Substances¶

A Substance is defined by a number of components (which may have specific roles assigned to them such as being solutes in the system) and the amount of each component in the substance.

To create a pure substance containing only water:

water_substance = Substance.from_components("O")


To create binary mixture of water and methanol in a 20:80 ratio:

binary_mixture = Substance()


To create a substance of an infinitely dilute paracetamol solute dissolved in water:

solution = Substance()
Component(smiles="O", role=Component.Role.Solvent), MoleFraction(value=1.0)
)
Component(smiles="CC(=O)Nc1ccc(O)cc1", role=Component.Role.Solute), ExactAmount(value=1)
)


## Property Phases¶

The PropertyPhase enum describes the possible phases which a measurement was performed in.

While the enum only has three defined phases (Solid, Liquid and Gas), multiple phases can be formed by OR’ing (|) multiple phases together. As an example, to define a phase for a liquid and gas coexisting:

liquid_gas_phase = PropertyPhase.Liquid | PropertyPhase.Gas


## Thermodynamic States¶

A ThermodynamicState specifies a combination of the temperature and (optionally) the pressure at which a measurement is performed:

thermodynamic_state = ThermodynamicState(
temperature=298.15*unit.kelvin, pressure=1.0*unit.atmosphere
)