PhysicalPropertyDataSet

class propertyestimator.datasets.PhysicalPropertyDataSet[source]

An object for storing and curating data sets of both physical property measurements and estimated. This class defines a number of convenience functions for filtering out unwanted properties, and for generating general statistics (such as the number of properties per substance) about the set.

__init__()[source]

Constructs a new PhysicalPropertyDataSet object.

Methods

__init__()

Constructs a new PhysicalPropertyDataSet object.

filter_by_components(number_of_components)

Filter the data set based on a minimum and maximum temperature.

filter_by_elements(*allowed_elements)

Filters out those properties which were estimated for

filter_by_function(filter_function)

Filter the data set using a given filter function.

filter_by_phases(phases)

Filter the data set based on the phase of the property (e.g liquid).

filter_by_pressure(min_pressure, max_pressure)

Filter the data set based on a minimum and maximum pressure.

filter_by_property_types(*property_type)

Filter the data set based on the type of property (e.g Density).

filter_by_smiles(*allowed_smiles)

Filters out those properties which were estimated for

filter_by_temperature(min_temperature, …)

Filter the data set based on a minimum and maximum temperature.

json()

Creates a JSON representation of this class.

merge(data_set)

Merge another data set into the current one.

parse_json(string_contents[, encoding])

Parses a typed json string into the corresponding class structure.

to_pandas()

Converts a PhysicalPropertyDataSet to a pandas.DataFrame object with columns of

Attributes

number_of_properties

The number of properties in the data set.

properties

A list of all of the properties within this set, partitioned by substance identifier.

sources

The list of sources from which the properties were gathered

property properties

A list of all of the properties within this set, partitioned by substance identifier.

TODO: Add a link to Substance.identifier when have access to sphinx docs. TODO: Investigate why PhysicalProperty is not cross-linking.

See also

Substance.identifier

Type

dict of str and list of PhysicalProperty

property sources

The list of sources from which the properties were gathered

Type

list of Source

property number_of_properties

The number of properties in the data set.

Type

int

merge(data_set)[source]

Merge another data set into the current one.

Parameters

data_set (PhysicalPropertyDataSet) – The secondary data set to merge into this one.

filter_by_function(filter_function)[source]

Filter the data set using a given filter function.

Parameters

filter_function (lambda) – The filter function.

filter_by_property_types(*property_type)[source]

Filter the data set based on the type of property (e.g Density).

Parameters

property_type (PropertyType or str) – The type of property which should be retained.

Examples

Filter the dataset to only contain densities and static dielectric constants

>>> # Load in the data set of properties which will be used for comparisons
>>> from propertyestimator.datasets import ThermoMLDataSet
>>> data_set = ThermoMLDataSet.from_doi('10.1016/j.jct.2016.10.001')
>>>
>>> # Filter the dataset to only include densities and dielectric constants.
>>> from propertyestimator.properties import Density, DielectricConstant
>>> data_set.filter_by_property_types(Density, DielectricConstant)

or

>>> data_set.filter_by_property_types('Density', 'DielectricConstant')
filter_by_phases(phases)[source]

Filter the data set based on the phase of the property (e.g liquid).

Parameters

phases (PropertyPhase) – The phase of property which should be retained.

Examples

Filter the dataset to only include liquid properties.

>>> # Load in the data set of properties which will be used for comparisons
>>> from propertyestimator.datasets import ThermoMLDataSet
>>> data_set = ThermoMLDataSet.from_doi('10.1016/j.jct.2016.10.001')
>>>
>>> from propertyestimator.properties import PropertyPhase
>>> data_set.filter_by_temperature(PropertyPhase.Liquid)
filter_by_temperature(min_temperature, max_temperature)[source]

Filter the data set based on a minimum and maximum temperature.

Parameters
  • min_temperature (unit.Quantity) – The minimum temperature.

  • max_temperature (unit.Quantity) – The maximum temperature.

Examples

Filter the dataset to only include properties measured between 130-260 K.

>>> # Load in the data set of properties which will be used for comparisons
>>> from propertyestimator.datasets import ThermoMLDataSet
>>> data_set = ThermoMLDataSet.from_doi('10.1016/j.jct.2016.10.001')
>>>
>>> from propertyestimator import unit
>>> data_set.filter_by_temperature(min_temperature=130*unit.kelvin, max_temperature=260*unit.kelvin)
filter_by_pressure(min_pressure, max_pressure)[source]

Filter the data set based on a minimum and maximum pressure.

Parameters
  • min_pressure (unit.Quantity) – The minimum pressure.

  • max_pressure (unit.Quantity) – The maximum pressure.

Examples

Filter the dataset to only include properties measured between 70-150 kPa.

>>> # Load in the data set of properties which will be used for comparisons
>>> from propertyestimator.datasets import ThermoMLDataSet
>>> data_set = ThermoMLDataSet.from_doi('10.1016/j.jct.2016.10.001')
>>>
>>> from propertyestimator import unit
>>> data_set.filter_by_temperature(min_pressure=70*unit.kilopascal, max_temperature=150*unit.kilopascal)
filter_by_components(number_of_components)[source]

Filter the data set based on a minimum and maximum temperature.

Parameters

number_of_components (int) – The allowed number of components in the mixture.

Examples

Filter the dataset to only include pure substance properties.

>>> # Load in the data set of properties which will be used for comparisons
>>> from propertyestimator.datasets import ThermoMLDataSet
>>> data_set = ThermoMLDataSet.from_doi('10.1016/j.jct.2016.10.001')
>>>
>>> data_set.filter_by_components(number_of_components=1)
filter_by_elements(*allowed_elements)[source]
Filters out those properties which were estimated for

compounds which contain elements outside of those defined in allowed_elements.

Parameters

allowed_elements (str) – The symbols (e.g. C, H, Cl) of the elements to retain.

filter_by_smiles(*allowed_smiles)[source]
Filters out those properties which were estimated for

compounds which do not appear in the allowed smiles list.

Parameters

allowed_smiles (str) – The smiles identifiers of the compounds to keep after filtering.

to_pandas()[source]

Converts a PhysicalPropertyDataSet to a pandas.DataFrame object with columns of

  • ‘Temperature’

  • ‘Pressure’

  • ‘Phase’

  • ‘Number Of Components’

  • ‘Component 1’

  • ‘Mole Fraction 1’

  • ‘Component N’

  • ‘Mole Fraction N’

  • ‘<Property 1> Value’

  • ‘<Property 1> Uncertainty’

  • ‘<Property N> Value’

  • ‘<Property N> Uncertainty’

  • ‘Source’

where ‘Component X’ is a column containing the smiles representation of component X.

Returns

The create data frame.

Return type

pandas.DataFrame

json()

Creates a JSON representation of this class.

Returns

The JSON representation of this class.

Return type

str

classmethod parse_json(string_contents, encoding='utf8')

Parses a typed json string into the corresponding class structure.

Parameters
  • string_contents (str or bytes) – The typed json string.

  • encoding (str) – The encoding of the string_contents.

Returns

The parsed class.

Return type

Any