Uncertainty distributions, parameters, and the Pedigree Matrix¶
Variables and formulas¶
Ecospold 2 supports parameterized datasets, where numeric values for exchanges and production volumes can be calculated using a chain for formulas and variables with uncertainty distributions. Formulas and variables can be present in four different places (see also the Internal data format):
 An exchange in the list
dataset['exchanges']
.  A property of an exchange in the list
dataset['exchanges'][some_index]['properties'][another_index]
. Not all exchanges have properties.  A parameter in the list
dataset['parameters']
. Again, not all datasets have parameters.  A technosphere exchange production volume
dataset['exchanges'][some_index]['production volume']
. Only production exchanges (typereference product
orbyproduct
) have production volumes.
Conventions and standards¶
A variable in an exchange, property, parameter, or production volume is defined with the dictionary key variable
. The value for this key will be the string name of the variable, e.g. {'variable': 'some_name'}
. Variable names must be valid python identifiers, so some_name
instead of some name
.
A formula in an exchange, property, parameter, or production volume is defined with the dictionary key formula
. The value for this key will be the formula as a string, e.g. {'formula': 'some_name * 2'}
.
Variables can be uncertain. If an uncertainty distribution is present in the same object as a variable, and no formula is present, then the given uncertainty is the uncertainty distribution for the variable. If a formula and an uncertainty dictionary are present, behaviour is not defined; there are multiple interpretations for this uncertainty distribution, but e.g. ecoinvent is not consistent.
The Ecospold standard places no real limits on which variables can depend on other variables, so arbitrarily complex relationships are possible.
Formulas that have division by zero errors are evaluated to be zero. However, most of these cases will be rewritten during the data cleaning step.
Evaluation of parameterized datasets¶
Evaluation of parameterized datasets is done with the bw2parameters library, which in turn relies on asteval.
After making changes in a parameterized dataset, you can use the following utility function to reevaluate all formula and variable values:

ocelot.transformations.parameterization.recalculation.
recalculate
(dataset)¶ Recalculate parameterized relationships within a dataset.
Modifies values in place.
Creates a
TolerantParameterSet
, populates it with named parameters with a dataset, and then gets the evaluation order the graph of parameter relationships. After reevaluating all named parameters, creates anInterpreter
with named parameters and all of numpy in its namespace. This interpreter is used to evaluate all other formulas in the dataset.Formulas that divide by zero are evaluated to zero.
Returns the modified dataset.
You may also be interested in this utility function for extracting parameters:

ocelot.transformations.parameterization.recalculation.
extract_named_parameters
(dataset)¶ Extract named parameters from
dataset
.Each named parameter must have a name, and should have either a numeric value (
amount
) or aformula
string. Parameters without names (variable
) are not extracted, as don’t contribute to dataset recalculation; they only get updated afterwards.Returns a dictionary with form:
{'name': {'amount': number, 'formula': string}}
.
Implicit references¶
To make things extra spicy, some variables can be implicit, and instead of being given a name, they are referred to by the id of their containing reference element. So, the formula Ref('aaaaaaaabbbbccccddddeeeeeeeeeeee')
means get the numeric value (amount
) of the exchange whose id
is aaaaaaaabbbbccccddddeeeeeeeeeeee
, and substitute in that amount. Datasets with these implicit variables only occur four times in ecoinvent 3.2 and three times in ecoinvent 3.3. Implicit variables can have three forms:
Ref('some id')
: Getamount
value for exchange or parameter with idsome id
.Ref('some id', 'ProductionVolume')
: Get production volume for exchange with idsome id
.Ref('some id', 'some other id')
: Getamount
for property with idsome other id
in exchangesome id
. This isn’t used in ecoinvent 3.2 or 3.3, and isn’t supported in the current version of Ocelot.
A cleanup function will replace these implicit relationships with named variables.

ocelot.transformations.parameterization.implicit_references.
replace_implicit_references
(data)¶ Replace
Ref(
with actual variables.Uses existing variables if possible, or else creates new variables in the elements that are referred to.
Generic transformations for parameters and formulas¶
After replacing implicit references (see above), we manually fix a couple of known problems in certain formula strings, such as numbers with leading zeros that are not understand by Python.

ocelot.transformations.parameterization.known_ecoinvent_issues.
fix_known_bad_formula_strings
(data)¶ Change certain known bad text elements in formulas.
The Ecospold 2 formula syntax is similar to Python in some ways, but we still need to use several functions to get formulas that Python can understand. Ocelot is still not 100% compatible with the entire Ecospold 2 formula spec.

ocelot.transformations.parameterization.python_compatibility.
lowercase_all_parameters
(data)¶ Convert all formulas and parameters to lower case.
Ecoinvent formulas and variables names are caseinsensitive, and often provided in many variants, e.g.
clinker_PV
andclinker_pv
. There are too many of these to fix manually, so we use a sledgehammer approach to guarantee consistency within datasets.

ocelot.transformations.parameterization.python_compatibility.
fix_math_formulas
(data)¶ Fix some special cases in formulas needed for correct parsing.

ocelot.transformations.parameterization.python_compatibility.
replace_reserved_words
(data)¶ Replace python reserved words in variable names and formulas.
For variable names, this is relatively simple  we just and see of the variable name is a python reserved word. For formulas, we use the
check_and_fix_formula
function.Changes datasets in place.
Finally, in cases where we can’t fix problems with formulas, we remove them from the dataset.

ocelot.transformations.parameterization.python_compatibility.
delete_unparsable_formulas
(data)¶ Uses AST parser to find unparsable formulas, which are deleted
Production volumes¶
Production volumes are specified for exchanges which produce reference product and allocatable byproduct flows. These volumes are used only to calculate the contribution of different transforming activities to markets. As such, production volumes are fixed during the evaluation of a system model in Ocelot. In order to stop an evaluation of the datasets formulas and variables from changing the value of the production volume, we move all such parameterization information to a new parameter, outside of the production volume definition.

ocelot.transformations.parameterization.production_volumes.
create_pv_parameters
(dataset)¶ Remove all production volume parameterization.
Production volumes are fixed, like reference production exchange amounts. This function will do one of three things:
 If there is no
formula
orvariable
in the production volume, do nothing.  If there is only a
formula
, delete the formula.  If there is a
variable
, move the variable to a new parameter.
 If there is no
Uncertainty distributions¶
Each uncertainty distribution in Ocelot is parsed and manipulated using a specific class. However, most of the time it is more convenient to use one of the following generic functions which are not distributionspecific:

ocelot.transformations.uncertainty.
scale_exchange
(exchange, factor)¶ Scale an
exchange
and its uncertainty by a constant numericfactor
.Modifies the exchange in place. Returns the modified exchange.

ocelot.transformations.uncertainty.
adjust_pedigree_matrix_time
(ds, exchange, year)¶
As each uncertainty distribution class provides the same API, you can also use the get_uncertainty_class
function to get the correct distribution for an exchange, and then call a class method, e.g. for any exchange exc
:
exc = get_uncertainty_class(exc).repair(exc)
Note that this also works on exchanges which don’t have an uncertainty
dictionary  the NoUncertainty
class will still do the right thing (which is normally nothing :).
Uncertainty distribution classes¶

class
ocelot.transformations.uncertainty.distributions.
NoUncertainty
¶ 
static
recalculate
(obj)¶ Adjusting pedigree matrix values for no uncertainty has no effect

static
repair
(obj)¶ Noop for no uncertainty

static
rescale
(obj, factor)¶ Rescale uncertainty distribution by a numeric
factor

classmethod
sample
(obj, size=1)¶ Draw
size
samples from this uncertainty distribution

classmethod
to_stats_arrays
(obj)¶ Returns a
stats_arrays
compatible dictionary.

static

class
ocelot.transformations.uncertainty.distributions.
Undefined
¶ Undefined uncertainty distribution.
This distribution has an uncertainty dictionary, include
minimum
andmaximum
values. However, as there is no given way to understand these values, they are not checked or used in Ocelot.
distribution
¶ alias of
UndefinedUncertainty

static
rescale
(obj, factor)¶ Rescale uncertainty distribution by a numeric
factor

classmethod
to_stats_arrays
(obj)¶ Returns a
stats_arrays
compatible dictionary.


class
ocelot.transformations.uncertainty.distributions.
Lognormal
¶ Lognormal distribution, defined by the mean (\(\mu\), called
mu
) and variance (\(\sigma^{2}\), calledvariance
) of the distribution’s natural logarithm.
static
recalculate
(obj)¶ Recalculate uncertainty values based on new pedigree matrix values

static
repair
(obj, fix_extremes=True)¶ Fix some common failures in lognormal distributions.
obj
is an object with a lognormal uncertainty distribution.If
fix_extremes
, will adjust variance values which are almost physically impossible. If
mean
is negative, set to positive, and addnegative = True
.  Make
mean
the same asamount
, and setmu
tolog(amount)
 Resolve any conflicts between
variance
andvariance with pedigree matrix
by preferring values invariance with pedigree uncertainty
andpedigree matrix
.  If
fix_extremes
, adjust clearly wrong uncertainties, using arbitrary rules I just made up:  If
1 < = variance <= e
, then the variance is set toln(variance)
.  If the
variance
is greater thane
, then the variance is set to0.25
.
 If
 If
 If

static
rescale
(obj, factor)¶ Rescale uncertainty distribution by a numeric
factor

classmethod
to_stats_arrays
(obj)¶ Returns a
stats_arrays
compatible dictionary.As negative lognormal distributions are not defined using the normal distribution functions, this method sets a
negative
flag.stats_arrays
will adjust any results to have the correct sign.Uses the standard deviation instead of the variance for compatibility with scipy and numpy.

static

class
ocelot.transformations.uncertainty.distributions.
Normal
¶ Normal distribution, defined by mean and variance.

static
recalculate
(obj)¶ TODO: This is currently not functioning correctly.
Use new pedigree matrix values to adjust the variance based on The application of the pedigree approach to the distributions foreseen in ecoinvent v3 by Müller, et al.
Adjusting the pedigree matrix for the normal distribution should lead to the same change in coefficient of determination as it would for the lognormal distribution.
For the lognormal distribution, the coefficient of determination is defined by:
\[CV = \sqrt{e^{\sigma^{2}}  1}\]For the normal distribution, the coefficient of determination is simply \(\sigma / \mu\). Additionally, we note that:
 Recalculating the pedigree matrix should not change the mean, i.e. \(\mu\).
 The pedigree matrix factors operate directly on the variance of the lognormal, so no manipulation is needed on that score.
So, our calculation algorithm is:
 Find the different in variance if the recalculation was applied to the lognormal distribution
 Find the relative change in coefficient of determination
 Calculate the new variance with pedigree matrix
\[CV_{ratio} = \frac{\sqrt{e^{\sigma_{pm}^{2}}  1}}{\sqrt{e^{\sigma_{withoutpm}^{2}}  1}}\]\[\sigma_{withoutpm} = 0\]\[CV_{ratio} = \sqrt{e^{\sigma_{pm}^{2}}  1}\]\[\frac{\sigma_{new}}{\mu_{new}} = \frac{\sigma_{old}}{\mu_{old}} CV_{ratio}\]\[\mu_{new} = \mu_{old}\]\[\sigma_{new}^{2} = \sigma_{old}^{2} ( e^{\sigma_{pm}^{2}}  1 )\]

static
repair
(obj)¶ Fix some common failures in normal distributions.
obj
is an object with a normal uncertainty distribution. Make
mean
the same asamount
 Resolve any conflicts between
variance
andvariance with pedigree matrix
by preferring values invariance with pedigree uncertainty
andpedigree matrix
 Make

static
rescale
(obj, factor)¶ Rescale uncertainty distribution by a numeric
factor
.Following Müller et al, rescaling should preserve the coefficient of determination, i.e. \(\sigma / \mu\). We are given the original variance, \(\sigma^{2}\). Therefore, we can find the new variance using:
\[\frac{\sigma_{old}}{\mu_{old}} = \frac{\sigma_{new}}{\mu_{new}}\]\[\frac{\sigma_{old}^{2}}{\mu_{old}^{2}} = \frac{\sigma_{new}^{2}}{\mu_{new}^{2}}\]\[\sigma_{new}^{2} = \frac{\mu_{new}^{2}}{\mu_{old}^{2}} \sigma_{old}^{2}\]

static

class
ocelot.transformations.uncertainty.distributions.
Triangular
¶ Triangular distribution, defined by minimum, mode, and maximum.

static
recalculate
(obj)¶ This is currently a noop, as pedigree matrices are not used for this distribution. However, it would be nice to have it in the future for completeness.

static
repair
(obj)¶ Make sure the provided values are a valid triangular distribution.
 Set
mode
toamount
.  Erases uncertainty if minimum == maximum == mode.
 Flips minimum and maximum if necessary.
 Raises
ValueError
if mode is outside (minimum, maximum)
 Set

static
rescale
(obj, factor)¶ Rescale the exchange by a constant numeric
factor
.

classmethod
to_stats_arrays
(obj)¶ Returns a
stats_arrays
compatible dictionary.

static

class
ocelot.transformations.uncertainty.distributions.
Uniform
¶ Uniform distribution, defined by minimum and maximum.

distribution
¶ alias of
UniformUncertainty

static
recalculate
(obj)¶ This is currently a noop, as pedigree matrices are not used for this distribution. However, it would be nice to have it in the future for completeness.

static
repair
(obj)¶ Make sure the provided values are a valid uniform distribution.
 Erases uncertainty if minimum == maximum == amount.
 Flips minimum and maximum if necessary.
 Raises
ValueError
if mode is outside (minimum, maximum)  If
amount
if not close to halfway between minimum and maximum, change to triangular distribution.

static
rescale
(obj, factor)¶ Rescale the exchange by a constant numeric
factor
.

classmethod
to_stats_arrays
(obj)¶ Returns a
stats_arrays
compatible dictionary.

Pedigree Matrix¶
Pedigree matrices are stored as dictionaries (see the data format). Currently, Ocelot only adjust the temporal correlation to adjust datasets to the reference year, but other adjustments are possible.
To adjust uncertainty values for a new pedigree matrix, call the method recalculate
for the correct uncertainty distribution, i.e. one of the following:
# Works always
get_uncertainty_class(exc).recalculate(exc)
# If you know the specific distribution
Lognormal.recalculate(exc)
To adjust the pedigree matrix value for temporal correlation to a given reference year, use the following utility function (which will already recalculate the uncertainty values):

ocelot.transformations.uncertainty.
adjust_pedigree_matrix_time
(ds, exchange, year)
Ocelot includes pedigree matrix values for the original pedigree matrix from ecoinvent 2, as well as the revised values from Empirically based uncertainty factors for the pedigree matrix in ecoinvent. However, there is not yet an API to use these updated factors.