Data Storage

Different storage backends for the data of a simulation.

class openfisca_core.data_storage.InMemoryStorage(is_eternal=False)[source]

Storing and retrieving calculated vectors in memory.

Parameters:

is_eternal (bool) – Whether the storage is eternal.

delete(period=None)[source]

Delete the data for the specified Period from memory.

Parameters:

period (None | Period) – The Period for which data should be deleted.

Return type:

None

Note

If period is specified, all data will be deleted.

Examples

>>> import numpy
>>> from openfisca_core import data_storage, periods
>>> storage = data_storage.InMemoryStorage()
>>> value = numpy.array([1, 2, 3])
>>> instant = periods.Instant((2017, 1, 1))
>>> period = periods.Period(("year", instant, 1))
>>> storage.put(value, period)
>>> storage.get(period)
array([1, 2, 3])
>>> storage.delete(period)
>>> storage.get(period)
>>> storage.put(value, period)
>>> storage.delete()
>>> storage.get(period)
get(period=None)[source]

Retrieve the data for the specified Period from memory.

Parameters:

period (None | Period) – The Period for which data should be retrieved.

Returns:
  • None – If no data is available.

  • EnumArray – The data for the specified Period.

  • ndarray[generic] – The data for the specified Period.

Return type:

None | ndarray[Any, dtype[generic]]

Examples

>>> import numpy
>>> from openfisca_core import data_storage, periods
>>> storage = data_storage.InMemoryStorage()
>>> value = numpy.array([1, 2, 3])
>>> instant = periods.Instant((2017, 1, 1))
>>> period = periods.Period(("year", instant, 1))
>>> storage.put(value, period)
>>> storage.get(period)
array([1, 2, 3])
get_known_periods()[source]

List of storage’s known periods.

Returns:

KeysView[Period] – A sequence containing the storage’s known periods.

Return type:

KeysView[Period]

Examples

>>> from openfisca_core import data_storage, periods
>>> storage = data_storage.InMemoryStorage()
>>> storage.get_known_periods()
dict_keys([])
>>> instant = periods.Instant((2017, 1, 1))
>>> period = periods.Period(("year", instant, 1))
>>> storage.put([], period)
>>> storage.get_known_periods()
dict_keys([Period(('year', Instant((2017, 1, 1)), 1))])
get_memory_usage()[source]

Memory usage of the storage.

Returns:

MemoryUsage – A dictionary representing the storage’s memory usage.

Return type:

MemoryUsage

Examples

>>> from openfisca_core import data_storage
>>> storage = data_storage.InMemoryStorage()
>>> storage.get_memory_usage()
{'nb_arrays': 0, 'total_nb_bytes': 0, 'cell_size': nan}
is_eternal: bool

Whether the storage is eternal.

put(value, period)[source]

Store the specified data in memory for the specified Period.

Parameters:
Return type:

None

Examples

>>> import numpy
>>> from openfisca_core import data_storage, periods
>>> storage = data_storage.InMemoryStorage()
>>> value = numpy.array([1, "2", "salary"])
>>> instant = periods.Instant((2017, 1, 1))
>>> period = periods.Period(("year", instant, 1))
>>> storage.put(value, period)
>>> storage.get(period)
array(['1', '2', 'salary'], dtype='<U21')
class openfisca_core.data_storage.OnDiskStorage(storage_dir, is_eternal=False, preserve_storage_dir=False)[source]

Storing and retrieving calculated vectors on disk.

Parameters:
  • storage_dir (str) – Path to store calculated vectors.

  • is_eternal (bool) – Whether the storage is eternal.

  • preserve_storage_dir (bool) – Whether to preserve the storage directory.

_decode_file(file)[source]

Decode a file by loading its contents as a numpy array.

Parameters:

file (str) – Path to the file to be decoded.

Returns:
  • EnumArray – Representing the data in the file.

  • ndarray[generic] – Representing the data in the file.

Return type:

ndarray[Any, dtype[generic]]

Note

If the file is associated with Enum values, the array is converted back to an EnumArray object.

Examples

>>> import tempfile
>>> import numpy
>>> from openfisca_core import data_storage, indexed_enums, periods
>>> class Housing(indexed_enums.Enum):
...     OWNER = "Owner"
...     TENANT = "Tenant"
...     FREE_LODGER = "Free lodger"
...     HOMELESS = "Homeless"
>>> array = numpy.array([1])
>>> value = indexed_enums.EnumArray(array, Housing)
>>> instant = periods.Instant((2017, 1, 1))
>>> period = periods.Period(("year", instant, 1))
>>> with tempfile.TemporaryDirectory() as directory:
...     storage = data_storage.OnDiskStorage(directory)
...     storage.put(value, period)
...     storage._decode_file(storage._files[period])
EnumArray([Housing.TENANT])
delete(period=None)[source]

Delete the data for the specified period from disk.

Parameters:

period (None | Period) – The period for which data should be deleted. If not specified, all data will be deleted.

Return type:

None

Examples

>>> import tempfile
>>> import numpy
>>> from openfisca_core import data_storage, periods
>>> value = numpy.array([1, 2, 3])
>>> instant = periods.Instant((2017, 1, 1))
>>> period = periods.Period(("year", instant, 1))
>>> with tempfile.TemporaryDirectory() as directory:
...     storage = data_storage.OnDiskStorage(directory)
...     storage.put(value, period)
...     storage.get(period)
array([1, 2, 3])
>>> with tempfile.TemporaryDirectory() as directory:
...     storage = data_storage.OnDiskStorage(directory)
...     storage.put(value, period)
...     storage.delete(period)
...     storage.get(period)
>>> with tempfile.TemporaryDirectory() as directory:
...     storage = data_storage.OnDiskStorage(directory)
...     storage.put(value, period)
...     storage.delete()
...     storage.get(period)
get(period=None)[source]

Retrieve the data for the specified period from disk.

Parameters:

period (None | Period) – The period for which data should be retrieved.

Returns:
  • None – If no data is available.

  • EnumArray – Representing the data for the specified period.

  • ndarray[generic] – Representing the data for the specified period.

Return type:

None | ndarray[Any, dtype[generic]]

Examples

>>> import tempfile
>>> import numpy
>>> from openfisca_core import data_storage, periods
>>> value = numpy.array([1, 2, 3])
>>> instant = periods.Instant((2017, 1, 1))
>>> period = periods.Period(("year", instant, 1))
>>> with tempfile.TemporaryDirectory() as directory:
...     storage = data_storage.OnDiskStorage(directory)
...     storage.put(value, period)
...     storage.get(period)
array([1, 2, 3])
get_known_periods()[source]

List of storage’s known periods.

Returns:

KeysView[Period] – A sequence containing the storage’s known periods.

Return type:

KeysView[Period]

Examples

>>> import tempfile
>>> import numpy
>>> from openfisca_core import data_storage, periods
>>> instant = periods.Instant((2017, 1, 1))
>>> period = periods.Period(("year", instant, 1))
>>> with tempfile.TemporaryDirectory() as directory:
...     storage = data_storage.OnDiskStorage(directory)
...     storage.get_known_periods()
dict_keys([])
>>> with tempfile.TemporaryDirectory() as directory:
...     storage = data_storage.OnDiskStorage(directory)
...     storage.put([], period)
...     storage.get_known_periods()
dict_keys([Period(('year', Instant((2017, 1, 1)), 1))])
is_eternal: bool

Whether the storage is eternal.

preserve_storage_dir: bool

Whether to preserve the storage directory.

put(value, period)[source]

Store the specified data on disk for the specified period.

Parameters:
Return type:

None

Examples

>>> import tempfile
>>> import numpy
>>> from openfisca_core import data_storage, periods
>>> value = numpy.array([1, "2", "salary"])
>>> instant = periods.Instant((2017, 1, 1))
>>> period = periods.Period(("year", instant, 1))
>>> with tempfile.TemporaryDirectory() as directory:
...     storage = data_storage.OnDiskStorage(directory)
...     storage.put(value, period)
...     storage.get(period)
array(['1', '2', 'salary'], dtype='<U21')
restore()[source]

Restore the storage from disk.

Return type:

None

Examples

>>> import tempfile
>>> import numpy
>>> from openfisca_core import data_storage, periods
>>> value = numpy.array([1, 2, 3])
>>> instant = periods.Instant((2017, 1, 1))
>>> period = periods.Period(("year", instant, 1))
>>> directory = tempfile.TemporaryDirectory()
>>> storage1 = data_storage.OnDiskStorage(directory.name)
>>> storage1.put(value, period)
>>> storage1._files
{Period(('year', Instant((2017, 1, 1)), 1)): '.../2017.npy'}
>>> storage2 = data_storage.OnDiskStorage(directory.name)
>>> storage2._files
{}
>>> storage2.restore()
>>> storage2._files
{Period((<DateUnit.YEAR: 'year'>, Instant((2017, 1, 1.../2017.npy'}
>>> directory.cleanup()
storage_dir: str

A dictionary containing data that has been stored on disk.