Skip to main content

W-Data format for superfluid dynamics and the W-SLDA Toolkit.

Project description

W-data Format

Language grade: Python Tests codecov Pypi pyversionsCode style: black

This project contains tools for working with and manipulating the W-data format used for analyzing superfluid data generated by the W-SLDA Toolkit.

This format was originally derived from the W-SLDA project led by Gabriel Wlazlowski as documented here:

Here we augment this format slightly to facilitate working with Python.

Generalizations

The original format required a .wtxt file with lots of relevant information. Here we generalize the format to allow this information to be specified in the data files, which we allow to be in the NPY format.

Installation

python3 -m pip install wdata

Basic Usage

The W-data format stores various arrays representing physical quantities such as the density (real), pairing field (complex), currents (3-component real vectors) etc. on a regular lattice of shape Nxyz = (Nx, Ny, Nz) at a bunch of Nt times.

The data is represented by two classes:

  • Var: These are the data variables such as density, currents, etc. with additional metadata (see the wdata.io.IVar interface for details):

    • Var.name: Name of variable as it will appear in VisIt for example.
    • Var.data: The actual data as a NumPy array.
    • Var.description: Description.
    • Var.filename: The file where the data is stored on disk.
    • Var.unit: Unit (mainly for use in VisIt... does not affect the data.)

    Additionally, the following can be provided, but can also be inferred from the Var.data if provided:

    • Var.descr: NumPy data descriptor (float, complex, etc.)
    • Var.shape: Shape of the array.
  • WData: This represents a complete dataset. Some relevant attributes are (see wdata.io.IWData for details):

    • WData.infofile: Location of the infofile (see below). This is where the metadata will be stored or loaded from.
    • WData.variables: List of Var variables.
    • WData.xyz: Abscissa (x, y, z) shaped so that they can be used with broadcasting. I.e. r = np.sqrt(x**2+y**2+z**2).
    • WData.t: Array of times.
    • WData.dim: Dimension of dataset. I.e. dim==1 for 1D simulations, dim==3 for 3D simulations.
    • WData.aliases: Dictionary of aliases. Convenience for providing alternative data access in VisIt.
    • WData.constants: Dictionary of constants such as kF, eF.

    Normally, the WData constructor will check that the data exists. If you are missing data, you can suppress this check by calling WData(..., check_data=False) or WData.load(..., check_data=False).

Minimal Example:

Here is a minimal set of data:

import numpy as np
np.random.seed(3)
from wdata.io import WData, Var

Nt = 10 
Nxyz = (4, 8, 16)
dxyz = (0.3, 0.2, 0.1)
dt = 0.1
Ntxyz = (Nt,) + Nxyz

density = np.random.random(Ntxyz)

data = WData(prefix='dataset', data_dir='_example_wdata',
             Nxyz=Nxyz, dxyz=dxyz,
             variables=[Var(density=density)],
             Nt=Nt)
data.save(force=True)

This will make a directory _example_wdata with infofile _example_wdata/dataset.wtxt:

$ tree _example_wdata
_example_wdata
|-- dataset.wtxt
`-- dataset_density.wdat

0 directories, 2 files
$ cat _example_wdata/dataset.wtxt
# Generated by wdata.io: [2020-12-18 06:41:29 UTC+0000 = 2020-12-17 22:41:29 PST-0800]

NX               4    # Lattice size in X direction
NY               8    #             ... Y ...
NZ              16    #             ... Z ...
DX             0.3    # Spacing in X direction
DY             0.2    #        ... Y ...
DZ             0.1    #        ... Z ...
prefix     dataset    # datafile prefix: <prefix>_<var>.<format>
datadim          3    # Block size: 1:NX, 2:NX*NY, 3:NX*NY*NZ
cycles          10    # Number Nt of frames/cycles per dataset
t0               0    # Time value of first frame
dt               1    # Time interval between frames

# variables
# tag       name    type    unit    format    # description
var      density    real    none      wdat    # density

The data can be loaded by specifying the infofile:

from wdata.io import WData
data = WData.load('_example_wdata/dataset.wtxt')

The data could be plotted using PyVista for example (the random data will not look so good...):

import numpy as np
import pyvista as pv
from wdata.io import WData

data = WData.load('_example_wdata/dataset.wtxt')
n = data.density[0]

grid = pv.StructuredGrid(*np.meshgrid(*data.xyz))
grid["vol"] = n.flatten(order="F")
contours = grid.contour(np.linspace(n.min(), n.max(), 5))

p = pv.Plotter()
p.add_mesh(contours, scalars=contours.points[:, 2])
p.show()

The recommended way to save data is to create variables for the data, times, and abscissa, then store this:

import numpy as np
from wdata.io import WData, Var

np.random.seed(3)

Nt = 10
Nxyz = (32, 32, 32)
dxyz = (10.0/32, 10.0/32, 10.0/32)
dt = 0.1

# Abscissa.  Not strictly needed, but if you have them, then use them
# instead.
t = np.arange(Nt)*dt
xyz = np.meshgrid(*[(np.arange(_N)-_N/2)*_dx
                    for _N, _dx in zip(Nxyz, dxyz)],
                  sparse=True, indexing='ij')

# Now make the WData object and save the data.
Ntxyz = (Nt,) + Nxyz
w = np.pi/t.max()
ws = [1.0 + 0.5*np.cos(w*t), 
      1.0 + 0.5*np.sin(w*t),
      1.0 + 0*t]
density = np.exp(-sum((_x[None,...].T*_w).T**2/2 for _x, _w in zip(xyz, ws)))
delta = np.random.random(Ntxyz) + np.random.random(Ntxyz)*1j - 0.5 - 0.5j
current = np.random.random((Nt, 3,) + Nxyz) - 0.5

variables = [
    Var(density=density),
    Var(delta=delta),
    Var(current=current)
]
    
data = WData(prefix='dataset2', 
             data_dir='_example_wdata/',
             xyz=xyz, t=t,
             variables=variables)
data.save()

Now load and plot the data:

import numpy as np
import pyvista as pv

from wdata.io import WData
data = WData.load(infofile='_example_wdata/dataset2.wtxt')

n = data.density[0]

grid = pv.StructuredGrid(*np.meshgrid(*data.xyz))
grid["vol"] = n.flatten(order="F")
contours = grid.contour(np.linspace(n.min(), n.max(), 5))

p = pv.Plotter()
p.add_mesh(contours, scalars=contours.points[:, 2])
p.show()

Note: the actual data is loaded into python using memory-mapped arrays. This allows you to refer to very large data-sets without loading the entire data into memory. This will delay loading until a copy of the array is made. For example:

import numpy as np
from wdata.io import WData
data = WData.load(infofile='_example_wdata/dataset2.wtxt')

# At this point, the data has not been fully loaded.  You can
# work with subsets efficiently.  For example, the following will
# only load the first frame of data:

n = data.density[0]

# Beware: if you make a copy of the data, explicitly *or implicitly* then it will get
# loaded.  The following will load the full array into memory so that np.cos can do its
# computations.

sum_cos_n = np.sum(np.cos(data.density))

# If this is too big, you may want to process each slice independently.  The previous
# example could be more efficiently computed using the following loop:

sum_cos_n = sum(np.cos(_n).sum() for _n in data.density)

# The Dask package may be useful for such processing in more complicated settings.

See Also

Developer Notes

Testing

For distribution we use poetry and for testing we use nox. To test the code:

nox

Documentation

For documentation, we use Sphinx. To build this run:

poetry install  # Install all of the developer dependencies
poetry run make -C docs html
  • __init__(): The default behavior of autodoc is to merge the documentation of __init__ methods with the class since the user never directly calls __init__(). Keep this in mind when writing the docstrings.

Changes

0.2.0

  • Drop support for python 3.6 and 3.7 (end of life - probably still work, but we don't test.)
  • Add support for python 3.12
  • Resolve issue #15. Allow for incomplete data in scalar types.

0.1.7

  • Resolve issue #3: Document that WData(..., check_data=False) allows one to skip check of data. (Also added better support for saving WData() objects with partial data.)

0.1.6

  • Resolve issue #10: Provide working abscissa. This allows the user to provide abscissa like x that are not equally spaced. These will be stored as data.
  • Resolve issue #14: More flexible loading, providing defaults for missing optional values, and allowing for extra new but unused values (particularly, units provided for consts).
  • Update to new W-Data format which specifies that all special parameters (nx, dx, t0, etc.) should be case insensitive.
  • Changed default value of nt to 0 so that we can have load and test empty datasets.
  • Update and include poetry.lock.

0.1.5

  • Resolve issue #13: WData can now load read-only files.

0.1.4

  • Resolve issue #8. Vectors can have Nv <= dim. Also, keep Nxyz info even if dim<3: this is how plane-wave approximations are used sometimes.
  • Fixed many small bugs discovered by 100% coverage testing.
  • Pass-through kwargs from io.WData.load() etc. to constructor.
  • Added check_data flag to optionally disable testing of data.
  • Remove item-access. Use attribute access instead: data.x or getattr(data, 'x').

0.1.3

  • Address issue #4 for loading large datasets. We now use memory mapped files.
  • Started adding Sphinx documentation. Not complete (sphinxcontrib.zopeext needs updating... something is wrong.)

0.1.2

  • Fixed issue #2. datadim < 3 now works properly.
  • Started working on documentation (incomplete).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wdata-0.2.0.tar.gz (22.0 kB view hashes)

Uploaded Source

Built Distribution

wdata-0.2.0-py3-none-any.whl (18.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page