postmd.utils package
- postmd.utils.calc_replicas_mean_std(data_arrays, ddof=0)[source]
average the data from replicates.
- Parameters:
data_arrays (list or np.ndarray) – a list of data
- Returns:
the averaged data
- Return type:
np.ndarray
Examples
>>> import numpy as np >>> from postmd.utils import calc_replicas_mean_std >>> data_arrays = [ >>> np.array([4.3, 5.6, 3.8, 5.1, 4.9]), # First dataset >>> np.array([3.2, 4.5, 4.1, 3.7, 4.3]), # Second dataset >>> np.array([5.5, 6.2, 5.9, 6.1, 5.8]), # Third dataset >>> ] >>> # Calculate mean and std of each dataset (i.e., each array). >>> mean, std = calc_replicas_mean_std(data_arrays) >>> print(f"replicas mean: {mean}") Averages of replicates: [4.33333333 5.43333333 4.6 4.96666667 5. ]
- postmd.utils.calc_box_length(num, density=1.0, NA=None)[source]
calculate the length of a cubic water box.
Warning
The built-in Avogadro constant in LAMMPS (units real or metal) is 6.02214129e23, see lammps/src/update.cpp, they write “force->mv2d = 1.0 / 0.602214129” for units real and units metal. However, we defalutly used the Avogadro constant in scipy.constants is 6.022140857e23, which is the international standard.
- Parameters:
num (int) – the number of water molecules.
density (float, optional) – the density of water, in g/cm^3. Defaults to 1.0.
- Returns:
the length of a cubic water box, in Angstrom
- Return type:
float
Examples
>>> import postmd.utils as utils >>> utils.calc_box_length(1000, density=1.0) The length of a cubic water box for 1000 water molecules and 1.0 g/cm^3 is 31.043047 Angstrom
- postmd.utils.create_dir(path, backup=True)[source]
Create a directory at the specified ‘path’. If the directory already exists and ‘backup’ is True, rename the original directory by appending ‘.bkXXX’.
- Parameters:
path (str) – The path of the directory to be created.
backup (bool, optional) – Whether to back up an existing directory. Default is
True
.
Examples
>>> import os >>> import postmd.utils as utils >>> >>> print(os.listdir()) ['createdir.py'] >>> utils.create_dir("test") >>> print(os.listdir()) # create a new "test" dir ['createdir.py', 'test'] >>> utils.create_dir("test") >>> print(os.listdir()) # move orgin "test" dir to "test.bk000" dir ['createdir.py', 'test', 'test.bk000'] >>> utils.create_dir("test") >>> print(os.listdir()) # move orgin "test" dir to "test.bk001" dir ['createdir.py', 'test', 'test.bk000', 'test.bk001']
- postmd.utils.cummean(data)[source]
calculate the cumulative average.
- Parameters:
data (1d list) – the data need to do the cumulative average
- Returns:
the cumulative average
- Return type:
np.ndarray
Examples
>>> import postmd.utils as utils >>> import numpy as np >>> array = np.arange(9) >>> print(array) [0 1 2 3 4 5 6 7 8] >>> utils.cummean(array) array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. ])
- postmd.utils.stats_mean_std_bins(x, y, bins=10, range=None)[source]
statistic the mean and standard deviation(ddof=0) of x and y in each bin. Here we used the [scipy.stats.binned_statistic](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html) function.
- Parameters:
x – (N,) array_like. A sequence of values to be binned.
y – (N,) array_like. The data on which the statistic will be computed. This must be the same shape as x, or a set of sequences - each the same shape as x. If values is a set of sequences, the statistic will be computed on each independently.
bins (int or sequence of scalars, optional) – If bins is an int, it defines the number of equal-width bins in the given range (10 by default). If bins is a sequence, it defines the bin edges, including the rightmost edge, allowing for non-uniform bin widths. Values in x that are smaller than lowest bin edge are assigned to bin number 0, values beyond the highest bin are assigned to bins[-1]. If the bin edges are specified, the number of bins will be, (nx = len(bins)-1). Defaults to 10.
range ((float, float) or [(float, float)], optional) – The lower and upper range of the bins. If not provided, range is simply (x.min(), x.max()). Values outside the range are ignored. Defaults to None.
- Returns:
(x_mean, x_std, y_mean, y_std)
- Return type:
tuple
- postmd.utils.calc_acf(data, nlag=None)[source]
calculate the auto-correlation function in the Green-Kubo formula
- Parameters:
data (_type_) – _description_
nlag (_type_, optional) – _description_. Defaults to None.
- Returns:
_description_
- Return type:
_type_
Submodules
postmd.utils.utils module
- postmd.utils.utils.calc_replicas_mean_std(data_arrays, ddof=0)[source]
average the data from replicates.
- Parameters:
data_arrays (list or np.ndarray) – a list of data
- Returns:
the averaged data
- Return type:
np.ndarray
Examples
>>> import numpy as np >>> from postmd.utils import calc_replicas_mean_std >>> data_arrays = [ >>> np.array([4.3, 5.6, 3.8, 5.1, 4.9]), # First dataset >>> np.array([3.2, 4.5, 4.1, 3.7, 4.3]), # Second dataset >>> np.array([5.5, 6.2, 5.9, 6.1, 5.8]), # Third dataset >>> ] >>> # Calculate mean and std of each dataset (i.e., each array). >>> mean, std = calc_replicas_mean_std(data_arrays) >>> print(f"replicas mean: {mean}") Averages of replicates: [4.33333333 5.43333333 4.6 4.96666667 5. ]
- postmd.utils.utils.calc_box_length(num, density=1.0, NA=None)[source]
calculate the length of a cubic water box.
Warning
The built-in Avogadro constant in LAMMPS (units real or metal) is 6.02214129e23, see lammps/src/update.cpp, they write “force->mv2d = 1.0 / 0.602214129” for units real and units metal. However, we defalutly used the Avogadro constant in scipy.constants is 6.022140857e23, which is the international standard.
- Parameters:
num (int) – the number of water molecules.
density (float, optional) – the density of water, in g/cm^3. Defaults to 1.0.
- Returns:
the length of a cubic water box, in Angstrom
- Return type:
float
Examples
>>> import postmd.utils as utils >>> utils.calc_box_length(1000, density=1.0) The length of a cubic water box for 1000 water molecules and 1.0 g/cm^3 is 31.043047 Angstrom
- postmd.utils.utils.create_dir(path, backup=True)[source]
Create a directory at the specified ‘path’. If the directory already exists and ‘backup’ is True, rename the original directory by appending ‘.bkXXX’.
- Parameters:
path (str) – The path of the directory to be created.
backup (bool, optional) – Whether to back up an existing directory. Default is
True
.
Examples
>>> import os >>> import postmd.utils as utils >>> >>> print(os.listdir()) ['createdir.py'] >>> utils.create_dir("test") >>> print(os.listdir()) # create a new "test" dir ['createdir.py', 'test'] >>> utils.create_dir("test") >>> print(os.listdir()) # move orgin "test" dir to "test.bk000" dir ['createdir.py', 'test', 'test.bk000'] >>> utils.create_dir("test") >>> print(os.listdir()) # move orgin "test" dir to "test.bk001" dir ['createdir.py', 'test', 'test.bk000', 'test.bk001']
- postmd.utils.utils.cummean(data)[source]
calculate the cumulative average.
- Parameters:
data (1d list) – the data need to do the cumulative average
- Returns:
the cumulative average
- Return type:
np.ndarray
Examples
>>> import postmd.utils as utils >>> import numpy as np >>> array = np.arange(9) >>> print(array) [0 1 2 3 4 5 6 7 8] >>> utils.cummean(array) array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. ])
- postmd.utils.utils.stats_mean_std_bins(x, y, bins=10, range=None)[source]
statistic the mean and standard deviation(ddof=0) of x and y in each bin. Here we used the [scipy.stats.binned_statistic](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html) function.
- Parameters:
x – (N,) array_like. A sequence of values to be binned.
y – (N,) array_like. The data on which the statistic will be computed. This must be the same shape as x, or a set of sequences - each the same shape as x. If values is a set of sequences, the statistic will be computed on each independently.
bins (int or sequence of scalars, optional) – If bins is an int, it defines the number of equal-width bins in the given range (10 by default). If bins is a sequence, it defines the bin edges, including the rightmost edge, allowing for non-uniform bin widths. Values in x that are smaller than lowest bin edge are assigned to bin number 0, values beyond the highest bin are assigned to bins[-1]. If the bin edges are specified, the number of bins will be, (nx = len(bins)-1). Defaults to 10.
range ((float, float) or [(float, float)], optional) – The lower and upper range of the bins. If not provided, range is simply (x.min(), x.max()). Values outside the range are ignored. Defaults to None.
- Returns:
(x_mean, x_std, y_mean, y_std)
- Return type:
tuple