adds read_npz and write_npz, convenience wrappers#46
Conversation
… and `NpzReader` similar to `read_npy` and `write_npy`
Thanks for pointing this out. I've created #48 to track this issue. Thanks also for the PR. There are a few things about the proposed API which are unsatisfying to me:
Creating a use flate2::{bufread::GzDecoder, write::GzEncoder, Compression};
use ndarray::{array, Array2};
use ndarray_npy::{ReadNpyError, ReadNpyExt, WriteNpyError, WriteNpyExt};
use std::fs::File;
use std::io::{BufReader, BufWriter, Write};
use std::path::Path;
fn write_npy_gz<P, T>(path: P, array: &T) -> Result<(), WriteNpyError>
where
P: AsRef<Path>,
T: WriteNpyExt,
{
// Note: I'm not sure if the `BufWriter` actually helps or not.
let mut writer = GzEncoder::new(BufWriter::new(File::create(path)?), Compression::default());
array.write_npy(&mut writer)?;
writer.finish()?.flush()?;
Ok(())
}
fn read_npy_gz<P, T>(path: P) -> Result<T, ReadNpyError>
where
P: AsRef<Path>,
T: ReadNpyExt,
{
// Note: I'm not sure if the `BufReader` actually helps or not.
T::read_npy(GzDecoder::new(BufReader::new(File::open(path)?)))
}
fn main() -> Result<(), Box<dyn std::error::Error>> {
let arr1 = array![[1, 2, 3], [4, 5, 6]];
// Write the array.
write_npy_gz("foo.npy.gz", &arr1)?;
// Read it back.
let arr2: Array2<i32> = read_npy_gz("foo.npy.gz")?;
println!("arr1:\n{}", arr1);
println!("arr2:\n{}", arr2);
assert_eq!(arr1, arr2);
Ok(())
}To read it with NumPy, you could do this: import numpy as np
import gzip
def load_npy_gz(path):
with gzip.open(path) as f:
return np.load(f)
arr = load_npy_gz('foo.npy.gz')
print(arr)(You could also decompress |
these work like
read_npyandwrite_npybut write compressed.npzfiles instead. I wanted this functionality for writing ephemeral array files to be able to check something later if needed without taking up too much disk space.in comparing to
read_npy/write_npy, there is one major difference: since anpzfile can contain multiple named arrays/files, this picks a default name for the single array it writes withwrite_npz, while allowing the user to specify the name to extract withread_npz. this may not be the best choice, but it seemed less than ideal to not permit specifying the name inread_npz, and I wantedwrite_npzto remain as simple as possible.I picked the default name for
write_npzbased on what numpy does insavez_compressed("arr_0.npy"). however, I think there is a divergence there. usingnp.load, you will get a dict-like object that allows you to access the arrays without the.npyextension (i.e. at keyarr_0). however, usingNpzReader, you need to use the fullarr_0.npyname to retrieve the same array. just wanted to flag as this tripped me up a bit.thanks for your consideration of this pull request.