Skip to content

Interpret bytes as packed binary data #621

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
lewisfish opened this issue Jan 21, 2022 · 6 comments
Open

Interpret bytes as packed binary data #621

lewisfish opened this issue Jan 21, 2022 · 6 comments
Labels
idea Proposition of an idea and opening an issue to discuss it

Comments

@lewisfish
Copy link
Contributor

Motivation

I've recently had cause to write Fortran code that communicates over a socket, and code that reads a structured binary file format. For both of these I needed to write routines that could read n bytes and convert those bytes into characters, integers etc or write integer, characters etc into n bytes.

Prior Art

Pythons struct, particularly the pack and unpack routines.
Szaghi's BeFoR64 pack_data routine.

Additional Information

No response

@lewisfish lewisfish added the idea Proposition of an idea and opening an issue to discuss it label Jan 21, 2022
@jvdp1
Copy link
Member

jvdp1 commented Jan 21, 2022

I am not sure to understand the details.
Is the intrinsic transfer not what you are looking for?
Or maybe this bitset module?

@lewisfish
Copy link
Contributor Author

Sorry if I was not clear.
What I'm meaning is essentially a user friendly wrapper around the intrinsic transfer to read or write bytes to/from a binary file.
For example in python using the struct libray one can read a arbitrary packed binary files (example from voxwriter) like so:

Currently in Fortran it is a lot more verbose to do the same thing with bare intrinsics.
Does this make more sense? I don't think bitset module does this, though my understanding of what a bitset is, is lacking.
If you still think that transfer suffices, please feel free to close the issue 😃

@nncarlson
Copy link
Contributor

I believe he's interested in utilities for serialization/deserialization of a data structure. At its heart it is the transfer function, but that function is really clunky to use directly and it is very useful to have some simple-to-use wrapper procedures. For example,

call copy_to_bytes(var, buffer, offset)

would turn the variable var to a sequence of bytes and add them to the int8 array buffer starting at the given offset and update offset accordingly. Serializing a data structure then just amounts to a sequence of clean, understandable calls to copy_to_bytes.

PS: I have such procedures, but for some reason I implemented them using c_loc and c_f_pointer to effectively equivalence storage instead of using transfer. I'm not sure why now.

@lewisfish
Copy link
Contributor Author

I believe he's interested in utilities for serialization/deserialization of a data structure. At its heart it is the transfer function, but that function is really clunky to use directly and it is very useful to have some simple-to-use wrapper procedures. For example,

call copy_to_bytes(var, buffer, offset)

would turn the variable var to a sequence of bytes and add them to the int8 array buffer starting at the given offset and update offset accordingly. Serializing a data structure then just amounts to a sequence of clean, understandable calls to copy_to_bytes.

PS: I have such procedures, but for some reason I implemented them using c_loc and c_f_pointer to effectively equivalence storage instead of using transfer. I'm not sure why now.

Yes this is exactly what I mean.

@ivan-pi
Copy link
Member

ivan-pi commented Jan 22, 2022

Is this task related to the FD thread: Bytearray for socket packets?

Me gut feeling is the stdlib_bitset is not applicable because we use our own bitset literal format for I/O.


For the buffer there is the possibility to use int8 or character(len=1). I believe the former is better since it is guaranteed to have the correct size. For text data, the character has the advantage it can be printed easily.

@lewisfish
Copy link
Contributor Author

Is this task related to the FD thread: Bytearray for socket packets?

Yes it is, thanks again for your help on that. Code from that question is here, though it needs a tidy up.

For the buffer there is the possibility to use int8 or character(len=1). I believe the former is better since it is guaranteed to have the correct size. For text data, the character has the advantage it can be printed easily.

I suppose you could have it as int8, and then have a to_char or print routine?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea Proposition of an idea and opening an issue to discuss it
Projects
None yet
Development

No branches or pull requests

4 participants