Reading binary files is an important skill for working with data (non-textual) such as images, audio, and videos. Using file mode and the “read” method you can easily read binary files. Python has the ability to handle the data and consent provides various help with certain criteria. Whether you are dealing with multimedia files, compressed data, or custom binary formats, Python’s ability to handle binary data empowers you to create powerful and versatile applications for a wide range of use cases. In this article, you will learn What binary files are and how to read data into a byte array, and Read binary data into chunks? and so on.
What are Binary files?
Generally, binary means two. In computer science, binary files are stored in a binary format having digits 0’s and 1’s. For example, the number 9 in binary format is represented as ‘1001’. In this way, our computer stores each and every file in a machine-readable format in a sequence of binary digits. The structure and format of binary files depend on the type of file. Image files have different structures when compared to audio files. However, decoding binary files depends on the complexity of the file format. In this article, let’s understand the reading of binary files.
Python Read A Binary File
To read a binary file,
Step 1: Open the binary file in binary mode
To read a binary file in Python, first, we need to open it in binary mode (‘”rb”‘). We can use the ‘open()’ function to achieve this.
Step 2: Create a binary file
To create a binary file in Python, You need to open the file in binary write mode ( wb ). For more refer to this article.
Python – Write Bytes to File
Step 3: Read the binary data
After opening the binary file in binary mode, we can use the read() method to read its content into a variable. The” read()” method will return a sequence of bytes, which represents the binary data.
Step 4: Process the binary data
Once we have read the binary data into a variable, we can process it according to our specific requirements. Processing the binary data could involve various tasks such as decoding binary data, analyzing the content, or writing the data to another binary file.
Step 5: Close the file
After reading and processing the binary data, it is essential to close the file using the “close()” method to release system resources and avoid potential issues with file access.
# Opening the binary file in binary mode as rb(read binary)f = open("files.zip", mode="rb")# Reading file data with read() methoddata = f.read()# Knowing the Type of our dataprint(type(data))# Printing our byte sequenced data print(data)# Closing the opened filef.close()
Output:
In the output, we see a sequence of byte data as bytes are the fundamental unit of binary representation.
b’PK\x03\x04\x14\x00\x00\x00\x08\x00U\xbd\xebV\xc2=j\x87\x1e\x00\x00\x00!\x00\x00\x00\n\x00\x00\x00TODO11.txt\xe3\xe5JN,\xceH-/\xe6\xe5\x82\xc0\xcc\xbc\x92\xd4\x9c\x9c\xcc\x82\xc4\xc4\x12^.w7w\x00PK\x01\x02\x14\x00\x14\x00\x00\x00\x08\x00U\xbd\xebV\xc2=j\x87\x1e\x00\x00\x00!\x00\x00\x00\n\x00\x00\x00\x00\x00\x00\x00\x01\x00 \x00\x00\x00\x00\x00\x00\x00TODO11.txtPK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x008\x00\x00\x00F\x00\x00\x00\x00\x00′
Reading binary data into a byte array
This given code demonstrates how to read binary data from a file into a byte array and then To read binary data into a binary array print the data using a while loop. Let’s explain the code step-by-step:
Open the Binary File
This line opens the binary file named “string.bin” in binary mode (‘”rb”‘). The file is opened for reading, and the file object is stored in the variable ‘file’.
# Open the binary filefile = open("string.bin", "rb")
Reading the first three bytes
This line reads the first three bytes from the binary file and stores them in the variable “data”. The “read(3)” method bytes from the file and advance the pointer accordingly.
data = file.read(3)
Print data using a “while ” Loop
The loop will keep reading and printing three bytes at a time until the end of the file is reached. Once the end of the file is reached, the read() method will return an empty bytes object, which evaluates to False in the while loop condition, and the loop will terminate.
while data: print(data) data = file.read(3)
Close the Binary File
Finally, after the loop has finished reading and printing the data, we close the binary file using the ‘close()’ method to release system resources.
file.close()
Now by using the above steps in one, we will get this :
The code output will depend on the content of the “string.bin” binary file. The code reads and prints the data in chunks of three bytes at a time until the end of the file is reached. Each iteration of the loop will print the three bytes read from the file.
# Open the binary filefile = open("string.bin", "rb")# Reading the first three bytes from the binary filedata = file.read(3)# Printing data by iterating with while loopwhile data: print(data) data = file.read(3)# Close the binary filefile.close()
For example, if the content of “string.bin” is b’GeeksForGeeks’ (a sequence of six bytes), the output will be:
Output:
b 'Gee'
b ' ksf'
b 'org'
b 'eek'
b 's'
Read Binary files in Chunks
To Read binary file data in chunks we use a while loop to read the binary data from the file in chunks of the specified size (chunk_size). The loop continues until the end of the file is reached, and each chunk of data is processed accordingly.
In this “chunk_size=1024” is used to specify the size of each chunk to read the binary file. file = open(“binary_file.bin”, “rb”): This line opens the binary file named “binary_file.bin” in binary mode (“rb”). while True is used to sets up an infinite loop that will keep reading the file in chunks until the end of the file is reached. “chunk = file. read(chunk_size)” is Inside the loop, and the read(chunk_size) method is used to read a chunk of binary data from the file.
# Specify the size of each chunk to readchunk_size = 10 file = open("binary_file.bin", "rb")# Using while loop to iterate the file datawhile True: chunk = file.read(chunk_size) if not chunk: break # Processing the chunk of binary data print(f"Read {len(chunk)} bytes: {chunk}")
The output of the code will depend on the content of the “binary_file.bin” binary file and the specified “chunk_size”, For example, if the file contains the binary data “b” Hello, this is binary data!’, and the chunk_size is set to 10, the output will be:
Output :
Read 10 bytes: b'Hello, thi'
Read 10 bytes: b's is binar'
Read 7 bytes: b'y data!'
Outputs vary depending on the binary file data we are reading and also on the chunk size we are specifying.
Read Binary file Data into Array
To read a binary file into an array.bin and used the “wb” mode to write a given binary file. The “array” is the name of the file. assigned array as num=[3,6,9,12,18] to get the array in byte format. use byte array().
To write an array to the file we use:
file=open("array","wb")num=[3,6,9,12,18]array=bytearray(num)file.write(array)file.close()
To read the written array from the given file, we have used the same file i.e., file=open(“array”, “rb”). rb used to read the array from the file. The list() is used to create a list object. number=list(file. read(3)). To read the bytes from the file. read() is used.
file=open("array","rb")number=list(file.read(3))print (number)file.close()
Output:
[3,6,9]
Read Binary files in Python using NumPy
To read a binary file into a NumPy array, import module NumPy. The “dtype” is “np.unit8” which stands for “unsigned 8-bit integer” This means that each item in the array is an 8-bit (1 byte) integer, with values that can range from 0 to 255.
import numpy as np# Open the file in binary modewith open('myfile.bin', 'rb') as f: # Read the data into a NumPy array array = np.fromfile(f, dtype=np.uint8) # Change dtype according to your data
Remember to change your file to your binary files
Output:
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], dtype=np.uint8)
Related Article
Python | Convert String to bytes
Python Array
Read a file line by line in Python
Reading and Writing a text file in Python
Reading binary files in Python – FAQs
How to Read a Python Binary File
To read a binary file in Python, you use the
open()
function with the'rb'
mode, which stands for “read binary.” This approach ensures that the file is read as is, without any transformations that might occur if the file were opened in text mode.with open('file.bin', 'rb') as file:
binary_data = file.read()
print(binary_data) # This will print the binary content of the file
How to Convert Binary Data to Readable Format in Python
Converting binary data to a readable format often depends on the type of data stored in the binary. For simple data like strings, you can directly decode the binary using an appropriate character encoding:
text = binary_data.decode('utf-8') # Decoding binary data to a string using UTF-8 encoding
print(text)For complex data structures, you might need to use modules like
struct
to unpack the binary data properly.
What is the Difference Between seek()
and tell()
Methods?
seek()
Method: This method is used to change the current file position in a file stream. Theseek()
method takes a parameter that specifies the position (in bytes) in the file to move to. This is useful for “jumping” to a certain part of the file to start reading or writing from there.tell()
Method: This method is used to find out the current position in a file. When you calltell()
, it returns an integer that represents the current position of the file pointer in bytes.with open('file.bin', 'rb') as file:
print(file.tell()) # Initially, it will print '0' as the file pointer is at the beginning
file.seek(10) # Move the pointer to 10 bytes from the beginning
print(file.tell()) # Now it will print '10'
How to Convert Binary File to Text
Converting a binary file to text involves reading the binary file and then decoding it to a string using the appropriate encoding, as binary files can contain anything from image data to encoded text.
with open('file.bin', 'rb') as file:
binary_data = file.read()
text_data = binary_data.decode('utf-8') # Ensure the encoding matches the file content
print(text_data)
How to Make a Binary File Readable
Making a binary file readable can mean two things: converting it to a text format (as shown above) or interpreting its contents properly based on its structure. For non-text data (like images or compiled data), you would typically use a specific library or tool that understands the format:
- For images, you might use a library like PIL/Pillow to read and display the image.
- For serialized objects, you might use
pickle
to deserialize them.Here’s an example using
pickle
to deserialize and make readable a binary file containing Python objects:import picklewith open('data.pkl', 'rb') as file:
data = pickle.load(file)
print(data) # Assuming the data is a dictionary, list, etc.These methods cover various scenarios involving binary files, from direct reading and displaying as text to more complex deserialization or decoding depending on the specific type of binary data.
Next Article
Read File As String in Python