Serialization is the process of converting an object into a series of bytes, called a byte stream, to be stored or sent for later use. This could include being sent across a network, written to a file, or other process. In Python, this is called pickling.
Now, we’ve seen how to convert numbers to a string, so we can write them to a text file. But what about other, more complex, types of data, like a list, set, etc. Well, this is where serialization comes into play.
Serializing Data to Write to a File
There are several steps you’ll take to serialize your data:
- Import the pickle module
- Open the file for binary writing
- Call the dump() method of the pickle module
- Close the output file
Step one only needs to occur once in your application. Steps 2 through 4 may occur several times depending upon how your application is structured, and/or if more than one file needs to be written to.
Let’s look at an example where you do this:
import pickle
phonebook = {'James':'555-1234','Mary':'555-5678','John':'555-2468','Martha': '555-1357'}
out_file = open('phonebook.dat','wb')
pickle.dump(phonebook, out_file)
out_file.close()
Binary Files
You can see from the example, we open an out_file with ‘wb’. That opens a file for writing in binary format. This is different from ‘w’ which opens a file for writing a text file.
That file is a binary file. If you open it with a text editor (like Notepad++) you will view data that isn’t very readable that might look something like what you see below.
€}q (X JamesqX 555-1234qX MaryqX 555-5678qX JohnqX 555-2468qX MarthaqX 555-1357qu.
Reading Serialized Data (Deserialization)
Once you write a file, you will need to read that data back in. There is a similar process:
- Import the pickle module
- Open the file for binary reading
- Import the data using the pickle.load method
- Closing the data file.
Here is an example of that.
import pickle
in_file = open('phonebook.dat','rb')
phonebook = pickle.load(in_file)
in_file.close()
print(phonebook)
I will note here that pickling data is not considered secure. So you should only unpickle data that you get from a trusted source.
Another option is to use JSON (JavaScript Object Notation) which provides for a more secure method, and allows you to view the data since it comes in a special text format.
Serializing Objects in Python was originally found on Access 2 Learn