In just 10 minutes, you’ll learn how to create an audiobook by converting text to audio files in Python. The best part? Create as many audiobooks as you want for free!
I like to learn new things and always try to keep myself updated on the latest news. This requires a lot of reading. However, I only have a few hours of leisure time every day to sit down and do my own things. I can’t spend them all on reading. This is why I turn to YouTube. I always play something in the background while doing other things (like cleaning up after dinner). However, there are always good articles online that are still in text format, and I still want to read them.
Python has benefited me big time, both personally and professionally. This is also one of the occasions. I can use Python to create audiobooks from online articles or PDFs and listen to them whenever I’m doing other things!
A friend recently introduced Paul Graham’s blog to me. In case you never heard of this guy, he’s one of the founders of YCombinator and Hacker News. He published many interesting articles on his blog, and my friend suggested that I should read them all.
Convert text to Audio using gTTS
I want to start this tutorial with a simple program, and we’ll expand the functionalities in later tutorials.
We’ll take this article Billionaires Build and convert it into an audio file. The library for converting text to audio is called gTTS, or Google Text-to-Speech, which allows for interaction with Google Translate’s text-to-speech API. So you might have guessed it already – this library requires an Internet connection. That shouldn’t be a problem if you are already reading this article. 😉
Let’s grab all text from that article, including the notes at the bottom, then put them into a text file. I’ll also add the author’s name at the beginning of the file. Save it and exit.
Load text into Python
Let’s first grab the gTTS library from pip.
pip install gTTS
Read the text file into Python.
readlines() will read every single line from the text file all the way to the end of the file.
>>>with open('billion.txt','r') as f: src = f.readlines() >>> type(src) <class 'list'> >>> len(src) 117
list object type,
readlines() basically read all lines and store them into a list. Let’s check the first few items in the list.
\n stands for new line character in Python, each
\n represents a new line. Although those
\n look annoying, we actually don’t have to remove them. They are only there for Python to understand that a new line is in place.
>>> src[:5] ['By Paul Graham\n', '\n', 'December 2020\n', '\n', "As I was deciding what to write about next, I was surprised to find that two separate essays I'd been planning to write were actually the same.\n"]
However, we need to remove two things to improve listening experience:
- The footnote numbers in the main text body
- The “Notes” section at the bottom
Clean up the text to improve the listening experience
It’s easy to remove the whole Notes section, so we’ll do that first. By observation, “Notes” is also an element in our list, so we can find that element’s location and remove everything after including itself. By some trial & error, we can see the “Notes” looks like this in the list:
>>> src[-20:] #displaying the last 20 items ['\n', '\n', '\n', '\n', '\n', 'Notes\n', '\n', " The YC partners have so much practice doing this that they sometimes see paths that the founders themselves haven't seen yet. The partners don.................]
index() function, we can find that “Notes\n” appears to be the 102nd item. Let’s remove everything after the 102nd item. Quickly checking the last few items to make sure that the “Notes” section is removed.
>>> src.index('Notes\n') 102 >>> src = src[:102] >>> src[-10:] ["Users are what the partners want to know about in YC interviews, and what I want to know about when I talk to founders that we funded ten years ago and who are billionaires now. What do users want? What new things could you build for them? Founders who've become billionaires are always eager to talk about that topic. That's how they became billionaires.\n", '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n']
The next things we need to remove are the footnote numbers in the main text body. For example, the “” at the end of element #26. Because the numbers change, we have to find a generic form. For this purpose, we can use the Regular Expression module, which is a built-in Python module to find generic text patterns.
>>> import re >>> test = re.sub('\[\d\]','', src) >>> test "The first thing the partners will try to figure out, usually, is whether what you're making will ever be something a lot of people want. It doesn't have to be something a lot of people want now. The product and the market will both evolve, and will influence each other's evolution. But in the end there has to be something with a huge market. That's what the partners will be trying to figure out: is there a path to a huge market? \n"
In the above code, the
re.sub() function basically removes the particular pattern [n]. In regular expression, the pattern “
\[\d\]” translates into: “[n]”, with n can be any digits from 0-9.
Now we sorted out how to remove the footnote number from a string, let’s apply that to remove all of the numbers by looping through the list
clean_src =  for i in src: temp = re.sub(r'\[\d\]','', i) clean_src.append(temp)
And if you are familiar with list comprehension, you can write a single line looping instead of the above 4 lines:
clean_src = [re.sub('\[\d\]','',i) for i in src]
Convert text to audio
Now we have a list with cleaned text. Let’s convert that list into a single string, then convert it to audio format using gTTS.
txt = ''.join(clean_src) tts = gTTS(txt, lang='en', tld = 'ca') tts.save('billion.mp3')
The gTTS library provides access to 70 different languages, plus several accents for a few languages. To see all the support languages, type gtts-cli –all in command prompt.
We quickly walked through how to convert a text file into an audio file. You can apply this technique to any text, even to a full book! Now enjoy your free audio books. 😊