Python for Office 101

Last Updated on May 24, 2020 by Jay

Many people are scared of learning a programming language due to its learning curve. Indeed, even the basics of a programming language can cover a lot of content. In this guide, I will cover only the basics to get you started using Python in an office environment, thus Python for Office 101! By the end of this short tutorial, you will be able to use Python for simple office tasks, such as reading an Excel table into Python.

Data types

The most common data types we encounter in office are number and text. There are two number data types: int and float, which means integers and decimal numbers. The value 6 is an integer, and the value 3.14 is a float. Python provides a convenient way to check data types by using type() function.

The text data type is called string, which can be represented by one of the three ways:

single quotes
double quotes
three single/double quotes

Single quotes and double quotes are mostly used for string variables. Whereas the three quotes are mostly used as Python docstrings.

Another data type is array, which is a collection of items. There are four different types of arrays in Python, they are list, tuple, dictionary, and set. You can store any data type into them, below are examples of each:

Lists are written in square brackets: [1,2,3,4].

tuples are similar to lists, but are written in parenthesis: (‘python’, ‘in’, ‘office’).

dictionaries are written in curly brackets: {‘item_1’: ‘apple’, ‘item_2′:’banana’}. “item_1” and “item_2” are called dictionary keys, and “apple” and “banana” are called values.

sets are written in curly brackets as well {‘apple’, ‘banana’}, but it’s different from dictionary, where sets don’t have keys.

Lists, tuples, and dictionaries are mostly used, whereas sets are less common. We will talk about the differences among these data types in another tutorial, now let’s just focus on learning the basics.

Variable assignment

Variable assignment in Python is super easy. You don’t have to declare a variable or indicate its type like in VBA, all you need is type a variable name, followed by an equal sign, the followed by a value. Note we use single equal sign here.

a = 10
b = 2
c = 'hello world'
li = [1,2,3,4]

Once you assign a variable, Python automatically determines the data type stored in a specific variable.

Simple math operations

You can use either values or variables in a math operation. You can think about the variables as if they are “cells” or “named range” if you are familiar with Microsoft Excel.

10 + 2
a + b

Python variable assignment and math operation

Basic operators in Python

Sometimes referred to as the “arithmetic operators”. The following are the basic operators.

Operator	Example	Description
+	a + b = 13	Add a and b together
–	a – b = 7	Subtract b from a
*	a * b = 30	Multiply a and b
/	a / b = 3.33…	Divide b from a
**	a ** b = 1000	a to the power of b note it is not “^” like in Excel!
%	a % b = 1	Modulus – returns the remainder from the division
//	a // b = 3	Returns the integer part of the division. Almost like rounddown(a/b,0)

Basic Python Operators

Special operator (operation assignment)

There’s a special operator where you combine the basic arithmetic operator and the equal sign, e.g. += in Python, which first performs an operation (addition in this case) then assign the value back to the original variable itself. This is useful when you want to increment a variable itself by a certain value, for example, maybe you want to increase a counter by 1, normally you would do i = i + 1. The Pythonic way of doing it is i += 1.

i = 0
i += 1

You can use any basic operator with the equal sign together, so -=, *=, /=, etc. will also work similarly, so I won’t talk about them, but feel free to play around with these operators.

Conditional statement

The conditional statement is quite simple in Python, it goes like the following:

if <condition_1>:
    <action_1>
elif <condition_2>:
    <action_2>
...
...
else:
    <action_3>

You can have as many elif as you need, but there can be only one else condition.

Simple conditions can be written in a short form, sometimes referred to as the “one line if statement”, goes like the following:

<value_when_true> if <condition> else <value_when_false>

print('a is bigger than b') if a>b else print('b is bigger than a')

Python comparison operators

To compare if two values are equal, double equal signs are used instead of the single equal sign (the latter is used for variable assignment). For example, to check whether a and b are equal, it is a == b.

The most commonly used comparison operators are listed here:

Comparison Operator	Example	Description
==	a == b (false)	Check if a and b are equal
!=	a != b (true)	Check if a and b are not equal
>	a > b (true)	Check if a is greater than b
>=	a >= b (true)	Check if a is greater than or equal b
<	a < b (false)	Check if a is smaller than b
<=	a <= b (false)	Check if a is smaller than or equal b
<>	a <> b (true)	Check if a and b are not equal. Same as !=

Because Python doesn’t force users to declare a variable data type (it does so automatically), so when you use Python to load CSV or Excel data, it’s possible that Python thinks some data are integer, while others are strings.

You can’t compare data of different types. For example, comparing an integer and a string will sometimes make Python mad. See the below example. Note it’s interesting that Python allows the == comparison, but not the > comparison. This is because Python knows whether an integer 100 is the same as a text “100”, but it doesn’t know which one is greater.

Python sometimes doesn’t like you comparing an integer and a string

This kind of problem happens all the time in data analysis. You can fix the problem by forcing Python to convert a string to an integer.

You can also reverse the conversion from integer to string.

Looping

You can use looping to go through a collection of items (i.e. lists), or perform a number of repetitive actions. Looping starts with the keyword for. Let’s use our list li as an example, if we want to loop through each item in it:

for item in li:
   print(item)

In the above looping statement, “item” just refers to an individual element in the list (li). You can use almost any word or letter in place of “item”. For example, the following would work just fine:

for x in li:
    print(x)

Installing libraries

We’ll touch on installing Python libraries briefly here. If you need a more in-depth guide on how to install Python libraries, check out this guide here. There are several ways to install 3rd party Python libraries. I will go through my personal favorite, which I believe is also the easiest and the standard method – pip install.

pip is a program that comes with your standard Python installation, so you should always have it available. However, first time Python user might experience difficulty and encounter the following error message when doing a pip install:

This is probably because you didn’t check the “add PATH” during the Python installation. This guide will help you fix the problem.

pip install is preferred because you can just open a Command Prompt and type pip install library_name. As long as you know the library name, you will be able to download it. You don’t need to know where to download it, or where to save the library on your hard-drive. pip install takes care of all that, remember, all you need to know is the library name to install it! For example, for data analysis, we’ll always be using pandas library. To get it, simple type pip install pandas in your Command Prompt window.

Working with strings in Python

Strings are basically text data. We’ll work with strings a lot. For example, a file path is a string. Opening a file in Python will require you knowing the file path.

Python Raw String (r-String)

If you are using a Windows system like me, you might experience the following problem when trying to access a file path. Here I’m trying to assign a folder path to a variable a.

It turns out that, the backslash sign “\” is a special character in Python, which is called “escape character”. It’s used with other letters to represent certain whitespace characters. For example,

\n   ### starts a new line
\t   ### tab
\r   ### return

In order to avoid this problem, you can do the following. All three methods will work fine.

Use double backslash in the file path
Use slash instead of the backslash
Use Python raw string

My favorite is the Python raw string method, which means you add an “r” right in front of the string. The first two approaches require you to manually modify the backslashes inside the file path, which means extra work. By simply adding an “r”, you are telling Python, “hey, treat whatever follows as a pure text, ignore all the special characters”.

Python f-String

The Python f-string is a very powerful feature available starting in Python v3.6. f-string is used for formatting a string.

Since Python is written in C, the string formatting in C is also applicable in Python. However, it’s confusing so I don’t want to mention it here. I will show two different string formatting here. One is the .format() method, the other is the f-string method. Consider the following code.

Python string .format()

In the below example, curly brackets {} indicate a placeholder for variables inside a string.

a = 'apple'
b = 'banana'
s1 = 'i have a {}, but i want a {}'.format(a,b)

Below is a graphical explanation, the .format(a,b) function places the value of a into the first {}, and places the value of b into the second {}. This method has many disadvantages. For example, if you want to include many variables in string, you need to remember the position of each variable.

You can also insert the variables inside the curly brackets, with this method, the order of a and b doesn’t matter anymore.

Much easier, also you can control the variables a little bit better. However, you still need to type out quite some code .format(...).

My personal favorite is f-string for string formatting. Similar to the raw string (or r-string), we put an “f” right in front of the the string. We’ll keep the curly brackets {} to indicate it’s a place holder for variables, and then we insert the variables directly into those curly brackets. The above example becomes:

s1_f = f'i have a {a}, but i want a {b}'
s2_f = f'i have a {b}, but i want a {a}'

The f-string is so much better than the .format() method. The code is not only shorter and more readable, but it’s also more concise. You know where each variable is placed at almost intuitively.

Combining r-string and f-string

Yes, you can combine r-string and f-string into a powerful string formatting tool! Let’s revisit the previous example with file path. To combine them, it’s literally just putting “rf” right in front of a string!

file_1 = 'book1.xlsx'
file_path = rf'C:\Users\JZ\Desktop\PythonInOffice\{file_1}'

Combination of Python f-string and r-string

Loading an Excel file into Python

Okay, it’s practice time. We are going to load an Excel file into Python. Feel free to follow along using the sample Excel file here. pandas is the gold standard for data analysis work in Python, so we’ll use that. If it makes an Excel user feel more comfortable, a pandas dataframe is basically same as an Excel table. If you can accept that, working with pandas will come as natural as working with Excel files!

import pandas as pd

file_name = 'sheet1.xlsx'
file_path = rf'C:\Users\JZ\Desktop\PythonInOffice\Python for office 101\{file_name}'

df = pd.read_excel(file_path)