Sometimes, we want to manipulate a list of integers, or a list of strings, or even a list of variables. These lists are called data structures, or collections. There are several collection types which can help :

Data structures

List

A list contains a list of items of any kind, any size. It is mutable and ordered. Think about it as a way to save the grades of several students at a test for example.

grades = [13, 15, 11, 9, 20]

A list can hold any kind of item, for example:

a = 3
b = "Hello"
var_a = [a, b]

A list can also contain other lists:

var_a = [[2,0], [1,1]]

When you create a collection, you might want to retrieve the value of a certain element, e.g the grade of a certain student. This is called indexing. To index a list, simply put into brackets the index of the item:

grades = [13, 15, 11, 9, 20]
grades[0]
13

In Python, the indexing starts at 0, which means that to retrieve the first value, you actually retrieve the one at position 0. We can then retrieve the second element using index 1.

grades[1]
15

To retrieve values between indexes 1 (included) and 3 (excluded), you can simply type:

grades[1:3]
[15, 11]

The output is a sub-list that contains only the values we are interested in. You can retrieve the elements from the 2nd position (included) until the end using:

grades[1:]
[15, 11, 9, 20]

All you need to do is to not specify an end index. To take the elements from the beginning until the 3rd element (excluded):

grades[:2]
[13, 15]

You can also count backwards to start from the end. Simply use a minus, and start at minus 1 as the last element, and move backwards.

grades[-1]
20
grades[-2]
9

You can also use the indexing until the end this way:

grades[-2:]
[9, 20]

You can replace an element of a list by indexing:

grades[0] = 14
[13, 15, 11, 9, 20]

To add a new element, we say that we append a new element, and use the corresponding keyword:

grades.append(15)

If you want to specify the position of the element to insert, you can also use the word insert :

grades.insert(2, 11)

You don’t need to do list = list.append..., this operation modifies the original list by itself.

A list has a length, and you can use the keyword len in Python to count the number of elements.

len(grades)
5

To get the maximum of a list, simply use the max keyword:

max(grades)
20

Same goes for the minimum with min. You can also compute the sum with the keyword sum. There are few keywords to remember in Python for data analytics, but don’t worry, we’ve already covered a good part of them.

To compute the average of a list, you can divide the sum by the number of components:

sum(grades) / len(grades)
13.6

Tuple

A tuple is a list that cannot be transformed (immutable).

var_a = (1,3,4,5)

You cannot add element to a tuple or modify it without creating a second one or transforming it into a list.

Set

A set is an unordered list made of unique elements only. If you add several duplicates, the set will only keep each value once.

var_a = set([1, 2, 3, 4])

Dictionary

A dictionnary has a key and a value for each item. You can use it to store the id of a customer and the value attached to that id for example.

dict_a = {'id_1':13, 'id_2': 15}

A dictionary can also contain new dictionaries as values. For example:

dict_b = {'id_1':{'address':'London', 'paid':180}, 'id_2':{'address':'Paris', 'paid':220}}

You can retrieve the value in dict_a for the key id_1 by simply using square brackets:

dict_a['id_1']
13

You can add a new element to the dictionary by referencing a new key:

dict_a['id_3'] = 12

This is it for the data structures, you’ll manipulate them a lot in the next sections.

If you found the article useful or see ways in which it could be improved, please leave a comment :)