Around 85% of the world data is unstructured, and a good part of it is text data. As a data analyst or data scientist, it is a common job to handle text data, and therfore strings.

Character Strings

Character strings are sequences of individual letters which all together form a string.

a = "My cat is 8"

Now, we’ll dive into some key functions that you can use on strings.

You can retrieve the length of a string by simply using the keyword len :

len(a)

11
``

You can count the number of occurences of a specific letter by using `.count`:

```python
a.count("M")

You can apply a lower, upper or title transformation to a string (upper, lower, title).

a.upper()

MY CAT IS 8

You can replace a letter or several letters by another letter or group of letters.

a.replace("My", "my")

my cat is 8

What if I now want to split my string into a list of words, that is to obtain a list of words? You can use the split keyword:

a.split()

["My", "cat", "is", "8"]

You can specify in the parenthesis the character on which it should split, by default it is the space.

You can also index a string to extract a sub-string:

a[1:6]

y cat

We start the indexing in 0, and exclude the 6th element. Notice how the space counts as a character too.

If you found the article useful or see ways in which it could be improved, please leave a comment :)

Manipulating strings

Maël Fabien

Character Strings