Series

How to change a Series type?

import pandas as pd

serie = pd.Series([1, 2, 3, 4])
series.astype(float)

How to apply a function to every item of my Serie?

import pandas as pd

serie = pd.Series(['a', 'b', 'b', 'a'])
series.apply(lambda x: 0 if x=='a' else 1)

How to prepare my DataFrame to apply get_dummies?

import pandas as pd

X = pd.read_csv(..)
categorical = ['x1', 'x2', 'x4']  # columns that have categorical features in your X

for cat in categorical: 
    X[cat] = X[cat].astype(object)

X_dummy = pd.get_dummies(X)

read_csv errors of encoding

Usually you can read a csv just by doing something like:

pd.read_csv('file.csv')

Sometimes, an encoding error appears. The first option is to pass ‘utf8’ as a value of the parameter encoding.

pd.read_csv('file.csv', encoding='utf8')

But there are some cases where this is not enough and the following error keeps appearing:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc7 in position 4: invalid continuation byte

The only thing that could resolve this was:

pd.read_csv('file.csv', encoding='latin-1')

Sum values of all columns

df.sum(axis=1)

Use apply for multiple columns

def my_function(a, b):
  return a + b


df.apply(lambda row: my_function(row['a'], row['b']), axis=1)

Cheers!
Letícia

Comments