How to do in Python a complex selection of rows in Pandas dataframe

120

Question: How to do in Python a complex selection of rows in Pandas dataframe

I have a big df like below (just show the first lines, the real one has more than 60000k rows):

Id  Name    Age Friends 0   Will    33  385 1   Jean    26  2 2   Hugh    55  221 3   Deanna  40  465 4   Quark   68  21 5   Weyoun  59  318 6   Gowron  37  220 7   Will    54  307 8   Jadzia  38  380 9   Hugh    27  181 10  Odo     53  191 11  Ben     57  372 ........ 

I would like to store in another dataframe that every 100 values insert 12. I know that with .loc and .iloc you can store 1 value each n values (100 in the example below):

df1 = df.loc[::100] 

I am trying not to iterate with a for within the dataframe since the df is so large, the process slows down a lot, is there any way with .loc to achieve this complex row selection?

Total Answers: 1

77

Answers 1: of How to do in Python a complex selection of rows in Pandas dataframe

You can actually just trim off all the hundreds off the index values, so e.g. 200-300 becomes 0-100, 123000-124000 becomes 0-100, etc., and then filter for values less than 12:

filtered = df[df.index % 100 < 12]