Pandas: MultiIndex DataFrame block-wise common timestamps


In a data file, I have samples from 3 devices (first column) which measure 3 different parameters (last 3 columns) at 6 different heights (second column). The file looks like:

0 2 2018-09-06T08:38:31Z 9.01 6.20 3.56
0 3 2018-09-06T08:38:41Z 8.90 5.98 3.43
0 4 2018-09-06T08:38:52Z 8.92 6.17 3.62
0 5 2018-09-06T08:39:03Z 8.96 6.13 3.56
1 0 2018-09-06T08:39:35Z 8.96 6.23 3.50
1 1 2018-09-06T08:39:45Z 9.01 6.45 3.43
1 2 2018-09-06T08:39:56Z 9.07 6.44 3.56
1 3 2018-09-06T08:40:06Z 9.08 6.49 3.56
1 4 2018-09-06T08:40:17Z 8.81 6.21 3.43
1 5 2018-09-06T08:40:28Z 9.05 6.43 3.31
2 0 2018-09-06T08:41:00Z 9.19 6.35 3.50
2 1 2018-09-06T08:41:10Z 8.83 6.31 3.50
2 2 2018-09-06T08:41:21Z 8.87 6.08 3.37
2 3 2018-09-06T08:41:31Z 9.01 6.39 3.43
2 4 2018-09-06T08:41:42Z 8.68 6.20 3.37
2 5 2018-09-06T08:41:52Z 8.87 6.39 3.43
0 2 2018-09-06T08:43:31Z 9.01 6.20 3.56
0 3 2018-09-06T08:43:41Z 8.90 5.98 3.43
0 4 2018-09-06T08:43:52Z 8.92 6.17 3.62
0 5 2018-09-06T08:43:03Z 8.96 6.13 3.56
1 0 2018-09-06T08:46:35Z 8.96 6.23 3.50
1 1 2018-09-06T08:46:45Z 9.01 6.45 3.43
1 2 2018-09-06T08:46:56Z 9.07 6.44 3.56
1 3 2018-09-06T08:46:06Z 9.08 6.49 3.56
1 4 2018-09-06T08:46:17Z 8.81 6.21 3.43
1 5 2018-09-06T08:46:28Z 9.05 6.43 3.31
2 0 2018-09-06T08:48:00Z 9.19 6.35 3.50
2 1 2018-09-06T08:48:10Z 8.83 6.31 3.50
2 2 2018-09-06T08:48:21Z 8.87 6.08 3.37
2 3 2018-09-06T08:48:31Z 9.01 6.39 3.43
2 4 2018-09-06T08:48:42Z 8.68 6.20 3.37
2 5 2018-09-06T08:48:52Z 8.87 6.39 3.43

The parameters get read per height per device. I would now like to modify the data frame such that I can select a parameter and be able to plot the time series for each height of each device.

My first approach was to make the timestamps unique for each "device block", i.e.

0 2 2018-09-06T08:38:31Z 9.01 6.20 3.56
0 3 2018-09-06T08:38:31Z 8.90 5.98 3.43
0 4 2018-09-06T08:38:31Z 8.92 6.17 3.62
0 5 2018-09-06T08:38:31Z 8.96 6.13 3.56
1 0 2018-09-06T08:39:35Z 8.96 6.23 3.50
1 1 2018-09-06T08:39:35Z 9.01 6.45 3.43
1 2 2018-09-06T08:39:35Z 9.07 6.44 3.56
1 3 2018-09-06T08:39:35Z 9.08 6.49 3.56
1 4 2018-09-06T08:39:35Z 8.81 6.21 3.43
1 5 2018-09-06T08:39:35Z 9.05 6.43 3.31
2 0 2018-09-06T08:41:00Z 9.19 6.35 3.50
2 1 2018-09-06T08:41:00Z 8.83 6.31 3.50
2 2 2018-09-06T08:41:00Z 8.87 6.08 3.37
2 3 2018-09-06T08:41:00Z 9.01 6.39 3.43
2 4 2018-09-06T08:41:00Z 8.68 6.20 3.37
2 5 2018-09-06T08:41:00Z 8.87 6.39 3.43
0 2 2018-09-06T08:43:31Z 9.01 6.20 3.56
0 3 2018-09-06T08:43:31Z 8.90 5.98 3.43
0 4 2018-09-06T08:43:31Z 8.92 6.17 3.62
0 5 2018-09-06T08:43:31Z 8.96 6.13 3.56
1 0 2018-09-06T08:46:35Z 8.96 6.23 3.50
1 1 2018-09-06T08:46:35Z 9.01 6.45 3.43
1 2 2018-09-06T08:46:35Z 9.07 6.44 3.56
1 3 2018-09-06T08:46:35Z 9.08 6.49 3.56
1 4 2018-09-06T08:46:35Z 8.81 6.21 3.43
1 5 2018-09-06T08:46:35Z 9.05 6.43 3.31
2 0 2018-09-06T08:48:00Z 9.19 6.35 3.50
2 1 2018-09-06T08:48:00Z 8.83 6.31 3.50
2 2 2018-09-06T08:48:00Z 8.87 6.08 3.37
2 3 2018-09-06T08:48:00Z 9.01 6.39 3.43
2 4 2018-09-06T08:48:00Z 8.68 6.20 3.37
2 5 2018-09-06T08:48:00Z 8.87 6.39 3.43

-- or, alternatively, for each "group" of devices, i.e.:

0 2 2018-09-06T08:38:31Z 9.01 6.20 3.56
0 3 2018-09-06T08:38:31Z 8.90 5.98 3.43
0 4 2018-09-06T08:38:31Z 8.92 6.17 3.62
0 5 2018-09-06T08:38:31Z 8.96 6.13 3.56
1 0 2018-09-06T08:38:31Z 8.96 6.23 3.50
1 1 2018-09-06T08:38:31Z 9.01 6.45 3.43
1 2 2018-09-06T08:38:31Z 9.07 6.44 3.56
1 3 2018-09-06T08:38:31Z 9.08 6.49 3.56
1 4 2018-09-06T08:38:31Z 8.81 6.21 3.43
1 5 2018-09-06T08:38:31Z 9.05 6.43 3.31
2 0 2018-09-06T08:38:31Z 9.19 6.35 3.50
2 1 2018-09-06T08:38:31Z 8.83 6.31 3.50
2 2 2018-09-06T08:38:31Z 8.87 6.08 3.37
2 3 2018-09-06T08:38:31Z 9.01 6.39 3.43
2 4 2018-09-06T08:38:31Z 8.68 6.20 3.37
2 5 2018-09-06T08:38:31Z 8.87 6.39 3.43
0 2 2018-09-06T08:43:20Z 9.01 6.20 3.56
0 3 2018-09-06T08:43:20Z 8.90 5.98 3.43
0 4 2018-09-06T08:43:20Z 8.92 6.17 3.62
0 5 2018-09-06T08:43:20Z 8.96 6.13 3.56
1 0 2018-09-06T08:43:20Z 8.96 6.23 3.50
1 1 2018-09-06T08:43:20Z 9.01 6.45 3.43
1 2 2018-09-06T08:43:20Z 9.07 6.44 3.56
1 3 2018-09-06T08:43:20Z 9.08 6.49 3.56
1 4 2018-09-06T08:43:20Z 8.81 6.21 3.43
1 5 2018-09-06T08:43:20Z 9.05 6.43 3.31
2 0 2018-09-06T08:43:20Z 9.19 6.35 3.50
2 1 2018-09-06T08:43:20Z 8.83 6.31 3.50
2 2 2018-09-06T08:43:20Z 8.87 6.08 3.37
2 3 2018-09-06T08:43:20Z 9.01 6.39 3.43
2 4 2018-09-06T08:43:20Z 8.68 6.20 3.37
2 5 2018-09-06T08:43:20Z 8.87 6.39 3.43

Then I think I could further group and index the data frame according to my needs. Unfortunately, I already struggle with the just mentioned task :-)

What I have so far in terms of code:

col_names = ['device', 'height', 'timestamp', 'p1', 'p2', 'p3']
df = pd.read_csv(file, names=col_names, index_col=False,
                 sep=' ', engine='python', error_bad_lines=False)

df['timestamp'] = pd.to_datetime(df['timestamp'], errors='coerce')

I can achieve a MultiIndex data frame via

df.set_index(['device', 'height', 'timestamp'], inplace=True)

But then I have no idea how I can set the timestamps in each "block" to a common value (e.g. the value of the first entry respectively).

I already searched on StackOverflow and in the Pandas docs. Maybe I'm just searching with the wrong expressions but so far I couldn't find any helpful entries. Any help would be highly appreciated! I hope you can understand my issue as I described it.

Thanks a lot in advance!

- - Source
comments powered by Disqus