Interactively exploring pandas dataframes

This script allows you to interactively explore a pandas dataframe by displaying the dataframe in pages and prompting the user to navigate through the pages using the “n” and “p” keys.

First, we need to define the function that displays the dataframe in pages:

import pandas as pd
import sys
from IPython.display import clear_output

def display_df(df, start_index, page_size):
    """
    Displays a pandas dataframe in pages and prompts the user to navigate through the pages using the "n" and "p" keys.
    """
    end_index = min(start_index + page_size, len(df))
    while True:
        clear_output(wait=True)
        print(df.iloc[start_index:end_index])
        print(f"Showing rows {start_index+1} to {end_index} of {len(df)}")
        print("")
        print("Press 'n' for next page, 'p' for previous page, or any other key to exit...")
        key = input()
        if key == "n":
            start_index += page_size
            end_index += page_size
            if start_index >= len(df):
                start_index = 0
                end_index = min(page_size, len(df))
        elif key == "p":
            start_index -= page_size
            end_index -= page_size
            if start_index < 0:
                start_index = len(df) - page_size
                end_index = len(df)
        else:
            break

Now, we can use the `display_df()` function to interactively explore a pandas dataframe. First, let’s load a sample dataframe:

df = pd.DataFrame({
    'col1': ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'],
    'col2': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
})

To display the dataframe in pages, we can call the `display_df()` function with the dataframe, the starting index, and the page size:

start_index = 0
page_size = 5
display_df(df, start_index, page_size)

This will display the dataframe in pages and prompt the user to navigate through the pages using the arrow keys.

You can modify the `display_df()` function and the example code above to work with your own pandas dataframes.

Written on April 14, 2023