How do you compare the error between two trajectories? A ChatGPT's answer

after caliberating

The MAPE value between the two datasets is: 6.60% The SMAPE value between the two datasets is: 3.24%

Ask:

I want to calibrate my model parameters. Now I have the simulation data "sumo.csv" and real world data "01_tracks.csv" they all have header "frame, id, x "  which represents the trajectory of different vehicles with their ids over frame (Note that the names and number of ids in the two files are different). How do I calculate the similarity between the trajectories? My advisor suggested calculating MAPE value, or you could do it the other way and just give the code. NOTICE: only compute the first 1000 frame of each table

Answer

To calculate the similarity between the trajectories in "sumo.csv" and "01_tracks.csv", you can use Mean Absolute Percentage Error (MAPE) as suggested by your advisor. Here's a step-by-step process with Python code using the Pandas library:

Import the necessary libraries:
```
import pandas as pd
import numpy as np
```
Read the CSV files and filter the data to the first 1000 frames:

sumo_df = pd.read_csv("sumo.csv")
real_df = pd.read_csv("01_tracks.csv")

sumo_df = sumo_df[sumo_df['frame'] < 1000]
real_df = real_df[real_df['frame'] < 1000]

Normalize the IDs in both datasets by subtracting the minimum ID value:

sumo_df['id'] = sumo_df['id'] - sumo_df['id'].min()
real_df['id'] = real_df['id'] - real_df['id'].min()

Calculate the MAPE value:

def mape(real_df, sumo_df):
    merged_df = real_df.merge(sumo_df, on=['frame', 'id'], suffixes=('_real', '_sumo'))
    if merged_df.empty:
        raise ValueError("No matching ids found in both datasets.")

    absolute_errors = np.abs(merged_df['x_real'] - merged_df['x_sumo'])
    percentage_errors = absolute_errors / np.abs(merged_df['x_real'])
    mape_value = np.mean(percentage_errors) * 100
    return mape_value

mape_value = mape(real_df, sumo_df)

But

the id is in different naming system in two table

Answer

If the IDs in the two tables are in different naming systems and you cannot directly match them, you can calculate the similarity between the trajectories by finding the closest matching pairs of IDs. You can do this by calculating the Euclidean distance between the coordinates of the same frame in both datasets.

Here's a step-by-step process with Python code using the Pandas library:

def euclidean_distance(point1, point2):
    return np.sqrt(np.sum((point1 - point2) ** 2))
# Calculate the MAPE value by finding the closest matching pairs of IDs:
def mape(real_df, sumo_df):
    total_percentage_errors = []
    for frame in real_df['frame'].unique():
        real_frame_data = real_df[real_df['frame'] == frame]
        sumo_frame_data = sumo_df[sumo_df['frame'] == frame]

        for _, real_row in real_frame_data.iterrows():
            real_point = np.array([real_row['x']])
            min_distance = float('inf')
            min_sumo_point = None

            for _, sumo_row in sumo_frame_data.iterrows():
                sumo_point = np.array([sumo_row['x']])
                distance = euclidean_distance(real_point, sumo_point)

                if distance < min_distance:
                    min_distance = distance
                    min_sumo_point = sumo_point

            percentage_error = np.abs(real_point - min_sumo_point) / np.abs(real_point)
            total_percentage_errors.append(percentage_error)

    mape_value = np.mean(total_percentage_errors) * 100
    return mape_value

mape_value = mape(real_df, sumo_df)