Converts Pandas dataframes and series, Numpy array’s and recarrays or a dictionary of individual timeseries into a Pandas dataframe with one datetime index. With all arrays dataframes and series it is assumed that the first column contains the timestamps.
Convert various tabular data formats to timeseries DataFrame
Args: data (Union[pd.DataFrame, pd.Series, dict, np.ndarray, np.recarray]): The input data to be converted. timezone (str, optional): The timezone to set for the index of the DataFrame. Defaults to ‘UTC’. columnnames (Optional[List[str]]): The column names to use for the DataFrame. Defaults to None.
Returns: pd.DataFrame: The converted timeseries DataFrame with the index set to the specified timezone.
Converts a data dict into a pandas DataFrame based on the specified record format. Parameters: - data: A dictionary containing the data to convert. - timecolumns: A list of column names to be treated as time columns. - recordformat: A string specifying the format of the data records (‘records’, ‘table’, ‘split’, ‘index’, ‘tight’). Returns: - df: A pandas DataFrame with a DatetimeIndex representing the converted data.
Exported source
def timeseries_dataframe_from_datadict( data:dict, timecolumns=None, recordformat='records', nested=False):""" Converts a data dict into a pandas DataFrame based on the specified record format. Parameters: - data: A dictionary containing the data to convert. - timecolumns: A list of column names to be treated as time columns. - recordformat: A string specifying the format of the data records ('records', 'table', 'split', 'index', 'tight'). Returns: - df: A pandas DataFrame with a DatetimeIndex representing the converted data. """ orient = recordformat.lower()assert orient in ['records', 'table', 'split', 'index', 'tight']assert timecolumns, 'No time columns specified'#print(f"Converting {'nested' if nested else 'flat'} data dict to DataFrame with orient={orient} and timecolumns={timecolumns}")if orient =='records':if nested:# data is a nested structure, we need to normalize it df = pd.json_normalize(data, sep='.', errors='ignore') # type: ignoreelse:# data is a structured ndarray, sequence of tuples or dicts, or DataFrame df = pd.DataFrame.from_records(data, coerce_float=True) # type: ignore time_columns_in_df = [C for C in df.columns if C in timecolumns]ifnot time_columns_in_df: time_column = df.columns[0]else: time_column = time_columns_in_df[0]elif orient =='table':# data is in pandas table format time_column = data['schema']['primaryKey'][0] df = pd.DataFrame.from_dict(data['data'], coerce_float=True).set_index(data['schema']['primaryKey']) df.index.name ='time'else:# data is formatted according to 'orient' parameter (pandas) df = pd.DataFrame.from_dict(data, orient=orient, coerce_float=True) # type: ignore time_column = df.index.name df.columns =list(df.columns) df[time_column] = pd.to_datetime(df[time_column],utc=True,format='ISO8601') df.set_index(time_column, inplace=True) df.index = pd.DatetimeIndex(df.index).round('ms') df.index.name ='time'return df
Convert a timeseries DataFrame or Series into a dictionary representation.
Args: data (Union[pd.DataFrame, pd.Series, dict]): The input data to be converted. It can be a pandas DataFrame, Series, or a dictionary. recordformat (str, optional): The format of the output records. Defaults to ‘records’. timezone (str, optional): The timezone to use for the DataFrame index. Defaults to ‘UTC’.
Returns: Union[dict, list]: The converted dictionary representation of the input data, a dictionary or a list of dictionaries depending on the recordformat parameter.
Resamples a time-series DataFrame on the specified period and method.
Parameters: df (pd.DataFrame): The input time-series DataFrame. period (str): The resampling period. method (str): The resampling method. Can be a string of multiple methods separated by ‘;’. method_args (dict, optional): Additional arguments for the resampling method.