{ "cells": [ { "cell_type": "markdown", "id": "c820ad4a-f8fc-4337-be3d-8e5b38279ad0", "metadata": {}, "source": [ "# Create Whatif Dataset\n", "\n", "You may already have collected data in a tabular format, e.g., `.csv` or `.parquet` files. However,\n", "you also need to *store* the *sar* information together with your tabular data. This is where\n", "`whatif` dataset helps you: it provides a consistent format to organize your data files, the *sar*\n", "information, and your custom metadata, as a single unit.\n", "\n", "This notebook shows an example on how to convert your existing `.csv` data to a `whatif` dataset.\n", "It covers the following steps:\n", "\n", "- load source data into a pandas data frame\n", "- preprocess the data frame to ensure the *sar* information is accurate\n", "- wrap the pandas dataframe into a `WiDataFrame`\n", "- save the `WiDataFrame` into the `whatif` dataset format\n", "\n", "**NOTE**: as of this writing, `whatif` internally stores datasets as `.csv` files. Additional\n", "formats will be added in the future.\n", "\n", "**Pre-requisite:** this example requires [psychrolib](https://github.com/psychrometrics/psychrolib).\n", "To quickly install this library, you may uncomment and execute the next cell. For more details,\n", "please refer to its [documentation](https://github.com/psychrometrics/psychrolib#installation)." ] }, { "cell_type": "code", "execution_count": null, "id": "871da26b", "metadata": {}, "outputs": [], "source": [ "# %pip install pandas psychrolib" ] }, { "cell_type": "markdown", "id": "bf807d76-6b2b-45f5-bff1-398a0030f7f0", "metadata": {}, "source": [ "## Scenario\n", "\n", "The source data has an existing root top unit (RTU) dataset which has the following states and\n", "actions:\n", "\n", "*States*\n", "1) `outside_humidity`\n", "2) `outside_temperature`\n", "3) `return_humidity`\n", "4) `return_temperature`\n", "\n", "*Actions*\n", "\n", "5) `economizer_enthalpy_setpoint`\n", "6) `economizer_temperature_setpoint`\n", "\n", "However, the source data does not have the reward column `power`. Hence, once we load the source\n", "data, we pre-process it to add the `reward` column. Once the source data frame has completed its\n", "*sar* columns, we can then save it as a `whatif` dataset." ] }, { "cell_type": "code", "execution_count": null, "id": "412c197c-0e59-48c9-88ef-9e97d3921edd", "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "%load_ext autoreload\n", "%autoreload 2\n", "\n", "from pathlib import Path\n", "from tempfile import mkdtemp\n", "\n", "import numpy as np\n", "import pandas as pd\n", "import psychrolib\n", "\n", "import a2rl as wi\n", "from a2rl.nbtools import pprint, print # Enable color outputs when rich is installed.\n", "\n", "psychrolib.SetUnitSystem(psychrolib.IP)\n", "\n", "# Fixed constant atm pressure, pressure data is not reliable with value (-0.1 to 0.1), unit: psi\n", "PRESSURE = 14.696\n", "# Assumption on design spec, can change to align with customer setting\n", "SUPPLY_TEMP = 55\n", "SUPPLY_REL_HUMIDITY = 0.5\n", "SUPPLY_AIRFLOW = 8000\n", "MIN_OUTSIDE_AIR_RATIO = 0.1" ] }, { "cell_type": "markdown", "id": "e554cea0", "metadata": {}, "source": [ "## Load source data\n", "\n", "Let's start by loading a raw data source, which is a single `.csv` file. For simplicity, we re-use\n", "the `data.csv` file from the sample `rtu` dataset. We won't load all the columns in this sample\n", "`.csv` files, to simulate the missing reward columns." ] }, { "cell_type": "code", "execution_count": null, "id": "17b0899f", "metadata": { "tags": [] }, "outputs": [], "source": [ "source_csv_file = wi.sample_dataset_path(\"rtu\") / \"data.csv\"\n", "print(f\"Load {source_csv_file} into a pandas dataframe, but without the power column.\")\n", "df = pd.read_csv(source_csv_file, usecols=lambda x: x != \"power\")\n", "\n", "print(df.shape)\n", "df.head()" ] }, { "cell_type": "markdown", "id": "3ad85898", "metadata": {}, "source": [ "## Calculate reward `power` column." ] }, { "cell_type": "code", "execution_count": null, "id": "9dff4247", "metadata": { "tags": [] }, "outputs": [], "source": [ "def get_enthalpy_from_temp_rh(temp, rh):\n", " humidity_ratio = psychrolib.GetHumRatioFromRelHum(temp, rh, PRESSURE)\n", " enthalpy = psychrolib.GetMoistAirEnthalpy(temp, humidity_ratio)\n", " return enthalpy\n", "\n", "\n", "SUPPLY_ENTHALPY = get_enthalpy_from_temp_rh(SUPPLY_TEMP, SUPPLY_REL_HUMIDITY)\n", "\n", "outside_enthalpy_list = []\n", "return_enthalpy_list = []\n", "for _, row in df.iterrows():\n", " out_ent = get_enthalpy_from_temp_rh(row[\"outside_temperature\"], row[\"outside_humidity\"])\n", " outside_enthalpy_list.append(out_ent)\n", "\n", " ret_ent = get_enthalpy_from_temp_rh(row[\"return_temperature\"], row[\"return_humidity\"])\n", " return_enthalpy_list.append(ret_ent)\n", "\n", "df[\"outside_enthalpy\"] = outside_enthalpy_list\n", "df[\"return_enthalpy\"] = return_enthalpy_list\n", "\n", "\n", "def cal_power(\n", " outside_temperature,\n", " outside_enthalpy,\n", " return_temperature,\n", " return_enthalpy,\n", " max_enthalpy,\n", " max_temp,\n", "):\n", " \"\"\"\n", " data dict keys:\n", " max_enthalpy (agent's action/setting)\n", " max_temp (agent's action/setting)\n", " outside_temperature\n", " outside_enthalpy\n", " return_temperature\n", " return_enthalpy\n", " \"\"\"\n", " data = {}\n", " data[\"outside_temperature\"] = outside_temperature\n", " data[\"outside_enthalpy\"] = outside_enthalpy\n", " data[\"return_temperature\"] = return_temperature\n", " data[\"return_enthalpy\"] = return_enthalpy\n", "\n", " # Only need to do when outside temp <= 55\n", " if data[\"outside_temperature\"] <= 55:\n", " # print(\"Outside temperature is below cooling enable setpoint, no mechanical cooling ...\")\n", " # raise ValueError(\"Outside temperature is below cooling enable setpoint\")\n", " return 0\n", "\n", " # Determine econ on/off and get air ratio\n", " if data[\"outside_enthalpy\"] > max_enthalpy or data[\"outside_temperature\"] > max_temp:\n", " outside_air_ratio = MIN_OUTSIDE_AIR_RATIO\n", " else:\n", " outside_air_ratio = (data[\"return_temperature\"] - SUPPLY_TEMP) / (\n", " data[\"return_temperature\"] - data[\"outside_temperature\"] + 1e-6\n", " )\n", " outside_air_ratio = np.clip(outside_air_ratio, MIN_OUTSIDE_AIR_RATIO, 1)\n", "\n", " # Determine enthaly to reach supply temp/RH\n", " economiser_mixed_air_enthalpy = (\n", " outside_air_ratio * data[\"outside_enthalpy\"]\n", " + (1 - outside_air_ratio) * data[\"return_enthalpy\"]\n", " )\n", "\n", " power = (economiser_mixed_air_enthalpy - SUPPLY_ENTHALPY) * 4.5 * SUPPLY_AIRFLOW\n", "\n", " return power\n", "\n", "\n", "df[\"power\"] = df.apply(\n", " lambda x: cal_power(\n", " x.outside_temperature,\n", " x.outside_enthalpy,\n", " x.return_temperature,\n", " x.return_enthalpy,\n", " x.economizer_enthalpy_setpoint,\n", " x.economizer_temperature_setpoint,\n", " ),\n", " axis=1,\n", ")\n", "\n", "print(df.shape)\n", "df.head()" ] }, { "cell_type": "markdown", "id": "bc71e6fc-670f-4837-a1db-6cac67560ea8", "metadata": {}, "source": [ "We can also plot the rewards." ] }, { "cell_type": "code", "execution_count": null, "id": "cca01b22-892a-4eb9-a1a2-a070cb90c4d6", "metadata": {}, "outputs": [], "source": [ "df[\"power\"].plot();" ] }, { "cell_type": "markdown", "id": "b857259f", "metadata": {}, "source": [ "## Save pandas Dataframe as `whatif` Dataset\n", "\n", "Converting a pandas dataframe to a `whatif` dataset is straight forward: we just need to create a\n", "`WiDataFrame`, then call its `to_csv_dataset()` method." ] }, { "cell_type": "code", "execution_count": null, "id": "1f7f1f5b-f598-42a6-822a-90aa3ba84f8f", "metadata": {}, "outputs": [], "source": [ "wdf = wi.WiDataFrame(\n", " df,\n", " states=[\"outside_humidity\", \"outside_temperature\", \"return_humidity\", \"return_temperature\"],\n", " actions=[\"economizer_enthalpy_setpoint\", \"economizer_temperature_setpoint\"],\n", " rewards=[\"power\"],\n", ")\n", "\n", "display(wdf.sar_d, wdf.head())" ] }, { "cell_type": "markdown", "id": "25a56886", "metadata": {}, "source": [ "Now, we can directly save the `WiDataFrame` into a `whatif` dataset. For this example, we're going\n", "to write the output to a temp directory, and our new dataset is going to include only the\n", "`timestamp` and the *sar* columns." ] }, { "cell_type": "code", "execution_count": null, "id": "d5c97a7b", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Will save to a temporary directory. Feel free to change to another location\n", "outdir = Path(mkdtemp()) / \"my-rtu-dataset\"\n", "print(\"Will save output dataset to\", outdir)\n", "\n", "# Let the dataset contains only the timestamp and sar columns.\n", "wdf[[\"timestamp\", *wdf.sar]].to_csv_dataset(outdir, index=False)" ] }, { "cell_type": "markdown", "id": "416fae80-f432-4690-86f1-898325fb3779", "metadata": {}, "source": [ "As shown below, a `whatif` dataset is a directory with a `metadata.yaml` file and a `data.csv` file." ] }, { "cell_type": "code", "execution_count": null, "id": "ee2e032f-126f-4ece-80a2-18b9f5aa9821", "metadata": {}, "outputs": [], "source": [ "!command -v tree &> /dev/null && tree -C --noreport {outdir} || ls -al {outdir}/*" ] }, { "cell_type": "markdown", "id": "3629e1aa", "metadata": {}, "source": [ "## Load the new Dataset\n", "\n", "Now you can load the directory into a `WiDataFrame`." ] }, { "cell_type": "code", "execution_count": null, "id": "64fd2be2", "metadata": {}, "outputs": [], "source": [ "df2 = wi.read_csv_dataset(outdir)\n", "display(df2.shape, df2.sar, df2.head())" ] }, { "cell_type": "markdown", "id": "aa3b2ec4-7664-4395-9439-45047e165890", "metadata": {}, "source": [ "We can also plot the rewards." ] }, { "cell_type": "code", "execution_count": null, "id": "5280fbdc-b9ab-40b8-a9f0-393eb5ba5c0c", "metadata": { "tags": [ "nbsphinx-thumbnail" ] }, "outputs": [], "source": [ "df2[\"power\"].plot();" ] }, { "cell_type": "markdown", "id": "a8872d24-884a-41fe-a4c9-a7b8723cfcd3", "metadata": {}, "source": [ "## Summary\n", "\n", "Congratulations! You've completed the tutorial on `whatif` dataset. We encourage you to further\n", "explore the remaining examples." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" }, "toc-autonumbering": true, "vscode": { "interpreter": { "hash": "22f92e4608f34d3393fc5e7884f8906c6794e2d0198ea9b43992c442775a4328" } } }, "nbformat": 4, "nbformat_minor": 5 }