Gives users a way to bring their own data onto the platform. For each time series users wish to use with the platform, users will need to upload it using this tool. Once uploaded, use the data all across Chronos, like using it in a forecast or as the base series in a new scenario.
In order to make a forecast using algorithms, data is required. There are various styles of organizing time series data but here Chronos has the requirement for a formatted 'date' column and then other columns containing the series values.
Before uploading the data, it is advisable to manually scan the imported data for errors. Sometimes when data is converted to new formats, the values are truncated or corrupted. Usually looking at a few things, particularly the min, mean, max, and count of the data is sufficient to detect any major problems. Review the data again after upload.
Add metadata to uploads when possible. It is useful for remembering just what that series was when reviewed months later. It is also important to be aware of as much context as possible about the series. Where do these numbers come from?
Two of the harder-to-detect data problems are incomplete recent data and data definition changes. Incomplete recent data is common where data has to be reviewed or is gradually arriving as requests are processed. Incompleteness also occurs based on when a forecast is performed. Performing a monthly forecast mid-month will often draw in incomplete data for the current month, appearing like a drastic and unexpected drop. The solution is usually to adjust the date range to remove the more recent data, or to manually adjust to expectations.
Data definition changes are another frequent challenge. Business units get combined or separated and the resulting data suddenly is much larger or smaller than before. Again there is no automatic approach, and the solution is usually to adjust the date range to drop the data, or to manually adjust the data if possible.
We currently support .csv
and .xlsx
file uploads and the primary way for users to bring their own data onto the platform. This is usually the first stop a user makes before starting to use the rest of the platform.
Formatting your data before bringing it on to the platform is essential. We've done as much as we can to accept many different formats and variations of data but this step is none-the-less important. Tools like Excel or Google Sheets are good places to work with your time series data to ensure the formatting is correct. Often there are utilities to transform data all at once so the process should be quick and easy.
Data should look like the tables below: column name "date" must be present, however the "values" column name can be called anything
date | values |
---|---|
2022-01-01 | 42.1 |
2022-01-02 | 40.8 |
2022-01-03 | 38.6 |
2022-01-04 | 42.8 |
... | ... |
date | values |
---|---|
2022-01 | 42.1 |
2022-02 | 40.8 |
2022-03 | 38.6 |
2022-04 | 42.8 |
... | ... |
date | values |
---|---|
2018 | 42.1 |
2019 | 40.8 |
2020 | 38.6 |
2021 | 42.8 |
... | ... |
In some cases you'll want to bring in more than one series at a time, for example with Batch forecasting. To specify additional series, add the values as a new column while maintaining a single date column. If the series' date ranges do not line up, that is okay. Include all the dates need and cells where a series doesn't have values can be left blank.
An example of this could look like the following:
date | series_one | series_two |
---|---|---|
2022-01-01 | 42.1 | |
2022-01-02 | 40.8 | |
2022-01-03 | 38.6 | 115.2 |
2022-01-04 | 42.8 | 110.0 |
2022-01-05 | 110.0 | |
2022-01-06 | 110.0 | |
... | ... | ... |
The table above shows multiple series with differing date ranges. series_one spans from 2022-01-01
to 2022-01-04
and series_two spans from 2022-01-03
to 2022-01-06
. Chronos will handle these cases without a problem as long as the date frequency of all the series being uploaded is the same. For example, mixing monthly data and daily data will not work.
Once you've successfully uploaded some data through the file upload process, there are a few preprocessing options available before the final submission of the data.
Transformations like converting to log scale should be used with particular attention because they shift the values into a different range. Users have to be aware that the forecast values will need to be shifted back, using an exponential function, to be interpreted in the original space.
Graphs in Chronos will often have a normalize option. Trying to view two related series at once, say an exchange rate with values around 1 and total sales value, with values in the 100,000's, would be impossible on the same graph without normalization as there isn't enough room to give both the necessary resolution. Normalization shifts the values into a range where they can be viewed at the same time without losing the basic shape and pattern.