Optimizing timestamps for graphing
Before I jump into the specifics of my journey into generating timestamps for my visualizations, I should first try and explain what I actually want to achieve and how it relates to the site. If you have read any of the other project posts, you might know that this project is all about projecting your future economical situation given today's state of things through input you provide like savings, loans and transactions (one-time or recurring) that you make towards either of these.
While I might end up visualizing this in many different ways in the future, I decided to prototype my idea by first creating a master view that encompasses all profits and all losses projected many years ahead. I want this master view to be a stacked area graph where all profits are displayed above the x-axis and all losses are displayed below. All I have as a basis for these calculations is what the user has provided in terms of information about the savings, loans and transactions they have.
In the figure above, I have illustrated the simplest example of what is needed to generate such a graph. Given some initial state provided by the user, the application should to the best effort, project these into the future to let users better understand how their saving strategies and downpayments play a role over the long term. The x-axis represents time and the y-axis represents profits / losses. In order to generate such a graph, we need to be able to calculate or estimate the y-values. I won't get into the details of this in this post as what I currently have is based on some pretty basic calculations not worthy of a post yet. I will however share some experiences on how I generate the timestamps required to project the y-values, that is, the x-values for the graph.
Here are some qualities that I think is important to keep in mind when designing this projector:
- It should be performant enough to not bother the user. No loading indicators, no slow interfaces etc.
- It should include all transaction dates. These dates are special to the user as they expect something to happen, or at least start happening, on these dates and they should therefore be included in the graph.
- The resulting graph should look smooth
Besides the performance requirements I have for the calculations, there are also some other technical considerations I need to take into account. Most importantly, if the stacked area graph consists of multiple areas, all of these areas must contain the same exact x-values. This means that any datestamp i deem worthy of including because area A requires it, will also have to be included in all other areas. Before we jump into things, lets just quickly review what timestamps and transactions are.
Transactions
Transactions are, as briefly mentioned, deposits or withdrawals that a user has set up. In the figure above, these are marked with a green dot. A transaction must be linked to something, be it a loan or a saving. If the transaction is going towards a loan, it is not possible to make a withdrawal as I imagine the banks would get rather angry about it. For savings, either withdrawal and deposits are allowed. A transaction may be set up as a one-time or a recurring phenomenon.
Blank timestamps
Any x-value that I wish to use for calculating the users projected values that does not relate to any specific event like a transaction is named a blank timestamp. These serve the sole purpose of ensuring there are enough x-values in my graph to satisfy my desire for a smooth graph. In the figure above, these are indicated by black circles.
Iterations of the algorithm
When I initially started developing the projection pipeline for the site, I begun working on the different projectors for saving accounts, loans and funds / stocks. These boiled down to some methods projectFund
, projectSavings
and projectLoans
that took a fromDate
and a toDate
as variable input parameters. It started out like this because I wanted to first look at projecting only funds as a proof of concept. Expanding this to the multiple types of projectors, the pipeline looked something like this:
Expanding this system to work on multiple types of projectors highlighted some issues with the approach. Since every projector was written in isolation, they only receive fromDate
and toDate
as input parameters, which means that the x-values they produce will potentially differ between each other taking transactions that go into either into account as well.
As illustrated in the figure, my first approach was to try and collect all of these values to ensure that all projections are made on the same x-values by creating a union-set of the values and interpolating all missing y-values for each projected set. This turned out to be a rather poor decision after testing this solution out on some realistic data and I had to throw in the towel and go back to the drawing board again to revisit the pipeline.
I ended up with a system that instead calculates all the necessary x-values up-front and passes them into the projectors instead of giving the projectors a date-range. This way, I can guarantee that the rather expensive calculations are bounded rather than having to later on excuse for the unnecessarily large amount of differing x-values after they have been projected, which in itself was a rather expensive operation. The figure below illustrates this new approach:
The details
In state, a the transaction typing looks like this:
export type TransactionInterval = {
every: number;
period: "days" | "weeks" | "months" | "years";
on: string;
};
type TransactionFrequency =
| {
paymentFrequency: "recurring";
interval: TransactionInterval;
}
| {
paymentFrequency: "once";
on: string;
};
type BaseTransaction = {
id: string;
name: string;
amount: number;
} & TransactionFrequency;
export type Withdrawal = BaseTransaction & {
type: "withdrawal";
savingsId: string;
};
export type Deposit = BaseTransaction & {
type: "deposit";
} & NullableEither<{ savingsId: string }, { loanId: string }>;
export type Transaction = Withdrawal | Deposit;
In essence, a transaction can be either a withdrawal or a deposit. A withdrawal can only be related to a saving, be it a savings account, a fund or stocks while a deposit can also be related to a loan.
From this single state object, we first want to create a list of TransactionFrame
elements. Each transaction frame represents an actual transaction on a single date. Our first method generates this list of transaction frames in a time-window between fromDate
to a toDate
:
export const getTransactionFramesInRange = (
transaction: Transaction,
fromDate: DateLike,
toDate: DateLike
): TransactionFrame[] => {
if (transaction.paymentFrequency === "once") {
if (!isWithinInterval(transaction.on, interval(fromDate, toDate))) {
return [];
}
const transactionFrame = createTransactionFrame(
transaction,
new Date(transaction.on)
);
return [transactionFrame];
}
const periodMethodMap: Record<
typeof transaction.interval.period,
(date: DateLike, amount: number) => Date
> = {
days: addDays,
weeks: addWeeks,
months: addMonths,
years: addYears,
};
const shiftPeriod = periodMethodMap[transaction.interval.period];
const onDate = new Date(transaction.interval.on);
let firstDate = !isAfter(onDate, fromDate) ? fromDate : onDate;
while (isAfter(firstDate, fromDate)) {
firstDate = shiftPeriod(firstDate, -transaction.interval.every);
}
const transactionFrames: TransactionFrame[] = [];
let shiftedDate = firstDate;
do {
const transactionFrame = createTransactionFrame(
transaction,
new Date(shiftedDate)
);
transactionFrames.push(transactionFrame);
shiftedDate = shiftPeriod(shiftedDate, transaction.interval.every);
} while (!isAfter(shiftedDate, toDate));
return transactionFrames;
};
Now that we have a list of specific transactions that go in-between two dates, we can insert blank timestamps in-between so that we aquire the desired smoothness of our x-values. For this, we are going to need two components:
- Functionality for generating even points between two dates
- A model to store all of the blank timestamps and transactions
As for the former, we create a class EvenTimestampGenerator
that takes in a maxMillisBetween
constructor parameter and provides us with a generate
method that gives us timestamps spaced out at maxMillisBetween
between fromDate
and toDate
:
export class EvenTimestampsGenerator {
private readonly maxMillisBetween: number;
constructor(maxMillisBetween: number) {
this.maxMillisBetween = maxMillisBetween;
}
generate(fromDate: DateLike, toDate: DateLike): Date[] {
const numberOfPoints = Math.floor(
differenceInMilliseconds(toDate, fromDate) / this.maxMillisBetween
);
if (numberOfPoints < 1) return [];
return Array.from(Array(numberOfPoints).keys()).map((i) =>
addMilliseconds(fromDate, i * this.maxMillisBetween)
);
}
}
As for the model that will contain all of this data, I implemented a map-like class HorizonTimestampTransactionMap
that allows adding both blank timestamps and transactions into it and casts all dates to the start of the day to avoid a finer granularity than users need in the app:
export class HorizonTimestampTransactionMap {
private map = new ArrayMap<string, TransactionFrame>();
private convertToKey(date: Date): string {
return startOfDay(date).toISOString();
}
addBlankTimestamp(date: Date) {
const key = this.convertToKey(date);
if (!this.map.has(key)) {
this.map.set(key, []);
}
}
addTransaction(transaction: TransactionFrame) {
const key = this.convertToKey(transaction.date);
this.map.push(key, transaction);
}
getEventsOnDate(date: Date): TransactionFrame[] {
return this.map.get(this.convertToKey(date)) ?? [];
}
toString(): string {
return JSON.stringify(Object.fromEntries(this.map.entries()), null, 2);
}
}