Time-series databases – sMAP and Apache Cassandra
Note: This post discusses my experiences in migrating the front end of an IoT integration from sMAP time-series data store to Apache Cassandra.
If you have been using some open source time-series databases recently, for storing time-series data from physical devices like the IoT devices, you would have come across sMAP as one of these databases. sMAP also known as ‘Simple Measurement and Actuation Profile’ comes with the standard package. It was developed by UCB and have been widely in use amongst Volttron users. Since our application uses Volttron, we decided to use it too.
Being the GUI person in the team, I had to come up with a strategy to efficiently pull data from sMAP and plot it as time-series plots for the user dynamically. I looked at two ways to do it. One would be to follow sMAP and develop a module similar to their sMAP Plotter. However, I found it messier to integrate into our application. I decided to use their metadata structure and pull data using this Archiver interface – query language. Worked out pretty well for me the first time. I was designing the page prototypes at the same time. I decided to keep the UI part of it (HTML/CSS) stable irrespective of how I populate data. I pulled the last 24-hour data and displayed it as per my design and plotted it on my UI dynamically. I also provided an option of auto-updating the data.
Recently, because of the difficulty in using sMAP on single-board computers like Raspberry Pi and Odroid, we decided to migrate to a different time-series database. After researching on many different databases, we decided to choose Apache Cassandra for our distributed time-series data storage. I am keeping some choices and information abstract intentionally since this version is still under development.
We had two ideas to stick to: keep the migration simple; do not disturb existing components. Both of these were possible because of our modular application design. The Operating Systems layer migrated to Cassandra by creating a new module that allowed data to be sent to Cassandra instead of sMAP. Bear in mind that our system is still under development and we were not worried about losing real data (which is typically the concern for big organizations at this stage). We did however back up whatever data we had collected over time using sMAP for our analysis purposes.
The UI platform had a slightly different migration story. What I dread as a UI developer is disturbing the base design of the UI as part of this migration. Having a consistent layout is always a good design principle. Changing the layout might lead to a string of tests to see if a user understands the new structure. With this in mind, I decided to point the UI pages to a new backend app to fetch data from Cassandra. I built a new app to fetch the same style of data from Cassandra as from sMAP and optimized my queries to Cassandra. I also implemented a few lambda expressions to simplify the parsing process. (As a side note here, Lambda expressions are life savers!)
I didn’t have to change anything in my front end (HTML/CSS) which was a relief. I adjusted the data and date formats to work with this new set of data and the UI was good to go for the first version with Cassandra. It is easier to work with a simplified function call to Cassandra API instead of going to umpteen number of metadata parsing in the sMAP interface.
My next step is to simplify the auto-update process by adding the partial-update feature to the plot. I am sure that is going to be a challenging task.
I should say that being open source is our strength. We get to evaluate and decide on various tools and technologies and choose the perfect set of interfaces for our application.