pulsar Time Series Data Base for Smart Metering solutions
PulsarTSDB for Smart Metering is a very efficient, distributed Time Series Data Base. The product is designed to fulfill Meter Data Management (MDM) providers demands:
- to store and access milions of meters Time Series Data,
- select Meter Informaton in miliseconds from Big Data,
- Cost Efficiency (licence and low number of commodity servers),
- commodity node to serve extensive number of meters (2 milions),
- straightforward administration.
The basis of PulsarTSDB was the requirement to have fast reliable access to data coming from large amount of meters in 15 minute intervals. We were faced with a problem of handling 2 million of meters producing up to 16 pieces of information every quarter of an hour and storing the data for at least 1 years time. This yields about 6 Terabytes of memory which requires very fast access to for computation and consumer portal purposes. In the same time the system has to be able to treat new coming 3 Gigabytes of information every 15 minutes.
All existing DB software proved to consume a lot of time to handle such an installation making it physically impossible to give an even close to up-to-date information to the user. Studying the issue we have observed that either read or write can be set as the key operation to optimize the all purpose DB by. If both have to be performed in the same time then the efficiency drops drastically by a factor of even over 1000 times.
PulsarTSDB by its smart data organization permits to avoid this obstacle making data processing many times faster and bringing it to almost the level of real-time processing.
Disc Storage DB
For security reasons all metering data is being held on disc space. The data is organized in such a way that the data which are being collected for a given meter are stored close by on the disc allowing very efficient read access for the customer to relevant to him information. For write purposes a specific buffer is used to enable equally fast data storage. As a result we have fast non disturbed read and write possibility. The size of the Data Base does not influence the performance of the system whether we are storing 1 year or for example 10 years of data.
In Memory DB
PulsarTSDB makes use of In-Memory which 100 times faster then DB Storage to handle the most recent data. The data between the two types of disc memories is being passed/exchanged by a complex and efficient mechanism on a periodical basis. Full set of all historical data in the In-Memory is synchronized through the Cache Storage to ensure full data content security and data flow.
Computation Engine Vee Aggregation
PulsarTSDB includes a Computation Engine which contains several computation functionalities made on the data received from the Smart Metering System.
The set consists of :
- Validation Estimation and Editing,
- Threshold Check,
- Outage Detection,
- Virtual Grouping,
- Blackout Warning,
- Energy and Water Theft Detection,
- Wrong Billing Detection,
- Meter Performance Tracking.
(the product can be extended by other functionalities on customer demand).
Some of the functionalities require access and analysis of all relevant data collected in the past time periods (e.g. 1 month) giving on-the-fly result. There is also a possibility to perform computation procedures in the batch mode.
CSV bulk meter input
Data is being collected as outside Input Data Stream of bulk CSV files containing metering readings. The input may originate from different meter types and be received from many concentrators in the same time. Statistical methods are used to queue efficiently the input stream to be stored in PulsarTSDB avoiding the network connections overload.
PulsarTSDB is fully integrated with Oracle Data Base System and though is physically separate for Oracle it is seen by it as part of the System. SOAP and Java interfaces are employed and can be used to access the Data Base.
The Data Base Management encompasses functionalities like Backup Management, Data Base Monitoring, Queue Management and enables information storage mapping.
DB Management is also responsible for the distribution and replication/synchronization of data between the systems nodes. In case of any failure of a node the other node (from the replication pair) is able to take over without any data loss. All nodes are equally responsible for the distribution of data. Due to the distribution of the data it is possible to access their proper information by a large number of consumers at the same time.