Login Page - Create Account

Support Board


Date/Time: Mon, 06 May 2024 01:20:59 +0000



Intraday and historical data downloading robustness

View Count: 799

[2015-12-04 16:45:09]
i960 - Posts: 360
Countless times I've seen situations where an interrupted download results in partial data being inserted into intraday and historical data files. This has happened for multiple reasons for me (resuming a laptop and having the battery die mid-download on a bunch of charts as an example). The fix is not fun - it requires nuking every intraday and daily file since the downloads started as they're all effectively missing data and it won't be noticed (at least in the current session).

Is Sierra blindly inserting data right into the data file as it's downloading data for it? Surely that's not going to work so well if these files aren't checksummed in any way and just blindly trusted by SC to have complete data when they might be missing sections of it (particularly intraday data files). IMO the issue here is in writing interim data directly to files without any strong error checking available for them.

Based on what I see here:

http://www.sierrachart.com/index.php?page=doc/doc_IntradayDataFileFormat.html

I see no provision for actual error detection or defense against multiple intraday records being written out with any kind of gap detection (gap as in data wasn't downloaded, but new data is still appended). Are there any plans to increase the robustness of how this data is updated from the server side?
[2015-12-05 12:55:50]
Sierra Chart Engineering - Posts: 104368
We do not understand or agree with what you are saying. Sierra Chart handles historical Intraday data in the most reliable way possible. There can be historical Intraday data download failures and when that is the case, real-time data after the failure has to be maintained and the chart has to be updated. The user can see this in the chart and can attempt to re-download the missing data from before that point if they want.

However, this kind of case is different and definitely will not result in missing data:
(resuming a laptop and having the battery die mid-download on a bunch of charts as an example).

When historical data is downloaded, it starts with the last Date-Time in the Intraday data file. It is written to the data file in time ascending order. In some cases, the very last record or records in the file before the download started, are overwritten with fresh data. If the historical Intraday data download is interrupted, only what is received is written to the Intraday data file so there is not going to be missing data.

In the scenario you gave where there is a power loss to the computer, when Sierra Chart is run again, then the data missing from the last Date-Time in the data file will be downloaded.

Real-time data would not have been written in this scenario so there will not be a time gap preventing the downloading of the missing data.

Why you would do this, we have no idea and we have no comment on this:
The fix is not fun - it requires nuking every intraday and daily file since the downloads started as they're all effectively missing data and it won't be noticed (at least in the current session).

There certainly is never a need to do the above unless a file were to be damaged at the file system level which is a very rare case.

In the hypothetical case, that data were to be missing in an Intraday chart data file somewhere the other than at the end, there is a procedure to re-download data for multiple symbols with a single command.

We have no time to be spending on this. It is only been responded to just to maintain accuracy of public information. We have far more important things to be doing rather than spending time on nonexistent issues or trying to solve something that has no reasonable solution.

Also the concept of a checksum on an Intraday data file is extremely impractical not to mention the significant time and overhead of doing the calculations which would interrupt real-time processing.

Considering all the various sources of historical data, the fact that some of them do not come from central sources and are variable between sources, the fact that there can and is dropped data with high-frequency UDP data feeds, the fact that there is dropped data on real-time transmission of data from numerous sources including Interactive Brokers, transitioning from historical data to real-time data sometimes is imperfect, historical Intraday data downloaded can result in failures, it is lunacy to try to maintain perfect integrity of a Intraday database and trying to recover that data and do so automatically.

Even the Chicago Mercantile exchange does not even transmit data using a checksum. They use 2 parallel UDP data feeds, to minimize packet loss but they know packet loss will occur when still combining them. There simply is no checksum to ensure integrity. But there is a sequence number to know when there are dropped packets. Although certainly nothing can be done to recover those in real time.
Sierra Chart Support - Engineering Level

Your definitive source for support. Other responses are from users. Try to keep your questions brief and to the point. Be aware of support policy:
https://www.sierrachart.com/index.php?l=PostingInformation.php#GeneralInformation

For the most reliable, advanced, and zero cost futures order routing, *change* to the Teton service:
Sierra Chart Teton Futures Order Routing
Date Time Of Last Edit: 2015-12-05 13:42:13

To post a message in this thread, you need to log in with your Sierra Chart account:

Login

Login Page - Create Account