Login Page - Create Account

Support Board


Date/Time: Sat, 20 Apr 2024 08:03:44 +0000



[User Discussion] - Allow Disabling of Compression or Do It in Another Thread

View Count: 1863

[2013-11-18 01:27:37]
bfalk - Posts: 33
As the topic says. It is painful to wait for NTFS compression to complete, especially when Sierra Chart's rendering thread is locked while it is doing so. Please allow for this to be disabled or do it in another thread. It's no fun to get stuck in a position because I decided to download historical data in another chart.

-B
[2013-11-18 01:31:51]
Sierra Chart Engineering - Posts: 104368
It is the operating system which performs the compression and is actually blocking. Normally this is not an issue because the compression is started when the file size is small.


Did you copy an uncompressed chart data file into the Sierra Chart data folder and then open it? This really is the only scenario where there would be a potential problem.

And historical data is download on a background thread in many cases depending upon the particular service.
Sierra Chart Support - Engineering Level

Your definitive source for support. Other responses are from users. Try to keep your questions brief and to the point. Be aware of support policy:
https://www.sierrachart.com/index.php?l=PostingInformation.php#GeneralInformation

For the most reliable, advanced, and zero cost futures order routing, *change* to the Teton service:
Sierra Chart Teton Futures Order Routing
Date Time Of Last Edit: 2013-11-18 01:33:17
[2013-11-18 01:34:29]
bfalk - Posts: 33
I'm using IQFeed, and when I fetch data for the first time SC seems to download it uncompressed and then compress it all at once. I'm not sure if NTFS supports streaming compression, which would get rid of the problem (compress while you download).

It's not terribly urgent, but it certainly is a bit annoying to have to deal with. Personally I would prefer to just have the files uncompressed as I do not have a disk space limitation.
[2013-11-18 01:42:59]
Sierra Chart Engineering - Posts: 104368
We have verified, that when the Intraday data file is opened before data is written to it during a historical data download, that the compression state is set at that time before any data is written.

Which means if the file is empty, that the compression is instantaneous. And when data is written after that, it is compressed on-the-fly.

We recommend disabling any anti-malware/ antivirus software you may be running. That could be causing the problem.
Sierra Chart Support - Engineering Level

Your definitive source for support. Other responses are from users. Try to keep your questions brief and to the point. Be aware of support policy:
https://www.sierrachart.com/index.php?l=PostingInformation.php#GeneralInformation

For the most reliable, advanced, and zero cost futures order routing, *change* to the Teton service:
Sierra Chart Teton Futures Order Routing
Date Time Of Last Edit: 2013-11-18 01:44:13
[2013-11-18 04:36:46]
bfalk - Posts: 33
I guess I was wrong about it being compression, I disabled compression globally on my system and got no performance gain. That then brings me to the question of what is so slow in the Sierra Chart historical data fetch process. The only issue I saw was that SC writes each record one by one (40 bytes each) which has a huge overhead of constantly calling into WriteFile(). SC should at least buffer 1MB worth of data internally and write it to the disk in chunks.

The other problem I've noticed is that the amount of entries written to disk are twice that of the amount of entries actually downloaded, I'm assuming this takes place in the *.tmp.scid to *.scid copy.

Here's my fetch:

HD Request # 2 - Downloading Intraday chart data for @CD# to the file @CD#.temp.scid. Service: dtn | 2013-11-17 23:16:38
Compressing file: M:\SierraChart\Data\@CD#.temp.scid. This may take a while... | 2013-11-17 23:16:38
HD Request # 2 - Requesting Minute data beginning at 1986-07-04 00:16:38 US Eastern Time. | 2013-11-17 23:16:38
IQ Feed: Received Server Connected message from Level 2 port. | 2013-11-17 23:16:38
IQ Feed: Level 2 port indicates market is opened. | 2013-11-17 23:16:38
HD Request # 2 - Receiving Intraday Minute data for @CD# | 2013-11-17 23:16:40
HD Request # 2 - Receiving Intraday data for @CD# starting at 2006-04-21 03:35:00 | 2013-11-17 23:16:40
HD Request # 2 - Timestamp of first Intraday data file record written: 2006-04-21 03:35:00 | 2013-11-17 23:16:40
HD Request # 2 - Received 2111150 records from 2006-04-21 03:35:00 to 2013-11-17 23:16:00 (7.6 years) and wrote 2111150 records for @CD# | 2013-11-17 23:17:22
HD Request # 2 - Intraday download COMPLETE for @CD# | 2013-11-17 23:17:22
HD Request # 3 - Downloading Intraday chart data for @CD# to the file @CD#.temp.scid. Service: dtn | 2013-11-17 23:17:22
HD Request # 3 - Requesting Tick data beginning at 1986-07-04 00:17:22 US Eastern Time. | 2013-11-17 23:17:22
HD Request # 3 - Receiving historical Intraday Tick data for @CD# | 2013-11-17 23:17:23
HD Request # 3 - Receiving Intraday data for @CD# starting at 2013-05-22 00:03:02 | 2013-11-17 23:17:23
HD Request # 3 - Timestamp of first Intraday data file record written: 2013-05-22 00:03:02 | 2013-11-17 23:17:23
HD Request # 3 - Received 4103446 records from 2013-05-22 00:03:02 to 2013-11-17 23:16:49 (180.0 days) and wrote 4103446 records for @CD# | 2013-11-17 23:18:53
HD Request # 3 - Copying downloaded Intraday chart data from @CD#.temp.scid to @CD#.scid | 2013-11-17 23:18:53
Compressing file: M:\SierraChart\Data\@CD#.scid. This may take a while... | 2013-11-17 23:18:53
HD Request # 3 - Timestamp of first Intraday data file record written: 2006-04-21 03:35:00 | 2013-11-17 23:18:53
HD Request # 3 - Intraday download COMPLETE for @CD# | 2013-11-17 23:21:02

The fact that it took 4.5 minutes just to download @CD# is a little bit shocking, especially as I have about 15 different futures I follow and @CD# is probably one of the least active ones.

My initial mention of putting it in a different thread still stands, as there is no reason that the file downloading, parsing, and saving should be running on the same thread as the main message loop. I hate when the GUI's of programs become unresponsive because they have a task that should be in the background in the foreground.

-B
[2013-11-18 06:30:30]
Sierra Chart Engineering - Posts: 104368
The only issue I saw was that SC writes each record one by one (40 bytes each) which has a huge overhead of constantly calling into WriteFile().
If this were really true, this would have been resolved 15 years ago. This is not correct. This is a very efficient function and the OS does very good caching. You can actually see, that there have been more than 6 million records written in a reasonable amount of time. The problem is at the very end with the final data copy.

This is the source of the problem:
HD Request # 3 - Copying downloaded Intraday chart data from @CD#.temp.scid to @CD#.scid | 2013-11-17 23:18:53
Compressing file: M:\SierraChart\Data\@CD#.scid. This may take a while... | 2013-11-17 23:18:53
HD Request # 3 - Timestamp of first Intraday data file record written: 2006-04-21 03:35:00 | 2013-11-17 23:18:53
HD Request # 3 - Intraday download COMPLETE for @CD# | 2013-11-17 23:21:02
Update to the latest prerelease, this is more efficient now.

If you still have a problem there is going to be an issue with the file system on your side. This process only takes us a couple of seconds, even with older versions, where as on your side it is taking about two minutes. This is not normal. Even with the way it was designed.



Nevertheless, we are going to write data in batches during a historical data download. I will try to get that out by tomorrow.

We do agree that historical data downloading should be on a background thread. It is done that way for most services that Sierra Chart supports. It is not done this way for IQ Feed but will be once IQ Feed is moved into a separate process through a DTC Bridge program. This is something we will be doing next year.

. I hate when the GUI's of programs become unresponsive because they have a task that should be in the background in the foreground.
Yes. We agree, but this is not a normal case. The main problem you are having appears to be a system problem on your side. However, the latest prerelease should make a significant difference. Try again.

And this is also why we have developed the DTC protocol and an overall Data / Trading service integration architecture by using separate process servers to take the load of interfacing with those services into another isolated process and thread.
Sierra Chart Support - Engineering Level

Your definitive source for support. Other responses are from users. Try to keep your questions brief and to the point. Be aware of support policy:
https://www.sierrachart.com/index.php?l=PostingInformation.php#GeneralInformation

For the most reliable, advanced, and zero cost futures order routing, *change* to the Teton service:
Sierra Chart Teton Futures Order Routing
Date Time Of Last Edit: 2013-11-18 06:35:41
[2013-11-18 06:58:12]
bfalk - Posts: 33
I really appreciate the positive response. I'm quite a nut about optimization and performance (even sometimes when unnecessary) which is one of the reasons I use SC and not some other bloated .NET software.

I will try out the prerelease tomorrow, as I'm heading off to sleep. The previous benchmark I did was all done on a ramdisk, so disk I/O was not an issue there and I'm certainly not memory or CPU constrained. I'm also well aware of everything in my kernel FS filter stack, and there is nothing stalling my I/O in there (no AV, no windows defender, etc). Hopefully the prerelease gives better results!

For comparison, I have a small piece of C that I wrote up a while back that downloads and writes all of @CD# (tick level, 3,782,500 points) in 16.984 seconds. I can pull with multiple threads increasing performance until I finally saturate my internet bandwidth. I can send you the source if you're interested.

One thing that I would like to see in the DTC Bridge version of IQFeed would be parallel pulling of multiple symbols. One issue with IQFeed (I've brought it up with them previously) is that they fetch on a single thread, meaning iqconnect bottlenecks on decompression on the CPU and not download speed (3.2GHz single core gives about 10mbits/s of throughput). However, IQFeed puts each unique socket connection on it's own thread, meaning if you open one session for each symbol you can saturate bandwidth.

Once again, I greatly appreciate the response. Thanks, and have a great day!
[2013-11-18 07:12:03]
Sierra Chart Engineering - Posts: 104368
Sure thing.

Not sure exactly what you require from IQ Feed. However, you might want to look into our new futures data service:
Announcement: New Sierra Chart Real-Time Futures Data Feed (CME, CBOT, NYMEX, COMEX data)

The data downloading with this service is done with a background thread. And will be further improved when we write in blocks. Probably will write about 16,000 records at a time. Will try to get that out tomorrow or the next day.
Sierra Chart Support - Engineering Level

Your definitive source for support. Other responses are from users. Try to keep your questions brief and to the point. Be aware of support policy:
https://www.sierrachart.com/index.php?l=PostingInformation.php#GeneralInformation

For the most reliable, advanced, and zero cost futures order routing, *change* to the Teton service:
Sierra Chart Teton Futures Order Routing
[2013-11-18 07:53:06]
bfalk - Posts: 33
I'm very interested in the SC Futures data. I'll probably give it a try for a month, especially as it would eliminate the issues of needing to run two copies of SC and also it would be much better supported as you guys would be in complete control of the API. However, there are a few things that still have me hooked in IQFeed. Mainly the fact that I need to have a real-time feed with an API for my own custom applications.

Things I need IQFeed for:

I need IQFeed because it has a lightweight data provider API (iqconnect) and available protocol specifications. This allows me to personally develop real-time trading and analysis tools which I already depend on. From SC I would want to have a way of accessing the data without having to run SC. If this means I would have to sign an NDA to get access to a developer API, I would be willing. I would be even more interested in protocol specifications for a direct connection to the SC data provider (no bridge, no SC running, I would directly connect to the server). I have very little tolerance for crashes, and I have found crashes in both SC and IQFeed by this point in time and thus I would prefer to be in control of parsing and decompression of data.

Things I would like to get:

One big issue with IQFeed is the fact that it provides data as CSV. It receives it as binary, converts it to CSV locally, then I convert it right back to binary... which is a huge waste of CPU time. I want my feed to come in as binary data, and preferably without any floats, as floating point operations are more expensive than integer operations.

I would like to get more than 120 days of historical tick data.

I would like to get historical BBO or market depth data.

Better than millisecond level resolution, however I'm not sure if CME offers this to anyone but huge organizations.

Things that have already made me happy for SC's feed:

SC's dev team has full control, thus better integration with SC.

Ability to have a high quality feed for my trading service without running two copies of SC (which at times seems buggy).

-B
[2013-11-18 08:43:17]
Sierra Chart Engineering - Posts: 104368
The Sierra Chart real-time futures data feed uses DTC. Although some of the logon details are not public. And we have to be a little careful about that because the connection is only allowed if someone gets a connection to their trading service. We can provide you the specifications, but just give us until January until we get more work done in this area.



I would like to get more than 120 days of historical tick data.
This is currently available. Nearly 3 years.

I would like to get historical BBO or market depth data.
We certainly do not have historical market depth data, and I cannot imagine ever offering that. You can indirectly get the best bid and offer by looking at the Bid volume and Ask volume of every tick.


Better than millisecond level resolution, however I'm not sure if CME offers this to anyone but huge organizations.

CME does not offer this to anyone, From what we can see, and they do not even offer millisecond resolution. It does not exist. And we question the the validity of even trying to analyze the milliseconds. Unless you fully understand how the CME computers and networks work, we do not see how you can get reliably accurate information from looking at the sending or receiving time of FIX messages.

A data provider claiming they have got millisecond time stamping from the CME, honestly is stretching the truth. They are not completely forthcoming with what they are giving you. Perhaps they disagree, but the fact is all they are giving you is the sending time of a FIX message which can contain a large number of trades. The individual trades are time stamped to the second. The CME does not provide millisecond resolution. It is only the FIX protocol which supports milliseconds in the sending time of a message which the CME has bundled multiple trades into and have decided to send out at a particular moment in time.




Sierra Chart Support - Engineering Level

Your definitive source for support. Other responses are from users. Try to keep your questions brief and to the point. Be aware of support policy:
https://www.sierrachart.com/index.php?l=PostingInformation.php#GeneralInformation

For the most reliable, advanced, and zero cost futures order routing, *change* to the Teton service:
Sierra Chart Teton Futures Order Routing
Date Time Of Last Edit: 2013-11-18 08:50:21
[2013-11-18 08:54:36]
bfalk - Posts: 33
Thanks for the insight. I've submitted for activation of the SC data feed. I'm quite excited to trade with it tomorrow!

I'm not too worried about historical market depth data, it would have just been neat to have (but I certainly understand how expensive it is to store and distribute).

I was always a bit skeptical about the millisecond level timestamping, what you said makes sense. Thanks for clearing that up.

Thanks for everything!

-B

To post a message in this thread, you need to log in with your Sierra Chart account:

Login

Login Page - Create Account