Login Page - Create Account

Support Board

Date/Time: Sun, 21 Oct 2018 09:16:09 +0000

[User Discussion] - Offering To The Community: Enhanced Intraday Data File Compression with Large Speedups

Support Request:
[2016-02-04 19:14:19]
bjohnson777 (Brett Johnson) - Posts: 237
In 1bar mode, this program will essentially do the same as the built in Intraday Data File Compression but 10-20x faster.

Where it gets interesting is 4bar mode where each bar is split into 4 ticks for Open, High, Low, and Close. For some reason this mode will run 6x faster on my system (noting the compressed file size is 4x larger) than the usual 1bar compression.

Note that this is an external command line program (EXE) and not a SC plugin study (DLL). It's a bit more complicated to run. Details below. The source code compiles cleanly on my 32bit and 64bit systems (linux and win). If you don't want to compile it yourself, I've attached 2 EXE's for win platforms. They are self contained and should run cleanly. If you're using an ancient computer that is 32bit only, choose the 32bit file. If you're running a multi-core system bought within the past several years, choose the 64bit file. The 32bit file should run on both if there's a problem.

This program will keep the up/down volume ratios from tick by tick data similar to the SC built in function.

Support didn't think this was possible, but it works just fine with Ask/Bid Volume (SC_ASKVOL and SC_BIDVOL). I'll be updating my other studies I've posted to make this default in a few days.

I also programmed a 2bar version where up counts are one bar and down counts are the other. For some reason this causes SC to hang and barf. This is probably a SC bug that needs to be looked at. DO NOT use the 2bar output for now.



Have a look at SCIDRecordWrite4Bars() just above main(). This is what I'm using to create the 4 OHLC bars. I've already provided the code, so this speed up needs to be integrated into the SC compress function. It should be easy.

The SCID file format page also needs some updating:

The data types should be size specific and not generic anymore. A long int may be 32bit or 64bit depending on the compiler. An int32_t or uint32_t will be the same on all compilers. Have a look at the top of my source file for what I'm using.

Also s_Record doesn't exist anymore on the doc page.

Pieces from my file notes:

This program is also an example of a working SCID reader. If
Version 2 comes out, this program will likely need updating. Using
this program on another version will probably corrupt that version.

Building and running: This program uses specific sized data types and should
compile cleanly on 32bit and 64bit systems.

g++ -O3 --static -o SC_CompressDataUnitSize SC_CompressDataUnitSize.cpp
Copy SC_CompressDataUnitSize to the SC Data directory.
Example: ./SC_CompressDataUnitSize -c4 -tm -u1 GBPUSD.scid.old GBPUSD.scid

Change directory to C:\SierraChart\CPPCompiler\bin
Copy the SC_CompressDataUnitSize.cpp source file here.
Open a DOS window.
g++.exe -O3 --static -o SC_CompressDataUnitSize.exe SC_CompressDataUnitSize.cpp
Copy SC_CompressDataUnitSize.exe to the SC Data directory.
If you haven't already, fully exit SC to avoid causing data corruption.
While in the Data directory, rename what ever files you want to compress with
a ".old" extension. In this example GBPUSD.scid gets renamed to GBPUSD.scid.old.
If there is a problem, delete the bad SCID file (GBPUSD.scid in this example)
and rename GBPUSD.scid.old back to GBPUSD.scid.
Open a DOS window and run from the Data directory:
SC_CompressDataUnitSize.exe -c4 -tm -u1 GBPUSD.scid.old GBPUSD.scid

After this program finishes, use SC to export the data to CSV if necessary.
This program includes CSV exports (mainly for debugging), but SC will probably
export cleaner and more usable files.

This program aligns the output bars to the beginning of each time block.
This keeps bars aligned in non-market graphing programs and spreadsheets.

My ChartBook Load Speed Test:
Original 1 Tick SCID Size: 6g (around 4-5min to load)
smashed by SC = 80sec (49megs).
1bar = 80sec (48megs). This is essentially similar to smashed by SC.
2bar = ?sec (95megs). Hangs
4bar = 14sec (191megs).

Usage Screen:

Usage: SC_CompressDataUnitSize.exe -opts InFile.scid OutFile.scid
Program to compress Sierra Chart Version 1 SCID tick by tick files down
in size with different methods while preserving up and down volume counts.
This version is 10-20x faster than the built in SC function. The "-c4"
option also offers a 6x run time speed up than the traditional compression.

Options (-opts) start with the dash (-) character followed by a letter and
a number (replaces the #) with some options.
-c#: Bar Consolidation Type. 1 is similar to SC's built in function. 2 hangs SC
for some reason. It will produce an up and down bar for each time unit. 4 will
give the 6x speed up. It produces 4 bars for Open, High, Low, and Close ticks.
-t#: Time Prefix. Options are s for seconds, m for minutes, h for hours, and d
for days.
-u#: Time Units. The number of units for -t. "-tm -u1" would give 1min bars.
Note SCID intraday files can have a maximum bar length of 1 day. Anything over
that will be rounded down to 1 day. Use the daily CSV text format for anything
higher than a day. Watch out for uneven dividing of the time units into 1
trading day. This program does not convert time zones. Be careful with larger
time units.
-x#: Cut days older than # back. This is used for trimming down SCID files.
-y#: Do not process (just pass them through) ticks from the last # days.
Days back are CALENDAR days, not trading days. Watch out for weekends and
holidays. Usually give 3-4 extra days to account for that.
-r: Write out CSV file from the SCID input. Watch out for file size.
-R: Same as -r except more human readable for debugging.
-w: Write out CSV file from the SCID output. Watch out for file size.
-W: Same as -w except more human readable for debugging.
-d: Enable debugging mode. More output is given.
-B: Batch mode. Doesn't display the warning. Use with caution.

Time Options Examples: 1sec bars: -ts -u1. 30sec bars: -ts -u30.
1min bars: -ts -u60 or -tm -u1. 10min bars: -tm -u10.
45min bars: -tm -u45. 1hr bars: -tm -u60 or -th -u1.
4hr bars: -th -u4. 6hr bars: -th -u6. 1day bars: -td -u1.

Convert forex EUR/USD tick by tick data to fast 1min bars discarding anything
older than 30 days and not converting the past 7 days with debug CSV files:
SC_CompressDataUnitSize.exe -c4 -tm -u1 -x30 -y7 -R -W EURUSD.scid.old EURUSD.scid

Version 0.9 2016-02-03 GPL'd and Open Sourced by Brett Johnson

List of my programs available on "Brett Johnson's Standard Tool Kit" DLL page.
Date Time Of Last Edit: 2016-03-21 20:35:23
attachmentSC_CompressDataUnitSize_32bit.exe - Attached On 2016-02-04 18:29:05 UTC - Size: 121.5 KB - 51 views
attachmentSC_CompressDataUnitSize_64bit.exe - Attached On 2016-02-04 18:29:10 UTC - Size: 189 KB - 61 views
attachmentSC_CompressDataUnitSize.cpp - Attached On 2016-02-04 19:01:35 UTC - Size: 45.4 KB - 112 views
[2016-02-24 16:39:31]
sigmadict - Posts: 91
I am interested in your post but I don't realy understand what this will speed up.
The chart loading the data, also replay mode ?

[2016-02-25 07:19:00]
bjohnson777 (Brett Johnson) - Posts: 237
When the chart is first opened, the 4bar mode loads much faster. Recalcs are also faster. For some reason, 4 separate bars as ticks load much faster than a single consolidated bar of the same data. The developers haven't commented on it.

If you have a lot of older tick data you want to compress down to a smaller size (like to 1min), it's worth looking into.

The developers fixed part of the doc page, but the specific data type sizes should be: (missing 't' and underscore is in the wrong place)
//As seen from:
#include <stdint.h>

[2016-02-26 05:12:21]
sigmadict - Posts: 91
thanks for your answer.
I am not a programmer, so I don't understand the codes.
I was interested in getting Sierra Chart faster and I was curious reading your post.
Do you only need to install .exe to make it work, or you need advance skills ?
Also, is this going to reduce accuracy of Tick by Tick Simulation Replays, or order entries ?
Sorry if I ask simple questions.

[2016-02-26 09:58:46]
bjohnson777 (Brett Johnson) - Posts: 237
The code block is for the developers. They need to fix that page.

The program does the same thing in File >> Data/Trade Service Settings >> Data File Management. Long description here:

Both methods take multiple ticks and combine them into a single unit. The less tick data there is to compute through, the faster it will be.

If you have to have tick by tick data for your replays, this won't work very well and neither method is recommended. Simulated orders during the replay will change their entry/exit points a little. Live orders are unaffected.

For what I'm interested in, I just need the results from the tick by tick data for accurate up/down volume counts. Those will be preserved. In my case, I convert individual ticks into 1 minute bars. It converted a 6g intraday data file into 200megs.

I ran across another post where the developers mention they are working on increasing the data file speed. I'm not sure what they're doing, but they think it will be out in a couple months.

To post a message in this thread, you need to login with your Sierra Chart account:


Login Page - Create Account