Login Page - Create Account

Support Board


Date/Time: Thu, 01 Jan 2026 15:14:54 +0000



[User Discussion] - Python for Sierra Chart

View Count: 47479

[2024-02-13 16:44:01]
User150671 - Posts: 70
If you don't need the convenience of a dataframe, you can save the extra overhead by just reading into and writing from a np.array.
Example:

read a scid file to a np.array:

def get_scid_np(scidFile, limitsize=sys.maxsize):
  # Check Path Exists
  f = Path(scidFile)
  assert f.exists(), "SCID file not found"
  # set the offset from limitsize
  stat = f.stat()
  offset = 56 if stat.st_size < limitsize else stat.st_size - (
    (limitsize // 40) * 40)

  ### Setup the SCID dtype
  sciddtype = np.dtype([
    ('Time', '<u8'),
    ('Open', '<f4'),
    ('High', '<f4'),
    ('Low', '<f4'),
    ('Close', '<f4'),
    ('Trades','<i4'),
    ('Volume', '<i4'),
    ('BidVolume', '<i4'),
    ('AskVolume', '<i4')
  ])
  scid = np.fromfile(scidFile, dtype=sciddtype, offset=offset)
  return scid



Write a np.array to a scid file:

def write_scid(scidarray,outfile,mode = 'w'):
  ## Write header file in write mode
  if mode == 'w':
    with open(outfile, 'wb') as fscid:
      header = b'SCID8\x00\x00\x00(\x00\x00\x00\x01\x00'
      fscid.write(header)
      for i in range(21):
        fscid.write(b'\x00\x00')
      
      fscid.write(scidarray.tobytes())
      fscid.close()

  # Append the scid array else
  else:
    with open(outfile, 'ab') as fscid:
        fscid.write(scidarray.tobytes())
        fscid.close()
    
  return

Also to whoever wrote this piece of code originally which I see is still being used above:
BCOLS = ['datetime', 'open', 'high', 'low', 'close', 'volume', 'trades', 'bidVolume', 'askVolume']

rectype = np.dtype([

(BCOLS[0], '<u8'), (BCOLS[1], '<f4'), (BCOLS[2], '<f4'),

(BCOLS[3], '<f4'), (BCOLS[4], '<f4'), (BCOLS[6], '<i4'),

(BCOLS[5], '<i4'), (BCOLS[7], '<i4'), (BCOLS[8], '<i4')

])


I didn't notice what you had done and just took your bcols and rectype and made it one variable... because why define something in order to then use it to define another variable with?
Problem is, the bcols order DOES NOT MATCH THE ORDER THEY ARE DEFINED IN THE DTYPE (volume and trades are switched around). Took me like 40 minutes to figure out how I broke my scid file generation. Well played sir.

You can define the np.dtype in one variable as follows:

sciddtype = np.dtype([
('Time', '<u8'),
('Open', '<f4'),
('High', '<f4'),
('Low', '<f4'),
('Close', '<f4'),
('Trades','<i4'),
('Volume', '<i4'),
('BidVolume', '<i4'),
('AskVolume', '<i4')
])

Date Time Of Last Edit: 2024-02-13 16:47:25
[2024-02-13 19:30:19]
rudy5 - Posts: 5
Write a np.array to a scid file:
Nice thanks. I've been using this to plot my custom stuff. https://github.com/louisnw01/lightweight-charts-python/tree/main
[2024-02-19 22:08:45]
Deiter - Posts: 14
Many thanks to everyone who has contributed to this thread. I have borrowed from your great work and added some Polars (rather than Pandas) work to further my own work with Sierra Chart data. I present this work here (it's an extract of a Jupyter Notebook that I am running in VS Code), in the hope of repaying some of the goodness y'all have provided.

# Purpose: Getting Sierra Chart .scid file information into a Polars dataframe for analysis and backtesting

# Approach:
# - Use Polars, rather than Pandas, to reap increased speed and large-file handling benefits.
# - Use Numpy to extract binary data from scid file and cast into appropriate data types in a Polars dataframe.
# - Handling of the SCDateTime data is the hardest part, as it cleverly countains both timestamp and trade count information in a single 64 bit integer, so I use some string methods and regex to extract the desired information.

# Outcome Notes:
# - My final dataframe (df_extract) requires further work to get the timestamp info (in column "SCDateTime_ms", signifying that it is an integer of milliseconds), into a Python datatype for time analysis. I also need to account for the cases where the trade counter information maxes out at 999 (see the Reddit discussion referenced below for details).
# - According to the Jupyter Notebook I ran this in, getting to my final dataframe (thus far) took 4.4 seconds for a 41 million row ES futures .scid file.
# - I'm not interesting in comparing different approaches to see speed differences (there are many articles online about the speed of vectorized Numpy operations vs loops and Polars vs Pandas), so I don't plan to run a bunch of speed tests. 41M rows in 4.4s is good enough for me right now.

# References:
# - Sierra Chart Support Board thread Python for Sierra Chart
# - Reddit post on reading scid data https://www.reddit.com/r/algotrading/comments/169ax2x/the_hello_world_guide_to_reading_execution_data/
# - Awesome Sierra Chart documentation Intraday Data File Format
# - The Polars website https://pola.rs/

import sys
from pathlib import Path

import numpy as np
import polars as pl


# The following section is borrowed, with one small change to the data type definition, from the Sierra Chart Support Board referenced above.


def get_scid_np(scidFile, limitsize=sys.maxsize):
# Check Path Exists

f = Path(scidFile)

assert f.exists(), "SCID file not found"

# set the offset from limitsize

stat = f.stat()

offset = 56 if stat.st_size < limitsize else stat.st_size - ((limitsize // 40) * 40)

### Setup the SCID dtype

sciddtype = np.dtype(
[
("SCDateTime", "<u8"),
("Open", "<f4"),
("High", "<f4"),
("Low", "<f4"),
("Close", "<f4"),
("NumTrades", "<u4"),
("TotalVolume", "<u4"),
("BidVolume", "<u4"),
("AskVolume", "<u4"),
]
)

scid_as_np_array = np.fromfile(scidFile, dtype=sciddtype, offset=offset)

return scid_as_np_array

intermediate_np_array = get_scid_np("ESH24_FUT_CME.scid")

df_raw = pl.DataFrame(intermediate_np_array)


# Now convert the SCDateTime 64 bit integer into a string.
df_SCDateTime_as_string = df_raw.with_columns(
pl.col("SCDateTime").cast(pl.Utf8).alias("SCDateTime_string"),
)


# Trickiest part thus far. Use regex to extract the timestamp info and trade counter info and cast them back into integers for further analysis.

df_extract = df_SCDateTime_as_string.with_columns(
pl.col("SCDateTime_string")
.str.extract(r"^.{13}", 0)
.cast(pl.Int64)
.alias("SCDateTime_ms"),
pl.col("SCDateTime_string")
.str.extract(r".{3}$", 0)
.cast(pl.Int32)
.alias("SCDateTime_trade_counter"),
)

Date Time Of Last Edit: 2024-02-19 22:19:11
[2024-02-19 23:15:07]
User150671 - Posts: 70
Handling of the SCDateTime data is the hardest part, as it cleverly countains both timestamp and trade count information in a single 64 bit integer, so I use some string methods and regex to extract the desired information.

I was not tracking that it contains any trade count info whatsoever ref:
Intraday Data File Format: DateTime

I have following functions for converting sierra time integers:

# takes a sierra Time and returns it as a dt.datetime
def unSierraTime(timeint):
  exceldate = -2209161600.0
  timeint = timeint/1000000
  delta = timeint + exceldate
  delta = dt.datetime.fromtimestamp(delta).replace(tzinfo=dt.timezone.utc)
  return delta

# Takes a Pandas TimeStamp or a datetime index and turns it into a sierra time integer
def sierraTime(dtg):
  excelDate = (pd.to_datetime('1899-12-30')).tz_localize(tz="utc")
  delta = dtg - excelDate

  if type(dtg) == pd._libs.tslibs.timestamps.Timestamp:
    delta = int(delta.total_seconds()*1000000)

  else:
    delta = (delta.total_seconds()*1000000).astype('int64')
  
  return delta

[2024-02-20 16:54:53]
Deiter - Posts: 14
Yeah, I was surprised as well, but I think this is the case, based on three things:
- The page in the SC docs you linked has this section: "When writing tick by tick data to an Intraday data file, where each tick is a single record in the file, and multiple ticks/trades have the same timestamp, then it is acceptable for the Date-Time of the record to be the same timestamp as prior records. What Sierra Chart itself does in this case, is increment the microsecond portion of the Date-Time to create unique timestamps for each trade within a millisecond."
- The Reddit article that first got me thinking about this: https://www.reddit.com/r/algotrading/comments/169ax2x/the_hello_world_guide_to_reading_execution_data/
- and finally my own read of the .scid file. I have attached a screenshot of the raw SCDateTime data that I imported into my dataframe, showing two runs of tick data where the least significant digit/byte of the data is incrementing by one, while the rest of the SCDateTime is not changing.
imageES tick by tick SCDateTime extract.png / V - Attached On 2024-02-20 16:53:46 UTC - Size: 109.09 KB - 520 views
[2024-02-20 23:03:17]
sgne - Posts: 109
One thing I noticed is that if you use a custom chart bar function loaded on a chart, then the extracted datetime follows the time zone of the chart, but when you do a direct extract from the scid file, then the time is gmt/utc.

The C++ commands to use in a {SCCustomChartBarInterfaceRef ChartBarInterface} function could be these:

SCDateTime tTime=ChartBarInterface.NewFileRecord.DateTime;
long long tickTime=reinterpret_cast<long long&>(tTime);
tickTime/=1000; // to get rid of microseconds, which are trade numbers

tTime can be used to get all sorts of data out of the SCDateTime constant reference, e.g.:

double xlTime=tTime.GetAsDouble();


And this is all easily done in ahk as well, with commands like this:

hdr:=56,numBytes:=40
scDateTimePosition:=0
scPricePosition:=12

; scDateTime position (int64) - 0 bytes in
fo.Pos := hdr+(A_Index-1)*numBytes+scDateTimePosition
scDateTime := fo.ReadInt64()//1000 ; get rid of trade number
dtetme := 18991230000000
dtetme += scDateTime//1000,seconds
dte := SubStr(dtetme,1,8)

; scPrice position (float) - 12 bytes in
fo.Pos := hdr+(A_Index-1)*numBytes+scPricePosition
floatPrice := fo.ReadFloat()
[2024-03-16 20:03:08]
User726340 - Posts: 30
updated original func for users interested in pandas df.


import sys
from pathlib import Path
import numpy as np
import pandas as pd

def get_scid_df(filename, limitsize=sys.maxsize):
f = Path(filename)
assert f.exists(), f"{f} file not found"
stat = f.stat()
offset = 56 if stat.st_size < limitsize else stat.st_size - ((limitsize // 40) * 40)
sciddtype = np.dtype(
[
("Time", "<u8"),
("Open", "<f4"),
("High", "<f4"),
("Low", "<f4"),
("Close", "<f4"),
("Trades", "<i4"),
("Volume", "<i4"),
("BidVolume", "<i4"),
("AskVolume", "<i4"),
]
)
df = pd.DataFrame(
data=np.memmap(f, dtype=sciddtype, offset=offset, mode="r"), copy=False
)
df.dropna(inplace=True)
df["Time"] = df["Time"] - 2209161600000000
df.drop(df[(df.Time < 1)].index, inplace=True)
df.set_index("Time", inplace=True)
df.index = pd.to_datetime(df.index, unit="us")
df.index = df.index.tz_localize(tz="utc")
return df


def resample_scdf(scdf_or_filename, period="1Min", tz="UTC", limitsize=sys.maxsize):
df = (
get_scid_df(scdf_or_filename, limitsize)
if isinstance(scdf_or_filename, str)
else scdf_or_filename
)
assert isinstance(
df, pd.DataFrame
), f"{scdf_or_filename} has to be SC df or valid scid file"
df = df.resample(period).agg(
{
"Open": "first",
"High": "max",
"Low": "min",
"Close": "last",
"Trades": "sum",
"Volume": "sum",
"BidVolume": "sum",
"AskVolume": "sum",
}
)
if tz != "UTC":
tz = "America/New_York" if tz == "EST" else tz
tz = "America/Los_Angeles" if tz == "PST" else tz
df.index = df.index.tz_convert(tz)
return df

to get a scid 1Min df in EST : print(resample_scdf("ESH24.scid", tz="EST"))
to get a scid 30Min df in EST : print(resample_scdf("ESH24.scid", "30Min", tz="EST"))
Date Time Of Last Edit: 2024-03-17 19:36:01
[2024-05-01 00:24:46]
User656492 - Posts: 157
This is fun stuff. Thank you all for sharing! I just tried the code above for NQM4 and got this result. I think something's amiss with the time format, which causes Open to be a strange value:

print(resample_scdf("NQM4.CME.scid", tz="EST"))

Open High Low Close Trades \
Time
2024-02-25 19:01:00-05:00 0.000000e+00 18223.00 18221.25 18221.25 1
2024-02-25 19:02:00-05:00 -1.999001e+35 18207.75 18202.75 18207.75 2
2024-02-25 19:03:00-05:00 0.000000e+00 18208.50 18202.50 18208.75 4
2024-02-25 19:04:00-05:00 0.000000e+00 18209.00 18200.25 18206.25 10
2024-02-25 19:05:00-05:00 0.000000e+00 18207.00 18199.00 18206.00 6

also, because maybe it helps:

df.describe()

  Open  High  Low  Close  Trades  Volume  BidVolume  AskVolume
count  5.584600e+04  55846.000000  55846.000000  55846.000000  93144.000000  93144.000000  93144.000000  93144.000000
mean  -inf  18176.611328  18170.449219  18173.523438  235.347591  256.042236  128.308200  127.734035
std  inf  360.248291  361.233368  360.750702  571.431255  625.932788  316.699976  313.838125
min  -1.999001e+35  17130.500000  17113.750000  17121.250000  0.000000  0.000000  0.000000  0.000000
25%  0.000000e+00  17912.500000  17908.750000  17910.750000  0.000000  0.000000  0.000000  0.000000
50%  0.000000e+00  18278.250000  18271.750000  18275.000000  21.000000  22.000000  10.000000  10.000000
75%  0.000000e+00  18463.500000  18458.750000  18461.250000  141.000000  154.000000  77.000000  77.000000
max  0.000000e+00  18708.500000  18702.750000  18706.000000  9990.000000  11664.000000  6341.000000  5972.000000

Date Time Of Last Edit: 2024-05-01 01:28:22
[2024-05-01 00:59:48]
rudy5 - Posts: 5
Can you please edit that to put your code output into a code block?
[2024-05-01 01:15:46]
User656492 - Posts: 157
I don't understand the suggestion. ...?
[2024-05-01 02:36:20]
rudy5 - Posts: 5
You did it thanks 👌🏼
[2024-05-05 15:10:40]
User656492 - Posts: 157
<BUMP> I'm using code from above:

import sys

from pathlib import Path

import numpy as np

import pandas as pd



def get_scid_df(filename, limitsize=sys.maxsize):

f = Path('/Users/Me/SierraPython/NQM4.CME.scid' )

assert f.exists(), f"{f} file not found"

stat = f.stat()

offset = 56 if stat.st_size < limitsize else stat.st_size - ((limitsize // 40) * 40)

sciddtype = np.dtype(

[

("Time", "<u8"),

("Open", "<f4"),

("High", "<f4"),

("Low", "<f4"),

("Close", "<f4"),

("Trades", "<i4"),

("Volume", "<i4"),

("BidVolume", "<i4"),

("AskVolume", "<i4"),

]

)

df = pd.DataFrame(

data=np.memmap(f, dtype=sciddtype, offset=offset, mode="r"), copy=False

)

df.dropna(inplace=True)

df["Time"] = df["Time"] - 2209161600000000

df.drop(df[(df.Time < 1)].index, inplace=True)

df.set_index("Time", inplace=True)

df.index = pd.to_datetime(df.index, unit="us")

df.index = df.index.tz_localize(tz="utc")

return df





def resample_scdf(scdf_or_filename, period="1Min", tz="UTC", limitsize=sys.maxsize):

df = (

get_scid_df(scdf_or_filename, limitsize)

if isinstance(scdf_or_filename, str)

else scdf_or_filename

)

assert isinstance(

df, pd.DataFrame

), f"{scdf_or_filename} has to be SC df or valid scid file"

df = df.resample(period).agg(

{

"Open": "first",

"High": "max",

"Low": "min",

"Close": "last",

"Trades": "sum",

"Volume": "sum",

"BidVolume": "sum",

"AskVolume": "sum",

}

)

if tz != "UTC":

tz = "America/New_York" if tz == "EST" else tz

tz = "America/Los_Angeles" if tz == "PST" else tz

df.index = df.index.tz_convert(tz)

return df

print(resample_scdf("NQM4.CME.scid", tz="EST"))

And I get this:
Open High Low Close \
Time
2024-02-25 19:01:00-05:00 0.000000e+00 1822300.0 1822125.0 1822125.0
2024-02-25 19:02:00-05:00 -1.999001e+37 1820775.0 1820275.0 1820775.0
2024-02-25 19:03:00-05:00 0.000000e+00 1820850.0 1820250.0 1820875.0
2024-02-25 19:04:00-05:00 0.000000e+00 1820900.0 1820025.0 1820625.0
2024-02-25 19:05:00-05:00 0.000000e+00 1820700.0 1819900.0 1820600.0
... ... ... ... ...
2024-04-30 12:20:00-04:00 0.000000e+00 1777300.0 1776650.0 1777175.0
2024-04-30 12:21:00-04:00 0.000000e+00 1777625.0 1776950.0 1777000.0
2024-04-30 12:22:00-04:00 0.000000e+00 1777350.0 1776950.0 1777150.0
2024-04-30 12:23:00-04:00 -1.999001e+37 1777575.0 1776900.0 1777175.0
2024-04-30 12:24:00-04:00 -1.999001e+37 1777350.0 1777125.0 1777175.0

Trades Volume BidVolume AskVolume
Time
2024-02-25 19:01:00-05:00 1 1 1 0
2024-02-25 19:02:00-05:00 2 2 0 2
2024-02-25 19:03:00-05:00 4 4 0 4
2024-02-25 19:04:00-05:00 10 10 0 10
2024-02-25 19:05:00-05:00 6 6 2 4
... ... ... ... ...
2024-04-30 12:20:00-04:00 869 959 376 583
2024-04-30 12:21:00-04:00 861 1023 430 593
2024-04-30 12:22:00-04:00 462 518 225 293
2024-04-30 12:23:00-04:00 522 548 279 269
2024-04-30 12:24:00-04:00 58 58 18 40

[93144 rows x 8 columns]

As you can see the Open field is fubar. I think time stamp is being parsed incorrectly but have no idea how to address it. Any thoughts?
[2024-05-06 18:47:20]
sgne - Posts: 109
I think O is that way because the SCID file is 1-tick, instead of 1-second or greater. In 1-tick files, O is zero, and H and L are bid/ask values. Good luck.
[2024-05-07 21:48:56]
User150671 - Posts: 70
This is explained in the documentation here:
Intraday Data File Format
[2024-08-17 16:42:00]
User150671 - Posts: 70
Okay. So I fucked up my code to convert time to and from SCID files. I just noticed today as I was trying to do some backtesting on things in different timezones, and realised it was always converting from SCID to NYC time.

Fixed Code:

import datetime as dt
import numpy as np

# takes a scid time and returns it as a dt.datetime
def un_sierra_time(scidtime):
  sierra_epoch = dt.datetime(1899, 12, 30, tzinfo=dt.timezone.utc)
  dtg = sierra_epoch + dt.timedelta(microseconds=int(scidtime))
  return dtg

# takes a dt.datetime object and returns it as a np.uint64 for working with scid files
def re_sierra_time(dtg):
  sierra_epoch = dt.datetime(1899, 12, 30, tzinfo=dt.timezone.utc)
  delta = dtg - sierra_epoch
  scidtime = np.uint64(delta.total_seconds()*1000000)
  return scidtime

Explanation:

As per the Intraday File Format at Intraday Data File Format: DateTime The SCID DateTime is:
"...a 64-bit integer. It represents the number of microseconds since the SCDateTime epoch of December 30, 1899."


un_sierra_time is fairly straight forward:
We take the Sierra Epoch Date and add the value of the SCID DateTime to get a Date Time Object using dt.timedelta

re_sierra_time is also fairly straight forward, we just do the opposite:
We take the sierra epoch and subtract the SCID DateTime from this to get a datetime timedelta object.
That delta object stores total days, seconds, and microseconds, and if you call it with total.seconds() you'll get a float with the total microseconds as the fraction.
We can then multiply that float by 1000000 and cast it into an int in order to store in or compare with an SCID time value (in this case I use a numpy uint64 as thats the format I am working with SCIDs with)

Apologies if anyone stole my code and was fucked around by it!
[2026-01-01 00:10:37]
Chad - Posts: 241
To all Python devs in this thread, have you made a function to parse only the data between specific start & end dates within a SCID file? I suppose it should be straightforward enough to filter using the Epoch Date, but any pointers would be appreciated.
[2026-01-01 09:06:00]
User150671 - Posts: 70
Hope this helps, feel free to send me a message if you have any questions

# Scid File Operations
## Imports
# Files / System
import sys
from pathlib import Path
## Data
import numpy as np
import pandas as pd
## Time / Date
import datetime as dt
from zoneinfo import ZoneInfo # needed for timezone conversion if required. Available Zones: print(zoneinfo.available_timezones()) / https://en.wikipedia.org/wiki/List_of_tz_database_time_zones

## function returns scidfile as a dataframe
def get_scid_df(scidfile, limitsize=sys.maxsize):
  f = Path(scidfile)
  assert f.exists(), "SCID file not found"
  # Check // set the size
  stat = f.stat()

  offset = 56 if stat.st_size < limitsize else stat.st_size - (
    (limitsize // 40) * 40)

  # rectype for DatFrame
  sciddtype = np.dtype([
    ('Time', '<u8'),
    ('Open', '<f4'),
    ('High', '<f4'),
    ('Low', '<f4'),
    ('Close', '<f4'),
    ('Trades','<i4'),
    ('Volume', '<i4'),
    ('BidVolume', '<i4'),
    ('AskVolume', '<i4'),
    ])

  df = pd.DataFrame(data=np.memmap(scidfile, dtype=sciddtype, offset=offset,mode="r"),copy=False)
  
  # Set the index columns and covert it to UTC time from Sierras Native Time
  df.set_index('Time', inplace=True)
  df.index = pd.to_datetime(df.index - 2209161600000000, unit='us').tz_localize(tz='utc')
  df.sort_index(inplace=True)
  return df

## function returns scidfile as a numpy array - quicker but less useful for some circumstances
def get_scid_np(scidfile, limitsize=sys.maxsize):
  # Check Path Exists
  f = Path(scidfile)
  assert f.exists(), f"SCID file not found: {scidfile}"
  # Check // set the size
  stat = f.stat()
  offset = 56 if stat.st_size < limitsize else stat.st_size - (
    (limitsize // 40) * 40)

  ### Setup the SCID dtype
  sciddtype = np.dtype([
    ('Time', '<u8'),
    ('Open', '<f4'),
    ('High', '<f4'),
    ('Low', '<f4'),
    ('Close', '<f4'),
    ('Trades','<i4'),
    ('Volume', '<i4'),
    ('BidVolume', '<i4'),
    ('AskVolume', '<i4'),
    ])

  scid = np.fromfile(scidfile, dtype=sciddtype, offset=offset)
  return scid


# takes a scid time and returns it as a dt.datetime
def time_from(sierratime):
  sierra_epoch = dt.datetime(1899, 12, 30, tzinfo=dt.timezone.utc)
  dtg = sierra_epoch + dt.timedelta(microseconds=int(sierratime))
  return dtg

# takes a dt.datetime object and returns it as a np.uint64 for working with scid files
def time_to(dtg):
  sierra_epoch = dt.datetime(1899, 12, 30, tzinfo=dt.timezone.utc)
  delta = dtg - sierra_epoch
  sierratime = np.uint64(delta.total_seconds()*1000000)
  return sierratime


## Example Usage
scidfile = 'c:/sierra/charts/data/ESH26-CME.scid'

## scicnp Opperations
scidnp = get_scid_np(scidfile)

# Get the last time from the SCID FIle
last_sierra_time = scidnp['Time'][-1]
# Convert that into a datetime object
last_dt_time = time_from(last_sierra_time)
# get the date of that object
last_date = last_dt_time.date()

# all three operations in one line:
last_date = time_from(scidnp['Time'][-1]).date()

# Now that we have a date to work on, we need to create datetime objects representing the start/end of the period we are looking
# I will use a full globex session as an example

# dt.datetime.combine takes a dt date object and a dt time object and converts it to a dattime object
# See Documentation here: https://docs.python.org/3/library/datetime.html
# as globex open is the prior day, we need to subtract 1 day from the last date, we do this using dt.imedelta(days=1) to get the last date
# as globex open is also in American Time we use a timezone native object set in US Time by specifying the timezone using tzinfo and ZoneInfo
# it is a good idea to use the correct timezone for the product you are looking at to make sure things are consistent
# e.g. if you are looking at say Bund I would use either a Berlin or London Time tzinfo=ZoneInfo('Europe/Berlin') / tzinfo=ZoneInfo('Europe/London') - London is still in Europe according to timezones.

globex_open = dt.datetime.combine(last_date-dt.timedelta(days=1),dt.time(18,0),tzinfo=ZoneInfo("America/New_York"))
globex_close = dt.datetime.combine(last_date,dt.time(17,0),tzinfo=ZoneInfo("America/New_York"))

# Now we get every entry in the numpy array where the Time is greater than Globex open and less than Globex Close
# calling the time_to function to convert the dattime objects to sierra times
window = scidnp[(scidnp['Time'] > time_to(globex_open)) & (scidnp['Time'] < time_to(globex_close))]



## Data Frame Operations
sciddf = get_scid_df(scidfile)

# As our function to get the dataframe converted the Sierra Time Object to a DateTime Index we can index quickly and efficently on the index
# The dataframe oindex in UTC time, however you can index in any timezone provided the start and end are in the same timezone
# to get the same window as on the np file we can slice the index as follows:
window = sciddf[globex_open:globex_close]


## Example of how to resample a dataframe inot a higher timeframe :
# the agg function is on the Close column instead of the Open/High/Low/Close columns due to the storage format used in scidfiles:
# Intraday Data File Format: Data Structures Used

df = sciddf.resample('1min', origin='start_day').agg({'Close': ['first','max', 'min', 'last'], 'Trades' : 'sum', 'Volume' : 'sum', 'BidVolume' : 'sum', 'AskVolume' : 'sum'}).dropna()
df.columns = ['Open', 'High', 'Low', 'Close', 'Trades','Volume','BidVolume','AskVolume']

# this is useful if you wnated to index of a specifc time as well - but that time may not have had trade at that exact milisecond.
# you could either iteratively set the window you want for every possible date... or you could just get every instance of that time in the dateframe:

# as an example:
cash_open_index = sciddf.loc[sciddf.index.tz_convert(ZoneInfo("America/New_York")).time == dt.time(9,30)].index # returns 9 instances and takes a while
cash_open_index = df.loc[df.index.tz_convert(ZoneInfo("America/New_York")).time == dt.time(9,30)].index # returns 23 instances very quickly

# we can then either call the dataframe with that index:
df.loc[cash_open_index]

# or we can iterate over that index as do more complex analysis:
for index in cash_open_index:
  time_end = dt.datetime.combine(index.date(),dt.time(9,35),tzinfo=ZoneInfo("America/New_York"))
  window = sciddf.loc[index:time_end]
  # do stuff here

To post a message in this thread, you need to log in with your Sierra Chart account:

Login

Login Page - Create Account