Login Page - Create Account

Support Board


Date/Time: Thu, 02 May 2024 22:07:55 +0000



Post From: Python for Sierra Chart

[2022-12-10 21:44:27]
User726340 - Posts: 30
Here is an efficient way to get pandas dataframe, uses numpy memmap.


import sys
from pathlib import Path
import numpy as np
import pandas as pd
# bar data columns
BCOLS = [
'Time', 'Open', 'High', 'Low', 'Close', 'Volume', 'Trades', 'BidVolume',
'AskVolume'
]
def get_scid_df(filename, limitsize=sys.maxsize):
f = Path(filename)
assert f.exists(), "file not found"
stat = f.stat()
offset = 56 if stat.st_size < limitsize else stat.st_size - (
(limitsize // 40) * 40)
rectype = np.dtype([
(BCOLS[0], '<u8'), (BCOLS[1], '<f4'), (BCOLS[2], '<f4'),
(BCOLS[3], '<f4'), (BCOLS[4], '<f4'), (BCOLS[6], '<i4'),
(BCOLS[5], '<i4'), (BCOLS[7], '<i4'), (BCOLS[8], '<i4')
])
df = pd.DataFrame(data=np.memmap(f, dtype=rectype, offset=offset,
mode="r"),
copy=False)
df.dropna(inplace=True)
df["Time"] = df["Time"] - 2209161600000000
df.drop(df[(df.Time < 1) | (df.Time > 1705466561000000)].index, inplace=True)
df.set_index("Time", inplace=True)
df.index = pd.to_datetime(df.index, unit='us')
df.index = df.index.tz_localize(tz="utc")
return df

df = get_scid_df('c:\SierraChart\Data\AAPL.scid')
print(df)



You can also read partial SCID file, by passing size (eg: read last 1mb of data):


get_scid_df('c:\SierraChart\Data\AAPL.scid', 1024*1024)

You can then use pandas resample to get any timeframe. for example, for 1min:

df_1min = (
df.resample("1Min")
.agg(
{
"Open": "first",
"High": "max",
"Low": "min",
"Close": "last",
"Volume": "sum",
"Trades": "sum",
"BidVolume": "sum",
"AskVolume": "sum",
}
)
.ffill()
)
print(df_1min)

Date Time Of Last Edit: 2022-12-10 22:01:46