kabukit.utils.cache
source module kabukit.utils.cache
Functions
source glob(source: str | None = None, group: str | None = None) → Iterator[Path]
Glob parquet files in the cache directory.
Parameters
-
source : str | None — The name of the cache subdirectory (e.g., "jquants", "edinet").
-
group : str | None — The name of the cache subdirectory (e.g., "info", "statements"). If None, it globs all
*.parquetfiles recursively.
Returns
-
Iterator[Path] — An iterator of Path objects for the matched parquet files.
source read(source: str, group: str, name: str | None = None) → pl.DataFrame
Read a polars.DataFrame directly from the cache.
Parameters
-
source : str — The name of the cache subdirectory (e.g., "jquants", "edinet").
-
group : str — The name of the cache subdirectory (e.g., "info", "statements").
-
name : str | None — Optional. A specific filename (without extension) within the cache group. If None, the latest file in the subdirectory is read.
Returns
-
polars.DataFrame — The DataFrame read from the cache.
Raises
-
FileNotFoundError — If no data is found in the cache.
source write(source: str, group: str, df: pl.DataFrame, name: str | None = None) → Path
Write a polars.DataFrame directly to the cache.
Parameters
-
source : str — The name of the cache subdirectory (e.g., "jquants", "edinet").
-
group : str — The name of the cache subdirectory (e.g., "info", "statements").
-
df : pl.DataFrame — The polars.DataFrame to write.
-
name : str | None — Optional. The filename (without extension) for the parquet file. If None, a timestamp is used as the filename.
Returns
-
Path — The path to the written Parquet file.
source clean(source: str | None = None, group: str | None = None) → None
Remove the entire cache directory or a specified cache group.
Parameters
-
source : str | None, optional — The name of the cache subdirectory (e.g., "jquants", "edinet") to remove.
-
group : str | None, optional — The name of the cache subdirectory (e.g., "info", "statements") to remove. If None, the entire cache directory is removed.