Skip to content

kabukit.utils.cache

source module kabukit.utils.cache

Functions

  • glob Glob parquet files in the cache directory.

  • read Read a polars.DataFrame directly from the cache.

  • write Write a polars.DataFrame directly to the cache.

  • clean Remove the entire cache directory or a specified cache group.

source glob(source: str | None = None, group: str | None = None)Iterator[Path]

Glob parquet files in the cache directory.

Parameters

  • source : str | None The name of the cache subdirectory (e.g., "jquants", "edinet").

  • group : str | None The name of the cache subdirectory (e.g., "info", "statements"). If None, it globs all *.parquet files recursively.

Returns

  • Iterator[Path] An iterator of Path objects for the matched parquet files.

source read(source: str, group: str, name: str | None = None)pl.DataFrame

Read a polars.DataFrame directly from the cache.

Parameters

  • source : str The name of the cache subdirectory (e.g., "jquants", "edinet").

  • group : str The name of the cache subdirectory (e.g., "info", "statements").

  • name : str | None Optional. A specific filename (without extension) within the cache group. If None, the latest file in the subdirectory is read.

Returns

  • polars.DataFrame The DataFrame read from the cache.

Raises

  • FileNotFoundError If no data is found in the cache.

source write(source: str, group: str, df: pl.DataFrame, name: str | None = None)Path

Write a polars.DataFrame directly to the cache.

Parameters

  • source : str The name of the cache subdirectory (e.g., "jquants", "edinet").

  • group : str The name of the cache subdirectory (e.g., "info", "statements").

  • df : pl.DataFrame The polars.DataFrame to write.

  • name : str | None Optional. The filename (without extension) for the parquet file. If None, a timestamp is used as the filename.

Returns

  • Path The path to the written Parquet file.

source clean(source: str | None = None, group: str | None = None)None

Remove the entire cache directory or a specified cache group.

Parameters

  • source : str | None, optional The name of the cache subdirectory (e.g., "jquants", "edinet") to remove.

  • group : str | None, optional The name of the cache subdirectory (e.g., "info", "statements") to remove. If None, the entire cache directory is removed.