A fast Rust based tool to read text-based files in a repository or directory, chunk them, and serialize them for LLM consumption. By default, the tool:
- Uses
.gitignorerules to skip unwanted files. - Uses the Git history to infer what files are important.
- Infers additional ignore patterns (binary, large, etc.).
- Splits content into chunks based on either approximate “token” count or byte size.
- Automatically detects if output is being piped and streams content instead of writing to files.
- Supports processing multiple directories in a single command.
- Configurable via a
yek.tomlfile.
Yek (يک) means “One” in Farsi/Persian.
Consider having a simple repo like this:
.
├── README.md
├── src
│ ├── main.rs
│ └── utils.rs
└── tests
└── test.rs
Running yek in this directory will produce a single file and write it to the temp directory with the following content:
>>>> README.md ... content of README.md ... >>>> tests/test.rs ... content of tests/test.rs ... >>>> src/utils.rs ... content of src/utils.rs ... >>>> src/main.rs ... rest of the file ...
yekwill prioritize more important files to come last in the output. This is useful for LLM consumption.
Via Homebrew (recommended for macOS)
brew tap bodo-run/yek https://github.com/bodo-run/yek.git brew install yek
For Unix-like systems (macOS, Linux):
curl -fsSL https://bodo.run/yek.sh | bash
For Windows (PowerShell):
irm https://bodo.run/yek.ps1 | iex
- Install Rust.
- Clone this repository.
- Run
make macosormake linuxto build for your platform (both runcargo build --release). - Add to your PATH:
export PATH=$(pwd)/target/release:$PATH
yek has sensible defaults, you can simply run yek in a directory to serialize the entire repository. It will serialize all files in the repository into chunks of 10MB by default. The file will be written to the temp directory and file path will be printed to the console.
Process current directory and write to temp directory:
Pipe output to clipboard (macOS):
Cap the max size to 128K tokens and only process the src directory:
yek --max-size 128000 --tokens src/
Cap the max size to 100KB and only process the src directory, writing to a specific directory:
yek --max-size 100KB --output-dir /tmp/yek src/
Process multiple directories:
Process multiple repositories:
yek ~/code/project1 ~/code/project2
yek --help Repository content chunker and serializer for LLM consumption Usage: yek [OPTIONS] [directories]... Arguments: [directories]... Directories to process [default: .] Options: --max-size max-size> Maximum size per chunk (e.g. '10MB', '128KB', '1GB') [default: 10MB] --tokens Count size in tokens instead of bytes --debug Enable debug output --output-dir output-dir> Output directory for chunks -h, --help Print help -V, --version Print version
You can place a file called yek.toml at your project root or pass a custom path via --config. The configuration file allows you to:
- Add custom ignore patterns
- Define file priority rules for processing order
- Add additional binary file extensions to ignore (extends the built-in list)
- Configure Git-based priority boost
This is optional, you can configure the yek.toml file at the root of your project.
# Add patterns to ignore (in addition to .gitignore) [ignore_patterns] patterns=[ "node_modules/", "\.next/", "my_custom_folder/" ] # Configure Git-based priority boost (optional) git_boost_max=50 # Maximum score boost based on Git history (default: 100) # Define priority rules for processing order # Higher scores are processed first [[priority_rules]] score=100 patterns=["^src/lib/"] [[priority_rules]] score=90 patterns=["^src/"] [[priority_rules]] score=80 patterns=["^