Related tools#
The following tools are designed to be used with LLM:
strip-tags#
strip-tags is a command for stripping tags from HTML. This is useful when working with LLMs because HTML tags can use up a lot of your token budget.
Here’s how to summarize the front page of the New York Times, by both stripping tags and filtering to just the elements with class="story-wrapper"
:
curl -s https://www.nytimes.com/ \
| strip-tags .story-wrapper \
| llm -s 'summarize the news'
llm, ttok and strip-tags—CLI tools for working with ChatGPT and other LLMs describes ways to use strip-tags
in more detail.
ttok#
ttok is a command-line tool for counting OpenAI tokens. You can use it to check if input is likely to fit in the token limit for GPT 3.5 or GPT4:
cat my-file.txt | ttok
125
It can also truncate input down to a desired number of tokens:
ttok This is too many tokens -t 3
This is too
This is useful for truncating a large document down to a size where it can be processed by an LLM.
Symbex#
Symbex is a tool for searching for symbols in Python codebases. It’s useful for extracting just the code for a specific problem and then piping that into LLM for explanation, refactoring or other tasks.
Here’s how to use it to find all functions that match test*csv*
and use those to guess what the software under test does:
symbex 'test*csv*' | \
llm --system 'based on these tests guess what this tool does'
It can also be used to export symbols in a format that can be piped to llm embed-multi in order to create embeddings:
symbex '*' '*:*' --nl | \
llm embed-multi symbols - \
--format nl --database embeddings.db --store
For more examples see Symbex: search Python code for functions and classes, then pipe them into a LLM.