Writing Natural Language Queries

Intro

bloop is designed from the ground up to support natural language searches. By utilising semantic code search and Large Language Models like GPT-4, bloop allows you to ask questions like:

  • Is there a function that does X?
  • How does API endpoint Y work?
  • Show me how the backend and frontend work together to achieve Z

Writing queries for bloop has similar quirks to prompting ChatGPT. Below we've compiled some of the key failure cases and resolutions we've noticed, so that you can get the most out of bloop.

1. Frame queries as where, how, or why questions

bloop is best at finding code and explaining it. Your question should always be a 'where', 'how' or 'why' question.

Where is the analytics key set?
Where is the analytics key set?
Explain how the router is setup
Why do we import the gpt2 tokeniser library?

Some of the things it cannot yet reliably accomplish:

  • Write code
  • Modify existing code
  • Count lines of code
  • Provide exhaustive lists (list every instance of log4j vulnerability, list every language used in this repo)
  • Find authors of code
  • Explain how code has changed over time
  • Describing code test coverage
  • Reading issues
  • Finding bugs

Note, we are working on supporting the cases above and will update this list as functionality becomes reliable.

How can I improve this code?
Who wrote the query parser?
Add a new endpoint to return a list of recent searches

2. Don't give up

If at first you don't get a satisfying answer, try to rephrase your query.

Example A: Break down a complex question into multiple parts

Trying to do too much at once can confuse bloop.

Which backend API responds to the frontend when the user requests code navigation?

Could be broken up into an initial question:

Where do we handle user requests for code navigation in the frontend?

And once you're sure bloop knows which part of the frontend flow you're talking about, ask bloop to find the relevant backend:

Which backend API responds to this request?

Example B: Add more information to your query

Often adding more contextual info about the subject of your query can improve the chance of identifying a match.

How does the code component work?

Could be expanded to explain when the user would see the component:

How does the code component which gets shown to the user in a popup work?

3. Be specific

The context limits of LLMs mean that bloop can only look at a small piece of your codebase at a time when answering your question, and so for very high level queries like "what does this repo do?" you will probably be better off reading the README.

That said, bloop is very capable when summarising content so you might find that "summarise the content in the README" works well if you're trying to get an overall understanding of a project.

For the best results, ask about functions, APIs, components or specific concepts of a project.

Example A:

What does this repo do?
Summarise how to install this project from the readme

Example B:

Explain the structure of this codebase
How is the webserver organised

Example C:

Please explain the architecture
What are the key technologies and dependencies of this project