• 20 Posts
  • 2.52K Comments
Joined 3 years ago
cake
Cake day: July 7th, 2023

help-circle




















    1. How many grep-like ops per file?
    2. Is it interactive or run by another process?
    3. Do you know which files ahead of time?
    4. Do you have any control over that file creation?
    5. Is the JSONL append only? Is the grep running while the file is modified?
    6. How large is very large? 100s of MB? Few GB? 100s of GB? Whether or not it fits in memory could change the approach.
    7. You’re using files, plural, would parallelizing at the file level (e.g. one thread per file) be enough?
    8. How many files and how often is that executed?